Re: PATCH: adding invoking_program to plugin_gcc_version
On Wed, 1 Jun 2011 07:52:48 +0200 Basile Starynkevitch wrote: > > Hello All, > > The attached patch to trunk 174518 adds a field invoking_program to the > plugin_gcc_version structure. It informs the plugin about the program > "cc1", "cc1plus", "lto1" using them. Wrong patch, here is a better one # gcc/ChangeLog entry ## 2011-06-01 Basile Starynkevitch * gcc-plugin.h (struct plugin_gcc_version): Add invoking_program field. * configure.ac: Ditto. * configure: Regenerate. * plugin.c (initialize_plugins): Set invoking_program. Ok if it bootstraps? Cheers -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mine, sont seulement les miennes} *** Index: gcc/configure === --- gcc/configure (revision 174518) +++ gcc/configure (working copy) @@ -10977,7 +10977,9 @@ static char revision[] = "$gcc_REVISION"; static struct plugin_gcc_version gcc_version = {basever, datestamp, devphase, revision, - configuration_arguments}; + configuration_arguments, +/* Field invoking_program is set in general_init inside toplev.c. */ + NULL}; EOF # Internationalization @@ -17517,7 +17519,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 17520 "configure" +#line 17522 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -17623,7 +17625,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 17626 "configure" +#line 17628 "configure" #include "confdefs.h" #if HAVE_DLFCN_H Index: gcc/configure.ac === --- gcc/configure.ac (revision 174518) +++ gcc/configure.ac (working copy) @@ -1521,7 +1521,9 @@ static char revision[] = "$gcc_REVISION"; static struct plugin_gcc_version gcc_version = {basever, datestamp, devphase, revision, - configuration_arguments}; + configuration_arguments, +/* Field invoking_program is set in general_init inside toplev.c. */ + NULL}; EOF changequote([,])dnl Index: gcc/gcc-plugin.h === --- gcc/gcc-plugin.h (revision 174518) +++ gcc/gcc-plugin.h (working copy) @@ -70,6 +70,8 @@ struct plugin_gcc_version const char *devphase; const char *revision; const char *configuration_arguments; + /* Short string like "cc1" or "lto1" giving the invoking program. */ + const char *invoking_program; }; /* Object that keeps track of the plugin name and its arguments. */ Index: gcc/plugin.c === --- gcc/plugin.c (revision 174518) +++ gcc/plugin.c (working copy) @@ -632,6 +632,7 @@ initialize_plugins (void) timevar_push (TV_PLUGIN_INIT); #ifdef ENABLE_PLUGIN + gcc_version.invoking_program = progname; /* Traverse and initialize each plugin specified in the command-line. */ htab_traverse_noresize (plugin_name_args_tab, init_one_plugin, NULL); #endif
PATCH: adding invoking_program to plugin_gcc_version
Hello All, The attached patch to trunk 174518 adds a field invoking_program to the plugin_gcc_version structure. It informs the plugin about the program "cc1", "cc1plus", "lto1" using them. # gcc/ChangeLog entry ## 2011-06-01 Basile Starynkevitch * gcc-plugin.h (struct plugin_gcc_version): Add invoking_program field. * toplev.c (general_init): Fill invoking_program field of gcc_version. * configure.ac: Ditto. * configure: Regenerate. See http://gcc.gnu.org/ml/gcc/2011-06/msg0.html & http://gcc.gnu.org/ml/gcc/2011-05/msg00324.html for why I believe it is useful. Ok for trunk? Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mine, sont seulement les miennes} *** Index: gcc/configure === --- gcc/configure (revision 174518) +++ gcc/configure (working copy) @@ -10977,7 +10977,9 @@ static char revision[] = "$gcc_REVISION"; static struct plugin_gcc_version gcc_version = {basever, datestamp, devphase, revision, - configuration_arguments}; + configuration_arguments, +/* Field invoking_program is set in general_init inside toplev.c. */ + NULL}; EOF # Internationalization @@ -17517,7 +17519,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 17520 "configure" +#line 17522 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -17623,7 +17625,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 17626 "configure" +#line 17628 "configure" #include "confdefs.h" #if HAVE_DLFCN_H Index: gcc/toplev.c === --- gcc/toplev.c (revision 174518) +++ gcc/toplev.c (working copy) @@ -1148,6 +1148,7 @@ general_init (const char *argv0) --p; progname = p; + gcc_version.invoking_program = progname; xmalloc_set_program_name (progname); hex_init (); Index: gcc/configure.ac === --- gcc/configure.ac (revision 174518) +++ gcc/configure.ac (working copy) @@ -1521,7 +1521,9 @@ static char revision[] = "$gcc_REVISION"; static struct plugin_gcc_version gcc_version = {basever, datestamp, devphase, revision, - configuration_arguments}; + configuration_arguments, +/* Field invoking_program is set in general_init inside toplev.c. */ + NULL}; EOF changequote([,])dnl Index: gcc/gcc-plugin.h === --- gcc/gcc-plugin.h (revision 174518) +++ gcc/gcc-plugin.h (working copy) @@ -70,6 +70,8 @@ struct plugin_gcc_version const char *devphase; const char *revision; const char *configuration_arguments; + /* Short string like "cc1" or "lto1" giving the invoking program. */ + const char *invoking_program; }; /* Object that keeps track of the plugin name and its arguments. */
Re: [build] Move Solaris 2 startup files to toplevel libgcc, revised
I am on travel and only doing email from my phone. Ralf and I had dinner together last night and he mentioned not understanding some parts of the patch. I won't be home until next week to actually test this. I guess hoping due diligence was used and knowing I may whine next week if it breaks RTEMS, do a visual double check please on the RTEMS specific parts and commit it. Either that or wait for a test report from me. --joel Rainer Orth wrote: >Paolo Bonzini writes: > >> On 05/30/2011 04:29 PM, Rainer Orth wrote: > * Non-Solaris SPARC changes: > >> > >> After I had moved sparc/sol2-c[in].asm to libgcc, I noticed that > >> despite the name a few non-Solaris targets uses those files, too: > >> > >> sparc-*-elf*, sparc-*-rtems*, sparc64-*-elf*, sparc64-*-rtems* > > Yes, I tried to untangle that, but the RTEMS folks complained so I had > to > backpedal. Note that this is also the case on the i386 side. >>> Drats, I hadn't expected anything like this ;-( >>> >>> Here's the updated patch which takes care of this. I've taken the >>> liberty to rename gcc/config/i386/t-rtems-i386 to >>> gcc/config/i386/t-rtems, following all other RTEMS makefile fragments. >>> I'd be really grateful if the RTEMS maintainers could give it a whirl. >>> >>> Besides, I still need build and SPARC maintainer approval for the >>> non-Solaris parts of the patch from as outlined in the previous >>> submission: >>> >>> http://gcc.gnu.org/ml/gcc-patches/2011-05/msg02181.html >>> >>> Bootstrapped without regressions on i386-pc-solaris2.10, >>> i386-pc-solaris2.11 and sparc-sun-solaris2.11. >>> >>> Ok for mainline after the RTEMS parts have been tested/approved? >> >> Looks good as far as I'm concerned. As a followup, please delete >> t-slibgcc-darwin and use the new t-slibgcc-dummy instead (even better, you >> could "svn mv" it). > >I think I just need RTEMS testing/approval now, everything else should >be set? > > Rainer > >-- >- >Rainer Orth, Center for Biotechnology, Bielefeld University
[lto] Merge streamer hooks from pph branch. (issue4568043)
This patch merges the LTO streamer hooks from the pph branch. These hooks are meant to separate streaming work that is specific to GIMPLE from other modules that may want to stream their own data structures. In the PPH branch, we are using the streamer to save front end data structures, so there was a bunch of work and checks that made no sense in the front end. All this module-specific work can be moved to the hooks defined in struct lto_streamer_hooks. One of the concerns I had with the hooks is potential overhead when streaming because of the new indirect calls added by the patch. This does not seem to be the case. I did several timing runs using insn-emit.o (the biggest object file we generate in a bootstrap) and using an LTO link of cc1. The times are slightly in favour of the hooks, actually (the patch removes some checks from the core tree pickling routines), but the difference is insignificant. Over 5 runs on an idle system: Without this patch, we compile insn-emit.o in 41.56 secs (wall time), with the patch the time is 41.34 secs (wall time). The other change in the patch is to reserve enough tags in enum LTO_tags to cover all the possible tree codes. This exposes the bug that I fixed yesterday in the pph branch (the sign of the shifts in lto_output_int_in_range). This means that we need to use output_record_start (LTO_null) instead of output_zero to output 0-delimiters, which increases the size of the output by less than 1% on average. Tested with LTO profiledbootstrap on x86_64. OK for trunk? Diego. * Makefile.in (cgraphunit.o): Add dependency on LTO_STREAMER_H. * cgraphunit.c: Include lto-streamer.h (cgraph_finalize_compilation_unit): Call gimple_streamer_hooks_init if LTO is enabled. * lto-streamer-in.c (unpack_value_fields): Do not check for TS_SSA_NAME, TS_STATEMENT_LIST and TS_OMP_CLAUSE. Call lto_streamer_hooks.unpack_value_fields if set. (lto_materialize_tree): For unhandled nodes, first try to call lto_streamer_hooks.alloc_tree, if it exists. (lto_input_ts_decl_common_tree_pointers): Move reading of DECL_INITIAL to gimple_streamer_read_tree. (lto_input_tree_pointers): Do not handle TS_SSA_NAME, TS_STATEMENT_LIST, TS_OMP_CLAUSE and TS_OPTIMIZATION. (lto_read_tree): Call lto_streamer_hooks.read_tree if set. Only register symbols in symtab if lto_streamer_hooks.register_decls_in_symtab_p is set. (gimple_streamer_read_tree): New. (lto_reader_init): Rename from lto_init_reader. Move initialization code to gimple_streamer_reader_init. Call lto_streamer_hooks.reader_init if set. (gimple_streamer_reader_init): New. * lto-streamer-out.c (pack_value_fields): Do not handle TS_SSA_NAME, TS_STATEMENT_LIST and TS_OMP_CLAUSE. Call lto_streamer_hooks.pack_value_fields if set. (lto_output_tree_ref): For tree nodes that are not normally indexable, call lto_streamer_hooks.indexable_with_decls_p before giving up. (lto_output_ts_decl_common_tree_pointers): Move handling for FUNCTION_DECL and TRANSLATION_UNIT_DECL to gimple_streamer_write_tree. Move assertion for NULL DECL_SAVED_TREEs to gimple_streamer_write_tree. (lto_output_ts_decl_with_vis_tree_pointers): Call output_record_start with LTO_null instead of output_zero. (lto_output_ts_binfo_tree_pointers): Likewise. (lto_output_tree): Likewise. (output_eh_try_list): Likewise. (output_eh_region): Likewise. (output_eh_lp): Likewise. (output_eh_regions): Likewise. (output_bb): Likewise. (output_function): Likewise. (output_unreferenced_globals): Likewise. (lto_output_tree_pointers): Do not handle TS_SSA_NAME, TS_STATEMENT_LIST, TS_OMP_CLAUSE and TS_OPTIMIZATION. (lto_output_tree_header): Call lto_streamer_hooks.is_streamable instead of lto_is_streamable. Call lto_streamer_hooks.output_tree_header if set. (lto_write_tree): Call lto_streamer_hooks.write_tree if set. (gimple_streamer_write_tree): New. (lto_output_tree): If lto_streamer_hooks.has_unique_integer_csts_p is set, lookup the constant in the streamer cache first. * lto-streamer.c (lto_streamer_cache_create): Call lto_streamer_hooks.preload_common_nodes instead of lto_preload_common_nodes. (lto_is_streamable): Move from lto-streamer.h (gimple_streamer_hooks_init): New. (streamer_hooks): New. (streamer_hooks_): Declare. (streamer_hooks_init): New. * lto-streamer.h (struct output_block): Forward declare. (struct lto_input_block): Likewise. (struct data_in): Likewise. (struct bitpack_d): Likewise. (enum LTO_tags): Reserve MAX_TREE_CODES
Re: New options to disable/enable any pass for any functions (issue4550056)
Please discard the previous one. This is the right one: David On Tue, May 31, 2011 at 5:01 PM, Xinliang David Li wrote: > The new patch is attached. The test (c,c++,fortran, java, ada) is on going. > > Thanks, > > David > > On Tue, May 31, 2011 at 9:06 AM, Xinliang David Li wrote: >> On Tue, May 31, 2011 at 2:05 AM, Richard Guenther >> wrote: >>> On Mon, May 30, 2011 at 10:16 PM, Xinliang David Li >>> wrote: The attached are two simple follow up patches 1) the first patch does some refactorization on function header dumping (with more information printed) 2) the second patch cleans up some pass names. Part of the cleanup results from a previous discussion with Honza -- a) rename 'tree_profile_ipa' into 'profile', and make 'ipa_profile' and 'profile' into 'profile_estimate'. The rest of cleanups are needed to make sure pass names are unique. Ok for trunk? >>> >>> + >>> +void >>> +pass_dump_function_header (FILE *dump_file, tree fdecl, struct function >>> *fun) >>> >>> This function needs documentation, the ChangeLog entry misses >>> the tree-pretty-print.h change. >>> >>> +struct function; >>> >>> instead of this please include coretypes.h from tree-pretty-print.h >>> and add the struct function forward declaration there if it isn't already >>> present. >>> >>> You change the output of the header, so please make sure you >>> have bootstrapped and tested with _all_ languages included >>> (and also watch for bugreports for target specific bugs). >>> >> >> Ok. >> >>> + fprintf (dump_file, "\n;; Function %s (%s, funcdef_no=%d, uid=%d)", >>> + dname, aname, fun->funcdef_no, node->uid); >>> >>> I see no point in dumping funcdef_no - it wasn't dumped before in >>> any place. Instead I miss dumping of the DECL_UID and thus >>> a more specific 'uid', like 'cgraph-uid'. >> >> Ok will add decl_uid. Funcdef_no is very useful for debugging FDO >> coverage mismatch related problems as it is the id used in profile >> hashing. >> >>> >>> + aname = (IDENTIFIER_POINTER >>> + (DECL_ASSEMBLER_NAME (fdecl))); >>> >>> using DECL_ASSEMBLER_NAME is bad - it might trigger computation >>> of DECL_ASSEMBLER_NAME which certainly shouldn't be done >>> only for dumping purposes. Instead do sth like >>> >>> if (DECL_ASSEMBLER_NAME_SET_P (fdecl)) >>> aname = DECL_ASSEMBLER_NAME (fdecl); >>> else >>> aname = ''; >> >> Ok. >> >>> >>> and please also watch for cgraph_get_node returning NULL. >>> >>> Also please call the function dump_function_header instead of >>> pass_dump_function_header. >>> >> >> Ok. >> >> Thanks, >> >> David >>> Please re-post with appropriate changes. >>> >>> Thanks, >>> Richard. >>> Thanks, David On Fri, May 27, 2011 at 2:58 AM, Richard Guenther wrote: > On Fri, May 27, 2011 at 12:02 AM, Xinliang David Li > wrote: >> The latest version that only exports 2 functions from passes.c. > > Ok with ... > > @@ -637,4 +637,8 @@ extern bool first_pass_instance; > /* Declare for plugins. */ > extern void do_per_function_toporder (void (*) (void *), void *); > > +extern void disable_pass (const char *); > +extern void enable_pass (const char *); > +struct function; > + > > struct function forward decl removed. > > + explicitly_enabled = is_pass_explicitly_enabled (pass, func); > + explicitly_disabled = is_pass_explicitly_disabled (pass, func); > > both functions inlined here and removed. > > +#define MAX_PASS_ID 512 > > this removed and instead a VEC_safe_grow_cleared () or VEC_length () > before the accesses. > > +-fenable-ipa-@var{pass} @gol > +-fenable-rtl-@var{pass} @gol > +-fenable-rtl-@var{pass}=@var{range-list} @gol > +-fenable-tree-@var{pass} @gol > +-fenable-tree-@var{pass}=@var{range-list} @gol > > -fenable-@var{kind}-@var{pass}, etc. > > +@item -fdisable-@var{ipa|tree|rtl}-@var{pass} > +@itemx -fenable-@var{ipa|tree|rtl}-@var{pass} > +@itemx -fdisable-@var{tree|rtl}-@var{pass}=@var{range-list} > +@itemx -fenable-@var{tree|rtl}-@var{pass}=@var{range-list} > > likewise. > > Thanks, > Richard. > >> David >> >> On Thu, May 26, 2011 at 2:04 PM, Xinliang David Li >> wrote: >>> On Thu, May 26, 2011 at 2:04 AM, Richard Guenther >>> wrote: On Wed, May 25, 2011 at 6:53 PM, Joseph S. Myers wrote: > On Wed, 25 May 2011, Xinliang David Li wrote: > >> Ping. The link to the message: >> >> http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01303.html > > I don't consider this an option handling patch. Patches adding whole > new > features involving new options should be reviewed by maintainers for > the > part of the compiler relevant to those features (since there isn't a > pass >>
[google]Skip target-libiberty for arm*-*-linux-androideabi (issue4564050)
Building gcc-4.6 arm android toolchain fails because of conflicting getpagesize() definition between libiberty and bionic. Based on discussions on another thread (http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg06627.html), reviewer recommended ripping out all support for building libiberty for the target side as it is not needed. However, I don't know if someone is working on that. Before the ideal patch is out, we have to clear the blocking issue and make the toolchain built. The following patch simply skips building target-libiberty for arm*-*-linux-androideabi target. I have tested the patch by building arm-linux-androideabi toolchain. I sent the similar patch to trunk for review. The patch to trunk is slightly different because this part of configure.ac has been modified. The logic is the same though. http://gcc.gnu.org/ml/gcc-patches/2011-05/msg02457.html The review is on going. I am not sure how long it would be. I would suggest we first commit this tiny patch in google/main and make our toolchain built. Then do further update if the trunk version is final. Thanks, Jing 2011-05-31 Jing Yu * configure.ac: Skip target-libiberty for arm*-*-linux-androideabi * configure: Regenerate Index: configure.ac === --- configure.ac(revision 174299) +++ configure.ac(working copy) @@ -682,6 +682,9 @@ noconfigdirs="$noconfigdirs target-libffi target-qthreads" libgloss_dir=arm ;; + arm*-*-linux-androideabi) +noconfigdirs="$noconfigdirs target-libiberty" +;; arm*-*-linux-gnueabi) noconfigdirs="$noconfigdirs target-qthreads" case ${with_newlib} in Index: configure === --- configure (revision 174299) +++ configure (working copy) @@ -3236,6 +3236,9 @@ noconfigdirs="$noconfigdirs target-libffi target-qthreads" libgloss_dir=arm ;; + arm*-*-linux-androideabi) +noconfigdirs="$noconfigdirs target-libiberty" +;; arm*-*-linux-gnueabi) noconfigdirs="$noconfigdirs target-qthreads" case ${with_newlib} in -- This patch is available for review at http://codereview.appspot.com/4564050
Re: New options to disable/enable any pass for any functions (issue4550056)
Honza, are you ok with the pass name change? David On Tue, May 31, 2011 at 2:07 AM, Richard Guenther wrote: > On Mon, May 30, 2011 at 11:44 PM, Xinliang David Li > wrote: >> This is the complete patch for pass name fixes (with test case changes). > > This is ok if Honza thinks the profile pass names make more sense this > way. > > Thanks, > Richard. > >> David >> >> >> On Mon, May 30, 2011 at 1:16 PM, Xinliang David Li >> wrote: >>> The attached are two simple follow up patches >>> >>> 1) the first patch does some refactorization on function header >>> dumping (with more information printed) >>> >>> 2) the second patch cleans up some pass names. Part of the cleanup >>> results from a previous discussion with Honza -- a) rename >>> 'tree_profile_ipa' into 'profile', and make 'ipa_profile' and >>> 'profile' into 'profile_estimate'. The rest of cleanups are needed to >>> make sure pass names are unique. >>> >>> Ok for trunk? >>> >>> Thanks, >>> >>> David >>> >>> On Fri, May 27, 2011 at 2:58 AM, Richard Guenther >>> wrote: On Fri, May 27, 2011 at 12:02 AM, Xinliang David Li wrote: > The latest version that only exports 2 functions from passes.c. Ok with ... @@ -637,4 +637,8 @@ extern bool first_pass_instance; /* Declare for plugins. */ extern void do_per_function_toporder (void (*) (void *), void *); +extern void disable_pass (const char *); +extern void enable_pass (const char *); +struct function; + struct function forward decl removed. + explicitly_enabled = is_pass_explicitly_enabled (pass, func); + explicitly_disabled = is_pass_explicitly_disabled (pass, func); both functions inlined here and removed. +#define MAX_PASS_ID 512 this removed and instead a VEC_safe_grow_cleared () or VEC_length () before the accesses. +-fenable-ipa-@var{pass} @gol +-fenable-rtl-@var{pass} @gol +-fenable-rtl-@var{pass}=@var{range-list} @gol +-fenable-tree-@var{pass} @gol +-fenable-tree-@var{pass}=@var{range-list} @gol -fenable-@var{kind}-@var{pass}, etc. +@item -fdisable-@var{ipa|tree|rtl}-@var{pass} +@itemx -fenable-@var{ipa|tree|rtl}-@var{pass} +@itemx -fdisable-@var{tree|rtl}-@var{pass}=@var{range-list} +@itemx -fenable-@var{tree|rtl}-@var{pass}=@var{range-list} likewise. Thanks, Richard. > David > > On Thu, May 26, 2011 at 2:04 PM, Xinliang David Li > wrote: >> On Thu, May 26, 2011 at 2:04 AM, Richard Guenther >> wrote: >>> On Wed, May 25, 2011 at 6:53 PM, Joseph S. Myers >>> wrote: On Wed, 25 May 2011, Xinliang David Li wrote: > Ping. The link to the message: > > http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01303.html I don't consider this an option handling patch. Patches adding whole new features involving new options should be reviewed by maintainers for the part of the compiler relevant to those features (since there isn't a pass manager maintainer, I guess that means middle-end). >>> >>> Hmm, I suppose then you reviewed the option handling parts and they >>> are ok? Those globbing options always cause headache to me. >>> >>> +-fenable-ipa-@var{pass} @gol >>> +-fenable-rtl-@var{pass} @gol >>> +-fenable-rtl-@var{pass}=@var{range-list} @gol >>> +-fenable-tree-@var{pass} @gol >>> >>> so, no -fenable-tree-@var{pass}=@var{range-list}? >>> >> >> Missed. Will add. >> >> >>> Does the pass name match 1:1 with the dump file name? In which >>> case >> >> Yes. >> >>> >>> +Disable ipa pass @var{pass}. @var{pass} is the pass name. If the same >>> pass is statically invoked in the compiler multiple times, the pass >>> name should be appended with a sequential number starting from 1. >>> >>> isn't true as passes that are invoked only a single time lack the number >>> suffix (yes, I'd really like that to be changed ...) >> >> Yes, pass with single static instance does not need number suffix. >> >>> >>> Please break likes also in .texi files and stick to 80 columns. >> >> Done. >> >>> Please >>> document that these options are debugging options and regular >>> options for enabling/disabling passes should be used. I would suggest >>> to group documentation differently and document -fenable-* and >>> -fdisable-*, thus, >>> >>> + -fdisable-@var{kind}-@var{pass} >>> + -fenable-@var{kind}-@var{pass} >>> >>> Even in .texi files please two spaces after a full-stop. >> >> Done >> >>> >>> +extern void enable_disable_pass (const char *, bool); >>> >>> I'd rather have both enable_pass and disable_pass ;) >> >> Ok. >> >>> >>> +struct
Re: New options to disable/enable any pass for any functions (issue4550056)
The new patch is attached. The test (c,c++,fortran, java, ada) is on going. Thanks, David On Tue, May 31, 2011 at 9:06 AM, Xinliang David Li wrote: > On Tue, May 31, 2011 at 2:05 AM, Richard Guenther > wrote: >> On Mon, May 30, 2011 at 10:16 PM, Xinliang David Li >> wrote: >>> The attached are two simple follow up patches >>> >>> 1) the first patch does some refactorization on function header >>> dumping (with more information printed) >>> >>> 2) the second patch cleans up some pass names. Part of the cleanup >>> results from a previous discussion with Honza -- a) rename >>> 'tree_profile_ipa' into 'profile', and make 'ipa_profile' and >>> 'profile' into 'profile_estimate'. The rest of cleanups are needed to >>> make sure pass names are unique. >>> >>> Ok for trunk? >> >> + >> +void >> +pass_dump_function_header (FILE *dump_file, tree fdecl, struct function >> *fun) >> >> This function needs documentation, the ChangeLog entry misses >> the tree-pretty-print.h change. >> >> +struct function; >> >> instead of this please include coretypes.h from tree-pretty-print.h >> and add the struct function forward declaration there if it isn't already >> present. >> >> You change the output of the header, so please make sure you >> have bootstrapped and tested with _all_ languages included >> (and also watch for bugreports for target specific bugs). >> > > Ok. > >> + fprintf (dump_file, "\n;; Function %s (%s, funcdef_no=%d, uid=%d)", >> + dname, aname, fun->funcdef_no, node->uid); >> >> I see no point in dumping funcdef_no - it wasn't dumped before in >> any place. Instead I miss dumping of the DECL_UID and thus >> a more specific 'uid', like 'cgraph-uid'. > > Ok will add decl_uid. Funcdef_no is very useful for debugging FDO > coverage mismatch related problems as it is the id used in profile > hashing. > >> >> + aname = (IDENTIFIER_POINTER >> + (DECL_ASSEMBLER_NAME (fdecl))); >> >> using DECL_ASSEMBLER_NAME is bad - it might trigger computation >> of DECL_ASSEMBLER_NAME which certainly shouldn't be done >> only for dumping purposes. Instead do sth like >> >> if (DECL_ASSEMBLER_NAME_SET_P (fdecl)) >> aname = DECL_ASSEMBLER_NAME (fdecl); >> else >> aname = ''; > > Ok. > >> >> and please also watch for cgraph_get_node returning NULL. >> >> Also please call the function dump_function_header instead of >> pass_dump_function_header. >> > > Ok. > > Thanks, > > David >> Please re-post with appropriate changes. >> >> Thanks, >> Richard. >> >>> Thanks, >>> >>> David >>> >>> On Fri, May 27, 2011 at 2:58 AM, Richard Guenther >>> wrote: On Fri, May 27, 2011 at 12:02 AM, Xinliang David Li wrote: > The latest version that only exports 2 functions from passes.c. Ok with ... @@ -637,4 +637,8 @@ extern bool first_pass_instance; /* Declare for plugins. */ extern void do_per_function_toporder (void (*) (void *), void *); +extern void disable_pass (const char *); +extern void enable_pass (const char *); +struct function; + struct function forward decl removed. + explicitly_enabled = is_pass_explicitly_enabled (pass, func); + explicitly_disabled = is_pass_explicitly_disabled (pass, func); both functions inlined here and removed. +#define MAX_PASS_ID 512 this removed and instead a VEC_safe_grow_cleared () or VEC_length () before the accesses. +-fenable-ipa-@var{pass} @gol +-fenable-rtl-@var{pass} @gol +-fenable-rtl-@var{pass}=@var{range-list} @gol +-fenable-tree-@var{pass} @gol +-fenable-tree-@var{pass}=@var{range-list} @gol -fenable-@var{kind}-@var{pass}, etc. +@item -fdisable-@var{ipa|tree|rtl}-@var{pass} +@itemx -fenable-@var{ipa|tree|rtl}-@var{pass} +@itemx -fdisable-@var{tree|rtl}-@var{pass}=@var{range-list} +@itemx -fenable-@var{tree|rtl}-@var{pass}=@var{range-list} likewise. Thanks, Richard. > David > > On Thu, May 26, 2011 at 2:04 PM, Xinliang David Li > wrote: >> On Thu, May 26, 2011 at 2:04 AM, Richard Guenther >> wrote: >>> On Wed, May 25, 2011 at 6:53 PM, Joseph S. Myers >>> wrote: On Wed, 25 May 2011, Xinliang David Li wrote: > Ping. The link to the message: > > http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01303.html I don't consider this an option handling patch. Patches adding whole new features involving new options should be reviewed by maintainers for the part of the compiler relevant to those features (since there isn't a pass manager maintainer, I guess that means middle-end). >>> >>> Hmm, I suppose then you reviewed the option handling parts and they >>> are ok? Those globbing options always cause headache to me. >>> >>> +-fenable-ipa-@var{pass} @gol >>> +-fenable-rtl-@var{p
-fdump-passes -fenable-xxx=func_name_list
The following patch implements the a new option that dumps gcc PASS configuration. The sample output is attached. There is one limitation: some placeholder passes that are named with '*xxx' are note registered thus they are not listed. They are not important as they can not be turned on/off anyway. The patch also enhanced -fenable-xxx and -fdisable-xx to allow a list of function assembler names to be specified. Ok for trunk? Thanks, David out Description: Binary data 2011-05-31 David Li * passes.c (passr_eq): (register_pass_name): Change pass name table name. (get_pass_by_name): Ditto. (enable_disable_pass): Handle function assembler name. (is_pass_explicitly_enabled_or_disabled): Ditto. (execute_one_pass): Add pass dumping. (execute_pass_list): Track pass identation. (execute_ipa_pass_list): Ditto. (dump_one_pass): New function. (pass_traverse): Ditto. (enter_pass_list): Ditto. (exit_pass_list): Ditto. (is_valid_assembler_name): Ditto. 2011-05-31 David Li * testsuite/gcc.dg/inline_2.c (revision 0): * testsuite/gcc.dg/inline_6.c (revision 0): * testsuite/gcc.dg/unroll_2.c (revision 0): * testsuite/gcc.dg/inline_3.c (revision 0): * testsuite/gcc.dg/unroll_3.c (revision 0): * testsuite/gcc.dg/inline_4.c (revision 0): * testsuite/gcc.dg/unroll_4.c (revision 0): * testsuite/gcc.dg/inline_1.c (revision 0): * testsuite/gcc.dg/inline_5.c (revision 0): * testsuite/gcc.dg/unroll_1.c (revision 0): Index: doc/invoke.texi === --- doc/invoke.texi (revision 174424) +++ doc/invoke.texi (working copy) @@ -291,6 +291,7 @@ Objective-C and Objective-C++ Dialects}. -fdump-translation-unit@r{[}-@var{n}@r{]} @gol -fdump-class-hierarchy@r{[}-@var{n}@r{]} @gol -fdump-ipa-all -fdump-ipa-cgraph -fdump-ipa-inline @gol +-fdump-passes @gol -fdump-statistics @gol -fdump-tree-all @gol -fdump-tree-original@r{[}-@var{n}@r{]} @gol @@ -5056,11 +5057,12 @@ appended with a sequential number starti Disable rtl pass @var{pass}. @var{pass} is the pass name. If the same pass is statically invoked in the compiler multiple times, the pass name should be appended with a sequential number starting from 1. @var{range-list} is a comma -seperated list of function ranges. Each range is a number pair seperated by a colon. -The range is inclusive in both ends. If the range is trivial, the number pair can be -simplified a a single number. If the function's cgraph node's @var{uid} is falling -within one of the specified ranges, the @var{pass} is disabled for that function. -The @var{uid} is shown in the function header of a dump file. +seperated list of function ranges or assembler names. Each range is a number +pair seperated by a colon. The range is inclusive in both ends. If the range +is trivial, the number pair can be simplified as a single number. If the +function's cgraph node's @var{uid} is falling within one of the specified ranges, +the @var{pass} is disabled for that function. The @var{uid} is shown in the +function header of a dump file. @item -fdisable-tree-@var{pass} @item -fdisable-tree-@var{pass}=@var{range-list} @@ -5090,7 +5092,8 @@ of option arguments. -fenable-tree-cunroll=1 # disable gcse2 for functions at the following ranges [1,1], # [300,400], and [400,1000] - -fdisable-rtl-gcse2=1:100,300,400:1000 +# disable gcse2 for functions foo and foo2 + -fdisable-rtl-gcse2=foo,foo2 # disable early inlining -fdisable-tree-einline # disable ipa inlining @@ -5483,6 +5486,11 @@ Dump after function inlining. @end table +@item -fdump-passes +@optindex fdump-passes +Dump the list of optimization passes that are turned on and off by +the current command line options. + @item -fdump-statistics-@var{option} @opindex fdump-statistics Enable and control dumping of pass statistics in a separate file. The Index: testsuite/gcc.dg/inline_2.c === --- testsuite/gcc.dg/inline_2.c (revision 0) +++ testsuite/gcc.dg/inline_2.c (revision 0) @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized -fdisable-tree-einline=0:3 -fdisable-ipa-inline" } */ +int g; +__attribute__((always_inline)) void bar (void) +{ + g++; +} + +int foo (void) +{ + bar (); + return g; +} + +int foo2 (void) +{ + bar(); + return g + 1; +} + +/* { dg-final { scan-tree-dump-times "bar" 5 "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ +/* { dg-excess-errors "extra notes" } */ Index: testsuite/gcc.dg/inline_6.c === --- testsuite/gcc.dg/inline_6.c (revision 0) +++ testsuite/gcc.dg/inline_6.c (revision 0) @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized -fdisable-tree-einline=foo2 -fdisable-ipa-inline" } */ +int g; +__attribute__((always_inline)) void bar (void) +{ + g++; +} + +int foo (void) +{ + bar (); + return g; +} + +in
Re: [PATCH][RFC] Init sizetypes based on target defs
> Which means that Ada must be seriously broken on m32c (well, I guess > nobody tried it there ;)). I usually only build C and C++.
Re: [PATCH][RFC] Init sizetypes based on target defs
On Tue, May 31, 2011 at 8:24 PM, DJ Delorie wrote: > >> hmm, yes. Again practically for most targets size_t will be >> following its SIZE_TYPE advice, but surely not for all. OTOH while >> the above clearly doesn't look "accidential", it certainly looks >> wrong. If not for sizetype then at least for size_type_node. The >> comment hints that the patch at most will no longer "get better >> code", but if Pmode gets better code when used for sizetype(!) then >> we should do so unconditionally and could get rid of the size_t >> reverse-engineering in initialize_sizetypes completely (m32c might >> disagree here). > > On m32c, Pmode is a 24-bit type, and the chip just doesn't have enough > math opcodes to to 24-bit pointer math with any degree of efficiency. > So, you either do 32-bit math (performance is horrible, since it's all > emulated) or 16-bit math on just the offset (sizeof size_t < Pmode). Which means that Ada must be seriously broken on m32c (well, I guess nobody tried it there ;)). Richard.
Re: RFA: another patch to solve PR49154
Hans-Peter Nilsson writes: > On Tue, 31 May 2011, Richard Sandiford wrote: >> Hans-Peter Nilsson writes: >> > Index: tm.texi.in >> > === >> > --- tm.texi.in (revision 174376) >> > +++ tm.texi.in (working copy) >> > @@ -2327,6 +2327,11 @@ constraints is through machine-dependent >> > You can define such letters to correspond to various classes, then use >> > them in operand constraints. >> > >> > +You must define the narrowest register class for a register so that >> > +class either has no subclasses, or that for some mode, the move cost >> > +between registers within the class are cheaper than moving a register >> > +in the class to or from memory (@pxref{Costs}). >> > + >> >> I fear this isn't true for some MIPS classes. > > I fear the assert will strike then, when there are allocatable > registers in such a class. :) > > You don't happen to have target and options to cc1? Gah, seems like I'd forgotten the "no subclasses" bit by the time I started looking at code. Sorry for the false alarm. Richard
Re: C6X port 5/11: Track predication conditions more accurately
On 05/31/2011 10:43 PM, Steve Ellcey wrote: > On Tue, 2011-05-31 at 23:59 +0400, Andrey Belevantsev wrote: >> On 31.05.2011 22:24, Steve Ellcey wrote: >>> Bernd, >>> >>> This patch (r174336) is causing me many testsuite failures on IA64. >>> Tests like gcc.c-torture/compile/20010408-1.c are dying with a >>> seg fault in vinsn_detach. >> I will look at it tomorrow. Bernd, Steve, please let us know about any >> issues with sel-sched code so we can help. >> >> Andrey > > Has anything been decided about how to fix PR 48673? I am still seeing > gfortran.dg/sms-1.f and sms-2.f fail on IA64. Something like this is one of the approaches I suggested in the PR. Untested. Bernd Index: gcc/config/ia64/ia64.c === --- gcc/config/ia64/ia64.c (revision 174430) +++ gcc/config/ia64/ia64.c (working copy) @@ -9404,9 +9404,15 @@ ia64_reorg (void) if (optimize && ia64_flag_schedule_insns2 && dbg_cnt (ia64_sched2)) { + basic_block bb; timevar_push (TV_SCHED2); ia64_final_schedule = 1; + /* We can't let modulo-sched prevent us from scheduling any bbs, +since we need the final schedule to produce bundle information. */ + FOR_EACH_BB (bb) + bb->flags &= ~BB_DISABLE_SCHEDULE; + initiate_bundle_states (); ia64_nop = make_insn_raw (gen_nop ()); PREV_INSN (ia64_nop) = NEXT_INSN (ia64_nop) = NULL_RTX;
Re: RFA: another patch to solve PR49154
On Tue, 31 May 2011, Richard Sandiford wrote: > Hans-Peter Nilsson writes: > > Index: tm.texi.in > > === > > --- tm.texi.in (revision 174376) > > +++ tm.texi.in (working copy) > > @@ -2327,6 +2327,11 @@ constraints is through machine-dependent > > You can define such letters to correspond to various classes, then use > > them in operand constraints. > > > > +You must define the narrowest register class for a register so that > > +class either has no subclasses, or that for some mode, the move cost > > +between registers within the class are cheaper than moving a register > > +in the class to or from memory (@pxref{Costs}). > > + > > I fear this isn't true for some MIPS classes. I fear the assert will strike then, when there are allocatable registers in such a class. :) You don't happen to have target and options to cc1? brgds, H-P
Re: C6X port 5/11: Track predication conditions more accurately
On Tue, 2011-05-31 at 23:59 +0400, Andrey Belevantsev wrote: > On 31.05.2011 22:24, Steve Ellcey wrote: > > Bernd, > > > > This patch (r174336) is causing me many testsuite failures on IA64. > > Tests like gcc.c-torture/compile/20010408-1.c are dying with a > > seg fault in vinsn_detach. > I will look at it tomorrow. Bernd, Steve, please let us know about any > issues with sel-sched code so we can help. > > Andrey Has anything been decided about how to fix PR 48673? I am still seeing gfortran.dg/sms-1.f and sms-2.f fail on IA64. Steve Ellcey s...@cup.hp.com
Re: Ping^2: PR target/45074: Check targets of multi-word operations
On 05/31/2011 07:51 PM, Richard Sandiford wrote: > Ping for: > > http://gcc.gnu.org/ml/gcc-patches/2011-04/msg01327.html > > It fixes the expansion of multiword operations in cases where the > suggested target is a hard register and where CANNOT_CHANGE_MODE_CLASS > forbids word-mode subparts. Can you call the new function valid_multiword_target_p? In a sense, we already know it's a multiword target, so the function name is a bit unfortunate. I see two copies of this code /* If TARGET is the same as one of the operands, the REG_EQUAL note won't be accurate, so use a new target. */ - if (target == 0 || target == op0 || target == op1) in expand_binop, and you seem to be changing only one? Also, there's xtarget = gen_reg_rtx (mode); if (target == 0 || !REG_P (target)) target = xtarget; /* Indicate for flow that the entire target reg is being set. */ if (REG_P (target)) emit_clobber (xtarget); Ok with these changes (or if there's a good reason not to touch the ones you left out). Bernd
Re: [Patch, Fortran] -fcoarray=single implementation of the atomic subroutines
Daniel Kraft wrote: Build and regtested on x86-64-linux. OK for the trunk? Ok. Just one thought: + if (!(atom->ts.type == BT_INTEGER&& atom->ts.kind == gfc_c_int_kind) +&& !(atom->ts.type == BT_LOGICAL&& atom->ts.kind == gfc_c_int_kind)) What about defining another constant for atomic_int_kind rather than gfc_c_int_kind, in case we want to change this in the future? I did so now - as discussed in #gfortran; cf. trans-types.c. Comitted as Rev. 174510; cf. attachment. Tobais Index: gcc/testsuite/gfortran.dg/coarray_atomic_1.f90 === --- gcc/testsuite/gfortran.dg/coarray_atomic_1.f90 (Revision 0) +++ gcc/testsuite/gfortran.dg/coarray_atomic_1.f90 (Revision 0) @@ -0,0 +1,21 @@ +! { dg-do compile } +! { dg-options "-fcoarray=single -std=f2008" } +! +! PR fortran/18918 +! +! Diagnostic for atomic subroutines +! +use iso_fortran_env, only: atomic_int_kind, atomic_logical_kind +implicit none +integer(atomic_int_kind) :: a(1)[*] +logical(1) :: c[*] +integer(atomic_int_kind) :: b +logical(atomic_logical_kind) :: d, e[*] + +call atomic_define(a, 7_2) ! { dg-error "must be a scalar" } +call atomic_ref(b, b) ! { dg-error "shall be a coarray" } + +call atomic_define(c, 7) ! { dg-error "an integer of ATOMIC_INT_KIND or a logical of ATOMIC_LOGICAL_KIND" } +call atomic_ref(d, a(1)) ! { dg-error "shall have the same type" } +call atomic_ref(.true., e) ! { dg-error "shall be definable" } +end Index: gcc/testsuite/gfortran.dg/coarray/atomic_1.f90 === --- gcc/testsuite/gfortran.dg/coarray/atomic_1.f90 (Revision 0) +++ gcc/testsuite/gfortran.dg/coarray/atomic_1.f90 (Revision 0) @@ -0,0 +1,27 @@ +! { dg-do run } +! +! PR fortran/18918 +! +! Basic atomic def/ref test +! + +use iso_fortran_env, only: atomic_int_kind, atomic_logical_kind +implicit none +integer(atomic_int_kind) :: a(1)[*] +logical(atomic_logical_kind) :: c[*] +intrinsic :: atomic_define +intrinsic :: atomic_ref +integer(8) :: b +logical(1) :: d + +call atomic_define(a(1), 7_2) +call atomic_ref(b, a(1)) +if (b /= a(1)) call abort() + +call atomic_define(c, .false.) +call atomic_ref(d, c[this_image()]) +if (d .neqv. .false.) call abort() +call atomic_define(c[this_image()], .true.) +call atomic_ref(d, c) +if (d .neqv. .true.) call abort() +end Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (Revision 174509) +++ gcc/testsuite/ChangeLog (Arbeitskopie) @@ -1,3 +1,9 @@ +2011-05-31 Tobias Burnus + + PR fortran/18918 + * gfortran.dg/coarray_atomic_1.f90: New. + * gfortran.dg/coarray/atomic_1.f90: New. + 2011-05-31 Jakub Jelinek * gcc.dg/guality/bswaptest.c: New test. Index: gcc/fortran/intrinsic.c === --- gcc/fortran/intrinsic.c (Revision 174509) +++ gcc/fortran/intrinsic.c (Arbeitskopie) @@ -51,7 +51,7 @@ enum klass { CLASS_IMPURE = 0, CLASS_PURE, CLASS_ELEMENTAL, - CLASS_INQUIRY, CLASS_TRANSFORMATIONAL }; + CLASS_INQUIRY, CLASS_TRANSFORMATIONAL, CLASS_ATOMIC }; #define ACTUAL_NO 0 #define ACTUAL_YES 1 @@ -2880,6 +2880,18 @@ make_noreturn(); + add_sym_2s ("atomic_define", GFC_ISYM_ATOMIC_DEF, CLASS_ATOMIC, + BT_UNKNOWN, 0, GFC_STD_F2008, + gfc_check_atomic_def, NULL, gfc_resolve_atomic_def, + "atom", BT_INTEGER, di, REQUIRED, INTENT_OUT, + "value", BT_INTEGER, di, REQUIRED, INTENT_IN); + + add_sym_2s ("atomic_ref", GFC_ISYM_ATOMIC_REF, CLASS_ATOMIC, + BT_UNKNOWN, 0, GFC_STD_F2008, + gfc_check_atomic_ref, NULL, gfc_resolve_atomic_ref, + "value", BT_INTEGER, di, REQUIRED, INTENT_OUT, + "atom", BT_INTEGER, di, REQUIRED, INTENT_IN); + add_sym_1s ("cpu_time", GFC_ISYM_CPU_TIME, CLASS_IMPURE, BT_UNKNOWN, 0, GFC_STD_F95, gfc_check_cpu_time, NULL, gfc_resolve_cpu_time, tm, BT_REAL, dr, REQUIRED, INTENT_OUT); Index: gcc/fortran/iso-fortran-env.def === --- gcc/fortran/iso-fortran-env.def (Revision 174509) +++ gcc/fortran/iso-fortran-env.def (Arbeitskopie) @@ -38,9 +38,9 @@ -- the standard that supports this type */ NAMED_INTCST (ISOFORTRANENV_FILE_ATOMIC_INT_KIND, "atomic_int_kind", \ - gfc_default_integer_kind, GFC_STD_F2008) + gfc_atomic_int_kind, GFC_STD_F2008) NAMED_INTCST (ISOFORTRANENV_FILE_ATOMIC_LOGICAL_KIND, "atomic_logical_kind", \ - gfc_default_logical_kind, GFC_STD_F2008) + gfc_atomic_logical_kind, GFC_STD_F2008) NAMED_INTCST (ISOFORTRANENV_CHARACTER_STORAGE_SIZE, "character_storage_size", \ gfc_character_storage_size, GFC_STD_F2003) NAMED_INTCST (ISOFORTRANENV_ERROR_UNIT, "error_unit", GFC_STDERR_UNIT_NUMBER, \ Index: gcc/fortran/intrinsic.h === --- gcc/fortran/intrinsic.h (Revision 174509
Re: C6X port 5/11: Track predication conditions more accurately
On 31.05.2011 22:24, Steve Ellcey wrote: Bernd, This patch (r174336) is causing me many testsuite failures on IA64. Tests like gcc.c-torture/compile/20010408-1.c are dying with a seg fault in vinsn_detach. I will look at it tomorrow. Bernd, Steve, please let us know about any issues with sel-sched code so we can help. Andrey #0 0x55b8760:0 in vinsn_detach (vi=0xf) #1 0x55bda30:0 in clear_expr (expr=0x7fffeef0) #2 0x562ea50:0 in schedule_expr_on_boundary (bnd=0x4040cf14, expr_vliw=0x4040c4ac, seqno=-1) #3 0x562fab0:0 in fill_insns (fence=0x4040bdec, seqno=-1, scheduled_insns_tailpp=0x7fffeff0) #4 0x563d3b0:0 in schedule_on_fences (fences=0x4040bde8, max_seqno=1, scheduled_insns_tailpp=0x7fffeff0) #5 0x563eb70:0 in sel_sched_region_2 (orig_max_seqno=22) #6 0x563f1b0:0 in sel_sched_region_1 () #7 0x56403f0:0 in sel_sched_region (rgn=9) #8 0x5640980:0 in run_selective_scheduling () #9 0x613d0f0:0 in ia64_reorg () I am still trying to get more information but I was wondering if you have already seen this problem or if anyone else has reported it? Steve Ellcey s...@cup.hp.com
Re: approved but not committed? - [PATCH, ARM] Testcases incorrectly run in Thumb/Xscale
Since this patch has been properly approved, if there is no objection in 24 hours, I will commit this patch to trunk. Thanks, Jing On Fri, May 27, 2011 at 3:55 PM, Jing Yu wrote: > Hi Sofiane, > > I find your following patch has been approved by Richard in Oct last > year, but it is not trunk. > Is there any problem with it? > http://gcc.gnu.org/ml/gcc-patches/2010-10/msg00266.html > > If you don't mind, I can help to commit the patch. > > Thanks, > Jing >
Re: [PATCH, rs6000] Tidy up dumping of register/memory move cost
On Wed, May 25, 2011 at 3:02 PM, Pat Haugen wrote: > The following fixes a problem when dumping register costs, where the > incorrect 'from' value was being written out because the code modified the > incoming parameter value. It also changes things so that register/memory > costs are only dumped on the outermost call, eliminating intermediate output > when a cost calculation requires going through memory or GPRs. > > Bootstrap/regtest on powerpc64-linux with no new regressions. Ok for trunk? > > -Pat > > > 2011-05-25 Pat Haugen > > * config/rs6000/rs6000.c (rs6000_register_move_cost): Preserve from > parameter value for dump. Dump cost on outermost call only. > (rs6000_memory_move_cost): Dump cost on outermost call only. Okay. Thanks, David
Re: [PATCH, rs6000] Fix REG_CLASS_CONTENTS
On Tue, May 31, 2011 at 12:08 PM, Pat Haugen wrote: > The following patch fixes an issue I noticed where vr0..vr2 were > inadvertently included in NON_FLOAT_REGS. > > Bootstrap/regtest on powerpc64-linux with no new regressions. Ok for trunk? > > -Pat > > > 2011-05-31 Pat Haugen > > * config/rs6000/rs6000.h (REG_CLASS_CONTENTS): Remove vr0..vr2 from > NON_FLOAT_REGS. Okay. Thanks, David
Re: [trans-mem] Add documentation
On 05/31/2011 11:55 AM, Torvald Riegel wrote: > Last majority opinion on this that I'm aware of said that only primitive > types can be used for exceptions thrown out of transactions. We should > be able to handle this with the EH ABI wrappers only. > > Okay to drop support of dropReferences()? Fine by me. >> The intention was always to drop the registration functions entirely, and >> create a new ELF Phdr describing the linker-sorted table. Much like what >> currently happens for PT_GNU_EH_FRAME. >> >> This work kept getting bogged down in how to represent the N different code >> generation variants. We clearly needed at least 2 -- sw and hw transactional >> clones -- but there was always a suggestion of more variants for different >> tm assumptions/invariants. > > Shall I just add your text to the document as an explanation? Sure. >> I thought the TM Language group, and Hans Bohem in particular, has been >> working on this. It post-dates the language spec we had available when >> the branch code was written. > > The only memory-model-related work that I'm aware of is what's contained > in the C++-level specification (embedding txns into the C++1x mem > model), but not on the ABI level. I'll try to get more information about > this. Ah, I equated these to the same thing. r~
Re: [trans-mem] Add documentation
On Tue, 2011-05-31 at 11:11 -0700, Richard Henderson wrote: > > +The intention behind @code{_ITM_dropReferences} is not entirely clear. The > > +specification suggests that this function is necessary because of certain > > +orderings of data transfer undos and the releasing of memory regions (i.e., > > +privatization). However, this ordering is never defined, nor is the > > ordering of > > +dropping references w.r.t. other events. > > I'm pretty sure dropReferences was supposed to be related to cancel+throw, > where we needed to keep modifications to the object thrown from the > transaction, while rolling back all other changes. That's what I thought as well. Alternatively, judging from the paragraphs position close to user undo/commit actions, it might also have been added to support some form of open nesting. > At the time I quit working on this, the TM Language group hadn't made up > it's mind what cancel+throw really meant. Frankly, it seems a bit > impossible to me. Last majority opinion on this that I'm aware of said that only primitive types can be used for exceptions thrown out of transactions. We should be able to handle this with the EH ABI wrappers only. Okay to drop support of dropReferences()? > > +Registered tables must be writable by the TM runtime, and must be live > > +throughout the life-time of the TM runtime. > > + > > +@strong{TODO} We might want to drop the ``writable'' requirement, and make > > a > > +copy instead. > > The intention was always to drop the registration functions entirely, and > create a new ELF Phdr describing the linker-sorted table. Much like what > currently happens for PT_GNU_EH_FRAME. > > This work kept getting bogged down in how to represent the N different code > generation variants. We clearly needed at least 2 -- sw and hw transactional > clones -- but there was always a suggestion of more variants for different > tm assumptions/invariants. Shall I just add your text to the document as an explanation? > > +The ABI should define a memory model and the ordering that is guaranteed > > for > > +data transfers and commit/undo actions, or at least refer to another memory > > +model that needs to be preserved. Without that, the compiler cannot ensure > > the > > +memory model specified on the level of the programming language (e.g., by > > the > > +C++ TM specification). > > I thought the TM Language group, and Hans Bohem in particular, has been > working on this. It post-dates the language spec we had available when > the branch code was written. The only memory-model-related work that I'm aware of is what's contained in the C++-level specification (embedding txns into the C++1x mem model), but not on the ABI level. I'll try to get more information about this. Torvald
Re: [Patch, Fortran] Coarray - fix assumed-shape cobounds
Tobias Burnus wrote: Simple patch, which requires the previous patch at http://gcc.gnu.org/ml/fortran/2011-05/msg00231.html OK for the trunk? Approved by Daniel Kraft on IRC, committed as Rev. 174504. Thanks for the reviews! Tobias
Re: [Patch, Fortran] Support static coarrays with "automatic" cobounds
Tobias Burnus wrote: gfortran currently rejects: subroutine foo(n) integer, SAVE :: foo(500)[n, *] claiming that as SAVE does not work with automatic arrays. The patch for this has been committed (Rev. 174503) after approval by Daniel Kraft on IRC (#gfortran). Thomas Koenig wrote: To me, that sounds a lot like a mistake in the standard. Maybe ask on c.l.f? It is not completely clear to me what you mean by mistake. Coarrays are kind of special in several respects. Except of deferred-shape coarrays (i.e. allocatable coarrays), the codimension is always assumed site, independently whether the dimension makes the coarray a scalar, explict-size, assumed-size, or assumed-shape array: integer :: A[*] integer :: B(:,:,:)[4,5:*] integer :: C(*)[1:*] integer :: D(5)[1:*] but integer, allocatable :: E(:)[:] This works as the codimension (coshape) does not influence the storage but just the index calculation between coindex and image index, when accessing remote images. Thus, there is no reason why one should pose restrictions on the coshape for static arrays. (By definition, all coarrays are either static ("SAVE") or allocatable.) I think in some cases, it can be really useful. As the size of the codimension is only known at run time (it's num_images()), it can make sense to define the coshape only at run time - even if the shape itself is a compile-time constant. Another peculiarity is that for a coarray "integer :: A(10,10)[*]" the actual argument remains a coarray in: call proc(A), call proc(A(1,1)), call proc(A(:,1)), call proc(2:3,4) etc. (I think the actual argument needs to be simply contiguous to make this work by preventing copy-in/copy-out.) Note: call proc(A[5]) is coindexed, but not a coarray. Tobias
Re: Skip building target libiberty for arm*-*-linux-androideabi
Based on discussion on another thread (http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg06627.html), what Joseph recommended was ripping out all support for building libiberty for the target side as it is not needed. Thus I doubt skipping target-libiberty for all targets is acceptable. I don't have the bandwidth to work on the ideal patch. Thus I am wondering if we can skip target-libiberty for androideabi target before the ideal patch is out. Thanks, Jing On Sun, May 29, 2011 at 9:23 PM, Ye Joey wrote: > On Sat, May 28, 2011 at 5:42 AM, Jing Yu wrote: >> >> Building gcc-4.6 arm android toolchain fails because of an >> incompatible function definition between libiberty and bionic. >> >> Thanking Joseph, I have learned that "there's no such thing as a >> target libiberty" and we should rip all the target-libiberty rules >> out. I don't know if someone is working on it. Before that patch comes >> out, can we add arm*-*-linux-androideabi to the list of targets where >> target-libiberty is skipped? >> > How about skip libiberty for all targets then? > > - Joey
Re: C6X port 5/11: Track predication conditions more accurately
Bernd, This patch (r174336) is causing me many testsuite failures on IA64. Tests like gcc.c-torture/compile/20010408-1.c are dying with a seg fault in vinsn_detach. #0 0x55b8760:0 in vinsn_detach (vi=0xf) #1 0x55bda30:0 in clear_expr (expr=0x7fffeef0) #2 0x562ea50:0 in schedule_expr_on_boundary (bnd=0x4040cf14, expr_vliw=0x4040c4ac, seqno=-1) #3 0x562fab0:0 in fill_insns (fence=0x4040bdec, seqno=-1, scheduled_insns_tailpp=0x7fffeff0) #4 0x563d3b0:0 in schedule_on_fences (fences=0x4040bde8, max_seqno=1, scheduled_insns_tailpp=0x7fffeff0) #5 0x563eb70:0 in sel_sched_region_2 (orig_max_seqno=22) #6 0x563f1b0:0 in sel_sched_region_1 () #7 0x56403f0:0 in sel_sched_region (rgn=9) #8 0x5640980:0 in run_selective_scheduling () #9 0x613d0f0:0 in ia64_reorg () I am still trying to get more information but I was wondering if you have already seen this problem or if anyone else has reported it? Steve Ellcey s...@cup.hp.com
Re: [PATCH][RFC] Init sizetypes based on target defs
> hmm, yes. Again practically for most targets size_t will be > following its SIZE_TYPE advice, but surely not for all. OTOH while > the above clearly doesn't look "accidential", it certainly looks > wrong. If not for sizetype then at least for size_type_node. The > comment hints that the patch at most will no longer "get better > code", but if Pmode gets better code when used for sizetype(!) then > we should do so unconditionally and could get rid of the size_t > reverse-engineering in initialize_sizetypes completely (m32c might > disagree here). On m32c, Pmode is a 24-bit type, and the chip just doesn't have enough math opcodes to to 24-bit pointer math with any degree of efficiency. So, you either do 32-bit math (performance is horrible, since it's all emulated) or 16-bit math on just the offset (sizeof size_t < Pmode).
Re: [build] Move MD_UNWIND_SUPPORT to toplevel libgcc
On May 30, 2011, at 8:43 AM, Rainer Orth wrote: > * The three users of MD_UNWIND_SUPPORT are modified to unconditionally > include a new md-unwind-support.h header which is created from the > info in config.host: if md_unwind_header exists, it is included in > md-unwind-support.h, otherwise the generated header is empty. > diff --git a/gcc/config/rs6000/darwin.h b/gcc/config/rs6000/darwin.h > --- a/gcc/config/rs6000/darwin.h > +++ b/gcc/config/rs6000/darwin.h > @@ -381,10 +381,6 @@ extern int darwin_emit_branch_islands; > #include > #endif > > -#if !defined(__LP64__) && !defined(DARWIN_LIBSYSTEM_HAS_UNWIND) > -#define MD_UNWIND_SUPPORT "config/rs6000/darwin-unwind.h" > -#endif > - So, I'm wondering, can we just roll this check into the header, so instead of: #if A #include file #endif file: bla we have: #include file file: #if A bla #endif The advantages, any wrapping code is handled the exact same way. Once this is done, then the transformation to port is identical to every other port. Also, this general rule would apply to the other corner cases as well, if I read them right. ? Oh, once this is done, I think: /* libSystem contains unwind information for signal frames. */ #define DARWIN_LIBSYSTEM_HAS_UNWIND is only used by libgcc. Does it have to move at the same time? If so, then it needs moving. If it doesn't have to move, you can leave it behind if you want, though my preference would be to move it. I think the darwin bits are Ok with this change.
Re: [RFC PATCH, go]: Port to ALPHA arch
Uros Bizjak writes: > On Sat, Apr 2, 2011 at 1:09 AM, Ian Lance Taylor wrote: >> On Wed, Mar 30, 2011 at 12:58 PM, Uros Bizjak wrote: >>> >>> Attached ports go to ALPHA architecture. >> >> Thanks! >> >> Committed. >> >> >>> b) alpha doesn't define "struct user_regs_struct" from which "type >>> PtraceRegs" is derived. I have manually created PtraceRegs from >>> pt_regs structure and patched generated libgo/sysinfo.go in build >>> directory after the build broke. However - the comment from sys/user.h >>> says that this file is for GDB and GDB only... >> >> libgo uses it to support ptrace, which in effect is the same as what >> gdb does. If mksysinfo.sh is unable to provide any definition for the >> structure, then the choices are either to patch up mksysinfo.sh so >> that it works, or to simply define the structure in >> libgo/syscalls/syscall_linux_alpha.go. it is unlikely to change so >> the latter seems acceptable if patching mksysinfo is too hard. > > Attached patch implements this suggestion and builds libgo without problems. Committed. Thanks. Ian
Re: [trans-mem] Add documentation
On 05/26/2011 12:19 PM, Torvald Riegel wrote: > +@subsection State manipulation functions > +There is no @code{getTransaction} function. Transaction identifiers for > +nested transactions will be ordered but not necessarily sequential (i.e., for > +a nested transaction's identifier @code{IN} and its enclosing transaction's > +identifier @code{IE}, it is guaranteed that @code{IN - IE < 1}). Err.. surely IN >= IE. Also, I think you want to use @var not @code here, and maybe @math for the relation expression. > +The intention behind @code{_ITM_dropReferences} is not entirely clear. The > +specification suggests that this function is necessary because of certain > +orderings of data transfer undos and the releasing of memory regions (i.e., > +privatization). However, this ordering is never defined, nor is the ordering > of > +dropping references w.r.t. other events. I'm pretty sure dropReferences was supposed to be related to cancel+throw, where we needed to keep modifications to the object thrown from the transaction, while rolling back all other changes. At the time I quit working on this, the TM Language group hadn't made up it's mind what cancel+throw really meant. Frankly, it seems a bit impossible to me. > +Registered tables must be writable by the TM runtime, and must be live > +throughout the life-time of the TM runtime. > + > +@strong{TODO} We might want to drop the ``writable'' requirement, and make a > +copy instead. The intention was always to drop the registration functions entirely, and create a new ELF Phdr describing the linker-sorted table. Much like what currently happens for PT_GNU_EH_FRAME. This work kept getting bogged down in how to represent the N different code generation variants. We clearly needed at least 2 -- sw and hw transactional clones -- but there was always a suggestion of more variants for different tm assumptions/invariants. > +The ABI should define a memory model and the ordering that is guaranteed for > +data transfers and commit/undo actions, or at least refer to another memory > +model that needs to be preserved. Without that, the compiler cannot ensure > the > +memory model specified on the level of the programming language (e.g., by the > +C++ TM specification). I thought the TM Language group, and Hans Bohem in particular, has been working on this. It post-dates the language spec we had available when the branch code was written. Most of the actual texinfo markup looks fine to my untrained eye. r~
Re: [RFC PATCH, go]: Port to ALPHA arch - sysinfo.go fixup
Uros Bizjak writes: > (BTW: Original calculation of Ctime_ns has a cut'n'paste error, > stat.Ctime.Nsec should be used instead of stat.Atime.Nsec). Thanks. Fixed like so. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 23e3bdca9ee3 libgo/go/os/stat.go --- a/libgo/go/os/stat.go Tue May 31 11:05:02 2011 -0700 +++ b/libgo/go/os/stat.go Tue May 31 11:07:24 2011 -0700 @@ -25,7 +25,7 @@ fi.Blocks = int64(stat.Blocks) fi.Atime_ns = int64(stat.Atime.Sec)*1e9 + int64(stat.Atime.Nsec) fi.Mtime_ns = int64(stat.Mtime.Sec)*1e9 + int64(stat.Mtime.Nsec) - fi.Ctime_ns = int64(stat.Ctime.Sec)*1e9 + int64(stat.Atime.Nsec) + fi.Ctime_ns = int64(stat.Ctime.Sec)*1e9 + int64(stat.Ctime.Nsec) for i := len(name)-1; i >= 0; i-- { if name[i] == '/' { name = name[i+1:]
Re: [RFC PATCH, go]: Port to ALPHA arch - sysinfo.go fixup
Uros Bizjak writes: > This still doesn't fix the build for alpha due to extra struct. From > sysinfo.go: > > type Timespec struct { Sec Timespec_sec_t; Nsec Timespec_nsec_t; } > type Stat_t struct { Dev uint64; Ino uint64; Rdev uint64; Size int64; > Blocks uint64; Mode uint32; Uid uint32; Gid uint32; Blksize uint32; > Nlink uint32; __pad0 int32; Go0 struct { Atime Timespec; }; Go1 struct > { Mtime Timespec; }; Go2 struct { Ctime Timespec; }; __unused > [2+1]int64; } I finally got back to this. I think this patch should fix this problem. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 0785cb0479d8 -r 23e3bdca9ee3 libgo/mksysinfo.sh --- a/libgo/mksysinfo.sh Fri May 27 15:31:49 2011 -0700 +++ b/libgo/mksysinfo.sh Tue May 31 11:05:02 2011 -0700 @@ -358,6 +358,7 @@ -e 's/\([^a-zA-Z0-9_]\)_timespec_t\([^a-zA-Z0-9_]\)/\1Timespec\2/g' \ -e 's/\([^a-zA-Z0-9_]\)_timespec\([^a-zA-Z0-9_]\)/\1Timespec\2/g' \ -e 's/\([^a-zA-Z0-9_]\)_timestruc_t\([^a-zA-Z0-9_]\)/\1Timestruc\2/g' \ + -e 's/Godump_[0-9] struct { \([^;]*;\) };/\1/g' \ >> ${OUT} # The directory searching types.
C++ PATCH for c++/44870 (wrong overload resolution error in template)
lvalue_kind has tried to give an approximate answer for value category in templates; in the past, it was OK to say that an arbitrary expression was an lvalue, as the only effect would be that errors we could have given at template definition time would be delayed until instantiation, which is still conforming. But now that we have rvalue references that can't bind to lvalues, it has become important to get the right answer. So this patch makes us look through NON_DEPENDENT_EXPR at the actual underlying tree structure. We need to add a couple more cases for lvalue expressions that only appear in templates, and handle overloaded functions/operators that return class type; I will not be surprised if there are other cases I didn't think of, but we are still in stage 1... :) Tested x86_64-pc-linux-gnu, applying to trunk. commit 319a7842876443fedd284888a5481f4aa58b7afe Author: Jason Merrill Date: Sat May 28 22:41:42 2011 -0400 PR c++/44870 * tree.c (lvalue_kind): Recurse on NON_DEPENDENT_EXPR. Handle ARROW_EXPR, TYPEID_EXPR, and arbitrary class-valued expressions. (build_min_non_dep): Preserve reference refs. (build_min_non_dep_call_vec): Likewise diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c index c93110b..11e195e 100644 --- a/gcc/cp/tree.c +++ b/gcc/cp/tree.c @@ -139,6 +139,7 @@ lvalue_kind (const_tree ref) && DECL_IN_AGGR_P (ref)) return clk_none; case INDIRECT_REF: +case ARROW_EXPR: case ARRAY_REF: case PARM_DECL: case RESULT_DECL: @@ -170,6 +171,7 @@ lvalue_kind (const_tree ref) break; case MODIFY_EXPR: +case TYPEID_EXPR: return clk_ordinary; case COMPOUND_EXPR: @@ -182,7 +184,9 @@ lvalue_kind (const_tree ref) return (CLASS_TYPE_P (TREE_TYPE (ref)) ? clk_class : clk_none); case CALL_EXPR: - /* Any class-valued call would be wrapped in a TARGET_EXPR. */ + /* We can see calls outside of TARGET_EXPR in templates. */ + if (CLASS_TYPE_P (TREE_TYPE (ref))) + return clk_class; return clk_none; case FUNCTION_DECL: @@ -199,14 +203,16 @@ lvalue_kind (const_tree ref) return lvalue_kind (BASELINK_FUNCTIONS (CONST_CAST_TREE (ref))); case NON_DEPENDENT_EXPR: - /* We must consider NON_DEPENDENT_EXPRs to be lvalues so that - things like "&E" where "E" is an expression with a - non-dependent type work. It is safe to be lenient because an - error will be issued when the template is instantiated if "E" - is not an lvalue. */ - return clk_ordinary; + /* We used to just return clk_ordinary for NON_DEPENDENT_EXPR because + it was safe enough for C++98, but in C++0x lvalues don't bind to + rvalue references, so we get bogus errors (c++/44870). */ + return lvalue_kind (TREE_OPERAND (ref, 0)); default: + if (!TREE_TYPE (ref)) + return clk_none; + if (CLASS_TYPE_P (TREE_TYPE (ref))) + return clk_class; break; } @@ -1985,6 +1991,9 @@ build_min_non_dep (enum tree_code code, tree non_dep, ...) va_start (p, non_dep); + if (REFERENCE_REF_P (non_dep)) +non_dep = TREE_OPERAND (non_dep, 0); + t = make_node (code); length = TREE_CODE_LENGTH (code); TREE_TYPE (t) = TREE_TYPE (non_dep); @@ -2002,7 +2011,7 @@ build_min_non_dep (enum tree_code code, tree non_dep, ...) COMPOUND_EXPR_OVERLOADED (t) = 1; va_end (p); - return t; + return convert_from_reference (t); } /* Similar to `build_nt_call_vec', but for template definitions of @@ -2013,9 +2022,11 @@ tree build_min_non_dep_call_vec (tree non_dep, tree fn, VEC(tree,gc) *argvec) { tree t = build_nt_call_vec (fn, argvec); + if (REFERENCE_REF_P (non_dep)) +non_dep = TREE_OPERAND (non_dep, 0); TREE_TYPE (t) = TREE_TYPE (non_dep); TREE_SIDE_EFFECTS (t) = TREE_SIDE_EFFECTS (non_dep); - return t; + return convert_from_reference (t); } tree diff --git a/gcc/testsuite/g++.dg/cpp0x/rv-template1.C b/gcc/testsuite/g++.dg/cpp0x/rv-template1.C new file mode 100644 index 000..11f53bd --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/rv-template1.C @@ -0,0 +1,11 @@ +// PR c++/44870 +// { dg-options -std=c++0x } + +void foo(int&& data); + +template +void bar(T t) +{ foo(int()); } + +void baz() +{ bar(0); }
Re: [C++ Patch] fix PR c++/48010
OK. Jason
Ping^2: PR target/45074: Check targets of multi-word operations
Ping for: http://gcc.gnu.org/ml/gcc-patches/2011-04/msg01327.html It fixes the expansion of multiword operations in cases where the suggested target is a hard register and where CANNOT_CHANGE_MODE_CLASS forbids word-mode subparts. Richard
Re: RFA PR middle-end/48770
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 05/30/11 04:55, Bernd Schmidt wrote: > On 05/27/2011 07:49 PM, Jeff Law wrote: >> Updated based on some comments from Bernd; specifically the other use of >> delete_dead_insn has been removed. >> >> WRT the assembly differences on MIPS Bernd referred to; what ultimately >> caused this problem were two dead insns that had been previously >> eliminated by reload were still in the insn stream and inhibited an >> if-conversion which resulted in slightly different assembly code, but >> shouldn't have had any significant impact on the performance or size of >> the resulting code. The dead insns were deleted by the post-reload DCE >> pass (which obviously runs after post-reload if conversion). >> >> If we really wanted to get those insns out of the stream, we could flag >> when reload deleted insns which might result in dead code remaining in >> the stream, then conditionally run DCE immediately after reload. I >> didn't think this was worth doing right now, but if someone objects I >> can certainly look into it. > > Patch is OK (still no codegen changes on x86 as far as I can tell), but > I'd appreciate the DCE thing as a followup. I suspect it's going to be just as easy to go ahead and add the DCE cleanup now rather than waiting. It should be a relatively trivial tweak to delete_dead_insn to set a flag which we'd check at the end of reload. It's #2 in the queue :-) jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJN5SmlAAoJEBRtltQi2kC7RfwH/RV9OsQIwMeKOHUOgOkpvQZx DIjRjlGTld+ndDN7sjgt0bIVLBITdR2K1CcVyPMDCZ2qZB0yiuDbXDQOZJkdrq89 55BUKrK7h5KLI5Ad8z5cGsyeK5nY3TUXmuPsdl0Pk/Lbu5m+EfxxxuQw0EljBhY9 UOasXcQ8VAXut0k4banTNFrY07TCM3gHMN2qRKTGiabtq/gh17ez/DejTVt3+2Ru 1Qi9XUVpL1mLwWsrqhI4JIe/zhpJS4eJizvP4Dywp+0pdplSPJVV9tybFWiojdfP zt/4IQD0YHosTdxIEPyLWZBuszyG0TDmsQYyt29cXYzQeZSge+T1CF9TfEcX1wY= =tSMU -END PGP SIGNATURE-
Re: [trans-mem] Add documentation
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 05/26/11 13:19, Torvald Riegel wrote: > Patch adds libitm.texi with some documentation, including differences > between our currently targeted ABI and Intel's current ABI > specification. > > OK for branch? I'd really like to see Richard chime in on this stuff as he's a lot more familiar with the TM work than I. However, if you don't hear from him within a week, let me know and I'll do what I can. Thanks, jeff -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJN5SiRAAoJEBRtltQi2kC7tHsIAI/JEAyLRC4Rgbt3EPzqgs15 boH6b53ziQx7VbzusgS+0nelVfpQWa0U7aAl4jekUqOmLXiuTVVyBgkazMNVfqvn YW/m3DiIhEYwo5F0opQXgsS7orQyUWa7IgO6szu093fSixKwfXbNHtnMReSpKRbS MMkQhkq4VzDR8HZzbGivowWSlfg+N1HIigMGZ6aQQ59WLtW46MMJV09No0dvZFm0 63B7oH3xqgcv7K8cE6FUERoZUXJwbNUMq+dktKOW5+APDOxJb767QoAuA/Xu7cjk V0Tfe3FQbPHGGTxFuPAoe9bJZ24mvQORKXBIrWOpB8oUOTd/IRoZdZDo39Qvt8c= =BQWy -END PGP SIGNATURE-
Re: Patch ping #2
On 05/31/2011 02:19 AM, Jakub Jelinek wrote: > Hi! > > - http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01182.html > > various debug info improvements (typed DWARF stack etc.) > Ok. Although it might be good to break up mem_loc_descriptor. You've got some rather big implementations of operations now. > - http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01246.html > > optimize away unneeded DW_OP_GNU_convert ops with typed DWARF stack > Ok. > - http://gcc.gnu.org/ml/gcc-patches/2011-04/msg01669.html > > PR48688 optimization, I know Richard asked for trying it during > > combine, but that attempt failed due to opposite optimization > Ok. r~
Re: introduce --param max-vartrack-expr-depth
On Tue, May 31, 2011 at 01:13:31PM -0300, Alexandre Oliva wrote: > On May 30, 2011, Bernd Schmidt wrote: > > > On 05/30/2011 12:35 PM, Alexandre Oliva wrote: > >> One of my patches for PR 48866 regressed guality/asm-1.c on > >> x86_64-linux-gnu because what used to be a single complex debug value > >> expression became a chain of debug temps holding simpler expressions, > >> and this chain exceeded the default recursion depth in resolving > >> location expressions. > > > What's the worst that can happen if you remove the limit altogether? > > Exponential behavior comes to mind. I will try a bootstrap with a very > high value, but for pathological cases we'd probably still need the > param anyway, so I'll check it in. Thanks for the reviews. Yeah, before introduction of cselib_dummy_expand_value_rtx_cb it used to be far more urgent, because with high depth values we'd create tons of garbage even when it wouldn't lead to anything reasonable, but still with very high depth values we could end up eating too much memory and compiler time. That said, perhaps default of say 20 instead of 10 wouldn't be bad... I'm not sure you want to bump the depth in reversible_op, because there it blindly creates new rtx and folds them (on the other side, it doesn't have a callback there, so it really doesn't matter how long chain of DEBUG_EXPR_DECLs is). I think 5 should be enough to get through LO_SUM etc., perhaps it could be bumped a tiny bit but not too much. Jakub
Re: [build] Move ENABLE_EXECUTE_STACK to toplevel libgcc
On May 30, 2011, at 8:59 AM, Rainer Orth wrote: > This is my hopefully last patch for toplevel libgcc moves: it moves > ENABLE_EXECUTE_STACK to $target-lib.h headers in libgcc/config. > Ok for mainline after a week if no problems occur in testing on the > other targets? Ok for the darwin bits.
Re: [build] Move MD_UNWIND_SUPPORT to toplevel libgcc
Paolo Bonzini writes: > On 05/30/2011 05:43 PM, Rainer Orth wrote: >> +md-unwind-support.h: config.status >> +if test -n "$(md_unwind_header)"; then \ >> + echo "#include \"config/$(md_unwind_header)\""> $@; \ >> +else \ >> + :> $@; \ >> +fi > > Can you add a default file md-unwind-none.h and use > > AC_CONFIG_LINKS([md-unwind-support.h:$md_unwind_header]) > > instead of this (and instead of AC_SUBST'ing the variable)? Sure, will do. >> -libgcc-eh-objects += $(addsuffix $(objext),$(basename $(notdir >> $(LIB2ADDEHSTATIC >> -libgcc-s-objects += $(addsuffix _s$(objext),$(basename $(notdir >> $(LIB2ADDEHSHARED >> +libgcc-eh-static-objects := $(addsuffix $(objext),$(basename $(notdir >> $(LIB2ADDEHSTATIC >> +libgcc-eh-shared-objects := $(addsuffix _s$(objext),$(basename $(notdir >> $(LIB2ADDEHSHARED >> + >> +$(libgcc-eh-static-objects) $(libgcc-eh-shared-objects): md-unwind-support.h >> + >> +libgcc-eh-objects += $(libgcc-eh-static-objects) >> +libgcc-s-objects += $(libgcc-eh-shared-objects) > > These changes to the dependencies should not be necessary, libgcc does > automatic dependency tracking. That's what I thought, but my bootstraps failed since there wasn't any dependency that triggered the creation of the header. That's when I introduced this explicit dependency. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [build] Move MD_UNWIND_SUPPORT to toplevel libgcc
Kai Tietz writes: >>> mingw part is not ok, as it breaks 32-bit defaulted multilib version >>> compiler. >> >> Can you explain what is going on here? Could it be fixed by wrapping >> w32-unwind.h in a #ifdef __x86_64__? > > To wrap it into __x86_64__ won't help. The issue is that in > combination of 32-bit and 64-bit we need to default here to SjLj, as > 64-bit doesn't support dw2 unwinding stuff and uses here internally > instead SEH. So if target is 32-bit default, but a multilib version is > used, we can't use w32-unwind.h. Wouldn't it work to use DWARF-2 EH in the 32-bit multilib, but SjLj/SEH for 64-bit? If so, wrapping i386/w32-unwind.h in #ifdef __x86_64__ would have exactly this effect, irrespective of multilib use. > Well, wrapping header with __x64_64__ might helper partial. . But this > might be worth a try. Nevertheless I assume that then at least > produced DLL names for libgcc could get confused for their extensions. > Rainer: It would be helpful, if you could try this. Unfortunately, I cannot: I don't have any access to windows systems, so I'm relying on the target maintainers or other interested parties to test. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [Patch, Fortran] -fcoarray=single implementation of the atomic subroutines
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, On 05/31/11 18:24, Tobias Burnus wrote: > This patch adds the atomic_define and atomic_ref intrinsics. They are > currently implemented in form a simple assignment. For -fcoarray=lib > they will be replaced by a function call to libcaf_{single,mpi,*}. > > I was shortly thinking of using something more clever, but I concluded > that a simple assignment should be sufficiently atomic (for > -fcoarray=single). > > Build and regtested on x86-64-linux. > OK for the trunk? Ok. Just one thought: + if (!(atom->ts.type == BT_INTEGER && atom->ts.kind == gfc_c_int_kind) + && !(atom->ts.type == BT_LOGICAL && atom->ts.kind == gfc_c_int_kind)) What about defining another constant for atomic_int_kind rather than gfc_c_int_kind, in case we want to change this in the future? Or is it clear that we indeed *always* want atomic_int_kind to be c_int_kind? Yours, Daniel -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2.1 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIVAwUBTeUZOlJ+ebqjtTmYAQLJyw//YT4NNnyxw5xv0e5ytBe5ec2NSInQt1w8 iTrKxHVhtTI7mJGVPae0wYLhLMm95pZ2I6Mp0ExFDT2Mnugw9QdGG1z2U4Hdbns3 qFdMhc2ub5ZtqlpLevojqJOM+Cz8b7nUJtczc8qY9tNC8r1ybQBOFfJEfPeBytVA BzysrOSx7rkuSQScRrr8S5e1NY1e4Mf2nLDvLlgdWpzSgBmm9At3CvZPASnskIxB DNMcCwWKuoRHTLOLMZEgrBp65vr1tNf7IHe0Z0Kmi7UgEvsrs7NwDvvRczBoLa2v cueVRU4ZMU+qNhNqM2QL1pYED0QEMrcHzshq2xt7r3p1XCKNhQeJlNg/Cpucs8a3 e+E9uAtNDJLVwCE/v71Vy8zXt5kr+fNDr/N2vEM0iJaqfzeUz5PI7uxy5wvoI5Oq ktuHxM9b7Ev8DtFJym5opLzYBTLA36F+H8zFtMfBwGnmuaNRGPqUUU7j/XBIAb3E DjRXq7FbRwIObPz06SQ4F3peG8abegs8+guqNw4V4ZKeWDU4fZhRsutLCnSi/jj1 94n1hWp1sf4gwWk3aK/90JXuQxAnnOz4VIskymuFwr8ZlnnX5+S62t8sYLRRL/BC FyJofN/uDyZ/0cbrVggAJtXHK29EGli/lPX7PYPiGpDkEvAB0auWjNI+EvPZS/gc Nr7b+g16vUI= =fYF2 -END PGP SIGNATURE-
Re: [build] Move MD_UNWIND_SUPPORT to toplevel libgcc
Paolo Bonzini writes: > Rainer, the same solution that is found for Windows should be used for > darwin, too. I'm uncertain if anything is needed for Darwin, though: gcc/config/rs6000/darwin.h has #if !defined(__LP64__) && !defined(DARWIN_LIBSYSTEM_HAS_UNWIND) #define MD_UNWIND_SUPPORT "config/rs6000/darwin-unwind.h" #endif gcc/config/darwin9.h defines DARWIN_LIBSYSTEM_HAS_UNWIND, so this only applies to Darwin 8 and earlier. That's why I have this in libgcc/config.host: powerpc-*-darwin*) case ${host} in *-*-darwin9* | *-*-darwin[12][0-9]*) ;; *) md_unwind_header=rs6000/darwin-unwind.h ;; esac I've no idea if Darwin 8 had 64-bit support, so rs6000/darwin-unwind.h would still have to be disabled. Perhaps the Darwin maintainers could chime in? Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[Patch, Fortran] -fcoarray=single implementation of the atomic subroutines
This patch adds the atomic_define and atomic_ref intrinsics. They are currently implemented in form a simple assignment. For -fcoarray=lib they will be replaced by a function call to libcaf_{single,mpi,*}. I was shortly thinking of using something more clever, but I concluded that a simple assignment should be sufficiently atomic (for -fcoarray=single). Build and regtested on x86-64-linux. OK for the trunk? Tobias b/gcc/fortran/check.c | 65 ++ b/gcc/fortran/gfortran.h |2 b/gcc/fortran/intrinsic.c | 16 +++ b/gcc/fortran/intrinsic.h |4 b/gcc/fortran/intrinsic.texi | 96 + b/gcc/fortran/iresolve.c | 16 +++ b/gcc/fortran/iso-fortran-env.def |4 b/gcc/fortran/trans-intrinsic.c| 68 ++ b/gcc/fortran/trans.c | 17 ++- b/gcc/fortran/trans.h |5 - gcc/gcc/testsuite/gfortran.dg/coarray/atomic_1.f90 | 27 + gcc/gcc/testsuite/gfortran.dg/coarray_atomic_1.f90 | 21 12 files changed, 326 insertions(+), 15 deletions(-) 2011-05-31 Tobias Burnus PR fortran/18918 * intrinsic.c (klass): Add CLASS_ATOMIC. (add_subroutines): Add atomic_ref/atomic_define. * intrinsic.texi (ATOMIC_REF, ATOMIC_DEFINE): Document. * intrinsic.h (gfc_check_atomic_def, gfc_check_atomic_ref, gfc_resolve_atomic_def, gfc_resolve_atomic_ref): New prototypes. * gfortran.h (gfc_isym_id): Add GFC_ISYM_ATOMIC_DEF and GFC_ISYM_ATOMIC_REF. * iresolve.c (gfc_resolve_atomic_def, gfc_resolve_atomic_ref): New functions. * check.c (gfc_check_atomic, gfc_check_atomic_def, gfc_check_atomic_ref): New functions. * iso-fortran-env.def (ISOFORTRANENV_FILE_ATOMIC_INT_KIND, ISOFORTRANENV_FILE_ATOMIC_LOGICAL_KIND): Change kind value. * trans-intrinsic.c (conv_intrinsic_atomic_def, conv_intrinsic_atomic_ref, gfc_conv_intrinsic_subroutine): New functions. (conv_intrinsic_move_alloc) Renamed from gfc_conv_intrinsic_move_alloc - and made static. * trans.h (gfc_conv_intrinsic_move_alloc): Remove. (gfc_conv_intrinsic_subroutine) Add prototype. * trans.c (trans_code): Call gfc_conv_intrinsic_subroutine. 2011-05-31 Tobias Burnus PR fortran/18918 * gfortran.dg/coarray_atomic_1.f90: New. * gfortran.dg/coarray/atomic_1.f90: New. diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c index 70c23e6..0e6b2d8 100644 --- a/gcc/fortran/check.c +++ b/gcc/fortran/check.c @@ -973,6 +973,71 @@ gfc_check_atan2 (gfc_expr *y, gfc_expr *x) } +static gfc_try +gfc_check_atomic (gfc_expr *atom, gfc_expr *value) +{ + if (!(atom->ts.type == BT_INTEGER && atom->ts.kind == gfc_c_int_kind) + && !(atom->ts.type == BT_LOGICAL && atom->ts.kind == gfc_c_int_kind)) +{ + gfc_error ("ATOM argument at %L to intrinsic function %s shall be an " + "integer of ATOMIC_INT_KIND or a logical of " + "ATOMIC_LOGICAL_KIND", &atom->where, gfc_current_intrinsic); + return FAILURE; +} + + if (!gfc_expr_attr (atom).codimension) +{ + gfc_error ("ATOM argument at %L of the %s intrinsic function shall be a " + "coarray or coindexed", &atom->where, gfc_current_intrinsic); + return FAILURE; +} + + if (atom->ts.type != value->ts.type) +{ + gfc_error ("ATOM and VALUE argument of the %s intrinsic function shall " + "have the same type at %L", gfc_current_intrinsic, + &value->where); + return FAILURE; +} + + return SUCCESS; +} + + +gfc_try +gfc_check_atomic_def (gfc_expr *atom, gfc_expr *value) +{ + if (scalar_check (atom, 0) == FAILURE || scalar_check (value, 1) == FAILURE) +return FAILURE; + + if (gfc_check_vardef_context (atom, false, NULL) == FAILURE) +{ + gfc_error ("ATOM argument of the %s intrinsic function at %L shall be " + "definable", gfc_current_intrinsic, &atom->where); + return FAILURE; +} + + return gfc_check_atomic (atom, value); +} + + +gfc_try +gfc_check_atomic_ref (gfc_expr *value, gfc_expr *atom) +{ + if (scalar_check (value, 0) == FAILURE || scalar_check (atom, 1) == FAILURE) +return FAILURE; + + if (gfc_check_vardef_context (value, false, NULL) == FAILURE) +{ + gfc_error ("VALUE argument of the %s intrinsic function at %L shall be " + "definable", gfc_current_intrinsic, &value->where); + return FAILURE; +} + + return gfc_check_atomic (atom, value); +} + + /* BESJN and BESYN functions. */ gfc_try diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index 72e412b..b2b2e84 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -306,6 +306,8 @@ enum gfc_isym_id GFC_ISYM_ATAN, GFC_ISYM_ATAN2, GFC_ISYM_ATANH, + GFC_ISYM_ATOMIC_DEF, + GFC_ISYM_ATOMIC_REF, GFC_ISYM_BGE, GFC_ISYM_BGT, GFC_ISYM_BIT_SIZE, diff --git a/gcc/fortran/intrinsic.c b/gcc/fortran/intrinsic.c index
Re: introduce --param max-vartrack-expr-depth
On May 30, 2011, Bernd Schmidt wrote: > On 05/30/2011 12:35 PM, Alexandre Oliva wrote: >> One of my patches for PR 48866 regressed guality/asm-1.c on >> x86_64-linux-gnu because what used to be a single complex debug value >> expression became a chain of debug temps holding simpler expressions, >> and this chain exceeded the default recursion depth in resolving >> location expressions. > What's the worst that can happen if you remove the limit altogether? Exponential behavior comes to mind. I will try a bootstrap with a very high value, but for pathological cases we'd probably still need the param anyway, so I'll check it in. Thanks for the reviews. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer
[PATCH] Fix PR debug/49130
Hello, In this PR an integer constant which type is a typedef based on an integer might be pretty-printed differently from the way the mangler would have represented it. As a result, the DW_AT_name representing the constant might be gratuitously be different from the linkage name. In the example accompanying the patch: The DW_AT_name of S is "S<2048u>", whereas the name of S.f (as demangled from the linkage name) is S<2048ul>::f(unsigned long), rightfully implying that the name of S should be "2048ul" instead of "2048u". The reason of this is that pp_c_integer_constant compares the type to pretty print with a set of integer constant types, using pointer comparison. When the type to pretty print is actually a typedef, that pointer comparison can fail even if the types are otherwise equivalent. Thus the patch considers the canonical type of the type to pretty-print, when it is present. Tested on x86_64-unknown-linux-gnu against trunk. gcc/c-family/ * c-pretty-print.c (pp_c_integer_constant): Consider the canonical type when using pointer comparison to compare types. gcc/testsuite/ * g++.dg/debug/dwarf2/integer-typedef.C: New test. --- gcc/c-family/c-pretty-print.c |7 - .../g++.dg/debug/dwarf2/integer-typedef.C | 28 2 files changed, 34 insertions(+), 1 deletions(-) create mode 100644 gcc/testsuite/g++.dg/debug/dwarf2/integer-typedef.C diff --git a/gcc/c-family/c-pretty-print.c b/gcc/c-family/c-pretty-print.c index e418903..518b373 100644 --- a/gcc/c-family/c-pretty-print.c +++ b/gcc/c-family/c-pretty-print.c @@ -901,7 +901,12 @@ pp_c_string_literal (c_pretty_printer *pp, tree s) static void pp_c_integer_constant (c_pretty_printer *pp, tree i) { - tree type = TREE_TYPE (i); + /* We are going to compare the type of I to other types using + pointer comparison so we need to use its canonical type. */ + tree type = +TYPE_CANONICAL (TREE_TYPE (i)) +? TYPE_CANONICAL (TREE_TYPE (i)) +: TREE_TYPE (i); if (TREE_INT_CST_HIGH (i) == 0) pp_wide_integer (pp, TREE_INT_CST_LOW (i)); diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/integer-typedef.C b/gcc/testsuite/g++.dg/debug/dwarf2/integer-typedef.C new file mode 100644 index 000..42b3c99 --- /dev/null +++ b/gcc/testsuite/g++.dg/debug/dwarf2/integer-typedef.C @@ -0,0 +1,28 @@ +// Origin: PR debug/49130 +// { dg-options "-g -dA" } + +typedef long unsigned int size_t; +static const size_t foo = 2048; + +template +struct S +{ + void f(size_t); +}; + +template +inline void +S::f(size_t) +{ + size_t i = size; +} + +int +main() +{ + S s1; + s1.f(10); +} + +// { dg-final {scan-assembler-times "\[^\n\r\]*DW_AT_name: \"S<2048ul>\"" 1 } } +// { dg-final {scan-assembler-times "\[^\n\r\]*DW_AT_MIPS_linkage_name: \"_ZN1SILm2048EE1fEm\"" 1 } } -- Dodji
[PATCH, rs6000] Fix REG_CLASS_CONTENTS
The following patch fixes an issue I noticed where vr0..vr2 were inadvertently included in NON_FLOAT_REGS. Bootstrap/regtest on powerpc64-linux with no new regressions. Ok for trunk? -Pat 2011-05-31 Pat Haugen * config/rs6000/rs6000.h (REG_CLASS_CONTENTS): Remove vr0..vr2 from NON_FLOAT_REGS. Index: gcc/config/rs6000/rs6000.h === --- gcc/config/rs6000/rs6000.h (revision 174304) +++ gcc/config/rs6000/rs6000.h (working copy) @@ -1224,7 +1224,7 @@ enum reg_class { 0x, 0x, 0x000f, 0x00022000 }, /* SPEC_OR_GEN_REGS */ \ { 0x, 0x, 0x0010, 0x }, /* CR0_REGS */\ { 0x, 0x, 0x0ff0, 0x }, /* CR_REGS */ \ - { 0x, 0x, 0xefff, 0x0002 }, /* NON_FLOAT_REGS */ \ + { 0x, 0x, 0x0fff, 0x0002 }, /* NON_FLOAT_REGS */ \ { 0x, 0x, 0x1000, 0x }, /* CA_REGS */ \ { 0x, 0x, 0x, 0x0003 } /* ALL_REGS */\ }
Re: New options to disable/enable any pass for any functions (issue4550056)
On Tue, May 31, 2011 at 2:05 AM, Richard Guenther wrote: > On Mon, May 30, 2011 at 10:16 PM, Xinliang David Li > wrote: >> The attached are two simple follow up patches >> >> 1) the first patch does some refactorization on function header >> dumping (with more information printed) >> >> 2) the second patch cleans up some pass names. Part of the cleanup >> results from a previous discussion with Honza -- a) rename >> 'tree_profile_ipa' into 'profile', and make 'ipa_profile' and >> 'profile' into 'profile_estimate'. The rest of cleanups are needed to >> make sure pass names are unique. >> >> Ok for trunk? > > + > +void > +pass_dump_function_header (FILE *dump_file, tree fdecl, struct function *fun) > > This function needs documentation, the ChangeLog entry misses > the tree-pretty-print.h change. > > +struct function; > > instead of this please include coretypes.h from tree-pretty-print.h > and add the struct function forward declaration there if it isn't already > present. > > You change the output of the header, so please make sure you > have bootstrapped and tested with _all_ languages included > (and also watch for bugreports for target specific bugs). > Ok. > + fprintf (dump_file, "\n;; Function %s (%s, funcdef_no=%d, uid=%d)", > + dname, aname, fun->funcdef_no, node->uid); > > I see no point in dumping funcdef_no - it wasn't dumped before in > any place. Instead I miss dumping of the DECL_UID and thus > a more specific 'uid', like 'cgraph-uid'. Ok will add decl_uid. Funcdef_no is very useful for debugging FDO coverage mismatch related problems as it is the id used in profile hashing. > > + aname = (IDENTIFIER_POINTER > + (DECL_ASSEMBLER_NAME (fdecl))); > > using DECL_ASSEMBLER_NAME is bad - it might trigger computation > of DECL_ASSEMBLER_NAME which certainly shouldn't be done > only for dumping purposes. Instead do sth like > > if (DECL_ASSEMBLER_NAME_SET_P (fdecl)) > aname = DECL_ASSEMBLER_NAME (fdecl); > else > aname = ''; Ok. > > and please also watch for cgraph_get_node returning NULL. > > Also please call the function dump_function_header instead of > pass_dump_function_header. > Ok. Thanks, David > Please re-post with appropriate changes. > > Thanks, > Richard. > >> Thanks, >> >> David >> >> On Fri, May 27, 2011 at 2:58 AM, Richard Guenther >> wrote: >>> On Fri, May 27, 2011 at 12:02 AM, Xinliang David Li >>> wrote: The latest version that only exports 2 functions from passes.c. >>> >>> Ok with ... >>> >>> @@ -637,4 +637,8 @@ extern bool first_pass_instance; >>> /* Declare for plugins. */ >>> extern void do_per_function_toporder (void (*) (void *), void *); >>> >>> +extern void disable_pass (const char *); >>> +extern void enable_pass (const char *); >>> +struct function; >>> + >>> >>> struct function forward decl removed. >>> >>> + explicitly_enabled = is_pass_explicitly_enabled (pass, func); >>> + explicitly_disabled = is_pass_explicitly_disabled (pass, func); >>> >>> both functions inlined here and removed. >>> >>> +#define MAX_PASS_ID 512 >>> >>> this removed and instead a VEC_safe_grow_cleared () or VEC_length () >>> before the accesses. >>> >>> +-fenable-ipa-@var{pass} @gol >>> +-fenable-rtl-@var{pass} @gol >>> +-fenable-rtl-@var{pass}=@var{range-list} @gol >>> +-fenable-tree-@var{pass} @gol >>> +-fenable-tree-@var{pass}=@var{range-list} @gol >>> >>> -fenable-@var{kind}-@var{pass}, etc. >>> >>> +@item -fdisable-@var{ipa|tree|rtl}-@var{pass} >>> +@itemx -fenable-@var{ipa|tree|rtl}-@var{pass} >>> +@itemx -fdisable-@var{tree|rtl}-@var{pass}=@var{range-list} >>> +@itemx -fenable-@var{tree|rtl}-@var{pass}=@var{range-list} >>> >>> likewise. >>> >>> Thanks, >>> Richard. >>> David On Thu, May 26, 2011 at 2:04 PM, Xinliang David Li wrote: > On Thu, May 26, 2011 at 2:04 AM, Richard Guenther > wrote: >> On Wed, May 25, 2011 at 6:53 PM, Joseph S. Myers >> wrote: >>> On Wed, 25 May 2011, Xinliang David Li wrote: >>> Ping. The link to the message: http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01303.html >>> >>> I don't consider this an option handling patch. Patches adding whole >>> new >>> features involving new options should be reviewed by maintainers for the >>> part of the compiler relevant to those features (since there isn't a >>> pass >>> manager maintainer, I guess that means middle-end). >> >> Hmm, I suppose then you reviewed the option handling parts and they >> are ok? Those globbing options always cause headache to me. >> >> +-fenable-ipa-@var{pass} @gol >> +-fenable-rtl-@var{pass} @gol >> +-fenable-rtl-@var{pass}=@var{range-list} @gol >> +-fenable-tree-@var{pass} @gol >> >> so, no -fenable-tree-@var{pass}=@var{range-list}? >> > > Missed. Will add. > > >> Does the pass name match 1:1 with the dump file name? In which >> case > > Yes. > >> >
Re: [PATH] PR/49139 fix always_inline failures diagnostics
On 05/31/2011 11:18 AM, Richard Guenther wrote: On Tue, May 31, 2011 at 9:54 AM, Christian Bruel wrote: Hello, The attached patch fixes a few diagnostic discrepancies for always_inline failures. Illustrated by the fail_always_inline[12].c attached cases, the current behavior is one of: - success (with and without -Winline), silently not honoring always_inline gcc fail_always_inline1.c -S -Winline -O0 -fpic gcc fail_always_inline1.c -S -O2 -fpic - error: with -Winline but not without gcc fail_always_inline1.c -S -Winline -O2 -fpic - error: without -Winline gcc fail_always_inline2.c -S -fno-early-inlining -O2 or the original c++ attachment in this defect note that -Winline never warns, as stated in the documentation This simple patch consistently emits a warning (changing the sorry unimplemented message) whenever the attribute is not honored. My first will was to generate and error instead of the warning, but since it is possible that inlines is only performed at LTO time, an error would be inapropriate (Note that today this is not possible with -Winline that would abort). Another alternative I considered would be to emit the warning under -Winline rather than unconditionally, but this more a user misuse of the attribute, so should always be warned anyway. Or maybe a new -Winline-always that would be activated under -Wall ? Other opinion welcomed. Tested with standard bootstrap and regression on x86. Comments, and/or OK for trunk ? The patch is not ok, we may not fail to inline an always_inline function. OK, I thought so that would be an error. but I was afraid to abort the inline of function with a missing body (provided by another file) by LTO, which would be a regression. rethinking about this and considering that a valid LTO program should be valid without LTO, and that the scope is the translation unit, that would be OK to always reject attribute_inline on functions without a body. To make this more consistent I proposed to warn whenever you take the address of an always_inline function (because then you can confuse GCC by indirectly calling such function which we might inline dependent on optimization setting and which we might discover we didn't inline only dependent on optimization setting).Honza proposed to move the sorry()ing to when we feel the need to output the always_inline function, thus when it was not optimized away, but that would require us not preserving the body (do we?) with -fpreserve-inline-functions. But we don't know if taking the address of the function will result by a call to it, or how the pointer will be used. So I think the check should be done at the caller site. But I code like: inline __attribute__((always_inline)) int foo() { return 0; } static int (*ptr)() = foo; int bar() { return ptr(); } is not be (legitimately) inlined, but without diagnostic, I don't know where to look at this that at the moment. For fail_always_inline1.c we should diagnose the appearant misuse of always_inline with a warning, drop the attribute but keep DECL_DISREGARD_INLINE_LIMITS set. > Same for fail_always_inline2.c. I agree that sorry()ing for those cases is odd. EIther we should reject the declarations upfront ("always-inline function will not be inlinable"), or we should emit a warning of that kind and make sure to not sorry later. In addition to the error at the caller site, I've added the specific warn about the missing inline keyword. Thanks for your comments, here is the new patch that I'm testing, (without the handling of indirect calls which can be treated separately). Cheers Christian 2010-05-25 Christian Bruel PR 49139 * cgraph.c (cgraph_function_body_availability): Check always_inline * ipa-inline-transform.c (inline_transform): Force optimize_inline_calls error checking if always_inline seen. * tree-inline.c (tree_inlinable_function_p): Use error instead of sorry. (expand_call_inline): Likewise. 2010-05-25 Christian Bruel * gcc.db/always_inline.c: Removed -Winline. Update checks * gcc.db/always_inline2.c: Likewise. * gcc.db/always_inline3.c: Likewise. * gcc.db/fail_always_inline.c: New. Index: gcc/cgraph.c === --- gcc/cgraph.c(revision 174264) +++ gcc/cgraph.c(working copy) @@ -2414,7 +2414,14 @@ bit. */ else if (decl_replaceable_p (node->decl) && !DECL_EXTERNAL (node->decl)) +{ + if (cgraph_global_info_ready) + if (lookup_attribute ("always_inline", DECL_ATTRIBUTES (node->decl))) + if (!DECL_DECLARED_INLINE_P (node->decl)) + warning (0, "always_inline function might not be inlinable"); + avail = AVAIL_OVERWRITABLE; +} else avail = AVAIL_AVAILABLE; return avail; Index: gcc/ipa-inline-transform.c =
Re: [build] Move ENABLE_EXECUTE_STACK to toplevel libgcc
Seems like a plan, but I didn't go in this direction since I couldn't test anything like this. As long as you test the general configury on 1-2 platforms, it's not any less tested than what you have now. The various __enable_execute_stack implementations differ in minor ways that would have to be autoconf'ed. Just select them by $host. One other caveat: I don't know if I like grouping the configure support for the enable-execute-stack.c variants together or would rather keep all configuration for a single platform (or OS) together. Probably a matter of taste. In case it wasn't clear, I was proposing the latter. Another question: should we keep the variants in libgcc/config (like your config/enable-execute-stack-mprotect.c) or at the toplevel (like enable-execute-stack-empty.c)? At the moment I see a mixture of files at the libgcc toplevel and others in config. I would put them under config/ unless they are platform-independent. I'd thought about it, but refrained since HAVE_ENABLE_EXECUTE_STACK affects only three cpus. Currently, our documentation of libgcc configury and macros used is close to non-existant. That's probably for someone who has implemented this stuff. True, OTOH HAVE_ENABLE_EXECUTE_STACK is a target macro, and those are well documented. Just say that it has to be defined if libgcc provides a non-trivial implementation of __enable_execute_stack; it doesn't need to delve into how to hack libgcc. you can make the above work, that would be great as an example of how to move stuff to the libgcc toplevel directory. I'll give it a try, but it might take over the weekend. Take your time, we're in stage1 now. Paolo
Re: [build] Move ENABLE_EXECUTE_STACK to toplevel libgcc
Paolo Bonzini writes: >> The new libgcc/config/$target-lib.h headers are added to libgcc_tm_file >> in gcc/config.gcc. I'd rather add them to libgcc/config.host instead so >> the information is kept local to libgcc. > > Did you have any problems doing so? There weren't any provisions in libgcc/configure.ac similar to libgcc_tm_file in gcc/configure.ac, so I took the easy way out. > Long term, it would be even better to do something like this in > libgcc/config.host: > > enable_execute_stack=enable-execute-stack-empty.c > case $host in > ...) enable_execute_stack=config/enable-execute-stack-mprotect.c ;; > ... > esac > > in libgcc/configure.ac: > > AC_CONFIG_LINKS(enable-execute-stack.c:$enable_execute_stack) > > in libgcc/Makefile.in: > > LIB2ADD += enable-execute-stack.c > > and drop this hunk altogether from gcc/libgcc2.c: > > #ifdef L_enable_execute_stack > /* Attempt to turn on execute permission for the stack. */ > > #ifdef ENABLE_EXECUTE_STACK > ENABLE_EXECUTE_STACK > #else > void > __enable_execute_stack (void *addr __attribute__((__unused__))) > {} > #endif /* ENABLE_EXECUTE_STACK */ > > #endif /* L_enable_execute_stack */ Seems like a plan, but I didn't go in this direction since I couldn't test anything like this. The various __enable_execute_stack implementations differ in minor ways that would have to be autoconf'ed. This should be done by someone with access to the affected platforms. One other caveat: I don't know if I like grouping the configure support for the enable-execute-stack.c variants together or would rather keep all configuration for a single platform (or OS) together. Probably a matter of taste. Another question: should we keep the variants in libgcc/config (like your config/enable-execute-stack-mprotect.c) or at the toplevel (like enable-execute-stack-empty.c)? At the moment I see a mixture of files at the libgcc toplevel and others in config. >> Bootstrapped without regressions on i386-pc-solaris2.11 and >> sparc-sun-solaris2.11. >> >> I've Cc'ed the OS port maintainers of the other affected targets and >> would appreciate testing/review. An OpenBSD maintainer isn't listed, >> unfortunately. Also included are the CPU port maintainers for the >> modified backends. >> >> Ok for mainline after a week if no problems occur in testing on the >> other targets? > > It's a good start, but at least you need changes to the documentation; if I'd thought about it, but refrained since HAVE_ENABLE_EXECUTE_STACK affects only three cpus. Currently, our documentation of libgcc configury and macros used is close to non-existant. That's probably for someone who has implemented this stuff. > you can make the above work, that would be great as an example of how to > move stuff to the libgcc toplevel directory. I'll give it a try, but it might take over the weekend. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH, ARM] Thumb-2 12-bit immediates in ADD and SUB instructions
On 05/20/2011 02:37 PM, Andrew Stubbs wrote: On 20/05/11 10:45, Dmitry Plotnikov wrote: This patch adds support for 12-bit immediate values for Thumb-2 in ADD and SUB instructions. We added two new alternatives for *arm_addsi3 which make use of two new constraints for 12-bit values. Also we modified costs of PLUS rtx expression. A already have a patch submitted for review that does all this. http://gcc.gnu.org/ml/gcc-patches/2011-04/msg01657.html Sorry, we missed this message. This patch reduces size of libevas by 1788 bytes (from 464916 to 463128), and sqlite by 54 bytes (from 266156 to 266052). Regtested with Cortex-A8 QEMU. I also have another patch that improves support for thumb2 replicated constants. I'd be interested whether the patch makes a difference to your benchmark? http://gcc.gnu.org/ml/gcc-patches/2011-04/msg01757.html (approved, but blocked on the addw/subw review). I tested the patches on libevas. Code size is reduced by 64 bytes with your addw/subw patch (this effect is actually similar to our patch; there was an error with numbers reported in the previous message). Adding your replicated constants patch reduces code size additionally by 808 bytes. These patches didn't affect the performance on libevas. +/* Return TRUE if int I is a valid immediate THUMB-2 constant in add/sub + * instructions in 12 bit encoding variants. */ + +int +const_ok_for_thumb2_12bit (HOST_WIDE_INT i) +{ + /* According to Thumb-2 instruction set manual such constants should be + * in range 0-4095. */ + if ((i& ~(unsigned HOST_WIDE_INT) 0xfff) == 0) +return TRUE; + + return FALSE; +} As Richard and Chung-Lin said, we have const_ok_for_op for this sort of thing. I already patched it for movw 16-bit constants, and my patch above covers addw and subw. @@ -7164,7 +7179,10 @@ arm_rtx_costs_1 (rtx x, enum rtx_code outer, int* total, bool speed) case CONST_INT: if (const_ok_for_arm (INTVAL (x)) - || const_ok_for_arm (~INTVAL (x))) + || const_ok_for_arm (~INTVAL (x)) + || (TARGET_THUMB2&& (outer == PLUS) + && (const_ok_for_thumb2_12bit (INTVAL (x)) + || const_ok_for_thumb2_12bit (~INTVAL (x) *total = COSTS_N_INSNS (1); else *total = COSTS_N_INSNS (arm_gen_constant (SET, mode, NULL_RTX, I think my patch may be missing this bit - good catch. It really should be recast in terms of const_ok_for_op, though. Would you include this in your patch? Or should we submit it as a separate patch? -- Best regards, Dmitry
[PATCH, i386]: Slightly penalize non-native FP moves.
Hello! Attached patch slightly penalizes FP moves that use non-native registers. Additionally, the patch merges FP push splitters. 2011-05-31 Uros Bizjak * config/i386/i386.md (*pushxf_nointeger): Merge alternatives 1 and 2. (FP push_operand splitters): Merge {TF,XF,DF} mode splitters. 2011-05-31 Uros Bizjak * config/i386/i386.md (*movtf_internal): Avoid allocating general registers. Penalize F*r->o alternative to prevent partial memory stalls. Slightly penalize *roF->*r alternative. Generate SSE CONST_DOUBLE immediates when optimizing function for size. Do not move CONST_DOUBLEs directly to memory for !TARGET_MEMORY_MISMATCH_STALL. (*movxf_internal): Slightly penalize Yx*roF->Yx*r alternative. (*movdf_internal): Slightly penalize Yd*roF->Yd*r alternative. (*movdf_internal_rex64): Slightly penalize rm->r, F->m and r->m alternatives. (*movsf_internal): Slightly penalize rmF->r and Fr->m alternatives. (fp_register_operand splitters): Use fp_register_operand constraint. Do not use FP_REG_P in insn condition. (any_fp_register_operand splitters): Use any_fp_register_operand constraint. Do not use ANY_FP_REG_P in insn condition. Patch was tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN. Uros. Index: i386.md === --- i386.md (revision 174465) +++ i386.md (working copy) @@ -2678,6 +2678,7 @@ (set_attr "unit" "sse,*,*") (set_attr "mode" "TF,SI,SI")]) +;; %%% Kill this when call knows how to work this out. (define_split [(set (match_operand:TF 0 "push_operand" "") (match_operand:TF 1 "sse_reg_operand" ""))] @@ -2685,14 +2686,6 @@ [(set (reg:P SP_REG) (plus:P (reg:P SP_REG) (const_int -16))) (set (mem:TF (reg:P SP_REG)) (match_dup 1))]) -(define_split - [(set (match_operand:TF 0 "push_operand" "") - (match_operand:TF 1 "general_operand" ""))] - "TARGET_SSE2 && reload_completed - && !SSE_REG_P (operands[1])" - [(const_int 0)] - "ix86_split_long_move (operands); DONE;") - (define_insn "*pushxf" [(set (match_operand:XF 0 "push_operand" "=<,<") (match_operand:XF 1 "general_no_elim_operand" "f,ro"))] @@ -2712,17 +2705,18 @@ ;; only once, but this ought to be handled elsewhere). (define_insn "*pushxf_nointeger" - [(set (match_operand:XF 0 "push_operand" "=X,X,X") - (match_operand:XF 1 "general_no_elim_operand" "f,Fo,*r"))] + [(set (match_operand:XF 0 "push_operand" "=X,X") + (match_operand:XF 1 "general_no_elim_operand" "f,*rFo"))] "optimize_function_for_size_p (cfun)" { /* This insn should be already split before reg-stack. */ gcc_unreachable (); } [(set_attr "type" "multi") - (set_attr "unit" "i387,*,*") - (set_attr "mode" "XF,SI,SI")]) + (set_attr "unit" "i387,*") + (set_attr "mode" "XF,SI")]) +;; %%% Kill this when call knows how to work this out. (define_split [(set (match_operand:XF 0 "push_operand" "") (match_operand:XF 1 "fp_register_operand" ""))] @@ -2731,14 +2725,6 @@ (set (mem:XF (reg:P SP_REG)) (match_dup 1))] "operands[2] = GEN_INT (-GET_MODE_SIZE (XFmode));") -(define_split - [(set (match_operand:XF 0 "push_operand" "") - (match_operand:XF 1 "general_operand" ""))] - "reload_completed - && !FP_REG_P (operands[1])" - [(const_int 0)] - "ix86_split_long_move (operands); DONE;") - ;; Size of pushdf is 3 (for sub) + 2 (for fstp) + memory operand size. ;; Size of pushdf using integer instructions is 2+2*memory operand size ;; On the average, pushdf using integers can be still shorter. @@ -2763,14 +2749,6 @@ [(set (reg:P SP_REG) (plus:P (reg:P SP_REG) (const_int -8))) (set (mem:DF (reg:P SP_REG)) (match_dup 1))]) -(define_split - [(set (match_operand:DF 0 "push_operand" "") - (match_operand:DF 1 "general_operand" ""))] - "reload_completed - && !ANY_FP_REG_P (operands[1])" - [(const_int 0)] - "ix86_split_long_move (operands); DONE;") - (define_insn "*pushsf_rex64" [(set (match_operand:SF 0 "push_operand" "=X,X,X") (match_operand:SF 1 "nonmemory_no_elim_operand" "f,rF,x"))] @@ -2797,15 +2775,6 @@ (set_attr "unit" "i387,*,*") (set_attr "mode" "SF,SI,SF")]) -(define_split - [(set (match_operand:SF 0 "push_operand" "") - (match_operand:SF 1 "memory_operand" ""))] - "reload_completed - && MEM_P (operands[1]) - && (operands[2] = find_constant_src (insn))" - [(set (match_dup 0) - (match_dup 2))]) - ;; %%% Kill this when call knows how to work this out. (define_split [(set (match_operand:SF 0 "push_operand" "") @@ -2813,7 +2782,25 @@ "reload_completed" [(set (reg:P SP_REG) (plus:P (reg:P SP_REG) (match_dup 2))) (set (mem:SF (reg:P SP_REG)) (match_dup 1))] - "operands[2] = GEN_INT (-GET_MODE_SIZE (mode));") + "operands[2] = GEN_INT (-GET_MODE_SIZE (mode));") + +(define_split + [(set
Re: [build] Move Solaris 2 startup files to toplevel libgcc, revised
Paolo Bonzini writes: > On 05/30/2011 04:29 PM, Rainer Orth wrote: * Non-Solaris SPARC changes: >> >> After I had moved sparc/sol2-c[in].asm to libgcc, I noticed that >> despite the name a few non-Solaris targets uses those files, too: >> >> sparc-*-elf*, sparc-*-rtems*, sparc64-*-elf*, sparc64-*-rtems* >>> > >>> > Yes, I tried to untangle that, but the RTEMS folks complained so I had to >>> > backpedal. Note that this is also the case on the i386 side. >> Drats, I hadn't expected anything like this ;-( >> >> Here's the updated patch which takes care of this. I've taken the >> liberty to rename gcc/config/i386/t-rtems-i386 to >> gcc/config/i386/t-rtems, following all other RTEMS makefile fragments. >> I'd be really grateful if the RTEMS maintainers could give it a whirl. >> >> Besides, I still need build and SPARC maintainer approval for the >> non-Solaris parts of the patch from as outlined in the previous >> submission: >> >> http://gcc.gnu.org/ml/gcc-patches/2011-05/msg02181.html >> >> Bootstrapped without regressions on i386-pc-solaris2.10, >> i386-pc-solaris2.11 and sparc-sun-solaris2.11. >> >> Ok for mainline after the RTEMS parts have been tested/approved? > > Looks good as far as I'm concerned. As a followup, please delete > t-slibgcc-darwin and use the new t-slibgcc-dummy instead (even better, you > could "svn mv" it). I think I just need RTEMS testing/approval now, everything else should be set? Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [build] Move Tru64 UNIX startup files to toplevel libgcc
Paolo Bonzini writes: > On 05/30/2011 05:12 PM, Rainer Orth wrote: >> Ok for mainline after a fresh bootstrap? > > Ok. This depends on the t-slibgcc-dummy file from the Solaris patch, so > feel free to move it to this patch (together with > s/t-slibgcc-darwin/t-slibgcc-dummy/). Ok, will do. It also depends on libgcc/config/t-slibgcc from the Solaris patch, so I'll apply that as well. gcc/config/i386/t-nwld also has SHLIB_LINK = dummy, so I'll make i[3456x]86-*-netware* use t-slibgcc-dummy, too. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[PATCH] LTO canoical type merging disentangle
This disentangles canonical type merging from type merging. LTO bootstrapped and tested on x86_64-unknown-linux-gnu, SPEC 2k6 LTO tested, committed. Richard. 2011-05-31 Richard Guenther * gimple.c (gimple_register_canonical_type): Do not register any types via gimple_register_type. Index: gcc/gimple.c === --- gcc/gimple.c(revision 174478) +++ gcc/gimple.c(working copy) @@ -4749,21 +4749,12 @@ tree gimple_register_canonical_type (tree t) { void **slot; - tree orig_t = t; gcc_assert (TYPE_P (t)); if (TYPE_CANONICAL (t)) return TYPE_CANONICAL (t); - /* Use the leader of our main variant for determining our canonical - type. The main variant leader is a type that will always - prevail. */ - t = gimple_register_type (TYPE_MAIN_VARIANT (t)); - - if (TYPE_CANONICAL (t)) -return TYPE_CANONICAL (t); - if (gimple_canonical_types == NULL) gimple_canonical_types = htab_create_ggc (16381, gimple_canonical_type_hash, gimple_canonical_type_eq, 0); @@ -4783,9 +4774,6 @@ gimple_register_canonical_type (tree t) *slot = (void *) t; } - /* Also cache the canonical type in the non-leaders. */ - TYPE_CANONICAL (orig_t) = t; - return t; }
Fix lto decl merging of thunks
Hi, while updating thunks to not be same body alises I missed this spot that merges alias decl. Bootstrapped/regtested x86_64-linux, comitted. 2011-05-31 Jan Hubicka * lto-symtab.c (lto_symtab_merge_cgraph_nodes): Merge alias decl of thunks. Index: lto-symtab.c === --- lto-symtab.c(revision 174393) +++ lto-symtab.c(working copy) @@ -821,11 +821,15 @@ lto_symtab_merge_cgraph_nodes (void) htab_traverse (lto_symtab_identifiers, lto_symtab_merge_cgraph_nodes_1, NULL); for (node = cgraph_nodes; node; node = node->next) -for (alias = node->same_body; alias; alias = next) - { - next = alias->next; - alias->thunk.alias = lto_symtab_prevailing_decl (alias->thunk.alias); - } +{ + if (node->thunk.thunk_p) +node->thunk.alias = lto_symtab_prevailing_decl (node->thunk.alias); + for (alias = node->same_body; alias; alias = next) + { + next = alias->next; + alias->thunk.alias = lto_symtab_prevailing_decl (alias->thunk.alias); + } +} } /* Given the decl DECL, return the prevailing decl with the same name. */
Re: [PATCH] Remove pow/powi expanders (PR46728 patch 6)
On Tue, May 31, 2011 at 4:00 PM, William J. Schmidt wrote: > This patch removes the now-redundant support for pow and powi builtins > in the expand phase. My concerns about -O0 regressions were unfounded. > There are code gen differences for the unlikely combination of -O0 and > -ffast-math, but that's obviously nothing to be concerned about. > > Bootstrapped and regtested for powerpc64-linux with no regressions. OK > for trunk? > > 2011-05-31 Bill Schmidt > > PR tree-optimization/46728 > * builtins.c (powi_table): Remove. > (powi_lookup_cost): Remove. > (powi_cost): Remove. > (expand_powi_1): Remove. > (expand_powi): Remove. You are not really removing this one but only removing constant exponent handling. > (expand_builtin_pow_root): Remove. > (expand_builtin_pow): Remove. > (expand_builtin_powi): Eliminate handling of constant exponent. > (expand_builtin): Use expand_builtin_mathfn_2 for BUILT_IN_POW. While I can imagine we can construct testcases that regress with this patch (easiest cases would be extra constant propagations due to loop unrolling or jump threading) I don't think they will be common enough to warrant keeping all of the machinery also at RTL expansion phase. Just to note, we are now expanding the canonical form pow(x,2) of x*x to x*x again before loop optimizations take place and we won't reverse that unless somebody would go and re-build, fold and gimplify that expression as trees (which I think nobody does nor should). Still - any comments from other maintainers appreciated. Consider the patch approved if there are no objections until the end of the week (it doesn't look like it would block further development on your side). Thanks, Richard. > > Index: gcc/builtins.c > === > --- gcc/builtins.c (revision 174447) > +++ gcc/builtins.c (working copy) > @@ -2847,451 +2847,6 @@ expand_builtin_int_roundingfn_2 (tree exp, rtx tar > return target; > } > > -/* To evaluate powi(x,n), the floating point value x raised to the > - constant integer exponent n, we use a hybrid algorithm that > - combines the "window method" with look-up tables. For an > - introduction to exponentiation algorithms and "addition chains", > - see section 4.6.3, "Evaluation of Powers" of Donald E. Knuth, > - "Seminumerical Algorithms", Vol. 2, "The Art of Computer Programming", > - 3rd Edition, 1998, and Daniel M. Gordon, "A Survey of Fast Exponentiation > - Methods", Journal of Algorithms, Vol. 27, pp. 129-146, 1998. */ > - > -/* Provide a default value for POWI_MAX_MULTS, the maximum number of > - multiplications to inline before calling the system library's pow > - function. powi(x,n) requires at worst 2*bits(n)-2 multiplications, > - so this default never requires calling pow, powf or powl. */ > - > -#ifndef POWI_MAX_MULTS > -#define POWI_MAX_MULTS (2*HOST_BITS_PER_WIDE_INT-2) > -#endif > - > -/* The size of the "optimal power tree" lookup table. All > - exponents less than this value are simply looked up in the > - powi_table below. This threshold is also used to size the > - cache of pseudo registers that hold intermediate results. */ > -#define POWI_TABLE_SIZE 256 > - > -/* The size, in bits of the window, used in the "window method" > - exponentiation algorithm. This is equivalent to a radix of > - (1< -#define POWI_WINDOW_SIZE 3 > - > -/* The following table is an efficient representation of an > - "optimal power tree". For each value, i, the corresponding > - value, j, in the table states than an optimal evaluation > - sequence for calculating pow(x,i) can be found by evaluating > - pow(x,j)*pow(x,i-j). An optimal power tree for the first > - 100 integers is given in Knuth's "Seminumerical algorithms". */ > - > -static const unsigned char powi_table[POWI_TABLE_SIZE] = > - { > - 0, 1, 1, 2, 2, 3, 3, 4, /* 0 - 7 */ > - 4, 6, 5, 6, 6, 10, 7, 9, /* 8 - 15 */ > - 8, 16, 9, 16, 10, 12, 11, 13, /* 16 - 23 */ > - 12, 17, 13, 18, 14, 24, 15, 26, /* 24 - 31 */ > - 16, 17, 17, 19, 18, 33, 19, 26, /* 32 - 39 */ > - 20, 25, 21, 40, 22, 27, 23, 44, /* 40 - 47 */ > - 24, 32, 25, 34, 26, 29, 27, 44, /* 48 - 55 */ > - 28, 31, 29, 34, 30, 60, 31, 36, /* 56 - 63 */ > - 32, 64, 33, 34, 34, 46, 35, 37, /* 64 - 71 */ > - 36, 65, 37, 50, 38, 48, 39, 69, /* 72 - 79 */ > - 40, 49, 41, 43, 42, 51, 43, 58, /* 80 - 87 */ > - 44, 64, 45, 47, 46, 59, 47, 76, /* 88 - 95 */ > - 48, 65, 49, 66, 50, 67, 51, 66, /* 96 - 103 */ > - 52, 70, 53, 74, 54, 104, 55, 74, /* 104 - 111 */ > - 56, 64, 57, 69, 58, 78, 59, 68, /* 112 - 119 */ > - 60, 61, 61, 80, 62, 75, 63, 68, /* 120 - 127 */ > - 64, 65, 65, 1
Re: RFA: another patch to solve PR49154
Hans-Peter Nilsson writes: > Index: tm.texi.in > === > --- tm.texi.in(revision 174376) > +++ tm.texi.in(working copy) > @@ -2327,6 +2327,11 @@ constraints is through machine-dependent > You can define such letters to correspond to various classes, then use > them in operand constraints. > > +You must define the narrowest register class for a register so that > +class either has no subclasses, or that for some mode, the move cost > +between registers within the class are cheaper than moving a register > +in the class to or from memory (@pxref{Costs}). > + I fear this isn't true for some MIPS classes. Richard
[PATCH] Remove pow/powi expanders (PR46728 patch 6)
This patch removes the now-redundant support for pow and powi builtins in the expand phase. My concerns about -O0 regressions were unfounded. There are code gen differences for the unlikely combination of -O0 and -ffast-math, but that's obviously nothing to be concerned about. Bootstrapped and regtested for powerpc64-linux with no regressions. OK for trunk? 2011-05-31 Bill Schmidt PR tree-optimization/46728 * builtins.c (powi_table): Remove. (powi_lookup_cost): Remove. (powi_cost): Remove. (expand_powi_1): Remove. (expand_powi): Remove. (expand_builtin_pow_root): Remove. (expand_builtin_pow): Remove. (expand_builtin_powi): Eliminate handling of constant exponent. (expand_builtin): Use expand_builtin_mathfn_2 for BUILT_IN_POW. Index: gcc/builtins.c === --- gcc/builtins.c (revision 174447) +++ gcc/builtins.c (working copy) @@ -2847,451 +2847,6 @@ expand_builtin_int_roundingfn_2 (tree exp, rtx tar return target; } -/* To evaluate powi(x,n), the floating point value x raised to the - constant integer exponent n, we use a hybrid algorithm that - combines the "window method" with look-up tables. For an - introduction to exponentiation algorithms and "addition chains", - see section 4.6.3, "Evaluation of Powers" of Donald E. Knuth, - "Seminumerical Algorithms", Vol. 2, "The Art of Computer Programming", - 3rd Edition, 1998, and Daniel M. Gordon, "A Survey of Fast Exponentiation - Methods", Journal of Algorithms, Vol. 27, pp. 129-146, 1998. */ - -/* Provide a default value for POWI_MAX_MULTS, the maximum number of - multiplications to inline before calling the system library's pow - function. powi(x,n) requires at worst 2*bits(n)-2 multiplications, - so this default never requires calling pow, powf or powl. */ - -#ifndef POWI_MAX_MULTS -#define POWI_MAX_MULTS (2*HOST_BITS_PER_WIDE_INT-2) -#endif - -/* The size of the "optimal power tree" lookup table. All - exponents less than this value are simply looked up in the - powi_table below. This threshold is also used to size the - cache of pseudo registers that hold intermediate results. */ -#define POWI_TABLE_SIZE 256 - -/* The size, in bits of the window, used in the "window method" - exponentiation algorithm. This is equivalent to a radix of - (1<= POWI_TABLE_SIZE) -{ - if (val & 1) - { - digit = val & ((1 << POWI_WINDOW_SIZE) - 1); - result += powi_lookup_cost (digit, cache) - + POWI_WINDOW_SIZE + 1; - val >>= POWI_WINDOW_SIZE; - } - else - { - val >>= 1; - result++; - } -} - - return result + powi_lookup_cost (val, cache); -} - -/* Recursive subroutine of expand_powi. This function takes the array, - CACHE, of already calculated exponents and an exponent N and returns - an RTX that corresponds to CACHE[1]**N, as calculated in mode MODE. */ - -static rtx -expand_powi_1 (enum machine_mode mode, unsigned HOST_WIDE_INT n, rtx *cache) -{ - unsigned HOST_WIDE_INT digit; - rtx target, result; - rtx op0, op1; - - if (n < POWI_TABLE_SIZE) -{ - if (cache[n]) - return cache[n]; - - target = gen_reg_rtx (mode); - cache[n] = target; - - op0 = expand_powi_1 (mode, n - powi_table[n], cache); - op1 = expand_powi_1 (mode, powi_table[n], cache); -} - else if (n & 1) -{ - target = gen_reg_rtx (mode); - digit = n & ((1 << POWI_WINDOW_SIZE) - 1); - op0 = expand_powi_1 (mode, n - digit, cache); - op1 = expand_powi_1 (mode, digit, cache); -} - else -{ - target = gen_reg_rtx (mode); - op0 = expand_powi_1 (mode, n >> 1, cache); - op1 = op0; -} - - result = expand_mult (mode, op0, op1, target, 0); - if (result != target) -emit_move_insn (target, result); - return target; -} - -/* Expand the RTL to evaluate powi(x,n) in mode MODE. X is the - floating point operand in mode MODE, and N is the exponent. This - function needs to be kept in sync with powi_cost above. */ - -static rtx -expand_powi (rtx x, enum machine_mode mode, HOST_WIDE_INT n) -{ - rtx cache[POWI_TABLE_SIZE]; - rtx result; - - if (n == 0) -return CONST1_RTX (mode); - - memset (cache, 0, sizeof (cache)); - cache[1] = x; - - result = expand_powi_1 (mode, (n < 0) ? -n : n, cache); - - /* If the original exponent was negative, reciprocate the result. */ - if (n < 0) -result = expand_binop (mode, sdiv_optab, CONST1_RTX (mode), - result, NULL_RTX, 0, OPTAB_LIB_WIDEN); - - return result; -} - -/* Fold a builtin function call to pow, powf, or powl into a series of sqrts or - cbrts. Return NULL_RTX if no simplification can be made or expand the tree - if we can simplify it. */ -static rtx -expand_builtin_pow_root (location_t loc, tree arg0, tree arg1, tree type,
[PATCH, MELT] add pragma support in MELT plugin
Hello, The following patch allows to use pragma in a MELT plugin. For exemple we can recover the following pragmas: #pragma GCCPLUGIN melt op or #pragma GCCPLUGIN melt op (arg1, arg2, ...) with argX a name, a string, or a number. It is easy to change the pragma space ("GCCPLUGIN") and name ("melt"), so I am open to suggestion. This plugin works on the MELT heart and so It need to regenerate the source. The first .diff (addPragma-warmelt-first) contains the gcc/melt/warmelt-first.melt file: I have added a field in the MELT class_system_data class which allows the user to add a function handling pragma. After adding this diff, it is needed to use 'make upgrade-warmelt' in the gcc directory to regenerate gcc/melt/generated. After this, it is possible to add the second diff (addPragma-runtime): it contains change in gcc/melt-runtime.c and gcc/melt-runtime.h: it adds the melt_pragma_callback which register the pragma handler defined in the same file. The pragma handler calls the function defined by the user and gives it the trees corresponding to the pragma operator and argument. I have been obliged to use weak symbols for pragma_lex and c_register_pragma as they are not defined when using lto. This is a temporary solution that I commented. I am going to send a test in the Testsuite, but I have already tried it successfully with something like that: (defun my_simple_pragma_handler(val1 lstarg) (debug_msg val1 "debugging melt pragma") ) (put_fields initial_system_data :sysdata_meltpragma_definer my_simple_pragma_handler ) Pierre Vittet Index: gcc/melt/warmelt-first.melt === --- gcc/melt/warmelt-first.melt (revision 174379) +++ gcc/melt/warmelt-first.melt (working copy) @@ -436,6 +436,7 @@ don't instanciate this class!}# sysdata_pass_dict;stringmap for passes sysdata_exit_finalizer ;;closure to be called after the passes, at finalization sysdata_meltattr_definer ;;closure to be called for melt attributes + sysdata_meltpragma_definer ;;closure to be called for melt pragma sysdata_patmacro_exporter;closure to export a patmacro sysdata_debugmsg ;closure for debugmsg sysdata_stdout ;raw file for stdout Index: gcc/melt-runtime.c === --- gcc/melt-runtime.c (revision 174379) +++ gcc/melt-runtime.c (working copy) @@ -75,6 +75,7 @@ along with GCC; see the file COPYING3. If not se #include "plugin.h" #include "cppdefault.h" +#include "c-pragma.h" #if BUILDING_GCC_VERSION > 4005 /* GCC 4.6 has realmpfr.h which includes */ #include "realmpfr.h" @@ -8930,7 +8931,124 @@ melt_attribute_callback(void *gcc_data ATTRIBUTE_U register_attribute(&melt_attr_spec); } +/*We declare weak functions because they cannot be linked when we use lto (it +loses langage specific informations). +If you use one of those functions you must check them to be not NULL. +*/ +extern enum cpp_ttype __attribute__((weak)) pragma_lex (tree *); +extern void __attribute__((weak)) c_register_pragma (const char *, const char +*, pragma_handler); +#define GCC_BAD(gmsgid) \ + do { warning (OPT_Wpragmas, gmsgid); goto end; } while (0) + + +void melt_handle_melt_pragma (melt_ptr_t optreev, melt_ptr_t listargtreev); + +/* handle a melt pragma*/ +static void +handle_melt_pragma (cpp_reader *ARG_UNUSED(dummy)) +{ + enum cpp_ttype token; + /*list containing the pragma argument*/ + tree x; + MELT_ENTERFRAME (3, NULL); +#define seqv meltfram__.mcfr_varptr[0] +#define treev meltfram__.mcfr_varptr[1] +#define optreev meltfram__.mcfr_varptr[2] + if(! pragma_lex || ! c_register_pragma) +GCC_BAD("Cannot use pragma symbol at this level (maybe you use -flto which \ +is incompatible)."); + + token = pragma_lex (&x); + if(token != CPP_NAME) +GCC_BAD ("malformed #pragma melt, ignored"); + optreev = meltgc_new_tree((meltobject_ptr_t) MELT_PREDEF (DISCR_TREE), x); + /*If the pragma has the form #pragma PLUGIN melt id (...) then optreev is the + tree containing "id". + Next element should be a parenthese opening. */ + token = pragma_lex (&x); + if (token != CPP_OPEN_PAREN){ +if (token != CPP_EOF) + GCC_BAD ("malformed #pragma melt, ignored"); + +else{ /* we have a pragma of the type '#pragma PLUGIN melt instr' */ + melt_handle_melt_pragma ((melt_ptr_t ) optreev, (melt_ptr_t ) NULL); +} + } + else{/* opening parenthesis */ +seqv = meltgc_new_list ((meltobject_ptr_t) MELT_PREDEF (DISCR_LIST)); +do { + token = pragma_lex (&x); + if(token != CPP_NAME && token != CPP_STRING && token != CPP_NUMBER) +GCC_BAD ("malformed #pragma melt, ignored"); + /* convert gcc tree into a boxed tree */ + treev = meltgc_new_tree((meltobject_ptr_t) MELT_PREDEF (DISCR_TREE), x); + /* put
Re: [PATCH] LTO and cache-preloading of FE dependent nodes
On Tue, May 31, 2011 at 09:15, Richard Guenther wrote: > This patch, now that Micha made us less dependent on preloading > exactly the same from each FE, gets rid of two hacks regarding > to the frontend (and option) dependent char_type_node and > boolean_type_node. It does so by first decoupling canonical type > registering from cache-preloading and restrict it to lto1, and > second by exempting nodes from the preloading that are known > to cause problems because they differ semantically between > frontends. Excellent! > > Queued for testing (depends on one prerequesite I think which is > in testing currently). > > Diego, this was the patch I had in mind - does this look reasonable > to you? Yes. It only touches the gimple-specific parts, so it should not affect the C++ streamer. I'm thinking that it may make sense for me to merge the streamer hooks now, even if trunk will only have them defined for gimple. This will help us isolate changes that only affect gimple from those that are shared. It should simplify merges into the branch as well. > In general I'm working towards identifying what global trees are > initialized solely dependent on target info and which ones are > really frontend dependent. Great. Diego.
[PATCH] LTO and cache-preloading of FE dependent nodes
This patch, now that Micha made us less dependent on preloading exactly the same from each FE, gets rid of two hacks regarding to the frontend (and option) dependent char_type_node and boolean_type_node. It does so by first decoupling canonical type registering from cache-preloading and restrict it to lto1, and second by exempting nodes from the preloading that are known to cause problems because they differ semantically between frontends. Queued for testing (depends on one prerequesite I think which is in testing currently). Diego, this was the patch I had in mind - does this look reasonable to you? In general I'm working towards identifying what global trees are initialized solely dependent on target info and which ones are really frontend dependent. Thanks, Richard. 2011-05-31 Richard Guenther * tree.c (free_lang_data): Do not reset boolean_type_node nor char_type_node. * lto-streamer.c (lto_record_common_node): Take node pointer, do not register types. (lto_preload_common_nodes): Explicitly skip preloading nodes that differ between frontends. lto/ * lto-lang.c (lto_register_canonical_types): New function. (lto_init): Register common nodes with the canonical type machinery. Do not play tricks with char_type_node. Index: gcc/tree.c === *** gcc/tree.c.orig 2011-05-31 15:04:08.0 +0200 --- gcc/tree.c 2011-05-31 15:04:24.0 +0200 *** free_lang_data (void) *** 5184,5208 /* Create gimple variants for common types. */ ptrdiff_type_node = integer_type_node; fileptr_type_node = ptr_type_node; - if (TREE_CODE (boolean_type_node) != BOOLEAN_TYPE - || (TYPE_MODE (boolean_type_node) - != mode_for_size (BOOL_TYPE_SIZE, MODE_INT, 0)) - || TYPE_PRECISION (boolean_type_node) != 1 - || !TYPE_UNSIGNED (boolean_type_node)) - { - boolean_type_node = make_unsigned_type (BOOL_TYPE_SIZE); - TREE_SET_CODE (boolean_type_node, BOOLEAN_TYPE); - TYPE_MAX_VALUE (boolean_type_node) = build_int_cst (boolean_type_node, 1); - TYPE_PRECISION (boolean_type_node) = 1; - boolean_false_node = TYPE_MIN_VALUE (boolean_type_node); - boolean_true_node = TYPE_MAX_VALUE (boolean_type_node); - } - - /* Unify char_type_node with its properly signed variant. */ - if (TYPE_UNSIGNED (char_type_node)) - unsigned_char_type_node = char_type_node; - else - signed_char_type_node = char_type_node; /* Reset some langhooks. Do not reset types_compatible_p, it may still be used indirectly via the get_alias_set langhook. */ --- 5184,5189 Index: gcc/lto-streamer.c === *** gcc/lto-streamer.c.orig 2011-05-31 15:04:08.0 +0200 --- gcc/lto-streamer.c 2011-05-31 15:04:24.0 +0200 *** lto_streamer_cache_get (struct lto_strea *** 478,487 /* Record NODE in CACHE. */ static void ! lto_record_common_node (struct lto_streamer_cache_d *cache, tree *nodep) { - tree node = *nodep; - /* We have to make sure to fill exactly the same number of elements for all frontends. That can include NULL trees. As our hash table can't deal with zero entries we'll simply stream --- 478,485 /* Record NODE in CACHE. */ static void ! lto_record_common_node (struct lto_streamer_cache_d *cache, tree node) { /* We have to make sure to fill exactly the same number of elements for all frontends. That can include NULL trees. As our hash table can't deal with zero entries we'll simply stream *** lto_record_common_node (struct lto_strea *** 491,515 if (!node) node = error_mark_node; - if (TYPE_P (node)) - { - /* Type merging will get confused by the canonical types as they -are set by the middle-end. */ - if (in_lto_p) - TYPE_CANONICAL (node) = NULL_TREE; - node = gimple_register_type (node); - TYPE_CANONICAL (node) = gimple_register_canonical_type (node); - if (in_lto_p) - TYPE_CANONICAL (*nodep) = TYPE_CANONICAL (node); - *nodep = node; - } - lto_streamer_cache_append (cache, node); if (POINTER_TYPE_P (node) || TREE_CODE (node) == COMPLEX_TYPE || TREE_CODE (node) == ARRAY_TYPE) ! lto_record_common_node (cache, &TREE_TYPE (node)); else if (TREE_CODE (node) == RECORD_TYPE) { /* The FIELD_DECLs of structures should be shared, so that every --- 489,500 if (!node) node = error_mark_node; lto_streamer_cache_append (cache, node); if (POINTER_TYPE_P (node) || TREE_CODE (node) == COMPLEX_TYPE || TREE_CODE (node) == ARRAY_TYPE) ! lto_record_common_node (cache, TREE_TYPE (node)); else if (TREE_CODE (node) == RECORD_TYPE) { /* The
Re: __sync_swap* with acq/rel/full memory barrier semantics
On 05/31/2011 06:38 AM, Jakub Jelinek wrote: Aldy was just too excited about working on memory model I think :-) I've been looking at this, and I propose we go this way : http://gcc.gnu.org/wiki/Atomic/GCCMM/CodeGen Please feel free to criticize, comment on, or ask for clarification. I usually miss something I meant to get across. I think the addition of new __sync_* builtins for the different models is preferrable and would be generally more usable even for other users than C++ atomics. On some targets any atomic insn will act as a full barrier, while on others it could generate different insns or code sequences that way. For OpenMP atomics having a none (in addition to full/acq/rel) would be useful, I think #pragma omp atomic doesn't impose any ordering on memory accesses other than the memory being atomically read/written/changed. Haven't read the C++0x standard in detail why it has 6 memory order modes instead of just 4, but if really 6 are needed (even for 4 probably), having new builtins with just one constant extra argument which says the memory ordering mode would be best. I'm not sure if you are agreeing or not, or how much :-) There is still only the basics of relaxed, consume, release/acquire, and seq-cst. so there are 4 modes. C++ gives you two more by separating release and acquire for loads and stores, loads using 'acquire' mode, stores using 'release'. I guess It allows for a slightly finer control over instructions that can be loads and/or stores. It looks like the optimal powerpc sequence for cmpxchg is slightly more efficient when its just an acquire or just a release rather than an acquire/release for instance. (and all 3 sequences are slightly different) The table is more or less complete... ie, a store cant have an 'acquire' mode... and I presume that a consumer which doesn't break release-acquire down into component parts would use that 'release' version of the store as 'release/acquire' mode. I presume a single builtin with a parameter is the most efficient way to build them, but thats just an implementation detail. Presumable you have each builtin in the table with each of those possible modes as a valid parameter. The one thing I would care about is i would like to see the relaxed version be 'just an insn' rather than a builtin, if thats possible... My understanding is that relaxed (as far as C++) has no synchronization at all, so therefore you can treat it like a normal operation as far as optimization. That seems the same for openMP. Its just thats its atomic operation. So it would be preferable if we can avoid a builtin in the optimizers for that. Thats why I left it out of the table. If all the atomic operations are already builtins, well, then I guess it doesn't matter :-P It would be nice to say something like emit_atomic_fetch_add (memory_order) and if its relaxed, emit the atomic fetch_add insn (or builtin if thats what it is), and if its something else, emit the appropriate builtin. That would make bits/libstdc++v2/atomic_2.h even easier too I think maybe we are more or less saying the same thing? :-) Andrew
Re: [PATCH PR45098, 4/10] Iv init cost.
Hi Tom, Thanks for the reply, and sorry for responding so slowly. Tom de Vries writes: > On 05/25/2011 03:44 PM, Richard Sandiford wrote: >> Sorry for being so late. I was just curious... >> >> Tom de Vries writes: >>> The init cost of an iv will in general not be zero. It will be >>> exceptional that the iv register happens to be initialized with the >>> proper value at no cost. In general, there will at the very least be a >>> regcopy or a const set. >>> >>> 2011-05-05 Tom de Vries >>> >>> PR target/45098 >>> * tree-ssa-loop-ivopts.c (determine_iv_cost): Prevent >>> cost_base.cost == 0. >>> Index: gcc/tree-ssa-loop-ivopts.c >>> === >>> --- gcc/tree-ssa-loop-ivopts.c (revision 173380) >>> +++ gcc/tree-ssa-loop-ivopts.c (working copy) >>> @@ -4688,6 +4688,8 @@ determine_iv_cost (struct ivopts_data *d >>> >>>base = cand->iv->base; >>>cost_base = force_var_cost (data, base, NULL); >>> + if (cost_base.cost == 0) >>> + cost_base.cost = COSTS_N_INSNS (1); >>>cost_step = add_cost (TYPE_MODE (TREE_TYPE (base)), data->speed); >>> >>>cost = cost_step + adjust_setup_cost (data, cost_base.cost); >> >> ...why does this reasoning apply only to this call to force_var_cost? >> >> Richard > > force_var_cost is described as estimating the cost of forcing expression expr > into a variable. If expr is already a var, this definition is ambiguous. > If we can use the var directly, the cost is zero, but if we need a regcopy, it > should be the cost of a regcopy. > > What is special for an iv, is that we know that it is not only used but also > modified. If a var is used in or after the loop, we need a regcopy to init the > iv with that var. If that var is not used in or after the loop, we can use > that > var as iv. The patch above is a heuristic that estimates that the latter > situation is the less frequent one. > > In general, we don't have such specific information, and the the cost of zero > is > a good choice then. > > We could add a parameter to force_var_cost that indicates this choice, that > would perhaps be a better fix. > > As for the reasoning related to the const set, that is something that indeed > holds more general, and could be implemented in force_var_cost, which is what > you're suggesting if I understand you correctly. It was actually a genuine question. I honestly wasn't sure whether (and why) this was the only site at which a reg copy should be counted. However... > The tentative patch below explores these last 2 ideas. ...this makes things _much_ clearer to me FWIW. Thanks. Richard
[v3] Minimal noexcept changes to std::string, __vstring
Hi, tested x86_64-linux, committed to mainline. Paolo. // 2011-05-31 Paolo Carlini * include/bits/basic_string.h: Use noexcept per the FDIS (minus compare(const string&), which uses char_traits::compare, which isn't noexcept; also no noexcept in the move assignment operator and move assign, see c++std-lib-30855). * include/bits/basic_string.tcc: Likewise. * include/ext/vstring.h: Likewise. * include/ext/vstring.tcc: Likewise. * include/debug/string: Likewise. Index: include/debug/string === --- include/debug/string(revision 174470) +++ include/debug/string(working copy) @@ -1,6 +1,6 @@ // Debugging string implementation -*- C++ -*- -// Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 +// Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 // Free Software Foundation, Inc. // // This file is part of the GNU ISO C++ Library. This library is free @@ -114,7 +114,7 @@ { } #ifdef __GXX_EXPERIMENTAL_CXX0X__ -basic_string(basic_string&& __str) +basic_string(basic_string&& __str) noexcept : _Base(std::move(__str)) { } @@ -171,37 +171,55 @@ // 21.3.2 iterators: iterator -begin() +begin() _GLIBCXX_NOEXCEPT { return iterator(_Base::begin(), this); } const_iterator -begin() const +begin() const _GLIBCXX_NOEXCEPT { return const_iterator(_Base::begin(), this); } iterator -end() +end() _GLIBCXX_NOEXCEPT { return iterator(_Base::end(), this); } const_iterator -end() const +end() const _GLIBCXX_NOEXCEPT { return const_iterator(_Base::end(), this); } reverse_iterator -rbegin() +rbegin() _GLIBCXX_NOEXCEPT { return reverse_iterator(end()); } const_reverse_iterator -rbegin() const +rbegin() const _GLIBCXX_NOEXCEPT { return const_reverse_iterator(end()); } reverse_iterator -rend() +rend() _GLIBCXX_NOEXCEPT { return reverse_iterator(begin()); } const_reverse_iterator -rend() const +rend() const _GLIBCXX_NOEXCEPT { return const_reverse_iterator(begin()); } +#ifdef __GXX_EXPERIMENTAL_CXX0X__ +const_iterator +cbegin() const noexcept +{ return const_iterator(_Base::begin(), this); } + +const_iterator +cend() const noexcept +{ return const_iterator(_Base::end(), this); } + +const_reverse_iterator +crbegin() const noexcept +{ return const_reverse_iterator(end()); } + +const_reverse_iterator +crend() const noexcept +{ return const_reverse_iterator(begin()); } +#endif + // 21.3.3 capacity: using _Base::size; using _Base::length; @@ -226,7 +244,7 @@ using _Base::reserve; void -clear() +clear() _GLIBCXX_NOEXCEPT { _Base::clear(); this->_M_invalidate_all(); @@ -672,7 +690,7 @@ // 21.3.6 string operations: const _CharT* -c_str() const +c_str() const _GLIBCXX_NOEXCEPT { const _CharT* __res = _Base::c_str(); this->_M_invalidate_all(); @@ -680,7 +698,7 @@ } const _CharT* -data() const +data() const _GLIBCXX_NOEXCEPT { const _CharT* __res = _Base::data(); this->_M_invalidate_all(); @@ -691,6 +709,7 @@ size_type find(const basic_string& __str, size_type __pos = 0) const + _GLIBCXX_NOEXCEPT { return _Base::find(__str, __pos); } size_type @@ -708,11 +727,12 @@ } size_type -find(_CharT __c, size_type __pos = 0) const +find(_CharT __c, size_type __pos = 0) const _GLIBCXX_NOEXCEPT { return _Base::find(__c, __pos); } size_type rfind(const basic_string& __str, size_type __pos = _Base::npos) const + _GLIBCXX_NOEXCEPT { return _Base::rfind(__str, __pos); } size_type @@ -730,11 +750,12 @@ } size_type -rfind(_CharT __c, size_type __pos = _Base::npos) const +rfind(_CharT __c, size_type __pos = _Base::npos) const _GLIBCXX_NOEXCEPT { return _Base::rfind(__c, __pos); } size_type find_first_of(const basic_string& __str, size_type __pos = 0) const + _GLIBCXX_NOEXCEPT { return _Base::find_first_of(__str, __pos); } size_type @@ -752,12 +773,12 @@ } size_type -find_first_of(_CharT __c, size_type __pos = 0) const +find_first_of(_CharT __c, size_type __pos = 0) const _GLIBCXX_NOEXCEPT { return _Base::find_first_of(__c, __pos); } size_type find_last_of(const basic_string& __str, -size_type __pos = _Base::npos) const +size_type __pos = _Base::npos) const _GLIBCXX_NOEXCEPT { return _Base::find_last_of(__str, __pos); } size_type @@ -776,10 +797,12 @@ size_type find_last_of(_CharT __c, size_type __pos = _Base::npos) const + _GLIBCXX_NOEXCEPT { return _Base::find_la
Re: Use i386/crtfastmath.c on Solaris 2/x86
On Tue, May 31, 2011 at 2:25 PM, Rainer Orth wrote: > The only complication is that I need to make sure that SSE insns are only > used if the host supports them. > > Bootstrapped without regressions on i386-pc-solaris2.8, > i386-pc-solaris2.9, i386-pc-solaris2.11, and sparc-sun-solaris2.11. > > The libgcc part depends on the toplevel libgcc patch, so actually > applying this patch will have to wait until that one is in. > > Ok for mainline? > > Rainer > > > 2011-05-28 Rainer Orth > > gcc: > * config/i386/crtfastmath.c [!__x86_64__ && __sun__ && __svr4__]: > Include , . > (sigill_caught): Define. > (sigill_hdlr): New function. > (set_fast_math) [!__x86_64__ && __sun__ && __svr4__]: Check if SSE > insns can be executed. > * config/sol2.h (ENDFILE_SPEC): Use crtfastmath.o if -ffast-math > etc. > * config/sparc/sol2.h (ENDFILE_SPEC): Remove. > > libgcc: > * config.host (i[34567]86-*-solaris2*): Add i386/t-crtfm to > tmake_file. > Add crtfastmath.o to extra_parts. Please just put "if (edx & bit_SSE)" part inside existing check. You will need to split assignment of mxcsr from the declaration, though. OK with this change. Thanks, Uros.
Re: [patch] Fix PR tree-optimization/49093
On Tue, May 31, 2011 at 2:31 PM, Ira Rosen wrote: > Hi, > > This patch fails vectorization for volatile data references. > > Bootstrapped on powerpc64-suse-linux and tested on > powerpc64-suse-linux and x86_64-suse-linux. > Applied to trunk. > OK for 4.6 after testing? Ok. Thanks, Richard. > Thanks, > Ira > > ChangeLog: > > PR tree-optimization/49093 > * tree-vect-data-refs.c (vect_analyze_data_refs): Fail for volatile > data references. > > testsuite/ChangeLog: > > PR tree-optimization/49093 > * gcc.dg/vect/pr49093.c: New test. > > Index: testsuite/gcc.dg/vect/pr49093.c > === > --- testsuite/gcc.dg/vect/pr49093.c (revision 0) > +++ testsuite/gcc.dg/vect/pr49093.c (revision 0) > @@ -0,0 +1,22 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O1 -ftree-vectorize -fdump-tree-vect-details > -fno-tree-fre" } */ > + > +volatile unsigned char g_324[4] = {0, 1, 0, 1}; > +void foo (int); > +int x, y; > +void func_81(void) > +{ > + int l_466, l_439[7] = {0}, g_97; > +lbl_473: > + if (x) { > + for (g_97 = 0; (g_97 < 4); ++g_97) { > + if (y) > + goto lbl_473; > + g_324[g_97]; > + l_466 = l_439[g_97]; > + } > + foo(l_466); > + } > +} > + > +/* { dg-final { cleanup-tree-dump "vect" } } */ > Index: tree-vect-data-refs.c > === > --- tree-vect-data-refs.c (revision 174471) > +++ tree-vect-data-refs.c (working copy) > @@ -2584,6 +2584,16 @@ vect_analyze_data_refs (loop_vec_info loop_vinfo, > return false; > } > > + if (TREE_THIS_VOLATILE (DR_REF (dr))) > + { > + if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS)) > + { > + fprintf (vect_dump, "not vectorized: volatile type "); > + print_gimple_stmt (vect_dump, stmt, 0, TDF_SLIM); > + } > + return false; > + } > + > base = unshare_expr (DR_BASE_ADDRESS (dr)); > offset = unshare_expr (DR_OFFSET (dr)); > init = unshare_expr (DR_INIT (dr)); >
[patch] Fix PR tree-optimization/49093
Hi, This patch fails vectorization for volatile data references. Bootstrapped on powerpc64-suse-linux and tested on powerpc64-suse-linux and x86_64-suse-linux. Applied to trunk. OK for 4.6 after testing? Thanks, Ira ChangeLog: PR tree-optimization/49093 * tree-vect-data-refs.c (vect_analyze_data_refs): Fail for volatile data references. testsuite/ChangeLog: PR tree-optimization/49093 * gcc.dg/vect/pr49093.c: New test. Index: testsuite/gcc.dg/vect/pr49093.c === --- testsuite/gcc.dg/vect/pr49093.c (revision 0) +++ testsuite/gcc.dg/vect/pr49093.c (revision 0) @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-O1 -ftree-vectorize -fdump-tree-vect-details -fno-tree-fre" } */ + +volatile unsigned char g_324[4] = {0, 1, 0, 1}; +void foo (int); +int x, y; +void func_81(void) +{ +int l_466, l_439[7] = {0}, g_97; +lbl_473: +if (x) { +for (g_97 = 0; (g_97 < 4); ++g_97) { +if (y) + goto lbl_473; +g_324[g_97]; +l_466 = l_439[g_97]; +} +foo(l_466); +} +} + +/* { dg-final { cleanup-tree-dump "vect" } } */ Index: tree-vect-data-refs.c === --- tree-vect-data-refs.c (revision 174471) +++ tree-vect-data-refs.c (working copy) @@ -2584,6 +2584,16 @@ vect_analyze_data_refs (loop_vec_info loop_vinfo, return false; } + if (TREE_THIS_VOLATILE (DR_REF (dr))) +{ + if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS)) +{ + fprintf (vect_dump, "not vectorized: volatile type "); + print_gimple_stmt (vect_dump, stmt, 0, TDF_SLIM); +} + return false; +} + base = unshare_expr (DR_BASE_ADDRESS (dr)); offset = unshare_expr (DR_OFFSET (dr)); init = unshare_expr (DR_INIT (dr));
Re: [PATCH] Fix ICE with TARGET_MEM_REF (PR rtl-optimization/49235)
On Tue, May 31, 2011 at 2:18 PM, Jakub Jelinek wrote: > Hi! > > Since richi's create_mem_ref_raw change to avoid creating TMR if base > is not ADDR_EXPR we can ICE, if base is NULL and offset some non-zero > constant, because (plus:DI (const_int 0) (const_int 16)) is created. > The second hunk fixes it by not adding the 0 in (the routine ends with > if (!*addr) *addr = const0_rtx; so it is fine not to add it), the third > hunk is an alternative fix, because it is fine to create MEM_REF in that > case too (both base and offset are constants). Either of the hunks > fixes this, but it doesn't hurt to put in both. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Thanks, Richard. > 2011-05-31 Jakub Jelinek > > PR rtl-optimization/49235 > * tree-ssa-address.c (gen_addr_rtx): Ignore base if it is const0_rtx. > (create_mem_ref_raw): Create MEM_REF even if base is INTEGER_CST. > > * gcc.dg/pr49235.c: New test. > > --- gcc/tree-ssa-address.c.jj 2011-05-31 08:03:10.0 +0200 > +++ gcc/tree-ssa-address.c 2011-05-31 09:34:21.0 +0200 > @@ -1,5 +1,5 @@ > /* Memory address lowering and addressing mode selection. > - Copyright (C) 2004, 2006, 2007, 2008, 2009, 2010 > + Copyright (C) 2004, 2006, 2007, 2008, 2009, 2010, 2011 > Free Software Foundation, Inc. > > This file is part of GCC. > @@ -129,7 +129,7 @@ gen_addr_rtx (enum machine_mode address_ > *addr = act_elem; > } > > - if (base) > + if (base && base != const0_rtx) > { > if (*addr) > *addr = simplify_gen_binary (PLUS, address_mode, base, *addr); > @@ -365,7 +365,7 @@ create_mem_ref_raw (tree type, tree alia > ??? As IVOPTs does not follow restrictions to where the base > pointer may point to create a MEM_REF only if we know that > base is valid. */ > - if (TREE_CODE (base) == ADDR_EXPR > + if ((TREE_CODE (base) == ADDR_EXPR || TREE_CODE (base) == INTEGER_CST) > && (!index2 || integer_zerop (index2)) > && (!addr->index || integer_zerop (addr->index))) > return fold_build2 (MEM_REF, type, base, addr->offset); > --- gcc/testsuite/gcc.dg/pr49235.c.jj 2011-05-31 09:42:50.0 +0200 > +++ gcc/testsuite/gcc.dg/pr49235.c 2011-05-31 09:40:02.0 +0200 > @@ -0,0 +1,25 @@ > +/* PR rtl-optimization/49235 */ > +/* { dg-do compile { target { int32plus } } } */ > +/* { dg-options "-O -fno-delete-null-pointer-checks -fno-tree-scev-cprop > -ftree-vectorize -fno-vect-cost-model -w" } */ > + > +void > +foo (void) > +{ > + unsigned i; > + unsigned *p = 0; > + for (i = 0; i < 4; ++i) > + *p++ = 0; > + for (i = 0; i < 4; ++i) > + *p++ = 0; > +} > + > +void > +bar (void) > +{ > + unsigned i; > + unsigned *p = (unsigned *) (__UINTPTR_TYPE__) 0x1234; > + for (i = 0; i < 4; ++i) > + *p++ = 0; > + for (i = 0; i < 4; ++i) > + *p++ = 0; > +} > > Jakub >
Use i386/crtfastmath.c on Solaris 2/x86
I had long meant to support -fast-math on Solaris 2/x86. While working on the Solaris toplevel libgcc move, I've done it with the following patch. The only complication is that I need to make sure that SSE insns are only used if the host supports them. Bootstrapped without regressions on i386-pc-solaris2.8, i386-pc-solaris2.9, i386-pc-solaris2.11, and sparc-sun-solaris2.11. The libgcc part depends on the toplevel libgcc patch, so actually applying this patch will have to wait until that one is in. Ok for mainline? Rainer 2011-05-28 Rainer Orth gcc: * config/i386/crtfastmath.c [!__x86_64__ && __sun__ && __svr4__]: Include , . (sigill_caught): Define. (sigill_hdlr): New function. (set_fast_math) [!__x86_64__ && __sun__ && __svr4__]: Check if SSE insns can be executed. * config/sol2.h (ENDFILE_SPEC): Use crtfastmath.o if -ffast-math etc. * config/sparc/sol2.h (ENDFILE_SPEC): Remove. libgcc: * config.host (i[34567]86-*-solaris2*): Add i386/t-crtfm to tmake_file. Add crtfastmath.o to extra_parts. diff --git a/gcc/config/i386/crtfastmath.c b/gcc/config/i386/crtfastmath.c --- a/gcc/config/i386/crtfastmath.c +++ b/gcc/config/i386/crtfastmath.c @@ -1,5 +1,5 @@ /* - * Copyright (C) 2005, 2007, 2009 Free Software Foundation, Inc. + * Copyright (C) 2005, 2007, 2009, 2011 Free Software Foundation, Inc. * * This file is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License as published by the @@ -30,6 +30,26 @@ #include "cpuid.h" #endif +#if !defined __x86_64 && defined __sun__ && defined __svr4__ +#include +#include + +static volatile sig_atomic_t sigill_caught; + +static void +sigill_hdlr (int sig __attribute((unused)), +siginfo_t *sip __attribute__((unused)), +ucontext_t *ucp) +{ + sigill_caught = 1; + /* Set PC to the instruction after the faulting one to skip over it, + otherwise we enter an infinite loop. 4 is the size of the stmxcsr + instruction. */ + ucp->uc_mcontext.gregs[EIP] += 4; + setcontext (ucp); +} +#endif + static void __attribute__((constructor)) #ifndef __x86_64__ /* The i386 ABI only requires 4-byte stack alignment, so this is necessary @@ -45,6 +65,32 @@ set_fast_math (void) if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx)) return; +#if defined __sun__ && defined __svr4__ + /* Solaris 2 before Solaris 9 4/04 cannot execute SSE instructions even + if the CPU supports them. Programs receive SIGILL instead, so check + for that at runtime. */ + + if (edx & bit_SSE) +{ + struct sigaction act, oact; + + act.sa_handler = sigill_hdlr; + sigemptyset (&act.sa_mask); + /* Need to set SA_SIGINFO so a ucontext_t * is passed to the handler. */ + act.sa_flags = SA_SIGINFO; + sigaction (SIGILL, &act, &oact); + + /* We need a single SSE instruction here so the handler can safely skip +over it. */ + __asm__ volatile ("movss %xmm2,%xmm1"); + + sigaction (SIGILL, &oact, NULL); + + if (sigill_caught) + return; +} +#endif /* __sun__ && __svr4__ */ + if (edx & bit_SSE) { unsigned int mxcsr = __builtin_ia32_stmxcsr (); diff --git a/gcc/config/sol2.h b/gcc/config/sol2.h --- a/gcc/config/sol2.h +++ b/gcc/config/sol2.h @@ -141,7 +141,9 @@ along with GCC; see the file COPYING3. %{p|pg:-ldl} -lc}" #undef ENDFILE_SPEC -#define ENDFILE_SPEC "crtend.o%s crtn.o%s" +#define ENDFILE_SPEC \ + "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s} \ + crtend.o%s crtn.o%s" /* We don't use the standard svr4 STARTFILE_SPEC because it's wrong for us. */ #undef STARTFILE_SPEC diff --git a/gcc/config/sparc/sol2.h b/gcc/config/sparc/sol2.h --- a/gcc/config/sparc/sol2.h +++ b/gcc/config/sparc/sol2.h @@ -117,11 +117,6 @@ along with GCC; see the file COPYING3. #define NO_DBX_BNSYM_ENSYM 1 -#undef ENDFILE_SPEC -#define ENDFILE_SPEC \ - "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s} \ - crtend.o%s crtn.o%s" - /* Select a format to encode pointers in exception handling data. CODE is 0 for data, 1 for code labels, 2 for function pointers. GLOBAL is true if the symbol may be affected by dynamic relocations. diff --git a/libgcc/config.host b/libgcc/config.host --- a/libgcc/config.host +++ b/libgcc/config.host @@ -338,6 +338,8 @@ i[34567]86-*-rtems*) tmake_file="${tmake_file} t-crtin i386/t-softfp i386/t-crtstuff t-rtems" ;; i[34567]86-*-solaris2*) + tmake_file="$tmake_file i386/t-crtfm" + extra_parts="$extra_parts crtfastmath.o" ;; i[4567]86-wrs-vxworks|i[4567]86-wrs-vxworksae) ;; -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Scheduler tweaks relating to speculation
I'm working on a patch to allow the scheduler to move more insns backwards across a jump by using predication. Currently I'm using a slightly extended form of the speculation support, adding another bit to TODO_SPEC. The following preliminary patch makes a few changes in that area in preparation, and also cleans up a few things: * Remove pointless use of "*ts", replace with computing a local variable and storing it in TODO_SPEC when done. * TODO_SPEC can have only three forms, either it is zero, or HARD_DEP, or it contains bits in SPECULATIVE (but not HARD_DEP). The current code sometimes goes through logical operations to compute a known value, HARD_DEP or zero. I found that hard to read and understand initially, so I've changed it. * New function dep_spec_p which decides which of the lists a dep should be on. * Allow insns to be scheduled before all their dependencies are resolved (i.e. don't crash when freeing the deps). * Using that, don't delete deps for debug insns, resolve them instead, and remove the big pointless comment about how we're not freeing dependencies during scheduling. This is intended for later improving backtracking in the presence of debug insns. * When not using USE_DEPS_LIST (why aren't we using it always?) or DO_SPECULATION, we should use zero to initialize DEP_STATUS, so that there aren't any bits in SPECULATIVE. Bootstrapped and tested on i686-linux. I've also built a cross-compiler to ia64-linux to verify that generated code is identical (using both -O2 and -O2 -fselective-scheduling -fselective-scheduling2). Ok? Bernd * haifa-sched.c (schedule_insns): Don't delete debug insn dependencies, resolve them normally. Remove outdated comment. (schedule_block): When computing a known value for TODO_SPEC, just set it rather than using logical operations. (try_ready): Likewise. Use a local variable rather than a pointer to TODO_SPEC. Reorder an if statement to move the easy case to the then block. * sched-deps.c (dep_spec_p): New static function. (update_dep): Use it to decide whether to call change_spec_dep_to_hard. (get_back_and_forw_lists): Use it. (sd_resolve_dep): Likewise. (init_dep): If !USE_DEPS_LIST, use zero to initialize status. (haifa_note_mem_dep): Likewise. (check_dep): Likewise. (sd_add_dep): Also clear SPECULATIVE bits if not DO_SPECULATION. (sched_free_deps): Free in two passes. Index: trunk/gcc/haifa-sched.c === --- trunk.orig/gcc/haifa-sched.c +++ trunk/gcc/haifa-sched.c @@ -1722,10 +1722,7 @@ schedule_insn (rtx insn) } INSN_REG_USE_LIST (dbg) = NULL; - /* We delete rather than resolve these deps, otherwise we - crash in sched_free_deps(), because forward deps are - expected to be released before backward deps. */ - sd_delete_dep (sd_it); + sd_resolve_dep (sd_it); } gcc_assert (QUEUE_INDEX (insn) == QUEUE_NOWHERE); @@ -1778,18 +1775,6 @@ schedule_insn (rtx insn) } } - /* This is the place where scheduler doesn't *basically* need backward and - forward dependencies for INSN anymore. Nevertheless they are used in - heuristics in rank_for_schedule (), early_queue_to_ready () and in - some targets (e.g. rs6000). Thus the earliest place where we *can* - remove dependencies is after targetm.sched.finish () call in - schedule_block (). But, on the other side, the safest place to remove - dependencies is when we are finishing scheduling entire region. As we - don't generate [many] dependencies during scheduling itself, we won't - need memory until beginning of next region. - Bottom line: Dependencies are removed for all insns in the end of - scheduling the region. */ - /* Annotate the instruction with issue information -- TImode indicates that the instruction is expected not to be able to issue on the same cycle as the previous insn. A machine @@ -3217,7 +3202,7 @@ schedule_block (basic_block *target_bb) /* We normally get here only if we don't want to move insn from the split block. */ { - TODO_SPEC (insn) = (TODO_SPEC (insn) & ~SPECULATIVE) | HARD_DEP; + TODO_SPEC (insn) = HARD_DEP; goto restart_choose_ready; } @@ -3292,7 +3277,7 @@ schedule_block (basic_block *target_bb) x = ready_element (&ready, i); QUEUE_INDEX (x) = QUEUE_NOWHERE; - TODO_SPEC (x) = (TODO_SPEC (x) & ~SPECULATIVE) | HARD_DEP; + TODO_SPEC (x) = HARD_DEP; } if (q_size) @@ -3305,7 +3290,7 @@ schedule_block (basic_block *target_bb) x = XEXP (link, 0); QUEUE_INDEX (x) = QUEUE_NOWHERE; - TODO_SPEC (x) = (TODO_SPEC (x) & ~SPECULATIVE) | HARD_
[PATCH] Fix ICE with TARGET_MEM_REF (PR rtl-optimization/49235)
Hi! Since richi's create_mem_ref_raw change to avoid creating TMR if base is not ADDR_EXPR we can ICE, if base is NULL and offset some non-zero constant, because (plus:DI (const_int 0) (const_int 16)) is created. The second hunk fixes it by not adding the 0 in (the routine ends with if (!*addr) *addr = const0_rtx; so it is fine not to add it), the third hunk is an alternative fix, because it is fine to create MEM_REF in that case too (both base and offset are constants). Either of the hunks fixes this, but it doesn't hurt to put in both. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2011-05-31 Jakub Jelinek PR rtl-optimization/49235 * tree-ssa-address.c (gen_addr_rtx): Ignore base if it is const0_rtx. (create_mem_ref_raw): Create MEM_REF even if base is INTEGER_CST. * gcc.dg/pr49235.c: New test. --- gcc/tree-ssa-address.c.jj 2011-05-31 08:03:10.0 +0200 +++ gcc/tree-ssa-address.c 2011-05-31 09:34:21.0 +0200 @@ -1,5 +1,5 @@ /* Memory address lowering and addressing mode selection. - Copyright (C) 2004, 2006, 2007, 2008, 2009, 2010 + Copyright (C) 2004, 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc. This file is part of GCC. @@ -129,7 +129,7 @@ gen_addr_rtx (enum machine_mode address_ *addr = act_elem; } - if (base) + if (base && base != const0_rtx) { if (*addr) *addr = simplify_gen_binary (PLUS, address_mode, base, *addr); @@ -365,7 +365,7 @@ create_mem_ref_raw (tree type, tree alia ??? As IVOPTs does not follow restrictions to where the base pointer may point to create a MEM_REF only if we know that base is valid. */ - if (TREE_CODE (base) == ADDR_EXPR + if ((TREE_CODE (base) == ADDR_EXPR || TREE_CODE (base) == INTEGER_CST) && (!index2 || integer_zerop (index2)) && (!addr->index || integer_zerop (addr->index))) return fold_build2 (MEM_REF, type, base, addr->offset); --- gcc/testsuite/gcc.dg/pr49235.c.jj 2011-05-31 09:42:50.0 +0200 +++ gcc/testsuite/gcc.dg/pr49235.c 2011-05-31 09:40:02.0 +0200 @@ -0,0 +1,25 @@ +/* PR rtl-optimization/49235 */ +/* { dg-do compile { target { int32plus } } } */ +/* { dg-options "-O -fno-delete-null-pointer-checks -fno-tree-scev-cprop -ftree-vectorize -fno-vect-cost-model -w" } */ + +void +foo (void) +{ + unsigned i; + unsigned *p = 0; + for (i = 0; i < 4; ++i) +*p++ = 0; + for (i = 0; i < 4; ++i) +*p++ = 0; +} + +void +bar (void) +{ + unsigned i; + unsigned *p = (unsigned *) (__UINTPTR_TYPE__) 0x1234; + for (i = 0; i < 4; ++i) +*p++ = 0; + for (i = 0; i < 4; ++i) +*p++ = 0; +} Jakub
[PATCH][RFC] Init sizetypes based on target defs
This initializes sizetypes correctly from the start, using target definitions available. All Frontends initialize sizetypes from size_type_node for which there is a target macro SIZE_TYPE which tells what type to use for this (C runtime ABI) type. Now, there are two frontends who do not honor SIZE_TYPE but have an idea on its own. That's Java (probably by accident) and Ada (of course). Java does /* This is not a java type, however tree-dfa requires a definition for size_type_node. */ size_type_node = make_unsigned_type (POINTER_SIZE); set_sizetype (size_type_node); so the FE itself doesn't care and POINTER_SIZE for almost all targets yields the same result as following the SIZE_TYPE advice. Ada has its own idea and thinks it can choose size_t freely, /* In Ada, we use the unsigned type corresponding to the width of Pmode as SIZETYPE. In most cases when ptr_mode and Pmode differ, C will use the width of ptr_mode for SIZETYPE, but we get better code using the width of Pmode. Note that, although we manipulate negative offsets for some internal constructs and rely on compile time overflow detection in size computations, using unsigned types for SIZETYPEs is fine since they are treated specially by the middle-end, in particular sign-extended. */ size_type_node = gnat_type_for_mode (Pmode, 1); set_sizetype (size_type_node); TYPE_NAME (sizetype) = get_identifier ("size_type"); hmm, yes. Again practically for most targets size_t will be following its SIZE_TYPE advice, but surely not for all. OTOH while the above clearly doesn't look "accidential", it certainly looks wrong. If not for sizetype then at least for size_type_node. The comment hints that the patch at most will no longer "get better code", but if Pmode gets better code when used for sizetype(!) then we should do so unconditionally and could get rid of the size_t reverse-engineering in initialize_sizetypes completely (m32c might disagree here). Not yet bootstrapped or tested (but I don't expect any issues other than eventual typos on the targets I have access to). Now, any objections? (Patch to be adjusted to really remove all set_sizetype calls) Thanks, Richard. 2011-05-31 Richard Guenther * stor-layout.c (initialize_sizetypes): Initialize all sizetypes based on target definitions. (set_sizetype): Remove. Index: gcc/stor-layout.c === --- gcc/stor-layout.c (revision 174469) +++ gcc/stor-layout.c (working copy) @@ -2189,95 +2189,83 @@ make_accum_type (int precision, int unsi return type; } -/* Initialize sizetype and bitsizetype to a reasonable and temporary - value to enable integer types to be created. */ +/* Initialize sizetypes so layout_type can use them. */ void initialize_sizetypes (void) { - tree t = make_node (INTEGER_TYPE); - int precision = GET_MODE_BITSIZE (SImode); + tree t; + int precision; - SET_TYPE_MODE (t, SImode); - TYPE_ALIGN (t) = GET_MODE_ALIGNMENT (SImode); - TYPE_IS_SIZETYPE (t) = 1; + if (strcmp (SIZE_TYPE, "unsigned int") == 0) +precision = INT_TYPE_SIZE; + else if (strcmp (SIZE_TYPE, "long unsigned int") == 0) +precision = LONG_TYPE_SIZE; + else if (strcmp (SIZE_TYPE, "long long unsigned int") == 0) +precision = LONG_LONG_TYPE_SIZE; + else +gcc_unreachable (); + + t = make_node (INTEGER_TYPE); + TYPE_NAME (t) = get_identifier ("sizetype"); + TYPE_PRECISION (t) = precision; TYPE_UNSIGNED (t) = 1; + TYPE_IS_SIZETYPE (t) = 1; + /* Layout sizetype manually. */ + SET_TYPE_MODE (t, smallest_mode_for_size (TYPE_PRECISION (t), MODE_INT)); + TYPE_ALIGN (t) = GET_MODE_ALIGNMENT (TYPE_MODE (t)); TYPE_SIZE (t) = build_int_cst (t, precision); - TYPE_SIZE_UNIT (t) = build_int_cst (t, GET_MODE_SIZE (SImode)); - TYPE_PRECISION (t) = precision; - + TYPE_SIZE_UNIT (t) = build_int_cst (t, GET_MODE_SIZE (TYPE_MODE (t))); set_min_and_max_values_for_integral_type (t, precision, /*is_unsigned=*/true); - + /* sizetype is unsigned but we need to fix TYPE_MAX_VALUE so that it is + sign-extended in a way consistent with force_fit_type. */ + TYPE_MAX_VALUE (t) += double_int_to_tree (t, tree_to_double_int (TYPE_MAX_VALUE (t))); sizetype = t; - bitsizetype = build_distinct_type_copy (t); -} -/* Make sizetype a version of TYPE, and initialize *sizetype accordingly. - We do this by overwriting the stub sizetype and bitsizetype nodes created - by initialize_sizetypes. This makes sure that (a) anything stubby about - them no longer exists and (b) any INTEGER_CSTs created with such a type, - remain valid. */ -void -set_sizetype (tree type) -{ - tree t, max; - int oprecision = TYPE_PRECISION (type); - /* The *bitsizetype types use a precision that avoids overflows when - calculating signed sizes / offsets in bits. However, when - cross-compiling f
Re: __sync_swap* with acq/rel/full memory barrier semantics
On Mon, May 30, 2011 at 04:07:09PM -0400, Andrew MacLeod wrote: > On 05/23/2011 07:05 PM, Joseph S. Myers wrote: > >On Mon, 23 May 2011, Aldy Hernandez wrote: > > > >>This is a patch implementing builtins for an atomic exchange with full, > >>acquire, and release memory barrier semantics. It is similar to > >>__sync_lock_test_and_set(), but the target does not have the option of > >>implementing a reduced functionality of only implementing a store of 1. > >>Also, > >>unlike __sync_lock_test_and_set(), we have all three memory barrier > >>variants. > >What's the reason you've implemented three variants, rather than six (the > >C1X/C++0X atomics have six memory order values) or one built-in function > >taking a memory order parameter? More generally, what is the underlying > >design here for how built-in functions should cover the whole of the new > >atomics functionality in C1X and C++0X? > > Aldy was just too excited about working on memory model I think :-) > > I've been looking at this, and I propose we go this way : > > http://gcc.gnu.org/wiki/Atomic/GCCMM/CodeGen > > Please feel free to criticize, comment on, or ask for > clarification. I usually miss something I meant to get across. I think the addition of new __sync_* builtins for the different models is preferrable and would be generally more usable even for other users than C++ atomics. On some targets any atomic insn will act as a full barrier, while on others it could generate different insns or code sequences that way. For OpenMP atomics having a none (in addition to full/acq/rel) would be useful, I think #pragma omp atomic doesn't impose any ordering on memory accesses other than the memory being atomically read/written/changed. Haven't read the C++0x standard in detail why it has 6 memory order modes instead of just 4, but if really 6 are needed (even for 4 probably), having new builtins with just one constant extra argument which says the memory ordering mode would be best. Jakub
Re: [ARM] TLS Descriptor support
On 05/27/11 01:47, Ramana Radhakrishnan wrote: Could you consider adding a check in the configury to test if a binutils version of recent vintage is being used when --with-tls=gnu is in ? I thought about that and it didn't seem worth it. We're not autodetecting whether to default to gnu-style tls and you'll find out soon enough if your binutils is too old. Could you also use R0_REGNUM, R1_REGNUM instead of 0 and 1 in the "tlscall" pattern ? This patch has been tested for both default arm and default gnu tls schemes using the gcc and glibc testsuites for an arm-linux-gnueabi target. Presumably for v7-a and v5te and with this as default ? On hardware ? Hm, I see our testing for fsf build uses qemu -- this patch has been tested on hardware in our releases. Just not this exact version of the patch. nathan -- Nathan Sidwell
Re: lto-cgraph cleanups
On Tue, 31 May 2011, Jan Hubicka wrote: > Hi, > this patch makes lto-cgraph to use the new output_enum/var_len functions > in some obvious places where it fits. It also fixes a problem where > resolution was streamed in wrong order that is maked by our current bitpack > implementation. > > One problem is streaming resolution enums. I am not sure I want to add the > last argument into ld-plugin-api header, so instead I just keep constant about > last code known to GCC. As plugin API will envolve we will have more > resoltion > types, but GCC should always get only those it knows about. > > Bootstrapped/regtested x86_64-linux, OK? Ok. Thanks, Richard. > Honza > > * cgraph.h (cgraph_inline_failed_t): Give enum a name > * lto-cgraph.c (LDPR_NUM_KNOWN): New macro. > (LTO_cgraph_tags): Add LTO_cgraph_last_tag. > (lto_output_edge): Use output_enum and var_len_unsigned. > (lto_output_varpool_node): Likewise. > (input_overwrite_node): Do not take resolution parameter; > extract it from a bitpack. > (input_node): Do not read resolution; use input_enum and > var_len_unsigned. > (input_varpool_node): Likewise. > (input_edge): Likewise. > (input_cgraph_1): Likewise. > Index: cgraph.h > === > *** cgraph.h (revision 174393) > --- cgraph.h (working copy) > *** typedef struct > *** 304,310 > > #define DEFCIFCODE(code, string)CIF_ ## code, > /* Reasons for inlining failures. */ > ! typedef enum { > #include "cif-code.def" > CIF_N_REASONS > } cgraph_inline_failed_t; > --- 304,310 > > #define DEFCIFCODE(code, string)CIF_ ## code, > /* Reasons for inlining failures. */ > ! typedef enum cgraph_inline_failed_enum { > #include "cif-code.def" > CIF_N_REASONS > } cgraph_inline_failed_t; > Index: lto-cgraph.c > === > *** lto-cgraph.c (revision 174393) > --- lto-cgraph.c (working copy) > *** static void output_varpool (cgraph_node_ > *** 49,54 > --- 49,56 > static void output_cgraph_opt_summary (cgraph_node_set set); > static void input_cgraph_opt_summary (VEC (cgraph_node_ptr, heap) * nodes); > > + /* Number of LDPR values known to GCC. */ > + #define LDPR_NUM_KNOWN (LDPR_RESOLVED_DYN + 1) > > /* Cgraph streaming is organized as set of record whose type > is indicated by a tag. */ > *** enum LTO_cgraph_tags > *** 62,68 > LTO_cgraph_analyzed_node, > /* Cgraph edges. */ > LTO_cgraph_edge, > ! LTO_cgraph_indirect_edge > }; > > /* Create a new cgraph encoder. */ > --- 63,70 > LTO_cgraph_analyzed_node, > /* Cgraph edges. */ > LTO_cgraph_edge, > ! LTO_cgraph_indirect_edge, > ! LTO_cgraph_last_tag > }; > > /* Create a new cgraph encoder. */ > *** lto_output_edge (struct lto_simple_outpu > *** 262,270 > struct bitpack_d bp; > > if (edge->indirect_unknown_callee) > ! lto_output_uleb128_stream (ob->main_stream, LTO_cgraph_indirect_edge); > else > ! lto_output_uleb128_stream (ob->main_stream, LTO_cgraph_edge); > > ref = lto_cgraph_encoder_lookup (encoder, edge->caller); > gcc_assert (ref != LCC_NOT_FOUND); > --- 264,274 > struct bitpack_d bp; > > if (edge->indirect_unknown_callee) > ! lto_output_enum (ob->main_stream, LTO_cgraph_tags, LTO_cgraph_last_tag, > ! LTO_cgraph_indirect_edge); > else > ! lto_output_enum (ob->main_stream, LTO_cgraph_tags, LTO_cgraph_last_tag, > ! LTO_cgraph_edge); > > ref = lto_cgraph_encoder_lookup (encoder, edge->caller); > gcc_assert (ref != LCC_NOT_FOUND); > *** lto_output_edge (struct lto_simple_outpu > *** 282,290 > bp = bitpack_create (ob->main_stream); > uid = (!gimple_has_body_p (edge->caller->decl) >? edge->lto_stmt_uid : gimple_uid (edge->call_stmt)); > ! bp_pack_value (&bp, uid, HOST_BITS_PER_INT); > ! bp_pack_value (&bp, edge->inline_failed, HOST_BITS_PER_INT); > ! bp_pack_value (&bp, edge->frequency, HOST_BITS_PER_INT); > bp_pack_value (&bp, edge->indirect_inlining_edge, 1); > bp_pack_value (&bp, edge->call_stmt_cannot_inline_p, 1); > bp_pack_value (&bp, edge->can_throw_external, 1); > --- 286,295 > bp = bitpack_create (ob->main_stream); > uid = (!gimple_has_body_p (edge->caller->decl) >? edge->lto_stmt_uid : gimple_uid (edge->call_stmt)); > ! bp_pack_enum (&bp, cgraph_inline_failed_enum, > ! CIF_N_REASONS, edge->inline_failed); > ! bp_pack_var_len_unsigned (&bp, uid); > ! bp_pack_var_len_unsigned (&bp, edge->frequency); > bp_pack_value (&bp, edge->indirect_inlining_edge, 1); > bp_pack_value (&bp, edge->call_stmt_cannot_inline_p, 1); > bp_pack_value (&bp, edge->can_throw_external, 1); > *
lto-cgraph cleanups
Hi, this patch makes lto-cgraph to use the new output_enum/var_len functions in some obvious places where it fits. It also fixes a problem where resolution was streamed in wrong order that is maked by our current bitpack implementation. One problem is streaming resolution enums. I am not sure I want to add the last argument into ld-plugin-api header, so instead I just keep constant about last code known to GCC. As plugin API will envolve we will have more resoltion types, but GCC should always get only those it knows about. Bootstrapped/regtested x86_64-linux, OK? Honza * cgraph.h (cgraph_inline_failed_t): Give enum a name * lto-cgraph.c (LDPR_NUM_KNOWN): New macro. (LTO_cgraph_tags): Add LTO_cgraph_last_tag. (lto_output_edge): Use output_enum and var_len_unsigned. (lto_output_varpool_node): Likewise. (input_overwrite_node): Do not take resolution parameter; extract it from a bitpack. (input_node): Do not read resolution; use input_enum and var_len_unsigned. (input_varpool_node): Likewise. (input_edge): Likewise. (input_cgraph_1): Likewise. Index: cgraph.h === *** cgraph.h(revision 174393) --- cgraph.h(working copy) *** typedef struct *** 304,310 #define DEFCIFCODE(code, string) CIF_ ## code, /* Reasons for inlining failures. */ ! typedef enum { #include "cif-code.def" CIF_N_REASONS } cgraph_inline_failed_t; --- 304,310 #define DEFCIFCODE(code, string) CIF_ ## code, /* Reasons for inlining failures. */ ! typedef enum cgraph_inline_failed_enum { #include "cif-code.def" CIF_N_REASONS } cgraph_inline_failed_t; Index: lto-cgraph.c === *** lto-cgraph.c(revision 174393) --- lto-cgraph.c(working copy) *** static void output_varpool (cgraph_node_ *** 49,54 --- 49,56 static void output_cgraph_opt_summary (cgraph_node_set set); static void input_cgraph_opt_summary (VEC (cgraph_node_ptr, heap) * nodes); + /* Number of LDPR values known to GCC. */ + #define LDPR_NUM_KNOWN (LDPR_RESOLVED_DYN + 1) /* Cgraph streaming is organized as set of record whose type is indicated by a tag. */ *** enum LTO_cgraph_tags *** 62,68 LTO_cgraph_analyzed_node, /* Cgraph edges. */ LTO_cgraph_edge, ! LTO_cgraph_indirect_edge }; /* Create a new cgraph encoder. */ --- 63,70 LTO_cgraph_analyzed_node, /* Cgraph edges. */ LTO_cgraph_edge, ! LTO_cgraph_indirect_edge, ! LTO_cgraph_last_tag }; /* Create a new cgraph encoder. */ *** lto_output_edge (struct lto_simple_outpu *** 262,270 struct bitpack_d bp; if (edge->indirect_unknown_callee) ! lto_output_uleb128_stream (ob->main_stream, LTO_cgraph_indirect_edge); else ! lto_output_uleb128_stream (ob->main_stream, LTO_cgraph_edge); ref = lto_cgraph_encoder_lookup (encoder, edge->caller); gcc_assert (ref != LCC_NOT_FOUND); --- 264,274 struct bitpack_d bp; if (edge->indirect_unknown_callee) ! lto_output_enum (ob->main_stream, LTO_cgraph_tags, LTO_cgraph_last_tag, !LTO_cgraph_indirect_edge); else ! lto_output_enum (ob->main_stream, LTO_cgraph_tags, LTO_cgraph_last_tag, !LTO_cgraph_edge); ref = lto_cgraph_encoder_lookup (encoder, edge->caller); gcc_assert (ref != LCC_NOT_FOUND); *** lto_output_edge (struct lto_simple_outpu *** 282,290 bp = bitpack_create (ob->main_stream); uid = (!gimple_has_body_p (edge->caller->decl) ? edge->lto_stmt_uid : gimple_uid (edge->call_stmt)); ! bp_pack_value (&bp, uid, HOST_BITS_PER_INT); ! bp_pack_value (&bp, edge->inline_failed, HOST_BITS_PER_INT); ! bp_pack_value (&bp, edge->frequency, HOST_BITS_PER_INT); bp_pack_value (&bp, edge->indirect_inlining_edge, 1); bp_pack_value (&bp, edge->call_stmt_cannot_inline_p, 1); bp_pack_value (&bp, edge->can_throw_external, 1); --- 286,295 bp = bitpack_create (ob->main_stream); uid = (!gimple_has_body_p (edge->caller->decl) ? edge->lto_stmt_uid : gimple_uid (edge->call_stmt)); ! bp_pack_enum (&bp, cgraph_inline_failed_enum, ! CIF_N_REASONS, edge->inline_failed); ! bp_pack_var_len_unsigned (&bp, uid); ! bp_pack_var_len_unsigned (&bp, edge->frequency); bp_pack_value (&bp, edge->indirect_inlining_edge, 1); bp_pack_value (&bp, edge->call_stmt_cannot_inline_p, 1); bp_pack_value (&bp, edge->can_throw_external, 1); *** lto_output_node (struct lto_simple_outpu *** 415,421 else tag = LTO_cgraph_unavail_node; ! lto_output_uleb128_stream (ob->main_stream, tag); /* In WPA mode, we only output part of the call-graph. Also, we fake cgraph node attribute
Re: [build] Move MD_UNWIND_SUPPORT to toplevel libgcc
On 05/31/2011 11:30 AM, Kai Tietz wrote: The issue is that in combination of 32-bit and 64-bit we need to default here to SjLj, Ok, then what you're testing is actually whether you're using sjlj or dw2 unwinding. config/i386/cygming.h will ensure that this is the same as testing TARGET_BI_ARCH and TARGET_64BIT_DEFAULT. Then you can just pass EH_MODEL or SHLIB_SONAME down to libgcc via libgcc.mvars, and look at it in libgcc/config.host to pick the appropriate header. Paolo
Re: [build] Move MD_UNWIND_SUPPORT to toplevel libgcc
2011/5/31 Paolo Bonzini : > On 05/30/2011 07:54 PM, Kai Tietz wrote: >> >> > -/* For 64-bit Windows we can't use DW2 unwind info. Also for multilib >> > - builds we can't use it, too. */ >> > -#if !TARGET_64BIT_DEFAULT&& !defined (TARGET_BI_ARCH) >> > -#define MD_UNWIND_SUPPORT "config/i386/w32-unwind.h" >> > -#endif >> > - >> > /* This matches SHLIB_SONAME and SHLIB_SOVERSION in t-cygming. */ >> > /* This matches SHLIB_SONAME and SHLIB_SOVERSION in t-cygwin. */ >> > #if DWARF2_UNWIND_INFO >> >> mingw part is not ok, as it breaks 32-bit defaulted multilib version >> compiler. > > Can you explain what is going on here? Could it be fixed by wrapping > w32-unwind.h in a #ifdef __x86_64__? To wrap it into __x86_64__ won't help. The issue is that in combination of 32-bit and 64-bit we need to default here to SjLj, as 64-bit doesn't support dw2 unwinding stuff and uses here internally instead SEH. So if target is 32-bit default, but a multilib version is used, we can't use w32-unwind.h. The line of interest is "#if !TARGET_64BIT_DEFAULT && !defined (TARGET_BI_ARCH)", which says: if target is 64-bit then don't use w32-unwind.h. if we are building for multilib then don't use w32-unwind.h. Well, wrapping header with __x64_64__ might helper partial. . But this might be worth a try. Nevertheless I assume that then at least produced DLL names for libgcc could get confused for their extensions. Rainer: It would be helpful, if you could try this. > Rainer, the same solution that is found for Windows should be used for > darwin, too. > > Paolo
Re: [PATH] PR/49139 fix always_inline failures diagnostics
On Tue, May 31, 2011 at 11:18:18AM +0200, Richard Guenther wrote: > The patch is not ok, we may not fail to inline an always_inline > function. To make this more consistent I proposed to warn > whenever you take the address of an always_inline function > (because then you can confuse GCC by indirectly calling > such function which we might inline dependent on optimization > setting and which we might discover we didn't inline only > dependent on optimization setting). Honza proposed to move That would warn on a lot of valid programs. Even #include void *readptr = read; would warn, because read is both extern and extern __inline __attribute__ ((__always_inline__, __artificial__, __gnu_inline__, __warn_unused_result__)) ssize_t read (int __fd, void *__buf, size_t __nbytes) { ... } wrapper. Similarly dozens of other functions. glibc relies on extern inline gnu_inline behavior there, if you take address, the extern is used instead of the inline. Jakub
Patch ping #2
Hi! - http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01182.html various debug info improvements (typed DWARF stack etc.) - http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01246.html optimize away unneeded DW_OP_GNU_convert ops with typed DWARF stack - http://gcc.gnu.org/ml/gcc-patches/2011-04/msg01669.html PR48688 optimization, I know Richard asked for trying it during combine, but that attempt failed due to opposite optimization Jakub
Re: [PATH] PR/49139 fix always_inline failures diagnostics
On Tue, May 31, 2011 at 9:54 AM, Christian Bruel wrote: > Hello, > > The attached patch fixes a few diagnostic discrepancies for always_inline > failures. > > Illustrated by the fail_always_inline[12].c attached cases, the current > behavior is one of: > > - success (with and without -Winline), silently not honoring always_inline > gcc fail_always_inline1.c -S -Winline -O0 -fpic > gcc fail_always_inline1.c -S -O2 -fpic > > - error: with -Winline but not without > gcc fail_always_inline1.c -S -Winline -O2 -fpic > > - error: without -Winline > gcc fail_always_inline2.c -S -fno-early-inlining -O2 > or the original c++ attachment in this defect > > note that -Winline never warns, as stated in the documentation > > This simple patch consistently emits a warning (changing the sorry > unimplemented message) whenever the attribute is not honored. > > My first will was to generate and error instead of the warning, but since it > is possible that inlines is only performed at LTO time, an error would be > inapropriate (Note that today this is not possible with -Winline that would > abort). > > Another alternative I considered would be to emit the warning under -Winline > rather than unconditionally, but this more a user misuse of the attribute, > so should always be warned anyway. Or maybe a new -Winline-always that would > be activated under -Wall ? Other opinion welcomed. > > Tested with standard bootstrap and regression on x86. > > Comments, and/or OK for trunk ? The patch is not ok, we may not fail to inline an always_inline function. To make this more consistent I proposed to warn whenever you take the address of an always_inline function (because then you can confuse GCC by indirectly calling such function which we might inline dependent on optimization setting and which we might discover we didn't inline only dependent on optimization setting). Honza proposed to move the sorry()ing to when we feel the need to output the always_inline function, thus when it was not optimized away, but that would require us not preserving the body (do we?) with -fpreserve-inline-functions. For fail_always_inline1.c we should diagnose the appearant misuse of always_inline with a warning, drop the attribute but keep DECL_DISREGARD_INLINE_LIMITS set. Same for fail_always_inline2.c. I agree that sorry()ing for those cases is odd. EIther we should reject the declarations upfront ("always-inline function will not be inlinable"), or we should emit a warning of that kind and make sure to not sorry later. Richard. > Many thanks, > > Christian > > 2010-05-25 Christian Bruel > > PR 49139 > * ipa-inline-transform.c (inline_transform):force call to > optimize_inline_calls error if function is always_inline. > * tree-inline.c (tree_inlinable_function_p): always warn. > (expand_call_inline): Likewise. > > 2010-05-25 Christian Bruel > > * gcc.db/always_inline.c: Removed -Winline. Update checks > * gcc.db/always_inline2.c: Likewise. > * gcc.db/always_inline3.c: Likewise. > * gcc.db/fail_always_inline1.c: New test. > * gcc.db/fail_always_inline2.c: New test. > > > > >
Re: [PATCH] PR debug/49047 (linkage name missing for cdtors)
OK. Jason
Re: [build] Move Tru64 UNIX startup files to toplevel libgcc
On 05/30/2011 05:12 PM, Rainer Orth wrote: Ok for mainline after a fresh bootstrap? Ok. This depends on the t-slibgcc-dummy file from the Solaris patch, so feel free to move it to this patch (together with s/t-slibgcc-darwin/t-slibgcc-dummy/). Paolo
Re: [build] Move Solaris 2 startup files to toplevel libgcc, revised
On 05/30/2011 04:29 PM, Rainer Orth wrote: * Non-Solaris SPARC changes: >> >> After I had moved sparc/sol2-c[in].asm to libgcc, I noticed that >> despite the name a few non-Solaris targets uses those files, too: >> >> sparc-*-elf*, sparc-*-rtems*, sparc64-*-elf*, sparc64-*-rtems* > > Yes, I tried to untangle that, but the RTEMS folks complained so I had to > backpedal. Note that this is also the case on the i386 side. Drats, I hadn't expected anything like this ;-( Here's the updated patch which takes care of this. I've taken the liberty to rename gcc/config/i386/t-rtems-i386 to gcc/config/i386/t-rtems, following all other RTEMS makefile fragments. I'd be really grateful if the RTEMS maintainers could give it a whirl. Besides, I still need build and SPARC maintainer approval for the non-Solaris parts of the patch from as outlined in the previous submission: http://gcc.gnu.org/ml/gcc-patches/2011-05/msg02181.html Bootstrapped without regressions on i386-pc-solaris2.10, i386-pc-solaris2.11 and sparc-sun-solaris2.11. Ok for mainline after the RTEMS parts have been tested/approved? Looks good as far as I'm concerned. As a followup, please delete t-slibgcc-darwin and use the new t-slibgcc-dummy instead (even better, you could "svn mv" it). Paolo
Re: New options to disable/enable any pass for any functions (issue4550056)
On Mon, May 30, 2011 at 11:44 PM, Xinliang David Li wrote: > This is the complete patch for pass name fixes (with test case changes). This is ok if Honza thinks the profile pass names make more sense this way. Thanks, Richard. > David > > > On Mon, May 30, 2011 at 1:16 PM, Xinliang David Li wrote: >> The attached are two simple follow up patches >> >> 1) the first patch does some refactorization on function header >> dumping (with more information printed) >> >> 2) the second patch cleans up some pass names. Part of the cleanup >> results from a previous discussion with Honza -- a) rename >> 'tree_profile_ipa' into 'profile', and make 'ipa_profile' and >> 'profile' into 'profile_estimate'. The rest of cleanups are needed to >> make sure pass names are unique. >> >> Ok for trunk? >> >> Thanks, >> >> David >> >> On Fri, May 27, 2011 at 2:58 AM, Richard Guenther >> wrote: >>> On Fri, May 27, 2011 at 12:02 AM, Xinliang David Li >>> wrote: The latest version that only exports 2 functions from passes.c. >>> >>> Ok with ... >>> >>> @@ -637,4 +637,8 @@ extern bool first_pass_instance; >>> /* Declare for plugins. */ >>> extern void do_per_function_toporder (void (*) (void *), void *); >>> >>> +extern void disable_pass (const char *); >>> +extern void enable_pass (const char *); >>> +struct function; >>> + >>> >>> struct function forward decl removed. >>> >>> + explicitly_enabled = is_pass_explicitly_enabled (pass, func); >>> + explicitly_disabled = is_pass_explicitly_disabled (pass, func); >>> >>> both functions inlined here and removed. >>> >>> +#define MAX_PASS_ID 512 >>> >>> this removed and instead a VEC_safe_grow_cleared () or VEC_length () >>> before the accesses. >>> >>> +-fenable-ipa-@var{pass} @gol >>> +-fenable-rtl-@var{pass} @gol >>> +-fenable-rtl-@var{pass}=@var{range-list} @gol >>> +-fenable-tree-@var{pass} @gol >>> +-fenable-tree-@var{pass}=@var{range-list} @gol >>> >>> -fenable-@var{kind}-@var{pass}, etc. >>> >>> +@item -fdisable-@var{ipa|tree|rtl}-@var{pass} >>> +@itemx -fenable-@var{ipa|tree|rtl}-@var{pass} >>> +@itemx -fdisable-@var{tree|rtl}-@var{pass}=@var{range-list} >>> +@itemx -fenable-@var{tree|rtl}-@var{pass}=@var{range-list} >>> >>> likewise. >>> >>> Thanks, >>> Richard. >>> David On Thu, May 26, 2011 at 2:04 PM, Xinliang David Li wrote: > On Thu, May 26, 2011 at 2:04 AM, Richard Guenther > wrote: >> On Wed, May 25, 2011 at 6:53 PM, Joseph S. Myers >> wrote: >>> On Wed, 25 May 2011, Xinliang David Li wrote: >>> Ping. The link to the message: http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01303.html >>> >>> I don't consider this an option handling patch. Patches adding whole >>> new >>> features involving new options should be reviewed by maintainers for the >>> part of the compiler relevant to those features (since there isn't a >>> pass >>> manager maintainer, I guess that means middle-end). >> >> Hmm, I suppose then you reviewed the option handling parts and they >> are ok? Those globbing options always cause headache to me. >> >> +-fenable-ipa-@var{pass} @gol >> +-fenable-rtl-@var{pass} @gol >> +-fenable-rtl-@var{pass}=@var{range-list} @gol >> +-fenable-tree-@var{pass} @gol >> >> so, no -fenable-tree-@var{pass}=@var{range-list}? >> > > Missed. Will add. > > >> Does the pass name match 1:1 with the dump file name? In which >> case > > Yes. > >> >> +Disable ipa pass @var{pass}. @var{pass} is the pass name. If the same >> pass is statically invoked in the compiler multiple times, the pass >> name should be appended with a sequential number starting from 1. >> >> isn't true as passes that are invoked only a single time lack the number >> suffix (yes, I'd really like that to be changed ...) > > Yes, pass with single static instance does not need number suffix. > >> >> Please break likes also in .texi files and stick to 80 columns. > > Done. > >> Please >> document that these options are debugging options and regular >> options for enabling/disabling passes should be used. I would suggest >> to group documentation differently and document -fenable-* and >> -fdisable-*, thus, >> >> + -fdisable-@var{kind}-@var{pass} >> + -fenable-@var{kind}-@var{pass} >> >> Even in .texi files please two spaces after a full-stop. > > Done > >> >> +extern void enable_disable_pass (const char *, bool); >> >> I'd rather have both enable_pass and disable_pass ;) > > Ok. > >> >> +struct function; >> +extern void pass_dump_function_header (FILE *, tree, struct function *); >> >> that's odd and probably should be split out, the function should >> maybe reside in tree-pretty-print.c. > > Ok. > >> >> Index: tree-ssa-loop-ivopts.c >>
Re: New options to disable/enable any pass for any functions (issue4550056)
On Mon, May 30, 2011 at 10:16 PM, Xinliang David Li wrote: > The attached are two simple follow up patches > > 1) the first patch does some refactorization on function header > dumping (with more information printed) > > 2) the second patch cleans up some pass names. Part of the cleanup > results from a previous discussion with Honza -- a) rename > 'tree_profile_ipa' into 'profile', and make 'ipa_profile' and > 'profile' into 'profile_estimate'. The rest of cleanups are needed to > make sure pass names are unique. > > Ok for trunk? + +void +pass_dump_function_header (FILE *dump_file, tree fdecl, struct function *fun) This function needs documentation, the ChangeLog entry misses the tree-pretty-print.h change. +struct function; instead of this please include coretypes.h from tree-pretty-print.h and add the struct function forward declaration there if it isn't already present. You change the output of the header, so please make sure you have bootstrapped and tested with _all_ languages included (and also watch for bugreports for target specific bugs). + fprintf (dump_file, "\n;; Function %s (%s, funcdef_no=%d, uid=%d)", + dname, aname, fun->funcdef_no, node->uid); I see no point in dumping funcdef_no - it wasn't dumped before in any place. Instead I miss dumping of the DECL_UID and thus a more specific 'uid', like 'cgraph-uid'. + aname = (IDENTIFIER_POINTER + (DECL_ASSEMBLER_NAME (fdecl))); using DECL_ASSEMBLER_NAME is bad - it might trigger computation of DECL_ASSEMBLER_NAME which certainly shouldn't be done only for dumping purposes. Instead do sth like if (DECL_ASSEMBLER_NAME_SET_P (fdecl)) aname = DECL_ASSEMBLER_NAME (fdecl); else aname = ''; and please also watch for cgraph_get_node returning NULL. Also please call the function dump_function_header instead of pass_dump_function_header. Please re-post with appropriate changes. Thanks, Richard. > Thanks, > > David > > On Fri, May 27, 2011 at 2:58 AM, Richard Guenther > wrote: >> On Fri, May 27, 2011 at 12:02 AM, Xinliang David Li >> wrote: >>> The latest version that only exports 2 functions from passes.c. >> >> Ok with ... >> >> @@ -637,4 +637,8 @@ extern bool first_pass_instance; >> /* Declare for plugins. */ >> extern void do_per_function_toporder (void (*) (void *), void *); >> >> +extern void disable_pass (const char *); >> +extern void enable_pass (const char *); >> +struct function; >> + >> >> struct function forward decl removed. >> >> + explicitly_enabled = is_pass_explicitly_enabled (pass, func); >> + explicitly_disabled = is_pass_explicitly_disabled (pass, func); >> >> both functions inlined here and removed. >> >> +#define MAX_PASS_ID 512 >> >> this removed and instead a VEC_safe_grow_cleared () or VEC_length () >> before the accesses. >> >> +-fenable-ipa-@var{pass} @gol >> +-fenable-rtl-@var{pass} @gol >> +-fenable-rtl-@var{pass}=@var{range-list} @gol >> +-fenable-tree-@var{pass} @gol >> +-fenable-tree-@var{pass}=@var{range-list} @gol >> >> -fenable-@var{kind}-@var{pass}, etc. >> >> +@item -fdisable-@var{ipa|tree|rtl}-@var{pass} >> +@itemx -fenable-@var{ipa|tree|rtl}-@var{pass} >> +@itemx -fdisable-@var{tree|rtl}-@var{pass}=@var{range-list} >> +@itemx -fenable-@var{tree|rtl}-@var{pass}=@var{range-list} >> >> likewise. >> >> Thanks, >> Richard. >> >>> David >>> >>> On Thu, May 26, 2011 at 2:04 PM, Xinliang David Li >>> wrote: On Thu, May 26, 2011 at 2:04 AM, Richard Guenther wrote: > On Wed, May 25, 2011 at 6:53 PM, Joseph S. Myers > wrote: >> On Wed, 25 May 2011, Xinliang David Li wrote: >> >>> Ping. The link to the message: >>> >>> http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01303.html >> >> I don't consider this an option handling patch. Patches adding whole new >> features involving new options should be reviewed by maintainers for the >> part of the compiler relevant to those features (since there isn't a pass >> manager maintainer, I guess that means middle-end). > > Hmm, I suppose then you reviewed the option handling parts and they > are ok? Those globbing options always cause headache to me. > > +-fenable-ipa-@var{pass} @gol > +-fenable-rtl-@var{pass} @gol > +-fenable-rtl-@var{pass}=@var{range-list} @gol > +-fenable-tree-@var{pass} @gol > > so, no -fenable-tree-@var{pass}=@var{range-list}? > Missed. Will add. > Does the pass name match 1:1 with the dump file name? In which > case Yes. > > +Disable ipa pass @var{pass}. @var{pass} is the pass name. If the same > pass is statically invoked in the compiler multiple times, the pass > name should be appended with a sequential number starting from 1. > > isn't true as passes that are invoked only a single time lack the number > suffix (yes, I'd really like that to be changed ...) Yes, pass with single static instance does not need number suff
Re: [build] Move ENABLE_EXECUTE_STACK to toplevel libgcc
On 05/30/2011 05:59 PM, Rainer Orth wrote: This is my hopefully last patch for toplevel libgcc moves: it moves ENABLE_EXECUTE_STACK to $target-lib.h headers in libgcc/config. Since gcc/config/sol2.h is only used on Solaris targets anymore and Solaris 8 is the minimal supported version, I've removed both a Solaris 2.6 workaround and include directly. Same thing in osf5-lib.h. Since the existance of the macro is used in alpha/alpha.c, i386/i386.c, and sparc/sparc.c to enable calls to __enable_execute_stack, I had to define HAVE_ENABLE_EXECUTE_STACK in the gcc/config headers to convey the necessary information. The new libgcc/config/$target-lib.h headers are added to libgcc_tm_file in gcc/config.gcc. I'd rather add them to libgcc/config.host instead so the information is kept local to libgcc. Did you have any problems doing so? Long term, it would be even better to do something like this in libgcc/config.host: enable_execute_stack=enable-execute-stack-empty.c case $host in ...) enable_execute_stack=config/enable-execute-stack-mprotect.c ;; ... esac in libgcc/configure.ac: AC_CONFIG_LINKS(enable-execute-stack.c:$enable_execute_stack) in libgcc/Makefile.in: LIB2ADD += enable-execute-stack.c and drop this hunk altogether from gcc/libgcc2.c: #ifdef L_enable_execute_stack /* Attempt to turn on execute permission for the stack. */ #ifdef ENABLE_EXECUTE_STACK ENABLE_EXECUTE_STACK #else void __enable_execute_stack (void *addr __attribute__((__unused__))) {} #endif /* ENABLE_EXECUTE_STACK */ #endif /* L_enable_execute_stack */ Bootstrapped without regressions on i386-pc-solaris2.11 and sparc-sun-solaris2.11. I've Cc'ed the OS port maintainers of the other affected targets and would appreciate testing/review. An OpenBSD maintainer isn't listed, unfortunately. Also included are the CPU port maintainers for the modified backends. Ok for mainline after a week if no problems occur in testing on the other targets? It's a good start, but at least you need changes to the documentation; if you can make the above work, that would be great as an example of how to move stuff to the libgcc toplevel directory. Paolo
Re: [build] Move MD_UNWIND_SUPPORT to toplevel libgcc
On 05/30/2011 05:43 PM, Rainer Orth wrote: +md-unwind-support.h: config.status + if test -n "$(md_unwind_header)"; then \ + echo "#include \"config/$(md_unwind_header)\""> $@; \ + else \ + :> $@; \ + fi Can you add a default file md-unwind-none.h and use AC_CONFIG_LINKS([md-unwind-support.h:$md_unwind_header]) instead of this (and instead of AC_SUBST'ing the variable)? -libgcc-eh-objects += $(addsuffix $(objext),$(basename $(notdir $(LIB2ADDEHSTATIC -libgcc-s-objects += $(addsuffix _s$(objext),$(basename $(notdir $(LIB2ADDEHSHARED +libgcc-eh-static-objects := $(addsuffix $(objext),$(basename $(notdir $(LIB2ADDEHSTATIC +libgcc-eh-shared-objects := $(addsuffix _s$(objext),$(basename $(notdir $(LIB2ADDEHSHARED + +$(libgcc-eh-static-objects) $(libgcc-eh-shared-objects): md-unwind-support.h + +libgcc-eh-objects += $(libgcc-eh-static-objects) +libgcc-s-objects += $(libgcc-eh-shared-objects) These changes to the dependencies should not be necessary, libgcc does automatic dependency tracking. Also a good start, though. Thanks for this work. Paolo
Re: [build] Move MD_UNWIND_SUPPORT to toplevel libgcc
On 05/30/2011 07:54 PM, Kai Tietz wrote: > -/* For 64-bit Windows we can't use DW2 unwind info. Also for multilib > - builds we can't use it, too. */ > -#if !TARGET_64BIT_DEFAULT&& !defined (TARGET_BI_ARCH) > -#define MD_UNWIND_SUPPORT "config/i386/w32-unwind.h" > -#endif > - > /* This matches SHLIB_SONAME and SHLIB_SOVERSION in t-cygming. */ > /* This matches SHLIB_SONAME and SHLIB_SOVERSION in t-cygwin. */ > #if DWARF2_UNWIND_INFO mingw part is not ok, as it breaks 32-bit defaulted multilib version compiler. Can you explain what is going on here? Could it be fixed by wrapping w32-unwind.h in a #ifdef __x86_64__? Rainer, the same solution that is found for Windows should be used for darwin, too. Paolo
Re: [PATCH] Make ipa_reduced_postorder number SCCs
> Hi, > > for an IPA-CP rewrite that I now work on, it is often useful to look > at a call graph edge and see whether both ends are in the same > strongly connected component. At the moment ipa_reduced_postorder > does not offer such capabilities but it is easy to add. It is enough > to save the lowest DFS number of the component when constructing the > linked list of nodes in it. And this is exactly what the patch below > does. > > Bootstrapped and tested on x86_64-linux. OK for trunk? > > Thanks, > > Martin > > > 2011-05-19 Martin Jambor > > * ipa-utils.c (ipa_dfs_info): New field scc_no; > * ipa-utils.c (searchc): Set scc_no; OK. thanks, Honza
Re: [PATCH] Fix ipa_reduced_postorder with respect to overwritable functions
> Hi, > > I ran into issues caused by ipa_reduced_postorder not really > topologically sorting functions while compiling some part of the java > library (yes, this is the first time java has caught a bug for me). > The problem was that all the functions are apparently > AVAIL_OVERWRITABLE and even though I have instructed > ipa_reduced_postorder by its allow_overwritable parameter to include > such functions, it still blatantly ignores edges to such functions > which leads to all sorts of unexpectedly broken assumptions and weird > bugs. > > Fixed thusly, bootstrapped and tested on x86_64-linux (and verified > that my problems caused by this went away too). > > OK for trunk? OK, Honza
[patch, testsuite] Fix PR 49239
Hi, gcc.dg/vect/vect-strided-u8-i8-gap4-unknown.c randomly fails on Linux/ia32. I think it's because I forgot to initialize the output array. Tested on x86_64-suse-linux. Committed as obvious. Ira testsuite/ChangeLog: PR testsuite/49239 * gcc.dg/vect/vect-strided-u8-i8-gap4-unknown.c: Initialize the output array. Index: testsuite/gcc.dg/vect/vect-strided-u8-i8-gap4-unknown.c === --- testsuite/gcc.dg/vect/vect-strided-u8-i8-gap4-unknown.c (revision 174467) +++ testsuite/gcc.dg/vect/vect-strided-u8-i8-gap4-unknown.c (working copy) @@ -25,6 +25,19 @@ s res[N]; unsigned char x; + for (i = 0; i < N; i++) +{ + res[i].a = 0; + res[i].b = 0; + res[i].c = 0; + res[i].d = 0; + res[i].e = 0; + res[i].f = 0; + res[i].g = 0; + res[i].h = 0; + __asm__ volatile (""); +} + /* Check peeling for gaps for unknown loop bound. */ for (i = 0; i < n; i++) {
[v3] update C++0x status table
This corrects the status of the "Initializer list range access" and "Range access" clauses, which are implemented, and updates the status of tuple/pointer/allocator traits to reflect my changes over the last few days. 2011-05-31 Jonathan Wakely * doc/xml/manual/status_cxx200x.xml: Update. * doc/html/*: Regenerate. Tested with 'make doc-xml-validate-docbook' and 'make doc-html-docbook' to regenerate the HTML, committed to trunk. Index: doc/xml/manual/status_cxx200x.xml === --- doc/xml/manual/status_cxx200x.xml (revision 174358) +++ doc/xml/manual/status_cxx200x.xml (working copy) @@ -247,10 +247,9 @@ - 18.9.3 Initializer list range access - N + Y @@ -469,10 +468,9 @@ - 20.4.2.8 Tuple traits - N + Y @@ -530,11 +528,11 @@ - + 20.6.3 Pointer traits - N - + Partial + Missing rebind @@ -563,11 +561,11 @@ - + 20.6.8 Allocator traits - N - + Partial + Missing rebind_alloc and rebind_traits 20.6.9 @@ -1619,10 +1617,9 @@ - 24.6.5 range access - N + Y @@ -2523,14 +2520,14 @@ 30.6.6 Class template future Partial - Missing future_status and future::share() + Timed waiting functions do not return future_status 30.6.7 Class template shared_future Partial - Missing future_status + Timed waiting functions do not return future_status 30.6.8
Re: [PATCH PR45098, 7/10] Nowrap limits iterations
Hi, > As far as I can tell, what is current calculated in i_bound (and assigned to > nb_iterations_upper_bound), is the maximum amount of times any statement in > the > loop is executed, where any includes exit tests. Differently put, the maximum > amount of times the loop header is executed. hmm... this is rather confusing, I don't really recall why I gave nb_iterations_upper_bound a different semantics from any other instance of what # of iterations of a loop means. > This is confirmed by this comment in tree-vrp.c: > > /* Try to use estimated number of iterations for the loop to constrain the > final value in the evolution. > We are interested in the number of executions of the latch, while > nb_iterations_upper_bound includes the last execution of the exit test. > */ > > I modified the patch to improved the comment. I think a better fix would be to make the nb_iterations_upper_bound semantics consistent with that of nb_iterations. Let me try to do it, hopefully this should be mostly mechanical, Zdenek
[PATH] PR/49139 fix always_inline failures diagnostics
Hello, The attached patch fixes a few diagnostic discrepancies for always_inline failures. Illustrated by the fail_always_inline[12].c attached cases, the current behavior is one of: - success (with and without -Winline), silently not honoring always_inline gcc fail_always_inline1.c -S -Winline -O0 -fpic gcc fail_always_inline1.c -S -O2 -fpic - error: with -Winline but not without gcc fail_always_inline1.c -S -Winline -O2 -fpic - error: without -Winline gcc fail_always_inline2.c -S -fno-early-inlining -O2 or the original c++ attachment in this defect note that -Winline never warns, as stated in the documentation This simple patch consistently emits a warning (changing the sorry unimplemented message) whenever the attribute is not honored. My first will was to generate and error instead of the warning, but since it is possible that inlines is only performed at LTO time, an error would be inapropriate (Note that today this is not possible with -Winline that would abort). Another alternative I considered would be to emit the warning under -Winline rather than unconditionally, but this more a user misuse of the attribute, so should always be warned anyway. Or maybe a new -Winline-always that would be activated under -Wall ? Other opinion welcomed. Tested with standard bootstrap and regression on x86. Comments, and/or OK for trunk ? Many thanks, Christian 2010-05-25 Christian Bruel PR 49139 * ipa-inline-transform.c (inline_transform):force call to optimize_inline_calls error if function is always_inline. * tree-inline.c (tree_inlinable_function_p): always warn. (expand_call_inline): Likewise. 2010-05-25 Christian Bruel * gcc.db/always_inline.c: Removed -Winline. Update checks * gcc.db/always_inline2.c: Likewise. * gcc.db/always_inline3.c: Likewise. * gcc.db/fail_always_inline1.c: New test. * gcc.db/fail_always_inline2.c: New test. /* { dg-do compile { target fpic } } */ /* { dg-options "-fpic" } */ __attribute((always_inline)) void bar() { } /* { dg-warning "can be overwriten at linktime" } */ void f() { bar(); /* { dg-warning "called from here" "" } */ } /* { dg-do compile { target fpic } } */ /* { dg-options "-fpic -fno-early-inlining" } */ inline void foo() { foo2(); } __attribute((always_inline)) void bar() { } /* { dg-warning "can be overwriten at linktime" } */ void f() { foo(); bar(); /* { dg-warning "called from here" "" } */ } Index: gcc/ipa-inline-transform.c === --- gcc/ipa-inline-transform.c (revision 174264) +++ gcc/ipa-inline-transform.c (working copy) @@ -302,9 +302,20 @@ for (e = node->callees; e; e = e->next_callee) { - cgraph_redirect_edge_call_stmt_to_callee (e); + gimple call = cgraph_redirect_edge_call_stmt_to_callee (e); + + if (!inline_p) + { if (!e->inline_failed || warn_inline) inline_p = true; + else + { + tree fn = gimple_call_fndecl (call); + + if (lookup_attribute ("always_inline", DECL_ATTRIBUTES (fn))) + inline_p = true; + } + } } if (inline_p) Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 174264) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,11 @@ +2010-05-25 Christian Bruel + + PR 49139 + * ipa-inline-transform.c (inline_transform): force optimize_inline_calls + error checking if always_inline seen. + * tree-inline.c (tree_inlinable_function_p): use error instead of sorry. + (expand_call_inline): Likewise. + 2011-05-25 Ian Lance Taylor * godump.c (go_format_type): Output the first field with a usable Index: gcc/testsuite/gcc.dg/always_inline.c === --- gcc/testsuite/gcc.dg/always_inline.c(revision 174264) +++ gcc/testsuite/gcc.dg/always_inline.c(working copy) @@ -1,8 +1,8 @@ /* { dg-do compile } */ -/* { dg-options "-Winline -O2" } */ +/* { dg-options "-O2" } */ #include inline __attribute__ ((always_inline)) void -e(int t, ...) /* { dg-message "sorry\[^\n\]*variable argument" "" } */ +e(int t, ...) /* { dg-warning "variable argument" } */ { va_list q; va_start (q, t); Index: gcc/testsuite/gcc.dg/always_inline2.c === --- gcc/testsuite/gcc.dg/always_inline2.c (revision 174264) +++ gcc/testsuite/gcc.dg/always_inline2.c (working copy) @@ -1,8 +1,8 @@ /* { dg-do compile } */ -/* { dg-options "-Winline -O2" } */ -inline __attribute__ ((always_inline)) void t(void); /* { dg-message "sorry\[^\n\]*body not available" "" } */ +/* { dg-options "-O2" } */ +inline __attribute__ ((always_inline)) void t(vo
Re: [PATCH PR45098, 7/10] Nowrap limits iterations
On 05/30/2011 02:38 PM, Zdenek Dvorak wrote: > Hi, > >>> The header block of the loop is bb 4, the latch block is bb 3: >>> ... >>> (gdb) p loop.header.index >>> $4 = 4 >>> (gdb) p loop.latch.index >>> $5 = 3 >>> ... >>> >>> The number of times the latch edge is executed, is 10. >>> >>> But loop->nb_iterations_upper_bound, or max_niter is 11: > > this is a bit strange, it looks like the # of iterations estimation is setting > nb_iterations_upper_bound too conservatively (or I gave > nb_iterations_upper_bound > a different semantics than I remember -- but both my memory and the comment > in cfgloop.h > suggest that nb_iterations_upper_bound >= nb_iterations, i.e., that it should > be 10 in your > example), > The actual values of nb_iterations_upper_bound are determined by this code fragment in record_estimate. /* Records that AT_STMT is executed at most BOUND + 1 times in LOOP. IS_EXIT is true if the loop is exited immediately after STMT, and this exit is taken at last when the STMT is executed BOUND + 1 times. REALISTIC is true if BOUND is expected to be close to the real number of iterations. UPPER is true if we are sure the loop iterates at most BOUND times. I_BOUND is an unsigned double_int upper estimate on BOUND. */ static void record_estimate (struct loop *loop, tree bound, double_int i_bound, gimple at_stmt, bool is_exit, bool realistic, bool upper) { ... /* Update the number of iteration estimates according to the bound. If at_stmt is an exit, then every statement in the loop is executed at most BOUND + 1 times. If it is not an exit, then some of the statements before it could be executed BOUND + 2 times, if an exit of LOOP is before stmt. */ exit = single_exit (loop); if (is_exit || (exit != NULL && dominated_by_p (CDI_DOMINATORS, exit->src, gimple_bb (at_stmt delta = double_int_one; else delta = double_int_two; i_bound = double_int_add (i_bound, delta); As far as I can tell, what is current calculated in i_bound (and assigned to nb_iterations_upper_bound), is the maximum amount of times any statement in the loop is executed, where any includes exit tests. Differently put, the maximum amount of times the loop header is executed. This is confirmed by this comment in tree-vrp.c: /* Try to use estimated number of iterations for the loop to constrain the final value in the evolution. We are interested in the number of executions of the latch, while nb_iterations_upper_bound includes the last execution of the exit test. */ I modified the patch to improved the comment. Ok for trunk? Thanks, - Tom Index: gcc/tree-ssa-loop-ivopts.c === --- gcc/tree-ssa-loop-ivopts.c (revision 173734) +++ gcc/tree-ssa-loop-ivopts.c (working copy) @@ -4391,8 +4391,14 @@ may_eliminate_iv (struct ivopts_data *da { if (!estimated_loop_iterations (loop, true, &max_niter)) return false; - /* The loop bound is already adjusted by adding 1. */ - if (double_int_ucmp (max_niter, period_value) > 0) + /* (a) loop->nb_iterations_upper_bound (assigned to max_niter) + includes the last execution of the exit test. + (b) The number of distinct values of the cand is + period_value + 1. + So, the transformation is allowed if + max_niter <= period_value + 1. */ + period_value = double_int_add (period_value, double_int_one); + if (double_int_ucmp (max_niter, period_value) > 0) return false; } else