Re: Fix PR 65177: diamonds are not valid execution threads for jump threading
On 03/19/2015 01:54 PM, Sebastian Pop wrote: Richard Biener wrote: please instead fixup after copy_bbs in duplicate_seme_region. Thanks for the review. Attached patch that does not modify copy_bbs. Fixes make check in hmmer and make check RUNTESTFLAGS=tree-ssa.exp Full bootstrap and regtest in progress on x86_64-linux. Ok for trunk? I'm going to have to find time tomorrow to wrap this up. I've got a morning full of meetings, then I on a plane for the east coast for the rest of the day. I'm in an exit row seat for the 2nd leg, so there ought to be enough room to open my laptop and poke at this. jeff
Re: Fix PR 65177: diamonds are not valid execution threads for jump threading
On 03/19/2015 03:16 AM, Richard Biener wrote: On Wed, Mar 18, 2015 at 11:35 PM, Sebastian Pop wrote: Hi, the attached patch fixes PR 65177 in which the code generator of FSM jump thread would create a diamond on the copied path: see https://gcc.gnu.org/PR65177#c18 for a detailed description. The patch is renaming SEME into jump_thread as the notion of SEME is more general than the special case that we are interested in, in the case of jump-threading: an execution thread to be jump-threaded has the property that each node on the path has exactly one predecessor, disallowing any other control flow like diamonds or back-edge loops within the SEME region. The patch passes regstrap on x86-64-linux, and fixes the make check of hmmer. Ok to commit? I don't like the special-casing in copy_bbs (with bb_in_bbs doing a linear walk anyway...). Is the first test + /* When creating a jump-thread, we only redirect edges to +consecutive basic blocks. */ + if (i + 1 < n) + { + if (e->dest != bbs[i + 1]) + continue; not really always the case for jump threads? copy_bbs doesn't impose any restriction on ordering on bbs[], so it seems to be a speciality of the caller. Right. The assumption of orderings was the initial thing that jumped out at me. While it may be the case that in practice we're going to be presented with blocks in that kind of order, depending on it seems unwise. Jeff
Re: [PATCH][RFA] [PR rtl-optimization/64317] Enhance postreload-gcse.c to eliminate more redundant loads
On 03/18/2015 12:19 AM, Andrew Pinski wrote: On Tue, Mar 17, 2015 at 11:27 AM, Jeff Law wrote: On 03/17/2015 04:35 AM, Richard Biener wrote: I'll test both. In the common case, the cost is going to be the basic bookkeeping so that we can compute the transparent property. The actual computation of transparency and everything else is guarded on having something in the hash tables -- and the overwhelming majority of the time there's nothing in the hash tables. Regardless, I'll pin down boxes and do some testing. I'm slightly leaning towards trying it even in stage4, but if e.g. richi disagrees, we could defer it to stage1 too. I'd be OK either way. I just want us to make a decision one way or the If it fixes a regression then OK for trunk, otherwise please wait for stage 1 to open. It fixes 64317 which is a P2 regression. I had a pass which I inherited from my predecessor which was basically doing the same as your patch: https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/postreload-load.c;h=2652e51f8eca310d51db3a30e5f6c8847be436ce;hb=refs/heads/apinski/thunderx-cost But I like your patch much better than this one. I saw many loads being removed for AARCH64 also at -Ofast -funroll-loops on SPEC 2006 with this pass. But it seemed to due to subregs which I had mentioned at https://gcc.gnu.org/ml/gcc/2015-01/msg00125.html . When I get a chance I can test your patch out on AARCH64 and disable "my" pass to see if "my" pass catches more than your patch. Well, they're doing different things, so they ought to be complementary to some degree. Jeff
Re: [PATCH][RFA] [PR rtl-optimization/64317] Enhance postreload-gcse.c to eliminate more redundant loads
On 03/16/2015 01:27 PM, Jakub Jelinek wrote: On Wed, Mar 11, 2015 at 03:30:36PM -0600, Jeff Law wrote: +#ifndef GCC_GCSE__COMMONH +#define GCC_GCSE__COMMONH GCC_GCSE_COMMON_H instead? @@ -1308,8 +1396,19 @@ gcse_after_reload_main (rtx f ATTRIBUTE_UNUSED) if (expr_table->elements () > 0) { + /* Knowing which MEMs are transparent through a block can signifiantly +increase the number of reundant loads found. So compute transparency +information for for each memory expression in the hash table. */ s/for for/for/ ? + df_analyze (); + /* This can not be part of the normal allocation routine because +we have to know the number of elements in the hash table. */ + transp = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), +expr_table->elements ()); + bitmap_vector_ones (transp, last_basic_block_for_fn (cfun)); + expr_table->traverse (dump_file); eliminate_partially_redundant_loads (); delete_redundant_insns (); + sbitmap_vector_free (transp); if (dump_file) { What effect does the patch have on compile time on say x86_64 or ppc64? I'm slightly leaning towards trying it even in stage4, but if e.g. richi disagrees, we could defer it to stage1 too. Oh yea, forgot to mention, for PPC the compile-time testing results were essentially the same -- there's significantly more variation in the timings, but the before/after comparisons were in the noise. Jeff
Re: [PATCH][RFA] [PR rtl-optimization/64317] Enhance postreload-gcse.c to eliminate more redundant loads
On 03/16/2015 01:27 PM, Jakub Jelinek wrote: On Wed, Mar 11, 2015 at 03:30:36PM -0600, Jeff Law wrote: +#ifndef GCC_GCSE__COMMONH +#define GCC_GCSE__COMMONH GCC_GCSE_COMMON_H instead? @@ -1308,8 +1396,19 @@ gcse_after_reload_main (rtx f ATTRIBUTE_UNUSED) if (expr_table->elements () > 0) { + /* Knowing which MEMs are transparent through a block can signifiantly +increase the number of reundant loads found. So compute transparency +information for for each memory expression in the hash table. */ s/for for/for/ ? + df_analyze (); + /* This can not be part of the normal allocation routine because +we have to know the number of elements in the hash table. */ + transp = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), +expr_table->elements ()); + bitmap_vector_ones (transp, last_basic_block_for_fn (cfun)); + expr_table->traverse (dump_file); eliminate_partially_redundant_loads (); delete_redundant_insns (); + sbitmap_vector_free (transp); if (dump_file) { What effect does the patch have on compile time on say x86_64 or ppc64? I'm slightly leaning towards trying it even in stage4, but if e.g. richi disagrees, we could defer it to stage1 too. I made the requested fixes as well as fixed another comment typo and fixed one real bug that showed up during some additional testing. Basically the bitmap indices need to start counting from 0, not 1. Starting at 1 can result in an out-of-range write for the last element. That's usually not a problem as there's space in the word, but I happened to have a case where we had 128 entries in the table and thus the out-of-bounds write hit a new word in memory clobbering heap metadata. I'm glad this was caught prior to installation. I've re-bootstrapped and regression tested on x86_64-linux-gnu and powerpc64-linux-gnu. Installed on the trunk. Actual patch committed is attached for archival purposes. Jeff commit f82a93981ea0f2ffbda93511a51c71910e10e4dd Author: law Date: Mon Mar 23 05:21:04 2015 + PR rtl-optimization/64317 * Makefile.in (OBJS): Add gcse-common.c * gcse.c: Include gcse-common.h (struct modify_pair_s): Move structure definition to gcse-common.h (compute_transp): Move function to gcse-common.c. (canon_list_insert): Similarly. (record_last_mem_set_info): Break out some code and put it into gcse-common.c. Call into the new common code. (compute_local_properties): Pass additional arguments to compute_transp. * postreload-gcse.c: Include gcse-common.h and df.h (modify_mem_list_set, blocks_with_calls): New variables. (modify_mem_list, canon_modify_mem_list, transp): Likewise. (get_bb_avail_insn): Pass in the expression index too. (alloc_mem): Allocate memory for the new bitmaps and lists. (free_mem): Free memory for the new bitmaps and lists. (insert_expr_in_table): Record a bitmap index for each entry we add to the table. (record_last_mem_set_info): Call into common code in gcse-common.c. (get_bb_avail_insn): If no available insn was found in the requested BB. If BB has a single predecessor, see if the expression is transparent in BB and available in that single predecessor. (compute_expr_transp): New wrapper for compute_transp. (eliminate_partially_redundant_load): Pass expression's bitmap_index to get_bb_avail_insn. Compute next_pred_bb_end a bit later. (gcse_after_reload_main): If there are elements in the hash table, then compute transparency for all the elements in the hash table. * gcse-common.h: New file. * gcse-common.c: New file. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@221585 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git a/gcc/ChangeLog b/gcc/ChangeLog index a42d22a..b615c1f 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,34 @@ +2015-03-22 Jeff Law + + PR rtl-optimization/64317 + * Makefile.in (OBJS): Add gcse-common.c + * gcse.c: Include gcse-common.h + (struct modify_pair_s): Move structure definition to gcse-common.h + (compute_transp): Move function to gcse-common.c. + (canon_list_insert): Similarly. + (record_last_mem_set_info): Break out some code and put it into + gcse-common.c. Call into the new common code. + (compute_local_properties): Pass additional arguments to compute_transp. + * postreload-gcse.c: Include gcse-common.h and df.h + (modify_mem_list_set, blocks_with_calls): New variables. + (modify_mem_list, canon_modify_mem_list, transp): Likewise. + (get_bb_avail_insn): Pass in the expression index too. + (alloc_me
Re: [wwwdocs] Describe the new way command line options are handled with LTO
> How about this for a copy-edited version of the new text? Sounds good to me. Thanks, Sandra. Honza > >Command-line optimization and target options are now streamed on > a per-function basis and honored by the link-time optimizer. > This change makes the link-time optimization a more transparent > replacement of per-file optimizations. > It is now possible to build projects that require > different optimization > settings for different translation units (such as > -ffast-math, -mavx, or > -finline). > Contrary to the earlier GCC releases, the optimization and target > options passed on the link command line are ignored. > Note that this applies only to those command-line options > that can be passed to optimize and > target attributes. > Command-line options affecting global code generation > (such as -fpic), warnings > (such as -Wodr), > optimizations affecting the way static variables > are optimized (such as -fcommon), debug output (such as > -g), > and --param parameters can be applied only > to the whole link-time optimization unit. > In these cases, it is recommended to consistently use the same > options at both compile time and link time. > > -Sandra
Re: [wwwdocs] Describe the new way command line options are handled with LTO
On 03/22/2015 08:26 PM, Jan Hubicka wrote: > Hi, > this is my honest attempt to explain how command line options works with LTO > in current compiler and how to deal with these sanely. > > It is bit of mess, but an improvement over past releases. We finished the > transition to per-function attributes. In GCC 6 I plan to add per-variable > counterpart. I wonder if we want to have something in the texinfo too? > > Honza > > Index: changes.html > === > RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v > retrieving revision 1.90 > diff -u -r1.90 changes.html > --- changes.html 23 Mar 2015 02:00:12 - 1.90 > +++ changes.html 23 Mar 2015 02:22:51 - > @@ -78,6 +78,25 @@ > Streaming extra information needed to merge types adds about 2-6% of > memory size and object size increase. This can be controlled by > -flto-odr-type-merging. > + Command line optimization and target options are now streamed on > + per-function basis and honored by the link-time optimizer. > + This change makes the link-time optimization more tranpsarent > + replacement of the per-file optimization. > + It is now possible to build projects that require different > optimization > + settings for different translation units (such as > + -ffast-math, -mavx, or > -finline). > + Contrary to the earlier GCC releases, the optimization and target > + options passed to the command line invoking linker are ignored. > + Note that this apply only to those command > + line options that can be passed to optimize and > + target attributes. Command line options affecting global > + code generation (such as -fpic), warnings > + (such as -Wodr), optimization affecting way static > variables > + are optimized (such as -fcommon), debug output (such as > + -g) and --param parameters can be applied > only > + to the whole link-time optimization unit. > + It these cases it is recommended to consistently use the same setting > + both at compile-time and link-time. > GCC bootstrap now uses slim LTO object files. > Memory usage and link times were improved. Tree merging was sped > up, > memory usage of GIMPLE declarations and types was reduced, and, > How about this for a copy-edited version of the new text? Command-line optimization and target options are now streamed on a per-function basis and honored by the link-time optimizer. This change makes the link-time optimization a more transparent replacement of per-file optimizations. It is now possible to build projects that require different optimization settings for different translation units (such as -ffast-math, -mavx, or -finline). Contrary to the earlier GCC releases, the optimization and target options passed on the link command line are ignored. Note that this applies only to those command-line options that can be passed to optimize and target attributes. Command-line options affecting global code generation (such as -fpic), warnings (such as -Wodr), optimizations affecting the way static variables are optimized (such as -fcommon), debug output (such as -g), and --param parameters can be applied only to the whole link-time optimization unit. In these cases, it is recommended to consistently use the same options at both compile time and link time. -Sandra
New French PO file for 'cpplib' (version 5.1-b20150208)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'cpplib' has been submitted by the French team of translators. The file is available at: http://translationproject.org/latest/cpplib/fr.po (This file, 'cpplib-5.1-b20150208.fr.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: http://translationproject.org/latest/cpplib/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: http://translationproject.org/domain/cpplib.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator.
[wwwdocs] Describe the new way command line options are handled with LTO
Hi, this is my honest attempt to explain how command line options works with LTO in current compiler and how to deal with these sanely. It is bit of mess, but an improvement over past releases. We finished the transition to per-function attributes. In GCC 6 I plan to add per-variable counterpart. I wonder if we want to have something in the texinfo too? Honza Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.90 diff -u -r1.90 changes.html --- changes.html23 Mar 2015 02:00:12 - 1.90 +++ changes.html23 Mar 2015 02:22:51 - @@ -78,6 +78,25 @@ Streaming extra information needed to merge types adds about 2-6% of memory size and object size increase. This can be controlled by -flto-odr-type-merging. + Command line optimization and target options are now streamed on + per-function basis and honored by the link-time optimizer. + This change makes the link-time optimization more tranpsarent + replacement of the per-file optimization. + It is now possible to build projects that require different optimization + settings for different translation units (such as + -ffast-math, -mavx, or -finline). + Contrary to the earlier GCC releases, the optimization and target + options passed to the command line invoking linker are ignored. + Note that this apply only to those command + line options that can be passed to optimize and + target attributes. Command line options affecting global + code generation (such as -fpic), warnings + (such as -Wodr), optimization affecting way static variables + are optimized (such as -fcommon), debug output (such as + -g) and --param parameters can be applied only + to the whole link-time optimization unit. + It these cases it is recommended to consistently use the same setting + both at compile-time and link-time. GCC bootstrap now uses slim LTO object files. Memory usage and link times were improved. Tree merging was sped up, memory usage of GIMPLE declarations and types was reduced, and,
[patch, doc] hyphenate "command-line" when used as an adjective
I've checked in this patch to take care of another documentation glitch from my backlog I spotted while working on something else. "Command line", unhyphenated, is a noun, but when used as an adjective phrase immediately before the noun it modifies, e.g. "command-line option", it should be hyphenated. -Sandra 2015-03-22 Sandra Loosemore gcc/ * doc/cpp.texi (Search Path): Hyphenate "command-line" when used as an adjective. (System Headers): Likewise. (Ifdef): Likewise. (Traditional macros): Likewise. (Invocation): Likewise. (Option Index): Likewise. * doc/cppopts.texi (-M): Likewise. (-finput-charset): Likewise. (--help): Likewise. * doc.invoke.texi (AVR Options): Likewise. (V850 Options): Likewise. Index: gcc/doc/cpp.texi === --- gcc/doc/cpp.texi (revision 221563) +++ gcc/doc/cpp.texi (working copy) @@ -858,7 +858,7 @@ GCC was configured to compile code for; the canonical name of the system it runs on. @var{version} is the version of GCC in use. -You can add to this list with the @option{-I@var{dir}} command line +You can add to this list with the @option{-I@var{dir}} command-line option. All the directories named by @option{-I} are searched, in left-to-right order, @emph{before} the default directories. The only exception is when @file{dir} is already searched by default. In @@ -1147,7 +1147,7 @@ Normally, only the headers found in spec system headers. These directories are determined when GCC is compiled. There are, however, two ways to make normal headers into system headers. -The @option{-isystem} command line option adds its argument to the list of +The @option{-isystem} command-line option adds its argument to the list of directories to search for headers, just like @option{-I}. Any headers found in that directory will be considered system headers. @@ -3217,11 +3217,11 @@ using a system feature on a machine wher @item Macros can be defined or undefined with the @option{-D} and @option{-U} -command line options when you compile the program. You can arrange to +command-line options when you compile the program. You can arrange to compile the same source file into two different programs by choosing a macro name to specify which program you want, writing conditionals to test whether or how this macro is defined, and then controlling the -state of the macro with command line options, perhaps set in the +state of the macro with command-line options, perhaps set in the Makefile. @xref{Invocation}. @item @@ -3904,7 +3904,7 @@ produce a single token. Normally comments are removed from the replacement text after the macro is expanded, but if the @option{-CC} option is passed on the -command line comments are preserved. (In fact, the current +command-line comments are preserved. (In fact, the current implementation removes comments even before saving the macro replacement text, but it careful to do it in such a way that the observed effect is identical even in the function-like macro case.) @@ -4325,7 +4325,7 @@ leaving out the answer: In either form, if no such assertion has been made, @samp{#unassert} has no effect. -You can also make or cancel assertions using command line options. +You can also make or cancel assertions using command-line options. @xref{Invocation}. @node Differences from previous versions @@ -4433,7 +4433,7 @@ file. @emph{Note:} Whether you use the preprocessor by way of @command{gcc} or @command{cpp}, the @dfn{compiler driver} is run first. This program's purpose is to translate your command into invocations of the -programs that do the actual work. Their command line interfaces are +programs that do the actual work. Their command-line interfaces are similar but not identical to the documented interface, and may change without notice. @@ -4513,7 +4513,7 @@ configuration of GCC@. @node Option Index @unnumbered Option Index @noindent -CPP's command line options and environment variables are indexed here +CPP's command-line options and environment variables are indexed here without any initial @samp{-} or @samp{--}. @printindex op Index: gcc/doc/cppopts.texi === --- gcc/doc/cppopts.texi (revision 221563) +++ gcc/doc/cppopts.texi (working copy) @@ -194,7 +194,7 @@ suitable for @command{make} describing t source file. The preprocessor outputs one @command{make} rule containing the object file name for that source file, a colon, and the names of all the included files, including those coming from @option{-include} or -@option{-imacros} command line options. +@option{-imacros} command-line options. Unless specified explicitly (with @option{-MT} or @option{-MQ}), the object file name consists of the name of the source file with any @@ -645,7 +645,7 @@ Set the input character set, used for tr set of the input file to the source character set used by GCC@. If the loc
Fix ICEs on ODR violating programs and improve ODR mismatch diagnostic
Hi, this patch fixes ODR violation warnings on Chromium and firefox. I went through all of them and found two false positives. I also fixes two ICEs that are triggered by testcases martin found while delta reducing. I tested the patch on firefox,chromium and libreoffice. Martin gave it a try on boost and tried it with delta to reproduce more ICEs. The patch is bit bloated up by code reordering - I moved more readable checks to be before less readable and silenced duplicated warnings about a single type. There is still quite some new code in odr_subtypes_equivalent_p (which was sitting in my tree for a while) and I hope it is the last larger change I needed in this area. Otimally last fix for GCC 5 :) The false positives are the following: - The code in compare_virtual_tables had typo in skipping RTTI (when mixing -fno-rtti and -frtti code) and it also incorrectly assumed that all functions are DECL_VIRTUAL. This resulted in bogus warning about RTTI info mismatch on Chromium. - odr_subtypes_equivalent_p matched types by name that seems too restrictive. This resulted in mismatch on array "uint8_t a[8]". In one unit uint8_t was typedefed and in other #defined. Because unsigned char and uint8_t have the same mangling and othe properties, I do not think this counts as ODR violation. While evaulating errors I however found I can not tell what the bug is without actually looking into dumps, so I improve the diagnostics. warn_types_mismatch previously did nothing for types without TYPE_NAME that covers most of pointer, arrays and method types. I added code that actually compares them and is able to tell what subtype mismatch and why. Ohter common case was putting types in dfferent namespaces or by #define mismatch. I added code comparing manged names and output warnings: ../../third_party/zlib/zlib.h:1630:0: note: type name 'MOZ_Z_internal_state' should match type name 'internal_state' struct internal_state {int dummy;}; ^ ../../third_party/pdfium/core/src/fxcodec/codec/../.././fxcodec/fx_zlib/zlib_v128/zlib.h:1811:0: note: the incompatible type is defined here struct internal_state {int dummy;}; ^ which hopefully makes it clear that someone got an idea to #define internal_state to MOZ_Z_internal_state and not be consistent. I also added code outputing sane diagnostic in case of component types and anonymous types. There are two anoying problems. First is that types are often output as struct instead of class and other is that location info tends to be wrong. For example in: gen/blink/core/CSSPropertyNames.cpp:2330:0: warning: type 'struct stringpool_t' violates one definition rule [-Wodr] static const short lookup[] = ^ gen/blink/core/CSSValueKeywords.cpp:4309:0: note: a different type is defined in another translation unit static const short lookup[] = ^ gen/blink/core/CSSPropertyNames.cpp:2330:0: note: the first difference of corresponding definitions is field 'stringpool_str0' static const short lookup[] = ^ gen/blink/core/CSSValueKeywords.cpp:4309:0: note: a field of same name but different type is defined in another translation unit static const short lookup[] = ^ This points after the actual definition to next statement. I will try to produce small testcase and work out where location gets garbled - it is wrong at FIELD_DECL itself that I do not tink can be mismerged by tree merging, so it seems a bug somewhere earlier. If location info would be correct there would be clear mismatch in the array bonds of stringpool_str0 array visible. For Chromium the errors output are now as follows. Comments are welcome (though I would like to avoid too many changes now). ../../third_party/ffmpeg/libavcodec/avcodec.h:1134:0: warning: type 'struct AVPacket' violates one definition rule [-Wodr] typedef struct AVPacket { ^ ../../third_party/ffmpeg/libavcodec/avcodec.h:1134:0: note: a different type is defined in another translation unit typedef struct AVPacket { ^ ../../third_party/ffmpeg/libavcodec/avcodec.h:1182:0: note: the first difference of corresponding definitions is field 'pos' int64_t pos;///< byte position in stream, -1 if unknown ^ ../../third_party/ffmpeg/libavcodec/avcodec.h:1178:0: note: a field with different name is defined in another translation unit void (*destruct)(struct AVPacket *); ^ ../../third_party/ffmpeg/libavcodec/avcodec.h:1236:0: warning: type 'struct AVCodecContext' violates one definition rule [-Wodr] typedef struct AVCodecContext { ^ ../../third_party/ffmpeg/libavcodec/avcodec.h:1236:0: note: a different type is defined in another translation unit typedef struct AVCodecContext { ^ ../../third_party/ffmpeg/libavcodec/avcodec.h:2241:0:
Re: [Patch, Fortran, 4.8/4.9/5 Regression] PR59513 READ or WRITE not allowed after EOF
On 03/22/2015 08:47 AM, Janne Blomqvist wrote: On Sat, Mar 21, 2015 at 12:24 AM, Jerry DeLisle wrote: The attached patch allows the attempt to READ or WRITE after an EOF for legacy code. The runtime error is suppressed for -std=legacy and -std=gnu. For standard conformance the error is retained as is now. Since it's a standard violation rather than a GNU extension, I'd prefer if it were enabled only with -std=legacy. Ok with this change. The attached patch adds documentation under 'extensions' in gfortran.texi. Tested with make html. I will commit soon with a ChangeLog entry Regards, Jerry Index: gfortran.texi === --- gfortran.texi (revision 221544) +++ gfortran.texi (working copy) @@ -1404,6 +1404,7 @@ without warning. * OpenMP:: * OpenACC:: * Argument list functions:: +* Read/Write after EOF marker:: @end menu @node Old-style kind specifications @@ -2049,7 +2050,19 @@ For details refer to the g77 manual Also, @code{c_by_val.f} and its partner @code{c_by_val.c} of the GNU Fortran testsuite are worth a look. +@node Read/Write after EOF marker +@subsection Read/Write after EOF marker +@cindex @code{EOF} +@cindex @code{BACKSPACE} +@cindex @code{REWIND} +Some legacy codes rely on allowing @code{READ} or @code{WRITE} after the +EOF file marker in order to find the end of a file. GNU Fortran normally +rejects these codes with a run-time error message and suggests the user +consider @code{BACKSPACE} or @code{REWIND} to properly position +the file before the EOF marker. As an extension, the run-time error may +be disabled using -std=legacy. + @node Extensions not implemented in GNU Fortran @section Extensions not implemented in GNU Fortran @cindex extensions, not implemented
Re: [PATCH] Speed-up IPA ICF by enhanced hash values
On 03/19/2015 09:42 PM, Jan Hubicka wrote: > >> diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c >> index f68d23c..b8e3aa4 100644 >> --- a/gcc/ipa-icf.c >> +++ b/gcc/ipa-icf.c >> @@ -557,6 +557,69 @@ sem_function::equals_wpa (sem_item *item, >>return true; >> } >> >> +/* Update hash by address sensitive references. */ >> + >> +void >> +sem_item::update_hash_by_addr_refs (hash_map > +sem_item *> &m_symtab_node_map) >> +{ >> + if (is_a (node) && DECL_VIRTUAL_P (node->decl)) > > Do not return early here, if reference goes to external symbol, we still want > to record it as sensitive. ref->address_matters_p () should behave well here. >> +return; >> + >> + ipa_ref* ref; >> + inchash::hash hstate (hash); >> + for (unsigned i = 0; i < node->num_references (); i++) >> +{ >> + ref = node->iterate_reference (i, ref); >> + if (ref->address_matters_p ()) > > ref->address_matters_p () || !m_symtab_node_map.get (ref->referred) > > if refernce goes to external symbol, it behaves like sensitive. > > You probably want to update topleve comment explaining what is sensitive and > local > reference and how the hashing is handled. >> +hstate.add_ptr (ref->referred->ultimate_alias_target ()); >> +} >> + >> + if (is_a (node)) >> +{ >> + for (cgraph_edge *e = dyn_cast (node)->callers; e; >> + e = e->next_caller) >> +{ >> + sem_item **result = m_symtab_node_map.get (e->callee); >> + if (!result) >> +hstate.add_ptr (e->callee->ultimate_alias_target ()); >> +} >> +} >> + >> + hash = hstate.end (); >> +} >> + >> +/* Update hash by computed local hash values taken from different >> + semantic items. */ > > Please add TODO that stronger SCC based hashing would be desirable here. >> @@ -2301,6 +2364,19 @@ sem_item_optimizer::add_item_to_class >> (congruence_class *cls, sem_item *item) >>item->cls = cls; >> } >> >> +void >> +sem_item_optimizer::update_hash_by_addr_refs () > > Add block comment ande xplain why the addr and local updates can not be > performed at once > or someone gets an idea to merge the loops. >> +{ >> + for (unsigned i = 0; i < m_items.length (); i++) >> +m_items[i]->update_hash_by_addr_refs (m_symtab_node_map); >> + >> + for (unsigned i = 0; i < m_items.length (); i++) >> +m_items[i]->update_hash_by_local_refs (m_symtab_node_map); >> + >> + for (unsigned i = 0; i < m_items.length (); i++) >> +m_items[i]->hash = m_items[i]->global_hash; >> +} >> + > > OK with these changes. > Honza > Hello. Enhanced version of the patch I'm going to install, if no additional notes. Thanks, Martin >From 2823fe3eec49b341afacb57ecb99aeae7c35bd07 Mon Sep 17 00:00:00 2001 From: mliska Date: Fri, 20 Mar 2015 18:00:40 +0100 Subject: [PATCH] IPA ICF: include hash values of references. gcc/ChangeLog: 2015-03-15 Martin Liska * ipa-icf.c (sem_item::update_hash_by_addr_refs): New function. (sem_item::update_hash_by_local_refs): Likewise. (sem_variable::get_hash): Empty line is fixed. (sem_item_optimizer::execute): Include adding of hash references. (sem_item_optimizer::update_hash_by_addr_refs): New function. (sem_item_optimizer::build_hash_based_classes): Use local hash. * ipa-icf.h (sem_item::update_hash_by_addr_refs): New function. (sem_item::update_hash_by_local_refs): Likewise. --- gcc/ipa-icf.c | 93 --- gcc/ipa-icf.h | 18 +++- 2 files changed, 106 insertions(+), 5 deletions(-) diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c index 360cf17..3cf1261 100644 --- a/gcc/ipa-icf.c +++ b/gcc/ipa-icf.c @@ -557,6 +557,72 @@ sem_function::equals_wpa (sem_item *item, return true; } +/* Update hash by address sensitive references. We iterate over all + sensitive references (address_matters_p) and we hash ultime alias + target of these nodes, which can improve a semantic item hash. + TODO: stronger SCC based hashing would be desirable here. */ + +void +sem_item::update_hash_by_addr_refs (hash_map &m_symtab_node_map) +{ + if (is_a (node) && DECL_VIRTUAL_P (node->decl)) +return; + + ipa_ref* ref; + inchash::hash hstate (hash); + for (unsigned i = 0; i < node->num_references (); i++) +{ + ref = node->iterate_reference (i, ref); + if (ref->address_matters_p () || !m_symtab_node_map.get (ref->referred)) + hstate.add_ptr (ref->referred->ultimate_alias_target ()); +} + + if (is_a (node)) +{ + for (cgraph_edge *e = dyn_cast (node)->callers; e; + e = e->next_caller) + { + sem_item **result = m_symtab_node_map.get (e->callee); + if (!result) + hstate.add_ptr (e->callee->ultimate_alias_target ()); + } +} + + hash = hstate.end (); +} + +/* Update hash by computed local hash values taken from different + semantic items. */ + +void +sem_item::update_hash_by_local_refs (hash_map &m_symtab_node_map) +{ + inchash::hash state (hash); + for (unsigned j = 0; j < node->num_r
Re: [Patch, Fortran, 4.8/4.9/5 Regression] PR59513 READ or WRITE not allowed after EOF
On 03/22/2015 08:47 AM, Janne Blomqvist wrote: On Sat, Mar 21, 2015 at 12:24 AM, Jerry DeLisle wrote: The attached patch allows the attempt to READ or WRITE after an EOF for legacy code. The runtime error is suppressed for -std=legacy and -std=gnu. For standard conformance the error is retained as is now. Since it's a standard violation rather than a GNU extension, I'd prefer if it were enabled only with -std=legacy. Ok with this change. Done. No need for new test case, it is covered by endfile_3.f90 as is. SendingChangeLog Sendingio/transfer.c Transmitting file data .. Committed revision 221572. Thanks and best regards, Jerry
Fix ipa-comdats WRT thunks
Hi, this patch fixes ICE when buliding chromium with LTO and on ipa-pure-const patch. The patch exposes semi-latent bug in ipa-comdats that is trying to separate thunk from its target into different sections. This will lead to unrecognized insns at PPC and others that do not support tail call across comdat sections. Fixed thus. Bootstrapped/regtested x86_64-linux, will commit it later today. PR ipa/65502 * ipa-comdats.c (enqueue_references): Walk through thunks. (ipa_comdats): Likewise. (set_comdat_group_1): New function. Index: ipa-comdats.c === --- ipa-comdats.c (revision 221568) +++ ipa-comdats.c (working copy) @@ -182,6 +182,10 @@ enqueue_references (symtab_node **first, for (i = 0; symbol->iterate_reference (i, ref); i++) { symtab_node *node = ref->referred->ultimate_alias_target (); + + /* Always keep thunks in same sections as target function. */ + if (is_a (node)) + node = dyn_cast (node)->function_symbol (); if (!node->aux && node->definition) { node->aux = *first; @@ -199,6 +203,10 @@ enqueue_references (symtab_node **first, else { symtab_node *node = edge->callee->ultimate_alias_target (); + + /* Always keep thunks in same sections as target function. */ + if (is_a (node)) + node = dyn_cast (node)->function_symbol (); if (!node->aux && node->definition) { node->aux = *first; @@ -209,7 +217,7 @@ enqueue_references (symtab_node **first, } /* Set comdat group of SYMBOL to GROUP. - Callback for symtab_for_node_and_aliases. */ + Callback for for_node_and_aliases. */ bool set_comdat_group (symtab_node *symbol, @@ -223,6 +231,16 @@ set_comdat_group (symtab_node *symbol, return false; } +/* Set comdat group of SYMBOL to GROUP. + Callback for for_node_thunks_and_aliases. */ + +bool +set_comdat_group_1 (cgraph_node *symbol, + void *head_p) +{ + return set_comdat_group (symbol, head_p); +} + /* The actual pass with the main dataflow loop. */ static unsigned int @@ -263,7 +281,12 @@ ipa_comdats (void) && (DECL_STATIC_CONSTRUCTOR (symbol->decl) || DECL_STATIC_DESTRUCTOR (symbol->decl { - map.put (symbol->ultimate_alias_target (), error_mark_node); + symtab_node *target = symbol->ultimate_alias_target (); + + /* Always keep thunks in same sections as target function. */ + if (is_a (target)) + target = dyn_cast (target)->function_symbol (); + map.put (target, error_mark_node); /* Mark the symbol so we won't waste time visiting it for dataflow. */ symbol->aux = (symtab_node *) (void *) 1; @@ -332,10 +355,8 @@ ipa_comdats (void) symbol->aux = NULL; if (!symbol->get_comdat_group () && !symbol->alias - /* Thunks to external functions do not need to be categorized. */ && (!(fun = dyn_cast (symbol)) - || !fun->thunk.thunk_p - || fun->function_symbol ()->definition) + || !fun->thunk.thunk_p) && symbol->real_symbol_p ()) { tree *val = map.get (symbol); @@ -355,9 +376,16 @@ ipa_comdats (void) symbol->dump (dump_file); fprintf (dump_file, "To group: %s\n", IDENTIFIER_POINTER (group)); } - symbol->call_for_symbol_and_aliases (set_comdat_group, -*comdat_head_map.get (group), -true); + if (is_a (symbol)) + dyn_cast (symbol)->call_for_symbol_and_aliases + (set_comdat_group_1, + *comdat_head_map.get (group), + true); + else + symbol->call_for_symbol_and_aliases + (set_comdat_group, + *comdat_head_map.get (group), + true); } } return 0;
Update Danny Smith's entry in contrib.tex
This is something I meant to do back in 2010. Better late than never, I guess. Gerald 2015-03-22 Dave Korn Gerald Pfeifer * doc/contrib.texi (Contributors): Update entry for Danny Smith. Index: doc/contrib.texi === --- doc/contrib.texi(revision 221569) +++ doc/contrib.texi(working copy) @@ -881,6 +881,8 @@ @item Danny Smith for his major efforts on the Mingw (and Cygwin) ports. +Retired from GCC maintainership August 2010, having mentored two +new maintainers into the role. @item Randy Smith finished the Sun FPA support.
Re: [Patch, Fortran, 4.8/4.9/5 Regression] PR59513 READ or WRITE not allowed after EOF
On Sat, Mar 21, 2015 at 12:24 AM, Jerry DeLisle wrote: > The attached patch allows the attempt to READ or WRITE after an EOF for > legacy code. The runtime error is suppressed for -std=legacy and -std=gnu. > For standard conformance the error is retained as is now. Since it's a standard violation rather than a GNU extension, I'd prefer if it were enabled only with -std=legacy. Ok with this change. -- Janne Blomqvist
Re: [Patch, Fortran, 4.8/4.9/5 Regression] PR59513 READ or WRITE not allowed after EOF
Dear Jerry, IMO the patch is in the obvious range, but needs to document the extension and may be a test case. Cheers, Dominique
Re: [Ping, Patch 2/2, v3, Fortran, pr60322 a.o.] [OOP] Incorrect bounds on polymorphic dummy array
Dear Andre, If I am not mistaken, this patch make the following test (pr57305, second attachment): subroutine add_element_poly(a,e) use iso_c_binding class(*),allocatable,intent(inout),target :: a(:) class(*),intent(in),target :: e class(*),allocatable,target :: tmp(:) type(c_ptr) :: dummy interface function memcpy(dest,src,n) bind(C,name="memcpy") result(res) import type(c_ptr) :: res integer(c_intptr_t),value :: dest integer(c_intptr_t),value :: src integer(c_size_t),value :: n end function end interface if (.not.allocated(a)) then allocate(a(1), source=e) else allocate(tmp(size(a)),source=a) deallocate(a) allocate(a(size(tmp)+1),mold=e) dummy = memcpy(loc(a(1)),loc(tmp),sizeof(tmp)) dummy = memcpy(loc(a(size(tmp)+1)),loc(e),sizeof(e)) end if end subroutine to give an ICE (a regression): pr57305_1.f90:24:0: dummy = memcpy(loc(a(1)),loc(tmp),sizeof(tmp)) 1 internal compiler error: Segmentation fault: 11 The ICE is caused by sizeof(tmp). TIA Dominique
Re: [patch, nios2] implement TARGET_ASM_OUTPUT_MI_THUNK
> The nios2 back end didn't previously implement TARGET_ASM_OUTPUT_MI_THUNK. Then backends.html needs to be adjusted, done thusly, installed. -- Eric BotcazouIndex: backends.html === RCS file: /cvs/gcc/wwwdocs/htdocs/backends.html,v retrieving revision 1.65 diff -u -r1.65 backends.html --- backends.html 21 Jan 2015 09:13:49 - 1.65 +++ backends.html 22 Mar 2015 10:51:58 - @@ -95,7 +95,7 @@ moxie | F g t s msp430 |L FIlb gs nds32 | F Cia s -nios2 | S C +nios2 | S Ci nvptx | S Q Cq mg e pa | ? Q CBD qr b i e pdp11 |L ICqrc b e
Re: [Patch, Fortran] Reject unsupported coarray communication
Dear Tobias, The test gfortran.dg/coarray/coindexed_3.f90 compiles without error, see https://gcc.gnu.org/ml/gcc-testresults/2015-03/msg02446.html. TIA Dominique