[google] New fdo summary-based icache sensitive unrolling (issue6282045)
This patch adds new program summary information to the gcov profile files that indicate how many profiled counts compose the majority of the program's execution time. This is used to provide an indication of the overall code size of the program. The new profile summary information is then used to guide codesize based unroll and peel decisions, to prevent those optimizations from increasing code size too much when the program may be sensitive to icache effects. Previously this was done via code size estimates computed during tree optimization from dynamic IPA (LIPO) partial call graphs, and thus only kicked in for LIPO builds in large module groupings. The new method allows the heuristic to be applied for regular FDO compiles as well, and is globally applied to all FDO compiled objects. The old LIPO-specific portion of the previous approach is reverted. Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for google branches? Thanks, Teresa 2012-06-01 Teresa Johnson tejohn...@google.com * libgcc/libgcov.c (sort_by_reverse_gcov_value): New function. (gcov_compute_cutoff_values): Ditto. (gcov_merge_gcda_file): Merge new summary information. (gcov_exit_init): Call gcov_compute_cutoff_values. * gcc/doc/invoke.texi: Rename -fripa-peel-size-limit and -fripa-unroll-size-limit to -fpeel-codesize-limit and -funroll-codesize-limit, respectively. Remove codesize-hotness-threshold param and add unrollpeel-hotness-threshold param. * gcc/gcov-io.c (gcov_write_summary): Write new summary info. (gcov_read_summary): Read new summary info. * gcc/gcov-io.h (GCOV_TAG_SUMMARY_LENGTH): Update for new summary info. (struct gcov_ctr_summary): Add new summary info: sum_cutoff_percent and num_to_cutoff. * gcc/loop-unroll.c (limit_code_size): Update to use new summary info and refine for very hot loops. (decide_unrolling_and_peeling, decide_unroll_runtime_iterations): Ditto. (decide_peel_simple, decide_unroll_stupid): Ditto. * gcc/coverage.c (read_counts_file): Propagate new summary info. * gcc/common.opt: Rename -fripa-peel-size-limit and -fripa-unroll-size-limit to -fpeel-codesize-limit and -funroll-codesize-limit, respectively. * gcc/tree-optimize.c (cgraph_codesize_estimate): Remove. (compute_codesize_estimate): Remove. (execute_cleanup_cfg_post_optimizing): Remove call to compute_codesize_estimate. * gcc/params.def (PARAM_UNROLLPEEL_CODESIZE_THRESHOLD): Remove. (PARAM_UNROLLPEEL_HOTNESS_THRESHOLD): Add. * gcc/gcov-dump.c (tag_summary): Dump new summary info. Index: libgcc/libgcov.c === --- libgcc/libgcov.c(revision 188119) +++ libgcc/libgcov.c(working copy) @@ -795,6 +795,104 @@ gcov_sort_topn_counter_arrays (const struct gcov_i } } +/* Used by qsort to sort gcov values in descending order. */ + +static int +sort_by_reverse_gcov_value (const void *pa, const void *pb) +{ + const gcov_type a = *(gcov_type const *)pa; + const gcov_type b = *(gcov_type const *)pb; + + if (b a) +return 1; + else if (b == a) +return 0; + else +return -1; +} + +/* Determines the number of counters required to cover a given percentage + of the total sum of execution counts in the summary, which is then also + recorded in SUM. */ + +static void +gcov_compute_cutoff_values (struct gcov_summary *sum) +{ + struct gcov_info *gi_ptr; + const struct gcov_fn_info *gfi_ptr; + const struct gcov_ctr_info *ci_ptr; + struct gcov_ctr_summary *cs_ptr; + unsigned t_ix, f_ix, i, ctr_info_ix, index; + gcov_unsigned_t c_num; + gcov_type *value_array; + gcov_type cum, cum_cutoff; + char *cutoff_str; + unsigned cutoff_perc; + +#define CUM_CUTOFF_PERCENT_TIMES_10 999 + cutoff_str = getenv (GCOV_HOTCODE_CUTOFF_TIMES_10); + if (cutoff_str strlen (cutoff_str)) +cutoff_perc = atoi (cutoff_str); + else +cutoff_perc = CUM_CUTOFF_PERCENT_TIMES_10; + + for (t_ix = 0; t_ix GCOV_COUNTERS_SUMMABLE; t_ix++) +{ + /* First check if there are any counts recorded for this counter. */ + cs_ptr = (sum-ctrs[t_ix]); + if (!cs_ptr-num) +continue; + + /* Determine the cumulative counter value at the specified cutoff + percentage and record the percentage for use by gcov consumers. */ + cum_cutoff = (cs_ptr-sum_all * cutoff_perc)/1000; + cs_ptr-sum_cutoff_percent = cutoff_perc; + + /* Next, walk through all the per-object structures and save each of + the count values in value_array. */ + index = 0; + value_array = (gcov_type *) malloc (sizeof(gcov_type)*cs_ptr-num); + for (gi_ptr = __gcov_list; gi_ptr; gi_ptr = gi_ptr-next) +{ + /* Find the appropriate index into the gcov_ctr_info array + for the counter we are currently
Re: [PATCH 2/2] mips: Add R4700 scheduling support
Matt Turner matts...@gmail.com writes: The R4700 is identical to the R4600 except for the integer and floating-point multiplication costs. See page 4 of http://datasheets.chipdb.org/IDT/MIPS/79RV4700.pdf 2012-03-24 Matt Turner matts...@gmail.com gcc/ * config/mips/4600.md (r4700_imul_si): New. (r4700_imul_di): New. (r4700_fmul_single): New. (r4700_fmul_double): New. * config/mips/mips-cpus.def: Add r4700. * config/mips/mips.c: Likewise. * config/mips/mips.md: Likewise. * config/mips/mips-tables.opt: Regenerate. Applied, thanks. Richard
RFA: Alternative iterator implementation
As discussed in the context of the AARCH64 submission, this patch rewrites the iterator handling in read-rtl.c so that we record iterator positions using an on-the-side VEC rather than placeholder modes and codes. We then substitute in-place for each sequence of iterator values and take a deep copy of the result. We do any string substitutions during the copy. The patch also generalises the current use of attributes for rtx modes ((zero_extend:WIDER_MODE ...), etc.) so that the same kind of thing can be done with future iterator types, including the int iterators. Not sure whether that'll be useful or not, but it made the patch easier to write. Tested by making sure that the insn-*.c output didn't change for x86_64-linux-gnu, except that we now do a better job of retaining #line information (not seen as a good thing by everbody, I realise). Also made sure that things like insn-output.c are still generated in the blink of an eye. Bootstrapped regression-tested on x86_64-linux-gnu and i686-pc-linux-gnu. OK to install? Richard gcc/ * read-rtl.c (mapping): Remove index field. Add current_value field. Define heap vectors. (iterator_group): Fix long line. Remove num_builtins field and uses_iterator fields. Make apply_iterator take a void * parameter. (iterator_use, atttribute_use): New structures. (iterator_traverse_data, BELLWETHER_CODE, bellwether_codes): Delete. (current_iterators, iterator_uses, attribute_uses): New variables. (uses_mode_iterator_p, uses_code_iterator_p): Delete. (apply_mode_iterator, apply_code_iterator): Take a void * parameter. (map_attr_string, apply_iterator_to_string): Remove iterator and value parameters. Look through all current iterator values for a matching attribute. (mode_attr_index, apply_mode_maps): Delete. (apply_iterator_to_rtx): Replace with... (copy_rtx_for_iterators): ...this new function. (uses_iterator_p, apply_iterator_traverse): Delete. (apply_attribute_uses, add_current_iterators, apply_iterators): New functions. (add_mapping): Remove index field. Set current_value field. (initialize_iterators): Don't set num_builtins and uses_iterator_p fields. (find_iterator): Delete. (record_iterator_use, record_attribute_use): New functions. (record_potential_iterator_use): New function. (check_code_iterator): Remove handling of bellwether codes. (read_rtx): Remove mode maps. Truncate iterator and attribute uses. (read_rtx_code, read_nested_rtx, read_rtx_variadic): Remove mode_maps parameter. Use the first code iterator value instead of the bellwether_codes array. Use record_potential_iterator_use for modes. Index: gcc/read-rtl.c === --- gcc/read-rtl.c 2012-06-03 08:58:32.251211521 +0100 +++ gcc/read-rtl.c 2012-06-03 09:20:47.633208254 +0100 @@ -41,7 +41,7 @@ struct map_value { }; /* Maps an iterator or attribute name to a list of (integer, string) pairs. - The integers are mode or code values; the strings are either C conditions + The integers are iterator values; the strings are either C conditions or attribute values. */ struct mapping { /* The name of the iterator or attribute. */ @@ -50,82 +50,80 @@ struct mapping { /* The group (modes or codes) to which the iterator or attribute belongs. */ struct iterator_group *group; - /* Gives a unique number to the attribute or iterator. Numbers are - allocated consecutively, starting at 0. */ - int index; - /* The list of (integer, string) pairs. */ struct map_value *values; + + /* For iterators, records the current value of the iterator. */ + struct map_value *current_value; }; -/* A structure for abstracting the common parts of code and mode iterators. */ +/* Vector definitions for the above. */ +typedef struct mapping *mapping_ptr; +DEF_VEC_P (mapping_ptr); +DEF_VEC_ALLOC_P (mapping_ptr, heap); + +/* A structure for abstracting the common parts of iterators. */ struct iterator_group { - /* Tables of mapping structures, one for attributes and one for iterators. */ + /* Tables of mapping structures, one for attributes and one for + iterators. */ htab_t attrs, iterators; - /* The number of real modes or codes (and by extension, the first - number available for use as an iterator placeholder). */ - int num_builtins; - - /* Treat the given string as the name of a standard mode or code and + /* Treat the given string as the name of a standard mode, etc., and return its integer value. */ int (*find_builtin) (const char *); - /* Return true if the given rtx uses the given mode or code. */ - bool (*uses_iterator_p) (rtx, int); + /* Make the given pointer use the given iterator value. */ + void (*apply_iterator) (void *,
Re: [Fortran, patch] PR 48831 - Constant expression (PARAMETER array element) rejected as nonconstant
Hi all, in attachment the patch which includes the review comments provided by Tobias. The patch is bootstrapped and tested on x86_64-unknown-linux-gnu. Regards. Alessandro 2012/5/20 Tobias Burnus bur...@net-b.de: Hi Alessandro, Alessandro Fanfarillo wrote: in attachment there's a patch for PR 48831, it also includes a new test case suggested by Tobias Burnus. The patch is bootstrapped and tested on x86_64-unknown-linux-gnu. Please try to ensure that your patch has a text mime type - it shows up as Content-Type: application/octet-stream; which makes reading, reviewing and quoting your patch more difficult. PR fortran/48831 * gfortran.h: Add non-static prototype declaration of check_init_expr function. * check.c (kind_check): Change if condition related to check_init_expr. * expr.c: Remove prototype declaration of check_init_expr function and static keyword. You should add the name of the function you change in parentheses, e.g. * gfortran.h (check_init_expr): Add prototype declaration of function. (The non-static is superfluous as static functions shouldn't be in header files.) For check_init_expr I'd use Remove forward declaration instead of Remove prototype declaration but that's personal style. But again, you should include the function name in parentheses. The reason is that one can more quickly find it, if it is always at the same spot. As mentioned before, the gfortran convention is to prefix functions (gfc_) - at least those which are nonstatic. Please change the function name. - if (k-expr_type != EXPR_CONSTANT) + if (check_init_expr(k) != SUCCESS) GNU style: Add a space before the ( of the function argument: check_init_expr (k). +/* Check an intrinsic arithmetic operation to see if it is consistent + with some type of expression. */ +gfc_try check_init_expr (gfc_expr *); I have to admit that after reading only the comment, I had no idea what the function does - especially the some type is not really helpful. How about a simple Check whether an expression is an initialization/constant expression. Initialization and constant expressions are well defined in the Fortran standard. (Actually, I find the function name speaks already for itself, thus, I do not see the need for a comment, but I also do not mind a comment.) (One problem with the name constant expression vs. initialization expression is that Fortran 90/95 distinguish between them while Fortran 2003/2008 have merged them to a single type of expression; Fortran 2003 calls the merged expression type initialization expression while Fortran 2008 calls them constant expressions. In principle, gfortran should make the distinction with -std=f95 and reject expressions which are nonconstant and only initexpressions when the standard demands it, but I am not sure whether gfortran does. That part of gfortran is a bit unclean and the distinction between init/const expr is nowadays largely ignored by the gfortran developers.) Otherwise, the patch looks OK. Tobias 2012-06-03 Alessandro Fanfarillo fanfarillo@gmail.com Tobias Burnus bur...@net-b.de PR fortran/48831 * gfortran.h (check_init_expr): Add prototype declaration of function. * check.c (kind_check): Change if condition related to check_init_expr. * expr.c (check_init_expr): Remove forward declaration and static keyword. Change name in gfc_check_init_expr. 2012-06-03 Alessandro Fanfarillo fanfarillo@gmail.com PR fortran/48831 * gfortran.dg/parameter_array_element_2.f90: New. Index: gcc/fortran/expr.c === --- gcc/fortran/expr.c (revisione 188147) +++ gcc/fortran/expr.c (copia locale) @@ -1943,12 +1943,6 @@ } -/* Check an intrinsic arithmetic operation to see if it is consistent - with some type of expression. */ - -static gfc_try check_init_expr (gfc_expr *); - - /* Scalarize an expression for an elemental intrinsic call. */ static gfc_try @@ -1994,7 +1988,7 @@ for (; a; a = a-next) { /* Check that this is OK for an initialization expression. */ - if (a-expr check_init_expr (a-expr) == FAILURE) + if (a-expr gfc_check_init_expr (a-expr) == FAILURE) goto cleanup; rank[n] = 0; @@ -2231,7 +2225,7 @@ gfc_actual_arglist *ap; for (ap = e-value.function.actual; ap; ap = ap-next) -if (check_init_expr (ap-expr) == FAILURE) +if (gfc_check_init_expr (ap-expr) == FAILURE) return MATCH_ERROR; return MATCH_YES; @@ -2319,7 +2313,7 @@ ap-expr-where); return MATCH_ERROR; } - else if (not_restricted check_init_expr (ap-expr) == FAILURE) + else if (not_restricted gfc_check_init_expr (ap-expr) == FAILURE) return MATCH_ERROR; if (not_restricted == 0 @@
Re: [Fortran, DRAFT patch] PR 46321 - [OOP] Polymorphic deallocation
Right, the problem is that the _free component is missing. Just as the _copy component, _free should be present for *every* vtype, no matter if there are allocatable components or not. If the _free component is not needed, it should be initialized to EXPR_NULL. With an empty _free function for every type which does not have allocatable components the problem with dynamic_dispatch_4.f03 disappears :), thank you very much. In the afternoon I'll reorganize the code. Bye. Alessandro
Document sincos standard pattern name
Hi, The attached patch adds some documentation for the sincos standard pattern name. Tested with 'make info dvi pdf'. Is the text correct and OK to apply? Maybe it would also make sense to apply it to the 4.7 branch? Cheers, Oleg ChangeLog: * gcc/doc/md.texi (Standard Pattern Names For Generation): Document sincos pattern. Index: gcc/doc/md.texi === --- gcc/doc/md.texi (revision 188148) +++ gcc/doc/md.texi (working copy) @@ -4795,6 +4795,22 @@ built-in function uses the mode which corresponds to the C data type @code{float}. +@cindex @code{sincos@var{m}3} instruction pattern +@item @samp{sincos@var{m}3} +Store the sine of operand 2 into operand 0 and the cosine of +operand 2 into operand 1. + +The @code{sin} and @code{cos} built-in functions of C always use the +mode which corresponds to the C data type @code{double} and the +@code{sinf} and @code{cosf} built-in function use the mode which +corresponds to the C data type @code{float}. +Targets that can calculate the sine and cosine simultaneously can +implement this pattern as opposed to implementing individual +@code{sin@var{m}2} and @code{cos@var{m}2} patterns. The @code{sin} +and @code{cos} built-in functions will then be expanded to the +@code{sincos@var{m}3} pattern, with one of the output values +left unused. + @cindex @code{exp@var{m}2} instruction pattern @item @samp{exp@var{m}2} Store the exponential of operand 1 into operand 0.
Re: [gcov] a few improvements
On 06/03/12 05:51, Xinliang David Li wrote: On Fri, Dec 30, 2011 at 10:25 AM, Nathan Sidwellnat...@acm.org wrote: I've committed this patch to fix and improve coverage reporting: 1) the time stamp local_tick will be -1 if the user overrides the random seed. In such cases the gcov data file should be deleted, just as it would if the time cannot be determined. The end result is that after a the profile data is used in the compilation (with option -fprofile-use -frandom-seed=), it is then deleted. Is this intended? This basically makes FDO very hard to use when -frandom-seed is used. IIUC the file is deleted after being used. What is the problem you are encountering? (if you are having a problem, that suggests others on systems without local_tick are also having problems) nathan
Re: [google] Add options to pattern match function name for hotness attributes
Thank you guys for the comments, I'll update the patch to : 1. generalize the flag to enable other annotations such always_inline. 2. change to use deferred option. Thanks, Dehao On Sun, Jun 3, 2012 at 12:40 PM, Xinliang David Li davi...@google.com wrote: On Sat, Jun 2, 2012 at 11:11 AM, Jan Hubicka hubi...@ucw.cz wrote: Actually Dehao also plans to teach the static predictor to understand standard library functions more (e.g IO functions) and add more naming How this differ from annotating the library? I find them more suitable to be compiler heuristic than being function's attribute -- attribute is a much stronger assertion. There are indeed quite some possibilities to do about library calls One thing I always wondered about is how to tell predictor that paths containing heavy IO functions don't need to be really opimized to death, since their execution time is dominated by the IO... Yes -- if branch predictor does the right thing and if function splitter is powerful enough, the IO code can be outlined and optimized for size :) based heuristics such as 'error, success, failure, fatal etc). yeah, this is also mentioned by some branch prediction papers. It is bit kludgy in nature (i.e. it won't understand programs written in Czech language) but it is an heuristics after all. right. thanks, David Honza thanks, David Honza thanks, David Honza
[committed] Reduce maximum PCREL17F branch offsets for PIC code
Sometimes when generating PIC code, a call would exceed its maximum branch offset and link would fail. This occurs because PIC stubs are larger than non PIC stubs. Most recently this occurred building the Debian kde4libs package. This change adjusts the maximum branch offsets for PIC code so that the maximum call density is the same as non PIC code. Tested on hppa-unknown-linux-gnu, hppa2.0w-hp-hpux11.11 and hppa64-hp-hpux11.11. Committed to trunk. Dave -- J. David Anglin dave.ang...@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) 2012-06-03 John David Anglin dave.ang...@nrc-cnrc.gc.ca * config/pa/pa.h (MAX_PCREL17F_OFFSET): Define. * config/pa/pa.c (pa_attr_length_millicode_call): Use MAX_PCREL17F_OFFSET instead of fixed offset. (pa_attr_length_call): Likewise. (pa_attr_length_indirect_call): Likewise. Index: config/pa/pa.c === --- config/pa/pa.c (revision 188142) +++ config/pa/pa.c (working copy) @@ -7457,7 +7457,7 @@ return 24; else { - if (!TARGET_LONG_CALLS distance 24) + if (!TARGET_LONG_CALLS distance MAX_PCREL17F_OFFSET) return 8; if (TARGET_LONG_ABS_CALL !flag_pic) @@ -7670,7 +7670,7 @@ /* pc-relative branch. */ if (!TARGET_LONG_CALLS ((TARGET_PA_20 !sibcall distance 760) - || distance 24)) + || distance MAX_PCREL17F_OFFSET)) length += 8; /* 64-bit plabel sequence. */ @@ -8029,7 +8029,7 @@ if (TARGET_FAST_INDIRECT_CALLS || (!TARGET_PORTABLE_RUNTIME ((TARGET_PA_20 !TARGET_SOM distance 760) - || distance 24))) + || distance MAX_PCREL17F_OFFSET))) return 8; if (flag_pic) Index: config/pa/pa.h === --- config/pa/pa.h (revision 188142) +++ config/pa/pa.h (working copy) @@ -1519,3 +1519,12 @@ #undef TARGET_HAVE_TLS #define TARGET_HAVE_TLS true #endif + +/* The maximum offset in bytes for a PA 1.X pc-relative call to the + head of the preceding stub table. The selected offsets have been + chosen so that approximately one call stub is allocated for every + 86.7 instructions. A long branch stub is two instructions when + not generating PIC code. For HP-UX and ELF targets, PIC stubs are + seven and four instructions, respectively. */ +#define MAX_PCREL17F_OFFSET \ + (flag_pic ? (TARGET_HPUX ? 198164 : 221312) : 24)
Re: [Fortran, patch] PR 48831 - Constant expression (PARAMETER array element) rejected as nonconstant
Thank you Tobias, I thought that Change name in gfc_check_init_expr was sufficient. 2012/6/3 Tobias Burnus bur...@net-b.de: Hi Alessandro, hi all, Alessandro Fanfarillo wrote: in attachment the patch which includes the review comments provided by Tobias. Thanks for the patch, which I committed as Rev. 188152. Congratulation to your second committed patch. Nit: You forgot twice to add the prefix gfc_ in the ChangeLog; I corrected it before committal. * * * If possible, use -p when you do a diff. With svn, simply pass -x -p (or --diff-cmd=diff -x '-p -u'); git does this already by default. [Some prefer -c to -u, which is also fine.] Without the -p flag, the result is: --- gcc/fortran/check.c (revisione 188147) +++ gcc/fortran/check.c @@ -163,7 +163,7 @@ if (scalar_check (k, n) == FAILURE) While with -p flag, one gets: --- gcc/fortran/check.c (Revision 188123) +++ gcc/fortran/check.c @@ -163,7 +163,7 @@ kind_check (gfc_expr *k, int n, bt type) if (scalar_check (k, n) == FAILURE) The difference is that the @@ line shows the function name (here: kind_check). That information makes it easier to review a patch as one then knows more about the context. Tobias
[wwwdocs] Buildstat update for 4.7
Latest results for 4.7.x -tgc Testresults for 4.7.0: hppa64-hp-hpux11.11 i386-pc-solaris2.8 x86_64-apple-darwin11.3.0 Index: buildstat.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/buildstat.html,v retrieving revision 1.4 diff -u -r1.4 buildstat.html --- buildstat.html 4 Apr 2012 20:08:45 - 1.4 +++ buildstat.html 3 Jun 2012 14:54:32 - @@ -39,6 +39,14 @@ /tr tr +tdhppa64-hp-hpux11.11/td +tdnbsp;/td +tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-04/msg00408.html;4.7.0/a +/td +/tr + +tr tdi386-apple-darwin10.8.0/td tdnbsp;/td tdTest results: @@ -50,6 +58,7 @@ tdi386-pc-solaris2.8/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-05/msg00527.html;4.7.0/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg02998.html;4.7.0/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg02619.html;4.7.0/a /td @@ -124,6 +133,7 @@ tdx86_64-apple-darwin11.3.0/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-05/msg02122.html;4.7.0/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg02630.html;4.7.0/a /td /tr
RE: [Patch,AVR]: Fix PR46261
-Original Message- From: Georg-Johann Lay [mailto:a...@gjlay.de] Sent: Thursday, May 31, 2012 8:56 AM To: gcc-patches@gcc.gnu.org Cc: Denis Chertykov; Weddington, Eric; Richard Henderson Subject: [Patch,AVR]: Fix PR46261 This fixes ICE on any source compiled with -mint8. Missing definition of UINT16_TYPE (defined to 0) crashed the compiler when it tries to build wchar stuff. As mentioned in the PR, I chose to add the stdint stuff as a new file. The -mint8 part is only lightly tested because there is no test suite for the non-C compliant code it generates. Without -mint8 avr-stdint.h mimics as many as possible definitions from newlib-stdint.h but for 2 cases where it deviates: - SIG_ATOMIC_TYPE is 8-bit, AVR cannot access 16 bits atomically - [U]INT_FAST8_TYPE is 8-bit because AVR is an 8-bit machine Ok to install? Ok. Please commit. If it tests with no regressions, you can backport to 4.7 (as mentioned off-list). Eric Weddington
[wwwdocs] Buildstat update for 4.6
Latest results for 4.6.x -tgc Testresults for 4.6.3: i386-pc-solaris2.8 i386-pc-solaris2.10 x86_64-unknown-linux-gnu Index: buildstat.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.6/buildstat.html,v retrieving revision 1.11 diff -u -r1.11 buildstat.html --- buildstat.html 4 Apr 2012 20:01:33 - 1.11 +++ buildstat.html 3 Jun 2012 14:58:28 - @@ -108,6 +108,7 @@ tdi386-pc-solaris2.8/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-04/msg02817.html;4.6.3/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg02401.html;4.6.3/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg02155.html;4.6.3/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg01757.html;4.6.3/a, @@ -136,6 +137,7 @@ tdi386-pc-solaris2.10/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-05/msg00087.html;4.6.3/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg01817.html;4.6.3/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2011-11/msg02468.html;4.6.2/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2011-07/msg00327.html;4.6.1/a, @@ -368,6 +370,7 @@ tdx86_64-unknown-linux-gnu/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-05/msg02020.html;4.6.3/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg01688.html;4.6.3/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2011-12/msg00968.html;4.6.2/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg03135.html;4.6.1/a,
Re: [C++ Patch] for c++/51214
2012/6/3 Jason Merrill ja...@redhat.com: On 05/24/2012 09:18 AM, Jason Merrill wrote: OK, thanks. I notice you haven't checked the patch in yet, is there a problem? Not at all, just lack of time, so many problems/holidays to tackle at the moment... That is May month in France ;-) I'll be checking in it the next few days. -- Fabien
Re: [google] Add options to pattern match function name for hotness attributes
Where is the patch -- the code review link is missing (in the original post). David On Sun, Jun 3, 2012 at 6:14 AM, Dehao Chen de...@google.com wrote: Thank you guys for the comments, I'll update the patch to : 1. generalize the flag to enable other annotations such always_inline. 2. change to use deferred option. Thanks, Dehao On Sun, Jun 3, 2012 at 12:40 PM, Xinliang David Li davi...@google.com wrote: On Sat, Jun 2, 2012 at 11:11 AM, Jan Hubicka hubi...@ucw.cz wrote: Actually Dehao also plans to teach the static predictor to understand standard library functions more (e.g IO functions) and add more naming How this differ from annotating the library? I find them more suitable to be compiler heuristic than being function's attribute -- attribute is a much stronger assertion. There are indeed quite some possibilities to do about library calls One thing I always wondered about is how to tell predictor that paths containing heavy IO functions don't need to be really opimized to death, since their execution time is dominated by the IO... Yes -- if branch predictor does the right thing and if function splitter is powerful enough, the IO code can be outlined and optimized for size :) based heuristics such as 'error, success, failure, fatal etc). yeah, this is also mentioned by some branch prediction papers. It is bit kludgy in nature (i.e. it won't understand programs written in Czech language) but it is an heuristics after all. right. thanks, David Honza thanks, David Honza thanks, David Honza
Re: [gcov] a few improvements
On Sun, Jun 3, 2012 at 6:10 AM, Nathan Sidwell nat...@acm.org wrote: On 06/03/12 05:51, Xinliang David Li wrote: On Fri, Dec 30, 2011 at 10:25 AM, Nathan Sidwellnat...@acm.org wrote: I've committed this patch to fix and improve coverage reporting: 1) the time stamp local_tick will be -1 if the user overrides the random seed. In such cases the gcov data file should be deleted, just as it would if the time cannot be determined. The end result is that after a the profile data is used in the compilation (with option -fprofile-use -frandom-seed=), it is then deleted. Is this intended? This basically makes FDO very hard to use when -frandom-seed is used. IIUC the file is deleted after being used. What is the problem you are encountering? (if you are having a problem, that suggests others on systems without local_tick are also having problems) Basically it makes it very difficult to rebuild the file with the profile data --- which makes problem triaging impossible. What is worse is that when profile data is missing, gcc silently takes it which gives you no clue you chasing in a completely different path .. David nathan
[PATCH] Option to build bare-metal ARM cross-compiler for arm-none-eabi target without libunwind
This is an update to Fredrik Hederstierna's mail and patch from 12 Apr 2012. I think he missed one place where -fexceptions needs to be changed to -fno-exceptions. With the attached patch, two of us (a friend and I, one on Linux and one on Mac) can build Rockbox with gcc-4.7.0. I'll admit I haven't checked the repository for changes in this area, this patch is based on stock released gcc-4.7.0. I have no opinion on Sebastian Huber's idea of making functionality of this kind implicit with target. - Larry diff -ur libgcc-orig/configure libgcc-baremetal/configure --- libgcc-orig/configure 2012-01-22 22:25:28.0 -0800 +++ libgcc-baremetal/configure 2012-05-31 07:30:36.0 -0700 @@ -600,6 +600,7 @@ set_use_emutls set_have_cc_tls vis_hide +enable_libunwind_exceptions fixed_point enable_decimal_float decimal_float @@ -697,6 +698,7 @@ with_build_libsubdir enable_decimal_float with_system_libunwind +enable_libunwind_exceptions enable_sjlj_exceptions enable_tls ' @@ -1332,6 +1334,8 @@ enable decimal float extension to C. Selecting 'bid' or 'dpd' choses which decimal floating point format to use + --disable-libunwind-exceptions + disable use of libunwind when build libgcc --enable-sjlj-exceptions force use of builtin_setjmp for exceptions --enable-tlsUse thread-local storage [default=yes] @@ -4493,6 +4497,15 @@ fi +# For bare metal toolchain libundwind exceptions can be disabled +# Check whether --enable-libunwind-exceptions was given. +if test ${enable_libunwind_exceptions+set} = set; then : + enableval=$enable_libunwind_exceptions; enable_libunwind_exceptions=$enableval +else + enable_libunwind_exceptions=yes +fi + + # The sjlj test is almost duplicated here and in libgo/configure.ac (for C), # libstdc++-v3/acinclude.m4 and libjava/configure.ac (for C++), and # libobjc/configure.ac (for Objective-C). diff -ur libgcc-orig/configure.ac libgcc-baremetal/configure.ac --- libgcc-orig/configure.ac 2011-11-07 08:34:31.0 -0800 +++ libgcc-baremetal/configure.ac 2012-05-27 11:49:22.0 -0700 @@ -184,6 +184,14 @@ # config.gcc also contains tests of with_system_libunwind. GCC_CHECK_UNWIND_GETIPINFO +# For bare metal toolchain libundwind exceptions can be disabled +AC_ARG_ENABLE(libunwind-exceptions, + [AC_HELP_STRING([--disable-libunwind-exceptions], +[disable use of libunwind when build libgcc])], + [enable_libunwind_exceptions=$enableval], + [enable_libunwind_exceptions=yes]) +AC_SUBST(enable_libunwind_exceptions) + # The sjlj test is almost duplicated here and in libgo/configure.ac (for C), # libstdc++-v3/acinclude.m4 and libjava/configure.ac (for C++), and # libobjc/configure.ac (for Objective-C). diff -ur libgcc-orig/Makefile.in libgcc-baremetal/Makefile.in --- libgcc-orig/Makefile.in 2011-11-21 19:01:02.0 -0800 +++ libgcc-baremetal/Makefile.in 2012-05-29 18:45:03.0 -0700 @@ -41,6 +41,7 @@ decimal_float = @decimal_float@ enable_decimal_float = @enable_decimal_float@ fixed_point = @fixed_point@ +enable_libunwind_exceptions = @enable_libunwind_exceptions@ host_noncanonical = @host_noncanonical@ target_noncanonical = @target_noncanonical@ @@ -497,17 +498,23 @@ endif # Build LIB2_DIVMOD_FUNCS. +ifeq ($(enable_libunwind_exceptions),yes) +LIB2_DIVMOD_CFLAGS := -fexceptions -fnon-call-exceptions +else +LIB2_DIVMOD_CFLAGS := -fno-exceptions -fno-non-call-exceptions +endif + lib2-divmod-o = $(patsubst %,%$(objext),$(LIB2_DIVMOD_FUNCS)) $(lib2-divmod-o): %$(objext): $(srcdir)/libgcc2.c $(gcc_compile) -DL$* -c $ \ - -fexceptions -fnon-call-exceptions $(vis_hide) + $(LIB2_DIVMOD_CFLAGS) $(vis_hide) libgcc-objects += $(lib2-divmod-o) ifeq ($(enable_shared),yes) lib2-divmod-s-o = $(patsubst %,%_s$(objext),$(LIB2_DIVMOD_FUNCS)) $(lib2-divmod-s-o): %_s$(objext): $(srcdir)/libgcc2.c $(gcc_s_compile) -DL$* -c $ \ - -fexceptions -fnon-call-exceptions + $(LIB2_DIVMOD_CFLAGS) $(vis_hide) libgcc-s-objects += $(lib2-divmod-s-o) endif @@ -810,7 +817,11 @@ # libgcc_eh.a, only LIB2ADDEH matters. If we do, only LIB2ADDEHSTATIC and # LIB2ADDEHSHARED matter. (Usually all three are identical.) +ifeq ($(enable_libunwind_exceptions),yes) c_flags := -fexceptions +else +c_flags := -fno-exceptions +endif ifeq ($(enable_shared),yes)
Re: [C++ Patch] for c++/51214
On Sun, Jun 3, 2012 at 10:56 AM, Fabien Chêne fabien.ch...@gmail.com wrote: 2012/6/3 Jason Merrill ja...@redhat.com: On 05/24/2012 09:18 AM, Jason Merrill wrote: OK, thanks. I notice you haven't checked the patch in yet, is there a problem? Not at all, just lack of time, so many problems/holidays to tackle at the moment... That is May month in France ;-) It must be distressing to make up for those 35 hours ;-p -- Gaby
Re: [gcov] a few improvements
On 06/03/12 17:16, Xinliang David Li wrote: Basically it makes it very difficult to rebuild the file with the profile data --- which makes problem triaging impossible. What is Can you explain this more -- what exactly are trying to do? Are you trying to rebuild multiple times with the same coverage data, or something different?
Re: [gcov] a few improvements
On Sun, Jun 3, 2012 at 10:24 AM, Nathan Sidwell nat...@acm.org wrote: On 06/03/12 17:16, Xinliang David Li wrote: Basically it makes it very difficult to rebuild the file with the profile data --- which makes problem triaging impossible. What is Can you explain this more -- what exactly are trying to do? Are you trying to rebuild multiple times with the same coverage data, yes -- for instance, in the context of debugging a compiler problem, you will need to compile the same file multiple times with the same coverage data. David or something different?
Re: [PATCH 2/2] Better system header location detection for built-in macro tokens
On 06/03/2012 05:27 AM, Jason Merrill wrote: On 06/02/2012 12:40 PM, Paolo Carlini wrote: That said, the tricks we are playing with the global input_location vs the loc we are passing around still confuse me quite a lot. Actually any *assignment* to input_location makes me a bit more nervous than I was already ;) Do you have any idea whether just passing down to build_x_modify_expr a different value for loc instead of assigning to input_location would also work for you? Maybe together with more throughly forwarding the loc from build_x_modify_expr itself to the build_min* functions (ie the project I mentioned above)?? We already pass to build_x_modify_expr the location that he is assigning to input_location. I would guess that the issue in this case is with the in_system_header macro, which uses input_location. I think the input_location hack here is OK until we improve our use of explicit locations to make it unnecessary. Good. In any case, as far as I can see, the assignment Dodji is adding just before calling build_x_modify_expr doesn't change anything for my issue, which actually has to do with build_x_binary_op: if I apply to below, thus passing an actual loc to build_min_non_dep_loc, there is a diagnostic regression for the locations of libstdc++-v3/testsuite/20_util/bind_ref_neg.cc. If you see something obviously wrong somewhere, just let me know... Paolo. /// Index: typeck.c === --- typeck.c(revision 188155) +++ typeck.c(working copy) @@ -2696,9 +2696,8 @@ finish_class_member_access_expr (tree object, tree BASELINK_QUALIFIED_P (member) = 1; orig_name = member; } - return build_min_non_dep (COMPONENT_REF, expr, - orig_object, orig_name, - NULL_TREE); + return build_min_non_dep_loc (UNKNOWN_LOCATION, COMPONENT_REF, expr, + orig_object, orig_name, NULL_TREE); } return expr; @@ -2763,7 +2762,7 @@ build_x_indirect_ref (location_t loc, tree expr, r rval = cp_build_indirect_ref (expr, errorstring, complain); if (processing_template_decl rval != error_mark_node) -return build_min_non_dep (INDIRECT_REF, rval, orig_expr); +return build_min_non_dep_loc (loc, INDIRECT_REF, rval, orig_expr); else return rval; } @@ -3631,7 +3630,7 @@ build_x_binary_op (location_t loc, enum tree_code warn_about_parentheses (code, arg1_code, orig_arg1, arg2_code, orig_arg2); if (processing_template_decl expr != error_mark_node) -return build_min_non_dep (code, expr, orig_arg1, orig_arg2); +return build_min_non_dep_loc (loc, code, expr, orig_arg1, orig_arg2); return expr; } @@ -3660,8 +3659,8 @@ build_x_array_ref (location_t loc, tree arg1, tree NULL_TREE, /*overload=*/NULL, complain); if (processing_template_decl expr != error_mark_node) -return build_min_non_dep (ARRAY_REF, expr, orig_arg1, orig_arg2, - NULL_TREE, NULL_TREE); +return build_min_non_dep_loc (loc, ARRAY_REF, expr, orig_arg1, orig_arg2, + NULL_TREE, NULL_TREE); return expr; } @@ -4764,7 +4763,7 @@ build_x_unary_op (location_t loc, enum tree_code c } if (processing_template_decl exp != error_mark_node) -exp = build_min_non_dep (code, exp, orig_expr, +exp = build_min_non_dep_loc (loc, code, exp, orig_expr, /*For {PRE,POST}{INC,DEC}REMENT_EXPR*/NULL_TREE); if (TREE_CODE (exp) == ADDR_EXPR) PTRMEM_OK_P (exp) = ptrmem; @@ -5624,8 +5623,8 @@ build_x_conditional_expr (location_t loc, tree ife expr = build_conditional_expr (ifexp, op1, op2, complain); if (processing_template_decl expr != error_mark_node) { - tree min = build_min_non_dep (COND_EXPR, expr, - orig_ifexp, orig_op1, orig_op2); + tree min = build_min_non_dep_loc (loc, COND_EXPR, expr, + orig_ifexp, orig_op1, orig_op2); /* In C++11, remember that the result is an lvalue or xvalue. In C++98, lvalue_kind can just assume lvalue in a template. */ if (cxx_dialect = cxx0x @@ -5742,7 +5741,8 @@ build_x_compound_expr (location_t loc, tree op1, t result = cp_build_compound_expr (op1, op2, complain); if (processing_template_decl result != error_mark_node) -return build_min_non_dep (COMPOUND_EXPR, result, orig_op1, orig_op2); +return build_min_non_dep_loc (loc, COMPOUND_EXPR, result, + orig_op1, orig_op2); return result; } Index: tree.c === --- tree.c (revision 188155) +++ tree.c (working copy) @@ -2086,7 +2086,7 @@ build_min (enum tree_code code, tree tt, ...) built. */ tree -build_min_non_dep (enum
Re: [PATCH 2/2] Better system header location detection for built-in macro tokens
On 06/03/2012 06:04 PM, Paolo Carlini wrote: if I apply to below, thus passing an actual loc to build_min_non_dep_loc, there is a diagnostic regression for the locations of libstdc++-v3/testsuite/20_util/bind_ref_neg.cc. If you see something obviously wrong somewhere, just let me know... Nope, it looks fine to me. I guess the change is due to having an explicit location on a tree that used to have none. Jason
Re: [google] Add options to pattern match function name for hotness attributes
Dehao Chen de...@google.com writes: Hi, This patch adds 4 flags to enable user to type in a list of name patterns. Compiler will match the function name with the given patterns, and add hot, cold, likely_hot, likely_cold attributes to function declaration. The static branch prediction checks if a basic block contains call to a annotated function, and set the branch probability accordingly. I like the idea (and would have some uses for it), but I don't like the command line options. That will lead to longer and longer command lines. Could this be made a pragma instead? You could still specify it from the Makefile by using -include -Andi -- a...@linux.intel.com -- Speaking for myself only
Re: [google] Add options to pattern match function name for hotness attributes
On Mon, Jun 4, 2012 at 9:34 AM, Andi Kleen a...@firstfloor.org wrote: Dehao Chen de...@google.com writes: Hi, This patch adds 4 flags to enable user to type in a list of name patterns. Compiler will match the function name with the given patterns, and add hot, cold, likely_hot, likely_cold attributes to function declaration. The static branch prediction checks if a basic block contains call to a annotated function, and set the branch probability accordingly. I like the idea (and would have some uses for it), but I don't like the command line options. That will lead to longer and longer command lines. Could this be made a pragma instead? You could still specify it from the Makefile by using -include Thanks for the suggestions. How about we add an option (-ffunction-attribute-list=), and we can also specify this option using #pragma GCC optimize in a separate -include file? Thanks, Dehao -Andi -- a...@linux.intel.com -- Speaking for myself only
[line-map] simple oneliner that speeds up track-macro-expansion
Hello list, the attached simple patch is a small part of the fix in Bug #53525, but it is independant and offers a noticeable performance improvement for track-macro-expansion. No regressions tested on x86, besides memory usage going up some KB. I don't get why the max_column_hint was zeroed in every macro. Maybe there are pathological cases that I don't see? 2012-06-04 Dimitrios Apostolou ji...@gmx.net * line-map.c (linemap_enter_macro): Don't zero max_column_hint in every macro. This improves performance by reducing the number of reallocations when track-macro-expansion is on. Thanks Dimitris === modified file 'libcpp/line-map.c' --- libcpp/line-map.c 2012-04-30 11:42:12 + +++ libcpp/line-map.c 2012-05-27 06:52:08 + @@ -331,7 +331,6 @@ linemap_enter_macro (struct line_maps *s num_tokens * sizeof (source_location)); LINEMAPS_MACRO_CACHE (set) = LINEMAPS_MACRO_USED (set) - 1; - set-max_column_hint = 0; return map; }
libgo patch committed: Better SWIG interface for allocating memory
This patch to libgo makes a better interface available to SWIG when C/C++ code wants to allocate memory that is garbage collected by Go. Previously if the only pointer to the memory were held in the C/C++ code, it was possible for the garbage collector to discard the memory before the C/C++ function returned. This patch fixes this in a straightforward way: it keeps a list of allocated memory, and drops the list when returning to Go. Bootstrapped and ran Go testsuite, and also SWIG testsuite, on x86_64-unknown-linux-gnu. Committed to mainline and 4.7 branch. Ian diff -r 9b3e0b116fe4 libgo/go/syscall/libcall_support.go --- a/libgo/go/syscall/libcall_support.go Wed May 30 16:04:09 2012 -0700 +++ b/libgo/go/syscall/libcall_support.go Sun Jun 03 22:30:03 2012 -0700 @@ -10,3 +10,9 @@ func Exitsyscall() func GetErrno() Errno func SetErrno(Errno) + +// These functions are used by CGO and SWIG. +func Cgocall() +func CgocallDone() +func CgocallBack() +func CgocallBackDone() diff -r 9b3e0b116fe4 libgo/runtime/go-cgo.c --- a/libgo/runtime/go-cgo.c Wed May 30 16:04:09 2012 -0700 +++ b/libgo/runtime/go-cgo.c Sun Jun 03 22:30:03 2012 -0700 @@ -4,11 +4,116 @@ Use of this source code is governed by a BSD-style license that can be found in the LICENSE file. */ +#include runtime.h #include go-alloc.h #include interface.h #include go-panic.h #include go-string.h +/* Go memory allocated by code not written in Go. We keep a linked + list of these allocations so that the garbage collector can see + them. */ + +struct cgoalloc +{ + struct cgoalloc *next; + void *alloc; +}; + +/* Prepare to call from code written in Go to code written in C or + C++. This takes the current goroutine out of the Go scheduler, as + though it were making a system call. Otherwise the program can + lock up if the C code goes to sleep on a mutex or for some other + reason. This idea is to call this function, then immediately call + the C/C++ function. After the C/C++ function returns, call + syscall_cgocalldone. The usual Go code would look like + + syscall.Cgocall() + defer syscall.Cgocalldone() + cfunction() + + */ + +/* We let Go code call these via the syscall package. */ +void syscall_cgocall(void) __asm__ (syscall.Cgocall); +void syscall_cgocalldone(void) __asm__ (syscall.CgocallDone); +void syscall_cgocallback(void) __asm__ (syscall.CgocallBack); +void syscall_cgocallbackdone(void) __asm__ (syscall.CgocallBackDone); + +void +syscall_cgocall () +{ + M* m; + G* g; + + m = runtime_m (); + ++m-ncgocall; + g = runtime_g (); + ++g-ncgo; + runtime_entersyscall (); +} + +/* Prepare to return to Go code from C/C++ code. */ + +void +syscall_cgocalldone () +{ + G* g; + + g = runtime_g (); + __go_assert (g != NULL); + --g-ncgo; + if (g-ncgo == 0) +{ + /* We are going back to Go, and we are not in a recursive call. + Let the garbage collector clean up any unreferenced + memory. */ + g-cgoalloc = NULL; +} + + /* If we are invoked because the C function called _cgo_panic, then + _cgo_panic will already have exited syscall mode. */ + if (g-status == Gsyscall) +runtime_exitsyscall (); +} + +/* Call back from C/C++ code to Go code. */ + +void +syscall_cgocallback () +{ + runtime_exitsyscall (); +} + +/* Prepare to return to C/C++ code from a callback to Go code. */ + +void +syscall_cgocallbackdone () +{ + runtime_entersyscall (); +} + +/* Allocate memory and save it in a list visible to the Go garbage + collector. */ + +void * +alloc_saved (size_t n) +{ + void *ret; + G *g; + struct cgoalloc *c; + + ret = __go_alloc (n); + + g = runtime_g (); + c = (struct cgoalloc *) __go_alloc (sizeof (struct cgoalloc)); + c-next = g-cgoalloc; + c-alloc = ret; + g-cgoalloc = c; + + return ret; +} + /* These are routines used by SWIG. The gc runtime library provides the same routines under the same name, though in that case the code is required to import runtime/cgo. */ @@ -16,7 +121,12 @@ void * _cgo_allocate (size_t n) { - return __go_alloc (n); + void *ret; + + runtime_exitsyscall (); + ret = alloc_saved (n); + runtime_entersyscall (); + return ret; } extern const struct __go_type_descriptor string_type_descriptor @@ -30,13 +140,39 @@ struct __go_string *ps; struct __go_empty_interface e; + runtime_exitsyscall (); len = __builtin_strlen (p); - data = __go_alloc (len); + data = alloc_saved (len); __builtin_memcpy (data, p, len); - ps = __go_alloc (sizeof *ps); + ps = alloc_saved (sizeof *ps); ps-__data = data; ps-__length = len; e.__type_descriptor = string_type_descriptor; e.__object = ps; + + /* We don't call runtime_entersyscall here, because normally what + will happen is that we will walk up the stack to a Go deferred + function that calls recover. However, this will do the wrong + thing if this panic is recovered and the stack unwinding is + caught by a C++ exception