Re: [google gcc-4_9] Fix bad LIPO profile produced by gcov-tool
This is only needed for gcov-tool as we need to rewrite the moduel-info to the profile (this is only used in decompress) The transitional compiler path does not need it because the string is already decompressed. It only needs to use the strings. gcov-dump in theory does not need it either if it only dumps the strings. But now I added the printing of both lengths. So now it is also needed. On Tue, Dec 1, 2015 at 4:46 PM, Xinliang David Li wrote: > Not sure about this line: > > mod_info->combined_strlen = saved_compressed_len; > > This did not exist for the compiler path before. > > On Tue, Dec 1, 2015 at 4:34 PM, Rong Xu wrote: >> >> Hi, >> >> This patch fixes the issue when using gcov-tool to merge LIPO profiles >> after we compressing the module infomration . We should not decompress >> the string as the compressed string should be written directly to the >> profile later. Tested with some LIPO profiles. >> >> Thanks, >> >> -Rong > >
[google gcc-4_9] Fix bad LIPO profile produced by gcov-tool
Hi, This patch fixes the issue when using gcov-tool to merge LIPO profiles after we compressing the module infomration . We should not decompress the string as the compressed string should be written directly to the profile later. Tested with some LIPO profiles. Thanks, -Rong 2015-12-01 Rong Xu * gcov-dump.c (tag_module_info): Dump string information. * gcov-io.c (gcov_read_module_info): record combined_len and don't uncompress in gcov-tool. Index: gcov-dump.c === --- gcov-dump.c (revision 231134) +++ gcov-dump.c (working copy) @@ -588,6 +588,11 @@ tag_module_info (const char *filename ATTRIBUTE_UN { if (!mod_info->is_primary) printf ("%s\n", mod_info->source_filename); + unsigned short compressed_size = mod_info->combined_strlen; + unsigned short uncompressed_size = mod_info->combined_strlen>>16; + printf ("compressed_ strlen=%d uncompressed_strlen=%d String:\n", + compressed_size,uncompressed_size); + printf ("%s\n", mod_info->saved_cc1_strings); } else { Index: gcov-io.c === --- gcov-io.c (revision 231134) +++ gcov-io.c (working copy) @@ -835,16 +835,18 @@ gcov_read_module_info (struct gcov_module_info *mo len -= (src_filename_len + 1); saved_compressed_len = (unsigned long) gcov_read_unsigned (); - saved_uncompressed_len = saved_compressed_len >> 16; - saved_compressed_len &= 0x; + mod_info->combined_strlen = saved_compressed_len; tag_len = gcov_read_unsigned (); len -= (tag_len + 2); gcc_assert (!len); compressed_array = (char *) xmalloc (tag_len * sizeof (gcov_unsigned_t)); - uncompressed_array = (char *) xmalloc (saved_uncompressed_len); for (i = 0; i < tag_len; i++) ((gcov_unsigned_t *) compressed_array)[i] = gcov_read_unsigned (); +#if !defined (IN_GCOV_TOOL) + saved_uncompressed_len = saved_compressed_len >> 16; + saved_compressed_len &= 0x; + uncompressed_array = (char *) xmalloc (saved_uncompressed_len); result_len = saved_uncompressed_len; uncompress ((Bytef *)uncompressed_array, &result_len, (const Bytef *)compressed_array, saved_compressed_len); @@ -851,6 +853,9 @@ gcov_read_module_info (struct gcov_module_info *mo gcc_assert (result_len == saved_uncompressed_len); mod_info->saved_cc1_strings = uncompressed_array; free (compressed_array); +#else /* IN_GCOV_TOOL: we don't need to uncompress. It's a pass through. */ + mod_info->saved_cc1_strings = compressed_array; +#endif } #endif
Re: [google][gcc-4_9] encode and compress cc1 option strings in gcov_module_info
Here is the patch set 2 that integrates David's comments. Note that this uses the combined strlen (i.e. encoding compressed and uncompressed strlen into one gcov_unsigned_t). Testing is ongoing. -Rong On Tue, Oct 6, 2015 at 11:30 AM, Rong Xu wrote: > It's 1:3 to 1:4 in the programs I tested. But it really depends on how > the options are used. I think your idea of using combined strlen works > better. > I just make the code a little clumsy but it does not cause any > performance issue. > > On Tue, Oct 6, 2015 at 10:21 AM, Xinliang David Li wrote: >> On Tue, Oct 6, 2015 at 9:26 AM, Rong Xu wrote: >>> On Mon, Oct 5, 2015 at 5:33 PM, Xinliang David Li >>> wrote: >>>>unsigned ggc_memory = gcov_read_unsigned (); >>>> + unsigned marker = 0, len = 0, k; >>>> + char **string_array, *saved_cc1_strings; >>>> + >>>>for (unsigned j = 0; j < 7; j++) >>>> >>>> >>>> Do not use hard coded number. Use the enum defined in coverage.c. >>> >>> OK. >>> >>>> >>>> >>>> +string_array[j] = xstrdup (gcov_read_string ()); >>>> + >>>> + k = 0; >>>> + for (unsigned j = 1; j < 7; j++) >>>> >>>> Do not use hard coded number. >>> >>> OK. >>> >>>> >>>> >>>> +{ >>>> + if (num_array[j] == 0) >>>> +continue; >>>> + marker += num_array[j]; >>>> >>>> It is better to read if the name of variable 'marker' is changed to >>>> 'j_end' or something similar >>>> >>>> For all the substrings of 'j' kind, there should be just one marker, >>>> right? It looks like here you introduce one marker per string, not one >>>> marker per string kind. >>> >>> I don't understand what you meant here. "marker" is fixed for each j >>> substring (one option kind) -- it the end index of the sub-string >>> array. k-loop is for each string. >>> >> >> That was a wrong comment from me. Discard it. >> >>>> >>>> + len += 3; /* [[ */ >>>> >>>> Same here for hard coded value. >>>> >>>> + for (; k < marker; k++) >>>> +len += strlen (string_array[k]) + 1; /* 1 for delimter of ']' >>>> */ >>>> >>>> Why do we need one ']' per string? >>> >>> This is because the options strings can contain space ' '. I cannot >>> use space as the delimiter, neither is \0 as it is the end of the >>> string of the encoded string. >> >> Ok -- this allows you to avoid string copy during parsing. >>> >>>> >>>> >>>> +} >>>> + saved_cc1_strings = (char *) xmalloc (len + 1); >>>> + saved_cc1_strings[0] = 0; >>>> + >>>> + marker = 0; >>>> + k = 0; >>>> + for (unsigned j = 1; j < 7; j++) >>>> >>>> Same here for 7. >>> >>> will fix in the new patch. >>> >>>> >>>> +{ >>>> + static const char lipo_string_flags[6] = {'Q', 'B', 'S', >>>> 'D','I', 'C'}; >>>> + if (num_array[j] == 0) >>>> +continue; >>>> + marker += num_array[j]; >>>> >>>> Suggest changing marker to j_end >>> OK. >>> >>>> >>>> + sprintf (saved_cc1_strings, "%s[[%c", saved_cc1_strings, >>>> + lipo_string_flags[j - 1]); >>>> + for (; k < marker; k++) >>>> +{ >>>> + sprintf (saved_cc1_strings, "%s%s]", saved_cc1_strings, >>>> + string_array[k]); >>>> >>>> +#define DELIMTER"[[" >>>> >>>> Why double '[' ? >>> I will change to single '['. >>> >>>> >>>> +#define DELIMTER2 "]" >>>> +#define QUOTE_PATH_FLAG 'Q' >>>> +#define BRACKET_PATH_FLAG 'B' >>>> +#define SYSTEM_PATH_FLAG'S' >>>> +#define D_U_OPTION_FLAG 'D
Re: [google][gcc-4_9] encode and compress cc1 option strings in gcov_module_info
It's 1:3 to 1:4 in the programs I tested. But it really depends on how the options are used. I think your idea of using combined strlen works better. I just make the code a little clumsy but it does not cause any performance issue. On Tue, Oct 6, 2015 at 10:21 AM, Xinliang David Li wrote: > On Tue, Oct 6, 2015 at 9:26 AM, Rong Xu wrote: >> On Mon, Oct 5, 2015 at 5:33 PM, Xinliang David Li wrote: >>>unsigned ggc_memory = gcov_read_unsigned (); >>> + unsigned marker = 0, len = 0, k; >>> + char **string_array, *saved_cc1_strings; >>> + >>>for (unsigned j = 0; j < 7; j++) >>> >>> >>> Do not use hard coded number. Use the enum defined in coverage.c. >> >> OK. >> >>> >>> >>> +string_array[j] = xstrdup (gcov_read_string ()); >>> + >>> + k = 0; >>> + for (unsigned j = 1; j < 7; j++) >>> >>> Do not use hard coded number. >> >> OK. >> >>> >>> >>> +{ >>> + if (num_array[j] == 0) >>> +continue; >>> + marker += num_array[j]; >>> >>> It is better to read if the name of variable 'marker' is changed to >>> 'j_end' or something similar >>> >>> For all the substrings of 'j' kind, there should be just one marker, >>> right? It looks like here you introduce one marker per string, not one >>> marker per string kind. >> >> I don't understand what you meant here. "marker" is fixed for each j >> substring (one option kind) -- it the end index of the sub-string >> array. k-loop is for each string. >> > > That was a wrong comment from me. Discard it. > >>> >>> + len += 3; /* [[ */ >>> >>> Same here for hard coded value. >>> >>> + for (; k < marker; k++) >>> +len += strlen (string_array[k]) + 1; /* 1 for delimter of ']' >>> */ >>> >>> Why do we need one ']' per string? >> >> This is because the options strings can contain space ' '. I cannot >> use space as the delimiter, neither is \0 as it is the end of the >> string of the encoded string. > > Ok -- this allows you to avoid string copy during parsing. >> >>> >>> >>> +} >>> + saved_cc1_strings = (char *) xmalloc (len + 1); >>> + saved_cc1_strings[0] = 0; >>> + >>> + marker = 0; >>> + k = 0; >>> + for (unsigned j = 1; j < 7; j++) >>> >>> Same here for 7. >> >> will fix in the new patch. >> >>> >>> +{ >>> + static const char lipo_string_flags[6] = {'Q', 'B', 'S', >>> 'D','I', 'C'}; >>> + if (num_array[j] == 0) >>> +continue; >>> + marker += num_array[j]; >>> >>> Suggest changing marker to j_end >> OK. >> >>> >>> + sprintf (saved_cc1_strings, "%s[[%c", saved_cc1_strings, >>> + lipo_string_flags[j - 1]); >>> + for (; k < marker; k++) >>> +{ >>> + sprintf (saved_cc1_strings, "%s%s]", saved_cc1_strings, >>> + string_array[k]); >>> >>> +#define DELIMTER"[[" >>> >>> Why double '[' ? >> I will change to single '['. >> >>> >>> +#define DELIMTER2 "]" >>> +#define QUOTE_PATH_FLAG 'Q' >>> +#define BRACKET_PATH_FLAG 'B' >>> +#define SYSTEM_PATH_FLAG'S' >>> +#define D_U_OPTION_FLAG 'D' >>> +#define INCLUDE_OPTION_FLAG 'I' >>> +#define COMMAND_ARG_FLAG'C' >>> + >>> +enum lipo_cc1_string_kind { >>> + k_quote_paths = 0, >>> + k_bracket_paths, >>> + k_system_paths, >>> + k_cpp_defines, >>> + k_cpp_includes, >>> + k_lipo_cl_args, >>> + num_lipo_cc1_string_kind >>> +}; >>> + >>> +struct lipo_parsed_cc1_string { >>> + const char* source_filename; >>> + unsigned num[num_lipo_cc1_string_kind]; >>> + char **strings[num_lipo_cc1_string_kind]; >>> +}; >>> + >>> +struct lipo_parsed_cc1_string * >>&g
Re: [google][gcc-4_9] encode and compress cc1 option strings in gcov_module_info
On Mon, Oct 5, 2015 at 5:33 PM, Xinliang David Li wrote: >unsigned ggc_memory = gcov_read_unsigned (); > + unsigned marker = 0, len = 0, k; > + char **string_array, *saved_cc1_strings; > + >for (unsigned j = 0; j < 7; j++) > > > Do not use hard coded number. Use the enum defined in coverage.c. OK. > > > +string_array[j] = xstrdup (gcov_read_string ()); > + > + k = 0; > + for (unsigned j = 1; j < 7; j++) > > Do not use hard coded number. OK. > > > +{ > + if (num_array[j] == 0) > +continue; > + marker += num_array[j]; > > It is better to read if the name of variable 'marker' is changed to > 'j_end' or something similar > > For all the substrings of 'j' kind, there should be just one marker, > right? It looks like here you introduce one marker per string, not one > marker per string kind. I don't understand what you meant here. "marker" is fixed for each j substring (one option kind) -- it the end index of the sub-string array. k-loop is for each string. > > + len += 3; /* [[ */ > > Same here for hard coded value. > > + for (; k < marker; k++) > +len += strlen (string_array[k]) + 1; /* 1 for delimter of ']' */ > > Why do we need one ']' per string? This is because the options strings can contain space ' '. I cannot use space as the delimiter, neither is \0 as it is the end of the string of the encoded string. > > > +} > + saved_cc1_strings = (char *) xmalloc (len + 1); > + saved_cc1_strings[0] = 0; > + > + marker = 0; > + k = 0; > + for (unsigned j = 1; j < 7; j++) > > Same here for 7. will fix in the new patch. > > +{ > + static const char lipo_string_flags[6] = {'Q', 'B', 'S', > 'D','I', 'C'}; > + if (num_array[j] == 0) > +continue; > + marker += num_array[j]; > > Suggest changing marker to j_end OK. > > + sprintf (saved_cc1_strings, "%s[[%c", saved_cc1_strings, > + lipo_string_flags[j - 1]); > + for (; k < marker; k++) > +{ > + sprintf (saved_cc1_strings, "%s%s]", saved_cc1_strings, > + string_array[k]); > > +#define DELIMTER"[[" > > Why double '[' ? I will change to single '['. > > +#define DELIMTER2 "]" > +#define QUOTE_PATH_FLAG 'Q' > +#define BRACKET_PATH_FLAG 'B' > +#define SYSTEM_PATH_FLAG'S' > +#define D_U_OPTION_FLAG 'D' > +#define INCLUDE_OPTION_FLAG 'I' > +#define COMMAND_ARG_FLAG'C' > + > +enum lipo_cc1_string_kind { > + k_quote_paths = 0, > + k_bracket_paths, > + k_system_paths, > + k_cpp_defines, > + k_cpp_includes, > + k_lipo_cl_args, > + num_lipo_cc1_string_kind > +}; > + > +struct lipo_parsed_cc1_string { > + const char* source_filename; > + unsigned num[num_lipo_cc1_string_kind]; > + char **strings[num_lipo_cc1_string_kind]; > +}; > + > +struct lipo_parsed_cc1_string * > +lipo_parse_saved_cc1_string (const char *src, char *str, > +bool parse_cl_args_only); > +void free_parsed_string (struct lipo_parsed_cc1_string *string); > + > > Declare above in a header file. OK. > > > /* Returns true if the command-line arguments stored in the given > module-infos > are incompatible. */ > bool > -incompatible_cl_args (struct gcov_module_info* mod_info1, > - struct gcov_module_info* mod_info2) > +incompatible_cl_args (struct lipo_parsed_cc1_string* mod_info1, > + struct lipo_parsed_cc1_string* mod_info2) > > Fix formating. OK. > > { > { > @@ -1647,7 +1679,7 @@ build_var (tree fn_decl, tree type, int counter) > /* Creates the gcov_fn_info RECORD_TYPE. */ > >NULL_TREE, get_gcov_unsigned_t ()); >DECL_CHAIN (field) = fields; >fields = field; > > - /* Num bracket paths */ > + /* cc1_uncompressed_strlen field */ >field = build_decl (BUILTINS_LOCATION, FIELD_DECL, >NULL_TREE, get_gcov_unsigned_t ()); >DECL_CHAIN (field) = fields; >fields = field; > > > Why do we need to store uncompressed string length? If there is need > to do that, I suggest combine uncompressed length and compressed > length into one 32bit integer. (16/16, or 17/15 split) In theory, I don't need the uncompressed
[google][gcc-4_9] encode and compress cc1 option strings in gcov_module_info
Hi, This patch is for google branch only. It encodes and compresses various cc1 option strings in gcov_module_info to reduce the lipo instrumented object size. The savings are from string compression and the reduced number of relocations. More specifically, we replace the following fields in gcov_module_info gcov_unsigned_t num_quote_paths; gcov_unsigned_t num_bracket_paths; gcov_unsigned_t num_system_paths; gcov_unsigned_t num_cpp_defines; gcov_unsigned_t num_cpp_includes; gcov_unsigned_t num_cl_args; char *string_array[1]; with gcov_unsigned_t cc1_strlen; gcov_unsigned_t cc1_uncompressed_strlen; char *saved_cc1_strings; The new saved_cc1_strings are zlib compressed string. Tested with google internal benchmarks. Thanks, -Rong 2015-10-05 Rong Xu * gcc/Makefile.in (gcov-dump): link with zlib (gcov-tool): Ditto. * gcc/auto-profile.c (autofdo_module_profile::read): convert old gcov_module_info to the new format. (read_aux_modules): using new incompatible_cl_args interface. * gcc/coverage.c (zlib.h): new include. (enum lipo_cc1_string_kind): new define. (struct lipo_parsed_cc1_string): new define. (incompatible_cl_args): change the paramter. (read_counts_file): using new gcov_module_info. (build_fn_info_type): remove used parameter. (build_gcov_module_info_type): using new gcov_module_info. (free_parsed_string): new function. (find_substr): ditto. (lipo_parse_saved_cc1_string): ditto. (lipo_append_tag): ditto. (build_gcov_module_info_value): using new gcov_module_info. (coverage_obj_init): remove used parameter. (add_module_info): using new gcov_module_info. (process_include): ditto. (process_include_paths_1): ditto. (process_include_paths): ditto. (set_lipo_c_parsing_context): ditto. * gcc/coverage.h: add new decls. * gcc/gcov-io.c (zlib.h) new include. (gcov_read_module_info): using new gcov_module_info. * gcc/gcov-io.h (struct gcov_module_info): new gcov_module_info. * libgcc/dyn-ipa.c: delete unused code. * libgcc/libgcov-driver.c (gcov_write_module_info): using new gcov_module_info. Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 228354) +++ gcc/Makefile.in (working copy) @@ -2583,7 +2583,7 @@ gcov$(exeext): $(GCOV_OBJS) $(LIBDEPS) GCOV_DUMP_OBJS = gcov-dump.o vec.o ggc-none.o gcov-dump$(exeext): $(GCOV_DUMP_OBJS) $(LIBDEPS) +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_DUMP_OBJS) \ - $(LIBS) -o $@ + $(LIBS) $(ZLIB) -o $@ GCOV_TOOL_DEP_FILES = $(srcdir)/../libgcc/libgcov-util.c gcov-io.c $(GCOV_IO_H) \ $(srcdir)/../libgcc/libgcov-driver.c $(srcdir)/../libgcc/libgcov-driver-system.c \ @@ -2607,7 +2607,7 @@ gcov-tool-params.o: params.c $(CONFIG_H) $(SYSTEM_ GCOV_TOOL_OBJS = gcov-tool.o libgcov-util.o libgcov-driver-tool.o \ libgcov-merge-tool.o gcov-tool-dyn-ipa.o gcov-tool-params.o gcov-tool$(exeext): $(GCOV_TOOL_OBJS) $(LIBDEPS) - +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_TOOL_OBJS) $(LIBS) -o $@ + +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_TOOL_OBJS) $(LIBS) $(ZLIB) -o $@ # # Build the include directories. The stamp files are stmp-* rather than # s-* so that mostlyclean does not force the include directory to Index: gcc/auto-profile.c === --- gcc/auto-profile.c (revision 228354) +++ gcc/auto-profile.c (working copy) @@ -947,7 +947,6 @@ autofdo_source_profile::get_function_instance_by_i return s; } - /* Member functions for autofdo_module_profile. */ bool @@ -972,6 +971,9 @@ autofdo_module_profile::read () unsigned exported = gcov_read_unsigned (); unsigned lang = gcov_read_unsigned (); unsigned ggc_memory = gcov_read_unsigned (); + unsigned marker = 0, len = 0, k; + char **string_array, *saved_cc1_strings; + for (unsigned j = 0; j < 7; j++) { num_array[j] = gcov_read_unsigned (); @@ -988,19 +990,48 @@ autofdo_module_profile::read () module->ident = i + 1; module->lang = lang; module->ggc_memory = ggc_memory; - module->num_quote_paths = num_array[1]; - module->num_bracket_paths = num_array[2]; - module->num_system_paths = num_array[3]; - module->num_cpp_defines = num_array[4]; - module->num_cpp_includes = num_array[5]; - module->num_cl_args = num_array[6]; module->source_filename = name; module->is_primary = strcmp (name, in_fnames[0]) == 0; module->flags = module->is_primary ? exported : 1; for (unsigned j = 0; j < num_array[0]; j++) ret.first->second.first.safe_push (xstrdup (gcov_read_strin
Re: [google][gcc-4_9] Remove unused key field in gcov_fn_info
You are right. I attached the updated patch to this email. On Tue, Sep 29, 2015 at 3:10 PM, Xinliang David Li wrote: >else > { >gfi_ptr = gi_ptr->functions[f_ix]; > - if (gfi_ptr && gfi_ptr->key == gi_ptr) > + if (gfi_ptr) > length = GCOV_TAG_FUNCTION_LENGTH; > - else > -length = 0; > } > > The removal of 'else' path seems wrong. > > David > > > On Tue, Sep 29, 2015 at 1:46 PM, Rong Xu wrote: >> Hi, >> >> This patch is for google/gcc-4_9 branch. >> >> The 'key' field in gcov_fn_info is designed to allow gcov function >> data to be COMDATTed, but the comdat elimination never works. This >> patch removes this field to reduce the instrumented object size. >> >> Thanks, >> >> -Rong Removed the unused 'key' field in gcov_fn_info to reduce the instrumented objects size. 2015-09-29 Rong Xu * gcc/coverage.c (build_fn_info_type): Remove 'key' field. (build_fn_info): Ditto. (coverage_obj_fn): Ditto. * libgcc/libgcov.h (struct gcov_fn_info): Ditto. * libgcc/libgcov-driver.c (gcov_compute_histogram): Ditto. (gcov_exit_compute_summary): Ditto. (gcov_exit_merge_gcda): Ditto. (gcov_write_func_counters): Ditto. (gcov_clear): Ditto. * libgcc/libgcov-util.c (tag_function): Ditto. (gcov_merge): Ditto. (gcov_profile_scale): Ditto. (gcov_profile_normalize): Ditto. (compute_one_gcov): Ditto. (gcov_info_count_all_cold): Ditto. Index: gcc/coverage.c === --- gcc/coverage.c (revision 228223) +++ gcc/coverage.c (working copy) @@ -189,7 +189,7 @@ static void read_counts_file (const char *, unsign static tree build_var (tree, tree, int); static void build_fn_info_type (tree, unsigned, tree); static void build_info_type (tree, tree); -static tree build_fn_info (const struct coverage_data *, tree, tree); +static tree build_fn_info (const struct coverage_data *, tree); static tree build_info (tree, tree); static bool coverage_obj_init (void); static vec *coverage_obj_fn @@ -1668,16 +1668,9 @@ build_fn_info_type (tree type, unsigned counters, finish_builtin_struct (ctr_info, "__gcov_ctr_info", fields, NULL_TREE); - /* key */ - field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE, - build_pointer_type (build_qualified_type - (gcov_info_type, TYPE_QUAL_CONST))); - fields = field; - /* ident */ field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE, get_gcov_unsigned_t ()); - DECL_CHAIN (field) = fields; fields = field; /* lineno_checksum */ @@ -1705,10 +1698,10 @@ build_fn_info_type (tree type, unsigned counters, /* Returns a CONSTRUCTOR for a gcov_fn_info. DATA is the coverage data for the function and TYPE is the gcov_fn_info - RECORD_TYPE. KEY is the object file key. */ + RECORD_TYPE. */ static tree -build_fn_info (const struct coverage_data *data, tree type, tree key) +build_fn_info (const struct coverage_data *data, tree type) { tree fields = TYPE_FIELDS (type); tree ctr_type; @@ -1716,11 +1709,6 @@ static tree vec *v1 = NULL; vec *v2 = NULL; - /* key */ - CONSTRUCTOR_APPEND_ELT (v1, fields, - build1 (ADDR_EXPR, TREE_TYPE (fields), key)); - fields = DECL_CHAIN (fields); - /* ident */ CONSTRUCTOR_APPEND_ELT (v1, fields, build_int_cstu (get_gcov_unsigned_t (), @@ -2556,7 +2544,7 @@ static vec * coverage_obj_fn (vec *ctor, tree fn, struct coverage_data const *data) { - tree init = build_fn_info (data, gcov_fn_info_type, gcov_info_var); + tree init = build_fn_info (data, gcov_fn_info_type); tree var = build_var (fn, gcov_fn_info_type, -1); DECL_INITIAL (var) = init; Index: libgcc/libgcov-driver.c === --- libgcc/libgcov-driver.c (revision 227984) +++ libgcc/libgcov-driver.c (working copy) @@ -380,7 +380,7 @@ gcov_compute_histogram (struct gcov_summary *sum) { gfi_ptr = gi_ptr->functions[f_ix]; - if (!gfi_ptr || gfi_ptr->key != gi_ptr) + if (!gfi_ptr) continue; ci_ptr = &gfi_ptr->ctrs[ctr_info_ix]; @@ -430,9 +430,6 @@ gcov_exit_compute_summary (struct gcov_summary *th { gfi_ptr = gi_ptr->functions[f_ix]; - if (gfi_ptr && gfi_ptr->key != gi_ptr) -gfi_ptr = 0; - crc32 = crc32_unsigned (crc32, gfi_ptr ? gfi_ptr->cfg_checksum : 0); crc32 = crc32_unsigned (crc32, gfi_ptr ?
[google][gcc-4_9] Remove unused key field in gcov_fn_info
Hi, This patch is for google/gcc-4_9 branch. The 'key' field in gcov_fn_info is designed to allow gcov function data to be COMDATTed, but the comdat elimination never works. This patch removes this field to reduce the instrumented object size. Thanks, -Rong Removed the unused 'key' field in gcov_fn_info to reduce the instrumented objects size. 2015-09-29 Rong Xu * gcc/coverage.c (build_fn_info_type): Remove 'key' field. (build_fn_info): Ditto. (coverage_obj_fn): Ditto. * libgcc/libgcov.h (struct gcov_fn_info): Ditto. * libgcc/libgcov-driver.c (gcov_compute_histogram): Ditto. (gcov_exit_compute_summary): Ditto. (gcov_exit_merge_gcda): Ditto. (gcov_write_func_counters): Ditto. (gcov_clear): Ditto. * libgcc/libgcov-util.c (tag_function): Ditto. (gcov_merge): Ditto. (gcov_profile_scale): Ditto. (gcov_profile_normalize): Ditto. (compute_one_gcov): Ditto. (gcov_info_count_all_cold): Ditto. Index: gcc/coverage.c === --- gcc/coverage.c (revision 228223) +++ gcc/coverage.c (working copy) @@ -189,7 +189,7 @@ static void read_counts_file (const char *, unsign static tree build_var (tree, tree, int); static void build_fn_info_type (tree, unsigned, tree); static void build_info_type (tree, tree); -static tree build_fn_info (const struct coverage_data *, tree, tree); +static tree build_fn_info (const struct coverage_data *, tree); static tree build_info (tree, tree); static bool coverage_obj_init (void); static vec *coverage_obj_fn @@ -1668,16 +1668,9 @@ build_fn_info_type (tree type, unsigned counters, finish_builtin_struct (ctr_info, "__gcov_ctr_info", fields, NULL_TREE); - /* key */ - field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE, - build_pointer_type (build_qualified_type - (gcov_info_type, TYPE_QUAL_CONST))); - fields = field; - /* ident */ field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE, get_gcov_unsigned_t ()); - DECL_CHAIN (field) = fields; fields = field; /* lineno_checksum */ @@ -1705,10 +1698,10 @@ build_fn_info_type (tree type, unsigned counters, /* Returns a CONSTRUCTOR for a gcov_fn_info. DATA is the coverage data for the function and TYPE is the gcov_fn_info - RECORD_TYPE. KEY is the object file key. */ + RECORD_TYPE. */ static tree -build_fn_info (const struct coverage_data *data, tree type, tree key) +build_fn_info (const struct coverage_data *data, tree type) { tree fields = TYPE_FIELDS (type); tree ctr_type; @@ -1716,11 +1709,6 @@ static tree vec *v1 = NULL; vec *v2 = NULL; - /* key */ - CONSTRUCTOR_APPEND_ELT (v1, fields, - build1 (ADDR_EXPR, TREE_TYPE (fields), key)); - fields = DECL_CHAIN (fields); - /* ident */ CONSTRUCTOR_APPEND_ELT (v1, fields, build_int_cstu (get_gcov_unsigned_t (), @@ -2556,7 +2544,7 @@ static vec * coverage_obj_fn (vec *ctor, tree fn, struct coverage_data const *data) { - tree init = build_fn_info (data, gcov_fn_info_type, gcov_info_var); + tree init = build_fn_info (data, gcov_fn_info_type); tree var = build_var (fn, gcov_fn_info_type, -1); DECL_INITIAL (var) = init; Index: libgcc/libgcov-driver.c === --- libgcc/libgcov-driver.c (revision 227984) +++ libgcc/libgcov-driver.c (working copy) @@ -380,7 +380,7 @@ gcov_compute_histogram (struct gcov_summary *sum) { gfi_ptr = gi_ptr->functions[f_ix]; - if (!gfi_ptr || gfi_ptr->key != gi_ptr) + if (!gfi_ptr) continue; ci_ptr = &gfi_ptr->ctrs[ctr_info_ix]; @@ -430,9 +430,6 @@ gcov_exit_compute_summary (struct gcov_summary *th { gfi_ptr = gi_ptr->functions[f_ix]; - if (gfi_ptr && gfi_ptr->key != gi_ptr) -gfi_ptr = 0; - crc32 = crc32_unsigned (crc32, gfi_ptr ? gfi_ptr->cfg_checksum : 0); crc32 = crc32_unsigned (crc32, gfi_ptr ? gfi_ptr->lineno_checksum : 0); @@ -688,7 +685,7 @@ gcov_exit_merge_gcda (struct gcov_info *gi_ptr, if (length != GCOV_TAG_FUNCTION_LENGTH) goto read_mismatch; - if (!gfi_ptr || gfi_ptr->key != gi_ptr) + if (!gfi_ptr) { /* This function appears in the other program. We need to buffer the information in order to write @@ -832,10 +829,8 @@ gcov_write_func_counters (struct gcov_info *gi_ptr else { gfi_ptr = gi_ptr->functions[f_ix]; - if (gfi_ptr && gfi_ptr->key == gi_ptr) + if (gfi_pt
[PATCH] multi-target indirect-call promotion
Hi, This patch implements the use phase of indirect-call-topn-profile that promotes multiple targets of indirect-calls. It addresses pr/45631. The main idea is to use current speculation framework, with a vector of direct edges (sorted by the probability). The trick part is the we have multiple edges associating withe same call_stmt. Resolve_speculation and edirect_call_stmt_to_callee are more complex now. This patch still has some issues. It runs into an assertion verify_cgraph_node assertion() when ruing 483.xalancbmk in spec2006. I think Honza should be able to quickly spot the issue and give valuable suggestions. Thanks, -Rong indirect_call_topn_profile_use_patch Description: Binary data
[google gcc-4.9] FDO build for linux kernel
Here is patch that ports our work on FDO linux kernel build support to gcc-4_9. With this patch, kernel will use the libgcov functions to dump the gcda files. This patch also enables LIPO build. But the module grouping is not computed online. We will use gcov-tool to do this offline. Tested with kernel FDO build, regression and google internal benchmarks. Thanks, -Rong kfdo_patch Description: Binary data
[google gcc-4_9] make ifunc support available for BIONIC
ifunc support is hard-coded as false for BIONIC. This patch removes this check and let configure decide whether it should have ifunc support. Thanks, -Rong ifunc_diff Description: Binary data
Re: [PATCH PR63581] Fix undefined references in debug_info
Here is the diff for ChangeLog Index: ChangeLog === --- ChangeLog (revision 216415) +++ ChangeLog (working copy) @@ -1,3 +1,8 @@ +2014-10-17 + + * cfgrtl.c (emit_barrier_after_bb): Append the barrier to the + footer, instead of unconditionally overwritten. + 2014-10-17 Eric Botcazou * ipa-inline-transform.c (master_clone_with_noninline_clones_p): New. Index: testsuite/ChangeLog === --- testsuite/ChangeLog (revision 216415) +++ testsuite/ChangeLog (working copy) @@ -1,3 +1,6 @@ +2014-10-17 + * g++.dg/tree-prof/pr63581.C: New test. + 2014-10-17 Marek Polacek PR c/63543 On Fri, Oct 17, 2014 at 2:54 PM, Rong Xu wrote: > Hi, > > The attached patch fixes PR63581. The diagnosis is in the bug report. > Google ref b/17759776. > > Tested with bootstrap and regression. > > Thanks, > > -Rong
[PATCH PR63581] Fix undefined references in debug_info
Hi, The attached patch fixes PR63581. The diagnosis is in the bug report. Google ref b/17759776. Tested with bootstrap and regression. Thanks, -Rong 63581_patch Description: Binary data
[PATCH] move gcov-dump functionality to gcov-tool
Hi This patch moves the gcov-dump functionality to gcov-tool (which is installed by default). The options are exactly the same as before. The difference is instead of calling "gcov-dump ...", we now call "gcov-tool dump ..." gcov-dump is useful in debugging fdo issues. I think it would be very helpful to make it available with the compiler. gcov-dump.c and the build construct are deleted with this patch. Tested with bootstrap and some spec2006 profiles. Thanks, -Rong gcov_tool_dump_patch Description: Binary data
Re: [PATCH] add overlap function to gcov-tool
On Tue, Oct 7, 2014 at 9:31 PM, Jan Hubicka wrote: >> Hi, >> >> This patch adds overlap functionality to gcov-tool. The overlap score >> estimates the similarity of two profiles. Currently it only computes >> overlap for arc counters. >> >> The overlap score is defined as >> \sum minimum (p1-counter[i] / p1-sum-all, p2-counter[i] / p2-sum-all) >> where p1-counter[i] and p2-counter[2] are two matched counter from >> profile1 and profiler2. >> p1-sum-all and p2-sum-all are the sum-all counters in profiler1 and >> profile2, repetitively. > > The patch looks fine in general. My statistics is all rusty, but can't we use > one of the established techniques like Kullback-Leibler to compare the > probabilitis distributions? Interesting. I never thought of using Kullback-Leibler divergence. It's very easy to switch to KL using this overlap framework -- only a few lines of change. The problem is KL divergence assumes absolute continuity (i.e. q(i) ==0 --> p(i) =0, which is not true in our distribution, I'm not sure how to work around this.). I did try earth-mover-distance (EMD) in our earlier internal version. But since I used uniform distance, the problem can be simplified to distribution diffs. > It would be also nice to have ability to compare > branch probabilities in btween train runs. Do you mean to do the comparison in CFG rather on the raw counters? We need gcno file to reconstruct the CFG. That needs some work. -Rong > > Honza >> >> The resulting score is a value ranging from 0.0 to 1.0 where 0.0 means >> no match and 1.0 mean a perfect match. >> >> This tool can be used in performance triaging and reducing the fdo >> training set size (where similar inputs can be pruned). >> >> Tested with spec2006 profiles. >> >> Thanks, >> >> -Rong > >> 2014-10-07 Rong Xu >> >> * gcc/gcov-tool.c (profile_overlap): New driver function >> to compute profile overlap. >> (print_overlap_usage_message): New. >> (overlap_usage): New. >> (do_overlap): New. >> (print_usage): Add calls to overlap function. >> (main): Ditto. >> * libgcc/libgcov-util.c (read_gcda_file): Fix format. >> (find_match_gcov_info): Ditto. >> (calculate_2_entries): New. >> (compute_one_gcov): Ditto. >> (gcov_info_count_all_cold): Ditto. >> (gcov_info_count_all_zero): Ditto. >> (extract_file_basename): Ditto. >> (get_file_basename): Ditto. >> (set_flag): Ditto. >> (matched_gcov_info): Ditto. >> (calculate_overlap): Ditto. >> (gcov_profile_overlap): Ditto. >> * libgcc/libgcov-driver.c (compute_summary): Make >> it avavilable for external calls. >> * gcc/doc/gcov-tool.texi: Add documentation. >> >> Index: gcc/gcov-tool.c >> === >> --- gcc/gcov-tool.c (revision 215981) >> +++ gcc/gcov-tool.c (working copy) >> @@ -39,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect >> #include >> >> extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, >> int); >> +extern int gcov_profile_overlap (struct gcov_info*, struct gcov_info*); >> extern int gcov_profile_normalize (struct gcov_info*, gcov_type); >> extern int gcov_profile_scale (struct gcov_info*, float, int, int); >> extern struct gcov_info* gcov_read_profile_dir (const char*, int); >> @@ -368,6 +369,121 @@ do_rewrite (int argc, char **argv) >>return ret; >> } >> >> +/* Driver function to computer the overlap score b/w profile D1 and D2. >> + Return 1 on error and 0 if OK. */ >> + >> +static int >> +profile_overlap (const char *d1, const char *d2) >> +{ >> + struct gcov_info *d1_profile; >> + struct gcov_info *d2_profile; >> + >> + d1_profile = gcov_read_profile_dir (d1, 0); >> + if (!d1_profile) >> +return 1; >> + >> + if (d2) >> +{ >> + d2_profile = gcov_read_profile_dir (d2, 0); >> + if (!d2_profile) >> +return 1; >> + >> + return gcov_profile_overlap (d1_profile, d2_profile); >> +} >> + >> + return 1; >> +} >> + >> +/* Usage message for profile overlap. */ >> + >> +static void >> +print_overlap_usage_message (int error_p) >> +{ >> + FILE *file = error_p ? stderr : stdout; >> + >> + fnotice (file, " overlap [options] Compute the >> overlap of two profiles\n"
[PATCH] add overlap function to gcov-tool
Hi, This patch adds overlap functionality to gcov-tool. The overlap score estimates the similarity of two profiles. Currently it only computes overlap for arc counters. The overlap score is defined as \sum minimum (p1-counter[i] / p1-sum-all, p2-counter[i] / p2-sum-all) where p1-counter[i] and p2-counter[2] are two matched counter from profile1 and profiler2. p1-sum-all and p2-sum-all are the sum-all counters in profiler1 and profile2, repetitively. The resulting score is a value ranging from 0.0 to 1.0 where 0.0 means no match and 1.0 mean a perfect match. This tool can be used in performance triaging and reducing the fdo training set size (where similar inputs can be pruned). Tested with spec2006 profiles. Thanks, -Rong 2014-10-07 Rong Xu * gcc/gcov-tool.c (profile_overlap): New driver function to compute profile overlap. (print_overlap_usage_message): New. (overlap_usage): New. (do_overlap): New. (print_usage): Add calls to overlap function. (main): Ditto. * libgcc/libgcov-util.c (read_gcda_file): Fix format. (find_match_gcov_info): Ditto. (calculate_2_entries): New. (compute_one_gcov): Ditto. (gcov_info_count_all_cold): Ditto. (gcov_info_count_all_zero): Ditto. (extract_file_basename): Ditto. (get_file_basename): Ditto. (set_flag): Ditto. (matched_gcov_info): Ditto. (calculate_overlap): Ditto. (gcov_profile_overlap): Ditto. * libgcc/libgcov-driver.c (compute_summary): Make it avavilable for external calls. * gcc/doc/gcov-tool.texi: Add documentation. Index: gcc/gcov-tool.c === --- gcc/gcov-tool.c (revision 215981) +++ gcc/gcov-tool.c (working copy) @@ -39,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect #include extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, int); +extern int gcov_profile_overlap (struct gcov_info*, struct gcov_info*); extern int gcov_profile_normalize (struct gcov_info*, gcov_type); extern int gcov_profile_scale (struct gcov_info*, float, int, int); extern struct gcov_info* gcov_read_profile_dir (const char*, int); @@ -368,6 +369,121 @@ do_rewrite (int argc, char **argv) return ret; } +/* Driver function to computer the overlap score b/w profile D1 and D2. + Return 1 on error and 0 if OK. */ + +static int +profile_overlap (const char *d1, const char *d2) +{ + struct gcov_info *d1_profile; + struct gcov_info *d2_profile; + + d1_profile = gcov_read_profile_dir (d1, 0); + if (!d1_profile) +return 1; + + if (d2) +{ + d2_profile = gcov_read_profile_dir (d2, 0); + if (!d2_profile) +return 1; + + return gcov_profile_overlap (d1_profile, d2_profile); +} + + return 1; +} + +/* Usage message for profile overlap. */ + +static void +print_overlap_usage_message (int error_p) +{ + FILE *file = error_p ? stderr : stdout; + + fnotice (file, " overlap [options] Compute the overlap of two profiles\n"); + fnotice (file, "-v, --verbose Verbose mode\n"); + fnotice (file, "-h, --hotonly Only print info for hot objects/functions\n"); + fnotice (file, "-f, --function Print function level info\n"); + fnotice (file, "-F, --fullname Print full filename\n"); + fnotice (file, "-o, --objectPrint object level info\n"); + fnotice (file, "-t , --hot_threshold Set the threshold for hotness\n"); + +} + +static const struct option overlap_options[] = +{ + { "verbose",no_argument, NULL, 'v' }, + { "function", no_argument, NULL, 'f' }, + { "fullname", no_argument, NULL, 'F' }, + { "object", no_argument, NULL, 'o' }, + { "hotonly",no_argument, NULL, 'h' }, + { "hot_threshold", required_argument, NULL, 't' }, + { 0, 0, 0, 0 } +}; + +/* Print overlap usage and exit. */ + +static void +overlap_usage (void) +{ + fnotice (stderr, "Overlap subcomand usage:"); + print_overlap_usage_message (true); + exit (FATAL_EXIT_CODE); +} + +int overlap_func_level; +int overlap_obj_level; +int overlap_hot_only; +int overlap_use_fullname; +double overlap_hot_threshold = 0.005; + +/* Driver for profile overlap sub-command. */ + +static int +do_overlap (int argc, char **argv) +{ + int opt; + int ret; + + optind = 0; + while ((opt = getopt_long (argc, argv, "vfFoht:", overlap_options, NULL)) != -1) +{ + switch (opt) +{ +case 'v': + verbose
Re: [PATCH] Indirect-call topn targets profiler (instrumentation)
On Mon, Oct 6, 2014 at 12:21 PM, Jan Hubicka wrote: >> On Mon, Oct 6, 2014 at 11:22 AM, Jan Hubicka wrote: >> > Rong, >> > Would be possible to use topn profiler to get resonale histograms for >> > switch >> > expansion, too? In that case it may make sense to have value version of >> > it as well. >> Yes. That's doable. I'm just not sure if top 2 entries are enough. I can see from your example that it will be useful for BBs having multiple case stmt. >> The underlying topn_profiler can be used the same way as the current >> one_value profiler for different scenarios. However, I wonder if it is >> useful for switch value profiling -- the arc profiling can do the >> same. > > Well, for switch expansion one really wants a histogram of values across the > switch > values for the decision tree balancing. But I guess this profiler does not > fit > that very well. > > Consider cases like > > switch (a) > { >case 3: >case 5: >case 7: > prime(); >case 2: >case 4: >case 6: > even(); > default: > bogus (); > } > > Arc profiling does tell you how often a is prime or even, but it does not > help you > to balance the tree, because the value ranges are iinterlacing >> >> David >> >> > >> > Otherwise the patch seems OK. I would implement it myself by introducing >> > separate >> > variables holding the topn profiler declaration, but can not think of >> > downsides of >> > doing it your way. >> > >> > The topn profiler should be better on determining the dominating target, >> > so perhaps >> > you could look into hooking it into ipa-profile.c. Extending speculative >> > edges for >> > multiple targets ought to be also quite easy - we need to update >> > speculative_call_info >> > to collect edges annotated to a given call stmt into a vector instead of >> > taking the >> > two references it does now. >> > >> > It would be also very nice to update it to handle case where all the edges >> > are direct >> > so we can use it tospeculatively inlinine functions that can be interposed >> > at runtime. Yes. I'm looking at this too. I actually ported the fdo-use part of the patch and it passes spec2006. But it performs the transformation using the old way (in gimple_ic_transform()) -- so I did not send out that patch. I'm trying to do this in the new way (in ipa-profile()). In any case, that will be a separated patch. Also, the switch expansion profile should be also a separated patch. Is it ok to commit these two patches now? Thanks, -Rong >> > >> > Thanks, >> > Honza >> > >> > Index: gcc/tree-profile.c >> > === >> > --- gcc/tree-profile.c (revision 215886) >> > +++ gcc/tree-profile.c (working copy) >> > @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3. If not see >> > #include "target.h" >> > #include "tree-cfgcleanup.h" >> > #include "tree-nested.h" >> > +#include "params.h" >> > >> > static GTY(()) tree gcov_type_node; >> > static GTY(()) tree tree_interval_profiler_fn; >> > @@ -101,7 +102,10 @@ init_ic_make_global_vars (void) >> > { >> >ic_void_ptr_var >> > = build_decl (UNKNOWN_LOCATION, VAR_DECL, >> > - get_identifier ("__gcov_indirect_call_callee"), >> > + get_identifier ( >> > + (PARAM_VALUE (PARAM_INDIR_CALL_TOPN_PROFILE) >> > ? >> > + "__gcov_indirect_call_topn_callee" : >> > + "__gcov_indirect_call_callee")), >> > ptr_void); >> >TREE_PUBLIC (ic_void_ptr_var) = 1; >> >DECL_EXTERNAL (ic_void_ptr_var) = 1; >> > @@ -131,7 +135,10 @@ init_ic_make_global_vars (void) >> > { >> >ic_gcov_type_ptr_var >> > = build_decl (UNKNOWN_LOCATION, VAR_DECL, >> > - get_identifier ("__gcov_indirect_call_counters"), >> > + get_identifier ( >> > + (PARAM_VALUE (PARAM_INDIR_CALL_TOPN_PROFILE) >> > ? >> > + "__gcov_indirect_call_topn_counters" : >> > + "__gcov_indirect_call_counters")), >> > gcov_type_ptr); >> >TREE_PUBLIC (ic_gcov_type_ptr_var) = 1; >> >DECL_EXTERNAL (ic_gcov_type_ptr_var) = 1; >> > @@ -226,8 +233,10 @@ gimple_init_edge_profiler (void) >> > ptr_void, >> > NULL_TREE); >> > tree_indirect_call_profiler_fn >> > - = build_fn_decl ("__gcov_indirect_call_profiler_v2", >> > -ic_profiler_fn_type); >> > + = build_fn_decl ( (PARAM_VALUE >> > (PARAM_INDIR_CALL_TOPN_PROFILE) ? >> > +"__gcov_indirect_call_topn_profiler": >> > +"__gcov_indirect_call_profiler_v2"), >> > +
[PATCH] Indirect-call topn targets profiler (runtime)
Hi, I ported the indirect-call topn target profile from google branch. It implements the algorithm described in "Value profiling and optimization" by Brad Calder and Peter Feller. This patch is the runtime support. Tested with gcc regression tests and the instrumentation patch (in a separated patch). Thanks, -Rong patch1_diff Description: Binary data
[PATCH] Indirect-call topn targets profiler (instrumentation)
Hi, I ported the indirect-call topn target profile from google branch. It implements the algorithm described in "Value profiling and optimization" by Brad Calder and Peter Feller. This patch is about the instrumentation. When --param=indir-call-topn-profile=1 is specified, we will use indirection-call-topn profiler in this patch, instead of indirect_call_profiler_v2. Tested with gcc regression tests and the runtime patch (in a separated patch). Thanks, -Rong patch2_diff Description: Binary data
[google gcc-4_9] fix undefined references in debug_info
Hi, This patch fixed a bug exposed in build kernel with fdo. We cannot simply overwrite the bb footer in emit_barrier_after_bb as the bb may already have a footer (in this case, a deleted label stmt). We need to output this label because it's a user label and debug_info has a reference to it. Tested with problematic file and regression test. Trunk may also have the same issue, but I need to work on a testcase. Thanks, -Rong 2014-10-02 Rong Xu * gcc/cfgrtl.c (emit_barrier_after_bb): Append footer instead of overwriting. Index: gcc/cfgrtl.c === --- gcc/cfgrtl.c(revision 215823) +++ gcc/cfgrtl.c(working copy) @@ -1453,7 +1453,20 @@ emit_barrier_after_bb (basic_block bb) gcc_assert (current_ir_type () == IR_RTL_CFGRTL || current_ir_type () == IR_RTL_CFGLAYOUT); if (current_ir_type () == IR_RTL_CFGLAYOUT) -BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier); +{ + rtx insn = unlink_insn_chain (barrier, barrier); + + if (BB_FOOTER (bb)) + { + rtx footer_tail = BB_FOOTER (bb); + while (NEXT_INSN(footer_tail)) +footer_tail = NEXT_INSN (insn); + NEXT_INSN (footer_tail) = insn; + PREV_INSN (insn) = footer_tail; + } + else +BB_FOOTER (bb) = insn; +} } /* Like force_nonfallthru below, but additionally performs redirection
[PATCH] PR61889
This patch makes the build of gcov-tool configurable. It checks if ftw.h is available. For mingw build, it provides ftw functionality by using FindFirstFile/FindNextFile/FindClose API. Tested with and without --disable-gcov-tool. Thanks, -Rong 2014-09-02 Rong Xu * gcc/Makefile.in: Make the build gcov-tool configurable. * gcc/configure.ac: Ditto. * gcc/configure: Ditto. * gcc/config.in: Ditto. * gcc/gcov-tool.c (unlink_gcda_file): Support win32 build. (unlink_profile_dir): Ditto. * libgcc/libgcov-util.c (read_gcda_file): Ditto. (read_file_handler): Ditto. (ftw_read_file): Ditto. (myftw): Ditto. (gcov_read_profile_dir): Ditto. (gcov_profile_normalize): Ditto. Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 214831) +++ gcc/Makefile.in (working copy) @@ -123,9 +123,13 @@ SUBDIRS =@subdirs@ build # Selection of languages to be made. CONFIG_LANGUAGES = @all_selected_languages@ -LANGUAGES = c gcov$(exeext) gcov-dump$(exeext) gcov-tool$(exeext) \ -$(CONFIG_LANGUAGES) +LANGUAGES = c gcov$(exeext) gcov-dump$(exeext) $(CONFIG_LANGUAGES) +disable_gcov_tool = @disable_gcov_tool@ +ifneq ($(disable_gcov_tool),yes) +LANGUAGES += gcov-tool$(exeext) +endif + # Default values for variables overridden in Makefile fragments. # CFLAGS is for the user to override to, e.g., do a cross build with -O2. # TCFLAGS is used for compilations with the GCC just built. Index: gcc/configure.ac === --- gcc/configure.ac(revision 214831) +++ gcc/configure.ac(working copy) @@ -5650,6 +5650,26 @@ if test "${ENABLE_LIBQUADMATH_SUPPORT}" != "no" ; fi +# Check if gcov-tool can be built. +AC_ARG_ENABLE(gcov-tool, +[AS_HELP_STRING([--disable-gcov-tool], +[disable the build of gcov-tool])]) +if test x"$enable_gcov_tool" = x"no"; then + disable_gcov_tool=yes +else + AC_CHECK_HEADERS(ftw.h, [disable_gcov_tool=no], + [case $host_os in + win32 | cygwin* | mingw32*) +disable_gcov_tool=no +;; + *) +disable_gcov_tool=yes +;; +esac]) +fi +AC_SUBST(disable_gcov_tool) + + # Specify what hash style to use by default. AC_ARG_WITH([linker-hash-style], [AC_HELP_STRING([--with-linker-hash-style={sysv,gnu,both}], Index: gcc/configure === --- gcc/configure (revision 214831) +++ gcc/configure (working copy) @@ -600,6 +600,7 @@ ac_includes_default="\ ac_subst_vars='LTLIBOBJS LIBOBJS +disable_gcov_tool PICFLAG enable_host_shared enable_plugin @@ -932,6 +933,7 @@ enable_version_specific_runtime_libs enable_plugin enable_host_shared enable_libquadmath_support +enable_gcov_tool with_linker_hash_style ' ac_precious_vars='build_alias @@ -1655,6 +1657,7 @@ Optional Features: --enable-host-sharedbuild host code as shared libraries --disable-libquadmath-support disable libquadmath support for Fortran + --disable-gcov-tool disable the build of gcov-tool Optional Packages: --with-PACKAGE[=ARG]use PACKAGE [ARG=yes] @@ -8353,7 +8356,7 @@ fi for ac_header in limits.h stddef.h string.h strings.h stdlib.h time.h iconv.h \ fcntl.h unistd.h sys/file.h sys/time.h sys/mman.h \ sys/resource.h sys/param.h sys/times.h sys/stat.h \ -direct.h malloc.h langinfo.h ldfcn.h locale.h wchar.h +direct.h malloc.h langinfo.h ldfcn.h locale.h wchar.h ftw.h do : as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh` ac_fn_c_check_header_preproc "$LINENO" "$ac_header" "$as_ac_Header" @@ -18033,7 +18036,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 18036 "configure" +#line 18039 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -18139,7 +18142,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 18142 "configure" +#line 18145 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -28116,6 +28119,33 @@ $as_echo "#define ENABLE_LIBQUADMATH_SUPPORT 1" >> fi +# Check if gcov-tool can be built. +# Check whether --enable-gcov-tool was given. +if test "${enable_gcov_tool+set}" = set; then : + enableval=$enable_gcov_tool; +fi + +if test x"$enable_gcov_tool" = x"no"; then + disable_gcov_tool=yes
[patch] make gcov-tool build configurable and fix mingw build issue
Hi, This patch makes gcov-tool build configurable. A new configure option --disable-gcov-tool is added to disable the build. It also fixes some build issues for mingw. Tested with regression and bootstrap. mingw test is ongoing. OK to check in if mingw build passes? Thanks, -Rong gcov_tool_patch_diff Description: Binary data
Re: [PATCH] offline gcda profile processing tool
On Fri, Jul 11, 2014 at 11:46 AM, Jan Hubicka wrote: >> Richard, >> >> I looked at my patch again. I already add -Wno-error to libgcov-util.o >> compilation: >> In line 200 of gcc/Makefile.in >> libgcov-util.o-warn = -Wno-error >> >> In my test, I used gcc-4.6 as the host compiler. I got warning like this: >> >> In file included from ../../gcc/gcc/../libgcc/libgcov-util.c:30:0: >> ../../gcc/gcc/../libgcc/libgcov.h:184:30: warning: ISO C++ forbids >> zero-size array ???ctrs??? [-pedantic] >> >> Can you check your buildlog to see if -Wno-error is added to the command >> line? > > I would preffer the libgcov.h (that is interface header to libgcov users) to > be > valid C++, so we still ought to fix it. > > Honza OK. I will send out a patch for review. -Rong >> >> Thanks, >> >> -Rong >> >> On Fri, Jul 11, 2014 at 9:47 AM, Rong Xu wrote: >> > I did see the warning in the bootstrap, but it did not exit the build. >> > I thought it was ok. >> > >> > I'll have a patch for this and send for review. >> > >> > -Rong >> > >> > On Fri, Jul 11, 2014 at 9:13 AM, Xinliang David Li >> > wrote: >> >> right. >> >> >> >> Rong, the fix would be just change ctr array size to 1. For each >> >> function, there should be at least one kind of counters -- see the >> >> assertion in build_fn_info_type. There are some code that do >> >> 'sizeof(gcov_fn_info)' when computing heap size -- they can be >> >> adjusted or leave it as it is (if not doing memcpy for the whole >> >> array). >> >> >> >> David >> >> >> >> On Fri, Jul 11, 2014 at 8:44 AM, Jakub Jelinek wrote: >> >>> On Fri, Jul 11, 2014 at 08:42:27AM -0700, Xinliang David Li wrote: >> >>>> I wonder why. The struct definition for gcov_fn_info has not changed >> >>>> in this patch. >> >>> >> >>> Perhaps it has been used only in C until now? >> >>> >> >>> Jakub
Re: [PATCH] offline gcda profile processing tool
Richard, I looked at my patch again. I already add -Wno-error to libgcov-util.o compilation: In line 200 of gcc/Makefile.in libgcov-util.o-warn = -Wno-error In my test, I used gcc-4.6 as the host compiler. I got warning like this: In file included from ../../gcc/gcc/../libgcc/libgcov-util.c:30:0: ../../gcc/gcc/../libgcc/libgcov.h:184:30: warning: ISO C++ forbids zero-size array ‘ctrs’ [-pedantic] Can you check your buildlog to see if -Wno-error is added to the command line? Thanks, -Rong On Fri, Jul 11, 2014 at 9:47 AM, Rong Xu wrote: > I did see the warning in the bootstrap, but it did not exit the build. > I thought it was ok. > > I'll have a patch for this and send for review. > > -Rong > > On Fri, Jul 11, 2014 at 9:13 AM, Xinliang David Li wrote: >> right. >> >> Rong, the fix would be just change ctr array size to 1. For each >> function, there should be at least one kind of counters -- see the >> assertion in build_fn_info_type. There are some code that do >> 'sizeof(gcov_fn_info)' when computing heap size -- they can be >> adjusted or leave it as it is (if not doing memcpy for the whole >> array). >> >> David >> >> On Fri, Jul 11, 2014 at 8:44 AM, Jakub Jelinek wrote: >>> On Fri, Jul 11, 2014 at 08:42:27AM -0700, Xinliang David Li wrote: >>>> I wonder why. The struct definition for gcov_fn_info has not changed >>>> in this patch. >>> >>> Perhaps it has been used only in C until now? >>> >>> Jakub
Re: [PATCH] offline gcda profile processing tool
I did see the warning in the bootstrap, but it did not exit the build. I thought it was ok. I'll have a patch for this and send for review. -Rong On Fri, Jul 11, 2014 at 9:13 AM, Xinliang David Li wrote: > right. > > Rong, the fix would be just change ctr array size to 1. For each > function, there should be at least one kind of counters -- see the > assertion in build_fn_info_type. There are some code that do > 'sizeof(gcov_fn_info)' when computing heap size -- they can be > adjusted or leave it as it is (if not doing memcpy for the whole > array). > > David > > On Fri, Jul 11, 2014 at 8:44 AM, Jakub Jelinek wrote: >> On Fri, Jul 11, 2014 at 08:42:27AM -0700, Xinliang David Li wrote: >>> I wonder why. The struct definition for gcov_fn_info has not changed >>> in this patch. >> >> Perhaps it has been used only in C until now? >> >> Jakub
Re: [PATCH] offline gcda profile processing tool
I should use _WIN32 macro. This code is for windows mkdir api. I'll commit a patch for this shortly (approved by honza). -Rong On Fri, Jul 11, 2014 at 8:49 AM, Xinliang David Li wrote: > What is the macro to test POSIX IO on host? The guard uses > TARGET_POSIX_IO which is not right. > > David > > On Fri, Jul 11, 2014 at 1:12 AM, Christophe Lyon > wrote: >> On 11 July 2014 10:07, Richard Biener wrote: >>> On Mon, May 5, 2014 at 7:17 PM, Rong Xu wrote: >>>> Here is the updated patch. The difference with patch set 3 is >>>> (1) use the gcov-counter.def for scaling operation. >>>> (2) fix wrong scaling function for time-profiler. >>>> >>>> passed with bootstrap, profiledboostrap and SPEC2006. >>> >>> One of the patches breaks bootstrap for me: >>> >>> /space/rguenther/src/svn/trunk/gcc/../libgcc/libgcov.h:184: error: ISO >>> C++ forbids zero-size array ‘ctrs’ >>> make[3]: *** [libgcov-util.o] Error 1 >>> >>> host compiler is gcc 4.3.4 >>> >>> Richard. >>> >> >> On my side, commit 212448 breaks the cross-build of GCC for targets >> using newlib: >> * arm-none-eabi >> * aarch64-none-elf >> /usr/include/sys/stat.h: In function <80><98>void >> gcov_output_files(const char*, gcov_info*)<80><99>: >> /usr/include/sys/stat.h:317: error: too few arguments to function >> <80><98>int mkdir(const char*, __mode_t)<80><99> >> /tmp/1392958_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/gcov-tool.c:96: >> error: at this point in file >> make[2]: *** [gcov-tool.o] Error 1 >> >> Christophe. >> >>>> Thanks, >>>> >>>> -Rong >>>> >>>> On Thu, May 1, 2014 at 3:37 PM, Rong Xu wrote: >>>>> Hi, Honza, >>>>> >>>>> Please find the new patch in the attachment. This patch integrated >>>>> your earlier suggestions. The noticeable changes are: >>>>> (1) build specialized object for libgcov-driver.c and libgcov-merge.c >>>>> and link into gcov-tool, rather including the source file. >>>>> (2) split some gcov counter definition code to gcov-counter.def to >>>>> avoid code duplication. >>>>> (3) use a macro for gcov_read_counter(), so gcov-tool can use existing >>>>> code in libgcov-merge.c with minimal change. >>>>> >>>>> This patch does not address the suggestion of creating a new directory >>>>> for libgcov. I agree with you that's a much better >>>>> and cleaner structure we should go for. We can do that in follow-up >>>>> patches. >>>>> >>>>> I tested this patch with boostrap and profiledbootstrap. Other tests >>>>> are on-going. >>>>> >>>>> Thanks, >>>>> >>>>> -Rong >>>>> >>>>> On Wed, Apr 16, 2014 at 8:34 PM, Jan Hubicka wrote: >>>>>>> GCOT_TOOL needs to use this function to read the string in gcda file >>>>>>> to memory to construct gcov_info objects. >>>>>>> As you noticed, gcov runtime does not need this interface. But >>>>>>> gcov-tool links with gcov runtime and it also uses the function. >>>>>>> We could make it available in gcov_runtime, but that will slightly >>>>>>> increase the memory footprint. >>>>>> >>>>>> Yep, it is not really pretty. I wrote bellow some plan how things may be >>>>>> structured in more convenient way. Any work in that direction would be >>>>>> welcome. >>>>>>> >>>>>>> The planned new functions for trunk version is: (1) overlap score b/w >>>>>>> two profiles (2) better dumping (top hot objects/function/counters) >>>>>>> and statistics. >>>>>>> Once this basic version is in, we can start to add the new >>>>>>> functionality. >>>>>> >>>>>> Sounds good. I assume the autoFDO does not go via gcov tool but rather >>>>>> uses >>>>>> custom reader of profile data at GCC side? >>>>>> I wonder, are there any recent overview papers/slides/text of design of >>>>>> AutoFDO >>>>>> and other features to be merged? >>>>>> I remember the talk from GCC Summit and I did read some of pre-LTO LIPO >
Re: [BUILDROBOT] gcov patch
On Fri, Jul 11, 2014 at 8:06 AM, Jan Hubicka wrote: >> Sorry. This code meant to work with the different mkdir api in >> windows. I used wrong ifdef. >> >> Here is the patch. OK for checkin? > OK. I also see the following with LTO bootstrap: > ../../gcc/../libgcc/libgcov-util.c:41:24: error: type of �gcov_max_filename� > does not match original declaration [-Werror] > extern gcov_unsigned_t gcov_max_filename; > ^ > ../../gcc/../libgcc/libgcov-driver.c:88:8: note: previously declared here > size_t gcov_max_filename = 0; > Probably both can be size_t? OK. I will change this too and submit. Thanks for the quick review. -Rong > > Honza > >> >> Thanks, >> >> -Rong >> >> 2014-07-11 Rong Xu >> >> * gcov-tool.c (gcov_output_files): Fix build error. >> >> Index: gcov-tool.c >> === >> --- gcov-tool.c (revision 212448) >> +++ gcov-tool.c (working copy) >> @@ -90,8 +90,8 @@ gcov_output_files (const char *out, struct gcov_in >>/* Try to make directory if it doesn't already exist. */ >>if (access (out, F_OK) == -1) >> { >> -#ifdef TARGET_POSIX_IO >> - if (mkdir (out, 0755) == -1 && errno != EEXIST) >> +#if !defined(_WIN32) >> + if (mkdir (out, S_IRWXU | S_IRWXG | S_IRWXO) == -1 && errno != EEXIST) > > Sounds almost like something we could have libiberty glue for... > > >> #else >>if (mkdir (out) == -1 && errno != EEXIST) >> #endif >>
Re: [BUILDROBOT] gcov patch
Sorry. This code meant to work with the different mkdir api in windows. I used wrong ifdef. Here is the patch. OK for checkin? Thanks, -Rong 2014-07-11 Rong Xu * gcov-tool.c (gcov_output_files): Fix build error. Index: gcov-tool.c === --- gcov-tool.c (revision 212448) +++ gcov-tool.c (working copy) @@ -90,8 +90,8 @@ gcov_output_files (const char *out, struct gcov_in /* Try to make directory if it doesn't already exist. */ if (access (out, F_OK) == -1) { -#ifdef TARGET_POSIX_IO - if (mkdir (out, 0755) == -1 && errno != EEXIST) +#if !defined(_WIN32) + if (mkdir (out, S_IRWXU | S_IRWXG | S_IRWXO) == -1 && errno != EEXIST) #else if (mkdir (out) == -1 && errno != EEXIST) #endif On Fri, Jul 11, 2014 at 6:08 AM, Jan-Benedict Glaw wrote: > On Fri, 2014-07-11 15:03:06 +0200, Jan-Benedict Glaw > wrote: >> See eg. >> http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=289639, >> it breaks like this: >> >> [...] >> g++ -c -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE -fno-exceptions >> -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wwrite-strings -Wcast-qual >> -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long >> -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H >> -I. -I. -I/home/jbglaw/repos/gcc/gcc -I/home/jbglaw/repos/gcc/gcc/. >> -I/home/jbglaw/repos/gcc/gcc/../include >> -I/home/jbglaw/repos/gcc/gcc/../libcpp/include >> -I/home/jbglaw/repos/gcc/gcc/../libdecnumber >> -I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber >> -I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o gcov-tool.o -MT >> gcov-tool.o -MMD -MP -MF ./.deps/gcov-tool.TPo >> /home/jbglaw/repos/gcc/gcc/gcov-tool.c >> /usr/include/sys/stat.h: In function ‘void gcov_output_files(const char*, >> gcov_info*)’: >> /usr/include/sys/stat.h:323: error: too few arguments to function ‘int >> mkdir(const char*, __mode_t)’ >> /home/jbglaw/repos/gcc/gcc/gcov-tool.c:96: error: at this point in file >> make[1]: *** [gcov-tool.o] Error 1 >> make[1]: Leaving directory `/home/jbglaw/build/rl78-elf/build-gcc/gcc' >> make: *** [all-gcc] Error 2 > > [...] > > +#ifdef TARGET_POSIX_IO > + if (mkdir (out, 0755) == -1 && errno != EEXIST) > +#else > + if (mkdir (out) == -1 && errno != EEXIST) > +#endif > > > And with POSIX I/O, this should probably use the S_I* macros for the > mode bits? > > MfG, JBG > > -- > Jan-Benedict Glaw jbg...@lug-owl.de +49-172-7608481 > Signature of:Arroganz verkürzt fruchtlose Gespräche. > the second : -- Jan-Benedict Glaw
Re: [Google] Fix AFDO early inline ICEs due to DFE
This looks fine to me. -Rong On Thu, Jun 12, 2014 at 11:23 AM, Teresa Johnson wrote: > These two patches fix multiple ICE that occurred due to DFE being > recently enabled after AutoFDO LIPO linking. > > Passes regression and internal testing. Ok for Google/4_8? > > Teresa > > 2014-06-12 Teresa Johnson > Dehao Chen > > Google ref b/15521327. > > * cgraphclones.c (cgraph_clone_edge): Use resolved node. > * l-ipo.c (resolve_cgraph_node): Resolve to non-removable node. > > Index: cgraphclones.c > === > --- cgraphclones.c (revision 211386) > +++ cgraphclones.c (working copy) > @@ -94,6 +94,7 @@ along with GCC; see the file COPYING3. If not see > #include "ipa-utils.h" > #include "lto-streamer.h" > #include "except.h" > +#include "l-ipo.h" > > /* Create clone of E in the node N represented by CALL_EXPR the callgraph. > */ > struct cgraph_edge * > @@ -118,7 +119,11 @@ cgraph_clone_edge (struct cgraph_edge *e, struct c > >if (call_stmt && (decl = gimple_call_fndecl (call_stmt))) > { > - struct cgraph_node *callee = cgraph_get_node (decl); > + struct cgraph_node *callee; > + if (L_IPO_COMP_MODE && cgraph_pre_profiling_inlining_done) > +callee = cgraph_lipo_get_resolved_node (decl); > + else > +callee = cgraph_get_node (decl); > gcc_checking_assert (callee); > new_edge = cgraph_create_edge (n, callee, call_stmt, count, freq); > } > Index: l-ipo.c > === > --- l-ipo.c (revision 211386) > +++ l-ipo.c (working copy) > @@ -1542,6 +1542,18 @@ resolve_cgraph_node (struct cgraph_sym **slot, str >gcc_assert (decl1_defined); >add_define_module (*slot, decl2); > > + /* Pick the node that cannot be removed, to avoid a situation > + where we remove the resolved node and later try to access > + it for the remaining non-removable copy. E.g. one may be > + extern and the other weak, only the extern copy can be removed. */ > + if (cgraph_can_remove_if_no_direct_calls_and_refs_p ((*slot)->rep_node) > + && !cgraph_can_remove_if_no_direct_calls_and_refs_p (node)) > +{ > + (*slot)->rep_node = node; > + (*slot)->rep_decl = decl2; > + return; > +} > + >has_prof1 = has_profile_info (decl1); >bool is_aux1 = cgraph_is_auxiliary (decl1); >bool is_aux2 = cgraph_is_auxiliary (decl2); > > > -- > Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Re: [Google/4_8] Reduce memory overhead of LIPO COMDAT fixups
This patch looks good to me. -Rong On Thu, Jun 5, 2014 at 12:45 PM, Teresa Johnson wrote: > (cc'ing a few additional people to help with review as David is out > and I'm not sure Rong is available) > > This patch greatly reduces the memory overhead of the new COMDAT fixup > analysis, by changing the second level hash tables to linked lists. > I found that almost none of the second level hash tables contained more > than one entry, but each hash table required a lot of memory overhead. > > I also now query the fixup type before adding the checksums to the > pointer sets during callgraph building, which would have enabled a workaround > for the memory issue. > > Tested with regression tests and internal tests (see ref below for details). > > Google ref b/15415042. > > 2014-06-05 Teresa Johnson > > * dyn-ipa.c (struct lineno_checksum_alias): Replaced pointer set. > (struct checksum_alias_info): Enabled linked list. > (cfg_checksum_get_key): Removed. > (find_cfg_checksum): New function. > (cfg_checksum_insert): Operate on linked list. > (checksum_set_insert): Ditto. > (gcov_build_callgraph): Allow disabling checksum insertion. > (gcov_find_new_ic_target): Operate on linked list. > (gcov_fixup_counters_checksum): Ditto. > (gcov_fixup_counters_lineno): Ditto. > (__gcov_compute_module_groups): Compute fixup type earlier. > > Index: dyn-ipa.c > === > --- dyn-ipa.c (revision 211288) > +++ dyn-ipa.c (working copy) > @@ -79,16 +79,15 @@ struct dyn_cgraph >unsigned num_nodes_executed; >/* used by new algorithm */ >struct modu_node *modu_nodes; > - /* Set indexed by lineno_checksum, returns another dyn_pointer_set*, > - indexed by cfg_checksum. That returns a checksum_alias_info struct. */ > + /* Set indexed by lineno_checksum, returns a linked list of > + checksum_alias_info structs. */ >struct dyn_pointer_set *lineno_pointer_sets; > }; > > /* Struct holding information for functions with the same lineno_checksum. > */ > struct lineno_checksum_alias > { > - /* Set indexed by cfg_checksum, holding a checksum_alias_info struct. */ > - struct dyn_pointer_set *cfg_pointer_set; > + struct checksum_alias_info *cfg_checksum_list; >unsigned lineno_checksum; > }; > > @@ -96,6 +95,7 @@ struct lineno_checksum_alias > checksums. */ > struct checksum_alias_info > { > + struct checksum_alias_info *next_cfg_checksum; >struct checksum_alias *alias_list; >unsigned cfg_checksum; > }; > @@ -205,6 +205,7 @@ pointer_set_create (unsigned (*get_key) (const voi > static struct dyn_cgraph the_dyn_call_graph; > static int total_zero_count = 0; > static int total_insane_count = 0; > +static int fixup_type = 0; > > enum GROUPING_ALGORITHM > { > @@ -374,14 +375,6 @@ lineno_checksum_get_key (const void *p) >return ((const struct lineno_checksum_alias *) p)->lineno_checksum; > } > > -/* The cfg_checksum value in P is the key for a cfg_pointer_set. */ > - > -static inline unsigned > -cfg_checksum_get_key (const void *p) > -{ > - return ((const struct checksum_alias_info *) p)->cfg_checksum; > -} > - > /* Create a new checksum_alias struct for function with GUID, FI_PTR, > and ZERO_COUNTS flag. Prepends to list NEXT and returns new struct. */ > > @@ -398,28 +391,44 @@ new_checksum_alias (gcov_type guid, const struct g >return alias; > } > > -/* Insert a new checksum_alias struct into pointer set P for function with > +/* Locate the checksum_alias_info in LIST that matches CFG_CHECKSUM. */ > + > +static struct checksum_alias_info * > +find_cfg_checksum (struct checksum_alias_info *list, unsigned cfg_checksum) > +{ > + for (; list; list = list->next_cfg_checksum) > +{ > + if (list->cfg_checksum == cfg_checksum) > +return list; > +} > + return NULL; > +} > + > +/* Insert a new checksum_alias struct into LIST for function with > CFG_CHECKSUM and associated GUID, FI_PTR, and ZERO_COUNTS flag. */ > > -static void > -cfg_checksum_set_insert (struct dyn_pointer_set *p, unsigned cfg_checksum, > - gcov_type guid, const struct gcov_fn_info *fi_ptr, > - int zero_counts) > +static struct checksum_alias_info * > +cfg_checksum_insert (unsigned cfg_checksum, gcov_type guid, > + const struct gcov_fn_info *fi_ptr, int zero_counts, > + struct checksum_alias_info *list) > { > - struct checksum_alias_info **m = (struct checksum_alias_info **) > -pointer_set_find_or_insert (p, cfg_checksum); > - if (*m) > + struct checksum_alias_info *alias_info; > + alias_info = find_cfg_checksum (list, cfg_checksum); > + if (alias_info) > { > - gcc_assert ((*m)->alias_list); > - (*m)->alias_list = new_checksum_alias (guid, fi_ptr, zero_counts, > - (*m)->alias_l
Re: [Patch] Avoid gcc_assert in libgcov
I think these asserts will be used by gcov-tool. So I prefer to change them to gcov_nonruntime_assert(). I'll merge them in my new gcov-tool patch before submitting (waiting for honaz's ok). Thanks, -Rong On Thu, May 22, 2014 at 7:09 AM, Teresa Johnson wrote: > On Thu, Jan 16, 2014 at 2:41 PM, Jan Hubicka wrote: >>> On Wed, Jan 15, 2014 at 8:39 PM, Jan Hubicka wrote: >>> >> >>> >> In that case should we call gcov_error when IN_LIBGCOV? One >>> >> possibility would be to simply make gcov_nonruntime_assert be defined >>> >> as if (!EXPR) gcov_error in the IN_LIBGCOV case. But I think what you >>> >> wanted here was to reduce libgcov bloat by removing calls altogether, >>> >> which this wouldn't solve. But if we want to call gcov_error in some >>> >> cases, I think I need to add another macro that will either do >>> >> gcc_assert when !IN_LIBGCOV and "if (!EXPR) gcov_error" when >>> >> IN_LIBGCOV. Is that what you had in mind? >>> > >>> > I think for errors that can be triggered by data corruption, we ought to >>> > produce resonable error messages in both IN_LIBGCOV or for offline tools. >>> > Just unwound sequence if(...) gcov_error seems fine to me in this case, >>> > but we may also have assert like wrapper. >>> > I see we do not provide gcov_error outside libgcov, we probably ought to >>> > map >>> > it to fatal_error in GCC binary. >>> > >>> > thanks, >>> > Honza >>> >>> Ok, here is the new patch. Bootstrapped and tested on >>> x86_64-unknown-linux-gnu. Ok for trunk? >>> >>> Thanks, Teresa >>> >>> 2014-01-16 Teresa Johnson >>> >>> * gcov-io.c (gcov_position): Use gcov_nonruntime_assert. >>> (gcov_is_error): Remove gcc_assert from IN_LIBGCOV code. >>> (gcov_rewrite): Use gcov_nonruntime_assert. >>> (gcov_open): Ditto. >>> (gcov_write_words): Ditto. >>> (gcov_write_length): Ditto. >>> (gcov_read_words): Use gcov_nonruntime_assert, and remove >>> gcc_assert from IN_LIBGCOV code. >>> (gcov_read_summary): Use gcov_error to flag profile corruption. >>> (gcov_sync): Use gcov_nonruntime_assert. >>> (gcov_seek): Remove gcc_assert from IN_LIBGCOV code. >>> (gcov_histo_index): Use gcov_nonruntime_assert. >>> (static void gcov_histogram_merge): Ditto. >>> (compute_working_sets): Ditto. >>> * gcov-io.h (gcov_nonruntime_assert): Define. >>> (gcov_error): Define for !IN_LIBGCOV >> >> OK, thanks! >> Honza > > I just found this old patch sitting in a client! Committed as r210805. > > I also discovered that a couple uses of gcc_assert have snuck into > libgcc/libgcov* files in the meantime. Looks like this got added > during some of Rong's refactoring, cc'ing Rong. They are in > libgcc/libgcov-driver-system.c (allocate_filename_struct) and > libgcov-merge.c (__gcov_merge_single and __gcov_merge_delta). Should I > remove those or change to gcov_error? > > Teresa > > > > -- > Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
[google gcc-4_8] LIPO: check the incompatibility of use and saved command line
Hi, This patch adds an extra check of the incompatibility of use and saved command line for the primary module. If the check files, we skip the inclusion of all aux modules. Maybe we should check the aux module command line with the use-compilation command line directly. But I keep the old way that checks b/w saved command lines. Tested with the program exposed this issue and internal benchmarks. This check never failed for instrumentation based lipo in the test. Thanks, -Rong 2014-05-12 Rong Xu * coverage.c (struct string_hasher): Use const char. (has_incompatible_cg_opts): Add new incompatible options. (incompatible_cl_arg_strs): Refactor from incompatible_cl_args. (incompatible_cl_args): Refactor. (read_counts_file): check incompatibility of use command line and the saved command line for the primary module. * auto-profile.c (read_aux_modules): Ditto. Index: coverage.c === --- coverage.c (revision 209291) +++ coverage.c (working copy) @@ -237,8 +237,8 @@ is_last_module (unsigned mod_id) struct string_hasher { /* hash_table support. */ - typedef char value_type; - typedef char compare_type; + typedef const char value_type; + typedef char compare_type; static inline hashval_t hash (const value_type *); static int equal (const value_type *, const compare_type *); static void remove (value_type *) {}; @@ -272,6 +272,10 @@ static struct opt_desc force_matching_cg_opts[] = { "-fexceptions", "-fno-exceptions", true }, { "-fsized-delete", "-fno-sized-delete", false }, { "-frtti", "-fno-rtti", true }, +{ "-fPIC", "-fno-PIC", false }, +{ "-fpic", "-fno-pic", false }, +{ "-fPIE", "-fno-PIE", false }, +{ "-fpie", "-fno-pie", false }, { "-fstrict-aliasing", "-fno-strict-aliasing", true }, { "-fsigned-char", "-funsigned-char", true }, /* { "-fsigned-char", "-fno-signed-char", true }, @@ -315,28 +319,25 @@ has_incompatible_cg_opts (bool *cg_opts1, bool *cg return false; } -/* Returns true if the command-line arguments stored in the given module-infos - are incompatible. */ -bool -incompatible_cl_args (struct gcov_module_info* mod_info1, - struct gcov_module_info* mod_info2) +/* Returns true if the command-line arguments stored ARGS_1 and + ARGS_2 are incompatible. N_ARGS_1 and N_ARGS_2 are the number + of string in ARGS_1 and ARGS_2, respectively. */ + +static bool +incompatible_cl_arg_strs (const char* const* args_1, const unsigned int n_args_1, + const char *filename1, const char* const* args_2, + const unsigned int n_args_2, const char *filename2) { - char **warning_opts1 = XNEWVEC (char *, mod_info1->num_cl_args); - char **warning_opts2 = XNEWVEC (char *, mod_info2->num_cl_args); - char **non_warning_opts1 = XNEWVEC (char *, mod_info1->num_cl_args); - char **non_warning_opts2 = XNEWVEC (char *, mod_info2->num_cl_args); - char *std_opts1 = NULL, *std_opts2 = NULL; + const char **warning_opts1 = XNEWVEC (const char *, n_args_1); + const char **warning_opts2 = XNEWVEC (const char *, n_args_2); + const char **non_warning_opts1 = XNEWVEC (const char *, n_args_1); + const char **non_warning_opts2 = XNEWVEC (const char *, n_args_2); + const char *std_opts1 = NULL, *std_opts2 = NULL; unsigned int i, num_warning_opts1 = 0, num_warning_opts2 = 0; unsigned int num_non_warning_opts1 = 0, num_non_warning_opts2 = 0; bool warning_mismatch = false; bool non_warning_mismatch = false; hash_table option_tab1, option_tab2; - unsigned int start_index1 = mod_info1->num_quote_paths -+ mod_info1->num_bracket_paths + mod_info1->num_system_paths -+ mod_info1->num_cpp_defines + mod_info1->num_cpp_includes; - unsigned int start_index2 = mod_info2->num_quote_paths -+ mod_info2->num_bracket_paths + mod_info2->num_system_paths -+ mod_info2->num_cpp_defines + mod_info2->num_cpp_includes; bool *cg_opts1, *cg_opts2, has_any_incompatible_cg_opts, has_incompatible_std; unsigned int num_cg_opts = 0; @@ -358,14 +359,13 @@ has_incompatible_cg_opts (bool *cg_opts1, bool *cg option_tab2.create (10); /* First, separate the warning and non-warning options. */ - for (i = 0; i < mod_info1->num_cl_args; i++) -if (mod_info1->string_array[start_index1 + i][1] == 'W') - warning_opts1[num_warning_opts1++] = - mod_info1->string_array[start_index1 + i]; + for (i = 0; i < n_args_1; i++) +if (args_1[i][1] == 'W') + warning_opts1[num_warning_opts1++] = args_1[i]; else { -char **slot; -char *option_
[google gcc-4_8] fix lipo ICE
Hi, This patch fixed lipo ICE triggered by an out-of-bound access. This is google specific patch and tested with bootstrap and the program exposed the issue. Thanks, -Rong 2014-05-08 Rong Xu * tree-inline.c (add_local_variables): Check if the debug_expr is a decl_node before calling is_global_var. Index: tree-inline.c === --- tree-inline.c (revision 209291) +++ tree-inline.c (working copy) @@ -3842,7 +3842,7 @@ add_local_variables (struct function *callee, stru of varpool node does not check the reference from debug expressions. Set it to 0 for all global vars. */ -if (L_IPO_COMP_MODE && tem && is_global_var (tem)) +if (L_IPO_COMP_MODE && tem && DECL_P (tem) && is_global_var (tem)) tem = NULL; id->remapping_type_depth++;
Re: [PATCH] offline gcda profile processing tool
On Tue, Apr 15, 2014 at 2:38 PM, Jan Hubicka wrote: > Rong, David, Dehao, Teresa > I would like to have some rought idea of what we could merge this stage1. > There is > certainly a lot of interesting stuff on the google branch including AutoFDO, > LIPO, > the multivalue profile counters that may be used by the new devirtualization > bits > and more. I also think we should switch counts into floating point > representation > so Teresa's splitting patch works. > > Can we get plans to make this effective? My personal schedule is quite free > until > April 29 when I go to Czech Republic for wedding and I will be back in Calgary > at 14th. > >> 2014-03-03 Rong Xu >> >> * gcc/gcov-io.c (gcov_read_string): Make this routine available >> to gcov-tool. >> (gcov_sync): Ditto. >> * gcc/Makefile.in: Build and install gcov-tool. >> * gcc/gcov-tool.c (unlink_gcda_file): Remove one gcda file. >> (unlink_profile_dir): Remove gcda files from the profile path. >> (profile_merge): Merge two profiles in directory. >> (print_merge_usage_message): Print merge usage. >> (merge_usage): Print merge usage and exit. >> (do_merge): Driver for profile merge sub-command. >> (profile_rewrite): Rewrite profile. >> (print_rewrite_usage_message): Print rewrite usage. >> (rewrite_usage): Print rewrite usage and exit. >> (do_rewrite): Driver for profile rewrite sub-command. >> (print_usage): Print gcov-info usage and exit. >> (print_version): Print gcov-info version. >> (process_args): Process arguments. >> (main): Main routine for gcov-tool. >> * libgcc/libgcov.h : Include the set of base-type headers for >> gcov-tool. >> (struct gcov_info): Make the functions field mutable in gcov-tool >> compilation. >> * libgcc/libgcov-merge.c (gcov_get_counter): New wrapper function >> to get the profile counter. >> (gcov_get_counter_target): New wrapper function to get the profile >> values that should not be scaled. >> (__gcov_merge_add): Replace gcov_read_counter() with the wrapper >> functions. >> (__gcov_merge_ior): Ditto. >> (__gcov_merge_time_profile): Ditto. >> (__gcov_merge_single): Ditto. >> (__gcov_merge_delta): Ditto. >> * libgcc/libgcov-util.c (void gcov_set_verbose): Set the verbose flag >> in the utility functions. >> (set_fn_ctrs): Utility function for reading gcda files to in-memory >> gcov_list object link lists. >> (tag_function): Ditto. >> (tag_blocks): Ditto. >> (tag_arcs): Ditto. >> (tag_lines): Ditto. >> (tag_counters): Ditto. >> (tag_summary): Ditto. >> (read_gcda_finalize): Ditto. >> (read_gcda_file): Ditto. >> (ftw_read_file): Ditto. >> (read_profile_dir_init): Ditto. >> (gcov_read_profile_dir): Ditto. >> (gcov_read_counter_mem): Ditto. >> (gcov_get_merge_weight): Ditto. >> (merge_wrapper): A wrapper function that calls merging handler. >> (gcov_merge): Merge two gcov_info objects with weights. >> (find_match_gcov_info): Find the matched gcov_info in the list. >> (gcov_profile_merge): Merge two gcov_info object lists. >> (__gcov_add_counter_op): Process edge profile counter values. >> (__gcov_ior_counter_op): Process IOR profile counter values. >> (__gcov_delta_counter_op): Process delta profile counter values. >> (__gcov_single_counter_op): Process single profile counter values. >> (fp_scale): Callback function for float-point scaling. >> (int_scale): Callback function for integer fraction scaling. >> (gcov_profile_scale): Scaling profile counters. >> (gcov_profile_normalize): Normalize profile counters. >> >> Index: gcc/gcov-io.c >> === >> --- gcc/gcov-io.c (revision 208237) >> +++ gcc/gcov-io.c (working copy) >> @@ -564,7 +564,7 @@ gcov_read_counter (void) >> buffer, or NULL on empty string. You must copy the string before >> calling another gcov function. */ >> >> -#if !IN_LIBGCOV >> +#if !IN_LIBGCOV || defined (IN_GCOV_TOOL) >> GCOV_LINKAGE const char * >> gcov_read_string (void) >> { >> @@ -641,7 +641,7 @@ gcov_read_summary (struct gcov_summary *summary) >> } >> } >> >> -#if !IN_LIBGCOV >> +#if !IN_LIBG
Re: [google gcc-4_8] unify int and fp scaling in gcov-tool
Yes. They were the bug I mentioned. This patch fixed that. -Rong On Mon, Feb 10, 2014 at 10:40 AM, Teresa Johnson wrote: > Looks good to me. > > A couple questions: > >> static void >> -__gcov_scale_icall_topn (gcov_type *counters, unsigned n_counters, double f) >> +__gcov_icall_topn_op (gcov_type *counters, unsigned n_counters, >> + counter_op_fn fn, void *data1, void *data2) >> { >>unsigned i; >> >> @@ -950,41 +949,65 @@ static void >>gcov_type *value_array = &counters[i + 1]; >> >>for (j = 0; j < GCOV_ICALL_TOPN_NCOUNTS - 1; j += 2) >> -value_array[1] *= f; > > Was the above a bug (always updating index 1)? > >> +value_array[j + 1] = fn (value_array[j + 1], data1, data2); >> } >> } >> >> static void >> -__gcov_scale_dc (gcov_type *counters, unsigned n_counters, double f) >> +__gcov_dc_op (gcov_type *counters, unsigned n_counters, >> + counter_op_fn fn, void *data1, void *data2) >> { >>unsigned i; >> >>gcc_assert (!(n_counters % 2)); >>for (i = 0; i < n_counters; i += 2) >> -counters[1] *= f; > > Same question here > >> +counters[i + 1] = fn (counters[i + 1], data1, data2); >> } > > Teresa > > On Tue, Feb 4, 2014 at 3:54 PM, Rong Xu wrote: >> Hi, >> >> The attached patch uses a callback function to unify the integer and >> floating-point scaling in gcov-tool. (Also fix a bug in fp scaling of >> ic and dc counters in earlier code). >> >> Tested with spec2006 profiles. >> >> OK for checking in? >> >> Thanks, >> >> -Rong > > > > -- > Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
[google gcc-4_8] unify int and fp scaling in gcov-tool
Hi, The attached patch uses a callback function to unify the integer and floating-point scaling in gcov-tool. (Also fix a bug in fp scaling of ic and dc counters in earlier code). Tested with spec2006 profiles. OK for checking in? Thanks, -Rong 2014-02-04 Rong Xu * gcc/gcov-tool.c (profile_rewrite2): Remove. (profile_rewrite): Merge int and fp scaling. (do_rewrite): Ditto. * libgcc/libgcov-util.c (typedef gcov_type) New. (__gcov_add_counter_op): Merge int and fp scaling. (__gcov_ior_counter_op): Ditto. (__gcov_delta_counter_op): Ditto. (__gcov_single_counter_op): Ditto. (__gcov_icall_topn_op): Ditto. (__gcov_dc_op): Ditto. (fp_scale): Ditto. (int_scale): Ditto. (gcov_profile_scale): Ditto. (gcov_profile_normalize): Ditto. (__gcov_scale2_add): Remove. (__gcov_scale2_ior): Remove. (__gcov_scale2_delta): Remove. (__gcov_scale2_single): Remove. (__gcov_scale2_icall_topn): Remove. (__gcov_scale2_dc): Remove. (gcov_profile_scale2): Remove. Index: gcc/gcov-tool.c === --- gcc/gcov-tool.c (revision 207488) +++ gcc/gcov-tool.c (working copy) @@ -43,8 +43,7 @@ see the files COPYING3 and COPYING.RUNTIME respect extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, int); extern int gcov_profile_normalize (struct gcov_info*, gcov_type); -extern int gcov_profile_scale (struct gcov_info*, float); -extern int gcov_profile_scale2 (struct gcov_info*, int, int); +extern int gcov_profile_scale (struct gcov_info*, float, int, int); extern struct gcov_info* gcov_read_profile_dir (const char*, int); extern void gcov_exit (void); extern void set_gcov_list (struct gcov_info *); @@ -227,45 +226,13 @@ do_merge (int argc, char **argv) return ret; } -/* Scale the counters in profile DIR by a factor of N/D. - Result is written to dir OUT. Return 0 on success. */ - -static int -profile_rewrite2 (const char *dir, const char *out, int n, int d) -{ - char *pwd; - int ret; - struct gcov_info * profile; - - - profile = gcov_read_profile_dir (dir, 0); - if (!profile) -return 1; - - /* Output new profile. */ - unlink_profile_dir (out); - mkdir (out, 0755); - pwd = getcwd (NULL, 0); - gcc_assert (pwd); - ret = chdir (out); - gcc_assert (ret == 0); - - gcov_profile_scale2 (profile, n, d); - - set_gcov_list (profile); - gcov_exit (); - - ret = chdir (pwd); - free (pwd); - return 0; -} - /* If N_VAL is no-zero, normalize the profile by setting the largest counter counter value to N_VAL and scale others counters proportionally. Otherwise, multiply the all counters by SCALE. */ static int -profile_rewrite (const char *d1, const char *out, long long n_val, float scale) +profile_rewrite (const char *d1, const char *out, long long n_val, + float scale, int n, int d) { char *pwd; int ret; @@ -287,7 +254,7 @@ static int if (n_val) gcov_profile_normalize (d1_profile, (gcov_type) n_val); else -gcov_profile_scale (d1_profile, scale); +gcov_profile_scale (d1_profile, scale, n, d); set_gcov_list (d1_profile); gcov_exit (); @@ -604,9 +571,9 @@ do_rewrite (int argc, char **argv) if (argc - optind == 1) { if (denominator > 0) -ret = profile_rewrite2 (argv[optind], output_dir, numerator, denominator); +ret = profile_rewrite (argv[optind], output_dir, 0, 0.0, numerator, denominator); else -ret = profile_rewrite (argv[optind], output_dir, normalize_val, scale); +ret = profile_rewrite (argv[optind], output_dir, normalize_val, scale, 0, 0); } else rewrite_usage (); Index: libgcc/libgcov-util.c === --- libgcc/libgcov-util.c (revision 207488) +++ libgcc/libgcov-util.c (working copy) @@ -873,39 +873,38 @@ gcov_profile_merge (struct gcov_info *tgt_profile, return 0; } -/* This part of code is to scale profile counters. */ +typedef gcov_type (*counter_op_fn) (gcov_type, void*, void*); -/* Type of function used to normalize counters. */ -typedef void (*gcov_scale_fn) (gcov_type *, gcov_unsigned_t, double); +/* Performing FN upon arc counters. */ -/* Scale arc counters. N_COUNTERS of counter value in COUNTERS array are - multiplied by a factor F. */ - static void -__gcov_scale_add (gcov_type *counters, unsigned n_counters, double f) +__gcov_add_counter_op (gcov_type *counters, unsigned n_counters, + counter_op_fn fn, void *data1, void *data2) { for (; n_counters; counters++, n_counters--) { gcov_type val = *counters; - *counters = val * f; + *counters = fn(val, data1, data2); } } -/* Scale ior counters. */ +/* Performing FN upon ior counters. */ static void -__gcov_scale_
[google gcc-4_8] fix profiledbootstrap
Hi, The attached patch fixes the duplicated definition error in gcov-tool.c in profiledbootstrap. Google branch only and tested with profiledbootstrap. Ok for checking in? -Rong Index: gcov-tool.c === --- gcov-tool.c (revision 207435) +++ gcov-tool.c (working copy) @@ -50,16 +50,23 @@ static int verbose; -gcov_unsigned_t __gcov_lipo_grouping_algorithm; -gcov_unsigned_t __gcov_lipo_merge_modu_edges; -gcov_unsigned_t __gcov_lipo_weak_inclusion; -gcov_unsigned_t __gcov_lipo_max_mem; -gcov_unsigned_t __gcov_lipo_random_group_size; -gcov_unsigned_t __gcov_lipo_cutoff; -gcov_unsigned_t __gcov_lipo_random_seed; -gcov_unsigned_t __gcov_lipo_dump_cgraph; -gcov_unsigned_t __gcov_lipo_propagate_scale; +/* The following defines are needed by dyn-ipa.c. + They will also be emitted by the compiler with -fprofile-generate, + which means this file cannot be compiled with -fprofile-generate + -- otherwise we get duplicated defintions. + Make the defines weak to link with other objects/libraries + that potentially compiled with -fprofile-generate. */ +__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_grouping_algorithm; +__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_merge_modu_edges; +__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_weak_inclusion; +__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_max_mem; +__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_random_group_size; +__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_cutoff; +__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_random_seed; +__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_dump_cgraph; +__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_propagate_scale; + /* Remove file NAME if it has a gcda suffix. */ static int Index: Makefile.in === --- Makefile.in (revision 207435) +++ Makefile.in (working copy) @@ -4090,6 +4090,9 @@ GCOV_TOOL_OBJS = gcov-tool.o libgcov-util.o dyn-ipa.o gcov-params.o gcov-tool.o: gcov-tool.c $(GCOV_IO_H) intl.h $(SYSTEM_H) coretypes.h $(TM_H) \ $(CONFIG_H) version.h $(DIAGNOSTIC_H) + +$(COMPILER) -c $(ALL_CPPFLAGS) \ + $(filter-out -fprofile-generate -fprofile-use,$(ALL_COMPILERFLAGS)) \ + $(INCLUDES) -o $@ $< gcov-params.o: params.c $(PARAMS_H) +$(COMPILER) -DIN_GCOV_TOOL -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) \ $(INCLUDES) -o $@ $<
Re: [google gcc-4_8] gcov-tool: some new LIPO supports.
Here is the revised patch that integrates Teresa' comments. Ok for checking in? -Rong On Thu, Jan 30, 2014 at 1:20 PM, Rong Xu wrote: > Comments are inlined. New patch attached to this email. > > -Rong > > On Thu, Jan 30, 2014 at 12:39 PM, Xinliang David Li > wrote: >> On Wed, Jan 29, 2014 at 4:24 PM, Rong Xu wrote: >>> Hi, >>> >>> The attached patch adds some new features to gcov-tool. It aims to >>> rewrite LIPO profile for reuse, debug or performance tuning purpose. >>> >>> -l, --modu_list Only use the modules in this file >> >> I think in verbose mode, the tool should emit warnings when a module >> is trimmed due to this option. This can be used in regression test. >> > > In previous patch, this warning is emitted unconventionally (not > guarded by verbose flag). > I changed it to under verbose mode in this patch. > >>> -r, --string Replace string in path >> >> --path_substr_replace or something in that line. >> > > Done. > >>> -u, --use_imports_file Use the grouping in import files. >>> >> >> Is there a path in code to auto test this? > > As we discussed offline. This can be verified by a separated script. > >> >>> -l uses only the modules in specified file to compute module grouping >>> (and subsequent dumping). >>> -r replaces the pattern specified in the argument. The format is: >>> old_str1:new_str1[,old_str2:new_str2]*, only the first occurrence is >>> replaced. >>> -u skips the run-time module grouping computation and reuses the one >>> comes with the profiles (which is user editable). >>> >>> Tested with profiles from google internal benchmarks. >>> >> >> Also use strcasestr for case insenstive operation. > > Done. > >> >> David >> >>> -Rong Index: gcc/gcov-tool.c === --- gcc/gcov-tool.c (revision 207435) +++ gcc/gcov-tool.c (working copy) @@ -29,6 +29,7 @@ see the files COPYING3 and COPYING.RUNTIME respect #include "coretypes.h" #include "tm.h" #include "intl.h" +#include "hashtab.h" #include "diagnostic.h" #include "version.h" #include "gcov-io.h" @@ -38,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect #include #include #include "params.h" +#include extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, int); extern int gcov_profile_normalize (struct gcov_info*, gcov_type); @@ -47,9 +49,11 @@ extern struct gcov_info* gcov_read_profile_dir (co extern void gcov_exit (void); extern void set_gcov_list (struct gcov_info *); extern void gcov_set_verbose (void); +extern void set_use_existing_grouping (void); +extern void set_use_modu_list (void); +extern void lipo_set_substitute_string (const char *); -static int verbose; - +/* The following defines are needed by dyn-ipa.c. */ gcov_unsigned_t __gcov_lipo_grouping_algorithm; gcov_unsigned_t __gcov_lipo_merge_modu_edges; gcov_unsigned_t __gcov_lipo_weak_inclusion; @@ -60,6 +64,8 @@ gcov_unsigned_t __gcov_lipo_random_seed; gcov_unsigned_t __gcov_lipo_dump_cgraph; gcov_unsigned_t __gcov_lipo_propagate_scale; +static int verbose; + /* Remove file NAME if it has a gcda suffix. */ static int @@ -285,6 +291,189 @@ profile_rewrite (const char *d1, const char *out, return 0; } +/* This is the hashtab entry to store a name and mod_id pair. */ +typedef struct { + const char *name; + unsigned id; +} mod_name_id; + +/* Hash and comparison functions for strings. */ + +static unsigned +mod_name_id_htab_hash (const void *s_p) +{ + const char *s = ((const mod_name_id *) s_p)->name; + return (*htab_hash_string) (s); +} + +static int +mod_name_id_hash_eq (const void *s1_p, const void *s2_p) +{ + return strcmp (((const mod_name_id *) s1_p)->name, + ((const mod_name_id *) s2_p)->name) == 0; +} + +static htab_t mod_name_id_hash_table; + +/* Look up an entry in the hash table. STRING is the module name. + CREATE controls to insert to htab or not. + If (*ID_P != 0), we write (*ID_P) to htab. + If (*ID_P == 0), we write module_id to (*ID_P). + return 1 if an entry is found and otherwise 0. */ + +static int +module_name_hash_lookup (const char *string, unsigned *id_p, int create) +{ + void **e; + mod_name_id t; + + t.name = string; + e = htab_find_slot (mod_name_id_hash_table, &t, + create ? INSERT : NO_INSERT); + if (e == NULL) +return 0; + if (*e == NULL) +{ + *e = XNEW (mod_name_id *); + (*(mod_name_id **)e)->name = xstrdup (string); +} + if (id_p) +
Re: [google gcc-4_8] gcov-tools: minor fix for broken build for arm
Thanks for catching this, and the fix. OK for google branch. -Rong On Fri, Jan 31, 2014 at 2:46 PM, Hán Shěn (沈涵) wrote: > Hi Rong, while building for arm toolchain on chromeos, GCOV_LOCKED is > not defined, which leads to redefinition of cs_all, this is observed > on google/gcc-4_8 branch. > > Patch below, tested on chromeos for arm and x86_64 arch. > > Ok for google/gcc-4_8 branch? > > diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c > index 76f6104..bbce3a6 100644 > --- a/libgcc/libgcov-driver.c > +++ b/libgcc/libgcov-driver.c > @@ -559,10 +559,6 @@ gcov_exit_merge_summary (const struct gcov_info > *gi_ptr, struct gcov_summary *pr > { >struct gcov_ctr_summary *cs_prg, *cs_tprg, *cs_all; >unsigned t_ix; > -#if !GCOV_LOCKED > - /* summary for all instances of program. */ > - struct gcov_ctr_summary *cs_all; > -#endif > >/* Merge the summaries. */ >for (t_ix = 0; t_ix < GCOV_COUNTERS_SUMMABLE; t_ix++) > > > Han
Re: [google gcc-4_8] gcov-tool: some new LIPO supports.
Comments are inlined. New patch attached to this email. -Rong On Thu, Jan 30, 2014 at 12:39 PM, Xinliang David Li wrote: > On Wed, Jan 29, 2014 at 4:24 PM, Rong Xu wrote: >> Hi, >> >> The attached patch adds some new features to gcov-tool. It aims to >> rewrite LIPO profile for reuse, debug or performance tuning purpose. >> >> -l, --modu_list Only use the modules in this file > > I think in verbose mode, the tool should emit warnings when a module > is trimmed due to this option. This can be used in regression test. > In previous patch, this warning is emitted unconventionally (not guarded by verbose flag). I changed it to under verbose mode in this patch. >> -r, --string Replace string in path > > --path_substr_replace or something in that line. > Done. >> -u, --use_imports_file Use the grouping in import files. >> > > Is there a path in code to auto test this? As we discussed offline. This can be verified by a separated script. > >> -l uses only the modules in specified file to compute module grouping >> (and subsequent dumping). >> -r replaces the pattern specified in the argument. The format is: >> old_str1:new_str1[,old_str2:new_str2]*, only the first occurrence is >> replaced. >> -u skips the run-time module grouping computation and reuses the one >> comes with the profiles (which is user editable). >> >> Tested with profiles from google internal benchmarks. >> > > Also use strcasestr for case insenstive operation. Done. > > David > >> -Rong Index: gcc/gcov-tool.c === --- gcc/gcov-tool.c (revision 207316) +++ gcc/gcov-tool.c (working copy) @@ -29,6 +29,7 @@ see the files COPYING3 and COPYING.RUNTIME respect #include "coretypes.h" #include "tm.h" #include "intl.h" +#include "hashtab.h" #include "diagnostic.h" #include "version.h" #include "gcov-io.h" @@ -38,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect #include #include #include "params.h" +#include extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, int); extern int gcov_profile_normalize (struct gcov_info*, gcov_type); @@ -47,9 +49,11 @@ extern struct gcov_info* gcov_read_profile_dir (co extern void gcov_exit (void); extern void set_gcov_list (struct gcov_info *); extern void gcov_set_verbose (void); +extern void set_use_existing_grouping (void); +extern void set_use_modu_list (void); +extern void lipo_set_substitute_string (const char *); -static int verbose; - +/* The following defines are needed by dyn-ipa.c. */ gcov_unsigned_t __gcov_lipo_grouping_algorithm; gcov_unsigned_t __gcov_lipo_merge_modu_edges; gcov_unsigned_t __gcov_lipo_weak_inclusion; @@ -60,6 +64,8 @@ gcov_unsigned_t __gcov_lipo_random_seed; gcov_unsigned_t __gcov_lipo_dump_cgraph; gcov_unsigned_t __gcov_lipo_propagate_scale; +static int verbose; + /* Remove file NAME if it has a gcda suffix. */ static int @@ -285,6 +291,187 @@ profile_rewrite (const char *d1, const char *out, return 0; } +/* This is the hashtab entry to store a name and mod_id pair. */ +typedef struct { + const char *name; + unsigned id; +} mod_name_id; + +/* Hash and comparison functions for strings. */ + +static unsigned +mod_name_id_htab_hash (const void *s_p) +{ + const char *s = ((const mod_name_id *) s_p)->name; + return (*htab_hash_string) (s); +} + +static int +mod_name_id_hash_eq (const void *s1_p, const void *s2_p) +{ + return strcmp (((const mod_name_id *) s1_p)->name, + ((const mod_name_id *) s2_p)->name) == 0; +} + +static htab_t mod_name_id_hash_table; + +/* Look up an entry in the hash table. STRING is the module name. + CREATE controls to insert to htab or not. + If (*ID_P != 0), we write (*ID_P) to htab. + If (*ID_P == 0), we write module_id to (*ID_P). + return 1 if an entry is found and otherwise 0. */ + +static int +module_name_hash_lookup (const char *string, unsigned *id_p, int create) +{ + void **e; + mod_name_id t; + + t.name = string; + e = htab_find_slot (mod_name_id_hash_table, &t, + create ? INSERT : NO_INSERT); + if (e == NULL) +return 0; + if (*e == NULL) +{ + *e = XNEW (mod_name_id *); + (*(mod_name_id **)e)->name = xstrdup (string); +} + if (id_p) +{ + if (*id_p != 0) +(*(mod_name_id **)e)->id = *id_p; + else +*id_p = (*(mod_name_id **)e)->id; +} + return 1; +} + +/* Return 1 if NAME is of a source type that LIPO targets. + Return 0 otherwise. */ + +static int +is_lipo_source_type (char *name) +{ + char *p; + + if (strcasestr (name, ".c") || +
[google gcc-4_8] gcov-tool: some new LIPO supports.
Hi, The attached patch adds some new features to gcov-tool. It aims to rewrite LIPO profile for reuse, debug or performance tuning purpose. -l, --modu_list Only use the modules in this file -r, --string Replace string in path -u, --use_imports_file Use the grouping in import files. -l uses only the modules in specified file to compute module grouping (and subsequent dumping). -r replaces the pattern specified in the argument. The format is: old_str1:new_str1[,old_str2:new_str2]*, only the first occurrence is replaced. -u skips the run-time module grouping computation and reuses the one comes with the profiles (which is user editable). Tested with profiles from google internal benchmarks. -Rong 2014-01-29 Rong Xu * gcc/gcov-tool.c (mod_name_id): New type of the hashtable entry. (mod_name_id_htab_hash): Hash function for module-name and module-id hashtab. (mod_name_id_hash_eq): Comparison function for the hashtab. (module_name_hash_lookup): Lookup function of the hashtab. (is_lipo_source_type): Find if a file is of LIPO target source. (lipo_process_name_string): Support of -l. (lipo_process_modu_list): Ditto. (is_module_available): Support of -u. (get_module_id_from_name): Ditto. (print_rewrite_usage_message): Add -u -l and -r support. (do_rewrite): Ditto. (print_version): Ditto. (process_args): Ditto. * libgcc/dyn-ipa.c (flag_use_existing_grouping): New flag for -u. (init_dyn_call_graph): Support of -l. (__gcov_finalize_dyn_callgraph): Ditto. (gcov_build_callgraph): Ditto. (gcov_compute_cutoff_count): Ditto. (gcov_process_cgraph_node): Ditto. (build_modu_graph): Ditto. (gcov_compute_module_groups_eager_propagation): Ditto. (gcov_compute_random_module_groups): Ditto. (gcov_write_module_infos): Ditto. (gcov_dump_callgraph): Ditto. (set_use_existing_grouping): Support of -u. (open_imports_file): Ditto. (read_modu_groups_from_imports_files): Ditto. (__gcov_compute_module_groups): Ditto. * libgcc/libgcov-util.c (substitute_string): Support -r. (lipo_set_substitute_string): Ditto. (lipo_process_substitute_string_1): Ditto. (lipo_process_substitute_string): Ditto. (read_gcda_file): Ditto. (flag_use_modu_list): Support of -l (set_use_modu_list): Ditto. (ftw_read_file): Ditto. (source_profile_dir): Support of -u. (get_source_profile_dir): Ditto. (gcov_read_profile_dir): Ditto. Index: gcc/gcov-tool.c === --- gcc/gcov-tool.c (revision 207055) +++ gcc/gcov-tool.c (working copy) @@ -29,6 +29,7 @@ see the files COPYING3 and COPYING.RUNTIME respect #include "coretypes.h" #include "tm.h" #include "intl.h" +#include "hashtab.h" #include "diagnostic.h" #include "version.h" #include "gcov-io.h" @@ -38,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect #include #include #include "params.h" +#include extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, int); extern int gcov_profile_normalize (struct gcov_info*, gcov_type); @@ -47,9 +49,11 @@ extern struct gcov_info* gcov_read_profile_dir (co extern void gcov_exit (void); extern void set_gcov_list (struct gcov_info *); extern void gcov_set_verbose (void); +extern void set_use_existing_grouping (void); +extern void set_use_modu_list (void); +extern void lipo_set_substitute_string (const char *); -static int verbose; - +/* The following defines are needed by dyn-ipa.c. */ gcov_unsigned_t __gcov_lipo_grouping_algorithm; gcov_unsigned_t __gcov_lipo_merge_modu_edges; gcov_unsigned_t __gcov_lipo_weak_inclusion; @@ -60,6 +64,8 @@ gcov_unsigned_t __gcov_lipo_random_seed; gcov_unsigned_t __gcov_lipo_dump_cgraph; gcov_unsigned_t __gcov_lipo_propagate_scale; +static int verbose; + /* Remove file NAME if it has a gcda suffix. */ static int @@ -285,6 +291,189 @@ profile_rewrite (const char *d1, const char *out, return 0; } +/* This is the hashtab entry to store a name and mod_id pair. */ +typedef struct { + const char *name; + unsigned id; +} mod_name_id; + +/* Hash and comparison functions for strings. */ + +static unsigned +mod_name_id_htab_hash (const void *s_p) +{ + const char *s = ((const mod_name_id *) s_p)->name; + return (*htab_hash_string) (s); +} + +static int +mod_name_id_hash_eq (const void *s1_p, const void *s2_p) +{ + return strcmp (((const mod_name_id *) s1_p)->name, + ((const mod_name_id *) s2_p)->name) == 0; +} + +static htab_t mod_name_id_hash_table; + +/* Look up an entry in the hash table. STRING is the module name. + CREATE controls to inse
[Google gcc-4_8] always emit __gcov_get_profile_prefix when linking libgcov.a
Hi, This patch is for google/gcc-4_8 branch. It fixes a regression in earlier libgcov refactoring. Thanks, -Rong 2014-01-23 Rong Xu * libgcov-driver.c (__gcov_get_profile_prefix): Always emit this function. Index: libgcov-driver.c === --- libgcov-driver.c(revision 206736) +++ libgcov-driver.c(working copy) @@ -68,6 +68,8 @@ extern struct gcov_info *get_gcov_list (void) ATTR these symbols will always need to be resolved. */ void (*__gcov_dummy_ref1)(void) = &__gcov_reset; void (*__gcov_dummy_ref2)(void) = &__gcov_dump; +extern char *__gcov_get_profile_prefix (void); +char *(*__gcov_dummy_ref3)(void) = &__gcov_get_profile_prefix; /* Default callback function for profile instrumentation callback. */ extern void __coverage_callback (gcov_type, int);
Re: [google gcc-4_8] port gcov-tool to gcc-4_8
Do we have to split params.def? I can include params.h and link in params.o (a special version as we don't have some global vars). As for lipo_cutoff, I think we don't need a special handling -- we should use the default value of 100 and let logic in dyn-ipa.c takes care of the rest. Please find the update patch which reads the default values from params.def. -Rong On Fri, Jan 17, 2014 at 12:05 PM, Xinliang David Li wrote: > For LIPO parameters, you can do this > 1) isolate all LIPO specific parameters into lipo_params.def, and > include it from params.def. > 2) include lipo_params.def (after proper definition of DEFPARAM) in > the profile tool source dir. > > By so doing, the compiler setting and profile tool will be in sync. > (Unfortuately, for historic reason, the lipo_cutoff default val is > still set in dyn-ipa.c even with the parameter, so some comments needs > to be added in dyn-ipa.c to make sure the value should be doubly > updated). > > David > > On Fri, Jan 17, 2014 at 11:15 AM, Rong Xu wrote: >> Hi, >> >> This patch port the gcov-tool work to google/gcc-4_8 branches. >> >> Tested with spec2006, profiledbootstrap and google internal benchmarks. >> >> -Rong Index: Makefile.in === --- Makefile.in (revision 206671) +++ Makefile.in (working copy) @@ -123,7 +123,8 @@ SUBDIRS =@subdirs@ build # Selection of languages to be made. CONFIG_LANGUAGES = @all_selected_languages@ -LANGUAGES = c gcov$(exeext) gcov-dump$(exeext) $(CONFIG_LANGUAGES) +LANGUAGES = c gcov$(exeext) gcov-dump$(exeext) gcov-tool$(exeext) \ +$(CONFIG_LANGUAGES) # Default values for variables overridden in Makefile fragments. # CFLAGS is for the user to override to, e.g., do a cross build with -O2. @@ -194,6 +195,7 @@ GCC_WARN_CXXFLAGS = $(LOOSE_WARN) $($(@D)-warn) $( build/gengtype-lex.o-warn = -Wno-error gengtype-lex.o-warn = -Wno-error expmed.o-warn = -Wno-error +libgcov-util.o-warn = -Wno-error # All warnings have to be shut off in stage1 if the compiler used then # isn't gcc; configure determines that. WARN_CFLAGS will be either @@ -755,6 +757,7 @@ GCC_TARGET_INSTALL_NAME := $(target_noncanonical)- CPP_INSTALL_NAME := $(shell echo cpp|sed '$(program_transform_name)') GCOV_INSTALL_NAME := $(shell echo gcov|sed '$(program_transform_name)') PROFILE_TOOL_INSTALL_NAME := $(shell echo profile_tool|sed '$(program_transform_name)') +GCOV_TOOL_INSTALL_NAME := $(shell echo gcov-tool|sed '$(program_transform_name)') # Setup the testing framework, if you have one EXPECT = `if [ -f $${rootme}/../expect/expect ] ; then \ @@ -1480,7 +1483,8 @@ ALL_HOST_FRONTEND_OBJS = $(foreach v,$(CONFIG_LANG ALL_HOST_BACKEND_OBJS = $(GCC_OBJS) $(OBJS) $(OBJS-libcommon) \ $(OBJS-libcommon-target) @TREEBROWSER@ main.o c-family/cppspec.o \ - $(COLLECT2_OBJS) $(EXTRA_GCC_OBJS) $(GCOV_OBJS) $(GCOV_DUMP_OBJS) + $(COLLECT2_OBJS) $(EXTRA_GCC_OBJS) $(GCOV_OBJS) $(GCOV_DUMP_OBJS) \ + $(GCOV_TOOL_OBJS) # This lists all host object files, whether they are included in this # compilation or not. @@ -1505,6 +1509,7 @@ MOSTLYCLEANFILES = insn-flags.h insn-config.h insn $(SPECS) collect2$(exeext) gcc-ar$(exeext) gcc-nm$(exeext) \ gcc-ranlib$(exeext) \ gcov-iov$(build_exeext) gcov$(exeext) gcov-dump$(exeext) \ + gcov-tool$(exeect) \ gengtype$(exeext) *.[0-9][0-9].* *.[si] *-checksum.c libbackend.a \ libcommon-target.a libcommon.a libgcc.mk @@ -4070,6 +4075,24 @@ GCOV_DUMP_OBJS = gcov-dump.o vec.o ggc-none.o gcov-dump$(exeext): $(GCOV_DUMP_OBJS) $(LIBDEPS) +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_DUMP_OBJS) \ $(LIBS) -o $@ + +libgcov-util.o: $(srcdir)/../libgcc/libgcov-util.c gcov-io.c $(GCOV_IO_H) \ + $(srcdir)/../libgcc/libgcov-driver.c $(srcdir)/../libgcc/libgcov-driver-system.c \ + $(srcdir)/../libgcc/libgcov-merge.c $(srcdir)/../libgcc/libgcov.h \ + $(SYSTEM_H) coretypes.h $(TM_H) $(CONFIG_H) version.h intl.h $(DIAGNOSTIC_H) + +$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) -o $@ $< +dyn-ipa.o: $(srcdir)/../libgcc/dyn-ipa.c gcov-io.c $(srcdir)/../libgcc/libgcov.h \ + $(GCOV_IO_H) $(SYSTEM_H) coretypes.h \ + $(TM_H) $(CONFIG_H) version.h intl.h $(DIAGNOSTIC_H) + +$(COMPILER) -DIN_GCOV_TOOL -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) \ + $(INCLUDES) -o $@ $< + +GCOV_TOOL_OBJS = gcov-tool.o libgcov-util.o dyn-ipa.o params.o +gcov-tool.o: gcov-tool.c $(GCOV_IO_H) intl.h $(SYSTEM_H) coretypes.h $(TM_H) \ + $(CONFIG_H) version.h $(DIAGNOSTIC_H) +gcov-tool$(exeext): $(GCOV_TOOL_OBJS) $(LIBDEPS) + +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_TOOL_OBJS) \ + $(LIBS) -o $@ # # Build the include directories. The stamp files are stmp-* rather than # s-* so that mostlyclean does not force the include dir
Re: [PATCH] offline gcda profile processing tool
Ping. -Rong On Mon, Jan 13, 2014 at 12:43 PM, Rong Xu wrote: > Hi, > > This patch implements gcov-tool, a offline profile processing tool. > This version supports merging two profiles with weights, and scaling > the profile with a floating-point / fraction weight. > > Earlier discussion can be found > http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02631.html > > In the discussion of the earlier patch, Honza wanted a separated > patch for interface change for gcov_merge_*() functions. In this patch, > the changes for gcov_merge_*() are minimal by using a wrapper function. > So I include everything in this patch. > > Tested with bootstrap profiledboostrap and SPEC2006 profiles. > > Thanks, > > -Rong
Re: [google gcc-4_8] backport libgcov re-factoring patches from trunk
On Wed, Jan 15, 2014 at 10:40 AM, Xinliang David Li wrote: > In libgcov-driver.c, > > 1) there are couple of places with trailing white spaces (e.g, in > gcov_sort_n_vals body), please remove They are from the existing code. But I'll fix them. > 2) gcov_exit_write_gcda in trunk takes eof_pos as an arg, and check it > before writing the header. I think this is more correct than in your > patch That's true. But this logic is from the newer code in trunk. Current 4_8 code does not check this. I deliberately did this because I thought the backporting patch should not change this. I'll change to the trunk verison then. > 3) It would be better to keep the function order the same in trunk > (e.g, compute summary related, merge gcda and write gcda etc); it is > also helpful to keep LIPO related functions order in the same way as > in google/main so that a better diff can be done > 4) libdriver-profiler.c -- make the function ordering the same as in > google/main would be helpful. sure. I'll do this two items. Will send an updated patch soon. > > thanks, > > David > > On Wed, Jan 15, 2014 at 10:03 AM, Rong Xu wrote: >> The attached patch backports libgcov re-factoring patches from trunk. >> >> Tested with google internal benchmarks, SPEC2006, bootstrap and >> profiledbootstrap. >> >> OK for google/gcc-4_8 branch? >> >> -Rong
Re: [RFC] libgcov.c re-factoring and offline profile-tool
My bad. Thanks for the fix! -Rong On Thu, Jan 9, 2014 at 11:47 AM, H.J. Lu wrote: > On Wed, Jan 8, 2014 at 2:33 PM, Rong Xu wrote: >> On Fri, Dec 6, 2013 at 6:23 AM, Jan Hubicka wrote: >>>> @@ -325,6 +311,9 @@ static struct gcov_summary all_prg; >>>> #endif >>>> /* crc32 for this program. */ >>>> static gcov_unsigned_t crc32; >>>> +/* Use this summary checksum rather the computed one if the value is >>>> + *non-zero. */ >>>> +static gcov_unsigned_t saved_summary_checksum; >>> >>> Why do you need to save the checksum? Won't it reset summary back with >>> multiple streaming? >> >> This was for the gcov_tool. checksum will be recomputed in gcov_exit >> and the value will depend on >> the order of gcov_info list. (the order will be different after >> reading from gcda files to memory). The purpose was >> to have the same summary_checksum so that I can get identical gcov-dump >> output. >> >>> >>> I would really like to avoid introducing those static vars that are used >>> exclusively >>> by gcov_exit. What about putting them into an gcov_context structure that >>> is passed around the functions that was broken out? >> >> With my recently patch the localizes this_prg, we only use 64 more >> bytes in bss. Do you still we have to remove >> all these statics? >> >>> > > libgcc ChangeLog entries should be in libgcc/ChangeLog, > not gcc/ChangeLog. I checked in a patch to move them > to libgcc/ChangeLog. > > -- > H.J.
Re: [RFC] libgcov.c re-factoring and offline profile-tool
On Fri, Dec 6, 2013 at 6:23 AM, Jan Hubicka wrote: >> @@ -325,6 +311,9 @@ static struct gcov_summary all_prg; >> #endif >> /* crc32 for this program. */ >> static gcov_unsigned_t crc32; >> +/* Use this summary checksum rather the computed one if the value is >> + *non-zero. */ >> +static gcov_unsigned_t saved_summary_checksum; > > Why do you need to save the checksum? Won't it reset summary back with > multiple streaming? This was for the gcov_tool. checksum will be recomputed in gcov_exit and the value will depend on the order of gcov_info list. (the order will be different after reading from gcda files to memory). The purpose was to have the same summary_checksum so that I can get identical gcov-dump output. > > I would really like to avoid introducing those static vars that are used > exclusively > by gcov_exit. What about putting them into an gcov_context structure that > is passed around the functions that was broken out? With my recently patch the localizes this_prg, we only use 64 more bytes in bss. Do you still we have to remove all these statics? >
Re: [RFC] libgcov.c re-factoring and offline profile-tool
On Wed, Dec 18, 2013 at 9:28 AM, Xinliang David Li wrote: >>> >>> #ifdef L_gcov_merge_ior >>> /* The profile merging function that just adds the counters. It is given >>> - an array COUNTERS of N_COUNTERS old counters and it reads the same >>> number >>> - of counters from the gcov file. */ >>> + an array COUNTERS of N_COUNTERS old counters. >>> + When SRC==NULL, it reads the same number of counters from the gcov file. >>> + Otherwise, it reads from SRC array. */ >>> void >>> -__gcov_merge_ior (gcov_type *counters, unsigned n_counters) >>> +__gcov_merge_ior (gcov_type *counters, unsigned n_counters, >>> + gcov_type *src, unsigned w __attribute__ ((unused))) >> >> So the new in-memory variants are introduced for merging tool, while libgcc >> use gcov_read_counter >> interface? >> Perhaps we can actually just duplicate the functions to avoid runtime to do >> all the scalling >> and in_mem tests it won't need? > > > I thought about this one a little. How about making the interface > change conditionally, but still share the implementation? The merge > function bodies mostly remain unchanged and there is no runtime > penalty for libgcov. The new macros can be shared across most of the > mergers. > > #ifdef IN_PREOFILE_TOOL > #define GCOV_MERGE_EXTRA_ARGS gcov_type *src, unsigned w > #define GCOV_READ_COUNTER *(src++) * w > #else > #define GCOV_MERGE_EXTRA_ARGS > #define GCOV_READ_COUNTER gcov_read_counter () > #endif > > __gcov_merge_add (gcov_type *counters, unsigned n_counters, > GCOV_MERGE_EXTRA_ARGS) > { > > for (; n_counters; counters++, n_counters--) > { > *counters += GCOV_READ_COUNTER ; >} > > } > > thanks, Personally I don't think the run time test of in_mem will cause any issue. This is in profile dumping, why don't we care a few more cycle heres? it won't pollute the profile. If you really don't like that, we can use the above approach, or I can hide the logic in gcov_read_counter(), i.e. overload gcov_read_counter() in profile_tool. For that, I will need a new global variable SRC and set it before calling the merge function. I would prefer to keep weight in _gcov_merge_* argument list. What do you think? -Rong > > David > >> >> I would suggest going with libgcov.h changes and clenaups first, with >> interface changes next >> and the gcov-tool is probably quite obvious at the end? >> Do you think you can split the patch this way? >> >> Thanks and sorry for taking long to review. I should have more time again >> now. >> Honza
Re: [RFC] libgcov.c re-factoring and offline profile-tool
; 108 0 0 108 6c _gcov_execve.o (ex > ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a) > 72 0 0 72 48 _gcov_reset.o (ex > ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a) > 72 0 0 72 48 _gcov_dump.o (ex > ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a) > 89 0 0 89 59 _gcov_interval_profiler.o (ex > ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a) > 69 0 0 69 45 _gcov_pow2_profiler.o (ex > ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a) > 115 0 0 115 73 _gcov_one_value_profiler.o (ex > ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a) > 122 0 0 122 7a _gcov_indirect_call_profiler.o (ex > ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a) > 57 0 0 57 39 _gcov_average_profiler.o (ex > ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a) > 52 0 0 52 34 _gcov_ior_profiler.o (ex > ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a) > 125 0 0 125 7d _gcov_merge_ior.o (ex > ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a) > 2014-01-08 Rong Xu * libgcc/libgcov-driver.c (this_prg): make it local to save bss space. (gcov_exit_compute_summary): Ditto. (gcov_exit_merge_gcda): Ditto. (gcov_exit_merge_summary): Ditto. (gcov_exit_dump_gcov): Ditto. Index: libgcc/libgcov-driver.c === --- libgcc/libgcov-driver.c (revision 206437) +++ libgcc/libgcov-driver.c (working copy) @@ -303,8 +303,6 @@ gcov_compute_histogram (struct gcov_summary *sum) } } -/* summary for program. */ -static struct gcov_summary this_prg; /* gcda filename. */ static char *gi_filename; /* buffer for the fn_data from another program. */ @@ -317,10 +315,10 @@ static struct gcov_summary_buffer *sum_buffer; static int run_accounted = 0; /* This funtions computes the program level summary and the histo-gram. - It computes and returns CRC32 and stored summari in THIS_PRG. */ + It computes and returns CRC32 and stored summary in THIS_PRG. */ static gcov_unsigned_t -gcov_exit_compute_summary (void) +gcov_exit_compute_summary (struct gcov_summary *this_prg) { struct gcov_info *gi_ptr; const struct gcov_fn_info *gfi_ptr; @@ -332,7 +330,7 @@ static gcov_unsigned_t gcov_unsigned_t crc32 = 0; /* Find the totals for this execution. */ - memset (&this_prg, 0, sizeof (this_prg)); + memset (this_prg, 0, sizeof (*this_prg)); for (gi_ptr = gcov_list; gi_ptr; gi_ptr = gi_ptr->next) { crc32 = crc32_unsigned (crc32, gi_ptr->stamp); @@ -357,7 +355,7 @@ static gcov_unsigned_t if (!gi_ptr->merge[t_ix]) continue; - cs_ptr = &this_prg.ctrs[t_ix]; + cs_ptr = &(this_prg->ctrs[t_ix]); cs_ptr->num += ci_ptr->num; crc32 = crc32_unsigned (crc32, ci_ptr->num); @@ -371,7 +369,7 @@ static gcov_unsigned_t } } } - gcov_compute_histogram (&this_prg); + gcov_compute_histogram (this_prg); return crc32; } @@ -393,6 +391,7 @@ struct gcov_filename_aux{ static int gcov_exit_merge_gcda (struct gcov_info *gi_ptr, struct gcov_summary *prg_p, + struct gcov_summary *this_prg, gcov_position_t *summary_pos_p, gcov_position_t *eof_pos_p, gcov_unsigned_t crc32) @@ -446,7 +445,7 @@ gcov_exit_merge_gcda (struct gcov_info *gi_ptr, goto next_summary; for (t_ix = 0; t_ix != GCOV_COUNTERS_SUMMABLE; t_ix++) -if (tmp.ctrs[t_ix].num != this_prg.ctrs[t_ix].num) +if (tmp.ctrs[t_ix].num != this_prg->ctrs[t_ix].num) goto next_summary; *prg_p = tmp; *summary_pos_p = *eof_pos_p; @@ -636,7 +635,7 @@ gcov_exit_write_gcda (const struct gcov_info *gi_p static int gcov_exit_merge_summary (const struct gcov_info *gi_ptr, struct gcov_summary *prg, -gcov_unsigned_t crc32, + struct gcov_summary *this_prg, gcov_unsigned_t crc32, struct gcov_summary *all_prg __attribute__ ((unused))) { struct gcov_ctr_summary *cs_prg, *cs_tprg; @@ -650,7 +649,7 @@ gcov_exit_merge_summary (const struct gcov_info *g for (t_ix = 0; t_ix < GCOV_COUNTERS_SUMMABLE; t_ix++) { cs_prg = &(prg->ctrs[t_ix]); - cs_tprg = &this_prg.ctrs[t_ix]; + cs_tprg = &(this_prg->ctrs[t_ix]); if (gi_ptr->merge[t_ix]) { @@ -719,7 +718,8 @@ gcov_exit_merge_summary (const struct gcov_info *g static void gcov_exit_dump_gcov (struct gcov_info *gi_ptr, struct gcov_filename_aux *gf, -gcov_unsigned_t crc32, struct gcov_summary *
Re: [PATCH] Conditional count update for fast coverage test in multi-threaded programs
-3.01% +2.87%-75.50% benchmark_9 81.6+9.91% +14.10%-69.81% benchmark_10 69.8 +23.60% -1.53% -- benchmark_11 89.9+0.34% +4.32%-59.62% benchmark_12 98.7+0.96% +0.88%-10.52% benchmark_13 80.5-8.31% -0.55%-75.90% benchmark_14 17.1 +429.47% -5.21% -4.87% benchmark_15 22.0 +295.70% -1.29%-46.36% benchmark_16 80.0-6.31% +0.54%-62.61% benchmark_17 83.5+4.84% +10.71%-70.48% benchmark_18 90.0-1.27% +3.18%-72.30% geometric mean +24.51% +4.81% -62.44% (incomplete) benchmark_13 77.1 -3.34%+5.75% -- On Mon, Nov 25, 2013 at 11:19 AM, Rong Xu wrote: > On Mon, Nov 25, 2013 at 2:11 AM, Richard Biener > wrote: >> On Fri, Nov 22, 2013 at 10:49 PM, Rong Xu wrote: >>> On Fri, Nov 22, 2013 at 4:03 AM, Richard Biener >>> wrote: >>>> On Fri, Nov 22, 2013 at 4:51 AM, Rong Xu wrote: >>>>> Hi, >>>>> >>>>> This patch injects a condition into the instrumented code for edge >>>>> counter update. The counter value will not be updated after reaching >>>>> value 1. >>>>> >>>>> The feature is under a new parameter --param=coverage-exec_once. >>>>> Default is disabled and setting to 1 to enable. >>>>> >>>>> This extra check usually slows the program down. For SPEC 2006 >>>>> benchmarks (all single thread programs), we usually see around 20%-35% >>>>> slow down in -O2 coverage build. This feature, however, is expected to >>>>> improve the coverage run speed for multi-threaded programs, because >>>>> there virtually no data race and false sharing in updating counters. >>>>> The improvement can be significant for highly threaded programs -- we >>>>> are seeing 7x speedup in coverage test run for some non-trivial google >>>>> applications. >>>>> >>>>> Tested with bootstrap. >>>> >>>> Err - why not simply emit >>>> >>>> counter = 1 >>>> >>>> for the counter update itself with that --param (I don't like a --param >>>> for this either). >>>> >>>> I assume that CPUs can avoid data-races and false sharing for >>>> non-changing accesses? >>>> >>> >>> I'm not aware of any CPU having this feature. I think a write to the >>> shared cache line to invalidate all the shared copies. I cannot find >>> any reference on checking the value of the write. Do you have any >>> pointer to the feature? >> >> I don't have any pointer - but I remember seeing this in the context >> of atomics thus it may be only in the context of using a xchg >> or cmpxchg instruction. Which would make it non-portable to >> some extent (if you don't want to use atomic builtins here). >> > > cmpxchg should work here -- it's a conditional write so the data race > /false sharing can be avoided. > I'm comparing the performance b/w explicit branch vs cmpxchg and will > report back. > > -Rong > > >> Richard. >> >>> I just tested this implementation vs. simply setting to 1, using >>> google search as the benchmark. >>> This one is 4.5x faster. The test was done on Intel Westmere systems. >>> >>> I can change the parameter to an option. >>> >>> -Rong >>> >>>> Richard. >>>> >>>>> Thanks, >>>>> >>>>> -Rong
Re: [PATCH] Conditional count update for fast coverage test in multi-threaded programs
On Mon, Nov 25, 2013 at 2:11 AM, Richard Biener wrote: > On Fri, Nov 22, 2013 at 10:49 PM, Rong Xu wrote: >> On Fri, Nov 22, 2013 at 4:03 AM, Richard Biener >> wrote: >>> On Fri, Nov 22, 2013 at 4:51 AM, Rong Xu wrote: >>>> Hi, >>>> >>>> This patch injects a condition into the instrumented code for edge >>>> counter update. The counter value will not be updated after reaching >>>> value 1. >>>> >>>> The feature is under a new parameter --param=coverage-exec_once. >>>> Default is disabled and setting to 1 to enable. >>>> >>>> This extra check usually slows the program down. For SPEC 2006 >>>> benchmarks (all single thread programs), we usually see around 20%-35% >>>> slow down in -O2 coverage build. This feature, however, is expected to >>>> improve the coverage run speed for multi-threaded programs, because >>>> there virtually no data race and false sharing in updating counters. >>>> The improvement can be significant for highly threaded programs -- we >>>> are seeing 7x speedup in coverage test run for some non-trivial google >>>> applications. >>>> >>>> Tested with bootstrap. >>> >>> Err - why not simply emit >>> >>> counter = 1 >>> >>> for the counter update itself with that --param (I don't like a --param >>> for this either). >>> >>> I assume that CPUs can avoid data-races and false sharing for >>> non-changing accesses? >>> >> >> I'm not aware of any CPU having this feature. I think a write to the >> shared cache line to invalidate all the shared copies. I cannot find >> any reference on checking the value of the write. Do you have any >> pointer to the feature? > > I don't have any pointer - but I remember seeing this in the context > of atomics thus it may be only in the context of using a xchg > or cmpxchg instruction. Which would make it non-portable to > some extent (if you don't want to use atomic builtins here). > cmpxchg should work here -- it's a conditional write so the data race /false sharing can be avoided. I'm comparing the performance b/w explicit branch vs cmpxchg and will report back. -Rong > Richard. > >> I just tested this implementation vs. simply setting to 1, using >> google search as the benchmark. >> This one is 4.5x faster. The test was done on Intel Westmere systems. >> >> I can change the parameter to an option. >> >> -Rong >> >>> Richard. >>> >>>> Thanks, >>>> >>>> -Rong
Re: [PATCH] Conditional count update for fast coverage test in multi-threaded programs
On Fri, Nov 22, 2013 at 4:03 AM, Richard Biener wrote: > On Fri, Nov 22, 2013 at 4:51 AM, Rong Xu wrote: >> Hi, >> >> This patch injects a condition into the instrumented code for edge >> counter update. The counter value will not be updated after reaching >> value 1. >> >> The feature is under a new parameter --param=coverage-exec_once. >> Default is disabled and setting to 1 to enable. >> >> This extra check usually slows the program down. For SPEC 2006 >> benchmarks (all single thread programs), we usually see around 20%-35% >> slow down in -O2 coverage build. This feature, however, is expected to >> improve the coverage run speed for multi-threaded programs, because >> there virtually no data race and false sharing in updating counters. >> The improvement can be significant for highly threaded programs -- we >> are seeing 7x speedup in coverage test run for some non-trivial google >> applications. >> >> Tested with bootstrap. > > Err - why not simply emit > > counter = 1 > > for the counter update itself with that --param (I don't like a --param > for this either). > > I assume that CPUs can avoid data-races and false sharing for > non-changing accesses? > I'm not aware of any CPU having this feature. I think a write to the shared cache line to invalidate all the shared copies. I cannot find any reference on checking the value of the write. Do you have any pointer to the feature? I just tested this implementation vs. simply setting to 1, using google search as the benchmark. This one is 4.5x faster. The test was done on Intel Westmere systems. I can change the parameter to an option. -Rong > Richard. > >> Thanks, >> >> -Rong
[PATCH] Conditional count update for fast coverage test in multi-threaded programs
Hi, This patch injects a condition into the instrumented code for edge counter update. The counter value will not be updated after reaching value 1. The feature is under a new parameter --param=coverage-exec_once. Default is disabled and setting to 1 to enable. This extra check usually slows the program down. For SPEC 2006 benchmarks (all single thread programs), we usually see around 20%-35% slow down in -O2 coverage build. This feature, however, is expected to improve the coverage run speed for multi-threaded programs, because there virtually no data race and false sharing in updating counters. The improvement can be significant for highly threaded programs -- we are seeing 7x speedup in coverage test run for some non-trivial google applications. Tested with bootstrap. Thanks, -Rong 2013-11-21 Rong Xu * gcc/doc/invoke.texi (coverage-exec_once): Add document. * gcc/params.def (coverage-exec_once): New param. * gcc/profile.h (gcov_type sum_edge_counts): Add extern decl. * gcc/profile.c (branch_prob): Add support for coverage-exec_once. * gcc/tree-profile.c (gimple_gen_edge_profiler): Ditto. (gimple_gen_ior_profiler): Ditto. (insert_if_then): Ditto. (add_execonce_wrapper): Ditto. (is_pred_instrumentation_candidate): Ditto. (add_predicate_to_edge_counters): Ditto. (cleanup_pred_instrumentation): Ditto. (init_pred_instrumentation): Ditto. (tree_profiling): Ditto. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 205226) +++ gcc/doc/invoke.texi (working copy) @@ -9937,6 +9937,10 @@ The default choice depends on the target. Set the maximum number of existing candidates that will be considered when seeking a basis for a new straight-line strength reduction candidate. +@item coverage-exec_once +Set to 1 to update each arc counter only once. This avoids false sharing +and speeds up profile-generate run for multi-threaded programs. + @end table @end table Index: gcc/params.def === --- gcc/params.def (revision 205226) +++ gcc/params.def (working copy) @@ -441,6 +441,14 @@ DEFPARAM(TRACER_MIN_BRANCH_PROBABILITY, "Stop forward growth if the probability of best edge is less than this threshold (in percent). Used when profile feedback is not available", 50, 0, 100) +/* This parameter enables fast coverage test. Once the counter flips from 0 + to 1, we no longer updating the value . This avoids false sharing and speeds + up profile-generate run for multi-threaded programs. */ +DEFPARAM (PARAM_COVERAGE_EXEC_ONCE, + "coverage-exec_once", + "Stop incrementing edge counts once they become 1.", + 0, 0, 1) + /* The maximum number of incoming edges to consider for crossjumping. */ DEFPARAM(PARAM_MAX_CROSSJUMP_EDGES, "max-crossjump-edges", Index: gcc/profile.c === --- gcc/profile.c (revision 205226) +++ gcc/profile.c (working copy) @@ -67,6 +67,7 @@ along with GCC; see the file COPYING3. If not see #include "cfgloop.h" #include "dumpfile.h" #include "cgraph.h" +#include "params.h" #include "profile.h" @@ -1327,6 +1328,9 @@ branch_prob (void) /* Commit changes done by instrumentation. */ gsi_commit_edge_inserts (); + + if (PARAM_VALUE (PARAM_COVERAGE_EXEC_ONCE)) +add_predicate_to_edge_counters (); } free_aux_for_edges (); Index: gcc/profile.h === --- gcc/profile.h (revision 205226) +++ gcc/profile.h (working copy) @@ -46,6 +46,9 @@ extern gcov_type sum_edge_counts (vec extern void init_node_map (bool); extern void del_node_map (void); +/* Insert a predicate to edge counter update stmt. */ +extern void add_predicate_to_edge_counters (void); + extern void get_working_sets (void); /* In predict.c. */ Index: gcc/tree-profile.c === --- gcc/tree-profile.c (revision 205226) +++ gcc/tree-profile.c (working copy) @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3. If not see #include "target.h" #include "tree-cfgcleanup.h" #include "tree-nested.h" +#include "params.h" static GTY(()) tree gcov_type_node; static GTY(()) tree tree_interval_profiler_fn; @@ -67,6 +68,11 @@ static GTY(()) tree ic_void_ptr_var; static GTY(()) tree ic_gcov_type_ptr_var; static GTY(()) tree ptr_void; +/* A pointer-set of the last statement in each block of statements that need to + be applied a predicate . */ +static struct pointer_set_t *pred_instrumentation_candidate = NULL; + + /* Do initiali
Re: [RFC] libgcov.c re-factoring and offline profile-tool
Ping. On Mon, Nov 18, 2013 at 12:31 PM, Xinliang David Li wrote: > gcov-dump tool does raw dump of profile data. In the long run, we > should have only one gcov profile manipulation tool, so it might be > better to incorporate gcov-dump into gcov-tool and get rid of > 'gcov-dump'. > > David > > On Mon, Nov 18, 2013 at 12:24 PM, Rong Xu wrote: >> Hi, all >> >> This is the new patch for gcov-tool (previously profile-tool). >> >> Honza: can you comment on the new merge interface? David posted some >> comments in an earlier email and we want to know what's your opinion. >> >> Test patch has been tested with boostrap, regresssion, >> profiledbootstrap and SPEC2006. >> >> Noticeable changes from the earlier version: >> >> 1. create a new file libgcov.h and move libgcov-*.h headers to libgcov.h >> So we can included multiple libgcov-*.c without adding new macros. >> >> 2. split libgcov.h specific code in gcvo-io.h to libcc/libgcov.h >> Avoid multiple-page of code under IN_LIBGCOV macro -- this >> improves the readability. >> >> 3. make gcov_var static, and move the definition from gcov-io.h to >> gcov-io.c. Also >>move some static functions accessing gcov_var to gcvo-io.c >> Current code rely on GCOV_LINKAGE tricks to avoid multi-definition. I don't >> see >> a reason that gcov_var needs to exposed as a global. >> >> 4. expose gcov_write_strings() and gcov_sync() to gcov_tool usage >> >> 5. rename profile-tool to gcov-tool per Honza's suggestion. >> >> Thanks, >> >> -Rong >> >> On Tue, Nov 12, 2013 at 4:27 PM, Rong Xu wrote: >>> The patch was checked in as r204730. >>> >>> Tested with >>> configure with --enable-languages=all,obj-c++ >>> (ada is currently broken, so I was not be able test) >>> bootstrap and profiledbootstrap >>> regression for -m32 and -m64. >>> spec2006 >>> >>> The only semantic change is __gcov_flush() used to be under L_gcov. >>> I put the function to libgcov-interface.c and made it under >>> L_gcov_flush (newly added). >>> So if you need this function, you have to enable L_gcov_flush in the >>> libgcc/Makefile.in. >>> >>> Thanks, >>> >>> -Rong >>> >>> On Mon, Nov 11, 2013 at 10:19 AM, Rong Xu wrote: >>>> On Mon, Nov 11, 2013 at 7:02 AM, Jan Hubicka wrote: >>>>>> 2013-11-04 Rong Xu >>>>>> >>>>>> * libgcc/libgcov.c: Delete as part of re-factoring. >>>>>> * libgcc/libgcov-profiler.c (__gcov_interval_profiler): Moved from >>>>>> libgcov.c >>>>>> (__gcov_pow2_profiler): Ditto. >>>>>> (__gcov_one_value_profiler_body): Ditto. >>>>>> (__gcov_one_value_profiler): Ditto. >>>>>> (__gcov_indirect_call_profiler): Ditto. >>>>>> (__gcov_indirect_call_profiler_v2): Ditto. >>>>>> (__gcov_average_profiler): Ditto. >>>>>> (__gcov_ior_profiler): Ditto. >>>>>> * libgcc/libgcov-driver.c (this_prg): Make it file scope >>>>>> static variable. >>>>>> (all_prg): Ditto. >>>>>> (crc32): Ditto. >>>>>> (gi_filename): Ditto. >>>>>> (fn_buffer): Ditto. >>>>>> (sum_buffer): Ditto. >>>>>> (struct gcov_filename_aux): New types to store auxiliary >>>>>> information >>>>>> for gi_filename. >>>>>> (gcov_version): Moved from libgcov.c. >>>>>> (crc32_unsigned): Ditto. >>>>>> (gcov_histogram_insert): Ditto. >>>>>> (gcov_compute_histogram): Ditto. >>>>>> (gcov_exit): Ditto. >>>>>> (gcov_clear): Ditto. >>>>>> (__gcov_init): Ditto. >>>>>> (gcov_exit_compute_summary): New function split from gcov_exit(). >>>>>> (gcov_exit_merge_gcda): Ditto. >>>>>> (gcov_exit_write_gcda): Ditto. >>>>>> (gcov_exit_dump_gcov): Ditto. >>>>>> * libgcc/libgcov-interface.c (init_mx): Moved from libgcov.c. >>>>>> (init_mx_once): Ditto. >>>>>> (__gcov_flush): Ditto. >>>>>> (__gcov_reset): Ditto. >>>>>> (__gcov_dump): Di
Re: atomic update of profile counters (issue7000044)
OK. Sorry for miss-reading the message. In that case, linking in libatomic becomes a separate issue. We don't need to touch gcc.c in this patch. Thanks, -Rong On Wed, Nov 20, 2013 at 2:19 PM, Andrew Pinski wrote: > On Wed, Nov 20, 2013 at 2:07 PM, Rong Xu wrote: >> Joseph and Andrew, thanks for the suggestion. That's really helpful. >> >> Here is the new patch for gcc.c. >> Basically, it's just what you have suggested: enclosing -latomic with >> --as-needed, and using macros. >> For the case of no --as-needed support, I use static link. (just found >> that some code already using this in the SPEC). >> I'm flexible on this part -- if you think this is unnecessary, I can remove. > > > I think Joseph's suggestion was also to include -latomic even when not > generating atomic profiling due to the C11 code requiring it. > > Thanks, > Andrew > >> >> Thanks, >> >> -Rong >> >> Index: gcc.c >> === >> --- gcc.c (revision 205053) >> +++ gcc.c (working copy) >> @@ -748,6 +748,23 @@ proper position among the other output files. */ >> %{fvtable-verify=preinit: -lvtv -u_vtable_map_vars_start >> -u_vtable_map_vars_end}}" >> #endif >> >> +/* This spec is for linking in libatomic in gcov atomic counter update. >> + We will use the atomic functions defined in libatomic, only when the >> builtin >> + versions are not available. In the case of no LD_AS_NEEDED support, we >> + link libatomic statically. */ >> + >> +#ifndef GCOV_ATOMIC_SPEC >> +#if USE_LD_AS_NEEDED >> +#define GCOV_ATOMIC_SPEC "%{fprofile-generate-atomic=*:" >> LD_AS_NEEDED_OPTION \ >> + " -latomic} " LD_NO_AS_NEEDED_OPTION >> +#elif defined(HAVE_LD_STATIC_DYNAMIC) >> +#define GCOV_ATOMIC_SPEC "%{fprofile-generate-atomic=*:" LD_STATIC_OPTION \ >> +" -latomic " LD_DYNAMIC_OPTION "}" >> +#else /* !USE_LD_AS_NEEDED && !HAVE_LD_STATIC_DYNAMIC */ >> +#define GCOV_ATOMIC_SPEC "%{fprofile-generate-atomic=*:-latomic}" >> +#endif >> +#endif >> + >> /* -u* was put back because both BSD and SysV seem to support it. */ >> /* %{static:} simply prevents an error message if the target machine >> doesn't handle -static. */ >> @@ -771,7 +788,8 @@ proper position among the other output files. */ >> >> %{fopenmp|ftree-parallelize-loops=*:%:include(libgomp.spec)%(link_gomp)}\ >> %{fgnu-tm:%:include(libitm.spec)%(link_itm)}\ >> %(mflib) " STACK_SPLIT_SPEC "\ >> -%{fprofile-arcs|fprofile-generate*|coverage:-lgcov} " SANITIZER_SPEC " \ >> +%{fprofile-arcs|fprofile-generate*|coverage:-lgcov\ >> + " GCOV_ATOMIC_SPEC "} " SANITIZER_SPEC " \ >> %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\ >> %{!nostdlib:%{!nostartfiles:%E}} %{T*} }}" >> >> >> >> On Wed, Nov 20, 2013 at 1:04 PM, Joseph S. Myers >> wrote: >>> On Wed, 20 Nov 2013, Rong Xu wrote: >>> >>>> I could do this in the SPEC >>>> -Wl,-Bstatic -latomic -Wl,-Bdynamic >>>> which would link libatomic statically. >>>> I works for me. But it looks a little weird in gcc driver. >>> >>> I think we should generally link libatomic with --as-needed by default on >>> platforms supporting --as-needed, in line with the general principle that >>> C code just using language not library facilities (_Atomic in this case) >>> shouldn't need any special options to link it (libatomic is like libgcc, >>> which is linked in automatically); the trickier question is what to do >>> with it on any systems supporting shared libraries but not --as-needed. >>> >>> -- >>> Joseph S. Myers >>> jos...@codesourcery.com
Re: atomic update of profile counters (issue7000044)
Joseph and Andrew, thanks for the suggestion. That's really helpful. Here is the new patch for gcc.c. Basically, it's just what you have suggested: enclosing -latomic with --as-needed, and using macros. For the case of no --as-needed support, I use static link. (just found that some code already using this in the SPEC). I'm flexible on this part -- if you think this is unnecessary, I can remove. Thanks, -Rong Index: gcc.c === --- gcc.c (revision 205053) +++ gcc.c (working copy) @@ -748,6 +748,23 @@ proper position among the other output files. */ %{fvtable-verify=preinit: -lvtv -u_vtable_map_vars_start -u_vtable_map_vars_end}}" #endif +/* This spec is for linking in libatomic in gcov atomic counter update. + We will use the atomic functions defined in libatomic, only when the builtin + versions are not available. In the case of no LD_AS_NEEDED support, we + link libatomic statically. */ + +#ifndef GCOV_ATOMIC_SPEC +#if USE_LD_AS_NEEDED +#define GCOV_ATOMIC_SPEC "%{fprofile-generate-atomic=*:" LD_AS_NEEDED_OPTION \ + " -latomic} " LD_NO_AS_NEEDED_OPTION +#elif defined(HAVE_LD_STATIC_DYNAMIC) +#define GCOV_ATOMIC_SPEC "%{fprofile-generate-atomic=*:" LD_STATIC_OPTION \ +" -latomic " LD_DYNAMIC_OPTION "}" +#else /* !USE_LD_AS_NEEDED && !HAVE_LD_STATIC_DYNAMIC */ +#define GCOV_ATOMIC_SPEC "%{fprofile-generate-atomic=*:-latomic}" +#endif +#endif + /* -u* was put back because both BSD and SysV seem to support it. */ /* %{static:} simply prevents an error message if the target machine doesn't handle -static. */ @@ -771,7 +788,8 @@ proper position among the other output files. */ %{fopenmp|ftree-parallelize-loops=*:%:include(libgomp.spec)%(link_gomp)}\ %{fgnu-tm:%:include(libitm.spec)%(link_itm)}\ %(mflib) " STACK_SPLIT_SPEC "\ -%{fprofile-arcs|fprofile-generate*|coverage:-lgcov} " SANITIZER_SPEC " \ +%{fprofile-arcs|fprofile-generate*|coverage:-lgcov\ + " GCOV_ATOMIC_SPEC "} " SANITIZER_SPEC " \ %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\ %{!nostdlib:%{!nostartfiles:%E}} %{T*} }}" On Wed, Nov 20, 2013 at 1:04 PM, Joseph S. Myers wrote: > On Wed, 20 Nov 2013, Rong Xu wrote: > >> I could do this in the SPEC >> -Wl,-Bstatic -latomic -Wl,-Bdynamic >> which would link libatomic statically. >> I works for me. But it looks a little weird in gcc driver. > > I think we should generally link libatomic with --as-needed by default on > platforms supporting --as-needed, in line with the general principle that > C code just using language not library facilities (_Atomic in this case) > shouldn't need any special options to link it (libatomic is like libgcc, > which is linked in automatically); the trickier question is what to do > with it on any systems supporting shared libraries but not --as-needed. > > -- > Joseph S. Myers > jos...@codesourcery.com
Re: atomic update of profile counters (issue7000044)
On Tue, Nov 19, 2013 at 5:11 PM, Andrew Pinski wrote: > On Tue, Nov 19, 2013 at 5:02 PM, Rong Xu wrote: >> Hi all, >> >> I merged this old patch with current trunk. I also make the following changes >> (1) not using weak references. Now every *profile_atomic() has it's >> own .o so that none of them will be in the final binary if >> -fprofile-generate-atomic is not specified. >> (2) more value profilers have the atomic version. >> (3) not link to libatomic. I used to link the libatomic in the >> presence of -fprofile-generate-atomic, per Andrew's suggestion. It >> used to work. But now if I can add -latomic in the SPEC, it cannot >> find the libatomic.so.1 (unless I specify the PATH). I did not find an >> easy way to statically link libatomic.a. Andrew: Do you have any >> suggestion? Or should we let the user link to libatomic.a if the >> builtins are not expanded? > > It should work for an installed GCC. For testing you might need > something that is included inside testsuite/lib/atomic-dg.exp which > sets the library path to include libatomic build directory. When I change the SPEC to include libatomic, the compiler can find libatomic. I.e. using >> gcc -O2 -fprofile-generate -fprofile-generate-atomic=3 hello_world.c generates a.out without any problem. But since there are both shared and static libatomic in lib64, it chooses to use the dynamic one. >> ldd a.out linux-vdso.so.1 => (0x7fff56bff000) libatomic.so.1 => not found libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x2b0720261000) /lib64/ld-linux-x86-64.so.2 (0x2b072003c000) >> ./a.out ./a.out: error while loading shared libraries: libatomic.so.1: cannot open shared object file: No such file or directory while >> LD_LIBRARY_PATH=/lib64 ./a.out works fine. I think that's the same reason we set the library path in testsuite/lib/atomic-dg.exp, because /lib64 is not in the dynamic library search list. I could do this in the SPEC -Wl,-Bstatic -latomic -Wl,-Bdynamic which would link libatomic statically. I works for me. But it looks a little weird in gcc driver. Index: gcc.c === --- gcc.c (revision 205053) +++ gcc.c (working copy) @@ -771,7 +771,8 @@ %{fopenmp|ftree-parallelize-loops=*:%:include(libgomp.spec)%(link_gomp)}\ %{fgnu-tm:%:include(libitm.spec)%(link_itm)}\ %(mflib) " STACK_SPLIT_SPEC "\ -%{fprofile-arcs|fprofile-generate*|coverage:-lgcov} " SANITIZER_SPEC " \ +%{fprofile-arcs|fprofile-generate*|coverage:-lgcov\ + %{fprofile-generate-atomic=*:-Wl,-Bstatic -latomic -Wl,-Bdynamic}} " SANITIZER_SPEC " \ %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\ %{!nostdlib:%{!nostartfiles:%E}} %{T*} }}" #endif > I think now we require libatomic in more cases (C11 atomic support for > an example). > > Thanks, > Andrew Pinski > >> >> Is this OK for trunk? >> >> Thanks, >> >> -Rong >> >> On Mon, Jan 7, 2013 at 12:55 PM, Rong Xu wrote: >>> Function __gcov_indirect_call_profiler_atomic (which contains call to >>> the atomic function) is always emitted in libgcov. >>> Since we only link libatomic when -fprofile-gen-atomic is specified, >>> we have to make the atomic function weak -- otherwise, there is a >>> unsat for regular FDO gen build (of course, when the builtin is not >>> expanded). >>> >>> An alternative it to always link libatomic together with libgcov. Then >>> we don't need the weak stuff. I'm not sure when one is better. >>> >>> -Rong >>> >>> On Mon, Jan 7, 2013 at 12:36 PM, Richard Henderson wrote: >>>> On 01/03/2013 04:42 PM, Rong Xu wrote: >>>>> It links libatomic when -fprofile-gen-atomic is specified for FDO >>>>> instrumentation build. Here I assume libatomic is always installed. >>>>> Andrew: do you think if this is reasonable? >>>>> >>>>> It also disables the functionality if target does not support weak >>>>> (ie. TARGET_SUPPORTS_WEAK == 0). >>>> >>>> Since you're linking libatomic, you don't need weak references. >>>> >>>> I think its ok to assume libatomic is installed, given that the >>>> user has had to explicitly use the command-line option. >>>> >>>> >>>> r~
Re: atomic update of profile counters (issue7000044)
Hi all, I merged this old patch with current trunk. I also make the following changes (1) not using weak references. Now every *profile_atomic() has it's own .o so that none of them will be in the final binary if -fprofile-generate-atomic is not specified. (2) more value profilers have the atomic version. (3) not link to libatomic. I used to link the libatomic in the presence of -fprofile-generate-atomic, per Andrew's suggestion. It used to work. But now if I can add -latomic in the SPEC, it cannot find the libatomic.so.1 (unless I specify the PATH). I did not find an easy way to statically link libatomic.a. Andrew: Do you have any suggestion? Or should we let the user link to libatomic.a if the builtins are not expanded? Is this OK for trunk? Thanks, -Rong On Mon, Jan 7, 2013 at 12:55 PM, Rong Xu wrote: > Function __gcov_indirect_call_profiler_atomic (which contains call to > the atomic function) is always emitted in libgcov. > Since we only link libatomic when -fprofile-gen-atomic is specified, > we have to make the atomic function weak -- otherwise, there is a > unsat for regular FDO gen build (of course, when the builtin is not > expanded). > > An alternative it to always link libatomic together with libgcov. Then > we don't need the weak stuff. I'm not sure when one is better. > > -Rong > > On Mon, Jan 7, 2013 at 12:36 PM, Richard Henderson wrote: >> On 01/03/2013 04:42 PM, Rong Xu wrote: >>> It links libatomic when -fprofile-gen-atomic is specified for FDO >>> instrumentation build. Here I assume libatomic is always installed. >>> Andrew: do you think if this is reasonable? >>> >>> It also disables the functionality if target does not support weak >>> (ie. TARGET_SUPPORTS_WEAK == 0). >> >> Since you're linking libatomic, you don't need weak references. >> >> I think its ok to assume libatomic is installed, given that the >> user has had to explicitly use the command-line option. >> >> >> r~ 2013-11-19 Rong Xu * gcc/gcov-io.h: Add atomic function macros for compiler use. * gcc/common.opt (fprofile-generate-atomic): New option. * gcc/tree-profile.c (gimple_init_edge_profiler): Support for atomic counter update. (gimple_gen_edge_profiler): Ditto. * libgcc/libgcov-profiler.c (__gcov_interval_profiler_atomic): Ditto. (__gcov_pow2_profiler_atomic): Ditto. (__gcov_one_value_profiler_body_atomic): Ditto. (__gcov_one_value_profiler_atomic): Ditto. (__gcov_indirect_call_profiler_atomic): Ditto. (__gcov_indirect_call_profiler_v2_atomic): Ditto. (__gcov_time_profiler_atomic): Ditto. (__gcov_average_profiler_atomic): Ditto. * gcc/gcc.c: Link libatomic when -fprofile-generate-atomic used. * libgcc/Makefile.in: Add atomic objects. Index: gcc/common.opt === --- gcc/common.opt (revision 205053) +++ gcc/common.opt (working copy) @@ -1684,6 +1684,15 @@ fprofile-correction Common Report Var(flag_profile_correction) Enable correction of flow inconsistent profile data input +; fprofile-generate-atomic=0: default and disable atomical update. +; fprofile-generate-atomic=1: atomically update edge profile counters. +; fprofile-generate-atomic=2: atomically update value profile counters. +; fprofile-generate-atomic=3: atomically update edge and value profile counters. +; other values will be ignored (fall back to the default of 0). +fprofile-generate-atomic= +Common Joined UInteger Report Var(flag_profile_generate_atomic) Init(3) Optimization +fprofile-generate-atomic=[0..3] Atomical increments of profile counters. + fprofile-generate Common Enable common options for generating profile info for profile feedback directed optimizations Index: gcc/tree-profile.c === --- gcc/tree-profile.c (revision 205053) +++ gcc/tree-profile.c (working copy) @@ -155,6 +155,9 @@ gimple_init_edge_profiler (void) tree ic_profiler_fn_type; tree average_profiler_fn_type; tree time_profiler_fn_type; + const char *fn_name; + bool profile_gen_value_atomic = (flag_profile_generate_atomic == 2 || + flag_profile_generate_atomic == 3); if (!gcov_type_node) { @@ -167,9 +170,10 @@ gimple_init_edge_profiler (void) gcov_type_ptr, gcov_type_node, integer_type_node, unsigned_type_node, NULL_TREE); + fn_name = profile_gen_value_atomic ? "__gcov_interval_profiler_atomic" : + "__gcov_interval_profiler"; tree_interval_profiler_fn - = build_fn_decl ("__gcov_in
Re: Fix summary generation with fork
On Mon, Nov 18, 2013 at 2:15 PM, Jan Hubicka wrote: > Hi, > this patch fixes problem we noticed with Martin Liska where gcov_dump is > called > several times per execution of firefox (on each fork and exec). This causes > runs to be large and makes functions executed once per program to be > considered > cold. > > This patch makes us to update runs just once per execution and not on each > streaming of profile data. While testing it I also noticed that program > summary is now broken, since crc32 is accumulated per each dumping instead > just > once. You are right. I forgot that gcov_exit() can be called multiple times. Should have initialized crc32 per gcov_exit() call. Thanks for the fix. > > I believe the newly introduced static vars should go - there is nothing really > preventing us from doing two concurent updates and also it just unnecesary > increases to footprint of libgcov. I converted thus all_prg and crc32 back to > local vars. > > Bootstrapped/regtested x86_64-linux, comitted. > > * libgcov-driver.c (get_gcov_dump_complete): Update comments. > (all_prg, crc32): Remove static vars. > (gcov_exit_compute_summary): Rewrite to return crc32; do not clear > all_prg. > (gcov_exit_merge_gcda): Add crc32 parameter. > (gcov_exit_merge_summary): Add crc32 and all_prg parameter; > do not account run if it was already accounted. > (gcov_exit_dump_gcov): Add crc32 and all_prg parameters. > (gcov_exit): Initialize all_prg; update. > Index: libgcov-driver.c > === > --- libgcov-driver.c(revision 204945) > +++ libgcov-driver.c(working copy) > @@ -96,7 +96,7 @@ static size_t gcov_max_filename = 0; > /* Flag when the profile has already been dumped via __gcov_dump(). */ > static int gcov_dump_complete; > > -/* A global functino that get the vaule of gcov_dump_complete. */ > +/* A global function that get the vaule of gcov_dump_complete. */ > > int > get_gcov_dump_complete (void) > @@ -319,12 +319,6 @@ gcov_compute_histogram (struct gcov_summ > > /* summary for program. */ > static struct gcov_summary this_prg; > -#if !GCOV_LOCKED > -/* summary for all instances of program. */ > -static struct gcov_summary all_prg; > -#endif > -/* crc32 for this program. */ > -static gcov_unsigned_t crc32; > /* gcda filename. */ > static char *gi_filename; > /* buffer for the fn_data from another program. */ > @@ -333,10 +327,9 @@ static struct gcov_fn_buffer *fn_buffer; > static struct gcov_summary_buffer *sum_buffer; > > /* This funtions computes the program level summary and the histo-gram. > - It initializes ALL_PRG, computes CRC32, and stores the summary in > - THIS_PRG. All these three variables are file statics. */ > + It computes and returns CRC32 and stored summari in THIS_PRG. */ > > -static void > +static gcov_unsigned_t > gcov_exit_compute_summary (void) > { >struct gcov_info *gi_ptr; > @@ -346,10 +339,8 @@ gcov_exit_compute_summary (void) >int f_ix; >unsigned t_ix; >gcov_unsigned_t c_num; > + gcov_unsigned_t crc32 = 0; > > -#if !GCOV_LOCKED > - memset (&all_prg, 0, sizeof (all_prg)); > -#endif >/* Find the totals for this execution. */ >memset (&this_prg, 0, sizeof (this_prg)); >for (gi_ptr = gcov_list; gi_ptr; gi_ptr = gi_ptr->next) > @@ -391,6 +382,7 @@ gcov_exit_compute_summary (void) > } > } >gcov_compute_histogram (&this_prg); > + return crc32; > } > > /* A struct that bundles all the related information about the > @@ -412,7 +404,8 @@ static int > gcov_exit_merge_gcda (struct gcov_info *gi_ptr, >struct gcov_summary *prg_p, >gcov_position_t *summary_pos_p, > - gcov_position_t *eof_pos_p) > + gcov_position_t *eof_pos_p, > + gcov_unsigned_t crc32) > { >gcov_unsigned_t tag, length; >unsigned t_ix; > @@ -652,13 +645,19 @@ gcov_exit_write_gcda (const struct gcov_ > Return -1 on error. Return 0 on success. */ > > static int > -gcov_exit_merge_summary (const struct gcov_info *gi_ptr, struct gcov_summary > *prg) > +gcov_exit_merge_summary (const struct gcov_info *gi_ptr, struct gcov_summary > *prg, > +gcov_unsigned_t crc32, struct gcov_summary *all_prg) > { >struct gcov_ctr_summary *cs_prg, *cs_tprg; > -#if !GCOV_LOCKED > - struct gcov_ctr_summary *cs_all; > -#endif >unsigned t_ix; > + /* If application calls fork or exec multiple times, we end up storing > + profile repeadely. We should not account this as multiple runs or > + functions executed once may mistakely become cold. */ > + static int run_accounted = 0; > +#if !GCOV_LOCKED > + /* summary for all instances of program. */ > + struct gcov_ctr_summary *cs_all; > +#endif > >/* Merge the summaries. */ >for (t_ix = 0; t_ix < GCOV_COUNTERS_SUMMABLE; t_ix++
Re: [RFC] libgcov.c re-factoring and offline profile-tool
The patch was checked in as r204730. Tested with configure with --enable-languages=all,obj-c++ (ada is currently broken, so I was not be able test) bootstrap and profiledbootstrap regression for -m32 and -m64. spec2006 The only semantic change is __gcov_flush() used to be under L_gcov. I put the function to libgcov-interface.c and made it under L_gcov_flush (newly added). So if you need this function, you have to enable L_gcov_flush in the libgcc/Makefile.in. Thanks, -Rong On Mon, Nov 11, 2013 at 10:19 AM, Rong Xu wrote: > On Mon, Nov 11, 2013 at 7:02 AM, Jan Hubicka wrote: >>> 2013-11-04 Rong Xu >>> >>> * libgcc/libgcov.c: Delete as part of re-factoring. >>> * libgcc/libgcov-profiler.c (__gcov_interval_profiler): Moved from >>> libgcov.c >>> (__gcov_pow2_profiler): Ditto. >>> (__gcov_one_value_profiler_body): Ditto. >>> (__gcov_one_value_profiler): Ditto. >>> (__gcov_indirect_call_profiler): Ditto. >>> (__gcov_indirect_call_profiler_v2): Ditto. >>> (__gcov_average_profiler): Ditto. >>> (__gcov_ior_profiler): Ditto. >>> * libgcc/libgcov-driver.c (this_prg): Make it file scope >>> static variable. >>> (all_prg): Ditto. >>> (crc32): Ditto. >>> (gi_filename): Ditto. >>> (fn_buffer): Ditto. >>> (sum_buffer): Ditto. >>> (struct gcov_filename_aux): New types to store auxiliary information >>> for gi_filename. >>> (gcov_version): Moved from libgcov.c. >>> (crc32_unsigned): Ditto. >>> (gcov_histogram_insert): Ditto. >>> (gcov_compute_histogram): Ditto. >>> (gcov_exit): Ditto. >>> (gcov_clear): Ditto. >>> (__gcov_init): Ditto. >>> (gcov_exit_compute_summary): New function split from gcov_exit(). >>> (gcov_exit_merge_gcda): Ditto. >>> (gcov_exit_write_gcda): Ditto. >>> (gcov_exit_dump_gcov): Ditto. >>> * libgcc/libgcov-interface.c (init_mx): Moved from libgcov.c. >>> (init_mx_once): Ditto. >>> (__gcov_flush): Ditto. >>> (__gcov_reset): Ditto. >>> (__gcov_dump): Ditto. >>> (__gcov_fork): Ditto. >>> (__gcov_execl): Ditto. >>> (__gcov_execlp): Ditto. >>> (__gcov_execle): Ditto. >>> (__gcov_execv): Ditto. >>> (__gcov_execvp): Ditto. >>> (__gcov_execve): Ditto. >>> * libgcc/libgcov-driver-system.c (gcov_error): New utility function. >>> (allocate_filename_struct): New function split from gcov_exit(). >>> (gcov_exit_open_gcda_file): Ditto. >>> (create_file_directory): Moved from libgcov.c. >>> * libgcc/libgcov-merge.c: >>> (__gcov_merge_add): Moved from libgcov.c. >>> (__gcov_merge_ior): Ditto. >>> (__gcov_merge_single): Ditto. >>> (__gcov_merge_delta): Ditto. >>> * libgcc/Makefile.in: Change to build newly added files. >>> * gcc/gcov-io.h (__gcov_merge_ior): Add the decl to avoid warning. >>> >> >> Hi, >> the patch looks resonable. I take your word on the fact that there are no >> changes >> in the code, I did not read it all ;))) > > We did split gcov_exit() into more functions. > But, semantically the code is the same. > >>> Index: libgcc/Makefile.in >>> === >>> --- libgcc/Makefile.in(revision 204285) >>> +++ libgcc/Makefile.in(working copy) >>> @@ -853,17 +853,37 @@ include $(iterator) >>> # Build libgcov components. >>> >>> # Defined in libgcov.c, included only in gcov library >>> -LIBGCOV = _gcov _gcov_merge_add _gcov_merge_single _gcov_merge_delta \ >>> -_gcov_fork _gcov_execl _gcov_execlp _gcov_execle \ >>> -_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump \ >>> -_gcov_interval_profiler _gcov_pow2_profiler _gcov_one_value_profiler \ >>> +##LIBGCOV = _gcov _gcov_merge_add _gcov_merge_single _gcov_merge_delta \ >>> +##_gcov_fork _gcov_execl _gcov_execlp _gcov_execle \ >>> +##_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump \ >>> +##_gcov_interval_profiler _gcov_pow2_profiler _gcov_one_value_profiler >>> \ >>> +##_gcov_indirect_call_profiler _gcov_average_profiler >>> _gcov_ior_profiler \ >>> +##_gcov_merge_ior _gcov_indirect_cal
Re: [PATCH] Time profiler - phase 1
Thanks. I'll have this change included in my libgcov.c re-re-factoring patch. -Rong On Tue, Nov 12, 2013 at 10:18 AM, Martin Liška wrote: > Hi Rong, > you are right, that's the only usage and so that you can declare it > static. > > Thank you, > Martin > > On Nov 12, 2013 7:10 PM, "Rong Xu" wrote: >> >> A question about the newly added global variable function_counter. >> Does it have to be a global? Can I make it a file static, within macro >> L_gcov_time_profiler? I don't find a use other than in >> __gcov_time_profiler(). >> >> thanks, >> >> -Rong >> >> >> >> On Mon, Nov 11, 2013 at 9:52 AM, Martin Liška >> wrote: >> > On 11 November 2013 15:57, Jan Hubicka wrote: >> >>> +2013-10-29 Martin Liska >> >>> + Jan Hubicka >> >>> >> >>> + >> >>> + * cgraph.c (dump_cgraph_node): Profile dump added. >> >>> + * cgraph.h (struct cgraph_node): New time profile variable >> >>> added. >> >>> + * cgraphclones.c (cgraph_clone_node): Time profile is cloned. >> >>> + * gcov-io.h (gcov_type): New profiler type introduced. >> >>> + * ipa-profile.c (lto_output_node): Streaming for time profile >> >>> added. >> >>> + (input_node): Time profiler is read from LTO stream. >> >>> + * predict.c (maybe_hot_count_p): Hot prediction changed. >> >>> + * profile.c (instrument_values): New case for time profiler >> >>> added. >> >>> + (compute_value_histograms): Read of time profile. >> >>> + * tree-pretty-print.c (dump_function_header): Time profiler is >> >>> dumped. >> >>> + * tree-profile.c (init_ic_make_global_vars): Time profiler >> >>> function added. >> >>> + (gimple_init_edge_profiler): TP function instrumentation. >> >>> + (gimple_gen_time_profiler): New. >> >>> + * value-prof.c (gimple_add_histogram_value): Support for time >> >>> profiler >> >>> + added. >> >>> + (dump_histogram_value): TP type added to dumps. >> >>> + (visit_hist): More sensitive check that takes TP into account. >> >>> + (gimple_find_values_to_profile): TP instrumentation. >> >>> + * value-prof.h (hist_type): New histogram type added. >> >>> + (struct histogram_value_t): Pointer to struct function added. >> >>> + * libgcc/Makefile.in: New GCOV merge function for TP added. >> >>> + * libgcov.c: function_counter variable introduced. >> >>> + (_gcov_merge_time_profile): New. >> >>> + (_gcov_time_profiler): New. >> >>> + >> >>> 2013-11-05 Steven Bosscher >> >>> >> >>> >> >>> diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c >> >>> index 1260069..eff4b51 100644 >> >>> --- a/gcc/ipa-profile.c >> >>> +++ b/gcc/ipa-profile.c >> >>> @@ -465,6 +465,7 @@ ipa_propagate_frequency (struct cgraph_node *node) >> >>>if (d.maybe_unlikely_executed) >> >>> { >> >>>node->frequency = NODE_FREQUENCY_UNLIKELY_EXECUTED; >> >>> + node->tp_first_run = 0; >> >>>if (dump_file) >> >>> fprintf (dump_file, "Node %s promoted to unlikely executed.\n", >> >>>cgraph_node_name (node)); >> >> >> >> Why do you put tp_first_run to 0? The function may be unlikely but it >> >> still may have >> >> average possition in the program execution. I.e. when it is executed >> >> once per many >> >> invocations during the train run) >> > >> > That makes no sense, I've removed this assignment. >> > >> >>> @@ -895,9 +907,22 @@ compute_value_histograms (histogram_values >> >>> values, unsigned cfg_checksum, >> >>>hist->hvalue.counters = XNEWVEC (gcov_type, hist->n_counters); >> >>>for (j = 0; j < hist->n_counters; j++) >> >>> if (aact_count) >> >>> - hist->hvalue.counters[j] = aact_count[j]; >> >>> - else >> >>> - hist->hvalue.counters[j] = 0; >> >>> + hist-&
Re: [PATCH] Time profiler - phase 1
A question about the newly added global variable function_counter. Does it have to be a global? Can I make it a file static, within macro L_gcov_time_profiler? I don't find a use other than in __gcov_time_profiler(). thanks, -Rong On Mon, Nov 11, 2013 at 9:52 AM, Martin Liška wrote: > On 11 November 2013 15:57, Jan Hubicka wrote: >>> +2013-10-29 Martin Liska >>> + Jan Hubicka >>> + >>> + * cgraph.c (dump_cgraph_node): Profile dump added. >>> + * cgraph.h (struct cgraph_node): New time profile variable added. >>> + * cgraphclones.c (cgraph_clone_node): Time profile is cloned. >>> + * gcov-io.h (gcov_type): New profiler type introduced. >>> + * ipa-profile.c (lto_output_node): Streaming for time profile added. >>> + (input_node): Time profiler is read from LTO stream. >>> + * predict.c (maybe_hot_count_p): Hot prediction changed. >>> + * profile.c (instrument_values): New case for time profiler added. >>> + (compute_value_histograms): Read of time profile. >>> + * tree-pretty-print.c (dump_function_header): Time profiler is dumped. >>> + * tree-profile.c (init_ic_make_global_vars): Time profiler function >>> added. >>> + (gimple_init_edge_profiler): TP function instrumentation. >>> + (gimple_gen_time_profiler): New. >>> + * value-prof.c (gimple_add_histogram_value): Support for time profiler >>> + added. >>> + (dump_histogram_value): TP type added to dumps. >>> + (visit_hist): More sensitive check that takes TP into account. >>> + (gimple_find_values_to_profile): TP instrumentation. >>> + * value-prof.h (hist_type): New histogram type added. >>> + (struct histogram_value_t): Pointer to struct function added. >>> + * libgcc/Makefile.in: New GCOV merge function for TP added. >>> + * libgcov.c: function_counter variable introduced. >>> + (_gcov_merge_time_profile): New. >>> + (_gcov_time_profiler): New. >>> + >>> 2013-11-05 Steven Bosscher >>> >>> >>> diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c >>> index 1260069..eff4b51 100644 >>> --- a/gcc/ipa-profile.c >>> +++ b/gcc/ipa-profile.c >>> @@ -465,6 +465,7 @@ ipa_propagate_frequency (struct cgraph_node *node) >>>if (d.maybe_unlikely_executed) >>> { >>>node->frequency = NODE_FREQUENCY_UNLIKELY_EXECUTED; >>> + node->tp_first_run = 0; >>>if (dump_file) >>> fprintf (dump_file, "Node %s promoted to unlikely executed.\n", >>>cgraph_node_name (node)); >> >> Why do you put tp_first_run to 0? The function may be unlikely but it still >> may have >> average possition in the program execution. I.e. when it is executed once >> per many >> invocations during the train run) > > That makes no sense, I've removed this assignment. > >>> @@ -895,9 +907,22 @@ compute_value_histograms (histogram_values values, >>> unsigned cfg_checksum, >>>hist->hvalue.counters = XNEWVEC (gcov_type, hist->n_counters); >>>for (j = 0; j < hist->n_counters; j++) >>> if (aact_count) >>> - hist->hvalue.counters[j] = aact_count[j]; >>> - else >>> - hist->hvalue.counters[j] = 0; >>> + hist->hvalue.counters[j] = aact_count[j]; >>> +else >>> + hist->hvalue.counters[j] = 0; >>> + >>> + /* Time profiler counter is not related to any statement, >>> + * so that we have to read the counter and set the value to >>> + * the corresponding call graph node. */ >> >> Reformat comment to GNU style. > > Yes. > >>> +2013-10-28 Martin Liska >>> + >>> + * gcc.dg/time-profiler-1.c: New test. >>> + * gcc.dg/time-profiler-2.c: Ditto. >>> + >> It is custom to put all changelogs to the beggining of your patch (just >> nitpicking ;)) >>> @@ -222,6 +225,18 @@ gimple_init_edge_profiler (void) >>> = tree_cons (get_identifier ("leaf"), NULL, >>>DECL_ATTRIBUTES (tree_indirect_call_profiler_fn)); >>> >>> + /* void (*) (gcov_type *, gcov_type, void *) */ >>> + time_profiler_fn_type >>> += build_function_type_list (void_type_node, >>> + gcov_type_ptr, NULL_TREE); >>> + tree_time_profiler_fn >>> + = build_fn_decl ("__gcov_time_profiler", >>> + time_profiler_fn_type); >>> + TREE_NOTHROW (tree_time_profiler_fn) = 1; >>> + DECL_ATTRIBUTES (tree_time_profiler_fn) >>> + = tree_cons (get_identifier ("leaf"), NULL, >>> + DECL_ATTRIBUTES (tree_time_profiler_fn)); >> >> We should udpate this code to use set_call_expr_flags but by incremental >> patch. > > OK, I will include this change to the phase 2. > >> The patch is OK with changes above. Do you have write permissin to commit >> it? >> If not, just send updated version and I will commit it + follow the write >> access >> instructions on gcc.gnu.org homepage. > > Yes, I do have commit right. I will bootstrap the pa
Re: [RFC] libgcov.c re-factoring and offline profile-tool
On Mon, Nov 11, 2013 at 7:17 AM, Jan Hubicka wrote: >> Here is the patch that includes profile-tool. >> Profile-tool now has two functions: merge and rewrite. I'll add diff later. >> >> Compiler is tested with spec2006 and profiledbootstrap. >> profile-tool is tested with spec2006 profiles. > > Hi, > it would be nice if you could elaborate bit more on the tool and what it does. > I plan to implement relink only instrumentatino for LTO and for that I will > need > profile reading/handling that is independent on functio nbodies, so perhaps > we can unify the infrastructure. My plans always was to merge the gcov and > profile.c profile handling logic and make it bit more generic. This tool is designed to manipulate raw profile counters. We plan to have the following functionality: (1) merge the gcda file with weights. We have fairly complicated build system. Sometime online merging in current libgcov.a not feasible (or not desirable). We need offline tool the merge the FDO profiles. The program may have multiple training inputs with different importance -- this is where weighted merging comes from. (2) diff the profile. We want to know how different of two profile data. Some tolerance of mismatch is required here. This is crucial for performance triaging. (3) rewrite the gcda file. We can do some (user-direct) fix-up work in the rewrite. This is may not be important for FDO. But this will be very useful for LIPO (like fix-up the indirect-target due to mismatch, rewrite module-grouping) (4) better dumping: like dump top-n-hottest functions/objects/indirectly-targets etc. This tool reconstructs the gcov_info list from gcda files and then uses significant part of libgcov.c code: merge, compute summary/histo-gram, and dump gcda files etc. The merging into current libgcov can also use this piece of code to do the in-memory merge. The advantage here is we will have precise histro-gram. > > Also I think profile-tool is a bit generic name. I.e. it does not realy say it > is related to GCC's coverage profiling. Maybe gcc-profile-tool or > gcov-profile-tool may be better? Yes. gcov-profile-tool seems to be better, or simply gcov-tool? > Perhaps an resonable option would also be to bundle it into gcov binary... Bundling into gcov is an interesting idea. I do plan to read gcno file to extract more information (at least the function name). Bungling into gcda means the tool can work on bb level, rather on the raw counters. Let me think a bit more on this. >> Index: libgcc/libgcov-tool.c >> === >> --- libgcc/libgcov-tool.c (revision 0) >> +++ libgcc/libgcov-tool.c (revision 0) >> @@ -0,0 +1,800 @@ >> +/* GCC instrumentation plugin for ThreadSanitizer. >> + Copyright (C) 2011-2013 Free Software Foundation, Inc. >> + Contributed by Dmitry Vyukov > > You need to update this... > Why there needs to be libgcov changes? David pointed out the copyright issue in an earlier email. I fixed it. The changes in libgcov are (1) ifdef the target headers. I need all the libgcov functionality to read/write the gcda files, as well as summary and histo-gram. But the binary is for host machines. (2) change the merge function so that I can take an in-memory gcov-info as input. > > Honza
Re: [RFC] libgcov.c re-factoring and offline profile-tool
On Mon, Nov 11, 2013 at 7:02 AM, Jan Hubicka wrote: >> 2013-11-04 Rong Xu >> >> * libgcc/libgcov.c: Delete as part of re-factoring. >> * libgcc/libgcov-profiler.c (__gcov_interval_profiler): Moved from >> libgcov.c >> (__gcov_pow2_profiler): Ditto. >> (__gcov_one_value_profiler_body): Ditto. >> (__gcov_one_value_profiler): Ditto. >> (__gcov_indirect_call_profiler): Ditto. >> (__gcov_indirect_call_profiler_v2): Ditto. >> (__gcov_average_profiler): Ditto. >> (__gcov_ior_profiler): Ditto. >> * libgcc/libgcov-driver.c (this_prg): Make it file scope >> static variable. >> (all_prg): Ditto. >> (crc32): Ditto. >> (gi_filename): Ditto. >> (fn_buffer): Ditto. >> (sum_buffer): Ditto. >> (struct gcov_filename_aux): New types to store auxiliary information >> for gi_filename. >> (gcov_version): Moved from libgcov.c. >> (crc32_unsigned): Ditto. >> (gcov_histogram_insert): Ditto. >> (gcov_compute_histogram): Ditto. >> (gcov_exit): Ditto. >> (gcov_clear): Ditto. >> (__gcov_init): Ditto. >> (gcov_exit_compute_summary): New function split from gcov_exit(). >> (gcov_exit_merge_gcda): Ditto. >> (gcov_exit_write_gcda): Ditto. >> (gcov_exit_dump_gcov): Ditto. >> * libgcc/libgcov-interface.c (init_mx): Moved from libgcov.c. >> (init_mx_once): Ditto. >> (__gcov_flush): Ditto. >> (__gcov_reset): Ditto. >> (__gcov_dump): Ditto. >> (__gcov_fork): Ditto. >> (__gcov_execl): Ditto. >> (__gcov_execlp): Ditto. >> (__gcov_execle): Ditto. >> (__gcov_execv): Ditto. >> (__gcov_execvp): Ditto. >> (__gcov_execve): Ditto. >> * libgcc/libgcov-driver-system.c (gcov_error): New utility function. >> (allocate_filename_struct): New function split from gcov_exit(). >> (gcov_exit_open_gcda_file): Ditto. >> (create_file_directory): Moved from libgcov.c. >> * libgcc/libgcov-merge.c: >> (__gcov_merge_add): Moved from libgcov.c. >> (__gcov_merge_ior): Ditto. >> (__gcov_merge_single): Ditto. >> (__gcov_merge_delta): Ditto. >> * libgcc/Makefile.in: Change to build newly added files. >> * gcc/gcov-io.h (__gcov_merge_ior): Add the decl to avoid warning. >> > > Hi, > the patch looks resonable. I take your word on the fact that there are no > changes > in the code, I did not read it all ;))) We did split gcov_exit() into more functions. But, semantically the code is the same. >> Index: libgcc/Makefile.in >> === >> --- libgcc/Makefile.in(revision 204285) >> +++ libgcc/Makefile.in(working copy) >> @@ -853,17 +853,37 @@ include $(iterator) >> # Build libgcov components. >> >> # Defined in libgcov.c, included only in gcov library >> -LIBGCOV = _gcov _gcov_merge_add _gcov_merge_single _gcov_merge_delta \ >> -_gcov_fork _gcov_execl _gcov_execlp _gcov_execle \ >> -_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump \ >> -_gcov_interval_profiler _gcov_pow2_profiler _gcov_one_value_profiler \ >> +##LIBGCOV = _gcov _gcov_merge_add _gcov_merge_single _gcov_merge_delta \ >> +##_gcov_fork _gcov_execl _gcov_execlp _gcov_execle \ >> +##_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump \ >> +##_gcov_interval_profiler _gcov_pow2_profiler _gcov_one_value_profiler \ >> +##_gcov_indirect_call_profiler _gcov_average_profiler >> _gcov_ior_profiler \ >> +##_gcov_merge_ior _gcov_indirect_call_profiler_v2 > > Probably no need for this commnet block. Yes. I ready remove in the most recently patch. >> + >> +LIBGCOV_MERGE = _gcov_merge_add _gcov_merge_single _gcov_merge_delta >> _gcov_merge_ior >> +LIBGCOV_PROFILER = _gcov_interval_profiler _gcov_pow2_profiler >> _gcov_one_value_profiler \ >> _gcov_indirect_call_profiler _gcov_average_profiler _gcov_ior_profiler \ >> -_gcov_merge_ior _gcov_indirect_call_profiler_v2 >> +_gcov_indirect_call_profiler_v2 >> +LIBGCOV_INTERFACE = _gcov_flush _gcov_fork _gcov_execl _gcov_execlp >> _gcov_execle \ >> +_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump >> +LIBGCOV_DRIVER = _gcov >> >> -libgcov-objects = $(patsubst %,%$(objext),$(LIBGCOV)) >> +libgcov-merge-objects = $(patsubst %,%$(objext),$(LIBGCOV_MERGE
Re: [RFC] libgcov.c re-factoring and offline profile-tool
On Fri, Nov 8, 2013 at 11:30 AM, Xinliang David Li wrote: > On Fri, Nov 8, 2013 at 11:21 AM, Rong Xu wrote: >> On Fri, Nov 8, 2013 at 11:02 AM, Xinliang David Li >> wrote: >>> On Fri, Nov 8, 2013 at 10:48 AM, Rong Xu wrote: >>>> On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li >>>> wrote: >>>>> in libgcov-driver.c >>>>> >>>>>> /* Flag when the profile has already been dumped via __gcov_dump(). */ >>>>>> static int gcov_dump_complete; >>>>> >>>>>> inline void >>>>>> set_gcov_dump_complete (void) >>>>>>{ >>>>> > gcov_dump_complete = 1; >>>>>>} >>>>> >>>>>>inline void >>>>>>reset_gcov_dump_complete (void) >>>>>>{ >>>>>> gcov_dump_complete = 0; >>>>>>} >>>>> >>>>> >>>>> These should be moved to libgcov-interface.c. Is there reason to not >>>>> mark them as static? >>>> >>>> gcov_dump_complete is used in gcov_exit() which is in >>>> libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both >>>> in libgcov-interface.c. >>>> I want gcov_dump_complete a static. So I have to add to global >>>> functions that accessible from libgcov-interface.c. >>>> Another choice is to move __gcov_dump and __gcov_reset to >>>> libgcov-driver.c and that will simplify the code. >>>> >>> >>> Ok then you should remove the 'inline' keyword for the set and reset >>> functions and keep them in libgcov-driver.c. >> >> Will remove 'inline' keyword. >> >>> >>>>> >>>>> in libgcov-profiler.c, line 142 >>>>> >>>>>> #if defined(HAVE_CC_TLS) && !defined (USE_EMUTLS) >>>>>> __thread >>>>> >>>>> trailing whilte space. >>>>> >>>> >>>> Will fix. >>>> >>>>> >>>>>> || (VTABLE_USES_DESCRIPTORS && __gcov_indirect_call_callee >>>>>> && *(void **) cur_func == *(void **) >>>>>> __gcov_indirect_call_callee)) >>>>> >>>>> trailing white space. >>>>> >>>> >>>> Will fix. >>>> >>>>> >>>>> in libgcov-merge.c >>>>> >>>>> 1) I don't think you need in_mem flag. For in memory merge, the source != >>>>> NULL. >>>>> 2) document the new source and weight parameter -- especially the weight. >>>> >>>> Will do. >>>> >>>>> 3) How do you deal with integer overflow ? >>>> >>>> This is good question. gcvo_type is (typically) long long. I thought >>>> the count value will not be nearly close to the max of long long. >>>> (We did see overflow in compiler, but that's because of the scaling -- >>>> some of the scaling factors are really large.) >>>> >>> >>> Why not passing weight as a scaled PERCENT (as branch prob) for the >>> source? THis way, the merge tool does not need to do scaling twice. >>> >> >> there are two weights, so unless you have two weights in the merge_add >> functions, you have to >> traverse the list twice. Do you mean pass addition parameters? >> >> Currently, merge tools is done this way: >> merge (p1, p2, w1, w2) >> first pass: merge_add (p1, p1, w1) >> second pass: merge_add (p1, p2, w2) >> > > Only need to pass the normalized the weight (in percent) for one > profile source -- say norm_w2 : w2/(w1+w2), and the merge function can > do this in one pass as norm_w1 = 1-norm_w2. > > >> I initially had both weights in merge_add function. but later I found that >> to deal with mis-match profile, I need to find out the gcov_info in p1 >> but not p2, (they need to multiply by w1 only). >> So I choose the above structure. >> >> Another thing about the PERCENT is the inaccuracy for floating points. > > This is how profile update work in profile-use compilation -- so that > should not be a big problem. > > Before you fix this, I'd like to hear what Honza and other reviewers think. > > thanks, > > David > OK. Let's get more comments on this before I re-work on this. I also want to point out that the traversal the
Re: [RFC] libgcov.c re-factoring and offline profile-tool
On Fri, Nov 8, 2013 at 11:02 AM, Xinliang David Li wrote: > On Fri, Nov 8, 2013 at 10:48 AM, Rong Xu wrote: >> On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li >> wrote: >>> in libgcov-driver.c >>> >>>> /* Flag when the profile has already been dumped via __gcov_dump(). */ >>>> static int gcov_dump_complete; >>> >>>> inline void >>>> set_gcov_dump_complete (void) >>>>{ >>> > gcov_dump_complete = 1; >>>>} >>> >>>>inline void >>>>reset_gcov_dump_complete (void) >>>>{ >>>> gcov_dump_complete = 0; >>>>} >>> >>> >>> These should be moved to libgcov-interface.c. Is there reason to not >>> mark them as static? >> >> gcov_dump_complete is used in gcov_exit() which is in >> libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both >> in libgcov-interface.c. >> I want gcov_dump_complete a static. So I have to add to global >> functions that accessible from libgcov-interface.c. >> Another choice is to move __gcov_dump and __gcov_reset to >> libgcov-driver.c and that will simplify the code. >> > > Ok then you should remove the 'inline' keyword for the set and reset > functions and keep them in libgcov-driver.c. Will remove 'inline' keyword. > >>> >>> in libgcov-profiler.c, line 142 >>> >>>> #if defined(HAVE_CC_TLS) && !defined (USE_EMUTLS) >>>> __thread >>> >>> trailing whilte space. >>> >> >> Will fix. >> >>> >>>> || (VTABLE_USES_DESCRIPTORS && __gcov_indirect_call_callee >>>> && *(void **) cur_func == *(void **) __gcov_indirect_call_callee)) >>> >>> trailing white space. >>> >> >> Will fix. >> >>> >>> in libgcov-merge.c >>> >>> 1) I don't think you need in_mem flag. For in memory merge, the source != >>> NULL. >>> 2) document the new source and weight parameter -- especially the weight. >> >> Will do. >> >>> 3) How do you deal with integer overflow ? >> >> This is good question. gcvo_type is (typically) long long. I thought >> the count value will not be nearly close to the max of long long. >> (We did see overflow in compiler, but that's because of the scaling -- >> some of the scaling factors are really large.) >> > > Why not passing weight as a scaled PERCENT (as branch prob) for the > source? THis way, the merge tool does not need to do scaling twice. > there are two weights, so unless you have two weights in the merge_add functions, you have to traverse the list twice. Do you mean pass addition parameters? Currently, merge tools is done this way: merge (p1, p2, w1, w2) first pass: merge_add (p1, p1, w1) second pass: merge_add (p1, p2, w2) I initially had both weights in merge_add function. but later I found that to deal with mis-match profile, I need to find out the gcov_info in p1 but not p2, (they need to multiply by w1 only). So I choose the above structure. Another thing about the PERCENT is the inaccuracy for floating points. I have the scaling function in rewrite sub-command mainly for debug purpose: I merge the same profile with a weight 1:9. Then I scale the result profile by 0.1. I was expecting identical profile as the source. But due to floating point inaccuracy, some counters are off slightly. > David > > > >>> >>> David >>> >>> >>> >>> >>> >>> >>> On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu wrote: >>>> On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers >>>> wrote: >>>>> On Thu, 7 Nov 2013, Rong Xu wrote: >>>>> >>>>>> Thanks Joseph for these detailed comments/suggestions. >>>>>> The fixed patch is attached to this email. >>>>>> The only thing left out is the Texinfo manual. Do you mean this tool >>>>>> should have its it's own texi file in gcc/doc? >>>>> >>>>> Its own texi file, probably included as a chapter by gcc.texi (like >>>>> gcov.texi is). >>>> >>>> OK. will add this in. >>>> >>>> My last patch that changes the command handling is actually broken. >>>> Please ignore that patch and use the one in this email. >>>> >>>> Also did some minor adjust of the code in libgcov. >>>> >>>> Thanks, >>>> >>>> -Rong >>>> >>>> >>>>> >>>>> -- >>>>> Joseph S. Myers >>>>> jos...@codesourcery.com
Re: [RFC] libgcov.c re-factoring and offline profile-tool
A question about inhibit_libc. When inhibit_libc is defined, we provide dummy functions for all the __gcov_* methods. Is it purposely to minimize the footprint? I'm thinking to allow some codes that are independent of libc (and its headers) under this. Is it OK? -Rong On Fri, Nov 8, 2013 at 10:48 AM, Rong Xu wrote: > On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li wrote: >> in libgcov-driver.c >> >>> /* Flag when the profile has already been dumped via __gcov_dump(). */ >>> static int gcov_dump_complete; >> >>> inline void >>> set_gcov_dump_complete (void) >>>{ >> > gcov_dump_complete = 1; >>>} >> >>>inline void >>>reset_gcov_dump_complete (void) >>>{ >>> gcov_dump_complete = 0; >>>} >> >> >> These should be moved to libgcov-interface.c. Is there reason to not >> mark them as static? > > gcov_dump_complete is used in gcov_exit() which is in > libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both > in libgcov-interface.c. > I want gcov_dump_complete a static. So I have to add to global > functions that accessible from libgcov-interface.c. > Another choice is to move __gcov_dump and __gcov_reset to > libgcov-driver.c and that will simplify the code. > >> >> in libgcov-profiler.c, line 142 >> >>> #if defined(HAVE_CC_TLS) && !defined (USE_EMUTLS) >>> __thread >> >> trailing whilte space. >> > > Will fix. > >> >>> || (VTABLE_USES_DESCRIPTORS && __gcov_indirect_call_callee >>> && *(void **) cur_func == *(void **) __gcov_indirect_call_callee)) >> >> trailing white space. >> > > Will fix. > >> >> in libgcov-merge.c >> >> 1) I don't think you need in_mem flag. For in memory merge, the source != >> NULL. >> 2) document the new source and weight parameter -- especially the weight. > > Will do. > >> 3) How do you deal with integer overflow ? > > This is good question. gcvo_type is (typically) long long. I thought > the count value will not be nearly close to the max of long long. > (We did see overflow in compiler, but that's because of the scaling -- > some of the scaling factors are really large.) > >> >> David >> >> >> >> >> >> >> On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu wrote: >>> On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers >>> wrote: >>>> On Thu, 7 Nov 2013, Rong Xu wrote: >>>> >>>>> Thanks Joseph for these detailed comments/suggestions. >>>>> The fixed patch is attached to this email. >>>>> The only thing left out is the Texinfo manual. Do you mean this tool >>>>> should have its it's own texi file in gcc/doc? >>>> >>>> Its own texi file, probably included as a chapter by gcc.texi (like >>>> gcov.texi is). >>> >>> OK. will add this in. >>> >>> My last patch that changes the command handling is actually broken. >>> Please ignore that patch and use the one in this email. >>> >>> Also did some minor adjust of the code in libgcov. >>> >>> Thanks, >>> >>> -Rong >>> >>> >>>> >>>> -- >>>> Joseph S. Myers >>>> jos...@codesourcery.com
Re: [RFC] libgcov.c re-factoring and offline profile-tool
On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li wrote: > in libgcov-driver.c > >> /* Flag when the profile has already been dumped via __gcov_dump(). */ >> static int gcov_dump_complete; > >> inline void >> set_gcov_dump_complete (void) >>{ > > gcov_dump_complete = 1; >>} > >>inline void >>reset_gcov_dump_complete (void) >>{ >> gcov_dump_complete = 0; >>} > > > These should be moved to libgcov-interface.c. Is there reason to not > mark them as static? gcov_dump_complete is used in gcov_exit() which is in libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both in libgcov-interface.c. I want gcov_dump_complete a static. So I have to add to global functions that accessible from libgcov-interface.c. Another choice is to move __gcov_dump and __gcov_reset to libgcov-driver.c and that will simplify the code. > > in libgcov-profiler.c, line 142 > >> #if defined(HAVE_CC_TLS) && !defined (USE_EMUTLS) >> __thread > > trailing whilte space. > Will fix. > >> || (VTABLE_USES_DESCRIPTORS && __gcov_indirect_call_callee >> && *(void **) cur_func == *(void **) __gcov_indirect_call_callee)) > > trailing white space. > Will fix. > > in libgcov-merge.c > > 1) I don't think you need in_mem flag. For in memory merge, the source != > NULL. > 2) document the new source and weight parameter -- especially the weight. Will do. > 3) How do you deal with integer overflow ? This is good question. gcvo_type is (typically) long long. I thought the count value will not be nearly close to the max of long long. (We did see overflow in compiler, but that's because of the scaling -- some of the scaling factors are really large.) > > David > > > > > > > On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu wrote: >> On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers >> wrote: >>> On Thu, 7 Nov 2013, Rong Xu wrote: >>> >>>> Thanks Joseph for these detailed comments/suggestions. >>>> The fixed patch is attached to this email. >>>> The only thing left out is the Texinfo manual. Do you mean this tool >>>> should have its it's own texi file in gcc/doc? >>> >>> Its own texi file, probably included as a chapter by gcc.texi (like >>> gcov.texi is). >> >> OK. will add this in. >> >> My last patch that changes the command handling is actually broken. >> Please ignore that patch and use the one in this email. >> >> Also did some minor adjust of the code in libgcov. >> >> Thanks, >> >> -Rong >> >> >>> >>> -- >>> Joseph S. Myers >>> jos...@codesourcery.com
Re: [RFC] libgcov.c re-factoring and offline profile-tool
BTW, this part of patch only does the re-factoring work. I haven't finished the porting of profile-tool part of patch to the trunk. They should be in a separate patch anyway. -Rong On Mon, Nov 4, 2013 at 2:43 PM, Rong Xu wrote: > Honza, Thanks for the comments! > Attached is the patch. > Passed spec2006 fdo build and profiledbootstrap in gcc. > I'm doing some other tests. > > -Rong > > On Mon, Nov 4, 2013 at 1:55 AM, Jan Hubicka wrote: >>> Thanks! Can you attach the patch file for the proposed change? >>> >>> David >>> >>> On Fri, Nov 1, 2013 at 2:08 PM, Rong Xu wrote: >>> > libgcov.c is the main file to generate libgcov.a. This single source >>> > generates >>> > 21 object files and then archives to a library. These objects are of >>> > very different purposes but they are in the same file guarded by various >>> > macros. >>> > The source file becomes quite large and its readability becomes very >>> > low. In its current state: >>> > 1) The code is very hard to understand by new developers; >> >> Agreed, libgcov has became hard to maintain since several parts are written >> in very low-level way. >>> > 2) It makes regressions more likely when making simple changes; >>> > More importantly, >>> > 3) it makes code reuse/sharing very difficult. For instance, Using >>> > libgcov to implement an offline profile tool (see below); kernel FDO >>> > support etc. >> >> Yep, it was my longer term plan to do this, too. >>> > >>> > We came to this issue when working on an offline tool to handle >>> > profile counters. >>> > Our plan is to have a profile-tool can merge, diff, collect statistics, >>> > and >>> > better dump raw profile counters. This is one of the most requested >>> > features >>> > from out internal users. This tool can also be very useful for any >>> > FDO/coverage >>> > users. In the first part of tool, we want to have the weight >>> > merge. Again, to reuse the code, we have to change libgcov.c. But we >>> > don't want >>> > to further diverge libgcov.a in our google branches from the trunk. >>> > >>> > Another issue in libgcov.c is function gcov_exit(). It is a huge function >>> > with about 467 lines of code. This function traverses gcov_list and dumps >>> > all >>> > the gcda files. The code for merging the gcda files also sits in the same >>> > function. The structure makes the code-reuse and extension really >>> > difficult >>> > (like in kernel FDO, we have to break this function to reuse some of the >>> > code). >>> > >>> > We propose to break gcov_exit into smaller functions and split libgcov.c >>> > into >>> > several files. Our plan for profile-tool is listed in (3). >>> > >>> > 1. break gcov_exit() into the following structure: >>> >gcov_exit() >>> > --> gcov_exit_compute_summary() >>> > --> allocate_filename_struct() >>> > for gi_ptr in gcov_list >>> > --> gcov_exit_dump_gcov() >>> >--> gcov_exit_open_gcda_file() >>> >--> gcov_exit_merge_gcda () >>> >--> gcov_exit_write_gcda () >> >> Just a short comment that summaries are also produced during the merging - >> i.e. they are >> done on per-file basis. But otherwise I agree. >> We should be cureful to not increase footprint of libgcov much for embedded >> users, but >> I believe changes like this can be done w/o additional overhead in the >> optimized library. > > The merge of summary has two parts. One is to find the summary to > merge to. That is in gcov_exit_merge_gcda(). > The other part of do the actual merging of summary is in > gcov_exit_dump_gcov(). > > I tried to keep the current work flow. So except for a few more calls, > I don't think there is much overhead. > >>> > >>> > 2. split libgcov.c into the following files: >>> > libgcov-profiler.c >>> > libgcov-merge.c >>> > libgcov-interface.c >>> > libgcov-driver.c >>> >libgcov-driver-system.c (source included into libgcov-driver.c) >> >> Seems resonable. Splitting i/o stuff into separate module (driver) is >> something I p
[RFC] libgcov.c re-factoring and offline profile-tool
libgcov.c is the main file to generate libgcov.a. This single source generates 21 object files and then archives to a library. These objects are of very different purposes but they are in the same file guarded by various macros. The source file becomes quite large and its readability becomes very low. In its current state: 1) The code is very hard to understand by new developers; 2) It makes regressions more likely when making simple changes; More importantly, 3) it makes code reuse/sharing very difficult. For instance, Using libgcov to implement an offline profile tool (see below); kernel FDO support etc. We came to this issue when working on an offline tool to handle profile counters. Our plan is to have a profile-tool can merge, diff, collect statistics, and better dump raw profile counters. This is one of the most requested features from out internal users. This tool can also be very useful for any FDO/coverage users. In the first part of tool, we want to have the weight merge. Again, to reuse the code, we have to change libgcov.c. But we don't want to further diverge libgcov.a in our google branches from the trunk. Another issue in libgcov.c is function gcov_exit(). It is a huge function with about 467 lines of code. This function traverses gcov_list and dumps all the gcda files. The code for merging the gcda files also sits in the same function. The structure makes the code-reuse and extension really difficult (like in kernel FDO, we have to break this function to reuse some of the code). We propose to break gcov_exit into smaller functions and split libgcov.c into several files. Our plan for profile-tool is listed in (3). 1. break gcov_exit() into the following structure: gcov_exit() --> gcov_exit_compute_summary() --> allocate_filename_struct() for gi_ptr in gcov_list --> gcov_exit_dump_gcov() --> gcov_exit_open_gcda_file() --> gcov_exit_merge_gcda () --> gcov_exit_write_gcda () 2. split libgcov.c into the following files: libgcov-profiler.c libgcov-merge.c libgcov-interface.c libgcov-driver.c libgcov-driver-system.c (source included into libgcov-driver.c) The details of the splitting: libgcov-interface.c /* This file contains API functions to other programs and the supporting functions. */ __gcov_dump() __gcov_execl() __gcov_execle() __gcov_execlp() __gcov_execv() __gcov_execve() __gcov_execvp() __gcov_flush() __gcov_fork() __gcov_reset() init_mx() init_mx_once() libgcov-profiler.c /* This file contains runtime profile handlers. */ variables: __gcov_indirect_call_callee __gcov_indirect_call_counters functions: __gcov_average_profiler() __gcov_indirect_call_profiler() __gcov_indirect_call_profiler_v2() __gcov_interval_profiler() __gcov_ior_profiler() __gcov_one_value_profiler() __gcov_one_value_profiler_body() __gcov_pow2_profiler() libgcov-merge.c /* This file contains the merge functions for various counters. */ functions: __gcov_merge_add() __gcov_merge_delta() __gcov_merge_ior() __gcov_merge_single() libcov-driver.c /* This file contains the gcov dumping functions. We separate the system dependent part to libgcov-driver-system.c. */ variables: gcov_list gcov_max_filename gcov_dump_complete -- newly added file static variables -- this_prg all_prg crc32 gi_filename fn_buffer fn_tail sum_buffer next_sum_buffer sum_tail - end - functions: free_fn_data() buffer_fn_data() crc32_unsigned() gcov_version() gcov_histogram_insert() gcov_compute_histogram() gcov_clear() __gcov_init() gcov_exit() --- newly added static functions -- gcov_exit_compute_summary () gcov_exit_merge_gcda() gcov_exit_write_gcda() gcov_exit_dump_gcov() - end - libgcov-driver-system.c /* system dependent part of ligcov-driver.c. */ functions: create_file_directory() --- newly added static functions -- gcov_error() /* This replaces all fprintf(stderr, ...) */ allocate_filename_struct() gcov_exit_open_gcda_file() 3. add offline profile-tool support. We will change the interface of merge functions to make it take in-memory gcov_info list, and a weight as the arguments. We will add libgcov-tool.c. It has two APIs: (1) read_profile_dir(): read a profile directory and reconstruct the gcov_info link list in-memory. (2) merge_profiles(): merge two in-memory gcov_info link list. We also add profile-tool.c in gcc directory. It will source-include libgcov-tool.c and build a host binary. (We don't do library style because that will need a hard dependence from gcc to libgcc). profile-tool.c will mainly handle user options and the write-out of gcov-info link list. Some changes in gcov-io.[ch] will be also needed. Th
[google gcc-4_8] write ggc_memory to gcda file
Hi, This patch writes out ggc_memory to gcda files. Google branches only. Test is ongoing. OK to check-in after test passes? -Rong 2013-10-25 Rong Xu * contrib/profile_tool (ModuleInfo): write out ggc_memory to gcda file. * gcc/gcov-io.c (gcov_read_module_info): Ditto. * libgcc/dyn-ipa.c (gcov_write_module_info): Ditto. Index: contrib/profile_tool === --- contrib/profile_tool(revision 204080) +++ contrib/profile_tool(working copy) @@ -674,6 +674,7 @@ class ModuleInfo(DataObject): self.module_id = reader.ReadWord() self.is_primary = reader.ReadWord() self.flags = reader.ReadWord() +self.ggc_memory = reader.ReadWord() self.language = reader.ReadWord() self.num_quote_paths = reader.ReadWord() self.num_bracket_paths = reader.ReadWord() @@ -710,6 +711,7 @@ class ModuleInfo(DataObject): writer.WriteWord(self.is_primary) writer.WriteWord(self.flags) writer.WriteWord(self.language) +writer.WriteWord(self.ggc_memory) writer.WriteWord(self.num_quote_paths) writer.WriteWord(self.num_bracket_paths) writer.WriteWord(self.num_system_paths) Index: gcc/gcov-io.c === --- gcc/gcov-io.c (revision 204080) +++ gcc/gcov-io.c (working copy) @@ -591,14 +591,15 @@ gcov_read_module_info (struct gcov_module_info *mo mod_info->ident = gcov_read_unsigned (); mod_info->is_primary = gcov_read_unsigned (); mod_info->flags = gcov_read_unsigned (); - mod_info->lang = gcov_read_unsigned (); + mod_info->lang = gcov_read_unsigned (); + mod_info->ggc_memory = gcov_read_unsigned (); mod_info->num_quote_paths = gcov_read_unsigned (); mod_info->num_bracket_paths = gcov_read_unsigned (); mod_info->num_system_paths = gcov_read_unsigned (); mod_info->num_cpp_defines = gcov_read_unsigned (); mod_info->num_cpp_includes = gcov_read_unsigned (); mod_info->num_cl_args = gcov_read_unsigned (); - len -= 10; + len -= 11; filename_len = gcov_read_unsigned (); mod_info->da_filename = (char *) xmalloc (filename_len * Index: libgcc/dyn-ipa.c === --- libgcc/dyn-ipa.c(revision 204080) +++ libgcc/dyn-ipa.c(working copy) @@ -2095,7 +2095,7 @@ gcov_write_module_info (const struct gcov_info *mo len += 1; /* Each string is lead by a length. */ } - len += 10; /* 9 more fields */ + len += 11; /* 11 more fields */ gcov_write_tag_length (GCOV_TAG_MODULE_INFO, len); gcov_write_unsigned (module_info->ident); @@ -2104,6 +2104,7 @@ gcov_write_module_info (const struct gcov_info *mo SET_MODULE_INCLUDE_ALL_AUX (module_info); gcov_write_unsigned (module_info->flags); gcov_write_unsigned (module_info->lang); + gcov_write_unsigned (module_info->ggc_memory); gcov_write_unsigned (module_info->num_quote_paths); gcov_write_unsigned (module_info->num_bracket_paths); gcov_write_unsigned (module_info->num_system_paths);
Re: [GOOGLE] Check if varpool node exist for decl before checking if it's from auxiliary module.
seems fine to me for google branches. -Rong On Tue, Oct 22, 2013 at 10:51 AM, Dehao Chen wrote: > This is fixing a LIPO bug when there -fexception is on. > > When compilation is finished, compile_file calls > dw2_output_indirect_constants, which may generate decls like > DW.ref.__gxx_personality_v0 (generated in > dw2_output_indirect_constant_1). This function is a global function, > but does not have associated varpool node. So original code will > segfault when checking if the varpool node is from auxiliary module. > > Verified that the segfault is fixed with the patch. Regression test on-going. > > OK for google-4_8 branch if reg test is green? > > Thanks, > Dehao > > Index: gcc/varasm.c > === > --- gcc/varasm.c (revision 203910) > +++ gcc/varasm.c (working copy) > @@ -1484,7 +1484,7 @@ notice_global_symbol (tree decl) >if (L_IPO_COMP_MODE >&& ((TREE_CODE (decl) == FUNCTION_DECL > && cgraph_is_auxiliary (decl)) > - || (TREE_CODE (decl) == VAR_DECL > + || (TREE_CODE (decl) == VAR_DECL && varpool_get_node (decl) >&& varpool_is_auxiliary (varpool_get_node (decl) > return;
[google gcc-4_8] LIPO: insert static vars to varpool
Hi, For google gcc-4_8 branch only. This is fixes the NULL pointer dereference in copy_tree_r due to empty varpool_node returned. Passed the ICE compilation. Other tests are ongoing. Thanks, -Rong 2013-10-14 Rong Xu * cp/semantics.c (finish_compound_literal): Put static var to varpool. This is for LIPO only. * cp/call.c (make_temporary_var_for_ref_to_temp): Ditto. Index: cp/semantics.c === --- cp/semantics.c (revision 203471) +++ cp/semantics.c (working copy) @@ -2486,6 +2486,10 @@ finish_compound_literal (tree type, tree compound_ decl = pushdecl_top_level (decl); DECL_NAME (decl) = make_anon_name (); SET_DECL_ASSEMBLER_NAME (decl, DECL_NAME (decl)); + /* Capture the current module info for statics. */ + if (L_IPO_COMP_MODE) +varpool_node_for_decl (decl); + return decl; } else Index: cp/call.c === --- cp/call.c (revision 203471) +++ cp/call.c (working copy) @@ -9047,6 +9047,9 @@ make_temporary_var_for_ref_to_temp (tree decl, tre tree name; TREE_STATIC (var) = TREE_STATIC (decl); + /* Capture the current module info for statics. */ + if (L_IPO_COMP_MODE && TREE_STATIC (var)) +varpool_node_for_decl (var); DECL_TLS_MODEL (var) = DECL_TLS_MODEL (decl); name = mangle_ref_init_variable (decl); DECL_NAME (var) = name;
Re: [PATCH] increase builtin_expert probability in loop exit test
Attached is the patchset 2. It takes the max to two hitrates then does the incremental. On Fri, Oct 11, 2013 at 2:01 PM, Rong Xu wrote: > Hi, > > An earlier patch (r203167) changed the default probability for > builtin_expect to 90%. > It does not work properly for the following case: >while (__builin_expect (expr, 1)) { } > > W/o builtin_expect, the exit probability is 9% while w/ > builtin_expect, the exit probability is 10%. > It seems to be wrong as we should estimate this loop has more > iterations than w/o the annotation. > > This attached patch bump the probability for this particular case. > > Tested with bootstrap. > > Thanks, > > -Rong 2013-10-11 Rong Xu * predict.c (tree_predict_by_opcode): Bump the estimiated probability if builtin_expect expression is in loop exit test. Index: predict.c === --- predict.c (revision 203463) +++ predict.c (working copy) @@ -2001,11 +2001,39 @@ tree_predict_by_opcode (basic_block bb) if (val) { int percent = PARAM_VALUE (BUILTIN_EXPECT_PROBABILITY); + void **preds; + int hitrate; gcc_assert (percent >= 0 && percent <= 100); + /* This handles the cases like + while (__builtin_expect (exp, 1)) { ... } + W/o builtin_expect, the default HITRATE is 91%. + It does not make sense to estimate a lower probability of 90% + (current default for builtin_expect) with the annotation. + So here, we bump the probability by a small amount. */ + preds = pointer_map_contains (bb_predictions, bb); + hitrate = HITRATE (percent); + if (preds) +{ + struct edge_prediction *pred; + int exit_hitrate = predictor_info [(int) PRED_LOOP_EXIT].hitrate; + + for (pred = (struct edge_prediction *) *preds; pred; + pred = pred->ep_next) +{ + if (pred->ep_predictor == PRED_LOOP_EXIT + && exit_hitrate > hitrate) +{ + hitrate = exit_hitrate + HITRATE (4); + if (hitrate > REG_BR_PROB_BASE) +hitrate = REG_BR_PROB_BASE; + break; +} +} +} if (integer_zerop (val)) -percent = 100 - percent; - predict_edge (then_edge, PRED_BUILTIN_EXPECT, HITRATE (percent)); +hitrate = REG_BR_PROB_BASE - hitrate; + predict_edge (then_edge, PRED_BUILTIN_EXPECT, hitrate); } /* Try "pointer heuristic." A comparison ptr == 0 is predicted as false.
Re: [google gcc-4_8] increase builtin_expect probability in loop exit test
here is the new patch. Note that the hitrate won't change by this patch for the case of while (__builtin_expect (exp, 0)) Index: predict.c === --- predict.c (revision 203462) +++ predict.c (working copy) @@ -1951,11 +1951,42 @@ tree_predict_by_opcode (basic_block bb) if (val) { int percent = PARAM_VALUE (BUILTIN_EXPECT_PROBABILITY); + int hitrate; gcc_assert (percent >= 0 && percent <= 100); if (integer_zerop (val)) -percent = 100 - percent; - predict_edge (then_edge, PRED_BUILTIN_EXPECT, HITRATE (percent)); +hitrate = HITRATE (100 - percent); + else +{ + /* This handles the cases like + while (__builtin_expect (exp, 1)) { ... } + W/o builtin_expect, the default HITRATE is 91%. + It does not make sense to estimate a lower probability of 90% + (current default for builtin_expect) with the annotation. + So here, we bump the probability by a small amount. */ + void **preds = pointer_map_contains (bb_predictions, bb); + + hitrate = HITRATE (percent); + if (preds) +{ + struct edge_prediction *pred; + int exit_hitrate = predictor_info [(int) PRED_LOOP_EXIT].hitrate; + + for (pred = (struct edge_prediction *) *preds; pred; + pred = pred->ep_next) +{ + if (pred->ep_predictor == PRED_LOOP_EXIT + && exit_hitrate > hitrate) +{ + hitrate = exit_hitrate + HITRATE (4); + if (hitrate > REG_BR_PROB_BASE) +hitrate = REG_BR_PROB_BASE; + break; +} +} +} +} + predict_edge (then_edge, PRED_BUILTIN_EXPECT, hitrate); } /* Try "pointer heuristic." A comparison ptr == 0 is predicted as false. On Fri, Oct 11, 2013 at 2:45 PM, Rong Xu wrote: > ok. that makes sense. > > > On Fri, Oct 11, 2013 at 2:41 PM, Xinliang David Li wrote: >> Should it be max + some_delta then? >> >> David >> >> On Fri, Oct 11, 2013 at 2:32 PM, Rong Xu wrote: >>> I want to differentiate the cases w/o and w/ builtin. >>> If I take the max, they will be the same (91%). >>> >>> -Rong >>> >>> On Fri, Oct 11, 2013 at 2:26 PM, Xinliang David Li >>> wrote: >>>> Why this 'percent += 4' instead of taking the max? >>>> >>>> David >>>> >>>> On Fri, Oct 11, 2013 at 2:10 PM, Rong Xu wrote: >>>>> The trunk version of this patch is submitted for review. >>>>> >>>>> David: can we have this patch for google/gcc-4_8 branch first? >>>>> It tested with regression and google internal benchmarks. >>>>> >>>>> Thanks, >>>>> >>>>> -Rong
Re: [google gcc-4_8] increase builtin_expect probability in loop exit test
ok. that makes sense. On Fri, Oct 11, 2013 at 2:41 PM, Xinliang David Li wrote: > Should it be max + some_delta then? > > David > > On Fri, Oct 11, 2013 at 2:32 PM, Rong Xu wrote: >> I want to differentiate the cases w/o and w/ builtin. >> If I take the max, they will be the same (91%). >> >> -Rong >> >> On Fri, Oct 11, 2013 at 2:26 PM, Xinliang David Li >> wrote: >>> Why this 'percent += 4' instead of taking the max? >>> >>> David >>> >>> On Fri, Oct 11, 2013 at 2:10 PM, Rong Xu wrote: >>>> The trunk version of this patch is submitted for review. >>>> >>>> David: can we have this patch for google/gcc-4_8 branch first? >>>> It tested with regression and google internal benchmarks. >>>> >>>> Thanks, >>>> >>>> -Rong
Re: [google gcc-4_8] increase builtin_expect probability in loop exit test
I want to differentiate the cases w/o and w/ builtin. If I take the max, they will be the same (91%). -Rong On Fri, Oct 11, 2013 at 2:26 PM, Xinliang David Li wrote: > Why this 'percent += 4' instead of taking the max? > > David > > On Fri, Oct 11, 2013 at 2:10 PM, Rong Xu wrote: >> The trunk version of this patch is submitted for review. >> >> David: can we have this patch for google/gcc-4_8 branch first? >> It tested with regression and google internal benchmarks. >> >> Thanks, >> >> -Rong
[google gcc-4_8] increase builtin_expect probability in loop exit test
The trunk version of this patch is submitted for review. David: can we have this patch for google/gcc-4_8 branch first? It tested with regression and google internal benchmarks. Thanks, -Rong 2013-10-11 Rong Xu * predict.c (tree_predict_by_opcode): Bump the estimiated probability if builtin_expect expression is in loop exit test. Index: predict.c === --- predict.c (revision 203462) +++ predict.c (working copy) @@ -1951,7 +1951,31 @@ tree_predict_by_opcode (basic_block bb) if (val) { int percent = PARAM_VALUE (BUILTIN_EXPECT_PROBABILITY); + void **preds; + /* This handles the cases like + while (__builtin_expect (exp, 1)) { ... } + W/o builtin_expect, the default HITRATE is 91%. +It does not make sense to estimate a lower probability of 90% +(current default for builtin_expect) with the annotation. +So here, we bump the probability by a small amount. */ + preds = pointer_map_contains (bb_predictions, bb); + if (preds) +{ + struct edge_prediction *pred; + + for (pred = (struct edge_prediction *) *preds; pred; + pred = pred->ep_next) + { + if (pred->ep_predictor == PRED_LOOP_EXIT + && predictor_info [(int) PRED_LOOP_EXIT].hitrate +> HITRATE (percent)) + percent += 4; + if (percent > 100) + percent = 100; + } + } + gcc_assert (percent >= 0 && percent <= 100); if (integer_zerop (val)) percent = 100 - percent;
[PATCH] increase builtin_expert probability in loop exit test
Hi, An earlier patch (r203167) changed the default probability for builtin_expect to 90%. It does not work properly for the following case: while (__builin_expect (expr, 1)) { } W/o builtin_expect, the exit probability is 9% while w/ builtin_expect, the exit probability is 10%. It seems to be wrong as we should estimate this loop has more iterations than w/o the annotation. This attached patch bump the probability for this particular case. Tested with bootstrap. Thanks, -Rong patch_diff Description: Binary data
Re: [PATCH] alternative hirate for builtin_expert
My change on the probability of builtin_expect does have an impact on the partial inline (more outlined functions will get inline back to the original function). I think this triggers an existing issue. Dehao will explain this in his coming email. -Rong On Fri, Oct 4, 2013 at 6:05 AM, Ramana Radhakrishnan wrote: > On 10/02/13 23:49, Rong Xu wrote: >> >> Here is the new patch. Honaz: Could you take a look? >> >> Thanks, >> >> -Rong >> >> On Wed, Oct 2, 2013 at 2:31 PM, Jan Hubicka wrote: >>>> >>>> Thanks for the suggestion. This is much cleaner than to use binary >>>> parameter. >>>> >>>> Just want to make sure I understand it correctly about the orginal >>>> hitrate: >>>> you want to retire the hitrate in PRED_BUILTIN_EXPECT and always use >>>> the one specified in the biniltin-expect-probability parameter. >>> >>> >>> Yes. >>>> >>>> >>>> Should I use 90% as the default? It's hard to fit current value 0.9996 >>>> in percent form. >>> >>> >>> Yes, 90% seems fine. The original value was set quite arbitrarily and no >>> real >>> performance study was made as far as I know except yours. I think users >>> that >>> are sure they use expect to gueard completely cold edges may just use >>> 100% >>> instead of 0.9996, so I would not worry much about the precision. >>> >>> Honza >>>> >>>> >>>> -Rong >>>>> >>>>> >>>>> OK with that change. >>>>> >>>>> Honza > > > > This broke arm-linux-gnueabihf building libstdc++-v3. I haven't dug further > yet but still reducing the testcase. > > See > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58619 > > for details. > > Can you please deal with this appropriately ? > > regards > Ramana > >
Re: [PATCH] alternative hirate for builtin_expert
Here is the new patch. Honaz: Could you take a look? Thanks, -Rong On Wed, Oct 2, 2013 at 2:31 PM, Jan Hubicka wrote: >> Thanks for the suggestion. This is much cleaner than to use binary parameter. >> >> Just want to make sure I understand it correctly about the orginal hitrate: >> you want to retire the hitrate in PRED_BUILTIN_EXPECT and always use >> the one specified in the biniltin-expect-probability parameter. > > Yes. >> >> Should I use 90% as the default? It's hard to fit current value 0.9996 >> in percent form. > > Yes, 90% seems fine. The original value was set quite arbitrarily and no real > performance study was made as far as I know except yours. I think users that > are sure they use expect to gueard completely cold edges may just use 100% > instead of 0.9996, so I would not worry much about the precision. > > Honza >> >> -Rong >> > >> > OK with that change. >> > >> > Honza p2_patch_v2 Description: Binary data
Re: [PATCH] fix size_estimation for builtin_expect
attached the new patch. OK for check in? On Wed, Oct 2, 2013 at 9:12 AM, Jan Hubicka wrote: >> > Hi, >> > >> > builtin_expect should be a NOP in size_estimation. Indeed, the call >> > stmt itself is 0 weight in size and time. But it may introduce >> > an extra relation expr which has non-zero size/time. The end result >> > is: for w/ and w/o builtin_expect, we have different size/time estimation >> > for inlining. >> > >> > This patch fixes this problem. >> > >> > An earlier discussion of this patch is >> > https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c590ad8c5315 >> > >> > This new patch address Honza's comments. >> > It passes the bootstrap and regression. >> > >> > Richard: I looked at your tree-ssa.c:walk_use_def_chains() code. I think >> > that's an overkill for this simple problem. Your code is mostly dealing >> > with the recursively walk the PHI node to find the real def stmts. >> > Here the traversal is within one BB and I may need to continue on multiple >> > real assignment. Calling walk_use_def_chains probably only uses >> > the SSA_NAME_DEF_STMT() part of the code. >> > >> > Thanks, >> > >> > -Rong > > This patch is OK. Add white space after > + bool match = false; > + bool done = false; > > and fix > + if (match && single_imm_use (var, &use_p, &use_stmt) && > + (gimple_code (use_stmt) == GIMPLE_COND)) > > && should be at beggining of new line.. > > Thanks, > Honza p1_patch_v2 Description: Binary data
Re: [PATCH] alternative hirate for builtin_expert
On Wed, Oct 2, 2013 at 9:08 AM, Jan Hubicka wrote: >> > Hi, >> > >> > >> > >> > Current default probability for builtin_expect is 0.9996. >> > This makes the freq of unlikely bb very low (4), which >> > suppresses the inlining of any calls within those bb. >> > >> > We used FDO data to measure the branch probably for >> > the branch annotated with builtin_expert. >> > For google internal benchmarks, the weight average >> > (the profile count value as the weight) is 0.9081. >> > >> > Linux kernel is another program that is heavily annotated >> > with builtin-expert. We measured its weight average as 0.8717, >> > using google search as the workload. >> > >> > This patch sets the alternate hirate probability for builtin_expert >> > to 90%. With the alternate hirate, we measured performance >> > improvement for google benchmarks and Linux kernel. >> > >> > An earlier discussion is >> > https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c5910054630b >> > >> > This new patch is for the trunk and addresses Honza's comments. >> > >> > Honza: this new probability is off by default. When we backport to google >> > branch we will make it the default. Let me know if you want to do the same >> > here. > > I do not like much the binary parameter for > builtin-expect-probability-relaxed. > > I would just add bulitin-expect-probability taking value in percents and then > make predict.c to use it. Just use predict_edge instead of predict_edge_def > and document hitrate value as unused in predict.def. Thanks for the suggestion. This is much cleaner than to use binary parameter. Just want to make sure I understand it correctly about the orginal hitrate: you want to retire the hitrate in PRED_BUILTIN_EXPECT and always use the one specified in the biniltin-expect-probability parameter. Should I use 90% as the default? It's hard to fit current value 0.9996 in percent form. -Rong > > OK with that change. > > Honza
Re: [PATCH] alternative hirate for builtin_expert
ping. On Fri, Sep 27, 2013 at 3:53 PM, Rong Xu wrote: > Hi, > > > > Current default probability for builtin_expect is 0.9996. > This makes the freq of unlikely bb very low (4), which > suppresses the inlining of any calls within those bb. > > We used FDO data to measure the branch probably for > the branch annotated with builtin_expert. > For google internal benchmarks, the weight average > (the profile count value as the weight) is 0.9081. > > Linux kernel is another program that is heavily annotated > with builtin-expert. We measured its weight average as 0.8717, > using google search as the workload. > > This patch sets the alternate hirate probability for builtin_expert > to 90%. With the alternate hirate, we measured performance > improvement for google benchmarks and Linux kernel. > > An earlier discussion is > https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c5910054630b > > This new patch is for the trunk and addresses Honza's comments. > > Honza: this new probability is off by default. When we backport to google > branch we will make it the default. Let me know if you want to do the same > here. > > Thanks, > > -Rong
Re: [PATCH] fix size_estimation for builtin_expect
ping. On Fri, Sep 27, 2013 at 3:56 PM, Rong Xu wrote: > Hi, > > builtin_expect should be a NOP in size_estimation. Indeed, the call > stmt itself is 0 weight in size and time. But it may introduce > an extra relation expr which has non-zero size/time. The end result > is: for w/ and w/o builtin_expect, we have different size/time estimation > for inlining. > > This patch fixes this problem. > > An earlier discussion of this patch is > https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c590ad8c5315 > > This new patch address Honza's comments. > It passes the bootstrap and regression. > > Richard: I looked at your tree-ssa.c:walk_use_def_chains() code. I think > that's an overkill for this simple problem. Your code is mostly dealing > with the recursively walk the PHI node to find the real def stmts. > Here the traversal is within one BB and I may need to continue on multiple > real assignment. Calling walk_use_def_chains probably only uses > the SSA_NAME_DEF_STMT() part of the code. > > Thanks, > > -Rong
[PATCH] fix size_estimation for builtin_expect
Hi, builtin_expect should be a NOP in size_estimation. Indeed, the call stmt itself is 0 weight in size and time. But it may introduce an extra relation expr which has non-zero size/time. The end result is: for w/ and w/o builtin_expect, we have different size/time estimation for inlining. This patch fixes this problem. An earlier discussion of this patch is https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c590ad8c5315 This new patch address Honza's comments. It passes the bootstrap and regression. Richard: I looked at your tree-ssa.c:walk_use_def_chains() code. I think that's an overkill for this simple problem. Your code is mostly dealing with the recursively walk the PHI node to find the real def stmts. Here the traversal is within one BB and I may need to continue on multiple real assignment. Calling walk_use_def_chains probably only uses the SSA_NAME_DEF_STMT() part of the code. Thanks, -Rong p1_patch Description: Binary data
[PATCH] alternative hirate for builtin_expert
Hi, Current default probability for builtin_expect is 0.9996. This makes the freq of unlikely bb very low (4), which suppresses the inlining of any calls within those bb. We used FDO data to measure the branch probably for the branch annotated with builtin_expert. For google internal benchmarks, the weight average (the profile count value as the weight) is 0.9081. Linux kernel is another program that is heavily annotated with builtin-expert. We measured its weight average as 0.8717, using google search as the workload. This patch sets the alternate hirate probability for builtin_expert to 90%. With the alternate hirate, we measured performance improvement for google benchmarks and Linux kernel. An earlier discussion is https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c5910054630b This new patch is for the trunk and addresses Honza's comments. Honza: this new probability is off by default. When we backport to google branch we will make it the default. Let me know if you want to do the same here. Thanks, -Rong p2_patch Description: Binary data
Re: [google gcc-4_8] alternate hirate for builtin_expert
On Thu, Sep 26, 2013 at 3:27 PM, Jan Hubicka wrote: >> Hi, >> >> Current default probably for builtin_expect is 0.9996. >> This makes the freq of unlikely bb very low (4), which >> suppresses the inlining of any calls within those bb. >> >> We used FDO data to measure the branch probably for >> the branch annotated with builtin_expert. >> For google >> internal benchmarks, the weight average >> (the profile count value as the weight) is 0.9081. >> >> Linux kernel is another program that is heavily annotated >> with builtin-expert. We measured its weight average as 0.8717, >> using google search as >> the workload. > > This is interesting. I was always unsure if programmers use builtin_expect > more often to mark an impossible paths (as those leading to crash) or just > unlikely paths. Your data seems to suggest the second. > I don't find an official guideline on how this should be used. Maybe some documentation in gcc user-guide helps. > We probably also ought to get analyze_brprob working again. It was always > useful to get such a data. >> >> >> This patch sets the alternate hirate probability for >> builtin_expert >> to 90%. With the alternate hirate, we measured performance >> improvement for google >> benchmarks and Linux kernel. >> >> >> -Rong >> 2013-09-26 Rong Xu >> >> * params.def (DEFPARAM): New. >> * params.def: New. >> * predict.c (tree_predict_by_opcode): Alternate >> probablity hirate for builtin_expect. > > This also seems resonable for mainline. Please add a comment > to the parameter explaining how the value was determined. > Please also add invoke.texi documentation. > Will do. > For patches that seems resonable for mainline in FDO/IPA area, > i would apprechiate if you just added me to CC, so I do not lose > track of them. Will do. > Honza
Re: [google/4_8] Disable -g/-gmlt during LIPO instrumentation
I don't quite understand here. We use the profile-generate memory consumption to estimate the profile use memory consumption. we still have -g/-gmlt in profile-use compilation. Will this change effectively under estimate the memory use in the use phrase? -Rong On Fri, Sep 27, 2013 at 11:50 AM, Teresa Johnson wrote: > David and Rong, > > The following patch will disable -g/-gmlt when instrumenting for LIPO > since they will affect the recorded ggc_memory used in the module > grouping decision. Added -fripa-allow-debug to override this behavior. > > Passes regression tests and simple tests on the new functionality. > > Ok for google/4_8? > > Teresa > > 2013-09-27 Teresa Johnson > > * opts.c (finish_options): Suppress -g/-gmlt when we are > building for LIPO instrumention. > * common.opt (fripa-allow-debug): New option. > > Index: opts.c > === > --- opts.c (revision 202976) > +++ opts.c (working copy) > @@ -799,7 +799,7 @@ finish_options (struct gcc_options *opts, struct g > #endif >if (!opts->x_flag_fat_lto_objects && !HAVE_LTO_PLUGIN) > error_at (loc, "-fno-fat-lto-objects are supported only with > linker plugin."); > -} > +} >if ((opts->x_flag_lto_partition_balanced != 0) + > (opts->x_flag_lto_partition_1to1 != 0) > + (opts->x_flag_lto_partition_none != 0) >= 1) > { > @@ -852,6 +852,26 @@ finish_options (struct gcc_options *opts, struct g >/* Turn on -ffunction-sections when -freorder-functions=* is used. */ >if (opts->x_flag_reorder_functions > 1) > opts->x_flag_function_sections = 1; > + > + /* LIPO module grouping depends on the memory consumed by the profile-gen > + parsing phase, recorded in a per-module ggc_memory field of the module > + info struct. This will be higher when debug generation is on via > + -g/-gmlt, which causes the FE to generate debug structures that will > + increase the ggc_total_memory. This could in theory cause the module > + groups to be slightly more conservative. Disable -g/-gmlt under > + -fripa -fprofile-generate, but provide an option to override this > + in case we actually need to debug an instrumented binary. */ > + if (opts->x_profile_arc_flag > + && opts->x_flag_dyn_ipa > + && opts->x_debug_info_level != DINFO_LEVEL_NONE > + && !opts->x_flag_dyn_ipa_allow_debug) > +{ > + inform (loc, > + "Debug generation via -g option disabled under -fripa " > + "-fprofile-generate (use -fripa-allow-debug to override)"); > + set_debug_level (NO_DEBUG, DEFAULT_GDB_EXTENSIONS, "0", opts, opts_set, > + loc); > +} > } > > #define LEFT_COLUMN27 > Index: common.opt > === > --- common.opt (revision 202976) > +++ common.opt (working copy) > @@ -1155,6 +1155,10 @@ fripa > Common Report Var(flag_dyn_ipa) > Perform Dynamic Inter-Procedural Analysis. > > +fripa-allow-debug > +Common Report Var(flag_dyn_ipa_allow_debug) Init(0) > +Allow -g enablement for -fripa -fprofile-generate compiles. > + > fripa-disallow-asm-modules > Common Report Var(flag_ripa_disallow_asm_modules) > Don't import an auxiliary module if it contains asm statements > > -- > Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Re: [google gcc-4_8] fix size_estimation for builtin_expect
On Thu, Sep 26, 2013 at 3:23 PM, Jan Hubicka wrote: >> Hi, >> >> builtin_expect should be a NOP in size_estimation. Indeed, the call >> stmt itself is 0 weight in size and time. But it may introduce >> an extra relation expr which has non-zero size/time. The end result >> is: for w/ and w/o builtin_expect, we have different size/time estimation >> for early inlining. >> >> This patch fixes this problem. >> >> -Rong > >> 2013-09-26 Rong Xu >> >> * ipa-inline-analysis.c (estimate_function_body_sizes): fix >> the size estimation for builtin_expect. > > This seems fine with an comment in the code what it is about. > I also think we want to support mutiple builtin_expects in a BB so perhaps > we want to have pointer set of statements to ignore? > Thanks for feedback. I'll add the comment and split the code into a new function. You have a good pointer about multiple builtin_expect. I think I need to remove the early exit (the break stmt). But I would argue that using pointer_set is probably not necessary. Here is the reasoning: The typical usage of builtin_expect is like: if (__builtin_expect (a<=b, 1)) { ... } The IR is: t1 = a <= b; t2 = (long int) t1; t3 = __builtin_expect (t2, 1); if (t3 != 0) goto ... Without the builtin, the IR is if (a<=b) goto... The code in the patch check the use of the builtin_expect return value, to see if it's used in a COND stmt. So even there are multiple builtins, only one can match. -Rong > To avoid spagetti code, please just move the new logic into separate > functions. > > Honza >> >> Index: ipa-inline-analysis.c >> === >> --- ipa-inline-analysis.c (revision 202638) >> +++ ipa-inline-analysis.c (working copy) >> @@ -2266,6 +2266,8 @@ estimate_function_body_sizes (struct cgraph_node * >>/* Estimate static overhead for function prologue/epilogue and alignment. >> */ >>int overhead = PARAM_VALUE (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE); >>int size = overhead; >> + gimple fix_expect_builtin; >> + >>/* Benefits are scaled by probability of elimination that is in range >> <0,2>. */ >>basic_block bb; >> @@ -2359,14 +2361,73 @@ estimate_function_body_sizes (struct cgraph_node * >> } >> } >> >> + fix_expect_builtin = NULL; >>for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi)) >> { >> gimple stmt = gsi_stmt (bsi); >> + if (gimple_call_builtin_p (stmt, BUILT_IN_EXPECT)) >> +{ >> + tree var = gimple_call_lhs (stmt); >> + tree arg = gimple_call_arg (stmt, 0); >> + use_operand_p use_p; >> + gimple use_stmt; >> + bool match = false; >> + bool done = false; >> + gcc_assert (var && arg); >> + gcc_assert (TREE_CODE (var) == SSA_NAME); >> + >> + while (TREE_CODE (arg) == SSA_NAME) >> +{ >> + gimple stmt_tmp = SSA_NAME_DEF_STMT (arg); >> + switch (gimple_assign_rhs_code (stmt_tmp)) >> +{ >> + case LT_EXPR: >> + case LE_EXPR: >> + case GT_EXPR: >> + case GE_EXPR: >> + case EQ_EXPR: >> + case NE_EXPR: >> +match = true; >> +done = true; >> +break; >> + case NOP_EXPR: >> +break; >> + default: >> +done = true; >> +break; >> +} >> + if (done) >> +break; >> + arg = gimple_assign_rhs1 (stmt_tmp); >> +} >> + >> + if (match && single_imm_use (var, &use_p, &use_stmt)) >> +{ >> + if (gimple_code (use_stmt) == GIMPLE_COND) >> +{ >> + fix_expect_builtin = use_stmt; >> +} >> +} >> + >> + /* we should see one builtin_expert call in one bb. */ >> + break; >> +} >> +} >> + >> + for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi)) >> + { >> + gimple stmt = gsi_stmt (bsi); >> int this_size = estimate_num_insns (stmt, &eni_size_weights); >> int this_time = estimate_num_insns (stmt, &eni_time_weights); >> int prob; >> struct predicate will_be_nonconstant; >> >> + if (stmt == fix_expect_builtin) >> +{ >> + this_size--; >> + this_time--; >> +} >> + >> if (dump_file && (dump_flags & TDF_DETAILS)) >> { >> fprintf (dump_file, " "); >
Re: [google gcc-4_8] fix size_estimation for builtin_expect
On Fri, Sep 27, 2013 at 1:20 AM, Richard Biener wrote: > On Fri, Sep 27, 2013 at 12:23 AM, Jan Hubicka wrote: >>> Hi, >>> >>> builtin_expect should be a NOP in size_estimation. Indeed, the call >>> stmt itself is 0 weight in size and time. But it may introduce >>> an extra relation expr which has non-zero size/time. The end result >>> is: for w/ and w/o builtin_expect, we have different size/time estimation >>> for early inlining. >>> >>> This patch fixes this problem. >>> >>> -Rong >> >>> 2013-09-26 Rong Xu >>> >>> * ipa-inline-analysis.c (estimate_function_body_sizes): fix >>> the size estimation for builtin_expect. >> >> This seems fine with an comment in the code what it is about. >> I also think we want to support mutiple builtin_expects in a BB so perhaps >> we want to have pointer set of statements to ignore? >> >> To avoid spagetti code, please just move the new logic into separate >> functions. > > Looks like this could use tree-ssa.c:walk_use_def_chains (please > change its implementation as necessary, make it C++, etc. - you will > be the first user again). > Thanks for the suggestion. But it seems walk_use_def_chains() was removed by r201951. -Rong > Richard. > >> Honza >>> >>> Index: ipa-inline-analysis.c >>> === >>> --- ipa-inline-analysis.c (revision 202638) >>> +++ ipa-inline-analysis.c (working copy) >>> @@ -2266,6 +2266,8 @@ estimate_function_body_sizes (struct cgraph_node * >>>/* Estimate static overhead for function prologue/epilogue and >>> alignment. */ >>>int overhead = PARAM_VALUE (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE); >>>int size = overhead; >>> + gimple fix_expect_builtin; >>> + >>>/* Benefits are scaled by probability of elimination that is in range >>> <0,2>. */ >>>basic_block bb; >>> @@ -2359,14 +2361,73 @@ estimate_function_body_sizes (struct cgraph_node * >>> } >>> } >>> >>> + fix_expect_builtin = NULL; >>>for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi)) >>> { >>> gimple stmt = gsi_stmt (bsi); >>> + if (gimple_call_builtin_p (stmt, BUILT_IN_EXPECT)) >>> +{ >>> + tree var = gimple_call_lhs (stmt); >>> + tree arg = gimple_call_arg (stmt, 0); >>> + use_operand_p use_p; >>> + gimple use_stmt; >>> + bool match = false; >>> + bool done = false; >>> + gcc_assert (var && arg); >>> + gcc_assert (TREE_CODE (var) == SSA_NAME); >>> + >>> + while (TREE_CODE (arg) == SSA_NAME) >>> +{ >>> + gimple stmt_tmp = SSA_NAME_DEF_STMT (arg); >>> + switch (gimple_assign_rhs_code (stmt_tmp)) >>> +{ >>> + case LT_EXPR: >>> + case LE_EXPR: >>> + case GT_EXPR: >>> + case GE_EXPR: >>> + case EQ_EXPR: >>> + case NE_EXPR: >>> +match = true; >>> +done = true; >>> +break; >>> + case NOP_EXPR: >>> +break; >>> + default: >>> +done = true; >>> +break; >>> +} >>> + if (done) >>> +break; >>> + arg = gimple_assign_rhs1 (stmt_tmp); >>> +} >>> + >>> + if (match && single_imm_use (var, &use_p, &use_stmt)) >>> +{ >>> + if (gimple_code (use_stmt) == GIMPLE_COND) >>> +{ >>> + fix_expect_builtin = use_stmt; >>> +} >>> +} >>> + >>> + /* we should see one builtin_expert call in one bb. */ >>> + break; >>> +} >>> +} >>> + >>> + for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi)) >>> + { >>> + gimple stmt = gsi_stmt (bsi); >>> int this_size = estimate_num_insns (stmt, &eni_size_weights); >>> int this_time = estimate_num_insns (stmt, &eni_time_weights); >>> int prob; >>> struct predicate will_be_nonconstant; >>> >>> + if (stmt == fix_expect_builtin) >>> +{ >>> + this_size--; >>> + this_time--; >>> +} >>> + >>> if (dump_file && (dump_flags & TDF_DETAILS)) >>> { >>> fprintf (dump_file, " "); >>
Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
On Thu, Sep 26, 2013 at 2:54 PM, Jan Hubicka wrote: >> > As for COMDAT merging, i would like to see the patch. I am experimenting >> > now with a patch to also privatize COMDATs during -fprofile-generate to >> > avoid problems with lost profiles mentioned above. >> > >> >> Do you mean you privatize every COMDAT function in the profile-generate? >> We discussed this idea internally and we thought it would not work for >> large applications (like in google) due to size. > > Yes, Martin and I plan to test this on firefox. In a way you already have all > the COMDAT functions unshared in the object files, so the resulting binary > should not be completely off the limits. But I do not have any quantitative > data, yet, since we hit bug in constant folding and devirtualization I fixed > in > meantime but we did not re-run the tests yet. LInker removes a great numbers of duplicated copies, esp for those template functions. We don't have a quantitative numbers either. But I'll collect some soon. > >> >> > As for context sensitivity, one approach would be to have two sets of >> > counters for every comdat - one merged globally and one counting local >> > instances. We can then privatize always and at profile read in stage >> > just clone every comdat and have two instances - one for offline copy >> > and one for inlining. >> > >> >> In my implementation, I also allow multiple sets of COMDAT profile >> co-existing in one compilation. >> Due to the auxiliary modules in LIPO, I actually have more than two. > > How does auxiliary modules work? It pulls in multiple profiles from other compilation. So there might be multiple inlined profiles. >> >> But I'm wondering how do you determine which profile to use for each >> call-site -- the inline decision may not >> be the same for profile-generate and profile-use compilation. > > My suggestion was to simply use the module local profile for all inline sites > within the given module and the global profile for the offline copy of the > function (that one will, in the case it survives linking, be shared across > all the modules anyway). For simple example like: callsite1 --> comcat_function_foo callsite2 --> comdat_function_foo callsite1 is inlined in profile-generate, it has its own inlined profile counter. callsite2 is not inlined and the profile goes to the offline copies. let's callsite 1 is cold (0 counter) and callsite 2 is hot. Using local profile (the cold one) for callsite2 will not be correct. > > I think this may work in the cases where i.e. use of hash templates in one > module is very different (in average size) from other module. > I did not really put much effort into it - I currently worry primarily about > the cases where profile is lost completely since it gets attached to a > function > not surviving final linking (or because we inline something we did not inlined > at profile time). > > As for context sensitivity, we may try to consider developing more consistent > solution for this. COMDAT functions are definitely not only that may exhibit > context sensitive behaviour. > One approach would be to always have multiple counters for each function and > hash based on cbacktraces collected by indirect call profiling > instrumentation. > In a way this is same path profiling, but that would definitely add quite some > overhead + we will need to think of resonable way to represent this within > compiler. > > How do you decide what functions you want to have multiple profiles for? I do the instrumentation after ipa-inline for comdat function. I know if a callsite is inlined or not. In profile-use phrase, I also need to provide to the context (which module this is from) to pick the right profile. > > Honza
[google gcc-4_8] alternate hirate for builtin_expert
Hi, Current default probably for builtin_expect is 0.9996. This makes the freq of unlikely bb very low (4), which suppresses the inlining of any calls within those bb. We used FDO data to measure the branch probably for the branch annotated with builtin_expert. For google internal benchmarks, the weight average (the profile count value as the weight) is 0.9081. Linux kernel is another program that is heavily annotated with builtin-expert. We measured its weight average as 0.8717, using google search as the workload. This patch sets the alternate hirate probability for builtin_expert to 90%. With the alternate hirate, we measured performance improvement for google benchmarks and Linux kernel. -Rong 2013-09-26 Rong Xu * params.def (DEFPARAM): New. * params.def: New. * predict.c (tree_predict_by_opcode): Alternate probablity hirate for builtin_expect. Index: params.def === --- params.def (revision 202638) +++ params.def (working copy) @@ -483,6 +483,10 @@ DEFPARAM(TRACER_MIN_BRANCH_PROBABILITY, "tracer-min-branch-probability", "Stop forward growth if the probability of best edge is less than this threshold (in percent). Used when profile feedback is not available", 50, 0, 100) +DEFPARAM(BUILTIN_EXPECT_PROBABILITY_RELAXED, +"builtin-expect-probability-relaxed", +"Set the estimated probability for builtin expect. By default using PROB_VERY_LIKELY, while value of 1 using HIRATE(90) probability.", +0, 0, 1) /* The maximum number of incoming edges to consider for crossjumping. */ DEFPARAM(PARAM_MAX_CROSSJUMP_EDGES, Index: params.def === --- params.def (revision 202638) +++ params.def (working copy) @@ -483,6 +483,10 @@ DEFPARAM(TRACER_MIN_BRANCH_PROBABILITY, "tracer-min-branch-probability", "Stop forward growth if the probability of best edge is less than this threshold (in percent). Used when profile feedback is not available", 50, 0, 100) +DEFPARAM(BUILTIN_EXPECT_PROBABILITY_RELAXED, +"builtin-expect-probability-relaxed", +"Set the estimated probability for builtin expect. By default using PROB_VERY_LIKELY, while value of 1 using HIRATE(90) probability.", +0, 0, 1) /* The maximum number of incoming edges to consider for crossjumping. */ DEFPARAM(PARAM_MAX_CROSSJUMP_EDGES, Index: predict.c === --- predict.c (revision 202638) +++ predict.c (working copy) @@ -1950,10 +1950,17 @@ tree_predict_by_opcode (basic_block bb) BITMAP_FREE (visited); if (val) { + enum br_predictor predictor; + + if (PARAM_VALUE (BUILTIN_EXPECT_PROBABILITY_RELAXED) == 0) +predictor = PRED_BUILTIN_EXPECT; + else +predictor = PRED_BUILTIN_EXPECT_RELAXED; + if (integer_zerop (val)) - predict_edge_def (then_edge, PRED_BUILTIN_EXPECT, NOT_TAKEN); + predict_edge_def (then_edge, predictor, NOT_TAKEN); else - predict_edge_def (then_edge, PRED_BUILTIN_EXPECT, TAKEN); + predict_edge_def (then_edge, predictor, TAKEN); return; } /* Try "pointer heuristic."
[google gcc-4_8] fix size_estimation for builtin_expect
Hi, builtin_expect should be a NOP in size_estimation. Indeed, the call stmt itself is 0 weight in size and time. But it may introduce an extra relation expr which has non-zero size/time. The end result is: for w/ and w/o builtin_expect, we have different size/time estimation for early inlining. This patch fixes this problem. -Rong 2013-09-26 Rong Xu * ipa-inline-analysis.c (estimate_function_body_sizes): fix the size estimation for builtin_expect. Index: ipa-inline-analysis.c === --- ipa-inline-analysis.c (revision 202638) +++ ipa-inline-analysis.c (working copy) @@ -2266,6 +2266,8 @@ estimate_function_body_sizes (struct cgraph_node * /* Estimate static overhead for function prologue/epilogue and alignment. */ int overhead = PARAM_VALUE (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE); int size = overhead; + gimple fix_expect_builtin; + /* Benefits are scaled by probability of elimination that is in range <0,2>. */ basic_block bb; @@ -2359,14 +2361,73 @@ estimate_function_body_sizes (struct cgraph_node * } } + fix_expect_builtin = NULL; for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi)) { gimple stmt = gsi_stmt (bsi); + if (gimple_call_builtin_p (stmt, BUILT_IN_EXPECT)) +{ + tree var = gimple_call_lhs (stmt); + tree arg = gimple_call_arg (stmt, 0); + use_operand_p use_p; + gimple use_stmt; + bool match = false; + bool done = false; + gcc_assert (var && arg); + gcc_assert (TREE_CODE (var) == SSA_NAME); + + while (TREE_CODE (arg) == SSA_NAME) +{ + gimple stmt_tmp = SSA_NAME_DEF_STMT (arg); + switch (gimple_assign_rhs_code (stmt_tmp)) +{ + case LT_EXPR: + case LE_EXPR: + case GT_EXPR: + case GE_EXPR: + case EQ_EXPR: + case NE_EXPR: +match = true; +done = true; +break; + case NOP_EXPR: +break; + default: +done = true; +break; +} + if (done) +break; + arg = gimple_assign_rhs1 (stmt_tmp); +} + + if (match && single_imm_use (var, &use_p, &use_stmt)) +{ + if (gimple_code (use_stmt) == GIMPLE_COND) +{ + fix_expect_builtin = use_stmt; +} +} + + /* we should see one builtin_expert call in one bb. */ + break; +} +} + + for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi)) + { + gimple stmt = gsi_stmt (bsi); int this_size = estimate_num_insns (stmt, &eni_size_weights); int this_time = estimate_num_insns (stmt, &eni_time_weights); int prob; struct predicate will_be_nonconstant; + if (stmt == fix_expect_builtin) +{ + this_size--; + this_time--; +} + if (dump_file && (dump_flags & TDF_DETAILS)) { fprintf (dump_file, " ");
Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
On Tue, Sep 24, 2013 at 5:31 AM, Jan Hubicka wrote: >> Hi Honza, >> >> I am finally getting back to working on this after a few weeks of >> working on some other priorities. > > I am also trying to return to this, so good timming ;) > Martin has got smaller C++ programs (Inkscape) to not touch cold segment > during the startup with FDO (w/o partitioning). Firefox still does, I think > the problem are lost samples due to different linker decisions even with LTO. > (i.e. linker pick an object from .a libraryat profile-generate time that i > never > passes later.). > > I plan to look into that today. >> >> Did you mean to commit the above change? I see that it went in as part >> of r202258 but doesn't show up in the ChangeLog entry for that >> revision. > > Yes, I meant to check it in, but did not mean to do so w/o Changelog. I wil > fix that. >> >> > >> > In other cases it was mostly loop unrolling in combination with jump >> > threading. So >> > I modified my script to separately report when failure happens for test >> > trained >> > once and test trained hundred times. >> >> Thanks for the linker script. I reproduced your results. I looked at a >> couple cases. The first was one that failed after 1 training run only >> (2910-2.c). It was due to jump threading, which you noted was a >> problem. For this one I think we can handle it in the partitioning, >> since there is an FDO insanity that we could probably treat more >> conservatively when splitting. > > We should fix the roundoff issues - when I was introducing the > frequency/probability/count system I made an assumptions that parts of > programs > with very low counts do not matter, since they are not part of hot spot (and I > cared only about the hot spot). Now we care about identifying unlikely > executed spots and we need to fix this. >> >> I looked at one that failed after 100 as well (20031204-1.c). In this >> case, it was due to expansion which was creating multiple branches/bbs >> from a logical OR and guessing incorrectly on how to assign the >> counts: >> >> if (octets == 4 && (*cp == ':' || *cp == '\0')) { >> >> The (*cp == ':' || *cp == '\0') part looked like the following going >> into RTL expansion: >> >> [20031204-1.c : 31:33] _29 = _28 == 58; >> [20031204-1.c : 31:33] _30 = _28 == 0; >> [20031204-1.c : 31:33] _31 = _29 | _30; >> [20031204-1.c : 31:18] if (_31 != 0) >> goto ; >> else >> goto ; >> >> where the result of the OR was always true, so bb 16 had a count of >> 100 and bb 19 a count of 0. When it was expanded, the expanded version >> of the above turned into 2 bbs with a branch in between. Both >> comparisons were done in the first bb, but the first bb checked >> whether the result of the *cp == '\0' compare was true, and if not >> branched to the check for whether the *cp == ':' compare was true. It >> gave the branch to the second check against ':' a count of 0, so that >> bb got a count of 0 and was split out, and put the count of 100 on the >> fall through assuming the compare with '\0' always evaluated to true. >> In reality, this OR condition was always true because *cp was ':', not >> '\0'. Therefore, the count of 0 on the second block with the check for >> ':' was incorrect, we ended up trying to execute it, and failed. >> >> Presumably we had the correct profile data for both blocks, but the >> accuracy was reduced when the OR was represented as a logical >> computation with a single branch. We could change the expansion code >> to do something different, e.g. treat as a 50-50 branch. But we would >> still end up with integer truncation issues when there was a single >> training run. But that could be dealt with conservatively in the >> bbpart code as I suggested for the jump threading issue above. I.e. a >> cold block with incoming non-cold edges conservatively not marked cold >> for splitting. >> >> > >> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/2422-1.c >> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/2910-2.c >> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20020413-1.c >> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20030903-1.c >> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20031204-1.c >> > FAIL /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20031204-1.c >> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20060420-1.c >> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20060905-1.c >> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20120427-1.c >> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20120427-2.c >> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20120808-1.c >> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20121108-1.c >> > FAIL /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20121108-1.c >> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/920501-6.c >> > FAIL /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/920
Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
On Fri, Aug 30, 2013 at 7:50 AM, Teresa Johnson wrote: > Can someone review and ok the attached patch for trunk? It has been > bootstrapped and tested on x86-64-unknown-linux-gnu, and tested by > enabling -freorder-blocks-and-partition enabled for a > profiledbootstrap as well. > > (Honza, see more responses inlined below. Rong, please see note below as > well). > > Thanks, > Teresa > > On Fri, Aug 30, 2013 at 2:14 AM, Jan Hubicka wrote: >>> Great! Is this the LTO merging you were talking about in an earlier >>> message, or the gcov runtime fix (that would presumably not be >>> lto-specific)? >> >> It is the LTO path - we need to merge profiles there anyway for his code >> unification >> work. > > Rong - can you send a summary of the approach you are working on? Is > it LIPO-specific? I'm also working to improve COMDAT handling in FDO/LIPO. It's applicable to regular FDO. Our motivation case is different from the case talked here: COMDAT function F is defined in module A and is picked by linker as the out-of-line copy in profile-gen phrase. It gets all the counters. In profile-use compilation F may not be emitted in A (like all the callsites are ipa-inlined in A), We choose instance of F in module B as the out-of-line copy, and it's not optimized. We are seeing more of this kind of cases in LIPO due to multiple COMDAT copies brought by the auxiliary modules. Since COMDAT function may be inlined after instrumentation, multiple copies of the counters make be co-exists We want to differentiate inlined-copy counters or out-of-line-copy counters. (In LIPO, we actually encourage the inline of COMDAT in IPA-inline to get more context sensitive counters.) Our current solution is to have another instrumentation only for COMDAT functions. * For each comdat_key, we create a global var pointing to the gcov_fn_info of the out-of-line copy. * This global var is initialized by the instrumentation code placed in the function entry, to the value of gcov_fn_info of current module. This is post IPA-inline instrumentation. So at most one instrumentation (the one picked by linker) is executed. * Expand gcov_fn_info to points to the global var. In the run time, we can differentiate if the counter out-of-line copy or inlined copy. We set a tag in gcda file accordingly. (in the case of both out-of-line copy and inlined copy, we treat it as out-of-line copy). In the profile-use phrase, we use this info to resolve the decls of COMDAT functions, and also make sure we only emit the out-of-line copy of the COMDAT function (even if it's not referenced in the module). This has been done and we are testing it now. I talked to Teresa about the her case some time back. It seems that merging the counters is an easy change with the above patch because we have the address of the out-of-line gcov_fn_info. We can do a simple in-memory merge if the checksum matches. The concern is that this may reduce the context sensitive information. -Rong > >> >>> > I have patch to track this. Moreover vforks seems to produce repeated >>> > merging of results. >>> >>> Aha, does this explain the gimp issue as well? >> >> Not really - we still need to debug why we hit cold section so many times >> with >> partitioning. I sitll think easier approach will be to lock the cold >> section and >> then start probably with testsuite (i.e. write script to compile the small >> testcases >> with FDO + partitioning and see what crash by hitting cold section). > > Ok, that is on my todo list. > >> >>> > >>> > Is it really necessary to run this from internal loop of the cfgcleanup? >>> >>> The reason I added it here is that just below there is a call to >>> verify_flow_info, and that will fail with the new verification. >> >> Hmm, OK, I suppose we run the cleanup after partitioning just once or twice, >> right? >> We can track this incrementally - I am not sure if we put it from the >> internal iteration >> loop we would get anything substantial either. >> Removing unreachable blocks twice is however ugly. > > When I was debugging the issue that led to this change I seemed to see > 1-2 iterations typically. Although I haven't measured it > scientifically. It would be good to revisit that and see if we can > pull both parts out of the loop, but as a separate patch. > >> >>> +/* Ensure that all hot bbs are included in a hot path through the >>> + procedure. This is done by calling this function twice, once >>> + with WALK_UP true (to look for paths from the entry to hot bbs) and >>> + once with WALK_UP false (to look for paths from hot bbs to the exit). >>> + Returns the updated value of COLD_BB_COUNT and adds newly-hot bbs >>> + to BBS_IN_HOT_PARTITION. */ >>> + >>> +static unsigned int >>> +sanitize_hot_paths (bool walk_up, unsigned int cold_bb_count, >>> +vec *bbs_in_hot_partition) >>> +{ >>> + /* Callers check this. */ >>> + gcc_checking_assert (cold_bb_count); >>> + >>> + /* Keep examining hot bbs while we stil
[Google] fix a bug in lipo varpool node linking
This patch fixed a bug in lipo varpool node linking. C++ FE drops the initializer if it's not used in this TU. For current varpool linking may resolve the varpool node to the one with null initializer. -Rong Index: l-ipo.c === --- l-ipo.c (revision 201800) +++ l-ipo.c (working copy) @@ -2151,6 +2151,19 @@ resolve_varpool_node (struct varpool_node **slot, return; } + if (DECL_INITIAL (decl1) && !DECL_INITIAL (decl2)) +{ + merge_addressable_attr (decl1, decl2); + return; +} + + if (!DECL_INITIAL (decl1) && DECL_INITIAL (decl2)) +{ + *slot = node; + merge_addressable_attr (decl2, decl1); + return; +} + /* Either all complete or neither's type is complete. Just pick the primary module's decl. */ if (!varpool_is_auxiliary (*slot))
[google gcc-4_8] Force cmd-line match for option -ansi in LIPO use
The following patch forces the command line match for -ansi option in LIPO use build. Otherwise, it gets various undefined symbol errors. This is exposed in LIPO random grouping test. Tested with google internal benchmarks and gcc regression test. 2013-07-30 Rong Xu * gcc/coverage.c (force_matching_cg_opts): require cmd line match on -ansi option in LIPO use build. Index: gcc/coverage.c === --- gcc/coverage.c (revision 201219) +++ gcc/coverage.c (working copy) @@ -263,6 +263,7 @@ static struct opt_desc force_matching_cg_opts[] = { "-fsized-delete", "-fno-sized-delete", false }, { "-frtti", "-fno-rtti", true }, { "-fstrict-aliasing", "-fno-strict-aliasing", true }, +{ "-ansi", "", false }, { NULL, NULL, false } };
Re: [Google] Fix profiledbootstrap failure
Will do. The patch was in gcc-4_7 by Dehao. r194713 | dehao | 2012-12-24 16:49:06 -0800 (Mon, 24 Dec 2012) | 5 lines Fix the profiled bootstrap: 1. Set the default value of gcov-debug to be 0. 2. Merge profile summaries from different instrumented binaries. On Tue, Jul 30, 2013 at 12:58 PM, Xinliang David Li wrote: > Ok. Rong, can you help commit the parameter default setting patch? > > thanks, > > David > > On Tue, Jul 30, 2013 at 12:48 PM, Rong Xu wrote: >> We have seen the issue before. It does fail the profile boostrap as it >> reads a wrong gcda file. >> I thought it had been fixed. (The fix was as David mentioned, setting >> the default value of the parameter to 0). >> >> -Rong >> >> On Tue, Jul 30, 2013 at 12:02 PM, Xinliang David Li >> wrote: >>> I need to understand why this affects profile bootstrap -- is this due >>> to file name conflict? >>> >>> The fix is wrong -- please do not remove the parameter. If it is a >>> problem, a better fix is to change the default parameter value to 0. >>> >>> David >>> >>> >>> On Tue, Jul 30, 2013 at 11:56 AM, Teresa Johnson >>> wrote: >>>> cc'ing Rong and David since this came from LIPO support. >>>> >>>> The patch as-is removes the one use of PARAM_GCOV_DEBUG (which is on >>>> by default) without removing the parameter itself. What is the failure >>>> mode you see from this code? >>>> >>>> Thanks, >>>> Teresa >>>> >>>> On Tue, Jul 30, 2013 at 11:50 AM, Dinar Temirbulatov >>>> wrote: >>>>> Hello >>>>> >>>>> This change allows to complete profiledbootstrap on the google gcc-4.8 >>>>> branch, tested with make bootstrap with no new regressions. OK for >>>>> google 4.8? >>>>>thanks, Dinar. >>>> >>>> >>>> >>>> -- >>>> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Re: [Google] Fix profiledbootstrap failure
We have seen the issue before. It does fail the profile boostrap as it reads a wrong gcda file. I thought it had been fixed. (The fix was as David mentioned, setting the default value of the parameter to 0). -Rong On Tue, Jul 30, 2013 at 12:02 PM, Xinliang David Li wrote: > I need to understand why this affects profile bootstrap -- is this due > to file name conflict? > > The fix is wrong -- please do not remove the parameter. If it is a > problem, a better fix is to change the default parameter value to 0. > > David > > > On Tue, Jul 30, 2013 at 11:56 AM, Teresa Johnson wrote: >> cc'ing Rong and David since this came from LIPO support. >> >> The patch as-is removes the one use of PARAM_GCOV_DEBUG (which is on >> by default) without removing the parameter itself. What is the failure >> mode you see from this code? >> >> Thanks, >> Teresa >> >> On Tue, Jul 30, 2013 at 11:50 AM, Dinar Temirbulatov >> wrote: >>> Hello >>> >>> This change allows to complete profiledbootstrap on the google gcc-4.8 >>> branch, tested with make bootstrap with no new regressions. OK for >>> google 4.8? >>>thanks, Dinar. >> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413