Re: [google gcc-4_9] Fix bad LIPO profile produced by gcov-tool

2015-12-01 Thread Rong Xu
This is only needed for gcov-tool as we need to rewrite the
moduel-info to the profile (this is only used in decompress)

The transitional compiler path does not need it because the string is
already decompressed. It only needs to use the strings.

gcov-dump in theory does not need it either if it only dumps the
strings. But now I added the printing of both lengths. So now it is
also needed.

On Tue, Dec 1, 2015 at 4:46 PM, Xinliang David Li  wrote:
> Not sure about this line:
>
> mod_info->combined_strlen = saved_compressed_len;
>
> This did not exist for the compiler path before.
>
> On Tue, Dec 1, 2015 at 4:34 PM, Rong Xu  wrote:
>>
>> Hi,
>>
>> This patch fixes the issue when using gcov-tool to merge LIPO profiles
>> after we compressing the module infomration . We should not decompress
>> the string as the compressed string should be written directly to the
>> profile later. Tested with some LIPO profiles.
>>
>> Thanks,
>>
>> -Rong
>
>


[google gcc-4_9] Fix bad LIPO profile produced by gcov-tool

2015-12-01 Thread Rong Xu
Hi,

This patch fixes the issue when using gcov-tool to merge LIPO profiles
after we compressing the module infomration . We should not decompress
the string as the compressed string should be written directly to the
profile later. Tested with some LIPO profiles.

Thanks,

-Rong
2015-12-01  Rong Xu  

* gcov-dump.c (tag_module_info): Dump string information.
* gcov-io.c (gcov_read_module_info): record combined_len
  and don't uncompress in gcov-tool.

Index: gcov-dump.c
===
--- gcov-dump.c (revision 231134)
+++ gcov-dump.c (working copy)
@@ -588,6 +588,11 @@ tag_module_info (const char *filename ATTRIBUTE_UN
 {
   if (!mod_info->is_primary)
printf ("%s\n", mod_info->source_filename);
+  unsigned short compressed_size = mod_info->combined_strlen;
+  unsigned short uncompressed_size = mod_info->combined_strlen>>16;
+  printf ("compressed_ strlen=%d uncompressed_strlen=%d String:\n",
+  compressed_size,uncompressed_size);
+  printf ("%s\n", mod_info->saved_cc1_strings);
 }
   else
 {
Index: gcov-io.c
===
--- gcov-io.c   (revision 231134)
+++ gcov-io.c   (working copy)
@@ -835,16 +835,18 @@ gcov_read_module_info (struct gcov_module_info *mo
   len -= (src_filename_len + 1);
 
   saved_compressed_len = (unsigned long) gcov_read_unsigned ();
-  saved_uncompressed_len  = saved_compressed_len >> 16;
-  saved_compressed_len &= 0x;
+  mod_info->combined_strlen = saved_compressed_len;
   tag_len = gcov_read_unsigned ();
   len -= (tag_len + 2);
   gcc_assert (!len);
   compressed_array = (char *) xmalloc (tag_len * sizeof (gcov_unsigned_t));
-  uncompressed_array = (char *) xmalloc (saved_uncompressed_len);
   for (i = 0; i < tag_len; i++)
 ((gcov_unsigned_t *) compressed_array)[i] = gcov_read_unsigned ();
 
+#if !defined (IN_GCOV_TOOL)
+  saved_uncompressed_len  = saved_compressed_len >> 16;
+  saved_compressed_len &= 0x;
+  uncompressed_array = (char *) xmalloc (saved_uncompressed_len);
   result_len = saved_uncompressed_len;
   uncompress ((Bytef *)uncompressed_array, &result_len,
   (const Bytef *)compressed_array, saved_compressed_len);
@@ -851,6 +853,9 @@ gcov_read_module_info (struct gcov_module_info *mo
   gcc_assert (result_len == saved_uncompressed_len);
   mod_info->saved_cc1_strings = uncompressed_array;
   free (compressed_array);
+#else /* IN_GCOV_TOOL: we don't need to uncompress. It's a pass through.  */
+  mod_info->saved_cc1_strings = compressed_array;
+#endif
 }
 #endif
 


Re: [google][gcc-4_9] encode and compress cc1 option strings in gcov_module_info

2015-10-06 Thread Rong Xu
Here is the patch set 2 that integrates David's comments. Note that
this uses the combined strlen (i.e. encoding compressed and
uncompressed strlen into one gcov_unsigned_t).

Testing is ongoing.

-Rong

On Tue, Oct 6, 2015 at 11:30 AM, Rong Xu  wrote:
> It's 1:3 to 1:4 in the programs I tested. But it really depends on how
> the options are used. I think your idea of using combined strlen works
> better.
> I just make the code a little clumsy but it does not cause any
> performance issue.
>
> On Tue, Oct 6, 2015 at 10:21 AM, Xinliang David Li  wrote:
>> On Tue, Oct 6, 2015 at 9:26 AM, Rong Xu  wrote:
>>> On Mon, Oct 5, 2015 at 5:33 PM, Xinliang David Li  
>>> wrote:
>>>>unsigned ggc_memory = gcov_read_unsigned ();
>>>> +  unsigned marker = 0, len = 0, k;
>>>> +  char **string_array, *saved_cc1_strings;
>>>> +
>>>>for (unsigned j = 0; j < 7; j++)
>>>>
>>>>
>>>> Do not use hard coded number. Use the enum defined in coverage.c.
>>>
>>> OK.
>>>
>>>>
>>>>
>>>> +string_array[j] = xstrdup (gcov_read_string ());
>>>> +
>>>> +  k = 0;
>>>> +  for (unsigned j = 1; j < 7; j++)
>>>>
>>>> Do not use hard coded number.
>>>
>>> OK.
>>>
>>>>
>>>>
>>>> +{
>>>> +  if (num_array[j] == 0)
>>>> +continue;
>>>> +  marker += num_array[j];
>>>>
>>>> It is better to read if the name of variable 'marker' is changed to
>>>> 'j_end' or something similar
>>>>
>>>> For all the substrings of 'j' kind, there should be just one marker,
>>>> right? It looks like here you introduce one marker per string, not one
>>>> marker per string kind.
>>>
>>> I don't understand what you meant here. "marker" is fixed for each j
>>> substring (one option kind) -- it the end index of the sub-string
>>> array. k-loop is for each string.
>>>
>>
>> That was a wrong comment from me. Discard it.
>>
>>>>
>>>> +  len += 3; /* [[  */
>>>>
>>>> Same here for hard coded value.
>>>>
>>>> +  for (; k < marker; k++)
>>>> +len += strlen (string_array[k]) + 1; /* 1 for delimter of ']' 
>>>>  */
>>>>
>>>> Why do we need one ']' per string?
>>>
>>> This is because the options strings can contain space '  '. I cannot
>>> use space as the delimiter, neither is \0 as it is the end of the
>>> string of the encoded string.
>>
>> Ok -- this allows you to avoid string copy during parsing.
>>>
>>>>
>>>>
>>>> +}
>>>> +  saved_cc1_strings = (char *) xmalloc (len + 1);
>>>> +  saved_cc1_strings[0] = 0;
>>>> +
>>>> +  marker = 0;
>>>> +  k = 0;
>>>> +  for (unsigned j = 1; j < 7; j++)
>>>>
>>>> Same here for 7.
>>>
>>> will fix in the new patch.
>>>
>>>>
>>>> +{
>>>> +  static const char lipo_string_flags[6] = {'Q', 'B', 'S',
>>>> 'D','I', 'C'};
>>>> +  if (num_array[j] == 0)
>>>> +continue;
>>>> +  marker += num_array[j];
>>>>
>>>> Suggest changing marker to j_end
>>> OK.
>>>
>>>>
>>>> +  sprintf (saved_cc1_strings, "%s[[%c", saved_cc1_strings,
>>>> +   lipo_string_flags[j - 1]);
>>>> +  for (; k < marker; k++)
>>>> +{
>>>> +  sprintf (saved_cc1_strings, "%s%s]", saved_cc1_strings,
>>>> +   string_array[k]);
>>>>
>>>> +#define DELIMTER"[["
>>>>
>>>> Why double '[' ?
>>> I will change to single '['.
>>>
>>>>
>>>> +#define DELIMTER2   "]"
>>>> +#define QUOTE_PATH_FLAG 'Q'
>>>> +#define BRACKET_PATH_FLAG   'B'
>>>> +#define SYSTEM_PATH_FLAG'S'
>>>> +#define D_U_OPTION_FLAG 'D&#

Re: [google][gcc-4_9] encode and compress cc1 option strings in gcov_module_info

2015-10-06 Thread Rong Xu
It's 1:3 to 1:4 in the programs I tested. But it really depends on how
the options are used. I think your idea of using combined strlen works
better.
I just make the code a little clumsy but it does not cause any
performance issue.

On Tue, Oct 6, 2015 at 10:21 AM, Xinliang David Li  wrote:
> On Tue, Oct 6, 2015 at 9:26 AM, Rong Xu  wrote:
>> On Mon, Oct 5, 2015 at 5:33 PM, Xinliang David Li  wrote:
>>>unsigned ggc_memory = gcov_read_unsigned ();
>>> +  unsigned marker = 0, len = 0, k;
>>> +  char **string_array, *saved_cc1_strings;
>>> +
>>>for (unsigned j = 0; j < 7; j++)
>>>
>>>
>>> Do not use hard coded number. Use the enum defined in coverage.c.
>>
>> OK.
>>
>>>
>>>
>>> +string_array[j] = xstrdup (gcov_read_string ());
>>> +
>>> +  k = 0;
>>> +  for (unsigned j = 1; j < 7; j++)
>>>
>>> Do not use hard coded number.
>>
>> OK.
>>
>>>
>>>
>>> +{
>>> +  if (num_array[j] == 0)
>>> +continue;
>>> +  marker += num_array[j];
>>>
>>> It is better to read if the name of variable 'marker' is changed to
>>> 'j_end' or something similar
>>>
>>> For all the substrings of 'j' kind, there should be just one marker,
>>> right? It looks like here you introduce one marker per string, not one
>>> marker per string kind.
>>
>> I don't understand what you meant here. "marker" is fixed for each j
>> substring (one option kind) -- it the end index of the sub-string
>> array. k-loop is for each string.
>>
>
> That was a wrong comment from me. Discard it.
>
>>>
>>> +  len += 3; /* [[  */
>>>
>>> Same here for hard coded value.
>>>
>>> +  for (; k < marker; k++)
>>> +len += strlen (string_array[k]) + 1; /* 1 for delimter of ']'  
>>> */
>>>
>>> Why do we need one ']' per string?
>>
>> This is because the options strings can contain space '  '. I cannot
>> use space as the delimiter, neither is \0 as it is the end of the
>> string of the encoded string.
>
> Ok -- this allows you to avoid string copy during parsing.
>>
>>>
>>>
>>> +}
>>> +  saved_cc1_strings = (char *) xmalloc (len + 1);
>>> +  saved_cc1_strings[0] = 0;
>>> +
>>> +  marker = 0;
>>> +  k = 0;
>>> +  for (unsigned j = 1; j < 7; j++)
>>>
>>> Same here for 7.
>>
>> will fix in the new patch.
>>
>>>
>>> +{
>>> +  static const char lipo_string_flags[6] = {'Q', 'B', 'S',
>>> 'D','I', 'C'};
>>> +  if (num_array[j] == 0)
>>> +continue;
>>> +  marker += num_array[j];
>>>
>>> Suggest changing marker to j_end
>> OK.
>>
>>>
>>> +  sprintf (saved_cc1_strings, "%s[[%c", saved_cc1_strings,
>>> +   lipo_string_flags[j - 1]);
>>> +  for (; k < marker; k++)
>>> +{
>>> +  sprintf (saved_cc1_strings, "%s%s]", saved_cc1_strings,
>>> +   string_array[k]);
>>>
>>> +#define DELIMTER"[["
>>>
>>> Why double '[' ?
>> I will change to single '['.
>>
>>>
>>> +#define DELIMTER2   "]"
>>> +#define QUOTE_PATH_FLAG 'Q'
>>> +#define BRACKET_PATH_FLAG   'B'
>>> +#define SYSTEM_PATH_FLAG'S'
>>> +#define D_U_OPTION_FLAG 'D'
>>> +#define INCLUDE_OPTION_FLAG 'I'
>>> +#define COMMAND_ARG_FLAG'C'
>>> +
>>> +enum lipo_cc1_string_kind {
>>> +  k_quote_paths = 0,
>>> +  k_bracket_paths,
>>> +  k_system_paths,
>>> +  k_cpp_defines,
>>> +  k_cpp_includes,
>>> +  k_lipo_cl_args,
>>> +  num_lipo_cc1_string_kind
>>> +};
>>> +
>>> +struct lipo_parsed_cc1_string {
>>> +  const char* source_filename;
>>> +  unsigned num[num_lipo_cc1_string_kind];
>>> +  char **strings[num_lipo_cc1_string_kind];
>>> +};
>>> +
>>> +struct lipo_parsed_cc1_string *
>>&g

Re: [google][gcc-4_9] encode and compress cc1 option strings in gcov_module_info

2015-10-06 Thread Rong Xu
On Mon, Oct 5, 2015 at 5:33 PM, Xinliang David Li  wrote:
>unsigned ggc_memory = gcov_read_unsigned ();
> +  unsigned marker = 0, len = 0, k;
> +  char **string_array, *saved_cc1_strings;
> +
>for (unsigned j = 0; j < 7; j++)
>
>
> Do not use hard coded number. Use the enum defined in coverage.c.

OK.

>
>
> +string_array[j] = xstrdup (gcov_read_string ());
> +
> +  k = 0;
> +  for (unsigned j = 1; j < 7; j++)
>
> Do not use hard coded number.

OK.

>
>
> +{
> +  if (num_array[j] == 0)
> +continue;
> +  marker += num_array[j];
>
> It is better to read if the name of variable 'marker' is changed to
> 'j_end' or something similar
>
> For all the substrings of 'j' kind, there should be just one marker,
> right? It looks like here you introduce one marker per string, not one
> marker per string kind.

I don't understand what you meant here. "marker" is fixed for each j
substring (one option kind) -- it the end index of the sub-string
array. k-loop is for each string.

>
> +  len += 3; /* [[  */
>
> Same here for hard coded value.
>
> +  for (; k < marker; k++)
> +len += strlen (string_array[k]) + 1; /* 1 for delimter of ']'  */
>
> Why do we need one ']' per string?

This is because the options strings can contain space '  '. I cannot
use space as the delimiter, neither is \0 as it is the end of the
string of the encoded string.

>
>
> +}
> +  saved_cc1_strings = (char *) xmalloc (len + 1);
> +  saved_cc1_strings[0] = 0;
> +
> +  marker = 0;
> +  k = 0;
> +  for (unsigned j = 1; j < 7; j++)
>
> Same here for 7.

will fix in the new patch.

>
> +{
> +  static const char lipo_string_flags[6] = {'Q', 'B', 'S',
> 'D','I', 'C'};
> +  if (num_array[j] == 0)
> +continue;
> +  marker += num_array[j];
>
> Suggest changing marker to j_end
OK.

>
> +  sprintf (saved_cc1_strings, "%s[[%c", saved_cc1_strings,
> +   lipo_string_flags[j - 1]);
> +  for (; k < marker; k++)
> +{
> +  sprintf (saved_cc1_strings, "%s%s]", saved_cc1_strings,
> +   string_array[k]);
>
> +#define DELIMTER"[["
>
> Why double '[' ?
I will change to single '['.

>
> +#define DELIMTER2   "]"
> +#define QUOTE_PATH_FLAG 'Q'
> +#define BRACKET_PATH_FLAG   'B'
> +#define SYSTEM_PATH_FLAG'S'
> +#define D_U_OPTION_FLAG 'D'
> +#define INCLUDE_OPTION_FLAG 'I'
> +#define COMMAND_ARG_FLAG'C'
> +
> +enum lipo_cc1_string_kind {
> +  k_quote_paths = 0,
> +  k_bracket_paths,
> +  k_system_paths,
> +  k_cpp_defines,
> +  k_cpp_includes,
> +  k_lipo_cl_args,
> +  num_lipo_cc1_string_kind
> +};
> +
> +struct lipo_parsed_cc1_string {
> +  const char* source_filename;
> +  unsigned num[num_lipo_cc1_string_kind];
> +  char **strings[num_lipo_cc1_string_kind];
> +};
> +
> +struct lipo_parsed_cc1_string *
> +lipo_parse_saved_cc1_string (const char *src, char *str,
> +bool parse_cl_args_only);
> +void free_parsed_string (struct lipo_parsed_cc1_string *string);
> +
>
> Declare above in a header file.

OK.

>
>
>  /* Returns true if the command-line arguments stored in the given 
> module-infos
> are incompatible.  */
>  bool
> -incompatible_cl_args (struct gcov_module_info* mod_info1,
> -  struct gcov_module_info* mod_info2)
> +incompatible_cl_args (struct lipo_parsed_cc1_string* mod_info1,
> +  struct lipo_parsed_cc1_string* mod_info2)
>
> Fix formating.
OK.
>
>  {
>  {
> @@ -1647,7 +1679,7 @@ build_var (tree fn_decl, tree type, int counter)
>  /* Creates the gcov_fn_info RECORD_TYPE.  */
>
>NULL_TREE, get_gcov_unsigned_t ());
>DECL_CHAIN (field) = fields;
>fields = field;
>
> -  /* Num bracket paths  */
> +  /* cc1_uncompressed_strlen field */
>field = build_decl (BUILTINS_LOCATION, FIELD_DECL,
>NULL_TREE, get_gcov_unsigned_t ());
>DECL_CHAIN (field) = fields;
>fields = field;
>
>
> Why do we need to store uncompressed string length? If there is need
> to do that, I suggest combine uncompressed length and compressed
> length into one 32bit integer. (16/16, or 17/15 split)

In theory, I don't need the uncompressed

[google][gcc-4_9] encode and compress cc1 option strings in gcov_module_info

2015-10-05 Thread Rong Xu
Hi,

This patch is for google branch only.

It encodes and compresses various cc1 option strings in
gcov_module_info to reduce the lipo instrumented object size. The
savings are from string compression and the reduced number of
relocations.

More specifically, we replace the following fields in gcov_module_info
  gcov_unsigned_t num_quote_paths;
  gcov_unsigned_t num_bracket_paths;
  gcov_unsigned_t num_system_paths;
  gcov_unsigned_t num_cpp_defines;
  gcov_unsigned_t num_cpp_includes;
  gcov_unsigned_t num_cl_args;
  char *string_array[1];
with
  gcov_unsigned_t cc1_strlen;
  gcov_unsigned_t cc1_uncompressed_strlen;
  char *saved_cc1_strings;

The new saved_cc1_strings are zlib compressed string.

Tested with google internal benchmarks.

Thanks,

-Rong
2015-10-05  Rong Xu  

* gcc/Makefile.in (gcov-dump): link with zlib
(gcov-tool): Ditto.
* gcc/auto-profile.c (autofdo_module_profile::read): convert old
gcov_module_info to the new format.
(read_aux_modules): using new incompatible_cl_args interface.
* gcc/coverage.c (zlib.h): new include.
(enum lipo_cc1_string_kind): new define.
(struct lipo_parsed_cc1_string): new define.
(incompatible_cl_args): change the paramter.
(read_counts_file): using new gcov_module_info.
(build_fn_info_type): remove used parameter.
(build_gcov_module_info_type): using new gcov_module_info.
(free_parsed_string): new function.
(find_substr): ditto.
(lipo_parse_saved_cc1_string): ditto.
(lipo_append_tag): ditto.
(build_gcov_module_info_value): using new gcov_module_info.
(coverage_obj_init): remove used parameter.
(add_module_info): using new gcov_module_info.
(process_include): ditto.
(process_include_paths_1): ditto.
(process_include_paths): ditto.
(set_lipo_c_parsing_context): ditto.
* gcc/coverage.h: add new decls.
* gcc/gcov-io.c (zlib.h) new include.
(gcov_read_module_info): using new gcov_module_info.
* gcc/gcov-io.h (struct gcov_module_info): new gcov_module_info.
* libgcc/dyn-ipa.c: delete unused code.
* libgcc/libgcov-driver.c (gcov_write_module_info): using new
gcov_module_info.

Index: gcc/Makefile.in
===
--- gcc/Makefile.in (revision 228354)
+++ gcc/Makefile.in (working copy)
@@ -2583,7 +2583,7 @@ gcov$(exeext): $(GCOV_OBJS) $(LIBDEPS)
 GCOV_DUMP_OBJS = gcov-dump.o vec.o ggc-none.o
 gcov-dump$(exeext): $(GCOV_DUMP_OBJS) $(LIBDEPS)
+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_DUMP_OBJS) \
-   $(LIBS) -o $@
+   $(LIBS)  $(ZLIB) -o $@
 
 GCOV_TOOL_DEP_FILES = $(srcdir)/../libgcc/libgcov-util.c gcov-io.c 
$(GCOV_IO_H) \
   $(srcdir)/../libgcc/libgcov-driver.c 
$(srcdir)/../libgcc/libgcov-driver-system.c \
@@ -2607,7 +2607,7 @@ gcov-tool-params.o: params.c $(CONFIG_H) $(SYSTEM_
 GCOV_TOOL_OBJS = gcov-tool.o libgcov-util.o libgcov-driver-tool.o \
 libgcov-merge-tool.o gcov-tool-dyn-ipa.o gcov-tool-params.o
 gcov-tool$(exeext): $(GCOV_TOOL_OBJS) $(LIBDEPS)
-   +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_TOOL_OBJS) $(LIBS) -o $@
+   +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_TOOL_OBJS) $(LIBS)  
$(ZLIB) -o $@
 #
 # Build the include directories.  The stamp files are stmp-* rather than
 # s-* so that mostlyclean does not force the include directory to
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c  (revision 228354)
+++ gcc/auto-profile.c  (working copy)
@@ -947,7 +947,6 @@ autofdo_source_profile::get_function_instance_by_i
   return s;
 }
 
-
 /* Member functions for autofdo_module_profile.  */
 
 bool
@@ -972,6 +971,9 @@ autofdo_module_profile::read ()
   unsigned exported = gcov_read_unsigned ();
   unsigned lang = gcov_read_unsigned ();
   unsigned ggc_memory = gcov_read_unsigned ();
+  unsigned marker = 0, len = 0, k;
+  char **string_array, *saved_cc1_strings;
+
   for (unsigned j = 0; j < 7; j++)
 {
   num_array[j] = gcov_read_unsigned ();
@@ -988,19 +990,48 @@ autofdo_module_profile::read ()
   module->ident = i + 1;
   module->lang = lang;
   module->ggc_memory = ggc_memory;
-  module->num_quote_paths = num_array[1];
-  module->num_bracket_paths = num_array[2];
-  module->num_system_paths = num_array[3];
-  module->num_cpp_defines = num_array[4];
-  module->num_cpp_includes = num_array[5];
-  module->num_cl_args = num_array[6];
   module->source_filename = name;
   module->is_primary = strcmp (name, in_fnames[0]) == 0;
   module->flags = module->is_primary ? exported : 1;
   for (unsigned j = 0; j < num_array[0]; j++)
 ret.first->second.first.safe_push (xstrdup (gcov_read_strin

Re: [google][gcc-4_9] Remove unused key field in gcov_fn_info

2015-09-29 Thread Rong Xu
You are right. I attached the updated patch to this email.

On Tue, Sep 29, 2015 at 3:10 PM, Xinliang David Li  wrote:
>else
>  {
>gfi_ptr = gi_ptr->functions[f_ix];
> -  if (gfi_ptr && gfi_ptr->key == gi_ptr)
> +  if (gfi_ptr)
>  length = GCOV_TAG_FUNCTION_LENGTH;
> -  else
> -length = 0;
>  }
>
> The removal of 'else' path seems wrong.
>
> David
>
>
> On Tue, Sep 29, 2015 at 1:46 PM, Rong Xu  wrote:
>> Hi,
>>
>> This patch is for google/gcc-4_9 branch.
>>
>> The 'key' field in gcov_fn_info is designed to allow gcov function
>> data to be COMDATTed, but the comdat elimination never works. This
>> patch removes this field to reduce the instrumented object size.
>>
>> Thanks,
>>
>> -Rong
Removed the unused 'key' field in gcov_fn_info to reduce the 
instrumented objects size.

2015-09-29  Rong Xu  

* gcc/coverage.c (build_fn_info_type): Remove 'key'
field. (build_fn_info): Ditto.
(coverage_obj_fn): Ditto.
* libgcc/libgcov.h (struct gcov_fn_info): Ditto.
* libgcc/libgcov-driver.c (gcov_compute_histogram): Ditto.
(gcov_exit_compute_summary): Ditto.
(gcov_exit_merge_gcda): Ditto.
(gcov_write_func_counters): Ditto.
(gcov_clear): Ditto.
* libgcc/libgcov-util.c (tag_function): Ditto.
(gcov_merge): Ditto.
(gcov_profile_scale): Ditto.
(gcov_profile_normalize): Ditto.
(compute_one_gcov): Ditto.
(gcov_info_count_all_cold): Ditto.

Index: gcc/coverage.c
===
--- gcc/coverage.c  (revision 228223)
+++ gcc/coverage.c  (working copy)
@@ -189,7 +189,7 @@ static void read_counts_file (const char *, unsign
 static tree build_var (tree, tree, int);
 static void build_fn_info_type (tree, unsigned, tree);
 static void build_info_type (tree, tree);
-static tree build_fn_info (const struct coverage_data *, tree, tree);
+static tree build_fn_info (const struct coverage_data *, tree);
 static tree build_info (tree, tree);
 static bool coverage_obj_init (void);
 static vec *coverage_obj_fn
@@ -1668,16 +1668,9 @@ build_fn_info_type (tree type, unsigned counters,
 
   finish_builtin_struct (ctr_info, "__gcov_ctr_info", fields, NULL_TREE);
 
-  /* key */
-  field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE,
- build_pointer_type (build_qualified_type
- (gcov_info_type, TYPE_QUAL_CONST)));
-  fields = field;
-
   /* ident */
   field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE,
  get_gcov_unsigned_t ());
-  DECL_CHAIN (field) = fields;
   fields = field;
 
   /* lineno_checksum */
@@ -1705,10 +1698,10 @@ build_fn_info_type (tree type, unsigned counters,
 
 /* Returns a CONSTRUCTOR for a gcov_fn_info.  DATA is
the coverage data for the function and TYPE is the gcov_fn_info
-   RECORD_TYPE.  KEY is the object file key.  */
+   RECORD_TYPE.  */
 
 static tree
-build_fn_info (const struct coverage_data *data, tree type, tree key)
+build_fn_info (const struct coverage_data *data, tree type)
 {
   tree fields = TYPE_FIELDS (type);
   tree ctr_type;
@@ -1716,11 +1709,6 @@ static tree
   vec *v1 = NULL;
   vec *v2 = NULL;
 
-  /* key */
-  CONSTRUCTOR_APPEND_ELT (v1, fields,
- build1 (ADDR_EXPR, TREE_TYPE (fields), key));
-  fields = DECL_CHAIN (fields);
-  
   /* ident */
   CONSTRUCTOR_APPEND_ELT (v1, fields,
  build_int_cstu (get_gcov_unsigned_t (),
@@ -2556,7 +2544,7 @@ static vec *
 coverage_obj_fn (vec *ctor, tree fn,
 struct coverage_data const *data)
 {
-  tree init = build_fn_info (data, gcov_fn_info_type, gcov_info_var);
+  tree init = build_fn_info (data, gcov_fn_info_type);
   tree var = build_var (fn, gcov_fn_info_type, -1);
   
   DECL_INITIAL (var) = init;
Index: libgcc/libgcov-driver.c
===
--- libgcc/libgcov-driver.c (revision 227984)
+++ libgcc/libgcov-driver.c (working copy)
@@ -380,7 +380,7 @@ gcov_compute_histogram (struct gcov_summary *sum)
 {
   gfi_ptr = gi_ptr->functions[f_ix];
 
-  if (!gfi_ptr || gfi_ptr->key != gi_ptr)
+  if (!gfi_ptr)
 continue;
 
   ci_ptr = &gfi_ptr->ctrs[ctr_info_ix];
@@ -430,9 +430,6 @@ gcov_exit_compute_summary (struct gcov_summary *th
 {
   gfi_ptr = gi_ptr->functions[f_ix];
 
-  if (gfi_ptr && gfi_ptr->key != gi_ptr)
-gfi_ptr = 0;
-
   crc32 = crc32_unsigned (crc32, gfi_ptr ? gfi_ptr->cfg_checksum : 0);
   crc32 = crc32_unsigned (crc32,
   gfi_ptr ?

[google][gcc-4_9] Remove unused key field in gcov_fn_info

2015-09-29 Thread Rong Xu
Hi,

This patch is for google/gcc-4_9 branch.

The 'key' field in gcov_fn_info is designed to allow gcov function
data to be COMDATTed, but the comdat elimination never works. This
patch removes this field to reduce the instrumented object size.

Thanks,

-Rong
Removed the unused 'key' field in gcov_fn_info to reduce the 
instrumented objects size.

2015-09-29  Rong Xu  

* gcc/coverage.c (build_fn_info_type): Remove 'key'
field. (build_fn_info): Ditto.
(coverage_obj_fn): Ditto.
* libgcc/libgcov.h (struct gcov_fn_info): Ditto.
* libgcc/libgcov-driver.c (gcov_compute_histogram): Ditto.
(gcov_exit_compute_summary): Ditto.
(gcov_exit_merge_gcda): Ditto.
(gcov_write_func_counters): Ditto.
(gcov_clear): Ditto.
* libgcc/libgcov-util.c (tag_function): Ditto.
(gcov_merge): Ditto.
(gcov_profile_scale): Ditto.
(gcov_profile_normalize): Ditto.
(compute_one_gcov): Ditto.
(gcov_info_count_all_cold): Ditto.

Index: gcc/coverage.c
===
--- gcc/coverage.c  (revision 228223)
+++ gcc/coverage.c  (working copy)
@@ -189,7 +189,7 @@ static void read_counts_file (const char *, unsign
 static tree build_var (tree, tree, int);
 static void build_fn_info_type (tree, unsigned, tree);
 static void build_info_type (tree, tree);
-static tree build_fn_info (const struct coverage_data *, tree, tree);
+static tree build_fn_info (const struct coverage_data *, tree);
 static tree build_info (tree, tree);
 static bool coverage_obj_init (void);
 static vec *coverage_obj_fn
@@ -1668,16 +1668,9 @@ build_fn_info_type (tree type, unsigned counters,
 
   finish_builtin_struct (ctr_info, "__gcov_ctr_info", fields, NULL_TREE);
 
-  /* key */
-  field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE,
- build_pointer_type (build_qualified_type
- (gcov_info_type, TYPE_QUAL_CONST)));
-  fields = field;
-
   /* ident */
   field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE,
  get_gcov_unsigned_t ());
-  DECL_CHAIN (field) = fields;
   fields = field;
 
   /* lineno_checksum */
@@ -1705,10 +1698,10 @@ build_fn_info_type (tree type, unsigned counters,
 
 /* Returns a CONSTRUCTOR for a gcov_fn_info.  DATA is
the coverage data for the function and TYPE is the gcov_fn_info
-   RECORD_TYPE.  KEY is the object file key.  */
+   RECORD_TYPE.  */
 
 static tree
-build_fn_info (const struct coverage_data *data, tree type, tree key)
+build_fn_info (const struct coverage_data *data, tree type)
 {
   tree fields = TYPE_FIELDS (type);
   tree ctr_type;
@@ -1716,11 +1709,6 @@ static tree
   vec *v1 = NULL;
   vec *v2 = NULL;
 
-  /* key */
-  CONSTRUCTOR_APPEND_ELT (v1, fields,
- build1 (ADDR_EXPR, TREE_TYPE (fields), key));
-  fields = DECL_CHAIN (fields);
-  
   /* ident */
   CONSTRUCTOR_APPEND_ELT (v1, fields,
  build_int_cstu (get_gcov_unsigned_t (),
@@ -2556,7 +2544,7 @@ static vec *
 coverage_obj_fn (vec *ctor, tree fn,
 struct coverage_data const *data)
 {
-  tree init = build_fn_info (data, gcov_fn_info_type, gcov_info_var);
+  tree init = build_fn_info (data, gcov_fn_info_type);
   tree var = build_var (fn, gcov_fn_info_type, -1);
   
   DECL_INITIAL (var) = init;
Index: libgcc/libgcov-driver.c
===
--- libgcc/libgcov-driver.c (revision 227984)
+++ libgcc/libgcov-driver.c (working copy)
@@ -380,7 +380,7 @@ gcov_compute_histogram (struct gcov_summary *sum)
 {
   gfi_ptr = gi_ptr->functions[f_ix];
 
-  if (!gfi_ptr || gfi_ptr->key != gi_ptr)
+  if (!gfi_ptr)
 continue;
 
   ci_ptr = &gfi_ptr->ctrs[ctr_info_ix];
@@ -430,9 +430,6 @@ gcov_exit_compute_summary (struct gcov_summary *th
 {
   gfi_ptr = gi_ptr->functions[f_ix];
 
-  if (gfi_ptr && gfi_ptr->key != gi_ptr)
-gfi_ptr = 0;
-
   crc32 = crc32_unsigned (crc32, gfi_ptr ? gfi_ptr->cfg_checksum : 0);
   crc32 = crc32_unsigned (crc32,
   gfi_ptr ? gfi_ptr->lineno_checksum : 0);
@@ -688,7 +685,7 @@ gcov_exit_merge_gcda (struct gcov_info *gi_ptr,
   if (length != GCOV_TAG_FUNCTION_LENGTH)
 goto read_mismatch;
 
-  if (!gfi_ptr || gfi_ptr->key != gi_ptr)
+  if (!gfi_ptr)
 {
   /* This function appears in the other program.  We
  need to buffer the information in order to write
@@ -832,10 +829,8 @@ gcov_write_func_counters (struct gcov_info *gi_ptr
   else
 {
   gfi_ptr = gi_ptr->functions[f_ix];
-  if (gfi_ptr && gfi_ptr->key == gi_ptr)
+  if (gfi_pt

[PATCH] multi-target indirect-call promotion

2014-11-17 Thread Rong Xu
Hi,

This patch implements the use phase of indirect-call-topn-profile that
promotes multiple targets of indirect-calls. It addresses pr/45631.

The main idea is to use current speculation framework, with a vector
of direct edges (sorted by the probability).  The trick part is the we
have multiple edges associating withe same call_stmt.
Resolve_speculation and edirect_call_stmt_to_callee are more complex
now.
This patch still has some issues. It runs into an assertion
verify_cgraph_node assertion() when ruing 483.xalancbmk in spec2006.

I think Honza should be able to quickly spot the issue and give
valuable suggestions.

Thanks,

-Rong


indirect_call_topn_profile_use_patch
Description: Binary data


[google gcc-4.9] FDO build for linux kernel

2014-11-13 Thread Rong Xu
Here is patch that ports our work on FDO linux kernel build support to
gcc-4_9. With this patch, kernel will use the libgcov functions to
dump the gcda files.

This patch also enables LIPO build. But the module grouping is not
computed online. We will use gcov-tool to do this offline.

Tested with kernel FDO build, regression and google internal benchmarks.

Thanks,

-Rong


kfdo_patch
Description: Binary data


[google gcc-4_9] make ifunc support available for BIONIC

2014-10-27 Thread Rong Xu
ifunc support is hard-coded as false for BIONIC. This patch removes
this check and let
configure decide whether it should have ifunc support.

Thanks,

-Rong


ifunc_diff
Description: Binary data


Re: [PATCH PR63581] Fix undefined references in debug_info

2014-10-17 Thread Rong Xu
Here is the diff for ChangeLog

Index: ChangeLog
===
--- ChangeLog (revision 216415)
+++ ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2014-10-17
+
+ * cfgrtl.c (emit_barrier_after_bb): Append the barrier to the
+ footer, instead of unconditionally overwritten.
+
 2014-10-17  Eric Botcazou  

  * ipa-inline-transform.c (master_clone_with_noninline_clones_p): New.
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog (revision 216415)
+++ testsuite/ChangeLog (working copy)
@@ -1,3 +1,6 @@
+2014-10-17
+ * g++.dg/tree-prof/pr63581.C: New test.
+
 2014-10-17  Marek Polacek  

  PR c/63543



On Fri, Oct 17, 2014 at 2:54 PM, Rong Xu  wrote:
> Hi,
>
> The attached patch fixes PR63581. The diagnosis is in the bug report.
> Google ref b/17759776.
>
> Tested with bootstrap and regression.
>
> Thanks,
>
> -Rong


[PATCH PR63581] Fix undefined references in debug_info

2014-10-17 Thread Rong Xu
Hi,

The attached patch fixes PR63581. The diagnosis is in the bug report.
Google ref b/17759776.

Tested with bootstrap and regression.

Thanks,

-Rong


63581_patch
Description: Binary data


[PATCH] move gcov-dump functionality to gcov-tool

2014-10-08 Thread Rong Xu
Hi

This patch moves the gcov-dump functionality to gcov-tool (which is
installed by default).
The options are exactly the same as before. The difference is instead of calling
"gcov-dump ...", we now call "gcov-tool dump ..."

gcov-dump is useful in debugging fdo issues. I think it would be very
helpful to make it available with the compiler.

gcov-dump.c and the build construct are deleted with this patch.

Tested with bootstrap and some spec2006 profiles.

Thanks,

-Rong


gcov_tool_dump_patch
Description: Binary data


Re: [PATCH] add overlap function to gcov-tool

2014-10-08 Thread Rong Xu
On Tue, Oct 7, 2014 at 9:31 PM, Jan Hubicka  wrote:
>> Hi,
>>
>> This patch adds overlap functionality to gcov-tool. The overlap score
>> estimates the similarity of two profiles. Currently it only computes
>> overlap for arc counters.
>>
>> The overlap score is defined as
>> \sum minimum (p1-counter[i] / p1-sum-all, p2-counter[i] / p2-sum-all)
>> where p1-counter[i] and p2-counter[2] are two matched counter from
>> profile1 and profiler2.
>> p1-sum-all and p2-sum-all are the sum-all counters in profiler1 and
>> profile2, repetitively.
>
> The patch looks fine in general.  My statistics is all rusty, but can't we use
> one of the established techniques like Kullback-Leibler to compare the
> probabilitis distributions?

Interesting. I never thought of using Kullback-Leibler divergence.
It's very easy to switch to KL using this overlap framework -- only a
few lines of change.
The problem is KL divergence assumes absolute continuity (i.e. q(i)
==0 --> p(i) =0, which is
not true in our distribution, I'm not sure how to work around this.).

I did try earth-mover-distance (EMD) in our earlier internal version.
But since I used
uniform distance, the problem can be simplified to distribution diffs.

> It would be also nice to have ability to compare
> branch probabilities in btween train runs.

Do you mean to do the comparison in CFG rather on the raw counters?
We need gcno file to reconstruct the CFG. That needs some work.

-Rong

>
> Honza
>>
>> The resulting score is a value ranging from 0.0 to 1.0 where 0.0 means
>> no match and 1.0 mean a perfect match.
>>
>> This tool can be used in performance triaging and reducing the fdo
>> training set size (where similar inputs can be pruned).
>>
>> Tested with spec2006 profiles.
>>
>> Thanks,
>>
>> -Rong
>
>> 2014-10-07  Rong Xu  
>>
>>   * gcc/gcov-tool.c (profile_overlap): New driver function
>> to compute profile overlap.
>>   (print_overlap_usage_message): New.
>>   (overlap_usage): New.
>>   (do_overlap): New.
>>   (print_usage): Add calls to overlap function.
>>   (main): Ditto.
>>   * libgcc/libgcov-util.c (read_gcda_file): Fix format.
>>   (find_match_gcov_info): Ditto.
>>   (calculate_2_entries): New.
>>   (compute_one_gcov): Ditto.
>>   (gcov_info_count_all_cold): Ditto.
>>   (gcov_info_count_all_zero): Ditto.
>>   (extract_file_basename): Ditto.
>>   (get_file_basename): Ditto.
>>   (set_flag): Ditto.
>>   (matched_gcov_info): Ditto.
>>   (calculate_overlap): Ditto.
>>   (gcov_profile_overlap): Ditto.
>>   * libgcc/libgcov-driver.c (compute_summary): Make
>> it avavilable for external calls.
>>   * gcc/doc/gcov-tool.texi: Add documentation.
>>
>> Index: gcc/gcov-tool.c
>> ===
>> --- gcc/gcov-tool.c   (revision 215981)
>> +++ gcc/gcov-tool.c   (working copy)
>> @@ -39,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
>>  #include 
>>
>>  extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, 
>> int);
>> +extern int gcov_profile_overlap (struct gcov_info*, struct gcov_info*);
>>  extern int gcov_profile_normalize (struct gcov_info*, gcov_type);
>>  extern int gcov_profile_scale (struct gcov_info*, float, int, int);
>>  extern struct gcov_info* gcov_read_profile_dir (const char*, int);
>> @@ -368,6 +369,121 @@ do_rewrite (int argc, char **argv)
>>return ret;
>>  }
>>
>> +/* Driver function to computer the overlap score b/w profile D1 and D2.
>> +   Return 1 on error and 0 if OK.  */
>> +
>> +static int
>> +profile_overlap (const char *d1, const char *d2)
>> +{
>> +  struct gcov_info *d1_profile;
>> +  struct gcov_info *d2_profile;
>> +
>> +  d1_profile = gcov_read_profile_dir (d1, 0);
>> +  if (!d1_profile)
>> +return 1;
>> +
>> +  if (d2)
>> +{
>> +  d2_profile = gcov_read_profile_dir (d2, 0);
>> +  if (!d2_profile)
>> +return 1;
>> +
>> +  return gcov_profile_overlap (d1_profile, d2_profile);
>> +}
>> +
>> +  return 1;
>> +}
>> +
>> +/* Usage message for profile overlap.  */
>> +
>> +static void
>> +print_overlap_usage_message (int error_p)
>> +{
>> +  FILE *file = error_p ? stderr : stdout;
>> +
>> +  fnotice (file, "  overlap [options] Compute the 
>> overlap of two profiles\n"

[PATCH] add overlap function to gcov-tool

2014-10-07 Thread Rong Xu
Hi,

This patch adds overlap functionality to gcov-tool. The overlap score
estimates the similarity of two profiles. Currently it only computes
overlap for arc counters.

The overlap score is defined as
\sum minimum (p1-counter[i] / p1-sum-all, p2-counter[i] / p2-sum-all)
where p1-counter[i] and p2-counter[2] are two matched counter from
profile1 and profiler2.
p1-sum-all and p2-sum-all are the sum-all counters in profiler1 and
profile2, repetitively.

The resulting score is a value ranging from 0.0 to 1.0 where 0.0 means
no match and 1.0 mean a perfect match.

This tool can be used in performance triaging and reducing the fdo
training set size (where similar inputs can be pruned).

Tested with spec2006 profiles.

Thanks,

-Rong
2014-10-07  Rong Xu  

* gcc/gcov-tool.c (profile_overlap): New driver function
to compute profile overlap. 
(print_overlap_usage_message): New.
(overlap_usage): New.
(do_overlap): New.
(print_usage): Add calls to overlap function.
(main): Ditto.
* libgcc/libgcov-util.c (read_gcda_file): Fix format.
(find_match_gcov_info): Ditto.
(calculate_2_entries): New.
(compute_one_gcov): Ditto.
(gcov_info_count_all_cold): Ditto.
(gcov_info_count_all_zero): Ditto.
(extract_file_basename): Ditto.
(get_file_basename): Ditto.
(set_flag): Ditto.
(matched_gcov_info): Ditto.
(calculate_overlap): Ditto.
(gcov_profile_overlap): Ditto.
* libgcc/libgcov-driver.c (compute_summary): Make
it avavilable for external calls.
* gcc/doc/gcov-tool.texi: Add documentation.

Index: gcc/gcov-tool.c
===
--- gcc/gcov-tool.c (revision 215981)
+++ gcc/gcov-tool.c (working copy)
@@ -39,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include 
 
 extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, int);
+extern int gcov_profile_overlap (struct gcov_info*, struct gcov_info*);
 extern int gcov_profile_normalize (struct gcov_info*, gcov_type);
 extern int gcov_profile_scale (struct gcov_info*, float, int, int);
 extern struct gcov_info* gcov_read_profile_dir (const char*, int);
@@ -368,6 +369,121 @@ do_rewrite (int argc, char **argv)
   return ret;
 }
 
+/* Driver function to computer the overlap score b/w profile D1 and D2.
+   Return 1 on error and 0 if OK.  */
+
+static int
+profile_overlap (const char *d1, const char *d2)
+{
+  struct gcov_info *d1_profile;
+  struct gcov_info *d2_profile;
+
+  d1_profile = gcov_read_profile_dir (d1, 0);
+  if (!d1_profile)
+return 1;
+
+  if (d2)
+{
+  d2_profile = gcov_read_profile_dir (d2, 0);
+  if (!d2_profile)
+return 1;
+
+  return gcov_profile_overlap (d1_profile, d2_profile);
+}
+
+  return 1;
+}
+
+/* Usage message for profile overlap.  */
+
+static void
+print_overlap_usage_message (int error_p)
+{
+  FILE *file = error_p ? stderr : stdout;
+
+  fnotice (file, "  overlap [options] Compute the overlap 
of two profiles\n");
+  fnotice (file, "-v, --verbose   Verbose mode\n");
+  fnotice (file, "-h, --hotonly   Only print info for 
hot objects/functions\n");
+  fnotice (file, "-f, --function  Print function level 
info\n");
+  fnotice (file, "-F, --fullname  Print full 
filename\n");
+  fnotice (file, "-o, --objectPrint object level 
info\n");
+  fnotice (file, "-t , --hot_threshold  Set the threshold 
for hotness\n");
+
+}
+
+static const struct option overlap_options[] =
+{
+  { "verbose",no_argument,   NULL, 'v' },
+  { "function",   no_argument,   NULL, 'f' },
+  { "fullname",   no_argument,   NULL, 'F' },
+  { "object", no_argument,   NULL, 'o' },
+  { "hotonly",no_argument,   NULL, 'h' },
+  { "hot_threshold",  required_argument, NULL, 't' },
+  { 0, 0, 0, 0 }
+};
+
+/* Print overlap usage and exit.  */
+
+static void
+overlap_usage (void)
+{
+  fnotice (stderr, "Overlap subcomand usage:");
+  print_overlap_usage_message (true);
+  exit (FATAL_EXIT_CODE);
+}
+
+int overlap_func_level;
+int overlap_obj_level;
+int overlap_hot_only;
+int overlap_use_fullname;
+double overlap_hot_threshold = 0.005;
+
+/* Driver for profile overlap sub-command.  */
+
+static int
+do_overlap (int argc, char **argv)
+{
+  int opt;
+  int ret;
+
+  optind = 0;
+  while ((opt = getopt_long (argc, argv, "vfFoht:", overlap_options, NULL)) != 
-1)
+{
+  switch (opt)
+{
+case 'v':
+  verbose 

Re: [PATCH] Indirect-call topn targets profiler (instrumentation)

2014-10-06 Thread Rong Xu
On Mon, Oct 6, 2014 at 12:21 PM, Jan Hubicka  wrote:
>> On Mon, Oct 6, 2014 at 11:22 AM, Jan Hubicka  wrote:
>> > Rong,
>> > Would be possible to use topn profiler to get resonale histograms for 
>> > switch
>> > expansion, too?  In that case it may make sense to have value version of 
>> > it as well.
>>

Yes. That's doable. I'm just not sure if top 2 entries are enough. I
can see from your example that it will be useful for BBs having
multiple case stmt.

>> The underlying topn_profiler can be used the same way as the current
>> one_value profiler for different scenarios. However, I wonder if it is
>> useful for switch value profiling -- the arc profiling can do the
>> same.
>
> Well, for switch expansion one really wants a histogram of values across the 
> switch
> values for the decision tree balancing.  But I guess this profiler does not 
> fit
> that very well.
>
> Consider cases like
>
> switch (a)
> {
>case 3:
>case 5:
>case 7:
> prime();
>case 2:
>case 4:
>case 6:
> even();
>   default:
> bogus ();
> }
>
> Arc profiling does tell you how often a is prime or even, but it does not 
> help you
> to balance the tree, because the value ranges are iinterlacing
>>
>> David
>>
>> >
>> > Otherwise the patch seems OK.  I would implement it myself by introducing 
>> > separate
>> > variables holding the topn profiler declaration, but can not think of 
>> > downsides of
>> > doing it your way.
>> >
>> > The topn profiler should be better on determining the dominating target, 
>> > so perhaps
>> > you could look into hooking it into ipa-profile.c.  Extending speculative 
>> > edges for
>> > multiple targets ought to be also quite easy - we need to update 
>> > speculative_call_info
>> > to collect edges annotated to a given call stmt into a vector instead of 
>> > taking the
>> > two references it does now.
>> >
>> > It would be also very nice to update it to handle case where all the edges 
>> > are direct
>> > so we can use it tospeculatively inlinine functions that can be interposed 
>> > at runtime.

Yes. I'm looking at this too. I actually ported the fdo-use part of
the patch and it passes spec2006. But it performs the transformation
using the old way (in gimple_ic_transform()) -- so I did not send out
that patch.
I'm trying to do this in the new way (in ipa-profile()). In any case,
that will be a separated patch.

Also, the switch expansion profile should be also a separated patch.

Is it ok to commit these two patches now?

Thanks,

-Rong

>> >
>> > Thanks,
>> > Honza
>> >
>> > Index: gcc/tree-profile.c
>> > ===
>> > --- gcc/tree-profile.c  (revision 215886)
>> > +++ gcc/tree-profile.c  (working copy)
>> > @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
>> >  #include "target.h"
>> >  #include "tree-cfgcleanup.h"
>> >  #include "tree-nested.h"
>> > +#include "params.h"
>> >
>> >  static GTY(()) tree gcov_type_node;
>> >  static GTY(()) tree tree_interval_profiler_fn;
>> > @@ -101,7 +102,10 @@ init_ic_make_global_vars (void)
>> >  {
>> >ic_void_ptr_var
>> > = build_decl (UNKNOWN_LOCATION, VAR_DECL,
>> > - get_identifier ("__gcov_indirect_call_callee"),
>> > + get_identifier (
>> > + (PARAM_VALUE (PARAM_INDIR_CALL_TOPN_PROFILE) 
>> > ?
>> > +  "__gcov_indirect_call_topn_callee" :
>> > +  "__gcov_indirect_call_callee")),
>> >   ptr_void);
>> >TREE_PUBLIC (ic_void_ptr_var) = 1;
>> >DECL_EXTERNAL (ic_void_ptr_var) = 1;
>> > @@ -131,7 +135,10 @@ init_ic_make_global_vars (void)
>> >  {
>> >ic_gcov_type_ptr_var
>> > = build_decl (UNKNOWN_LOCATION, VAR_DECL,
>> > - get_identifier ("__gcov_indirect_call_counters"),
>> > + get_identifier (
>> > + (PARAM_VALUE (PARAM_INDIR_CALL_TOPN_PROFILE) 
>> > ?
>> > +  "__gcov_indirect_call_topn_counters" :
>> > +  "__gcov_indirect_call_counters")),
>> >   gcov_type_ptr);
>> >TREE_PUBLIC (ic_gcov_type_ptr_var) = 1;
>> >DECL_EXTERNAL (ic_gcov_type_ptr_var) = 1;
>> > @@ -226,8 +233,10 @@ gimple_init_edge_profiler (void)
>> >   ptr_void,
>> >   NULL_TREE);
>> >   tree_indirect_call_profiler_fn
>> > - = build_fn_decl ("__gcov_indirect_call_profiler_v2",
>> > -ic_profiler_fn_type);
>> > + = build_fn_decl ( (PARAM_VALUE 
>> > (PARAM_INDIR_CALL_TOPN_PROFILE) ?
>> > +"__gcov_indirect_call_topn_profiler":
>> > +"__gcov_indirect_call_profiler_v2"),
>> > + 

[PATCH] Indirect-call topn targets profiler (runtime)

2014-10-06 Thread Rong Xu
Hi,

I ported the indirect-call topn target profile from google branch. It
implements the algorithm described in "Value profiling and
optimization" by Brad Calder and Peter Feller. This patch is the
runtime support.

Tested with gcc regression tests and the instrumentation patch (in a
separated patch).

Thanks,

-Rong


patch1_diff
Description: Binary data


[PATCH] Indirect-call topn targets profiler (instrumentation)

2014-10-06 Thread Rong Xu
Hi,

I ported the indirect-call topn target profile from google branch. It
implements the algorithm described in "Value profiling and
optimization" by Brad Calder and Peter Feller. This patch is about the
instrumentation. When --param=indir-call-topn-profile=1 is specified,
we will use indirection-call-topn profiler in this patch, instead of
indirect_call_profiler_v2.

Tested with gcc regression tests and the runtime patch (in a separated patch).

Thanks,

-Rong


patch2_diff
Description: Binary data


[google gcc-4_9] fix undefined references in debug_info

2014-10-03 Thread Rong Xu
Hi,

This patch fixed a bug exposed in build kernel with fdo.

We cannot simply overwrite the bb footer in  emit_barrier_after_bb as
the bb may already have a footer (in this case, a deleted label stmt).
We need to output this label because it's a user label and debug_info
has a reference to it.

Tested with problematic file and regression test.
Trunk may also have the same issue, but I need to work on a testcase.

Thanks,

-Rong
2014-10-02  Rong Xu  

* gcc/cfgrtl.c (emit_barrier_after_bb): Append footer instead of
overwriting.

Index: gcc/cfgrtl.c
===
--- gcc/cfgrtl.c(revision 215823)
+++ gcc/cfgrtl.c(working copy)
@@ -1453,7 +1453,20 @@ emit_barrier_after_bb (basic_block bb)
   gcc_assert (current_ir_type () == IR_RTL_CFGRTL
   || current_ir_type () == IR_RTL_CFGLAYOUT);
   if (current_ir_type () == IR_RTL_CFGLAYOUT)
-BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier);
+{
+  rtx insn = unlink_insn_chain (barrier, barrier);
+
+  if (BB_FOOTER (bb))
+   {
+  rtx footer_tail = BB_FOOTER (bb);
+  while (NEXT_INSN(footer_tail))
+footer_tail = NEXT_INSN (insn);
+ NEXT_INSN (footer_tail) = insn;
+ PREV_INSN (insn) = footer_tail;
+   }
+  else
+BB_FOOTER (bb) = insn;
+}
 }
 
 /* Like force_nonfallthru below, but additionally performs redirection


[PATCH] PR61889

2014-09-02 Thread Rong Xu
This patch makes the build of gcov-tool configurable. It checks if
ftw.h is available. For mingw build, it provides ftw functionality by
using FindFirstFile/FindNextFile/FindClose API.

Tested with and without --disable-gcov-tool.

Thanks,

-Rong
2014-09-02  Rong Xu  

* gcc/Makefile.in: Make the build gcov-tool configurable.
* gcc/configure.ac: Ditto.
* gcc/configure: Ditto.
* gcc/config.in: Ditto.
* gcc/gcov-tool.c (unlink_gcda_file): Support win32 build.
(unlink_profile_dir): Ditto.
* libgcc/libgcov-util.c (read_gcda_file): Ditto.
(read_file_handler): Ditto.
(ftw_read_file): Ditto.
(myftw): Ditto.
(gcov_read_profile_dir): Ditto.
(gcov_profile_normalize): Ditto.

Index: gcc/Makefile.in
===
--- gcc/Makefile.in (revision 214831)
+++ gcc/Makefile.in (working copy)
@@ -123,9 +123,13 @@ SUBDIRS =@subdirs@ build
 
 # Selection of languages to be made.
 CONFIG_LANGUAGES = @all_selected_languages@
-LANGUAGES = c gcov$(exeext) gcov-dump$(exeext) gcov-tool$(exeext) \
-$(CONFIG_LANGUAGES)
+LANGUAGES = c gcov$(exeext) gcov-dump$(exeext) $(CONFIG_LANGUAGES)
 
+disable_gcov_tool = @disable_gcov_tool@
+ifneq ($(disable_gcov_tool),yes)
+LANGUAGES += gcov-tool$(exeext)
+endif
+
 # Default values for variables overridden in Makefile fragments.
 # CFLAGS is for the user to override to, e.g., do a cross build with -O2.
 # TCFLAGS is used for compilations with the GCC just built.
Index: gcc/configure.ac
===
--- gcc/configure.ac(revision 214831)
+++ gcc/configure.ac(working copy)
@@ -5650,6 +5650,26 @@ if test "${ENABLE_LIBQUADMATH_SUPPORT}" != "no" ;
 fi
 
 
+# Check if gcov-tool can be built.
+AC_ARG_ENABLE(gcov-tool,
+[AS_HELP_STRING([--disable-gcov-tool],
+[disable the build of gcov-tool])])
+if test x"$enable_gcov_tool" = x"no"; then
+  disable_gcov_tool=yes
+else
+  AC_CHECK_HEADERS(ftw.h, [disable_gcov_tool=no],
+   [case $host_os in
+  win32 | cygwin* | mingw32*)
+disable_gcov_tool=no
+;;
+  *)
+disable_gcov_tool=yes
+;;
+esac])
+fi
+AC_SUBST(disable_gcov_tool)
+
+
 # Specify what hash style to use by default.
 AC_ARG_WITH([linker-hash-style],
 [AC_HELP_STRING([--with-linker-hash-style={sysv,gnu,both}],
Index: gcc/configure
===
--- gcc/configure   (revision 214831)
+++ gcc/configure   (working copy)
@@ -600,6 +600,7 @@ ac_includes_default="\
 
 ac_subst_vars='LTLIBOBJS
 LIBOBJS
+disable_gcov_tool
 PICFLAG
 enable_host_shared
 enable_plugin
@@ -932,6 +933,7 @@ enable_version_specific_runtime_libs
 enable_plugin
 enable_host_shared
 enable_libquadmath_support
+enable_gcov_tool
 with_linker_hash_style
 '
   ac_precious_vars='build_alias
@@ -1655,6 +1657,7 @@ Optional Features:
   --enable-host-sharedbuild host code as shared libraries
   --disable-libquadmath-support
   disable libquadmath support for Fortran
+  --disable-gcov-tool disable the build of gcov-tool
 
 Optional Packages:
   --with-PACKAGE[=ARG]use PACKAGE [ARG=yes]
@@ -8353,7 +8356,7 @@ fi
 for ac_header in limits.h stddef.h string.h strings.h stdlib.h time.h iconv.h \
 fcntl.h unistd.h sys/file.h sys/time.h sys/mman.h \
 sys/resource.h sys/param.h sys/times.h sys/stat.h \
-direct.h malloc.h langinfo.h ldfcn.h locale.h wchar.h
+direct.h malloc.h langinfo.h ldfcn.h locale.h wchar.h ftw.h
 do :
   as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
 ac_fn_c_check_header_preproc "$LINENO" "$ac_header" "$as_ac_Header"
@@ -18033,7 +18036,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18036 "configure"
+#line 18039 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -18139,7 +18142,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18142 "configure"
+#line 18145 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -28116,6 +28119,33 @@ $as_echo "#define ENABLE_LIBQUADMATH_SUPPORT 1" >>
 fi
 
 
+# Check if gcov-tool can be built.
+# Check whether --enable-gcov-tool was given.
+if test "${enable_gcov_tool+set}" = set; then :
+  enableval=$enable_gcov_tool;
+fi
+
+if test x"$enable_gcov_tool" = x"no"; then
+  disable_gcov_tool=yes

[patch] make gcov-tool build configurable and fix mingw build issue

2014-08-13 Thread Rong Xu
Hi,

This patch makes gcov-tool build configurable. A new configure option
--disable-gcov-tool is added to disable the build.

It also fixes some build issues for mingw.

Tested with regression and bootstrap. mingw test is ongoing.

OK to check in if mingw build passes?

Thanks,

-Rong


gcov_tool_patch_diff
Description: Binary data


Re: [PATCH] offline gcda profile processing tool

2014-07-11 Thread Rong Xu
On Fri, Jul 11, 2014 at 11:46 AM, Jan Hubicka  wrote:
>> Richard,
>>
>> I looked at my patch again. I already add -Wno-error to libgcov-util.o
>> compilation:
>> In line 200 of gcc/Makefile.in
>> libgcov-util.o-warn = -Wno-error
>>
>> In my test, I used gcc-4.6 as the host compiler. I got warning like this:
>>
>> In file included from ../../gcc/gcc/../libgcc/libgcov-util.c:30:0:
>> ../../gcc/gcc/../libgcc/libgcov.h:184:30: warning: ISO C++ forbids
>> zero-size array ???ctrs??? [-pedantic]
>>
>> Can you check your buildlog to see if -Wno-error is added to the command 
>> line?
>
> I would preffer the libgcov.h (that is interface header to libgcov users) to 
> be
> valid C++, so we still ought to fix it.
>
> Honza

OK. I will send out a patch for review.

-Rong

>>
>> Thanks,
>>
>> -Rong
>>
>> On Fri, Jul 11, 2014 at 9:47 AM, Rong Xu  wrote:
>> > I did see the warning in the bootstrap, but it did not exit the build.
>> > I thought it was ok.
>> >
>> > I'll have a patch for this and send for review.
>> >
>> > -Rong
>> >
>> > On Fri, Jul 11, 2014 at 9:13 AM, Xinliang David Li  
>> > wrote:
>> >> right.
>> >>
>> >> Rong, the fix would be just change ctr array size to 1. For each
>> >> function, there should be at least one kind of counters -- see the
>> >> assertion in build_fn_info_type.  There are some code that do
>> >> 'sizeof(gcov_fn_info)' when computing heap size -- they can be
>> >> adjusted or leave it as it is (if not doing memcpy for the whole
>> >> array).
>> >>
>> >> David
>> >>
>> >> On Fri, Jul 11, 2014 at 8:44 AM, Jakub Jelinek  wrote:
>> >>> On Fri, Jul 11, 2014 at 08:42:27AM -0700, Xinliang David Li wrote:
>> >>>> I wonder why. The struct definition for gcov_fn_info has not changed
>> >>>> in this patch.
>> >>>
>> >>> Perhaps it has been used only in C until now?
>> >>>
>> >>> Jakub


Re: [PATCH] offline gcda profile processing tool

2014-07-11 Thread Rong Xu
Richard,

I looked at my patch again. I already add -Wno-error to libgcov-util.o
compilation:
In line 200 of gcc/Makefile.in
libgcov-util.o-warn = -Wno-error

In my test, I used gcc-4.6 as the host compiler. I got warning like this:

In file included from ../../gcc/gcc/../libgcc/libgcov-util.c:30:0:
../../gcc/gcc/../libgcc/libgcov.h:184:30: warning: ISO C++ forbids
zero-size array ‘ctrs’ [-pedantic]

Can you check your buildlog to see if -Wno-error is added to the command line?

Thanks,

-Rong

On Fri, Jul 11, 2014 at 9:47 AM, Rong Xu  wrote:
> I did see the warning in the bootstrap, but it did not exit the build.
> I thought it was ok.
>
> I'll have a patch for this and send for review.
>
> -Rong
>
> On Fri, Jul 11, 2014 at 9:13 AM, Xinliang David Li  wrote:
>> right.
>>
>> Rong, the fix would be just change ctr array size to 1. For each
>> function, there should be at least one kind of counters -- see the
>> assertion in build_fn_info_type.  There are some code that do
>> 'sizeof(gcov_fn_info)' when computing heap size -- they can be
>> adjusted or leave it as it is (if not doing memcpy for the whole
>> array).
>>
>> David
>>
>> On Fri, Jul 11, 2014 at 8:44 AM, Jakub Jelinek  wrote:
>>> On Fri, Jul 11, 2014 at 08:42:27AM -0700, Xinliang David Li wrote:
>>>> I wonder why. The struct definition for gcov_fn_info has not changed
>>>> in this patch.
>>>
>>> Perhaps it has been used only in C until now?
>>>
>>> Jakub


Re: [PATCH] offline gcda profile processing tool

2014-07-11 Thread Rong Xu
I did see the warning in the bootstrap, but it did not exit the build.
I thought it was ok.

I'll have a patch for this and send for review.

-Rong

On Fri, Jul 11, 2014 at 9:13 AM, Xinliang David Li  wrote:
> right.
>
> Rong, the fix would be just change ctr array size to 1. For each
> function, there should be at least one kind of counters -- see the
> assertion in build_fn_info_type.  There are some code that do
> 'sizeof(gcov_fn_info)' when computing heap size -- they can be
> adjusted or leave it as it is (if not doing memcpy for the whole
> array).
>
> David
>
> On Fri, Jul 11, 2014 at 8:44 AM, Jakub Jelinek  wrote:
>> On Fri, Jul 11, 2014 at 08:42:27AM -0700, Xinliang David Li wrote:
>>> I wonder why. The struct definition for gcov_fn_info has not changed
>>> in this patch.
>>
>> Perhaps it has been used only in C until now?
>>
>> Jakub


Re: [PATCH] offline gcda profile processing tool

2014-07-11 Thread Rong Xu
I should use _WIN32 macro. This code is for windows mkdir api.
I'll commit a patch for this shortly (approved by honza).

-Rong

On Fri, Jul 11, 2014 at 8:49 AM, Xinliang David Li  wrote:
> What is the macro to test POSIX IO on host? The guard uses
> TARGET_POSIX_IO which is not right.
>
> David
>
> On Fri, Jul 11, 2014 at 1:12 AM, Christophe Lyon
>  wrote:
>> On 11 July 2014 10:07, Richard Biener  wrote:
>>> On Mon, May 5, 2014 at 7:17 PM, Rong Xu  wrote:
>>>> Here is the updated patch. The difference with patch set 3 is
>>>> (1) use the gcov-counter.def for scaling operation.
>>>> (2) fix wrong scaling function for time-profiler.
>>>>
>>>> passed with bootstrap, profiledboostrap and SPEC2006.
>>>
>>> One of the patches breaks bootstrap for me:
>>>
>>> /space/rguenther/src/svn/trunk/gcc/../libgcc/libgcov.h:184: error: ISO
>>> C++ forbids zero-size array ‘ctrs’
>>> make[3]: *** [libgcov-util.o] Error 1
>>>
>>> host compiler is gcc 4.3.4
>>>
>>> Richard.
>>>
>>
>> On my side, commit 212448 breaks the cross-build of GCC for targets
>> using newlib:
>> * arm-none-eabi
>> * aarch64-none-elf
>> /usr/include/sys/stat.h: In function <80><98>void
>> gcov_output_files(const char*, gcov_info*)<80><99>:
>> /usr/include/sys/stat.h:317: error: too few arguments to function
>> <80><98>int mkdir(const char*, __mode_t)<80><99>
>> /tmp/1392958_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/gcov-tool.c:96:
>> error: at this point in file
>> make[2]: *** [gcov-tool.o] Error 1
>>
>> Christophe.
>>
>>>> Thanks,
>>>>
>>>> -Rong
>>>>
>>>> On Thu, May 1, 2014 at 3:37 PM, Rong Xu  wrote:
>>>>> Hi, Honza,
>>>>>
>>>>> Please find the new patch in the attachment. This patch integrated
>>>>> your earlier suggestions. The noticeable changes are:
>>>>> (1) build specialized object for libgcov-driver.c and libgcov-merge.c
>>>>> and link into gcov-tool, rather including the source file.
>>>>> (2) split some gcov counter definition code to gcov-counter.def to
>>>>> avoid code duplication.
>>>>> (3) use a macro for gcov_read_counter(), so gcov-tool can use existing
>>>>> code in libgcov-merge.c with minimal change.
>>>>>
>>>>> This patch does not address the suggestion of creating a new directory
>>>>> for libgcov. I agree with you that's a much better
>>>>> and cleaner structure we should go for. We can do that in follow-up 
>>>>> patches.
>>>>>
>>>>> I tested this patch with boostrap and profiledbootstrap. Other tests
>>>>> are on-going.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -Rong
>>>>>
>>>>> On Wed, Apr 16, 2014 at 8:34 PM, Jan Hubicka  wrote:
>>>>>>> GCOT_TOOL needs to use this function to read the string in gcda file
>>>>>>> to memory to construct gcov_info objects.
>>>>>>> As you noticed, gcov runtime does not need this interface. But
>>>>>>> gcov-tool links with gcov runtime and it also uses the function.
>>>>>>> We could make it available in gcov_runtime, but that will slightly
>>>>>>> increase the memory footprint.
>>>>>>
>>>>>> Yep, it is not really pretty. I wrote bellow some plan how things may be
>>>>>> structured in more convenient way.  Any work in that direction would be 
>>>>>> welcome.
>>>>>>>
>>>>>>> The planned new functions for trunk version is: (1) overlap score b/w
>>>>>>> two profiles (2) better dumping (top hot objects/function/counters)
>>>>>>> and statistics.
>>>>>>> Once this basic version is in, we can start to add the new 
>>>>>>> functionality.
>>>>>>
>>>>>> Sounds good. I assume the autoFDO does not go via gcov tool but rather 
>>>>>> uses
>>>>>> custom reader of profile data at GCC side?
>>>>>> I wonder, are there any recent overview papers/slides/text of design of 
>>>>>> AutoFDO
>>>>>> and other features to be merged?
>>>>>> I remember the talk from GCC Summit and I did read some of pre-LTO LIPO
>

Re: [BUILDROBOT] gcov patch

2014-07-11 Thread Rong Xu
On Fri, Jul 11, 2014 at 8:06 AM, Jan Hubicka  wrote:
>> Sorry. This code meant to work with the different mkdir api in
>> windows. I used wrong ifdef.
>>
>> Here is the patch. OK for checkin?
> OK. I also see the following with LTO bootstrap:
> ../../gcc/../libgcc/libgcov-util.c:41:24: error: type of �gcov_max_filename� 
> does not match original declaration [-Werror]
>  extern gcov_unsigned_t gcov_max_filename;
> ^
> ../../gcc/../libgcc/libgcov-driver.c:88:8: note: previously declared here
>  size_t gcov_max_filename = 0;
> Probably both can be size_t?

OK. I will change this too and submit.

Thanks for the quick review.

-Rong

>
> Honza
>
>>
>> Thanks,
>>
>> -Rong
>>
>> 2014-07-11  Rong Xu  
>>
>> * gcov-tool.c (gcov_output_files): Fix build error.
>>
>> Index: gcov-tool.c
>> ===
>> --- gcov-tool.c (revision 212448)
>> +++ gcov-tool.c (working copy)
>> @@ -90,8 +90,8 @@ gcov_output_files (const char *out, struct gcov_in
>>/* Try to make directory if it doesn't already exist.  */
>>if (access (out, F_OK) == -1)
>>  {
>> -#ifdef TARGET_POSIX_IO
>> -  if (mkdir (out, 0755) == -1 && errno != EEXIST)
>> +#if !defined(_WIN32)
>> +  if (mkdir (out, S_IRWXU | S_IRWXG | S_IRWXO) == -1 && errno != EEXIST)
>
> Sounds almost like something we could have libiberty glue for...
>
>
>>  #else
>>if (mkdir (out) == -1 && errno != EEXIST)
>>  #endif
>>


Re: [BUILDROBOT] gcov patch

2014-07-11 Thread Rong Xu
Sorry. This code meant to work with the different mkdir api in
windows. I used wrong ifdef.

Here is the patch. OK for checkin?

Thanks,

-Rong

2014-07-11  Rong Xu  

* gcov-tool.c (gcov_output_files): Fix build error.

Index: gcov-tool.c
===
--- gcov-tool.c (revision 212448)
+++ gcov-tool.c (working copy)
@@ -90,8 +90,8 @@ gcov_output_files (const char *out, struct gcov_in
   /* Try to make directory if it doesn't already exist.  */
   if (access (out, F_OK) == -1)
 {
-#ifdef TARGET_POSIX_IO
-  if (mkdir (out, 0755) == -1 && errno != EEXIST)
+#if !defined(_WIN32)
+  if (mkdir (out, S_IRWXU | S_IRWXG | S_IRWXO) == -1 && errno != EEXIST)
 #else
   if (mkdir (out) == -1 && errno != EEXIST)
 #endif

On Fri, Jul 11, 2014 at 6:08 AM, Jan-Benedict Glaw  wrote:
> On Fri, 2014-07-11 15:03:06 +0200, Jan-Benedict Glaw  
> wrote:
>> See eg. 
>> http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=289639,
>> it breaks like this:
>>
>> [...]
>> g++ -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions 
>> -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wwrite-strings -Wcast-qual 
>> -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long 
>> -Wno-variadic-macros -Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H 
>> -I. -I. -I/home/jbglaw/repos/gcc/gcc -I/home/jbglaw/repos/gcc/gcc/. 
>> -I/home/jbglaw/repos/gcc/gcc/../include 
>> -I/home/jbglaw/repos/gcc/gcc/../libcpp/include  
>> -I/home/jbglaw/repos/gcc/gcc/../libdecnumber 
>> -I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
>> -I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o gcov-tool.o -MT 
>> gcov-tool.o -MMD -MP -MF ./.deps/gcov-tool.TPo 
>> /home/jbglaw/repos/gcc/gcc/gcov-tool.c
>> /usr/include/sys/stat.h: In function ‘void gcov_output_files(const char*, 
>> gcov_info*)’:
>> /usr/include/sys/stat.h:323: error: too few arguments to function ‘int 
>> mkdir(const char*, __mode_t)’
>> /home/jbglaw/repos/gcc/gcc/gcov-tool.c:96: error: at this point in file
>> make[1]: *** [gcov-tool.o] Error 1
>> make[1]: Leaving directory `/home/jbglaw/build/rl78-elf/build-gcc/gcc'
>> make: *** [all-gcc] Error 2
>
> [...]
>
> +#ifdef TARGET_POSIX_IO
> +  if (mkdir (out, 0755) == -1 && errno != EEXIST)
> +#else
> +  if (mkdir (out) == -1 && errno != EEXIST)
> +#endif
>
>
> And with POSIX I/O, this should probably use the S_I* macros for the
> mode bits?
>
> MfG, JBG
>
> --
>   Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
>  Signature of:Arroganz verkürzt fruchtlose Gespräche.
>  the second  :   -- Jan-Benedict Glaw


Re: [Google] Fix AFDO early inline ICEs due to DFE

2014-06-12 Thread Rong Xu
This looks fine to me.

-Rong

On Thu, Jun 12, 2014 at 11:23 AM, Teresa Johnson  wrote:
> These two patches fix multiple ICE that occurred due to DFE being
> recently enabled after AutoFDO LIPO linking.
>
> Passes regression and internal testing. Ok for Google/4_8?
>
> Teresa
>
> 2014-06-12  Teresa Johnson  
> Dehao Chen  
>
> Google ref b/15521327.
>
> * cgraphclones.c (cgraph_clone_edge): Use resolved node.
> * l-ipo.c (resolve_cgraph_node): Resolve to non-removable node.
>
> Index: cgraphclones.c
> ===
> --- cgraphclones.c  (revision 211386)
> +++ cgraphclones.c  (working copy)
> @@ -94,6 +94,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "ipa-utils.h"
>  #include "lto-streamer.h"
>  #include "except.h"
> +#include "l-ipo.h"
>
>  /* Create clone of E in the node N represented by CALL_EXPR the callgraph.  
> */
>  struct cgraph_edge *
> @@ -118,7 +119,11 @@ cgraph_clone_edge (struct cgraph_edge *e, struct c
>
>if (call_stmt && (decl = gimple_call_fndecl (call_stmt)))
> {
> - struct cgraph_node *callee = cgraph_get_node (decl);
> +  struct cgraph_node *callee;
> +  if (L_IPO_COMP_MODE && cgraph_pre_profiling_inlining_done)
> +callee = cgraph_lipo_get_resolved_node (decl);
> +  else
> +callee = cgraph_get_node (decl);
>   gcc_checking_assert (callee);
>   new_edge = cgraph_create_edge (n, callee, call_stmt, count, freq);
> }
> Index: l-ipo.c
> ===
> --- l-ipo.c (revision 211386)
> +++ l-ipo.c (working copy)
> @@ -1542,6 +1542,18 @@ resolve_cgraph_node (struct cgraph_sym **slot, str
>gcc_assert (decl1_defined);
>add_define_module (*slot, decl2);
>
> +  /* Pick the node that cannot be removed, to avoid a situation
> + where we remove the resolved node and later try to access
> + it for the remaining non-removable copy.  E.g. one may be
> + extern and the other weak, only the extern copy can be removed.  */
> +  if (cgraph_can_remove_if_no_direct_calls_and_refs_p ((*slot)->rep_node)
> +  && !cgraph_can_remove_if_no_direct_calls_and_refs_p (node))
> +{
> +  (*slot)->rep_node = node;
> +  (*slot)->rep_decl = decl2;
> +  return;
> +}
> +
>has_prof1 = has_profile_info (decl1);
>bool is_aux1 = cgraph_is_auxiliary (decl1);
>bool is_aux2 = cgraph_is_auxiliary (decl2);
>
>
> --
> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [Google/4_8] Reduce memory overhead of LIPO COMDAT fixups

2014-06-05 Thread Rong Xu
This patch looks good to me.

-Rong

On Thu, Jun 5, 2014 at 12:45 PM, Teresa Johnson  wrote:
> (cc'ing a few additional people to help with review as David is out
> and I'm not sure Rong is available)
>
> This patch greatly reduces the memory overhead of the new COMDAT fixup
> analysis, by changing the second level hash tables to linked lists.
> I found that almost none of the second level hash tables contained more
> than one entry, but each hash table required a lot of memory overhead.
>
> I also now query the fixup type before adding the checksums to the
> pointer sets during callgraph building, which would have enabled a workaround
> for the memory issue.
>
> Tested with regression tests and internal tests (see ref below for details).
>
> Google ref b/15415042.
>
> 2014-06-05  Teresa Johnson  
>
> * dyn-ipa.c (struct lineno_checksum_alias): Replaced pointer set.
> (struct checksum_alias_info): Enabled linked list.
> (cfg_checksum_get_key): Removed.
> (find_cfg_checksum): New function.
> (cfg_checksum_insert): Operate on linked list.
> (checksum_set_insert): Ditto.
> (gcov_build_callgraph): Allow disabling checksum insertion.
> (gcov_find_new_ic_target): Operate on linked list.
> (gcov_fixup_counters_checksum): Ditto.
> (gcov_fixup_counters_lineno): Ditto.
> (__gcov_compute_module_groups): Compute fixup type earlier.
>
> Index: dyn-ipa.c
> ===
> --- dyn-ipa.c   (revision 211288)
> +++ dyn-ipa.c   (working copy)
> @@ -79,16 +79,15 @@ struct dyn_cgraph
>unsigned num_nodes_executed;
>/* used by new algorithm  */
>struct modu_node *modu_nodes;
> -  /* Set indexed by lineno_checksum, returns another dyn_pointer_set*,
> - indexed by cfg_checksum.  That returns a checksum_alias_info struct.  */
> +  /* Set indexed by lineno_checksum, returns a linked list of
> + checksum_alias_info structs.  */
>struct dyn_pointer_set *lineno_pointer_sets;
>  };
>
>  /* Struct holding information for functions with the same lineno_checksum.  
> */
>  struct lineno_checksum_alias
>  {
> -  /* Set indexed by cfg_checksum, holding a checksum_alias_info struct.  */
> -  struct dyn_pointer_set *cfg_pointer_set;
> +  struct checksum_alias_info *cfg_checksum_list;
>unsigned lineno_checksum;
>  };
>
> @@ -96,6 +95,7 @@ struct lineno_checksum_alias
> checksums.  */
>  struct checksum_alias_info
>  {
> +  struct checksum_alias_info *next_cfg_checksum;
>struct checksum_alias *alias_list;
>unsigned cfg_checksum;
>  };
> @@ -205,6 +205,7 @@ pointer_set_create (unsigned (*get_key) (const voi
>  static struct dyn_cgraph the_dyn_call_graph;
>  static int total_zero_count = 0;
>  static int total_insane_count = 0;
> +static int fixup_type = 0;
>
>  enum GROUPING_ALGORITHM
>  {
> @@ -374,14 +375,6 @@ lineno_checksum_get_key (const void *p)
>return ((const struct lineno_checksum_alias *) p)->lineno_checksum;
>  }
>
> -/* The cfg_checksum value in P is the key for a cfg_pointer_set.  */
> -
> -static inline unsigned
> -cfg_checksum_get_key (const void *p)
> -{
> -  return ((const struct checksum_alias_info *) p)->cfg_checksum;
> -}
> -
>  /* Create a new checksum_alias struct for function with GUID, FI_PTR,
> and ZERO_COUNTS flag.  Prepends to list NEXT and returns new struct.  */
>
> @@ -398,28 +391,44 @@ new_checksum_alias (gcov_type guid, const struct g
>return alias;
>  }
>
> -/* Insert a new checksum_alias struct into pointer set P for function with
> +/* Locate the checksum_alias_info in LIST that matches CFG_CHECKSUM.  */
> +
> +static struct checksum_alias_info *
> +find_cfg_checksum (struct checksum_alias_info *list, unsigned cfg_checksum)
> +{
> +  for (; list; list = list->next_cfg_checksum)
> +{
> +  if (list->cfg_checksum == cfg_checksum)
> +return list;
> +}
> +  return NULL;
> +}
> +
> +/* Insert a new checksum_alias struct into LIST for function with
> CFG_CHECKSUM and associated GUID, FI_PTR, and ZERO_COUNTS flag.  */
>
> -static void
> -cfg_checksum_set_insert (struct dyn_pointer_set *p, unsigned cfg_checksum,
> - gcov_type guid, const struct gcov_fn_info *fi_ptr,
> - int zero_counts)
> +static struct checksum_alias_info *
> +cfg_checksum_insert (unsigned cfg_checksum, gcov_type guid,
> + const struct gcov_fn_info *fi_ptr, int zero_counts,
> + struct checksum_alias_info *list)
>  {
> -  struct checksum_alias_info **m = (struct checksum_alias_info **)
> -pointer_set_find_or_insert (p, cfg_checksum);
> -  if (*m)
> +  struct checksum_alias_info *alias_info;
> +  alias_info = find_cfg_checksum (list, cfg_checksum);
> +  if (alias_info)
>  {
> -  gcc_assert ((*m)->alias_list);
> -  (*m)->alias_list = new_checksum_alias (guid, fi_ptr, zero_counts,
> - (*m)->alias_l

Re: [Patch] Avoid gcc_assert in libgcov

2014-05-22 Thread Rong Xu
I think these asserts will be used by gcov-tool. So I prefer to change
them to  gcov_nonruntime_assert(). I'll merge them in my new gcov-tool
patch before submitting (waiting for honaz's ok).

Thanks,

-Rong

On Thu, May 22, 2014 at 7:09 AM, Teresa Johnson  wrote:
> On Thu, Jan 16, 2014 at 2:41 PM, Jan Hubicka  wrote:
>>> On Wed, Jan 15, 2014 at 8:39 PM, Jan Hubicka  wrote:
>>> >>
>>> >> In that case should we call gcov_error when IN_LIBGCOV? One
>>> >> possibility would be to simply make gcov_nonruntime_assert be defined
>>> >> as if (!EXPR) gcov_error in the IN_LIBGCOV case. But I think what you
>>> >> wanted here was to reduce libgcov bloat by removing calls altogether,
>>> >> which this wouldn't solve. But if we want to call gcov_error in some
>>> >> cases, I think I need to add another macro that will either do
>>> >> gcc_assert when !IN_LIBGCOV and "if (!EXPR) gcov_error" when
>>> >> IN_LIBGCOV. Is that what you had in mind?
>>> >
>>> > I think for errors that can be triggered by data corruption, we ought to
>>> > produce resonable error messages in both IN_LIBGCOV or for offline tools.
>>> > Just unwound sequence if(...) gcov_error seems fine to me in this case,
>>> > but we may also have assert like wrapper.
>>> > I see we do not provide gcov_error outside libgcov, we probably ought to 
>>> > map
>>> > it to fatal_error in GCC binary.
>>> >
>>> > thanks,
>>> > Honza
>>>
>>> Ok, here is the new patch. Bootstrapped and tested on
>>> x86_64-unknown-linux-gnu. Ok for trunk?
>>>
>>> Thanks, Teresa
>>>
>>> 2014-01-16  Teresa Johnson  
>>>
>>> * gcov-io.c (gcov_position): Use gcov_nonruntime_assert.
>>> (gcov_is_error): Remove gcc_assert from IN_LIBGCOV code.
>>> (gcov_rewrite): Use gcov_nonruntime_assert.
>>> (gcov_open): Ditto.
>>> (gcov_write_words): Ditto.
>>> (gcov_write_length): Ditto.
>>> (gcov_read_words): Use gcov_nonruntime_assert, and remove
>>> gcc_assert from IN_LIBGCOV code.
>>> (gcov_read_summary): Use gcov_error to flag profile corruption.
>>> (gcov_sync): Use gcov_nonruntime_assert.
>>> (gcov_seek): Remove gcc_assert from IN_LIBGCOV code.
>>> (gcov_histo_index): Use gcov_nonruntime_assert.
>>> (static void gcov_histogram_merge): Ditto.
>>> (compute_working_sets): Ditto.
>>> * gcov-io.h (gcov_nonruntime_assert): Define.
>>> (gcov_error): Define for !IN_LIBGCOV
>>
>> OK, thanks!
>> Honza
>
> I just found this old patch sitting in a client! Committed as r210805.
>
> I also discovered that a couple uses of gcc_assert have snuck into
> libgcc/libgcov* files in the meantime. Looks like this got added
> during some of Rong's refactoring, cc'ing Rong. They are in
> libgcc/libgcov-driver-system.c (allocate_filename_struct) and
> libgcov-merge.c (__gcov_merge_single and __gcov_merge_delta). Should I
> remove those or change to gcov_error?
>
> Teresa
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


[google gcc-4_8] LIPO: check the incompatibility of use and saved command line

2014-05-12 Thread Rong Xu
Hi,

This patch adds an extra check of the incompatibility of use and saved
command line for the primary module. If the check files, we skip the
inclusion of all aux modules.
Maybe we should check the aux module command line with the
use-compilation command line directly. But I keep the old way that
checks b/w saved command lines.

Tested with the program exposed this issue and internal benchmarks.
This check never failed for instrumentation based lipo in the test.

Thanks,

-Rong
2014-05-12  Rong Xu  

* coverage.c (struct string_hasher): Use const char.
(has_incompatible_cg_opts): Add new incompatible options.
(incompatible_cl_arg_strs): Refactor from incompatible_cl_args.
(incompatible_cl_args): Refactor.
(read_counts_file): check incompatibility of use command line and the
saved command line for the primary module.
* auto-profile.c (read_aux_modules): Ditto.

Index: coverage.c
===
--- coverage.c  (revision 209291)
+++ coverage.c  (working copy)
@@ -237,8 +237,8 @@ is_last_module (unsigned mod_id)
 struct string_hasher
 {
   /* hash_table support.  */
-  typedef  char value_type;
-  typedef  char compare_type;
+  typedef const char value_type;
+  typedef char compare_type;
   static inline hashval_t hash (const value_type *);
   static int equal (const value_type *, const compare_type *);
   static void remove (value_type *) {};
@@ -272,6 +272,10 @@ static struct opt_desc force_matching_cg_opts[] =
 { "-fexceptions", "-fno-exceptions", true },
 { "-fsized-delete", "-fno-sized-delete", false },
 { "-frtti", "-fno-rtti", true },
+{ "-fPIC", "-fno-PIC", false },
+{ "-fpic", "-fno-pic", false },
+{ "-fPIE", "-fno-PIE", false },
+{ "-fpie", "-fno-pie", false },
 { "-fstrict-aliasing", "-fno-strict-aliasing", true },
 { "-fsigned-char", "-funsigned-char", true },
 /* { "-fsigned-char", "-fno-signed-char", true },
@@ -315,28 +319,25 @@ has_incompatible_cg_opts (bool *cg_opts1, bool *cg
   return false;
 }
 
-/* Returns true if the command-line arguments stored in the given module-infos
-   are incompatible.  */
-bool
-incompatible_cl_args (struct gcov_module_info* mod_info1,
- struct gcov_module_info* mod_info2)
+/* Returns true if the command-line arguments stored ARGS_1 and
+   ARGS_2 are incompatible. N_ARGS_1 and N_ARGS_2 are the number
+   of string in ARGS_1 and ARGS_2, respectively.  */
+
+static bool
+incompatible_cl_arg_strs (const char* const* args_1, const unsigned int 
n_args_1,
+  const char *filename1, const char* const* args_2,
+  const unsigned int n_args_2, const char *filename2)
 {
-  char **warning_opts1 = XNEWVEC (char *, mod_info1->num_cl_args);
-  char **warning_opts2 = XNEWVEC (char *, mod_info2->num_cl_args);
-  char **non_warning_opts1 = XNEWVEC (char *, mod_info1->num_cl_args);
-  char **non_warning_opts2 = XNEWVEC (char *, mod_info2->num_cl_args);
-  char *std_opts1 = NULL, *std_opts2 = NULL;
+  const char **warning_opts1 = XNEWVEC (const char *, n_args_1);
+  const char **warning_opts2 = XNEWVEC (const char *, n_args_2);
+  const char **non_warning_opts1 = XNEWVEC (const char *, n_args_1);
+  const char **non_warning_opts2 = XNEWVEC (const char *, n_args_2);
+  const char *std_opts1 = NULL, *std_opts2 = NULL;
   unsigned int i, num_warning_opts1 = 0, num_warning_opts2 = 0;
   unsigned int num_non_warning_opts1 = 0, num_non_warning_opts2 = 0;
   bool warning_mismatch = false;
   bool non_warning_mismatch = false;
   hash_table  option_tab1, option_tab2;
-  unsigned int start_index1 = mod_info1->num_quote_paths 
-+ mod_info1->num_bracket_paths + mod_info1->num_system_paths 
-+ mod_info1->num_cpp_defines + mod_info1->num_cpp_includes;
-  unsigned int start_index2 = mod_info2->num_quote_paths
-+ mod_info2->num_bracket_paths + mod_info2->num_system_paths
-+ mod_info2->num_cpp_defines + mod_info2->num_cpp_includes;
 
   bool *cg_opts1, *cg_opts2, has_any_incompatible_cg_opts, 
has_incompatible_std;
   unsigned int num_cg_opts = 0;
@@ -358,14 +359,13 @@ has_incompatible_cg_opts (bool *cg_opts1, bool *cg
   option_tab2.create (10);
 
   /* First, separate the warning and non-warning options.  */
-  for (i = 0; i < mod_info1->num_cl_args; i++)
-if (mod_info1->string_array[start_index1 + i][1] == 'W')
-  warning_opts1[num_warning_opts1++] =
-   mod_info1->string_array[start_index1 + i];
+  for (i = 0; i < n_args_1; i++)
+if (args_1[i][1] == 'W')
+  warning_opts1[num_warning_opts1++] = args_1[i];
 else
   {
-char **slot;
-char *option_

[google gcc-4_8] fix lipo ICE

2014-05-08 Thread Rong Xu
Hi,

This patch fixed lipo ICE triggered by an out-of-bound access.

This is google specific patch and tested with bootstrap and the
program exposed the issue.

Thanks,

-Rong
2014-05-08  Rong Xu  

* tree-inline.c (add_local_variables): Check if the debug_expr
is a decl_node before calling is_global_var.

Index: tree-inline.c
===
--- tree-inline.c   (revision 209291)
+++ tree-inline.c   (working copy)
@@ -3842,7 +3842,7 @@ add_local_variables (struct function *callee, stru
of varpool node does not check the reference
from debug expressions.
Set it to 0 for all global vars.  */
-if (L_IPO_COMP_MODE && tem && is_global_var (tem))
+if (L_IPO_COMP_MODE && tem && DECL_P (tem) && is_global_var (tem))
   tem = NULL;
 
id->remapping_type_depth++;


Re: [PATCH] offline gcda profile processing tool

2014-04-16 Thread Rong Xu
On Tue, Apr 15, 2014 at 2:38 PM, Jan Hubicka  wrote:
> Rong, David, Dehao, Teresa
> I would like to have some rought idea of what we could merge this stage1. 
> There is
> certainly a lot of interesting stuff on the google branch including AutoFDO, 
> LIPO,
> the multivalue profile counters that may be used by the new devirtualization 
> bits
> and more. I also think we should switch counts into floating point 
> representation
> so Teresa's splitting patch works.
>
> Can we get plans to make this effective? My personal schedule is quite free 
> until
> April 29 when I go to Czech Republic for wedding and I will be back in Calgary
> at 14th.
>
>> 2014-03-03  Rong Xu  
>>
>>   * gcc/gcov-io.c (gcov_read_string): Make this routine available
>> to gcov-tool.
>>   (gcov_sync): Ditto.
>>   * gcc/Makefile.in: Build and install gcov-tool.
>>   * gcc/gcov-tool.c (unlink_gcda_file): Remove one gcda file.
>>   (unlink_profile_dir): Remove gcda files from the profile path.
>>   (profile_merge): Merge two profiles in directory.
>>   (print_merge_usage_message): Print merge usage.
>>   (merge_usage): Print merge usage and exit.
>>   (do_merge): Driver for profile merge sub-command.
>>   (profile_rewrite): Rewrite profile.
>>   (print_rewrite_usage_message): Print rewrite usage.
>>   (rewrite_usage): Print rewrite usage and exit.
>>   (do_rewrite): Driver for profile rewrite sub-command.
>>   (print_usage): Print gcov-info usage and exit.
>>   (print_version): Print gcov-info version.
>>   (process_args): Process arguments.
>>   (main): Main routine for gcov-tool.
>>   * libgcc/libgcov.h : Include the set of base-type headers for
>> gcov-tool.
>> (struct gcov_info): Make the functions field mutable in gcov-tool
>> compilation.
>>   * libgcc/libgcov-merge.c (gcov_get_counter): New wrapper function
>> to get the profile counter.
>>   (gcov_get_counter_target): New wrapper function to get the profile
>> values that should not be scaled.
>>   (__gcov_merge_add): Replace gcov_read_counter() with the wrapper
>> functions.
>>   (__gcov_merge_ior): Ditto.
>>   (__gcov_merge_time_profile): Ditto.
>>   (__gcov_merge_single): Ditto.
>>   (__gcov_merge_delta): Ditto.
>>   * libgcc/libgcov-util.c (void gcov_set_verbose): Set the verbose flag
>> in the utility functions.
>>   (set_fn_ctrs): Utility function for reading gcda files to in-memory
>> gcov_list object link lists.
>>   (tag_function): Ditto.
>>   (tag_blocks): Ditto.
>>   (tag_arcs): Ditto.
>>   (tag_lines): Ditto.
>>   (tag_counters): Ditto.
>>   (tag_summary): Ditto.
>>   (read_gcda_finalize): Ditto.
>>   (read_gcda_file): Ditto.
>>   (ftw_read_file): Ditto.
>>   (read_profile_dir_init): Ditto.
>>   (gcov_read_profile_dir): Ditto.
>>   (gcov_read_counter_mem): Ditto.
>>   (gcov_get_merge_weight): Ditto.
>>   (merge_wrapper): A wrapper function that calls merging handler.
>>   (gcov_merge): Merge two gcov_info objects with weights.
>>   (find_match_gcov_info): Find the matched gcov_info in the list.
>>   (gcov_profile_merge): Merge two gcov_info object lists.
>>   (__gcov_add_counter_op): Process edge profile counter values.
>>   (__gcov_ior_counter_op): Process IOR profile counter values.
>>   (__gcov_delta_counter_op): Process delta profile counter values.
>>   (__gcov_single_counter_op): Process single  profile counter values.
>>   (fp_scale): Callback function for float-point scaling.
>>   (int_scale): Callback function for integer fraction scaling.
>>   (gcov_profile_scale): Scaling profile counters.
>>   (gcov_profile_normalize): Normalize profile counters.
>>
>> Index: gcc/gcov-io.c
>> ===
>> --- gcc/gcov-io.c (revision 208237)
>> +++ gcc/gcov-io.c (working copy)
>> @@ -564,7 +564,7 @@ gcov_read_counter (void)
>> buffer, or NULL on empty string. You must copy the string before
>> calling another gcov function.  */
>>
>> -#if !IN_LIBGCOV
>> +#if !IN_LIBGCOV || defined (IN_GCOV_TOOL)
>>  GCOV_LINKAGE const char *
>>  gcov_read_string (void)
>>  {
>> @@ -641,7 +641,7 @@ gcov_read_summary (struct gcov_summary *summary)
>>  }
>>  }
>>
>> -#if !IN_LIBGCOV
>> +#if !IN_LIBG

Re: [google gcc-4_8] unify int and fp scaling in gcov-tool

2014-02-10 Thread Rong Xu
Yes. They were the bug I mentioned. This patch fixed that.

-Rong

On Mon, Feb 10, 2014 at 10:40 AM, Teresa Johnson  wrote:
> Looks good to me.
>
> A couple questions:
>
>>  static void
>> -__gcov_scale_icall_topn (gcov_type *counters, unsigned n_counters, double f)
>> +__gcov_icall_topn_op (gcov_type *counters, unsigned n_counters,
>> +  counter_op_fn fn, void *data1, void *data2)
>>  {
>>unsigned i;
>>
>> @@ -950,41 +949,65 @@ static void
>>gcov_type *value_array = &counters[i + 1];
>>
>>for (j = 0; j < GCOV_ICALL_TOPN_NCOUNTS - 1; j += 2)
>> -value_array[1] *= f;
>
> Was the above a bug (always updating index 1)?
>
>> +value_array[j + 1] = fn (value_array[j + 1], data1, data2);
>>  }
>>  }
>>
>>  static void
>> -__gcov_scale_dc (gcov_type *counters, unsigned n_counters, double f)
>> +__gcov_dc_op (gcov_type *counters, unsigned n_counters,
>> +  counter_op_fn fn, void *data1, void *data2)
>>  {
>>unsigned i;
>>
>>gcc_assert (!(n_counters % 2));
>>for (i = 0; i < n_counters; i += 2)
>> -counters[1] *= f;
>
> Same question here
>
>> +counters[i + 1] = fn (counters[i + 1], data1, data2);
>>  }
>
> Teresa
>
> On Tue, Feb 4, 2014 at 3:54 PM, Rong Xu  wrote:
>> Hi,
>>
>> The attached patch uses a callback function to unify the integer and
>> floating-point scaling in gcov-tool. (Also fix a bug in fp scaling of
>> ic and dc counters in earlier code).
>>
>> Tested with spec2006 profiles.
>>
>> OK for checking in?
>>
>> Thanks,
>>
>> -Rong
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


[google gcc-4_8] unify int and fp scaling in gcov-tool

2014-02-04 Thread Rong Xu
Hi,

The attached patch uses a callback function to unify the integer and
floating-point scaling in gcov-tool. (Also fix a bug in fp scaling of
ic and dc counters in earlier code).

Tested with spec2006 profiles.

OK for checking in?

Thanks,

-Rong
2014-02-04  Rong Xu  

* gcc/gcov-tool.c (profile_rewrite2): Remove.
(profile_rewrite): Merge int and fp scaling.
(do_rewrite): Ditto.
* libgcc/libgcov-util.c (typedef gcov_type) New.
(__gcov_add_counter_op): Merge int and fp scaling.
(__gcov_ior_counter_op): Ditto.
(__gcov_delta_counter_op): Ditto.
(__gcov_single_counter_op): Ditto.
(__gcov_icall_topn_op): Ditto.
(__gcov_dc_op): Ditto.
(fp_scale): Ditto.
(int_scale): Ditto.
(gcov_profile_scale): Ditto.
(gcov_profile_normalize): Ditto.
(__gcov_scale2_add): Remove.
(__gcov_scale2_ior): Remove.
(__gcov_scale2_delta): Remove.
(__gcov_scale2_single): Remove.
(__gcov_scale2_icall_topn): Remove.
(__gcov_scale2_dc): Remove.
(gcov_profile_scale2): Remove.

Index: gcc/gcov-tool.c
===
--- gcc/gcov-tool.c (revision 207488)
+++ gcc/gcov-tool.c (working copy)
@@ -43,8 +43,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 
 extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, int);
 extern int gcov_profile_normalize (struct gcov_info*, gcov_type);
-extern int gcov_profile_scale (struct gcov_info*, float);
-extern int gcov_profile_scale2 (struct gcov_info*, int, int);
+extern int gcov_profile_scale (struct gcov_info*, float, int, int);
 extern struct gcov_info* gcov_read_profile_dir (const char*, int);
 extern void gcov_exit (void);
 extern void set_gcov_list (struct gcov_info *);
@@ -227,45 +226,13 @@ do_merge (int argc, char **argv)
   return ret;
 }
 
-/* Scale the counters in profile DIR by a factor of N/D.
-   Result is written to dir OUT. Return 0 on success.  */
-
-static int
-profile_rewrite2 (const char *dir, const char *out, int n, int d)
-{
-  char *pwd;
-  int ret;
-  struct gcov_info * profile;
-
-
-  profile = gcov_read_profile_dir (dir, 0);
-  if (!profile)
-return 1;
-
-  /* Output new profile.  */
-  unlink_profile_dir (out);
-  mkdir (out, 0755);
-  pwd = getcwd (NULL, 0);
-  gcc_assert (pwd);
-  ret = chdir (out);
-  gcc_assert (ret == 0);
-
-  gcov_profile_scale2 (profile, n, d);
-
-  set_gcov_list (profile);
-  gcov_exit ();
-
-  ret = chdir (pwd);
-  free (pwd);
-  return 0;
-}
-
 /* If N_VAL is no-zero, normalize the profile by setting the largest counter
counter value to N_VAL and scale others counters proportionally.
Otherwise, multiply the all counters by SCALE.  */
 
 static int
-profile_rewrite (const char *d1, const char *out, long long n_val, float scale)
+profile_rewrite (const char *d1, const char *out, long long n_val,
+ float scale, int n, int d)
 {
   char *pwd;
   int ret;
@@ -287,7 +254,7 @@ static int
   if (n_val)
 gcov_profile_normalize (d1_profile, (gcov_type) n_val);
   else
-gcov_profile_scale (d1_profile, scale);
+gcov_profile_scale (d1_profile, scale, n, d);
 
   set_gcov_list (d1_profile);
   gcov_exit ();
@@ -604,9 +571,9 @@ do_rewrite (int argc, char **argv)
   if (argc - optind == 1)
 {
   if (denominator > 0)
-ret = profile_rewrite2 (argv[optind],  output_dir, numerator, 
denominator);
+ret = profile_rewrite (argv[optind],  output_dir, 0, 0.0, numerator, 
denominator);
   else
-ret = profile_rewrite (argv[optind],  output_dir, normalize_val, 
scale);
+ret = profile_rewrite (argv[optind],  output_dir, normalize_val, 
scale, 0, 0);
 }
   else
 rewrite_usage ();
Index: libgcc/libgcov-util.c
===
--- libgcc/libgcov-util.c   (revision 207488)
+++ libgcc/libgcov-util.c   (working copy)
@@ -873,39 +873,38 @@ gcov_profile_merge (struct gcov_info *tgt_profile,
   return 0;
 }
 
-/* This part of code is to scale profile counters.  */
+typedef gcov_type (*counter_op_fn) (gcov_type, void*, void*);
 
-/* Type of function used to normalize counters.  */
-typedef void (*gcov_scale_fn) (gcov_type *, gcov_unsigned_t, double);
+/* Performing FN upon arc counters.  */
 
-/* Scale arc counters. N_COUNTERS of counter value in COUNTERS array are
-   multiplied by a factor F.  */
-
 static void
-__gcov_scale_add (gcov_type *counters, unsigned n_counters, double f)
+__gcov_add_counter_op (gcov_type *counters, unsigned n_counters,
+   counter_op_fn fn, void *data1, void *data2)
 {
   for (; n_counters; counters++, n_counters--)
 {
   gcov_type val = *counters;
-  *counters = val * f;
+  *counters = fn(val, data1, data2);
 }
 }
 
-/* Scale ior counters.  */
+/* Performing FN upon ior counters.  */
 
 static void
-__gcov_scale_

[google gcc-4_8] fix profiledbootstrap

2014-02-04 Thread Rong Xu
Hi,

The attached patch fixes the duplicated definition error in
gcov-tool.c in profiledbootstrap.

Google branch only and tested with profiledbootstrap.

Ok for checking in?

-Rong
Index: gcov-tool.c
===
--- gcov-tool.c (revision 207435)
+++ gcov-tool.c (working copy)
@@ -50,16 +50,23 @@
 
 static int verbose;
 
-gcov_unsigned_t __gcov_lipo_grouping_algorithm;
-gcov_unsigned_t __gcov_lipo_merge_modu_edges;
-gcov_unsigned_t __gcov_lipo_weak_inclusion;
-gcov_unsigned_t __gcov_lipo_max_mem;
-gcov_unsigned_t __gcov_lipo_random_group_size;
-gcov_unsigned_t __gcov_lipo_cutoff;
-gcov_unsigned_t __gcov_lipo_random_seed;
-gcov_unsigned_t __gcov_lipo_dump_cgraph;
-gcov_unsigned_t __gcov_lipo_propagate_scale;
+/* The following defines are needed by dyn-ipa.c.
+   They will also be emitted by the compiler with -fprofile-generate,
+   which means this file cannot be compiled with -fprofile-generate
+   -- otherwise we get duplicated defintions.
+   Make the defines weak to link with other objects/libraries
+   that potentially compiled with -fprofile-generate.  */
 
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_grouping_algorithm;
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_merge_modu_edges;
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_weak_inclusion;
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_max_mem;
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_random_group_size;
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_cutoff;
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_random_seed;
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_dump_cgraph;
+__attribute__ ((weak)) gcov_unsigned_t __gcov_lipo_propagate_scale;
+
 /* Remove file NAME if it has a gcda suffix. */
 
 static int
Index: Makefile.in
===
--- Makefile.in (revision 207435)
+++ Makefile.in (working copy)
@@ -4090,6 +4090,9 @@
 GCOV_TOOL_OBJS = gcov-tool.o libgcov-util.o dyn-ipa.o gcov-params.o
 gcov-tool.o: gcov-tool.c $(GCOV_IO_H) intl.h $(SYSTEM_H) coretypes.h $(TM_H) \
$(CONFIG_H) version.h $(DIAGNOSTIC_H)
+   +$(COMPILER) -c $(ALL_CPPFLAGS) \
+ $(filter-out -fprofile-generate -fprofile-use,$(ALL_COMPILERFLAGS)) \
+ $(INCLUDES) -o $@ $<
 gcov-params.o: params.c $(PARAMS_H)
+$(COMPILER) -DIN_GCOV_TOOL -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) \
  $(INCLUDES) -o $@ $<


Re: [google gcc-4_8] gcov-tool: some new LIPO supports.

2014-02-04 Thread Rong Xu
Here is the revised patch that integrates Teresa' comments.
Ok for checking in?

-Rong

On Thu, Jan 30, 2014 at 1:20 PM, Rong Xu  wrote:
> Comments are inlined. New patch attached to this email.
>
> -Rong
>
> On Thu, Jan 30, 2014 at 12:39 PM, Xinliang David Li  
> wrote:
>> On Wed, Jan 29, 2014 at 4:24 PM, Rong Xu  wrote:
>>> Hi,
>>>
>>> The attached patch adds some new features to gcov-tool. It aims to
>>> rewrite LIPO profile for reuse, debug or performance tuning purpose.
>>>
>>> -l, --modu_list   Only use the modules in this file
>>
>> I think in verbose mode, the tool should emit warnings when a module
>> is trimmed due to this option. This can be used in regression test.
>>
>
> In previous patch, this warning is emitted unconventionally (not
> guarded by verbose flag).
> I changed it to under verbose mode in this patch.
>
>>> -r, --string Replace string in path
>>
>> --path_substr_replace or something in that line.
>>
>
> Done.
>
>>> -u, --use_imports_file Use the grouping in import files.
>>>
>>
>> Is there a path in code to auto test this?
>
> As we discussed offline. This can be verified by a separated script.
>
>>
>>> -l uses only the modules in specified file to compute module grouping
>>> (and subsequent dumping).
>>> -r replaces the pattern specified in the argument. The format is:
>>> old_str1:new_str1[,old_str2:new_str2]*, only the first occurrence is
>>> replaced.
>>> -u skips the run-time module grouping computation and reuses the one
>>> comes with the profiles (which is user editable).
>>>
>>> Tested with profiles from google internal benchmarks.
>>>
>>
>> Also use strcasestr for case insenstive operation.
>
> Done.
>
>>
>> David
>>
>>> -Rong
Index: gcc/gcov-tool.c
===
--- gcc/gcov-tool.c (revision 207435)
+++ gcc/gcov-tool.c (working copy)
@@ -29,6 +29,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include "coretypes.h"
 #include "tm.h"
 #include "intl.h"
+#include "hashtab.h"
 #include "diagnostic.h"
 #include "version.h"
 #include "gcov-io.h"
@@ -38,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include 
 #include 
 #include "params.h"
+#include 
 
 extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, int);
 extern int gcov_profile_normalize (struct gcov_info*, gcov_type);
@@ -47,9 +49,11 @@ extern struct gcov_info* gcov_read_profile_dir (co
 extern void gcov_exit (void);
 extern void set_gcov_list (struct gcov_info *);
 extern void gcov_set_verbose (void);
+extern void set_use_existing_grouping (void);
+extern void set_use_modu_list (void);
+extern void lipo_set_substitute_string (const char *);
 
-static int verbose;
-
+/* The following defines are needed by dyn-ipa.c.  */
 gcov_unsigned_t __gcov_lipo_grouping_algorithm;
 gcov_unsigned_t __gcov_lipo_merge_modu_edges;
 gcov_unsigned_t __gcov_lipo_weak_inclusion;
@@ -60,6 +64,8 @@ gcov_unsigned_t __gcov_lipo_random_seed;
 gcov_unsigned_t __gcov_lipo_dump_cgraph;
 gcov_unsigned_t __gcov_lipo_propagate_scale;
 
+static int verbose;
+
 /* Remove file NAME if it has a gcda suffix. */
 
 static int
@@ -285,6 +291,189 @@ profile_rewrite (const char *d1, const char *out,
   return 0;
 }
 
+/* This is the hashtab entry to store a name and mod_id pair. */
+typedef struct {
+  const char *name;
+  unsigned id;
+} mod_name_id;
+
+/* Hash and comparison functions for strings.  */
+
+static unsigned
+mod_name_id_htab_hash (const void *s_p)
+{
+  const char *s = ((const mod_name_id *) s_p)->name;
+  return (*htab_hash_string) (s);
+}
+
+static int
+mod_name_id_hash_eq (const void *s1_p, const void *s2_p)
+{
+  return strcmp (((const mod_name_id *) s1_p)->name,
+ ((const mod_name_id *) s2_p)->name) == 0;
+}
+
+static htab_t mod_name_id_hash_table;
+
+/* Look up an entry in the hash table. STRING is the module name.
+   CREATE controls to insert to htab or not.
+   If (*ID_P != 0), we write (*ID_P) to htab.
+   If (*ID_P == 0), we write module_id to (*ID_P).
+   return 1 if an entry is found and otherwise 0.  */
+
+static int
+module_name_hash_lookup (const char *string, unsigned *id_p, int create)
+{
+  void **e;
+  mod_name_id t;
+
+  t.name = string;
+  e = htab_find_slot (mod_name_id_hash_table, &t,
+  create ? INSERT : NO_INSERT);
+  if (e == NULL)
+return 0;
+  if (*e == NULL)
+{
+  *e = XNEW (mod_name_id *);
+  (*(mod_name_id **)e)->name = xstrdup (string);
+}
+  if (id_p)
+ 

Re: [google gcc-4_8] gcov-tools: minor fix for broken build for arm

2014-01-31 Thread Rong Xu
Thanks for catching this, and the fix.
OK for google branch.

-Rong

On Fri, Jan 31, 2014 at 2:46 PM, Hán Shěn (沈涵)  wrote:
> Hi Rong, while building for arm toolchain on chromeos, GCOV_LOCKED is
> not defined, which leads to redefinition of cs_all, this is observed
> on google/gcc-4_8 branch.
>
> Patch below, tested on chromeos for arm and x86_64 arch.
>
> Ok for google/gcc-4_8 branch?
>
> diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c
> index 76f6104..bbce3a6 100644
> --- a/libgcc/libgcov-driver.c
> +++ b/libgcc/libgcov-driver.c
> @@ -559,10 +559,6 @@ gcov_exit_merge_summary (const struct gcov_info
> *gi_ptr, struct gcov_summary *pr
>  {
>struct gcov_ctr_summary *cs_prg, *cs_tprg, *cs_all;
>unsigned t_ix;
> -#if !GCOV_LOCKED
> -  /* summary for all instances of program.  */
> -  struct gcov_ctr_summary *cs_all;
> -#endif
>
>/* Merge the summaries.  */
>for (t_ix = 0; t_ix < GCOV_COUNTERS_SUMMABLE; t_ix++)
>
>
> Han


Re: [google gcc-4_8] gcov-tool: some new LIPO supports.

2014-01-30 Thread Rong Xu
Comments are inlined. New patch attached to this email.

-Rong

On Thu, Jan 30, 2014 at 12:39 PM, Xinliang David Li  wrote:
> On Wed, Jan 29, 2014 at 4:24 PM, Rong Xu  wrote:
>> Hi,
>>
>> The attached patch adds some new features to gcov-tool. It aims to
>> rewrite LIPO profile for reuse, debug or performance tuning purpose.
>>
>> -l, --modu_list   Only use the modules in this file
>
> I think in verbose mode, the tool should emit warnings when a module
> is trimmed due to this option. This can be used in regression test.
>

In previous patch, this warning is emitted unconventionally (not
guarded by verbose flag).
I changed it to under verbose mode in this patch.

>> -r, --string Replace string in path
>
> --path_substr_replace or something in that line.
>

Done.

>> -u, --use_imports_file Use the grouping in import files.
>>
>
> Is there a path in code to auto test this?

As we discussed offline. This can be verified by a separated script.

>
>> -l uses only the modules in specified file to compute module grouping
>> (and subsequent dumping).
>> -r replaces the pattern specified in the argument. The format is:
>> old_str1:new_str1[,old_str2:new_str2]*, only the first occurrence is
>> replaced.
>> -u skips the run-time module grouping computation and reuses the one
>> comes with the profiles (which is user editable).
>>
>> Tested with profiles from google internal benchmarks.
>>
>
> Also use strcasestr for case insenstive operation.

Done.

>
> David
>
>> -Rong
Index: gcc/gcov-tool.c
===
--- gcc/gcov-tool.c (revision 207316)
+++ gcc/gcov-tool.c (working copy)
@@ -29,6 +29,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include "coretypes.h"
 #include "tm.h"
 #include "intl.h"
+#include "hashtab.h"
 #include "diagnostic.h"
 #include "version.h"
 #include "gcov-io.h"
@@ -38,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include 
 #include 
 #include "params.h"
+#include 
 
 extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, int);
 extern int gcov_profile_normalize (struct gcov_info*, gcov_type);
@@ -47,9 +49,11 @@ extern struct gcov_info* gcov_read_profile_dir (co
 extern void gcov_exit (void);
 extern void set_gcov_list (struct gcov_info *);
 extern void gcov_set_verbose (void);
+extern void set_use_existing_grouping (void);
+extern void set_use_modu_list (void);
+extern void lipo_set_substitute_string (const char *);
 
-static int verbose;
-
+/* The following defines are needed by dyn-ipa.c.  */
 gcov_unsigned_t __gcov_lipo_grouping_algorithm;
 gcov_unsigned_t __gcov_lipo_merge_modu_edges;
 gcov_unsigned_t __gcov_lipo_weak_inclusion;
@@ -60,6 +64,8 @@ gcov_unsigned_t __gcov_lipo_random_seed;
 gcov_unsigned_t __gcov_lipo_dump_cgraph;
 gcov_unsigned_t __gcov_lipo_propagate_scale;
 
+static int verbose;
+
 /* Remove file NAME if it has a gcda suffix. */
 
 static int
@@ -285,6 +291,187 @@ profile_rewrite (const char *d1, const char *out,
   return 0;
 }
 
+/* This is the hashtab entry to store a name and mod_id pair. */
+typedef struct {
+  const char *name;
+  unsigned id;
+} mod_name_id;
+
+/* Hash and comparison functions for strings.  */
+
+static unsigned
+mod_name_id_htab_hash (const void *s_p)
+{
+  const char *s = ((const mod_name_id *) s_p)->name;
+  return (*htab_hash_string) (s);
+}
+
+static int
+mod_name_id_hash_eq (const void *s1_p, const void *s2_p)
+{
+  return strcmp (((const mod_name_id *) s1_p)->name,
+ ((const mod_name_id *) s2_p)->name) == 0;
+}
+
+static htab_t mod_name_id_hash_table;
+
+/* Look up an entry in the hash table. STRING is the module name.
+   CREATE controls to insert to htab or not.
+   If (*ID_P != 0), we write (*ID_P) to htab.
+   If (*ID_P == 0), we write module_id to (*ID_P).
+   return 1 if an entry is found and otherwise 0.  */
+
+static int
+module_name_hash_lookup (const char *string, unsigned *id_p, int create)
+{
+  void **e;
+  mod_name_id t;
+
+  t.name = string;
+  e = htab_find_slot (mod_name_id_hash_table, &t,
+  create ? INSERT : NO_INSERT);
+  if (e == NULL)
+return 0;
+  if (*e == NULL)
+{
+  *e = XNEW (mod_name_id *);
+  (*(mod_name_id **)e)->name = xstrdup (string);
+}
+  if (id_p)
+{
+  if (*id_p != 0)
+(*(mod_name_id **)e)->id = *id_p;
+  else
+*id_p = (*(mod_name_id **)e)->id;
+}
+  return 1;
+}
+
+/* Return 1 if NAME is of a source type that LIPO targets.
+   Return 0 otherwise.  */
+
+static int
+is_lipo_source_type (char *name)
+{
+  char *p;
+
+  if (strcasestr (name, ".c") ||
+ 

[google gcc-4_8] gcov-tool: some new LIPO supports.

2014-01-29 Thread Rong Xu
Hi,

The attached patch adds some new features to gcov-tool. It aims to
rewrite LIPO profile for reuse, debug or performance tuning purpose.

-l, --modu_list   Only use the modules in this file
-r, --string Replace string in path
-u, --use_imports_file Use the grouping in import files.

-l uses only the modules in specified file to compute module grouping
(and subsequent dumping).
-r replaces the pattern specified in the argument. The format is:
old_str1:new_str1[,old_str2:new_str2]*, only the first occurrence is
replaced.
-u skips the run-time module grouping computation and reuses the one
comes with the profiles (which is user editable).

Tested with profiles from google internal benchmarks.

-Rong
2014-01-29  Rong Xu  

* gcc/gcov-tool.c (mod_name_id): New type of the hashtable entry.
(mod_name_id_htab_hash): Hash function for module-name and module-id
hashtab.
(mod_name_id_hash_eq): Comparison function for the hashtab.
(module_name_hash_lookup): Lookup function of the hashtab.
(is_lipo_source_type): Find if a file is of LIPO target source.
(lipo_process_name_string): Support of -l.
(lipo_process_modu_list): Ditto.
(is_module_available): Support of -u.
(get_module_id_from_name): Ditto.
(print_rewrite_usage_message): Add -u -l and -r support.
(do_rewrite): Ditto.
(print_version): Ditto.
(process_args): Ditto.
* libgcc/dyn-ipa.c (flag_use_existing_grouping): New flag for -u.
(init_dyn_call_graph): Support of -l.
(__gcov_finalize_dyn_callgraph): Ditto.
(gcov_build_callgraph): Ditto.
(gcov_compute_cutoff_count): Ditto.
(gcov_process_cgraph_node): Ditto.
(build_modu_graph): Ditto.
(gcov_compute_module_groups_eager_propagation): Ditto.
(gcov_compute_random_module_groups): Ditto.
(gcov_write_module_infos): Ditto.
(gcov_dump_callgraph): Ditto.
(set_use_existing_grouping): Support of -u.
(open_imports_file): Ditto.
(read_modu_groups_from_imports_files): Ditto.
(__gcov_compute_module_groups): Ditto.
* libgcc/libgcov-util.c (substitute_string): Support -r.
(lipo_set_substitute_string): Ditto.
(lipo_process_substitute_string_1): Ditto.
(lipo_process_substitute_string): Ditto.
(read_gcda_file): Ditto.
(flag_use_modu_list): Support of -l
(set_use_modu_list): Ditto.
(ftw_read_file): Ditto.
(source_profile_dir): Support of -u.
(get_source_profile_dir): Ditto.
(gcov_read_profile_dir): Ditto.

Index: gcc/gcov-tool.c
===
--- gcc/gcov-tool.c (revision 207055)
+++ gcc/gcov-tool.c (working copy)
@@ -29,6 +29,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include "coretypes.h"
 #include "tm.h"
 #include "intl.h"
+#include "hashtab.h"
 #include "diagnostic.h"
 #include "version.h"
 #include "gcov-io.h"
@@ -38,6 +39,7 @@ see the files COPYING3 and COPYING.RUNTIME respect
 #include 
 #include 
 #include "params.h"
+#include 
 
 extern int gcov_profile_merge (struct gcov_info*, struct gcov_info*, int, int);
 extern int gcov_profile_normalize (struct gcov_info*, gcov_type);
@@ -47,9 +49,11 @@ extern struct gcov_info* gcov_read_profile_dir (co
 extern void gcov_exit (void);
 extern void set_gcov_list (struct gcov_info *);
 extern void gcov_set_verbose (void);
+extern void set_use_existing_grouping (void);
+extern void set_use_modu_list (void);
+extern void lipo_set_substitute_string (const char *);
 
-static int verbose;
-
+/* The following defines are needed by dyn-ipa.c.  */
 gcov_unsigned_t __gcov_lipo_grouping_algorithm;
 gcov_unsigned_t __gcov_lipo_merge_modu_edges;
 gcov_unsigned_t __gcov_lipo_weak_inclusion;
@@ -60,6 +64,8 @@ gcov_unsigned_t __gcov_lipo_random_seed;
 gcov_unsigned_t __gcov_lipo_dump_cgraph;
 gcov_unsigned_t __gcov_lipo_propagate_scale;
 
+static int verbose;
+
 /* Remove file NAME if it has a gcda suffix. */
 
 static int
@@ -285,6 +291,189 @@ profile_rewrite (const char *d1, const char *out,
   return 0;
 }
 
+/* This is the hashtab entry to store a name and mod_id pair. */
+typedef struct {
+  const char *name;
+  unsigned id;
+} mod_name_id;
+
+/* Hash and comparison functions for strings.  */
+
+static unsigned
+mod_name_id_htab_hash (const void *s_p)
+{
+  const char *s = ((const mod_name_id *) s_p)->name;
+  return (*htab_hash_string) (s);
+}
+
+static int
+mod_name_id_hash_eq (const void *s1_p, const void *s2_p)
+{
+  return strcmp (((const mod_name_id *) s1_p)->name,
+ ((const mod_name_id *) s2_p)->name) == 0;
+}
+
+static htab_t mod_name_id_hash_table;
+
+/* Look up an entry in the hash table. STRING is the module name.
+   CREATE controls to inse

[Google gcc-4_8] always emit __gcov_get_profile_prefix when linking libgcov.a

2014-01-23 Thread Rong Xu
Hi,

This patch is for google/gcc-4_8 branch. It fixes a regression in
earlier libgcov refactoring.

Thanks,

-Rong

2014-01-23  Rong Xu  

* libgcov-driver.c (__gcov_get_profile_prefix): Always emit
this function.

Index: libgcov-driver.c
===
--- libgcov-driver.c(revision 206736)
+++ libgcov-driver.c(working copy)
@@ -68,6 +68,8 @@ extern struct gcov_info *get_gcov_list (void) ATTR
these symbols will always need to be resolved.  */
 void (*__gcov_dummy_ref1)(void) = &__gcov_reset;
 void (*__gcov_dummy_ref2)(void) = &__gcov_dump;
+extern char *__gcov_get_profile_prefix (void);
+char *(*__gcov_dummy_ref3)(void) = &__gcov_get_profile_prefix;

 /* Default callback function for profile instrumentation callback.  */
 extern void __coverage_callback (gcov_type, int);


Re: [google gcc-4_8] port gcov-tool to gcc-4_8

2014-01-17 Thread Rong Xu
Do we have to split params.def?

I can include params.h and link in params.o (a special version as we
don't have some global vars).

As for lipo_cutoff, I think we don't need a special handling -- we
should use the default value of 100 and let logic in dyn-ipa.c takes
care of the rest.

Please find the update patch which reads the default values from params.def.

-Rong

On Fri, Jan 17, 2014 at 12:05 PM, Xinliang David Li  wrote:
> For LIPO parameters, you can do this
> 1) isolate all LIPO specific parameters into lipo_params.def, and
> include it from params.def.
> 2) include lipo_params.def (after proper definition of DEFPARAM) in
> the profile tool source dir.
>
> By so doing, the compiler setting and profile tool will be in sync.
> (Unfortuately, for historic reason, the lipo_cutoff default val is
> still set in dyn-ipa.c even with the parameter, so some comments needs
> to be added in dyn-ipa.c to make sure the value should be doubly
> updated).
>
> David
>
> On Fri, Jan 17, 2014 at 11:15 AM, Rong Xu  wrote:
>> Hi,
>>
>> This patch port the gcov-tool work to google/gcc-4_8 branches.
>>
>> Tested with spec2006, profiledbootstrap and google internal benchmarks.
>>
>> -Rong
Index: Makefile.in
===
--- Makefile.in (revision 206671)
+++ Makefile.in (working copy)
@@ -123,7 +123,8 @@ SUBDIRS =@subdirs@ build
 
 # Selection of languages to be made.
 CONFIG_LANGUAGES = @all_selected_languages@
-LANGUAGES = c gcov$(exeext) gcov-dump$(exeext) $(CONFIG_LANGUAGES)
+LANGUAGES = c gcov$(exeext) gcov-dump$(exeext) gcov-tool$(exeext) \
+$(CONFIG_LANGUAGES)
 
 # Default values for variables overridden in Makefile fragments.
 # CFLAGS is for the user to override to, e.g., do a cross build with -O2.
@@ -194,6 +195,7 @@ GCC_WARN_CXXFLAGS = $(LOOSE_WARN) $($(@D)-warn) $(
 build/gengtype-lex.o-warn = -Wno-error
 gengtype-lex.o-warn = -Wno-error
 expmed.o-warn = -Wno-error
+libgcov-util.o-warn = -Wno-error
 
 # All warnings have to be shut off in stage1 if the compiler used then
 # isn't gcc; configure determines that.  WARN_CFLAGS will be either
@@ -755,6 +757,7 @@ GCC_TARGET_INSTALL_NAME := $(target_noncanonical)-
 CPP_INSTALL_NAME := $(shell echo cpp|sed '$(program_transform_name)')
 GCOV_INSTALL_NAME := $(shell echo gcov|sed '$(program_transform_name)')
 PROFILE_TOOL_INSTALL_NAME := $(shell echo profile_tool|sed 
'$(program_transform_name)')
+GCOV_TOOL_INSTALL_NAME := $(shell echo gcov-tool|sed 
'$(program_transform_name)')
 
 # Setup the testing framework, if you have one
 EXPECT = `if [ -f $${rootme}/../expect/expect ] ; then \
@@ -1480,7 +1483,8 @@ ALL_HOST_FRONTEND_OBJS = $(foreach v,$(CONFIG_LANG
 
 ALL_HOST_BACKEND_OBJS = $(GCC_OBJS) $(OBJS) $(OBJS-libcommon) \
   $(OBJS-libcommon-target) @TREEBROWSER@ main.o c-family/cppspec.o \
-  $(COLLECT2_OBJS) $(EXTRA_GCC_OBJS) $(GCOV_OBJS) $(GCOV_DUMP_OBJS)
+  $(COLLECT2_OBJS) $(EXTRA_GCC_OBJS) $(GCOV_OBJS) $(GCOV_DUMP_OBJS) \
+  $(GCOV_TOOL_OBJS)
 
 # This lists all host object files, whether they are included in this
 # compilation or not.
@@ -1505,6 +1509,7 @@ MOSTLYCLEANFILES = insn-flags.h insn-config.h insn
  $(SPECS) collect2$(exeext) gcc-ar$(exeext) gcc-nm$(exeext) \
  gcc-ranlib$(exeext) \
  gcov-iov$(build_exeext) gcov$(exeext) gcov-dump$(exeext) \
+ gcov-tool$(exeect) \
  gengtype$(exeext) *.[0-9][0-9].* *.[si] *-checksum.c libbackend.a \
  libcommon-target.a libcommon.a libgcc.mk
 
@@ -4070,6 +4075,24 @@ GCOV_DUMP_OBJS = gcov-dump.o vec.o ggc-none.o
 gcov-dump$(exeext): $(GCOV_DUMP_OBJS) $(LIBDEPS)
+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_DUMP_OBJS) \
$(LIBS) -o $@
+
+libgcov-util.o: $(srcdir)/../libgcc/libgcov-util.c gcov-io.c $(GCOV_IO_H) \
+  $(srcdir)/../libgcc/libgcov-driver.c 
$(srcdir)/../libgcc/libgcov-driver-system.c \
+  $(srcdir)/../libgcc/libgcov-merge.c $(srcdir)/../libgcc/libgcov.h \
+  $(SYSTEM_H) coretypes.h $(TM_H) $(CONFIG_H) version.h intl.h $(DIAGNOSTIC_H)
+   +$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) -o $@ 
$<
+dyn-ipa.o: $(srcdir)/../libgcc/dyn-ipa.c gcov-io.c 
$(srcdir)/../libgcc/libgcov.h \
+   $(GCOV_IO_H) $(SYSTEM_H) coretypes.h \
+   $(TM_H) $(CONFIG_H) version.h intl.h $(DIAGNOSTIC_H)
+   +$(COMPILER) -DIN_GCOV_TOOL -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) \
+ $(INCLUDES) -o $@ $<
+
+GCOV_TOOL_OBJS = gcov-tool.o libgcov-util.o dyn-ipa.o params.o
+gcov-tool.o: gcov-tool.c $(GCOV_IO_H) intl.h $(SYSTEM_H) coretypes.h $(TM_H) \
+   $(CONFIG_H) version.h $(DIAGNOSTIC_H)
+gcov-tool$(exeext): $(GCOV_TOOL_OBJS) $(LIBDEPS)
+   +$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_TOOL_OBJS) \
+ $(LIBS) -o $@
 #
 # Build the include directories.  The stamp files are stmp-* rather than
 # s-* so that mostlyclean does not force the include dir

Re: [PATCH] offline gcda profile processing tool

2014-01-16 Thread Rong Xu
Ping.

-Rong

On Mon, Jan 13, 2014 at 12:43 PM, Rong Xu  wrote:
> Hi,
>
> This patch implements gcov-tool, a offline profile processing tool.
> This version supports merging two profiles with weights, and scaling
> the profile with a floating-point / fraction weight.
>
> Earlier discussion can be found
> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02631.html
>
> In the discussion of the earlier patch, Honza wanted a separated
> patch for interface change for gcov_merge_*() functions. In this patch,
> the changes for gcov_merge_*() are minimal by using a wrapper function.
> So I include everything in this patch.
>
> Tested with bootstrap profiledboostrap and SPEC2006 profiles.
>
> Thanks,
>
> -Rong


Re: [google gcc-4_8] backport libgcov re-factoring patches from trunk

2014-01-15 Thread Rong Xu
On Wed, Jan 15, 2014 at 10:40 AM, Xinliang David Li  wrote:
> In libgcov-driver.c,
>
> 1) there are couple of places with trailing white spaces (e.g, in
> gcov_sort_n_vals body), please remove
They are from the existing code. But I'll fix them.

> 2) gcov_exit_write_gcda in trunk takes eof_pos as an arg, and check it
> before writing the header. I think this is more correct than in your
> patch
That's true. But this logic is from the newer code in trunk.
Current 4_8 code does not check this. I deliberately did this because I thought
the backporting patch should not change this.

I'll change to the trunk verison then.


> 3) It would be better to keep the function order the same in trunk
> (e.g, compute summary related, merge gcda and write gcda etc); it is
> also helpful to keep LIPO related functions order in the same way as
> in google/main so that a better diff can be done
> 4) libdriver-profiler.c -- make the function ordering the same as in
> google/main would be helpful.

sure. I'll do this two items. Will send an updated patch soon.

>
> thanks,
>
> David
>
> On Wed, Jan 15, 2014 at 10:03 AM, Rong Xu  wrote:
>> The attached patch backports libgcov re-factoring patches from trunk.
>>
>> Tested with google internal benchmarks, SPEC2006, bootstrap and
>> profiledbootstrap.
>>
>> OK for google/gcc-4_8 branch?
>>
>> -Rong


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2014-01-09 Thread Rong Xu
My bad.
Thanks for the fix!

-Rong

On Thu, Jan 9, 2014 at 11:47 AM, H.J. Lu  wrote:
> On Wed, Jan 8, 2014 at 2:33 PM, Rong Xu  wrote:
>> On Fri, Dec 6, 2013 at 6:23 AM, Jan Hubicka  wrote:
>>>> @@ -325,6 +311,9 @@ static struct gcov_summary all_prg;
>>>>  #endif
>>>>  /* crc32 for this program.  */
>>>>  static gcov_unsigned_t crc32;
>>>> +/* Use this summary checksum rather the computed one if the value is
>>>> + *non-zero.  */
>>>> +static gcov_unsigned_t saved_summary_checksum;
>>>
>>> Why do you need to save the checksum? Won't it reset summary back with 
>>> multiple streaming?
>>
>> This was for the gcov_tool. checksum will be recomputed in gcov_exit
>> and the value will depend on
>> the order of gcov_info list. (the order will be different after
>> reading from gcda files to memory). The purpose was
>> to have the same summary_checksum so that I can get identical gcov-dump 
>> output.
>>
>>>
>>> I would really like to avoid introducing those static vars that are used 
>>> exclusively
>>> by gcov_exit.  What about putting them into an gcov_context structure that
>>> is passed around the functions that was broken out?
>>
>> With my recently patch the localizes this_prg, we only use 64 more
>> bytes in bss. Do you still we have to remove
>> all these statics?
>>
>>>
>
> libgcc ChangeLog entries should be in libgcc/ChangeLog,
> not gcc/ChangeLog.  I checked in a patch to move them
> to libgcc/ChangeLog.
>
> --
> H.J.


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2014-01-08 Thread Rong Xu
On Fri, Dec 6, 2013 at 6:23 AM, Jan Hubicka  wrote:
>> @@ -325,6 +311,9 @@ static struct gcov_summary all_prg;
>>  #endif
>>  /* crc32 for this program.  */
>>  static gcov_unsigned_t crc32;
>> +/* Use this summary checksum rather the computed one if the value is
>> + *non-zero.  */
>> +static gcov_unsigned_t saved_summary_checksum;
>
> Why do you need to save the checksum? Won't it reset summary back with 
> multiple streaming?

This was for the gcov_tool. checksum will be recomputed in gcov_exit
and the value will depend on
the order of gcov_info list. (the order will be different after
reading from gcda files to memory). The purpose was
to have the same summary_checksum so that I can get identical gcov-dump output.

>
> I would really like to avoid introducing those static vars that are used 
> exclusively
> by gcov_exit.  What about putting them into an gcov_context structure that
> is passed around the functions that was broken out?

With my recently patch the localizes this_prg, we only use 64 more
bytes in bss. Do you still we have to remove
all these statics?

>


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2014-01-08 Thread Rong Xu
On Wed, Dec 18, 2013 at 9:28 AM, Xinliang David Li  wrote:
>>>
>>>  #ifdef L_gcov_merge_ior
>>>  /* The profile merging function that just adds the counters.  It is given
>>> -   an array COUNTERS of N_COUNTERS old counters and it reads the same 
>>> number
>>> -   of counters from the gcov file.  */
>>> +   an array COUNTERS of N_COUNTERS old counters.
>>> +   When SRC==NULL, it reads the same number of counters from the gcov file.
>>> +   Otherwise, it reads from SRC array.  */
>>>  void
>>> -__gcov_merge_ior (gcov_type *counters, unsigned n_counters)
>>> +__gcov_merge_ior (gcov_type *counters, unsigned n_counters,
>>> +  gcov_type *src, unsigned w __attribute__ ((unused)))
>>
>> So the new in-memory variants are introduced for merging tool, while libgcc 
>> use gcov_read_counter
>> interface?
>> Perhaps we can actually just duplicate the functions to avoid runtime to do 
>> all the scalling
>> and in_mem tests it won't need?
>
>
> I thought about this one a little. How about making the interface
> change conditionally, but still share the implementation?  The merge
> function bodies mostly remain unchanged and there is no runtime
> penalty for libgcov.  The new macros can be shared across most of the
> mergers.
>
> #ifdef IN_PREOFILE_TOOL
> #define GCOV_MERGE_EXTRA_ARGS  gcov_type *src, unsigned w
> #define GCOV_READ_COUNTER  *(src++) * w
> #else
> #define GCOV_MERGE_EXTRA_ARGS
> #define GCOV_READ_COUNTER gcov_read_counter ()
> #endif
>
> __gcov_merge_add (gcov_type *counters, unsigned n_counters,
>   GCOV_MERGE_EXTRA_ARGS)
> {
>
>  for (; n_counters; counters++, n_counters--)
>   {
>   *counters += GCOV_READ_COUNTER ;
>}
>
> }
>
> thanks,

Personally I don't think the run time test of in_mem will cause any
issue. This is in profile dumping, why don't we care a few more cycle
heres? it won't pollute the profile.

If you really don't like that, we can use the above approach, or I can
hide the logic in gcov_read_counter(), i.e. overload
gcov_read_counter() in profile_tool. For that, I will need a new
global variable SRC and set it before calling the merge function.
I would prefer to keep weight in _gcov_merge_* argument list.

What do you think?

-Rong

>
> David
>
>>
>> I would suggest going with libgcov.h changes and clenaups first, with 
>> interface changes next
>> and the gcov-tool is probably quite obvious at the end?
>> Do you think you can split the patch this way?
>>
>> Thanks and sorry for taking long to review. I should have more time again 
>> now.
>> Honza


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2014-01-08 Thread Rong Xu
; 108   0   0 108  6c _gcov_execve.o (ex 
> ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a)
>  72   0   0  72  48 _gcov_reset.o (ex 
> ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a)
>  72   0   0  72  48 _gcov_dump.o (ex 
> ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a)
>  89   0   0  89  59 _gcov_interval_profiler.o (ex 
> ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a)
>  69   0   0  69  45 _gcov_pow2_profiler.o (ex 
> ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a)
> 115   0   0 115  73 _gcov_one_value_profiler.o (ex 
> ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a)
> 122   0   0 122  7a _gcov_indirect_call_profiler.o (ex 
> ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a)
>  57   0   0  57  39 _gcov_average_profiler.o (ex 
> ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a)
>  52   0   0  52  34 _gcov_ior_profiler.o (ex 
> ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a)
> 125   0   0 125  7d _gcov_merge_ior.o (ex 
> ./lib64/gcc/x86_64-suse-linux/4.8/libgcov.a)
>
2014-01-08  Rong Xu  

* libgcc/libgcov-driver.c (this_prg): make it local to save
bss space.
(gcov_exit_compute_summary): Ditto.
(gcov_exit_merge_gcda): Ditto.
(gcov_exit_merge_summary): Ditto.
(gcov_exit_dump_gcov): Ditto.

Index: libgcc/libgcov-driver.c
===
--- libgcc/libgcov-driver.c (revision 206437)
+++ libgcc/libgcov-driver.c (working copy)
@@ -303,8 +303,6 @@ gcov_compute_histogram (struct gcov_summary *sum)
 }
 }
 
-/* summary for program.  */
-static struct gcov_summary this_prg;
 /* gcda filename.  */
 static char *gi_filename;
 /* buffer for the fn_data from another program.  */
@@ -317,10 +315,10 @@ static struct gcov_summary_buffer *sum_buffer;
 static int run_accounted = 0;
 
 /* This funtions computes the program level summary and the histo-gram.
-   It computes and returns CRC32  and stored summari in THIS_PRG.  */
+   It computes and returns CRC32 and stored summary in THIS_PRG.  */
 
 static gcov_unsigned_t
-gcov_exit_compute_summary (void)
+gcov_exit_compute_summary (struct gcov_summary *this_prg)
 {
   struct gcov_info *gi_ptr;
   const struct gcov_fn_info *gfi_ptr;
@@ -332,7 +330,7 @@ static gcov_unsigned_t
   gcov_unsigned_t crc32 = 0;
 
   /* Find the totals for this execution.  */
-  memset (&this_prg, 0, sizeof (this_prg));
+  memset (this_prg, 0, sizeof (*this_prg));
   for (gi_ptr = gcov_list; gi_ptr; gi_ptr = gi_ptr->next)
 {
   crc32 = crc32_unsigned (crc32, gi_ptr->stamp);
@@ -357,7 +355,7 @@ static gcov_unsigned_t
   if (!gi_ptr->merge[t_ix])
 continue;
 
-  cs_ptr = &this_prg.ctrs[t_ix];
+  cs_ptr = &(this_prg->ctrs[t_ix]);
   cs_ptr->num += ci_ptr->num;
   crc32 = crc32_unsigned (crc32, ci_ptr->num);
 
@@ -371,7 +369,7 @@ static gcov_unsigned_t
 }
 }
 }
-  gcov_compute_histogram (&this_prg);
+  gcov_compute_histogram (this_prg);
   return crc32;
 }
 
@@ -393,6 +391,7 @@ struct gcov_filename_aux{
 static int
 gcov_exit_merge_gcda (struct gcov_info *gi_ptr,
   struct gcov_summary *prg_p,
+  struct gcov_summary *this_prg,
   gcov_position_t *summary_pos_p,
   gcov_position_t *eof_pos_p,
  gcov_unsigned_t crc32)
@@ -446,7 +445,7 @@ gcov_exit_merge_gcda (struct gcov_info *gi_ptr,
 goto next_summary;
 
   for (t_ix = 0; t_ix != GCOV_COUNTERS_SUMMABLE; t_ix++)
-if (tmp.ctrs[t_ix].num != this_prg.ctrs[t_ix].num)
+if (tmp.ctrs[t_ix].num != this_prg->ctrs[t_ix].num)
   goto next_summary;
   *prg_p = tmp;
   *summary_pos_p = *eof_pos_p;
@@ -636,7 +635,7 @@ gcov_exit_write_gcda (const struct gcov_info *gi_p
 
 static int
 gcov_exit_merge_summary (const struct gcov_info *gi_ptr, struct gcov_summary 
*prg,
-gcov_unsigned_t crc32,
+ struct gcov_summary *this_prg, gcov_unsigned_t crc32,
 struct gcov_summary *all_prg __attribute__ ((unused)))
 {
   struct gcov_ctr_summary *cs_prg, *cs_tprg;
@@ -650,7 +649,7 @@ gcov_exit_merge_summary (const struct gcov_info *g
   for (t_ix = 0; t_ix < GCOV_COUNTERS_SUMMABLE; t_ix++)
 {
   cs_prg = &(prg->ctrs[t_ix]);
-  cs_tprg = &this_prg.ctrs[t_ix];
+  cs_tprg = &(this_prg->ctrs[t_ix]);
 
   if (gi_ptr->merge[t_ix])
 {
@@ -719,7 +718,8 @@ gcov_exit_merge_summary (const struct gcov_info *g
 
 static void
 gcov_exit_dump_gcov (struct gcov_info *gi_ptr, struct gcov_filename_aux *gf,
-gcov_unsigned_t crc32, struct gcov_summary *

Re: [PATCH] Conditional count update for fast coverage test in multi-threaded programs

2013-12-20 Thread Rong Xu
 -3.01%   +2.87%-75.50%
benchmark_9   81.6+9.91%  +14.10%-69.81%
benchmark_10  69.8   +23.60%   -1.53% --
benchmark_11  89.9+0.34%   +4.32%-59.62%
benchmark_12  98.7+0.96%   +0.88%-10.52%
benchmark_13  80.5-8.31%   -0.55%-75.90%
benchmark_14  17.1  +429.47%   -5.21% -4.87%
benchmark_15  22.0  +295.70%   -1.29%-46.36%
benchmark_16  80.0-6.31%   +0.54%-62.61%
benchmark_17  83.5+4.84%  +10.71%-70.48%
benchmark_18  90.0-1.27%   +3.18%-72.30%
geometric mean   +24.51%   +4.81%  -62.44% (incomplete)

benchmark_13  77.1 -3.34%+5.75% --

On Mon, Nov 25, 2013 at 11:19 AM, Rong Xu  wrote:
> On Mon, Nov 25, 2013 at 2:11 AM, Richard Biener
>  wrote:
>> On Fri, Nov 22, 2013 at 10:49 PM, Rong Xu  wrote:
>>> On Fri, Nov 22, 2013 at 4:03 AM, Richard Biener
>>>  wrote:
>>>> On Fri, Nov 22, 2013 at 4:51 AM, Rong Xu  wrote:
>>>>> Hi,
>>>>>
>>>>> This patch injects a condition into the instrumented code for edge
>>>>> counter update. The counter value will not be updated after reaching
>>>>> value 1.
>>>>>
>>>>> The feature is under a new parameter --param=coverage-exec_once.
>>>>> Default is disabled and setting to 1 to enable.
>>>>>
>>>>> This extra check usually slows the program down. For SPEC 2006
>>>>> benchmarks (all single thread programs), we usually see around 20%-35%
>>>>> slow down in -O2 coverage build. This feature, however, is expected to
>>>>> improve the coverage run speed for multi-threaded programs, because
>>>>> there virtually no data race and false sharing in updating counters.
>>>>> The improvement can be significant for highly threaded programs -- we
>>>>> are seeing 7x speedup in coverage test run for some non-trivial google
>>>>> applications.
>>>>>
>>>>> Tested with bootstrap.
>>>>
>>>> Err - why not simply emit
>>>>
>>>>   counter = 1
>>>>
>>>> for the counter update itself with that --param (I don't like a --param
>>>> for this either).
>>>>
>>>> I assume that CPUs can avoid data-races and false sharing for
>>>> non-changing accesses?
>>>>
>>>
>>> I'm not aware of any CPU having this feature. I think a write to the
>>> shared cache line to invalidate all the shared copies. I cannot find
>>> any reference on checking the value of the write. Do you have any
>>> pointer to the feature?
>>
>> I don't have any pointer - but I remember seeing this in the context
>> of atomics thus it may be only in the context of using a xchg
>> or cmpxchg instruction.  Which would make it non-portable to
>> some extent (if you don't want to use atomic builtins here).
>>
>
> cmpxchg should work here -- it's a conditional write so the data race
> /false sharing can be avoided.
> I'm comparing the performance b/w explicit branch vs cmpxchg and will
> report back.
>
> -Rong
>
>
>> Richard.
>>
>>> I just tested this implementation vs. simply setting to 1, using
>>> google search as the benchmark.
>>> This one is 4.5x faster. The test was done on Intel Westmere systems.
>>>
>>> I can change the parameter to an option.
>>>
>>> -Rong
>>>
>>>> Richard.
>>>>
>>>>> Thanks,
>>>>>
>>>>> -Rong


Re: [PATCH] Conditional count update for fast coverage test in multi-threaded programs

2013-11-25 Thread Rong Xu
On Mon, Nov 25, 2013 at 2:11 AM, Richard Biener
 wrote:
> On Fri, Nov 22, 2013 at 10:49 PM, Rong Xu  wrote:
>> On Fri, Nov 22, 2013 at 4:03 AM, Richard Biener
>>  wrote:
>>> On Fri, Nov 22, 2013 at 4:51 AM, Rong Xu  wrote:
>>>> Hi,
>>>>
>>>> This patch injects a condition into the instrumented code for edge
>>>> counter update. The counter value will not be updated after reaching
>>>> value 1.
>>>>
>>>> The feature is under a new parameter --param=coverage-exec_once.
>>>> Default is disabled and setting to 1 to enable.
>>>>
>>>> This extra check usually slows the program down. For SPEC 2006
>>>> benchmarks (all single thread programs), we usually see around 20%-35%
>>>> slow down in -O2 coverage build. This feature, however, is expected to
>>>> improve the coverage run speed for multi-threaded programs, because
>>>> there virtually no data race and false sharing in updating counters.
>>>> The improvement can be significant for highly threaded programs -- we
>>>> are seeing 7x speedup in coverage test run for some non-trivial google
>>>> applications.
>>>>
>>>> Tested with bootstrap.
>>>
>>> Err - why not simply emit
>>>
>>>   counter = 1
>>>
>>> for the counter update itself with that --param (I don't like a --param
>>> for this either).
>>>
>>> I assume that CPUs can avoid data-races and false sharing for
>>> non-changing accesses?
>>>
>>
>> I'm not aware of any CPU having this feature. I think a write to the
>> shared cache line to invalidate all the shared copies. I cannot find
>> any reference on checking the value of the write. Do you have any
>> pointer to the feature?
>
> I don't have any pointer - but I remember seeing this in the context
> of atomics thus it may be only in the context of using a xchg
> or cmpxchg instruction.  Which would make it non-portable to
> some extent (if you don't want to use atomic builtins here).
>

cmpxchg should work here -- it's a conditional write so the data race
/false sharing can be avoided.
I'm comparing the performance b/w explicit branch vs cmpxchg and will
report back.

-Rong


> Richard.
>
>> I just tested this implementation vs. simply setting to 1, using
>> google search as the benchmark.
>> This one is 4.5x faster. The test was done on Intel Westmere systems.
>>
>> I can change the parameter to an option.
>>
>> -Rong
>>
>>> Richard.
>>>
>>>> Thanks,
>>>>
>>>> -Rong


Re: [PATCH] Conditional count update for fast coverage test in multi-threaded programs

2013-11-22 Thread Rong Xu
On Fri, Nov 22, 2013 at 4:03 AM, Richard Biener
 wrote:
> On Fri, Nov 22, 2013 at 4:51 AM, Rong Xu  wrote:
>> Hi,
>>
>> This patch injects a condition into the instrumented code for edge
>> counter update. The counter value will not be updated after reaching
>> value 1.
>>
>> The feature is under a new parameter --param=coverage-exec_once.
>> Default is disabled and setting to 1 to enable.
>>
>> This extra check usually slows the program down. For SPEC 2006
>> benchmarks (all single thread programs), we usually see around 20%-35%
>> slow down in -O2 coverage build. This feature, however, is expected to
>> improve the coverage run speed for multi-threaded programs, because
>> there virtually no data race and false sharing in updating counters.
>> The improvement can be significant for highly threaded programs -- we
>> are seeing 7x speedup in coverage test run for some non-trivial google
>> applications.
>>
>> Tested with bootstrap.
>
> Err - why not simply emit
>
>   counter = 1
>
> for the counter update itself with that --param (I don't like a --param
> for this either).
>
> I assume that CPUs can avoid data-races and false sharing for
> non-changing accesses?
>

I'm not aware of any CPU having this feature. I think a write to the
shared cache line to invalidate all the shared copies. I cannot find
any reference on checking the value of the write. Do you have any
pointer to the feature?

I just tested this implementation vs. simply setting to 1, using
google search as the benchmark.
This one is 4.5x faster. The test was done on Intel Westmere systems.

I can change the parameter to an option.

-Rong

> Richard.
>
>> Thanks,
>>
>> -Rong


[PATCH] Conditional count update for fast coverage test in multi-threaded programs

2013-11-21 Thread Rong Xu
Hi,

This patch injects a condition into the instrumented code for edge
counter update. The counter value will not be updated after reaching
value 1.

The feature is under a new parameter --param=coverage-exec_once.
Default is disabled and setting to 1 to enable.

This extra check usually slows the program down. For SPEC 2006
benchmarks (all single thread programs), we usually see around 20%-35%
slow down in -O2 coverage build. This feature, however, is expected to
improve the coverage run speed for multi-threaded programs, because
there virtually no data race and false sharing in updating counters.
The improvement can be significant for highly threaded programs -- we
are seeing 7x speedup in coverage test run for some non-trivial google
applications.

Tested with bootstrap.

Thanks,

-Rong
2013-11-21  Rong Xu  

* gcc/doc/invoke.texi (coverage-exec_once): Add document.
* gcc/params.def (coverage-exec_once): New param.
* gcc/profile.h (gcov_type sum_edge_counts): Add extern decl.
* gcc/profile.c (branch_prob): Add support for coverage-exec_once.
* gcc/tree-profile.c (gimple_gen_edge_profiler): Ditto.
(gimple_gen_ior_profiler): Ditto.
(insert_if_then): Ditto.
(add_execonce_wrapper): Ditto.
(is_pred_instrumentation_candidate): Ditto.
(add_predicate_to_edge_counters): Ditto.
(cleanup_pred_instrumentation): Ditto.
(init_pred_instrumentation): Ditto.
(tree_profiling): Ditto.

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 205226)
+++ gcc/doc/invoke.texi (working copy)
@@ -9937,6 +9937,10 @@ The default choice depends on the target.
 Set the maximum number of existing candidates that will be considered when
 seeking a basis for a new straight-line strength reduction candidate.
 
+@item coverage-exec_once
+Set to 1 to update each arc counter only once. This avoids false sharing
+and speeds up profile-generate run for multi-threaded programs.
+
 @end table
 @end table
 
Index: gcc/params.def
===
--- gcc/params.def  (revision 205226)
+++ gcc/params.def  (working copy)
@@ -441,6 +441,14 @@ DEFPARAM(TRACER_MIN_BRANCH_PROBABILITY,
 "Stop forward growth if the probability of best edge is less than this 
threshold (in percent). Used when profile feedback is not available",
 50, 0, 100)
 
+/* This parameter enables fast coverage test. Once the counter flips from 0
+   to 1, we no longer updating the value . This avoids false sharing and speeds
+   up profile-generate run for multi-threaded programs.  */
+DEFPARAM (PARAM_COVERAGE_EXEC_ONCE,
+ "coverage-exec_once",
+ "Stop incrementing edge counts once they become 1.",
+ 0, 0, 1)
+
 /* The maximum number of incoming edges to consider for crossjumping.  */
 DEFPARAM(PARAM_MAX_CROSSJUMP_EDGES,
 "max-crossjump-edges",
Index: gcc/profile.c
===
--- gcc/profile.c   (revision 205226)
+++ gcc/profile.c   (working copy)
@@ -67,6 +67,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "dumpfile.h"
 #include "cgraph.h"
+#include "params.h"
 
 #include "profile.h"
 
@@ -1327,6 +1328,9 @@ branch_prob (void)
 
   /* Commit changes done by instrumentation.  */
   gsi_commit_edge_inserts ();
+
+  if (PARAM_VALUE (PARAM_COVERAGE_EXEC_ONCE))
+add_predicate_to_edge_counters ();
 }
 
   free_aux_for_edges ();
Index: gcc/profile.h
===
--- gcc/profile.h   (revision 205226)
+++ gcc/profile.h   (working copy)
@@ -46,6 +46,9 @@ extern gcov_type sum_edge_counts (vec
 extern void init_node_map (bool);
 extern void del_node_map (void);
 
+/* Insert a predicate to edge counter update stmt.  */
+extern void add_predicate_to_edge_counters (void);
+
 extern void get_working_sets (void);
 
 /* In predict.c.  */
Index: gcc/tree-profile.c
===
--- gcc/tree-profile.c  (revision 205226)
+++ gcc/tree-profile.c  (working copy)
@@ -52,6 +52,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "target.h"
 #include "tree-cfgcleanup.h"
 #include "tree-nested.h"
+#include "params.h"
 
 static GTY(()) tree gcov_type_node;
 static GTY(()) tree tree_interval_profiler_fn;
@@ -67,6 +68,11 @@ static GTY(()) tree ic_void_ptr_var;
 static GTY(()) tree ic_gcov_type_ptr_var;
 static GTY(()) tree ptr_void;
 
+/* A pointer-set of the last statement in each block of statements that need to
+   be applied a predicate .  */
+static struct pointer_set_t *pred_instrumentation_candidate = NULL;
+
+
 /* Do initiali

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-20 Thread Rong Xu
Ping.

On Mon, Nov 18, 2013 at 12:31 PM, Xinliang David Li  wrote:
> gcov-dump tool does raw dump of profile data. In the long run, we
> should have only one gcov profile manipulation tool, so it might be
> better to incorporate gcov-dump into gcov-tool and get rid of
> 'gcov-dump'.
>
> David
>
> On Mon, Nov 18, 2013 at 12:24 PM, Rong Xu  wrote:
>> Hi, all
>>
>> This is the new patch for gcov-tool (previously profile-tool).
>>
>> Honza: can you comment on the new merge interface? David posted some
>> comments in an earlier email and we want to know what's your opinion.
>>
>> Test patch has been tested with boostrap, regresssion,
>> profiledbootstrap and SPEC2006.
>>
>> Noticeable changes from the earlier version:
>>
>> 1. create a new file libgcov.h and move libgcov-*.h headers to libgcov.h
>> So we can included multiple libgcov-*.c without adding new macros.
>>
>> 2. split libgcov.h specific code in gcvo-io.h to libcc/libgcov.h
>> Avoid multiple-page of code under IN_LIBGCOV macro -- this
>> improves the readability.
>>
>> 3. make gcov_var static, and move the definition from gcov-io.h to
>> gcov-io.c. Also
>>move some static functions accessing gcov_var to gcvo-io.c
>> Current code rely on GCOV_LINKAGE tricks to avoid multi-definition. I don't 
>> see
>> a reason that gcov_var needs to exposed as a global.
>>
>> 4. expose gcov_write_strings() and gcov_sync() to gcov_tool usage
>>
>> 5. rename profile-tool to gcov-tool per Honza's suggestion.
>>
>> Thanks,
>>
>> -Rong
>>
>> On Tue, Nov 12, 2013 at 4:27 PM, Rong Xu  wrote:
>>> The patch was checked in as r204730.
>>>
>>> Tested with
>>> configure with --enable-languages=all,obj-c++
>>> (ada is currently broken, so I was not be able test)
>>> bootstrap and profiledbootstrap
>>> regression for -m32 and -m64.
>>> spec2006
>>>
>>> The only semantic change is __gcov_flush() used to be under L_gcov.
>>> I put the function to libgcov-interface.c and made it under
>>> L_gcov_flush (newly added).
>>> So if you need this function, you have to enable L_gcov_flush in the
>>> libgcc/Makefile.in.
>>>
>>> Thanks,
>>>
>>> -Rong
>>>
>>> On Mon, Nov 11, 2013 at 10:19 AM, Rong Xu  wrote:
>>>> On Mon, Nov 11, 2013 at 7:02 AM, Jan Hubicka  wrote:
>>>>>> 2013-11-04  Rong Xu  
>>>>>>
>>>>>>   * libgcc/libgcov.c: Delete as part of re-factoring.
>>>>>>   * libgcc/libgcov-profiler.c (__gcov_interval_profiler): Moved from
>>>>>>   libgcov.c
>>>>>>   (__gcov_pow2_profiler): Ditto.
>>>>>>   (__gcov_one_value_profiler_body): Ditto.
>>>>>>   (__gcov_one_value_profiler): Ditto.
>>>>>>   (__gcov_indirect_call_profiler): Ditto.
>>>>>>   (__gcov_indirect_call_profiler_v2): Ditto.
>>>>>>   (__gcov_average_profiler): Ditto.
>>>>>>   (__gcov_ior_profiler): Ditto.
>>>>>>   * libgcc/libgcov-driver.c (this_prg): Make it file scope
>>>>>>   static variable.
>>>>>>   (all_prg): Ditto.
>>>>>>   (crc32): Ditto.
>>>>>>   (gi_filename): Ditto.
>>>>>>   (fn_buffer): Ditto.
>>>>>>   (sum_buffer): Ditto.
>>>>>> (struct gcov_filename_aux): New types to store auxiliary 
>>>>>> information
>>>>>> for gi_filename.
>>>>>>   (gcov_version): Moved from libgcov.c.
>>>>>>   (crc32_unsigned): Ditto.
>>>>>>   (gcov_histogram_insert): Ditto.
>>>>>>   (gcov_compute_histogram): Ditto.
>>>>>>   (gcov_exit): Ditto.
>>>>>>   (gcov_clear): Ditto.
>>>>>>   (__gcov_init): Ditto.
>>>>>>   (gcov_exit_compute_summary): New function split from gcov_exit().
>>>>>>   (gcov_exit_merge_gcda): Ditto.
>>>>>>   (gcov_exit_write_gcda): Ditto.
>>>>>>   (gcov_exit_dump_gcov): Ditto.
>>>>>>   * libgcc/libgcov-interface.c (init_mx): Moved from libgcov.c.
>>>>>>   (init_mx_once): Ditto.
>>>>>>   (__gcov_flush): Ditto.
>>>>>>   (__gcov_reset): Ditto.
>>>>>>   (__gcov_dump): Di

Re: atomic update of profile counters (issue7000044)

2013-11-20 Thread Rong Xu
OK. Sorry for miss-reading the message.
In that case, linking in libatomic becomes a separate issue. We don't
need to touch gcc.c in this patch.

Thanks,

-Rong

On Wed, Nov 20, 2013 at 2:19 PM, Andrew Pinski  wrote:
> On Wed, Nov 20, 2013 at 2:07 PM, Rong Xu  wrote:
>> Joseph and Andrew, thanks for the suggestion. That's really helpful.
>>
>> Here is the new patch for gcc.c.
>> Basically, it's just what you have suggested: enclosing -latomic with
>> --as-needed, and using macros.
>> For the case of no --as-needed support, I use static link. (just found
>> that some code already using this in the SPEC).
>> I'm flexible on this part -- if you think this is unnecessary, I can remove.
>
>
> I think Joseph's suggestion was also to include -latomic even when not
> generating atomic profiling due to the C11 code requiring it.
>
> Thanks,
> Andrew
>
>>
>> Thanks,
>>
>> -Rong
>>
>> Index: gcc.c
>> ===
>> --- gcc.c   (revision 205053)
>> +++ gcc.c   (working copy)
>> @@ -748,6 +748,23 @@ proper position among the other output files.  */
>>  %{fvtable-verify=preinit: -lvtv -u_vtable_map_vars_start
>> -u_vtable_map_vars_end}}"
>>  #endif
>>
>> +/* This spec is for linking in libatomic in gcov atomic counter update.
>> +   We will use the atomic functions defined in libatomic, only when the 
>> builtin
>> +   versions are not available. In the case of no LD_AS_NEEDED support, we
>> +   link libatomic statically.  */
>> +
>> +#ifndef GCOV_ATOMIC_SPEC
>> +#if USE_LD_AS_NEEDED
>> +#define GCOV_ATOMIC_SPEC "%{fprofile-generate-atomic=*:" 
>> LD_AS_NEEDED_OPTION \
>> +  " -latomic} " LD_NO_AS_NEEDED_OPTION
>> +#elif defined(HAVE_LD_STATIC_DYNAMIC)
>> +#define GCOV_ATOMIC_SPEC "%{fprofile-generate-atomic=*:" LD_STATIC_OPTION \
>> +" -latomic " LD_DYNAMIC_OPTION "}"
>> +#else /* !USE_LD_AS_NEEDED && !HAVE_LD_STATIC_DYNAMIC  */
>> +#define GCOV_ATOMIC_SPEC "%{fprofile-generate-atomic=*:-latomic}"
>> +#endif
>> +#endif
>> +
>>  /* -u* was put back because both BSD and SysV seem to support it.  */
>>  /* %{static:} simply prevents an error message if the target machine
>> doesn't handle -static.  */
>> @@ -771,7 +788,8 @@ proper position among the other output files.  */
>>  
>> %{fopenmp|ftree-parallelize-loops=*:%:include(libgomp.spec)%(link_gomp)}\
>>  %{fgnu-tm:%:include(libitm.spec)%(link_itm)}\
>>  %(mflib) " STACK_SPLIT_SPEC "\
>> -%{fprofile-arcs|fprofile-generate*|coverage:-lgcov} " SANITIZER_SPEC " \
>> +%{fprofile-arcs|fprofile-generate*|coverage:-lgcov\
>> +  " GCOV_ATOMIC_SPEC "} " SANITIZER_SPEC " \
>>  %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\
>>  %{!nostdlib:%{!nostartfiles:%E}} %{T*} }}"
>>
>>
>>
>> On Wed, Nov 20, 2013 at 1:04 PM, Joseph S. Myers
>>  wrote:
>>> On Wed, 20 Nov 2013, Rong Xu wrote:
>>>
>>>> I could do this in the SPEC
>>>>   -Wl,-Bstatic -latomic -Wl,-Bdynamic
>>>> which would link libatomic statically.
>>>> I works for me. But it looks a little weird in gcc driver.
>>>
>>> I think we should generally link libatomic with --as-needed by default on
>>> platforms supporting --as-needed, in line with the general principle that
>>> C code just using language not library facilities (_Atomic in this case)
>>> shouldn't need any special options to link it (libatomic is like libgcc,
>>> which is linked in automatically); the trickier question is what to do
>>> with it on any systems supporting shared libraries but not --as-needed.
>>>
>>> --
>>> Joseph S. Myers
>>> jos...@codesourcery.com


Re: atomic update of profile counters (issue7000044)

2013-11-20 Thread Rong Xu
Joseph and Andrew, thanks for the suggestion. That's really helpful.

Here is the new patch for gcc.c.
Basically, it's just what you have suggested: enclosing -latomic with
--as-needed, and using macros.
For the case of no --as-needed support, I use static link. (just found
that some code already using this in the SPEC).
I'm flexible on this part -- if you think this is unnecessary, I can remove.

Thanks,

-Rong

Index: gcc.c
===
--- gcc.c   (revision 205053)
+++ gcc.c   (working copy)
@@ -748,6 +748,23 @@ proper position among the other output files.  */
 %{fvtable-verify=preinit: -lvtv -u_vtable_map_vars_start
-u_vtable_map_vars_end}}"
 #endif

+/* This spec is for linking in libatomic in gcov atomic counter update.
+   We will use the atomic functions defined in libatomic, only when the builtin
+   versions are not available. In the case of no LD_AS_NEEDED support, we
+   link libatomic statically.  */
+
+#ifndef GCOV_ATOMIC_SPEC
+#if USE_LD_AS_NEEDED
+#define GCOV_ATOMIC_SPEC "%{fprofile-generate-atomic=*:" LD_AS_NEEDED_OPTION \
+  " -latomic} " LD_NO_AS_NEEDED_OPTION
+#elif defined(HAVE_LD_STATIC_DYNAMIC)
+#define GCOV_ATOMIC_SPEC "%{fprofile-generate-atomic=*:" LD_STATIC_OPTION \
+" -latomic " LD_DYNAMIC_OPTION "}"
+#else /* !USE_LD_AS_NEEDED && !HAVE_LD_STATIC_DYNAMIC  */
+#define GCOV_ATOMIC_SPEC "%{fprofile-generate-atomic=*:-latomic}"
+#endif
+#endif
+
 /* -u* was put back because both BSD and SysV seem to support it.  */
 /* %{static:} simply prevents an error message if the target machine
doesn't handle -static.  */
@@ -771,7 +788,8 @@ proper position among the other output files.  */
 %{fopenmp|ftree-parallelize-loops=*:%:include(libgomp.spec)%(link_gomp)}\
 %{fgnu-tm:%:include(libitm.spec)%(link_itm)}\
 %(mflib) " STACK_SPLIT_SPEC "\
-%{fprofile-arcs|fprofile-generate*|coverage:-lgcov} " SANITIZER_SPEC " \
+%{fprofile-arcs|fprofile-generate*|coverage:-lgcov\
+  " GCOV_ATOMIC_SPEC "} " SANITIZER_SPEC " \
 %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\
     %{!nostdlib:%{!nostartfiles:%E}} %{T*} }}"



On Wed, Nov 20, 2013 at 1:04 PM, Joseph S. Myers
 wrote:
> On Wed, 20 Nov 2013, Rong Xu wrote:
>
>> I could do this in the SPEC
>>   -Wl,-Bstatic -latomic -Wl,-Bdynamic
>> which would link libatomic statically.
>> I works for me. But it looks a little weird in gcc driver.
>
> I think we should generally link libatomic with --as-needed by default on
> platforms supporting --as-needed, in line with the general principle that
> C code just using language not library facilities (_Atomic in this case)
> shouldn't need any special options to link it (libatomic is like libgcc,
> which is linked in automatically); the trickier question is what to do
> with it on any systems supporting shared libraries but not --as-needed.
>
> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: atomic update of profile counters (issue7000044)

2013-11-20 Thread Rong Xu
On Tue, Nov 19, 2013 at 5:11 PM, Andrew Pinski  wrote:
> On Tue, Nov 19, 2013 at 5:02 PM, Rong Xu  wrote:
>> Hi all,
>>
>> I merged this old patch with current trunk. I also make the following changes
>> (1) not using weak references. Now every *profile_atomic() has it's
>> own .o so that none of them will be in the final binary if
>> -fprofile-generate-atomic is not specified.
>> (2) more value profilers have the atomic version.
>> (3) not link to libatomic. I used to link the libatomic in the
>> presence of -fprofile-generate-atomic, per Andrew's suggestion. It
>> used to work. But now if I can add -latomic in the SPEC, it cannot
>> find the libatomic.so.1 (unless I specify the PATH). I did not find an
>> easy way to statically link libatomic.a. Andrew: Do you have any
>> suggestion? Or should we let the user link to libatomic.a if the
>> builtins are not expanded?
>
> It should work for an installed GCC.  For testing you might need
> something that is included inside testsuite/lib/atomic-dg.exp which
> sets the library path to include libatomic build directory.

When I change the SPEC to include libatomic,
the compiler can find libatomic. I.e. using
>> gcc -O2 -fprofile-generate -fprofile-generate-atomic=3 hello_world.c
generates a.out without any problem.

But since there are both shared and static libatomic in lib64, it
chooses to use the dynamic one.
>> ldd a.out
linux-vdso.so.1 =>  (0x7fff56bff000)
libatomic.so.1 => not found
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x2b0720261000)
/lib64/ld-linux-x86-64.so.2 (0x2b072003c000)

>> ./a.out
./a.out: error while loading shared libraries: libatomic.so.1: cannot
open shared object file: No such file or directory

while
>> LD_LIBRARY_PATH=/lib64 ./a.out
works fine.

I think that's the same reason we set the library path in
testsuite/lib/atomic-dg.exp, because /lib64
is not in the dynamic library search list.

I could do this in the SPEC
  -Wl,-Bstatic -latomic -Wl,-Bdynamic
which would link libatomic statically.
I works for me. But it looks a little weird in gcc driver.

Index: gcc.c
===
--- gcc.c   (revision 205053)
+++ gcc.c   (working copy)
@@ -771,7 +771,8 @@
 %{fopenmp|ftree-parallelize-loops=*:%:include(libgomp.spec)%(link_gomp)}\
 %{fgnu-tm:%:include(libitm.spec)%(link_itm)}\
 %(mflib) " STACK_SPLIT_SPEC "\
-%{fprofile-arcs|fprofile-generate*|coverage:-lgcov} " SANITIZER_SPEC " \
+%{fprofile-arcs|fprofile-generate*|coverage:-lgcov\
+  %{fprofile-generate-atomic=*:-Wl,-Bstatic -latomic
-Wl,-Bdynamic}} " SANITIZER_SPEC " \
 %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\
 %{!nostdlib:%{!nostartfiles:%E}} %{T*} }}"
 #endif



> I think now we require libatomic in more cases (C11 atomic support for
> an example).
>
> Thanks,
> Andrew Pinski
>
>>
>> Is this OK for trunk?
>>
>> Thanks,
>>
>> -Rong
>>
>> On Mon, Jan 7, 2013 at 12:55 PM, Rong Xu  wrote:
>>> Function __gcov_indirect_call_profiler_atomic (which contains call to
>>> the atomic function) is always emitted in libgcov.
>>> Since we only link libatomic when -fprofile-gen-atomic is specified,
>>> we have to make the atomic function weak -- otherwise, there is a
>>> unsat for regular FDO gen build (of course, when the builtin is not
>>> expanded).
>>>
>>> An alternative it to always link libatomic together with libgcov. Then
>>> we don't need the weak stuff. I'm not sure when one is better.
>>>
>>> -Rong
>>>
>>> On Mon, Jan 7, 2013 at 12:36 PM, Richard Henderson  wrote:
>>>> On 01/03/2013 04:42 PM, Rong Xu wrote:
>>>>> It links libatomic when -fprofile-gen-atomic is specified for FDO
>>>>> instrumentation build. Here I assume libatomic is always installed.
>>>>> Andrew: do you think if this is reasonable?
>>>>>
>>>>> It also disables the functionality if target does not support weak
>>>>> (ie. TARGET_SUPPORTS_WEAK == 0).
>>>>
>>>> Since you're linking libatomic, you don't need weak references.
>>>>
>>>> I think its ok to assume libatomic is installed, given that the
>>>> user has had to explicitly use the command-line option.
>>>>
>>>>
>>>> r~


Re: atomic update of profile counters (issue7000044)

2013-11-19 Thread Rong Xu
Hi all,

I merged this old patch with current trunk. I also make the following changes
(1) not using weak references. Now every *profile_atomic() has it's
own .o so that none of them will be in the final binary if
-fprofile-generate-atomic is not specified.
(2) more value profilers have the atomic version.
(3) not link to libatomic. I used to link the libatomic in the
presence of -fprofile-generate-atomic, per Andrew's suggestion. It
used to work. But now if I can add -latomic in the SPEC, it cannot
find the libatomic.so.1 (unless I specify the PATH). I did not find an
easy way to statically link libatomic.a. Andrew: Do you have any
suggestion? Or should we let the user link to libatomic.a if the
builtins are not expanded?

Is this OK for trunk?

Thanks,

-Rong

On Mon, Jan 7, 2013 at 12:55 PM, Rong Xu  wrote:
> Function __gcov_indirect_call_profiler_atomic (which contains call to
> the atomic function) is always emitted in libgcov.
> Since we only link libatomic when -fprofile-gen-atomic is specified,
> we have to make the atomic function weak -- otherwise, there is a
> unsat for regular FDO gen build (of course, when the builtin is not
> expanded).
>
> An alternative it to always link libatomic together with libgcov. Then
> we don't need the weak stuff. I'm not sure when one is better.
>
> -Rong
>
> On Mon, Jan 7, 2013 at 12:36 PM, Richard Henderson  wrote:
>> On 01/03/2013 04:42 PM, Rong Xu wrote:
>>> It links libatomic when -fprofile-gen-atomic is specified for FDO
>>> instrumentation build. Here I assume libatomic is always installed.
>>> Andrew: do you think if this is reasonable?
>>>
>>> It also disables the functionality if target does not support weak
>>> (ie. TARGET_SUPPORTS_WEAK == 0).
>>
>> Since you're linking libatomic, you don't need weak references.
>>
>> I think its ok to assume libatomic is installed, given that the
>> user has had to explicitly use the command-line option.
>>
>>
>> r~
2013-11-19  Rong Xu  

* gcc/gcov-io.h: Add atomic function macros for compiler use.
* gcc/common.opt (fprofile-generate-atomic): New option.
* gcc/tree-profile.c (gimple_init_edge_profiler): Support for
atomic counter update.
(gimple_gen_edge_profiler): Ditto.
* libgcc/libgcov-profiler.c 
(__gcov_interval_profiler_atomic): Ditto.
(__gcov_pow2_profiler_atomic): Ditto.
(__gcov_one_value_profiler_body_atomic): Ditto.
(__gcov_one_value_profiler_atomic): Ditto.
(__gcov_indirect_call_profiler_atomic): Ditto.
(__gcov_indirect_call_profiler_v2_atomic): Ditto.
(__gcov_time_profiler_atomic): Ditto.
(__gcov_average_profiler_atomic): Ditto.
* gcc/gcc.c: Link libatomic when -fprofile-generate-atomic used.
* libgcc/Makefile.in: Add atomic objects.

Index: gcc/common.opt
===
--- gcc/common.opt  (revision 205053)
+++ gcc/common.opt  (working copy)
@@ -1684,6 +1684,15 @@ fprofile-correction
 Common Report Var(flag_profile_correction)
 Enable correction of flow inconsistent profile data input
 
+; fprofile-generate-atomic=0: default and disable atomical update.
+; fprofile-generate-atomic=1: atomically update edge profile counters.
+; fprofile-generate-atomic=2: atomically update value profile counters.
+; fprofile-generate-atomic=3: atomically update edge and value profile 
counters.
+; other values will be ignored (fall back to the default of 0).
+fprofile-generate-atomic=
+Common Joined UInteger Report Var(flag_profile_generate_atomic) Init(3) 
Optimization
+fprofile-generate-atomic=[0..3] Atomical increments of profile counters.
+
 fprofile-generate
 Common
 Enable common options for generating profile info for profile feedback 
directed optimizations
Index: gcc/tree-profile.c
===
--- gcc/tree-profile.c  (revision 205053)
+++ gcc/tree-profile.c  (working copy)
@@ -155,6 +155,9 @@ gimple_init_edge_profiler (void)
   tree ic_profiler_fn_type;
   tree average_profiler_fn_type;
   tree time_profiler_fn_type;
+  const char *fn_name;
+  bool profile_gen_value_atomic = (flag_profile_generate_atomic == 2 ||
+   flag_profile_generate_atomic == 3);
 
   if (!gcov_type_node)
 {
@@ -167,9 +170,10 @@ gimple_init_edge_profiler (void)
  gcov_type_ptr, gcov_type_node,
  integer_type_node,
  unsigned_type_node, NULL_TREE);
+  fn_name = profile_gen_value_atomic ? "__gcov_interval_profiler_atomic" : 
+ "__gcov_interval_profiler";
   tree_interval_profiler_fn
- = build_fn_decl ("__gcov_in

Re: Fix summary generation with fork

2013-11-19 Thread Rong Xu
On Mon, Nov 18, 2013 at 2:15 PM, Jan Hubicka  wrote:
> Hi,
> this patch fixes problem we noticed with Martin Liska where gcov_dump is 
> called
> several times per execution of firefox (on each fork and exec).  This causes
> runs to be large and makes functions executed once per program to be 
> considered
> cold.
>
> This patch makes us to update runs just once per execution and not on each
> streaming of profile data.  While testing it I also noticed that program
> summary is now broken, since crc32 is accumulated per each dumping instead 
> just
> once.

You are right. I forgot that gcov_exit() can be called multiple times.
Should have
initialized crc32 per gcov_exit() call.

Thanks for the fix.
>
> I believe the newly introduced static vars should go - there is nothing really
> preventing us from doing two concurent updates and also it just unnecesary
> increases to footprint of libgcov.  I converted thus all_prg and crc32 back to
> local vars.
>
> Bootstrapped/regtested x86_64-linux, comitted.
>
> * libgcov-driver.c (get_gcov_dump_complete): Update comments.
> (all_prg, crc32): Remove static vars.
> (gcov_exit_compute_summary): Rewrite to return crc32; do not clear
> all_prg.
> (gcov_exit_merge_gcda): Add crc32 parameter.
> (gcov_exit_merge_summary): Add crc32 and all_prg parameter;
> do not account run if it was already accounted.
> (gcov_exit_dump_gcov): Add crc32 and all_prg parameters.
> (gcov_exit): Initialize all_prg; update.
> Index: libgcov-driver.c
> ===
> --- libgcov-driver.c(revision 204945)
> +++ libgcov-driver.c(working copy)
> @@ -96,7 +96,7 @@ static size_t gcov_max_filename = 0;
>  /* Flag when the profile has already been dumped via __gcov_dump().  */
>  static int gcov_dump_complete;
>
> -/* A global functino that get the vaule of gcov_dump_complete.  */
> +/* A global function that get the vaule of gcov_dump_complete.  */
>
>  int
>  get_gcov_dump_complete (void)
> @@ -319,12 +319,6 @@ gcov_compute_histogram (struct gcov_summ
>
>  /* summary for program.  */
>  static struct gcov_summary this_prg;
> -#if !GCOV_LOCKED
> -/* summary for all instances of program.  */
> -static struct gcov_summary all_prg;
> -#endif
> -/* crc32 for this program.  */
> -static gcov_unsigned_t crc32;
>  /* gcda filename.  */
>  static char *gi_filename;
>  /* buffer for the fn_data from another program.  */
> @@ -333,10 +327,9 @@ static struct gcov_fn_buffer *fn_buffer;
>  static struct gcov_summary_buffer *sum_buffer;
>
>  /* This funtions computes the program level summary and the histo-gram.
> -   It initializes ALL_PRG, computes CRC32, and stores the summary in
> -   THIS_PRG. All these three variables are file statics.  */
> +   It computes and returns CRC32  and stored summari in THIS_PRG.  */
>
> -static void
> +static gcov_unsigned_t
>  gcov_exit_compute_summary (void)
>  {
>struct gcov_info *gi_ptr;
> @@ -346,10 +339,8 @@ gcov_exit_compute_summary (void)
>int f_ix;
>unsigned t_ix;
>gcov_unsigned_t c_num;
> +  gcov_unsigned_t crc32 = 0;
>
> -#if !GCOV_LOCKED
> -  memset (&all_prg, 0, sizeof (all_prg));
> -#endif
>/* Find the totals for this execution.  */
>memset (&this_prg, 0, sizeof (this_prg));
>for (gi_ptr = gcov_list; gi_ptr; gi_ptr = gi_ptr->next)
> @@ -391,6 +382,7 @@ gcov_exit_compute_summary (void)
>  }
>  }
>gcov_compute_histogram (&this_prg);
> +  return crc32;
>  }
>
>  /* A struct that bundles all the related information about the
> @@ -412,7 +404,8 @@ static int
>  gcov_exit_merge_gcda (struct gcov_info *gi_ptr,
>struct gcov_summary *prg_p,
>gcov_position_t *summary_pos_p,
> -  gcov_position_t *eof_pos_p)
> +  gcov_position_t *eof_pos_p,
> + gcov_unsigned_t crc32)
>  {
>gcov_unsigned_t tag, length;
>unsigned t_ix;
> @@ -652,13 +645,19 @@ gcov_exit_write_gcda (const struct gcov_
> Return -1 on error. Return 0 on success.  */
>
>  static int
> -gcov_exit_merge_summary (const struct gcov_info *gi_ptr, struct gcov_summary 
> *prg)
> +gcov_exit_merge_summary (const struct gcov_info *gi_ptr, struct gcov_summary 
> *prg,
> +gcov_unsigned_t crc32, struct gcov_summary *all_prg)
>  {
>struct gcov_ctr_summary *cs_prg, *cs_tprg;
> -#if !GCOV_LOCKED
> -  struct gcov_ctr_summary *cs_all;
> -#endif
>unsigned t_ix;
> +  /* If application calls fork or exec multiple times, we end up storing
> + profile repeadely.  We should not account this as multiple runs or
> + functions executed once may mistakely become cold.  */
> +  static int run_accounted = 0;
> +#if !GCOV_LOCKED
> +  /* summary for all instances of program.  */
> +  struct gcov_ctr_summary *cs_all;
> +#endif
>
>/* Merge the summaries.  */
>for (t_ix = 0; t_ix < GCOV_COUNTERS_SUMMABLE; t_ix++

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-12 Thread Rong Xu
The patch was checked in as r204730.

Tested with
configure with --enable-languages=all,obj-c++
(ada is currently broken, so I was not be able test)
bootstrap and profiledbootstrap
regression for -m32 and -m64.
spec2006

The only semantic change is __gcov_flush() used to be under L_gcov.
I put the function to libgcov-interface.c and made it under
L_gcov_flush (newly added).
So if you need this function, you have to enable L_gcov_flush in the
libgcc/Makefile.in.

Thanks,

-Rong

On Mon, Nov 11, 2013 at 10:19 AM, Rong Xu  wrote:
> On Mon, Nov 11, 2013 at 7:02 AM, Jan Hubicka  wrote:
>>> 2013-11-04  Rong Xu  
>>>
>>>   * libgcc/libgcov.c: Delete as part of re-factoring.
>>>   * libgcc/libgcov-profiler.c (__gcov_interval_profiler): Moved from
>>>   libgcov.c
>>>   (__gcov_pow2_profiler): Ditto.
>>>   (__gcov_one_value_profiler_body): Ditto.
>>>   (__gcov_one_value_profiler): Ditto.
>>>   (__gcov_indirect_call_profiler): Ditto.
>>>   (__gcov_indirect_call_profiler_v2): Ditto.
>>>   (__gcov_average_profiler): Ditto.
>>>   (__gcov_ior_profiler): Ditto.
>>>   * libgcc/libgcov-driver.c (this_prg): Make it file scope
>>>   static variable.
>>>   (all_prg): Ditto.
>>>   (crc32): Ditto.
>>>   (gi_filename): Ditto.
>>>   (fn_buffer): Ditto.
>>>   (sum_buffer): Ditto.
>>> (struct gcov_filename_aux): New types to store auxiliary information
>>> for gi_filename.
>>>   (gcov_version): Moved from libgcov.c.
>>>   (crc32_unsigned): Ditto.
>>>   (gcov_histogram_insert): Ditto.
>>>   (gcov_compute_histogram): Ditto.
>>>   (gcov_exit): Ditto.
>>>   (gcov_clear): Ditto.
>>>   (__gcov_init): Ditto.
>>>   (gcov_exit_compute_summary): New function split from gcov_exit().
>>>   (gcov_exit_merge_gcda): Ditto.
>>>   (gcov_exit_write_gcda): Ditto.
>>>   (gcov_exit_dump_gcov): Ditto.
>>>   * libgcc/libgcov-interface.c (init_mx): Moved from libgcov.c.
>>>   (init_mx_once): Ditto.
>>>   (__gcov_flush): Ditto.
>>>   (__gcov_reset): Ditto.
>>>   (__gcov_dump): Ditto.
>>>   (__gcov_fork): Ditto.
>>>   (__gcov_execl): Ditto.
>>>   (__gcov_execlp): Ditto.
>>>   (__gcov_execle): Ditto.
>>>   (__gcov_execv): Ditto.
>>>   (__gcov_execvp): Ditto.
>>>   (__gcov_execve): Ditto.
>>>   * libgcc/libgcov-driver-system.c (gcov_error): New utility function.
>>>   (allocate_filename_struct): New function split from gcov_exit().
>>>   (gcov_exit_open_gcda_file): Ditto.
>>> (create_file_directory): Moved from libgcov.c.
>>>   * libgcc/libgcov-merge.c:
>>>   (__gcov_merge_add): Moved from libgcov.c.
>>>   (__gcov_merge_ior): Ditto.
>>>   (__gcov_merge_single): Ditto.
>>>   (__gcov_merge_delta): Ditto.
>>>   * libgcc/Makefile.in: Change to build newly added files.
>>>   * gcc/gcov-io.h (__gcov_merge_ior): Add the decl to avoid warning.
>>>
>>
>> Hi,
>> the patch looks resonable. I take your word on the fact that there are no 
>> changes
>> in the code, I did not read it all ;)))
>
> We did split gcov_exit() into more functions.
> But, semantically the code is the same.
>
>>> Index: libgcc/Makefile.in
>>> ===
>>> --- libgcc/Makefile.in(revision 204285)
>>> +++ libgcc/Makefile.in(working copy)
>>> @@ -853,17 +853,37 @@ include $(iterator)
>>>  # Build libgcov components.
>>>
>>>  # Defined in libgcov.c, included only in gcov library
>>> -LIBGCOV = _gcov _gcov_merge_add _gcov_merge_single _gcov_merge_delta \
>>> -_gcov_fork _gcov_execl _gcov_execlp _gcov_execle \
>>> -_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump \
>>> -_gcov_interval_profiler _gcov_pow2_profiler _gcov_one_value_profiler \
>>> +##LIBGCOV = _gcov _gcov_merge_add _gcov_merge_single _gcov_merge_delta \
>>> +##_gcov_fork _gcov_execl _gcov_execlp _gcov_execle \
>>> +##_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump \
>>> +##_gcov_interval_profiler _gcov_pow2_profiler _gcov_one_value_profiler 
>>> \
>>> +##_gcov_indirect_call_profiler _gcov_average_profiler 
>>> _gcov_ior_profiler \
>>> +##_gcov_merge_ior _gcov_indirect_cal

Re: [PATCH] Time profiler - phase 1

2013-11-12 Thread Rong Xu
Thanks. I'll have this change included in my libgcov.c re-re-factoring patch.

-Rong

On Tue, Nov 12, 2013 at 10:18 AM, Martin Liška  wrote:
> Hi Rong,
>   you are right, that's the only usage and so that you can declare it
> static.
>
> Thank you,
> Martin
>
> On Nov 12, 2013 7:10 PM, "Rong Xu"  wrote:
>>
>> A question about the newly added global variable function_counter.
>> Does it have to be a global? Can I make it a file static, within macro
>> L_gcov_time_profiler? I don't find a use other than in
>> __gcov_time_profiler().
>>
>> thanks,
>>
>> -Rong
>>
>>
>>
>> On Mon, Nov 11, 2013 at 9:52 AM, Martin Liška 
>> wrote:
>> > On 11 November 2013 15:57, Jan Hubicka  wrote:
>> >>> +2013-10-29  Martin Liska  
>> >>> + Jan Hubicka
>> >>> 
>> >>> +
>> >>> + * cgraph.c (dump_cgraph_node): Profile dump added.
>> >>> + * cgraph.h (struct cgraph_node): New time profile variable
>> >>> added.
>> >>> + * cgraphclones.c (cgraph_clone_node): Time profile is cloned.
>> >>> + * gcov-io.h (gcov_type): New profiler type introduced.
>> >>> + * ipa-profile.c (lto_output_node): Streaming for time profile
>> >>> added.
>> >>> + (input_node): Time profiler is read from LTO stream.
>> >>> + * predict.c (maybe_hot_count_p): Hot prediction changed.
>> >>> + * profile.c (instrument_values): New case for time profiler
>> >>> added.
>> >>> + (compute_value_histograms): Read of time profile.
>> >>> + * tree-pretty-print.c (dump_function_header): Time profiler is
>> >>> dumped.
>> >>> + * tree-profile.c (init_ic_make_global_vars): Time profiler
>> >>> function added.
>> >>> + (gimple_init_edge_profiler): TP function instrumentation.
>> >>> + (gimple_gen_time_profiler): New.
>> >>> + * value-prof.c (gimple_add_histogram_value): Support for time
>> >>> profiler
>> >>> + added.
>> >>> + (dump_histogram_value): TP type added to dumps.
>> >>> + (visit_hist): More sensitive check that takes TP into account.
>> >>> + (gimple_find_values_to_profile): TP instrumentation.
>> >>> + * value-prof.h (hist_type): New histogram type added.
>> >>> + (struct histogram_value_t): Pointer to struct function added.
>> >>> + * libgcc/Makefile.in: New GCOV merge function for TP added.
>> >>> + * libgcov.c: function_counter variable introduced.
>> >>> + (_gcov_merge_time_profile): New.
>> >>> + (_gcov_time_profiler): New.
>> >>> +
>> >>>  2013-11-05  Steven Bosscher  
>> >>>
>> >>>
>> >>> diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c
>> >>> index 1260069..eff4b51 100644
>> >>> --- a/gcc/ipa-profile.c
>> >>> +++ b/gcc/ipa-profile.c
>> >>> @@ -465,6 +465,7 @@ ipa_propagate_frequency (struct cgraph_node *node)
>> >>>if (d.maybe_unlikely_executed)
>> >>>  {
>> >>>node->frequency = NODE_FREQUENCY_UNLIKELY_EXECUTED;
>> >>> +  node->tp_first_run = 0;
>> >>>if (dump_file)
>> >>>   fprintf (dump_file, "Node %s promoted to unlikely executed.\n",
>> >>>cgraph_node_name (node));
>> >>
>> >> Why do you put tp_first_run to 0?  The function may be unlikely but it
>> >> still may have
>> >> average possition in the program execution. I.e. when it is executed
>> >> once per many
>> >> invocations during the train run)
>> >
>> > That makes no sense, I've removed this assignment.
>> >
>> >>> @@ -895,9 +907,22 @@ compute_value_histograms (histogram_values
>> >>> values, unsigned cfg_checksum,
>> >>>hist->hvalue.counters =  XNEWVEC (gcov_type, hist->n_counters);
>> >>>for (j = 0; j < hist->n_counters; j++)
>> >>>  if (aact_count)
>> >>> -   hist->hvalue.counters[j] = aact_count[j];
>> >>> - else
>> >>> -   hist->hvalue.counters[j] = 0;
>> >>> +  hist-&

Re: [PATCH] Time profiler - phase 1

2013-11-12 Thread Rong Xu
A question about the newly added global variable function_counter.
Does it have to be a global? Can I make it a file static, within macro
L_gcov_time_profiler? I don't find a use other than in
__gcov_time_profiler().

thanks,

-Rong



On Mon, Nov 11, 2013 at 9:52 AM, Martin Liška  wrote:
> On 11 November 2013 15:57, Jan Hubicka  wrote:
>>> +2013-10-29  Martin Liska  
>>> + Jan Hubicka  
>>> +
>>> + * cgraph.c (dump_cgraph_node): Profile dump added.
>>> + * cgraph.h (struct cgraph_node): New time profile variable added.
>>> + * cgraphclones.c (cgraph_clone_node): Time profile is cloned.
>>> + * gcov-io.h (gcov_type): New profiler type introduced.
>>> + * ipa-profile.c (lto_output_node): Streaming for time profile added.
>>> + (input_node): Time profiler is read from LTO stream.
>>> + * predict.c (maybe_hot_count_p): Hot prediction changed.
>>> + * profile.c (instrument_values): New case for time profiler added.
>>> + (compute_value_histograms): Read of time profile.
>>> + * tree-pretty-print.c (dump_function_header): Time profiler is dumped.
>>> + * tree-profile.c (init_ic_make_global_vars): Time profiler function 
>>> added.
>>> + (gimple_init_edge_profiler): TP function instrumentation.
>>> + (gimple_gen_time_profiler): New.
>>> + * value-prof.c (gimple_add_histogram_value): Support for time profiler
>>> + added.
>>> + (dump_histogram_value): TP type added to dumps.
>>> + (visit_hist): More sensitive check that takes TP into account.
>>> + (gimple_find_values_to_profile): TP instrumentation.
>>> + * value-prof.h (hist_type): New histogram type added.
>>> + (struct histogram_value_t): Pointer to struct function added.
>>> + * libgcc/Makefile.in: New GCOV merge function for TP added.
>>> + * libgcov.c: function_counter variable introduced.
>>> + (_gcov_merge_time_profile): New.
>>> + (_gcov_time_profiler): New.
>>> +
>>>  2013-11-05  Steven Bosscher  
>>>
>>>
>>> diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c
>>> index 1260069..eff4b51 100644
>>> --- a/gcc/ipa-profile.c
>>> +++ b/gcc/ipa-profile.c
>>> @@ -465,6 +465,7 @@ ipa_propagate_frequency (struct cgraph_node *node)
>>>if (d.maybe_unlikely_executed)
>>>  {
>>>node->frequency = NODE_FREQUENCY_UNLIKELY_EXECUTED;
>>> +  node->tp_first_run = 0;
>>>if (dump_file)
>>>   fprintf (dump_file, "Node %s promoted to unlikely executed.\n",
>>>cgraph_node_name (node));
>>
>> Why do you put tp_first_run to 0?  The function may be unlikely but it still 
>> may have
>> average possition in the program execution. I.e. when it is executed once 
>> per many
>> invocations during the train run)
>
> That makes no sense, I've removed this assignment.
>
>>> @@ -895,9 +907,22 @@ compute_value_histograms (histogram_values values, 
>>> unsigned cfg_checksum,
>>>hist->hvalue.counters =  XNEWVEC (gcov_type, hist->n_counters);
>>>for (j = 0; j < hist->n_counters; j++)
>>>  if (aact_count)
>>> -   hist->hvalue.counters[j] = aact_count[j];
>>> - else
>>> -   hist->hvalue.counters[j] = 0;
>>> +  hist->hvalue.counters[j] = aact_count[j];
>>> +else
>>> +  hist->hvalue.counters[j] = 0;
>>> +
>>> +  /* Time profiler counter is not related to any statement,
>>> +   * so that we have to read the counter and set the value to
>>> +   * the corresponding call graph node.  */
>>
>> Reformat comment to GNU style.
>
> Yes.
>
>>> +2013-10-28  Martin Liska 
>>> +
>>> + * gcc.dg/time-profiler-1.c: New test.
>>> + * gcc.dg/time-profiler-2.c: Ditto.
>>> +
>> It is custom to put all changelogs to the beggining of your patch (just 
>> nitpicking ;))
>>> @@ -222,6 +225,18 @@ gimple_init_edge_profiler (void)
>>>   = tree_cons (get_identifier ("leaf"), NULL,
>>>DECL_ATTRIBUTES (tree_indirect_call_profiler_fn));
>>>
>>> +  /* void (*) (gcov_type *, gcov_type, void *)  */
>>> +  time_profiler_fn_type
>>> += build_function_type_list (void_type_node,
>>> +   gcov_type_ptr, NULL_TREE);
>>> +  tree_time_profiler_fn
>>> +   = build_fn_decl ("__gcov_time_profiler",
>>> +  time_profiler_fn_type);
>>> +  TREE_NOTHROW (tree_time_profiler_fn) = 1;
>>> +  DECL_ATTRIBUTES (tree_time_profiler_fn)
>>> + = tree_cons (get_identifier ("leaf"), NULL,
>>> +  DECL_ATTRIBUTES (tree_time_profiler_fn));
>>
>> We should udpate this code to use set_call_expr_flags but by incremental 
>> patch.
>
> OK, I will include this change to the phase 2.
>
>> The patch is OK with changes above.  Do you have write permissin to commit 
>> it?
>> If not, just send updated version and I will commit it + follow the write 
>> access
>> instructions on gcc.gnu.org homepage.
>
> Yes, I do have commit right. I will bootstrap the pa

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-11 Thread Rong Xu
On Mon, Nov 11, 2013 at 7:17 AM, Jan Hubicka  wrote:
>> Here is the patch that includes profile-tool.
>> Profile-tool now has two functions: merge and rewrite. I'll add diff later.
>>
>> Compiler is tested with spec2006 and profiledbootstrap.
>> profile-tool is tested with spec2006 profiles.
>
> Hi,
> it would be nice if you could elaborate bit more on the tool and what it does.
> I plan to implement relink only instrumentatino for LTO and for that I will 
> need
> profile reading/handling that is independent on functio nbodies, so perhaps
> we can unify the infrastructure.  My plans always was to merge the gcov and
> profile.c profile handling logic and make it bit more generic.

This tool is designed to manipulate raw profile counters. We plan to
have the following functionality:

(1) merge the gcda file with weights. We have fairly complicated build
system. Sometime online merging in
current libgcov.a not feasible (or not desirable). We need offline
tool the merge the FDO profiles.
The program may have multiple training inputs with different
importance -- this is where weighted merging
comes from.

(2) diff the profile. We want to know how different of two profile
data. Some tolerance of mismatch is required here.
This is crucial for performance triaging.

(3) rewrite the gcda file. We can do some (user-direct) fix-up work in
the rewrite. This is may not be important for FDO. But
this will be very useful for LIPO (like fix-up the indirect-target due
to mismatch, rewrite module-grouping)

(4) better dumping: like dump top-n-hottest
functions/objects/indirectly-targets etc.

This tool reconstructs the gcov_info list from gcda files and then
uses significant part of libgcov.c code:
merge, compute summary/histo-gram, and dump gcda files etc.

The merging into current libgcov can also use this piece of code to do
the in-memory merge. The advantage
here is we will have precise histro-gram.

>
> Also I think profile-tool is a bit generic name. I.e. it does not realy say it
> is related to GCC's coverage profiling.  Maybe gcc-profile-tool or
> gcov-profile-tool may be better?

Yes. gcov-profile-tool seems to be better, or simply gcov-tool?

> Perhaps an resonable option would also be to bundle it into gcov binary...

Bundling into gcov is an interesting idea. I do plan to read gcno file
to extract more information (at least the function name).
Bungling into gcda means the tool can work on bb level, rather on the
raw counters. Let me think a bit more on this.

>> Index: libgcc/libgcov-tool.c
>> ===
>> --- libgcc/libgcov-tool.c (revision 0)
>> +++ libgcc/libgcov-tool.c (revision 0)
>> @@ -0,0 +1,800 @@
>> +/* GCC instrumentation plugin for ThreadSanitizer.
>> +   Copyright (C) 2011-2013 Free Software Foundation, Inc.
>> +   Contributed by Dmitry Vyukov 
>
> You need to update this...
> Why there needs to be libgcov changes?

David pointed out the copyright issue in an earlier email. I fixed it.

The changes in libgcov are
(1) ifdef the target headers. I need all the libgcov functionality to
read/write the gcda files, as well as summary and histo-gram. But the
binary is for host machines.
(2) change the merge function so that I can take an in-memory
gcov-info as input.

>
> Honza


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-11 Thread Rong Xu
On Mon, Nov 11, 2013 at 7:02 AM, Jan Hubicka  wrote:
>> 2013-11-04  Rong Xu  
>>
>>   * libgcc/libgcov.c: Delete as part of re-factoring.
>>   * libgcc/libgcov-profiler.c (__gcov_interval_profiler): Moved from
>>   libgcov.c
>>   (__gcov_pow2_profiler): Ditto.
>>   (__gcov_one_value_profiler_body): Ditto.
>>   (__gcov_one_value_profiler): Ditto.
>>   (__gcov_indirect_call_profiler): Ditto.
>>   (__gcov_indirect_call_profiler_v2): Ditto.
>>   (__gcov_average_profiler): Ditto.
>>   (__gcov_ior_profiler): Ditto.
>>   * libgcc/libgcov-driver.c (this_prg): Make it file scope
>>   static variable.
>>   (all_prg): Ditto.
>>   (crc32): Ditto.
>>   (gi_filename): Ditto.
>>   (fn_buffer): Ditto.
>>   (sum_buffer): Ditto.
>> (struct gcov_filename_aux): New types to store auxiliary information
>> for gi_filename.
>>   (gcov_version): Moved from libgcov.c.
>>   (crc32_unsigned): Ditto.
>>   (gcov_histogram_insert): Ditto.
>>   (gcov_compute_histogram): Ditto.
>>   (gcov_exit): Ditto.
>>   (gcov_clear): Ditto.
>>   (__gcov_init): Ditto.
>>   (gcov_exit_compute_summary): New function split from gcov_exit().
>>   (gcov_exit_merge_gcda): Ditto.
>>   (gcov_exit_write_gcda): Ditto.
>>   (gcov_exit_dump_gcov): Ditto.
>>   * libgcc/libgcov-interface.c (init_mx): Moved from libgcov.c.
>>   (init_mx_once): Ditto.
>>   (__gcov_flush): Ditto.
>>   (__gcov_reset): Ditto.
>>   (__gcov_dump): Ditto.
>>   (__gcov_fork): Ditto.
>>   (__gcov_execl): Ditto.
>>   (__gcov_execlp): Ditto.
>>   (__gcov_execle): Ditto.
>>   (__gcov_execv): Ditto.
>>   (__gcov_execvp): Ditto.
>>   (__gcov_execve): Ditto.
>>   * libgcc/libgcov-driver-system.c (gcov_error): New utility function.
>>   (allocate_filename_struct): New function split from gcov_exit().
>>   (gcov_exit_open_gcda_file): Ditto.
>> (create_file_directory): Moved from libgcov.c.
>>   * libgcc/libgcov-merge.c:
>>   (__gcov_merge_add): Moved from libgcov.c.
>>   (__gcov_merge_ior): Ditto.
>>   (__gcov_merge_single): Ditto.
>>   (__gcov_merge_delta): Ditto.
>>   * libgcc/Makefile.in: Change to build newly added files.
>>   * gcc/gcov-io.h (__gcov_merge_ior): Add the decl to avoid warning.
>>
>
> Hi,
> the patch looks resonable. I take your word on the fact that there are no 
> changes
> in the code, I did not read it all ;)))

We did split gcov_exit() into more functions.
But, semantically the code is the same.

>> Index: libgcc/Makefile.in
>> ===
>> --- libgcc/Makefile.in(revision 204285)
>> +++ libgcc/Makefile.in(working copy)
>> @@ -853,17 +853,37 @@ include $(iterator)
>>  # Build libgcov components.
>>
>>  # Defined in libgcov.c, included only in gcov library
>> -LIBGCOV = _gcov _gcov_merge_add _gcov_merge_single _gcov_merge_delta \
>> -_gcov_fork _gcov_execl _gcov_execlp _gcov_execle \
>> -_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump \
>> -_gcov_interval_profiler _gcov_pow2_profiler _gcov_one_value_profiler \
>> +##LIBGCOV = _gcov _gcov_merge_add _gcov_merge_single _gcov_merge_delta \
>> +##_gcov_fork _gcov_execl _gcov_execlp _gcov_execle \
>> +##_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump \
>> +##_gcov_interval_profiler _gcov_pow2_profiler _gcov_one_value_profiler \
>> +##_gcov_indirect_call_profiler _gcov_average_profiler 
>> _gcov_ior_profiler \
>> +##_gcov_merge_ior _gcov_indirect_call_profiler_v2
>
> Probably no need for this commnet block.

Yes. I ready remove in the most recently patch.

>> +
>> +LIBGCOV_MERGE = _gcov_merge_add _gcov_merge_single _gcov_merge_delta 
>> _gcov_merge_ior
>> +LIBGCOV_PROFILER = _gcov_interval_profiler _gcov_pow2_profiler 
>> _gcov_one_value_profiler \
>>  _gcov_indirect_call_profiler _gcov_average_profiler _gcov_ior_profiler \
>> -_gcov_merge_ior _gcov_indirect_call_profiler_v2
>> +_gcov_indirect_call_profiler_v2
>> +LIBGCOV_INTERFACE = _gcov_flush _gcov_fork _gcov_execl _gcov_execlp 
>> _gcov_execle \
>> +_gcov_execv _gcov_execvp _gcov_execve _gcov_reset _gcov_dump
>> +LIBGCOV_DRIVER = _gcov
>>
>> -libgcov-objects = $(patsubst %,%$(objext),$(LIBGCOV))
>> +libgcov-merge-objects = $(patsubst %,%$(objext),$(LIBGCOV_MERGE

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-08 Thread Rong Xu
On Fri, Nov 8, 2013 at 11:30 AM, Xinliang David Li  wrote:
> On Fri, Nov 8, 2013 at 11:21 AM, Rong Xu  wrote:
>> On Fri, Nov 8, 2013 at 11:02 AM, Xinliang David Li  
>> wrote:
>>> On Fri, Nov 8, 2013 at 10:48 AM, Rong Xu  wrote:
>>>> On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li  
>>>> wrote:
>>>>> in libgcov-driver.c
>>>>>
>>>>>> /* Flag when the profile has already been dumped via __gcov_dump().  */
>>>>>> static int gcov_dump_complete;
>>>>>
>>>>>> inline void
>>>>>> set_gcov_dump_complete (void)
>>>>>>{
>>>>>   > gcov_dump_complete = 1;
>>>>>>}
>>>>>
>>>>>>inline void
>>>>>>reset_gcov_dump_complete (void)
>>>>>>{
>>>>>>  gcov_dump_complete = 0;
>>>>>>}
>>>>>
>>>>>
>>>>> These should be moved to libgcov-interface.c. Is there reason to not
>>>>> mark them as static?
>>>>
>>>> gcov_dump_complete is used in gcov_exit() which is in
>>>> libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both
>>>> in libgcov-interface.c.
>>>> I want gcov_dump_complete a static. So I have to add to global
>>>> functions that accessible from libgcov-interface.c.
>>>> Another choice is to move __gcov_dump and __gcov_reset to
>>>> libgcov-driver.c and that will simplify the code.
>>>>
>>>
>>> Ok  then you should remove the 'inline' keyword for the set and reset
>>> functions and keep them in libgcov-driver.c.
>>
>> Will remove 'inline' keyword.
>>
>>>
>>>>>
>>>>> in libgcov-profiler.c, line 142
>>>>>
>>>>>> #if defined(HAVE_CC_TLS) && !defined (USE_EMUTLS)
>>>>>> __thread
>>>>>
>>>>> trailing whilte space.
>>>>>
>>>>
>>>> Will fix.
>>>>
>>>>>
>>>>>> || (VTABLE_USES_DESCRIPTORS && __gcov_indirect_call_callee
>>>>>> && *(void **) cur_func == *(void **) 
>>>>>> __gcov_indirect_call_callee))
>>>>>
>>>>> trailing white space.
>>>>>
>>>>
>>>> Will fix.
>>>>
>>>>>
>>>>> in libgcov-merge.c
>>>>>
>>>>> 1) I don't think you need in_mem flag. For in memory merge, the source != 
>>>>> NULL.
>>>>> 2) document the new source and weight parameter -- especially the weight.
>>>>
>>>> Will do.
>>>>
>>>>> 3) How do you deal with integer overflow ?
>>>>
>>>> This is good question. gcvo_type is (typically) long long. I thought
>>>> the count value will not be nearly close to the max of long long.
>>>> (We did see overflow in compiler, but that's because of the scaling --
>>>> some of the scaling factors are really large.)
>>>>
>>>
>>> Why not passing weight as a scaled PERCENT  (as branch prob) for the
>>> source? THis way, the merge tool does not need to do scaling twice.
>>>
>>
>> there are two weights, so unless you have two weights in the merge_add
>> functions, you have to
>> traverse the list twice. Do you mean pass addition parameters?
>>
>> Currently, merge tools is done this way:
>> merge (p1, p2, w1, w2)
>> first pass: merge_add (p1, p1, w1)
>> second pass: merge_add (p1, p2, w2)
>>
>
> Only need to pass the normalized the weight (in percent) for one
> profile source -- say norm_w2 : w2/(w1+w2), and the merge function can
> do this in one pass as norm_w1 = 1-norm_w2.
>
>
>> I initially had both weights in merge_add function. but later I found that
>> to deal with mis-match profile, I need to find out the gcov_info in p1
>> but not p2, (they need to multiply by w1 only).
>> So I choose the above structure.
>>
>> Another thing about the PERCENT is the inaccuracy for floating points.
>
> This is how profile update work in profile-use compilation -- so that
> should not be a big problem.
>
> Before you fix this, I'd like to hear what Honza and other reviewers think.
>
> thanks,
>
> David
>

OK. Let's get more comments on this before I re-work on this.
I also want to point out that the traversal the 

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-08 Thread Rong Xu
On Fri, Nov 8, 2013 at 11:02 AM, Xinliang David Li  wrote:
> On Fri, Nov 8, 2013 at 10:48 AM, Rong Xu  wrote:
>> On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li  
>> wrote:
>>> in libgcov-driver.c
>>>
>>>> /* Flag when the profile has already been dumped via __gcov_dump().  */
>>>> static int gcov_dump_complete;
>>>
>>>> inline void
>>>> set_gcov_dump_complete (void)
>>>>{
>>>   > gcov_dump_complete = 1;
>>>>}
>>>
>>>>inline void
>>>>reset_gcov_dump_complete (void)
>>>>{
>>>>  gcov_dump_complete = 0;
>>>>}
>>>
>>>
>>> These should be moved to libgcov-interface.c. Is there reason to not
>>> mark them as static?
>>
>> gcov_dump_complete is used in gcov_exit() which is in
>> libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both
>> in libgcov-interface.c.
>> I want gcov_dump_complete a static. So I have to add to global
>> functions that accessible from libgcov-interface.c.
>> Another choice is to move __gcov_dump and __gcov_reset to
>> libgcov-driver.c and that will simplify the code.
>>
>
> Ok  then you should remove the 'inline' keyword for the set and reset
> functions and keep them in libgcov-driver.c.

Will remove 'inline' keyword.

>
>>>
>>> in libgcov-profiler.c, line 142
>>>
>>>> #if defined(HAVE_CC_TLS) && !defined (USE_EMUTLS)
>>>> __thread
>>>
>>> trailing whilte space.
>>>
>>
>> Will fix.
>>
>>>
>>>> || (VTABLE_USES_DESCRIPTORS && __gcov_indirect_call_callee
>>>> && *(void **) cur_func == *(void **) __gcov_indirect_call_callee))
>>>
>>> trailing white space.
>>>
>>
>> Will fix.
>>
>>>
>>> in libgcov-merge.c
>>>
>>> 1) I don't think you need in_mem flag. For in memory merge, the source != 
>>> NULL.
>>> 2) document the new source and weight parameter -- especially the weight.
>>
>> Will do.
>>
>>> 3) How do you deal with integer overflow ?
>>
>> This is good question. gcvo_type is (typically) long long. I thought
>> the count value will not be nearly close to the max of long long.
>> (We did see overflow in compiler, but that's because of the scaling --
>> some of the scaling factors are really large.)
>>
>
> Why not passing weight as a scaled PERCENT  (as branch prob) for the
> source? THis way, the merge tool does not need to do scaling twice.
>

there are two weights, so unless you have two weights in the merge_add
functions, you have to
traverse the list twice. Do you mean pass addition parameters?

Currently, merge tools is done this way:
merge (p1, p2, w1, w2)
first pass: merge_add (p1, p1, w1)
second pass: merge_add (p1, p2, w2)

I initially had both weights in merge_add function. but later I found that
to deal with mis-match profile, I need to find out the gcov_info in p1
but not p2, (they need to multiply by w1 only).
So I choose the above structure.

Another thing about the PERCENT is the inaccuracy for floating points.

I have the scaling function in rewrite sub-command mainly for debug purpose:
I merge the same profile with a weight 1:9.
Then I scale the result profile by 0.1. I was expecting identical
profile as the source. But due to floating point inaccuracy, some
counters are off slightly.


> David
>
>
>
>>>
>>> David
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu  wrote:
>>>> On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers  
>>>> wrote:
>>>>> On Thu, 7 Nov 2013, Rong Xu wrote:
>>>>>
>>>>>> Thanks Joseph for these detailed comments/suggestions.
>>>>>> The fixed patch is attached to this email.
>>>>>> The only thing left out is the Texinfo manual. Do you mean this tool
>>>>>> should have its it's own texi file in gcc/doc?
>>>>>
>>>>> Its own texi file, probably included as a chapter by gcc.texi (like
>>>>> gcov.texi is).
>>>>
>>>> OK. will add this in.
>>>>
>>>> My last patch that changes the command handling is actually broken.
>>>> Please ignore that patch and use the one in this email.
>>>>
>>>> Also did some minor adjust of the code in libgcov.
>>>>
>>>> Thanks,
>>>>
>>>> -Rong
>>>>
>>>>
>>>>>
>>>>> --
>>>>> Joseph S. Myers
>>>>> jos...@codesourcery.com


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-08 Thread Rong Xu
A question about inhibit_libc.
When inhibit_libc is defined, we provide dummy functions for all the
__gcov_* methods.
Is it purposely to minimize the footprint?
I'm thinking to allow some codes that are independent of libc (and its
headers) under this. Is it OK?

-Rong

On Fri, Nov 8, 2013 at 10:48 AM, Rong Xu  wrote:
> On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li  wrote:
>> in libgcov-driver.c
>>
>>> /* Flag when the profile has already been dumped via __gcov_dump().  */
>>> static int gcov_dump_complete;
>>
>>> inline void
>>> set_gcov_dump_complete (void)
>>>{
>>   > gcov_dump_complete = 1;
>>>}
>>
>>>inline void
>>>reset_gcov_dump_complete (void)
>>>{
>>>  gcov_dump_complete = 0;
>>>}
>>
>>
>> These should be moved to libgcov-interface.c. Is there reason to not
>> mark them as static?
>
> gcov_dump_complete is used in gcov_exit() which is in
> libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both
> in libgcov-interface.c.
> I want gcov_dump_complete a static. So I have to add to global
> functions that accessible from libgcov-interface.c.
> Another choice is to move __gcov_dump and __gcov_reset to
> libgcov-driver.c and that will simplify the code.
>
>>
>> in libgcov-profiler.c, line 142
>>
>>> #if defined(HAVE_CC_TLS) && !defined (USE_EMUTLS)
>>> __thread
>>
>> trailing whilte space.
>>
>
> Will fix.
>
>>
>>> || (VTABLE_USES_DESCRIPTORS && __gcov_indirect_call_callee
>>> && *(void **) cur_func == *(void **) __gcov_indirect_call_callee))
>>
>> trailing white space.
>>
>
> Will fix.
>
>>
>> in libgcov-merge.c
>>
>> 1) I don't think you need in_mem flag. For in memory merge, the source != 
>> NULL.
>> 2) document the new source and weight parameter -- especially the weight.
>
> Will do.
>
>> 3) How do you deal with integer overflow ?
>
> This is good question. gcvo_type is (typically) long long. I thought
> the count value will not be nearly close to the max of long long.
> (We did see overflow in compiler, but that's because of the scaling --
> some of the scaling factors are really large.)
>
>>
>> David
>>
>>
>>
>>
>>
>>
>> On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu  wrote:
>>> On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers  
>>> wrote:
>>>> On Thu, 7 Nov 2013, Rong Xu wrote:
>>>>
>>>>> Thanks Joseph for these detailed comments/suggestions.
>>>>> The fixed patch is attached to this email.
>>>>> The only thing left out is the Texinfo manual. Do you mean this tool
>>>>> should have its it's own texi file in gcc/doc?
>>>>
>>>> Its own texi file, probably included as a chapter by gcc.texi (like
>>>> gcov.texi is).
>>>
>>> OK. will add this in.
>>>
>>> My last patch that changes the command handling is actually broken.
>>> Please ignore that patch and use the one in this email.
>>>
>>> Also did some minor adjust of the code in libgcov.
>>>
>>> Thanks,
>>>
>>> -Rong
>>>
>>>
>>>>
>>>> --
>>>> Joseph S. Myers
>>>> jos...@codesourcery.com


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-08 Thread Rong Xu
On Fri, Nov 8, 2013 at 10:22 AM, Xinliang David Li  wrote:
> in libgcov-driver.c
>
>> /* Flag when the profile has already been dumped via __gcov_dump().  */
>> static int gcov_dump_complete;
>
>> inline void
>> set_gcov_dump_complete (void)
>>{
>   > gcov_dump_complete = 1;
>>}
>
>>inline void
>>reset_gcov_dump_complete (void)
>>{
>>  gcov_dump_complete = 0;
>>}
>
>
> These should be moved to libgcov-interface.c. Is there reason to not
> mark them as static?

gcov_dump_complete is used in gcov_exit() which is in
libgcov-driver.c, but it's set in __gcov_dump and __gcov_reset, both
in libgcov-interface.c.
I want gcov_dump_complete a static. So I have to add to global
functions that accessible from libgcov-interface.c.
Another choice is to move __gcov_dump and __gcov_reset to
libgcov-driver.c and that will simplify the code.

>
> in libgcov-profiler.c, line 142
>
>> #if defined(HAVE_CC_TLS) && !defined (USE_EMUTLS)
>> __thread
>
> trailing whilte space.
>

Will fix.

>
>> || (VTABLE_USES_DESCRIPTORS && __gcov_indirect_call_callee
>> && *(void **) cur_func == *(void **) __gcov_indirect_call_callee))
>
> trailing white space.
>

Will fix.

>
> in libgcov-merge.c
>
> 1) I don't think you need in_mem flag. For in memory merge, the source != 
> NULL.
> 2) document the new source and weight parameter -- especially the weight.

Will do.

> 3) How do you deal with integer overflow ?

This is good question. gcvo_type is (typically) long long. I thought
the count value will not be nearly close to the max of long long.
(We did see overflow in compiler, but that's because of the scaling --
some of the scaling factors are really large.)

>
> David
>
>
>
>
>
>
> On Thu, Nov 7, 2013 at 3:34 PM, Rong Xu  wrote:
>> On Thu, Nov 7, 2013 at 9:40 AM, Joseph S. Myers  
>> wrote:
>>> On Thu, 7 Nov 2013, Rong Xu wrote:
>>>
>>>> Thanks Joseph for these detailed comments/suggestions.
>>>> The fixed patch is attached to this email.
>>>> The only thing left out is the Texinfo manual. Do you mean this tool
>>>> should have its it's own texi file in gcc/doc?
>>>
>>> Its own texi file, probably included as a chapter by gcc.texi (like
>>> gcov.texi is).
>>
>> OK. will add this in.
>>
>> My last patch that changes the command handling is actually broken.
>> Please ignore that patch and use the one in this email.
>>
>> Also did some minor adjust of the code in libgcov.
>>
>> Thanks,
>>
>> -Rong
>>
>>
>>>
>>> --
>>> Joseph S. Myers
>>> jos...@codesourcery.com


Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-11-04 Thread Rong Xu
BTW, this part of patch only does the re-factoring work.
I haven't finished the porting of profile-tool part of patch to the
trunk. They should be in a separate patch anyway.

-Rong

On Mon, Nov 4, 2013 at 2:43 PM, Rong Xu  wrote:
> Honza, Thanks for the comments!
> Attached is the patch.
> Passed spec2006 fdo build and profiledbootstrap in gcc.
> I'm doing some other tests.
>
> -Rong
>
> On Mon, Nov 4, 2013 at 1:55 AM, Jan Hubicka  wrote:
>>> Thanks! Can you attach the patch file for the proposed change?
>>>
>>> David
>>>
>>> On Fri, Nov 1, 2013 at 2:08 PM, Rong Xu  wrote:
>>> > libgcov.c is the main file to generate libgcov.a. This single source 
>>> > generates
>>> > 21 object files and then archives to a library. These objects are of
>>> > very different purposes but they are in the same file guarded by various 
>>> > macros.
>>> > The source file becomes quite large and its readability becomes very
>>> > low. In its current state:
>>> > 1) The code is very hard to understand by new developers;
>>
>> Agreed, libgcov has became hard to maintain since several parts are written
>> in very low-level way.
>>> > 2) It makes regressions more likely when making simple changes;
>>> > More importantly,
>>> > 3) it makes code reuse/sharing very difficult. For instance, Using
>>> > libgcov to implement an offline profile tool (see below); kernel FDO
>>> > support etc.
>>
>> Yep, it was my longer term plan to do this, too.
>>> >
>>> > We came to this issue when working on an offline tool to handle
>>> > profile counters.
>>> > Our plan is to have a profile-tool can merge, diff, collect statistics, 
>>> > and
>>> > better dump raw profile counters. This is one of the most requested 
>>> > features
>>> > from out internal users. This tool can also be very useful for any 
>>> > FDO/coverage
>>> > users. In the first part of tool, we want to have the weight
>>> > merge. Again, to reuse the code, we have to change libgcov.c. But we 
>>> > don't want
>>> > to further diverge libgcov.a in our google branches from the trunk.
>>> >
>>> > Another issue in libgcov.c is function gcov_exit(). It is a huge function
>>> > with about 467 lines of code. This function traverses gcov_list and dumps 
>>> > all
>>> > the gcda files. The code for merging the gcda files also sits in the same
>>> >  function. The structure makes the code-reuse and extension really 
>>> > difficult
>>> > (like in kernel FDO, we have to break this function to reuse some of the 
>>> > code).
>>> >
>>> > We propose to break gcov_exit into smaller functions and split libgcov.c 
>>> > into
>>> > several files. Our plan for profile-tool is listed in (3).
>>> >
>>> > 1. break gcov_exit() into the following structure:
>>> >gcov_exit()
>>> >   --> gcov_exit_compute_summary()
>>> >   --> allocate_filename_struct()
>>> >   for gi_ptr in gcov_list
>>> > --> gcov_exit_dump_gcov()
>>> >--> gcov_exit_open_gcda_file()
>>> >--> gcov_exit_merge_gcda ()
>>> >--> gcov_exit_write_gcda ()
>>
>> Just a short comment that summaries are also produced during the merging - 
>> i.e. they are
>> done on per-file basis. But otherwise I agree.
>> We should be cureful to not increase footprint of libgcov much for embedded 
>> users, but
>> I believe changes like this can be done w/o additional overhead in the 
>> optimized library.
>
> The merge of summary has two parts. One is to find the summary to
> merge to. That is in gcov_exit_merge_gcda().
> The other part of do the actual merging of summary is in 
> gcov_exit_dump_gcov().
>
> I tried to keep the current work flow. So except for a few more calls,
> I don't think there is much overhead.
>
>>> >
>>> > 2. split libgcov.c into the following files:
>>> >  libgcov-profiler.c
>>> >  libgcov-merge.c
>>> >  libgcov-interface.c
>>> >  libgcov-driver.c
>>> >libgcov-driver-system.c (source included into libgcov-driver.c)
>>
>> Seems resonable.  Splitting i/o stuff into separate module (driver) is 
>> something I p

[RFC] libgcov.c re-factoring and offline profile-tool

2013-11-01 Thread Rong Xu
libgcov.c is the main file to generate libgcov.a. This single source generates
21 object files and then archives to a library. These objects are of
very different purposes but they are in the same file guarded by various macros.
The source file becomes quite large and its readability becomes very
low. In its current state:
1) The code is very hard to understand by new developers;
2) It makes regressions more likely when making simple changes;
More importantly,
3) it makes code reuse/sharing very difficult. For instance, Using
libgcov to implement an offline profile tool (see below); kernel FDO
support etc.

We came to this issue when working on an offline tool to handle
profile counters.
Our plan is to have a profile-tool can merge, diff, collect statistics, and
better dump raw profile counters. This is one of the most requested features
from out internal users. This tool can also be very useful for any FDO/coverage
users. In the first part of tool, we want to have the weight
merge. Again, to reuse the code, we have to change libgcov.c. But we don't want
to further diverge libgcov.a in our google branches from the trunk.

Another issue in libgcov.c is function gcov_exit(). It is a huge function
with about 467 lines of code. This function traverses gcov_list and dumps all
the gcda files. The code for merging the gcda files also sits in the same
 function. The structure makes the code-reuse and extension really difficult
(like in kernel FDO, we have to break this function to reuse some of the code).

We propose to break gcov_exit into smaller functions and split libgcov.c into
several files. Our plan for profile-tool is listed in (3).

1. break gcov_exit() into the following structure:
   gcov_exit()
  --> gcov_exit_compute_summary()
  --> allocate_filename_struct()
  for gi_ptr in gcov_list
--> gcov_exit_dump_gcov()
   --> gcov_exit_open_gcda_file()
   --> gcov_exit_merge_gcda ()
   --> gcov_exit_write_gcda ()

2. split libgcov.c into the following files:
 libgcov-profiler.c
 libgcov-merge.c
 libgcov-interface.c
 libgcov-driver.c
   libgcov-driver-system.c (source included into libgcov-driver.c)

The details of the splitting:
libgcov-interface.c
/* This file contains API functions to other programs and the supporting
   functions.  */
  __gcov_dump()
  __gcov_execl()
  __gcov_execle()
  __gcov_execlp()
  __gcov_execv()
  __gcov_execve()
  __gcov_execvp()
  __gcov_flush()
  __gcov_fork()
  __gcov_reset()
  init_mx()
  init_mx_once()

libgcov-profiler.c
/* This file contains runtime profile handlers.  */
  variables:
__gcov_indirect_call_callee
__gcov_indirect_call_counters
  functions:
__gcov_average_profiler()
__gcov_indirect_call_profiler()
__gcov_indirect_call_profiler_v2()
__gcov_interval_profiler()
__gcov_ior_profiler()
__gcov_one_value_profiler()
__gcov_one_value_profiler_body()
__gcov_pow2_profiler()

libgcov-merge.c
/* This file contains the merge functions for various counters.  */
  functions:
__gcov_merge_add()
__gcov_merge_delta()
__gcov_merge_ior()
__gcov_merge_single()

libcov-driver.c
/* This file contains the gcov dumping functions. We separate the
   system dependent part to libgcov-driver-system.c.  */
  variables:
gcov_list
gcov_max_filename
gcov_dump_complete
-- newly added file static variables --
this_prg
all_prg
crc32
gi_filename
fn_buffer
fn_tail
sum_buffer
next_sum_buffer
sum_tail
- end -
  functions:
free_fn_data()
buffer_fn_data()
crc32_unsigned()
gcov_version()
gcov_histogram_insert()
gcov_compute_histogram()
gcov_clear()
__gcov_init()
gcov_exit()
--- newly added static functions --
gcov_exit_compute_summary ()
gcov_exit_merge_gcda()
gcov_exit_write_gcda()
gcov_exit_dump_gcov()
- end -

libgcov-driver-system.c
/* system dependent part of ligcov-driver.c.  */
  functions:
create_file_directory()
--- newly added static functions --
gcov_error() /* This replaces all fprintf(stderr, ...) */
allocate_filename_struct()
gcov_exit_open_gcda_file()

3. add offline profile-tool support.
   We will change the interface of merge functions to make it
   take in-memory gcov_info list, and a weight as the arguments.

   We will add libgcov-tool.c. It has two APIs:
   (1) read_profile_dir(): read a profile directory and reconstruct the
   gcov_info link list in-memory.
   (2) merge_profiles(): merge two in-memory gcov_info link list.

   We also add profile-tool.c in gcc directory. It will source-include
   libgcov-tool.c and build a host binary. (We don't do library style
   because that will need a hard dependence from gcc to libgcc).
   profile-tool.c will mainly handle user options and the write-out of
   gcov-info link list. Some changes in gcov-io.[ch] will be also needed.

Th

[google gcc-4_8] write ggc_memory to gcda file

2013-10-25 Thread Rong Xu
Hi,

This patch writes out ggc_memory to gcda files.

Google branches only. Test is ongoing. OK to check-in after test passes?

-Rong
2013-10-25  Rong Xu  

* contrib/profile_tool (ModuleInfo): write out ggc_memory to gcda file.
* gcc/gcov-io.c (gcov_read_module_info): Ditto.
* libgcc/dyn-ipa.c (gcov_write_module_info): Ditto.

Index: contrib/profile_tool
===
--- contrib/profile_tool(revision 204080)
+++ contrib/profile_tool(working copy)
@@ -674,6 +674,7 @@ class ModuleInfo(DataObject):
 self.module_id = reader.ReadWord()
 self.is_primary = reader.ReadWord()
 self.flags = reader.ReadWord()
+self.ggc_memory = reader.ReadWord()
 self.language = reader.ReadWord()
 self.num_quote_paths = reader.ReadWord()
 self.num_bracket_paths = reader.ReadWord()
@@ -710,6 +711,7 @@ class ModuleInfo(DataObject):
 writer.WriteWord(self.is_primary)
 writer.WriteWord(self.flags)
 writer.WriteWord(self.language)
+writer.WriteWord(self.ggc_memory)
 writer.WriteWord(self.num_quote_paths)
 writer.WriteWord(self.num_bracket_paths)
 writer.WriteWord(self.num_system_paths)
Index: gcc/gcov-io.c
===
--- gcc/gcov-io.c   (revision 204080)
+++ gcc/gcov-io.c   (working copy)
@@ -591,14 +591,15 @@ gcov_read_module_info (struct gcov_module_info *mo
   mod_info->ident = gcov_read_unsigned ();
   mod_info->is_primary = gcov_read_unsigned ();
   mod_info->flags = gcov_read_unsigned ();
-  mod_info->lang  = gcov_read_unsigned ();
+  mod_info->lang = gcov_read_unsigned ();
+  mod_info->ggc_memory = gcov_read_unsigned ();
   mod_info->num_quote_paths = gcov_read_unsigned ();
   mod_info->num_bracket_paths = gcov_read_unsigned ();
   mod_info->num_system_paths = gcov_read_unsigned ();
   mod_info->num_cpp_defines = gcov_read_unsigned ();
   mod_info->num_cpp_includes = gcov_read_unsigned ();
   mod_info->num_cl_args = gcov_read_unsigned ();
-  len -= 10;
+  len -= 11;
 
   filename_len = gcov_read_unsigned ();
   mod_info->da_filename = (char *) xmalloc (filename_len *
Index: libgcc/dyn-ipa.c
===
--- libgcc/dyn-ipa.c(revision 204080)
+++ libgcc/dyn-ipa.c(working copy)
@@ -2095,7 +2095,7 @@ gcov_write_module_info (const struct gcov_info *mo
   len += 1; /* Each string is lead by a length.  */
 }
 
-  len += 10; /* 9 more fields */
+  len += 11; /* 11 more fields */
 
   gcov_write_tag_length (GCOV_TAG_MODULE_INFO, len);
   gcov_write_unsigned (module_info->ident);
@@ -2104,6 +2104,7 @@ gcov_write_module_info (const struct gcov_info *mo
 SET_MODULE_INCLUDE_ALL_AUX (module_info);
   gcov_write_unsigned (module_info->flags);
   gcov_write_unsigned (module_info->lang);
+  gcov_write_unsigned (module_info->ggc_memory);
   gcov_write_unsigned (module_info->num_quote_paths);
   gcov_write_unsigned (module_info->num_bracket_paths);
   gcov_write_unsigned (module_info->num_system_paths);


Re: [GOOGLE] Check if varpool node exist for decl before checking if it's from auxiliary module.

2013-10-22 Thread Rong Xu
seems fine to me for google branches.

-Rong

On Tue, Oct 22, 2013 at 10:51 AM, Dehao Chen  wrote:
> This is fixing a LIPO bug when there -fexception is on.
>
> When compilation is finished, compile_file calls
> dw2_output_indirect_constants, which may generate decls like
> DW.ref.__gxx_personality_v0 (generated in
> dw2_output_indirect_constant_1). This function is a global function,
> but does not have associated varpool node. So original code will
> segfault when checking if the varpool node is from auxiliary module.
>
> Verified that the segfault is fixed with the patch. Regression test on-going.
>
> OK for google-4_8 branch if reg test is green?
>
> Thanks,
> Dehao
>
> Index: gcc/varasm.c
> ===
> --- gcc/varasm.c (revision 203910)
> +++ gcc/varasm.c (working copy)
> @@ -1484,7 +1484,7 @@ notice_global_symbol (tree decl)
>if (L_IPO_COMP_MODE
>&& ((TREE_CODE (decl) == FUNCTION_DECL
> && cgraph_is_auxiliary (decl))
> -  || (TREE_CODE (decl) == VAR_DECL
> +  || (TREE_CODE (decl) == VAR_DECL && varpool_get_node (decl)
>&& varpool_is_auxiliary (varpool_get_node (decl)
>  return;


[google gcc-4_8] LIPO: insert static vars to varpool

2013-10-14 Thread Rong Xu
Hi,

For google gcc-4_8 branch only.

This is fixes the NULL pointer dereference in copy_tree_r due to empty
varpool_node returned.

Passed the ICE compilation. Other tests are ongoing.

Thanks,

-Rong


2013-10-14  Rong Xu  

* cp/semantics.c (finish_compound_literal): Put static var to
varpool. This is for LIPO only.
* cp/call.c (make_temporary_var_for_ref_to_temp): Ditto.

Index: cp/semantics.c
===
--- cp/semantics.c  (revision 203471)
+++ cp/semantics.c  (working copy)
@@ -2486,6 +2486,10 @@ finish_compound_literal (tree type, tree compound_
   decl = pushdecl_top_level (decl);
   DECL_NAME (decl) = make_anon_name ();
   SET_DECL_ASSEMBLER_NAME (decl, DECL_NAME (decl));
+  /* Capture the current module info for statics.  */
+  if (L_IPO_COMP_MODE)
+varpool_node_for_decl (decl);
+
   return decl;
 }
   else
Index: cp/call.c
===
--- cp/call.c   (revision 203471)
+++ cp/call.c   (working copy)
@@ -9047,6 +9047,9 @@ make_temporary_var_for_ref_to_temp (tree decl, tre
   tree name;

   TREE_STATIC (var) = TREE_STATIC (decl);
+  /* Capture the current module info for statics.  */
+  if (L_IPO_COMP_MODE && TREE_STATIC (var))
+varpool_node_for_decl (var);
   DECL_TLS_MODEL (var) = DECL_TLS_MODEL (decl);
   name = mangle_ref_init_variable (decl);
   DECL_NAME (var) = name;


Re: [PATCH] increase builtin_expert probability in loop exit test

2013-10-11 Thread Rong Xu
Attached is the patchset 2. It takes the max to two hitrates then does
the incremental.


On Fri, Oct 11, 2013 at 2:01 PM, Rong Xu  wrote:
> Hi,
>
> An earlier patch (r203167) changed the default probability for
> builtin_expect to 90%.
> It does not work properly for the following case:
>while (__builin_expect (expr, 1)) { }
>
> W/o builtin_expect, the exit probability is 9% while w/
> builtin_expect, the exit probability is 10%.
> It seems to be wrong as we should estimate this loop has more
> iterations than w/o the annotation.
>
> This attached patch bump the probability for this particular case.
>
> Tested with bootstrap.
>
> Thanks,
>
> -Rong
2013-10-11  Rong Xu  

* predict.c (tree_predict_by_opcode): Bump the 
estimiated probability if builtin_expect expression is
in loop exit test.

Index: predict.c
===
--- predict.c   (revision 203463)
+++ predict.c   (working copy)
@@ -2001,11 +2001,39 @@ tree_predict_by_opcode (basic_block bb)
   if (val)
 {
   int percent = PARAM_VALUE (BUILTIN_EXPECT_PROBABILITY);
+  void **preds;
+  int hitrate;
 
   gcc_assert (percent >= 0 && percent <= 100);
+  /* This handles the cases like
+   while (__builtin_expect (exp, 1)) { ... }
+ W/o builtin_expect, the default HITRATE is 91%.
+ It does not make sense to estimate a lower probability of 90%
+ (current default for builtin_expect) with the annotation.
+ So here, we bump the probability by a small amount.  */
+  preds = pointer_map_contains (bb_predictions, bb);
+  hitrate = HITRATE (percent);
+  if (preds)
+{
+  struct edge_prediction *pred;
+  int exit_hitrate = predictor_info [(int) PRED_LOOP_EXIT].hitrate;
+
+  for (pred = (struct edge_prediction *) *preds; pred;
+   pred = pred->ep_next)
+{
+  if (pred->ep_predictor == PRED_LOOP_EXIT
+  && exit_hitrate > hitrate)
+{
+  hitrate = exit_hitrate + HITRATE (4);
+  if (hitrate > REG_BR_PROB_BASE)
+hitrate = REG_BR_PROB_BASE;
+  break;
+}
+}
+}
   if (integer_zerop (val))
-percent = 100 - percent;
-  predict_edge (then_edge, PRED_BUILTIN_EXPECT, HITRATE (percent));
+hitrate = REG_BR_PROB_BASE - hitrate;
+  predict_edge (then_edge, PRED_BUILTIN_EXPECT, hitrate);
 }
   /* Try "pointer heuristic."
  A comparison ptr == 0 is predicted as false.


Re: [google gcc-4_8] increase builtin_expect probability in loop exit test

2013-10-11 Thread Rong Xu
here is the new patch. Note that the hitrate won't change by this
patch for the case of
  while (__builtin_expect (exp, 0))

Index: predict.c
===
--- predict.c   (revision 203462)
+++ predict.c   (working copy)
@@ -1951,11 +1951,42 @@ tree_predict_by_opcode (basic_block bb)
   if (val)
 {
   int percent = PARAM_VALUE (BUILTIN_EXPECT_PROBABILITY);
+  int hitrate;

   gcc_assert (percent >= 0 && percent <= 100);
   if (integer_zerop (val))
-percent = 100 - percent;
-  predict_edge (then_edge, PRED_BUILTIN_EXPECT, HITRATE (percent));
+hitrate = HITRATE (100 - percent);
+  else
+{
+  /* This handles the cases like
+   while (__builtin_expect (exp, 1)) { ... }
+ W/o builtin_expect, the default HITRATE is 91%.
+ It does not make sense to estimate a lower probability of 90%
+ (current default for builtin_expect) with the annotation.
+ So here, we bump the probability by a small amount.  */
+  void **preds = pointer_map_contains (bb_predictions, bb);
+
+  hitrate = HITRATE (percent);
+  if (preds)
+{
+  struct edge_prediction *pred;
+  int exit_hitrate = predictor_info [(int) PRED_LOOP_EXIT].hitrate;
+
+  for (pred = (struct edge_prediction *) *preds; pred;
+   pred = pred->ep_next)
+{
+  if (pred->ep_predictor == PRED_LOOP_EXIT
+  && exit_hitrate > hitrate)
+{
+  hitrate = exit_hitrate + HITRATE (4);
+  if (hitrate > REG_BR_PROB_BASE)
+hitrate = REG_BR_PROB_BASE;
+  break;
+}
+}
+}
+}
+  predict_edge (then_edge, PRED_BUILTIN_EXPECT, hitrate);
 }
   /* Try "pointer heuristic."
  A comparison ptr == 0 is predicted as false.

On Fri, Oct 11, 2013 at 2:45 PM, Rong Xu  wrote:
> ok. that makes sense.
>
>
> On Fri, Oct 11, 2013 at 2:41 PM, Xinliang David Li  wrote:
>> Should it be max + some_delta then?
>>
>> David
>>
>> On Fri, Oct 11, 2013 at 2:32 PM, Rong Xu  wrote:
>>> I want to differentiate the cases w/o and w/ builtin.
>>> If I take the max, they will be the same (91%).
>>>
>>> -Rong
>>>
>>> On Fri, Oct 11, 2013 at 2:26 PM, Xinliang David Li  
>>> wrote:
>>>> Why this 'percent += 4'  instead of taking the max?
>>>>
>>>> David
>>>>
>>>> On Fri, Oct 11, 2013 at 2:10 PM, Rong Xu  wrote:
>>>>> The trunk version of this patch is submitted for review.
>>>>>
>>>>> David: can we have this patch for google/gcc-4_8 branch first?
>>>>> It tested with regression and google internal benchmarks.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -Rong


Re: [google gcc-4_8] increase builtin_expect probability in loop exit test

2013-10-11 Thread Rong Xu
ok. that makes sense.


On Fri, Oct 11, 2013 at 2:41 PM, Xinliang David Li  wrote:
> Should it be max + some_delta then?
>
> David
>
> On Fri, Oct 11, 2013 at 2:32 PM, Rong Xu  wrote:
>> I want to differentiate the cases w/o and w/ builtin.
>> If I take the max, they will be the same (91%).
>>
>> -Rong
>>
>> On Fri, Oct 11, 2013 at 2:26 PM, Xinliang David Li  
>> wrote:
>>> Why this 'percent += 4'  instead of taking the max?
>>>
>>> David
>>>
>>> On Fri, Oct 11, 2013 at 2:10 PM, Rong Xu  wrote:
>>>> The trunk version of this patch is submitted for review.
>>>>
>>>> David: can we have this patch for google/gcc-4_8 branch first?
>>>> It tested with regression and google internal benchmarks.
>>>>
>>>> Thanks,
>>>>
>>>> -Rong


Re: [google gcc-4_8] increase builtin_expect probability in loop exit test

2013-10-11 Thread Rong Xu
I want to differentiate the cases w/o and w/ builtin.
If I take the max, they will be the same (91%).

-Rong

On Fri, Oct 11, 2013 at 2:26 PM, Xinliang David Li  wrote:
> Why this 'percent += 4'  instead of taking the max?
>
> David
>
> On Fri, Oct 11, 2013 at 2:10 PM, Rong Xu  wrote:
>> The trunk version of this patch is submitted for review.
>>
>> David: can we have this patch for google/gcc-4_8 branch first?
>> It tested with regression and google internal benchmarks.
>>
>> Thanks,
>>
>> -Rong


[google gcc-4_8] increase builtin_expect probability in loop exit test

2013-10-11 Thread Rong Xu
The trunk version of this patch is submitted for review.

David: can we have this patch for google/gcc-4_8 branch first?
It tested with regression and google internal benchmarks.

Thanks,

-Rong
2013-10-11  Rong Xu  

* predict.c (tree_predict_by_opcode): Bump the 
estimiated probability if builtin_expect expression is
in loop exit test.

Index: predict.c
===
--- predict.c   (revision 203462)
+++ predict.c   (working copy)
@@ -1951,7 +1951,31 @@ tree_predict_by_opcode (basic_block bb)
   if (val)
 {
   int percent = PARAM_VALUE (BUILTIN_EXPECT_PROBABILITY);
+  void **preds;
 
+  /* This handles the cases like
+   while (__builtin_expect (exp, 1)) { ... }
+ W/o builtin_expect, the default HITRATE is 91%.
+It does not make sense to estimate a lower probability of 90%
+(current default for builtin_expect) with the annotation.
+So here, we bump the probability by a small amount.  */
+  preds = pointer_map_contains (bb_predictions, bb);
+  if (preds)
+{
+  struct edge_prediction *pred;
+
+  for (pred = (struct edge_prediction *) *preds; pred;
+  pred = pred->ep_next)
+   {
+ if (pred->ep_predictor == PRED_LOOP_EXIT
+ && predictor_info [(int) PRED_LOOP_EXIT].hitrate
+> HITRATE (percent))
+   percent += 4;
+ if (percent > 100)
+   percent = 100;
+   }
+   }
+
   gcc_assert (percent >= 0 && percent <= 100);
   if (integer_zerop (val))
 percent = 100 - percent;


[PATCH] increase builtin_expert probability in loop exit test

2013-10-11 Thread Rong Xu
Hi,

An earlier patch (r203167) changed the default probability for
builtin_expect to 90%.
It does not work properly for the following case:
   while (__builin_expect (expr, 1)) { }

W/o builtin_expect, the exit probability is 9% while w/
builtin_expect, the exit probability is 10%.
It seems to be wrong as we should estimate this loop has more
iterations than w/o the annotation.

This attached patch bump the probability for this particular case.

Tested with bootstrap.

Thanks,

-Rong


patch_diff
Description: Binary data


Re: [PATCH] alternative hirate for builtin_expert

2013-10-04 Thread Rong Xu
My change on the probability of builtin_expect does have an impact on
the partial inline (more outlined functions will get inline back to
the original function).
I think this triggers an existing issue.
Dehao will explain this in his coming email.

-Rong

On Fri, Oct 4, 2013 at 6:05 AM, Ramana Radhakrishnan  wrote:
> On 10/02/13 23:49, Rong Xu wrote:
>>
>> Here is the new patch. Honaz: Could you take a look?
>>
>> Thanks,
>>
>> -Rong
>>
>> On Wed, Oct 2, 2013 at 2:31 PM, Jan Hubicka  wrote:
>>>>
>>>> Thanks for the suggestion. This is much cleaner than to use binary
>>>> parameter.
>>>>
>>>> Just want to make sure I understand it correctly about the orginal
>>>> hitrate:
>>>> you want to retire the hitrate in PRED_BUILTIN_EXPECT and always use
>>>> the one specified in the biniltin-expect-probability parameter.
>>>
>>>
>>> Yes.
>>>>
>>>>
>>>> Should I use 90% as the default? It's hard to fit current value 0.9996
>>>> in percent form.
>>>
>>>
>>> Yes, 90% seems fine.  The original value was set quite arbitrarily and no
>>> real
>>> performance study was made as far as I know except yours. I think users
>>> that
>>> are sure they use expect to gueard completely cold edges may just use
>>> 100%
>>> instead of 0.9996, so I would not worry much about the precision.
>>>
>>> Honza
>>>>
>>>>
>>>> -Rong
>>>>>
>>>>>
>>>>> OK with that change.
>>>>>
>>>>> Honza
>
>
>
> This broke arm-linux-gnueabihf building libstdc++-v3. I haven't dug further
> yet but still reducing the testcase.
>
> See
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58619
>
> for details.
>
> Can you please deal with this appropriately ?
>
> regards
> Ramana
>
>


Re: [PATCH] alternative hirate for builtin_expert

2013-10-02 Thread Rong Xu
Here is the new patch. Honaz: Could you take a look?

Thanks,

-Rong

On Wed, Oct 2, 2013 at 2:31 PM, Jan Hubicka  wrote:
>> Thanks for the suggestion. This is much cleaner than to use binary parameter.
>>
>> Just want to make sure I understand it correctly about the orginal hitrate:
>> you want to retire the hitrate in PRED_BUILTIN_EXPECT and always use
>> the one specified in the biniltin-expect-probability parameter.
>
> Yes.
>>
>> Should I use 90% as the default? It's hard to fit current value 0.9996
>> in percent form.
>
> Yes, 90% seems fine.  The original value was set quite arbitrarily and no real
> performance study was made as far as I know except yours. I think users that
> are sure they use expect to gueard completely cold edges may just use 100%
> instead of 0.9996, so I would not worry much about the precision.
>
> Honza
>>
>> -Rong
>> >
>> > OK with that change.
>> >
>> > Honza


p2_patch_v2
Description: Binary data


Re: [PATCH] fix size_estimation for builtin_expect

2013-10-02 Thread Rong Xu
attached the new patch. OK for check in?

On Wed, Oct 2, 2013 at 9:12 AM, Jan Hubicka  wrote:
>> > Hi,
>> >
>> > builtin_expect should be a NOP in size_estimation. Indeed, the call
>> > stmt itself is 0 weight in size and time. But it may introduce
>> > an extra relation expr which has non-zero size/time. The end result
>> > is: for w/ and w/o builtin_expect, we have different size/time estimation
>> > for inlining.
>> >
>> > This patch fixes this problem.
>> >
>> > An earlier discussion of this patch is
>> >   https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c590ad8c5315
>> >
>> > This new patch address Honza's comments.
>> > It passes the bootstrap and regression.
>> >
>> > Richard: I looked at your tree-ssa.c:walk_use_def_chains() code. I think
>> > that's an overkill for this simple problem. Your code is mostly dealing
>> > with the recursively walk the PHI node to find the real def stmts.
>> > Here the traversal is within one BB and I may need to continue on multiple
>> > real assignment. Calling walk_use_def_chains probably only uses
>> > the SSA_NAME_DEF_STMT() part of the code.
>> >
>> > Thanks,
>> >
>> > -Rong
>
> This patch is OK.  Add white space after
> +  bool match = false;
> +  bool done = false;
>
> and fix
> +  if (match && single_imm_use (var, &use_p, &use_stmt) &&
> +  (gimple_code (use_stmt) == GIMPLE_COND))
>
> && should be at beggining of new line..
>
> Thanks,
> Honza


p1_patch_v2
Description: Binary data


Re: [PATCH] alternative hirate for builtin_expert

2013-10-02 Thread Rong Xu
On Wed, Oct 2, 2013 at 9:08 AM, Jan Hubicka  wrote:
>> > Hi,
>> >
>> >
>> >
>> >   Current default probability for builtin_expect is 0.9996.
>> > This makes the freq of unlikely bb very low (4), which
>> > suppresses the inlining of any calls within those bb.
>> >
>> > We used FDO data to measure the branch probably for
>> > the branch annotated with builtin_expert.
>> > For google internal benchmarks, the weight average
>> > (the profile count value as the weight) is 0.9081.
>> >
>> > Linux kernel is another program that is heavily annotated
>> > with builtin-expert. We measured its weight average as 0.8717,
>> > using google search as the workload.
>> >
>> > This patch sets the alternate hirate probability for builtin_expert
>> > to 90%. With the alternate hirate, we measured performance
>> > improvement for google benchmarks and Linux kernel.
>> >
>> > An earlier discussion is
>> > https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c5910054630b
>> >
>> > This new patch is for the trunk and addresses Honza's comments.
>> >
>> > Honza: this new probability is off by default. When we backport to google
>> > branch we will make it the default. Let me know if you want to do the same
>> > here.
>
> I do not like much the binary parameter for 
> builtin-expect-probability-relaxed.
>
> I would just add bulitin-expect-probability taking value in percents and then
> make predict.c to use it. Just use predict_edge instead of predict_edge_def
> and document hitrate value as unused in predict.def.

Thanks for the suggestion. This is much cleaner than to use binary parameter.

Just want to make sure I understand it correctly about the orginal hitrate:
you want to retire the hitrate in PRED_BUILTIN_EXPECT and always use
the one specified in the biniltin-expect-probability parameter.

Should I use 90% as the default? It's hard to fit current value 0.9996
in percent form.

-Rong
>
> OK with that change.
>
> Honza


Re: [PATCH] alternative hirate for builtin_expert

2013-10-01 Thread Rong Xu
ping.

On Fri, Sep 27, 2013 at 3:53 PM, Rong Xu  wrote:
> Hi,
>
>
>
>   Current default probability for builtin_expect is 0.9996.
> This makes the freq of unlikely bb very low (4), which
> suppresses the inlining of any calls within those bb.
>
> We used FDO data to measure the branch probably for
> the branch annotated with builtin_expert.
> For google internal benchmarks, the weight average
> (the profile count value as the weight) is 0.9081.
>
> Linux kernel is another program that is heavily annotated
> with builtin-expert. We measured its weight average as 0.8717,
> using google search as the workload.
>
> This patch sets the alternate hirate probability for builtin_expert
> to 90%. With the alternate hirate, we measured performance
> improvement for google benchmarks and Linux kernel.
>
> An earlier discussion is
> https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c5910054630b
>
> This new patch is for the trunk and addresses Honza's comments.
>
> Honza: this new probability is off by default. When we backport to google
> branch we will make it the default. Let me know if you want to do the same
> here.
>
> Thanks,
>
> -Rong


Re: [PATCH] fix size_estimation for builtin_expect

2013-10-01 Thread Rong Xu
ping.

On Fri, Sep 27, 2013 at 3:56 PM, Rong Xu  wrote:
> Hi,
>
> builtin_expect should be a NOP in size_estimation. Indeed, the call
> stmt itself is 0 weight in size and time. But it may introduce
> an extra relation expr which has non-zero size/time. The end result
> is: for w/ and w/o builtin_expect, we have different size/time estimation
> for inlining.
>
> This patch fixes this problem.
>
> An earlier discussion of this patch is
>   https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c590ad8c5315
>
> This new patch address Honza's comments.
> It passes the bootstrap and regression.
>
> Richard: I looked at your tree-ssa.c:walk_use_def_chains() code. I think
> that's an overkill for this simple problem. Your code is mostly dealing
> with the recursively walk the PHI node to find the real def stmts.
> Here the traversal is within one BB and I may need to continue on multiple
> real assignment. Calling walk_use_def_chains probably only uses
> the SSA_NAME_DEF_STMT() part of the code.
>
> Thanks,
>
> -Rong


[PATCH] fix size_estimation for builtin_expect

2013-09-27 Thread Rong Xu
Hi,

builtin_expect should be a NOP in size_estimation. Indeed, the call
stmt itself is 0 weight in size and time. But it may introduce
an extra relation expr which has non-zero size/time. The end result
is: for w/ and w/o builtin_expect, we have different size/time estimation
for inlining.

This patch fixes this problem.

An earlier discussion of this patch is
  https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c590ad8c5315

This new patch address Honza's comments.
It passes the bootstrap and regression.

Richard: I looked at your tree-ssa.c:walk_use_def_chains() code. I think
that's an overkill for this simple problem. Your code is mostly dealing
with the recursively walk the PHI node to find the real def stmts.
Here the traversal is within one BB and I may need to continue on multiple
real assignment. Calling walk_use_def_chains probably only uses
the SSA_NAME_DEF_STMT() part of the code.

Thanks,

-Rong


p1_patch
Description: Binary data


[PATCH] alternative hirate for builtin_expert

2013-09-27 Thread Rong Xu
Hi,



  Current default probability for builtin_expect is 0.9996.
This makes the freq of unlikely bb very low (4), which
suppresses the inlining of any calls within those bb.

We used FDO data to measure the branch probably for
the branch annotated with builtin_expert.
For google internal benchmarks, the weight average
(the profile count value as the weight) is 0.9081.

Linux kernel is another program that is heavily annotated
with builtin-expert. We measured its weight average as 0.8717,
using google search as the workload.

This patch sets the alternate hirate probability for builtin_expert
to 90%. With the alternate hirate, we measured performance
improvement for google benchmarks and Linux kernel.

An earlier discussion is
https://mail.google.com/mail/u/0/?pli=1#label/gcc-paches/1415c5910054630b

This new patch is for the trunk and addresses Honza's comments.

Honza: this new probability is off by default. When we backport to google
branch we will make it the default. Let me know if you want to do the same
here.

Thanks,

-Rong


p2_patch
Description: Binary data


Re: [google gcc-4_8] alternate hirate for builtin_expert

2013-09-27 Thread Rong Xu
On Thu, Sep 26, 2013 at 3:27 PM, Jan Hubicka  wrote:
>> Hi,
>>
>> Current default probably for builtin_expect is 0.9996.
>> This makes the freq of unlikely bb very low (4), which
>> suppresses the inlining of any calls within those bb.
>>
>> We used FDO data to measure the branch probably for
>> the branch annotated with builtin_expert.
>>  For google
>> internal benchmarks, the weight average
>> (the profile count value as the weight) is 0.9081.
>>
>> Linux kernel is another program that is heavily annotated
>> with builtin-expert. We measured its weight average as 0.8717,
>>   using google search as
>> the workload.
>
> This is interesting.  I was always unsure if programmers use builtin_expect
> more often to mark an impossible paths (as those leading to crash) or just
> unlikely paths.  Your data seems to suggest the second.
>

I don't find an official guideline on how this should be used. Maybe
some documentation
in gcc user-guide helps.

> We probably also ought to get analyze_brprob working again. It was always
> useful to get such a data.
>>
>>
>> This patch sets the alternate hirate probability for
>> builtin_expert
>> to 90%. With the alternate hirate, we measured performance
>>   improvement for google
>> benchmarks and Linux kernel.
>>
>>
>>   -Rong
>> 2013-09-26  Rong Xu  
>>
>>   * params.def (DEFPARAM): New.
>>   * params.def: New.
>>   * predict.c (tree_predict_by_opcode): Alternate
>> probablity hirate for builtin_expect.
>
> This also seems resonable for mainline.  Please add a comment
> to the parameter explaining how the value was determined.
> Please also add invoke.texi documentation.
>

Will do.

> For patches that seems resonable for mainline in FDO/IPA area,
> i would apprechiate if you just added me to CC, so I do not lose
> track of them.

Will do.

> Honza


Re: [google/4_8] Disable -g/-gmlt during LIPO instrumentation

2013-09-27 Thread Rong Xu
I don't quite understand here. We use the profile-generate memory
consumption to estimate the profile use memory consumption.
we still have -g/-gmlt in profile-use compilation. Will this change
effectively under estimate the memory use in the use phrase?

-Rong

On Fri, Sep 27, 2013 at 11:50 AM, Teresa Johnson  wrote:
> David and Rong,
>
> The following patch will disable -g/-gmlt when instrumenting for LIPO
> since they will affect the recorded ggc_memory used in the module
> grouping decision. Added -fripa-allow-debug to override this behavior.
>
> Passes regression tests and simple tests on the new functionality.
>
> Ok for google/4_8?
>
> Teresa
>
> 2013-09-27  Teresa Johnson  
>
> * opts.c (finish_options): Suppress -g/-gmlt when we are
> building for LIPO instrumention.
> * common.opt (fripa-allow-debug): New option.
>
> Index: opts.c
> ===
> --- opts.c  (revision 202976)
> +++ opts.c  (working copy)
> @@ -799,7 +799,7 @@ finish_options (struct gcc_options *opts, struct g
>  #endif
>if (!opts->x_flag_fat_lto_objects && !HAVE_LTO_PLUGIN)
>  error_at (loc, "-fno-fat-lto-objects are supported only with
> linker plugin.");
> -}
> +}
>if ((opts->x_flag_lto_partition_balanced != 0) +
> (opts->x_flag_lto_partition_1to1 != 0)
> + (opts->x_flag_lto_partition_none != 0) >= 1)
>  {
> @@ -852,6 +852,26 @@ finish_options (struct gcc_options *opts, struct g
>/* Turn on -ffunction-sections when -freorder-functions=* is used.  */
>if (opts->x_flag_reorder_functions > 1)
>  opts->x_flag_function_sections = 1;
> +
> +  /* LIPO module grouping depends on the memory consumed by the profile-gen
> + parsing phase, recorded in a per-module ggc_memory field of the module
> + info struct. This will be higher when debug generation is on via
> + -g/-gmlt, which causes the FE to generate debug structures that will
> + increase the ggc_total_memory. This could in theory cause the module
> + groups to be slightly more conservative. Disable -g/-gmlt under
> + -fripa -fprofile-generate, but provide an option to override this
> + in case we actually need to debug an instrumented binary.  */
> +  if (opts->x_profile_arc_flag
> +  && opts->x_flag_dyn_ipa
> +  && opts->x_debug_info_level != DINFO_LEVEL_NONE
> +  && !opts->x_flag_dyn_ipa_allow_debug)
> +{
> +  inform (loc,
> + "Debug generation via -g option disabled under -fripa "
> +  "-fprofile-generate (use -fripa-allow-debug to override)");
> +  set_debug_level (NO_DEBUG, DEFAULT_GDB_EXTENSIONS, "0", opts, opts_set,
> +   loc);
> +}
>  }
>
>  #define LEFT_COLUMN27
> Index: common.opt
> ===
> --- common.opt  (revision 202976)
> +++ common.opt  (working copy)
> @@ -1155,6 +1155,10 @@ fripa
>  Common Report Var(flag_dyn_ipa)
>  Perform Dynamic Inter-Procedural Analysis.
>
> +fripa-allow-debug
> +Common Report Var(flag_dyn_ipa_allow_debug) Init(0)
> +Allow -g enablement for -fripa -fprofile-generate compiles.
> +
>  fripa-disallow-asm-modules
>  Common Report Var(flag_ripa_disallow_asm_modules)
>  Don't import an auxiliary module if it contains asm statements
>
> --
> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [google gcc-4_8] fix size_estimation for builtin_expect

2013-09-27 Thread Rong Xu
On Thu, Sep 26, 2013 at 3:23 PM, Jan Hubicka  wrote:
>> Hi,
>>
>> builtin_expect should be a NOP in size_estimation. Indeed, the call
>> stmt itself is 0 weight in size and time. But it may introduce
>> an extra relation expr which has non-zero size/time. The end result
>> is: for w/ and w/o builtin_expect, we have different size/time estimation
>> for early inlining.
>>
>> This patch fixes this problem.
>>
>> -Rong
>
>> 2013-09-26  Rong Xu  
>>
>>   * ipa-inline-analysis.c (estimate_function_body_sizes): fix
>> the size estimation for builtin_expect.
>
> This seems fine with an comment in the code what it is about.
> I also think we want to support mutiple builtin_expects in a BB so perhaps
> we want to have pointer set of statements to ignore?
>

Thanks for feedback. I'll add the comment and split the code into a
new function.

You have a good pointer about multiple builtin_expect. I think I need
to remove the early exit (the break stmt).
But I would argue that using pointer_set is probably not necessary.

Here is the reasoning: The typical usage of builtin_expect is like:
  if (__builtin_expect (a<=b, 1))  { ... }
The IR is:
  t1 = a <= b;
  t2 = (long int) t1;
  t3 = __builtin_expect (t2, 1);
  if (t3 != 0) goto ...
Without the builtin, the IR is
  if (a<=b) goto...

The code in the patch check the use of the builtin_expect return
value, to see if it's
used in a COND stmt. So even there are multiple builtins, only one can match.

-Rong

> To avoid spagetti code, please just move the new logic into separate 
> functions.
>
> Honza
>>
>> Index: ipa-inline-analysis.c
>> ===
>> --- ipa-inline-analysis.c (revision 202638)
>> +++ ipa-inline-analysis.c (working copy)
>> @@ -2266,6 +2266,8 @@ estimate_function_body_sizes (struct cgraph_node *
>>/* Estimate static overhead for function prologue/epilogue and alignment. 
>> */
>>int overhead = PARAM_VALUE (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE);
>>int size = overhead;
>> +  gimple fix_expect_builtin;
>> +
>>/* Benefits are scaled by probability of elimination that is in range
>>   <0,2>.  */
>>basic_block bb;
>> @@ -2359,14 +2361,73 @@ estimate_function_body_sizes (struct cgraph_node *
>>   }
>>   }
>>
>> +  fix_expect_builtin = NULL;
>>for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
>>   {
>> gimple stmt = gsi_stmt (bsi);
>> +   if (gimple_call_builtin_p (stmt, BUILT_IN_EXPECT))
>> +{
>> +  tree var = gimple_call_lhs (stmt);
>> +  tree arg = gimple_call_arg (stmt, 0);
>> +  use_operand_p use_p;
>> +  gimple use_stmt;
>> +  bool match = false;
>> +  bool done = false;
>> +  gcc_assert (var && arg);
>> +  gcc_assert (TREE_CODE (var) == SSA_NAME);
>> +
>> +  while (TREE_CODE (arg) == SSA_NAME)
>> +{
>> +  gimple stmt_tmp = SSA_NAME_DEF_STMT (arg);
>> +  switch (gimple_assign_rhs_code (stmt_tmp))
>> +{
>> +  case LT_EXPR:
>> +  case LE_EXPR:
>> +  case GT_EXPR:
>> +  case GE_EXPR:
>> +  case EQ_EXPR:
>> +  case NE_EXPR:
>> +match = true;
>> +done = true;
>> +break;
>> +  case NOP_EXPR:
>> +break;
>> +  default:
>> +done = true;
>> +break;
>> +}
>> +  if (done)
>> +break;
>> +  arg = gimple_assign_rhs1 (stmt_tmp);
>> +}
>> +
>> +  if (match && single_imm_use (var, &use_p, &use_stmt))
>> +{
>> +  if (gimple_code (use_stmt) == GIMPLE_COND)
>> +{
>> +  fix_expect_builtin = use_stmt;
>> +}
>> +}
>> +
>> +  /* we should see one builtin_expert call in one bb.  */
>> +  break;
>> +}
>> +}
>> +
>> +  for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
>> + {
>> +   gimple stmt = gsi_stmt (bsi);
>> int this_size = estimate_num_insns (stmt, &eni_size_weights);
>> int this_time = estimate_num_insns (stmt, &eni_time_weights);
>> int prob;
>> struct predicate will_be_nonconstant;
>>
>> +   if (stmt == fix_expect_builtin)
>> +{
>> +  this_size--;
>> +  this_time--;
>> +}
>> +
>> if (dump_file && (dump_flags & TDF_DETAILS))
>>   {
>> fprintf (dump_file, "  ");
>


Re: [google gcc-4_8] fix size_estimation for builtin_expect

2013-09-27 Thread Rong Xu
On Fri, Sep 27, 2013 at 1:20 AM, Richard Biener
 wrote:
> On Fri, Sep 27, 2013 at 12:23 AM, Jan Hubicka  wrote:
>>> Hi,
>>>
>>> builtin_expect should be a NOP in size_estimation. Indeed, the call
>>> stmt itself is 0 weight in size and time. But it may introduce
>>> an extra relation expr which has non-zero size/time. The end result
>>> is: for w/ and w/o builtin_expect, we have different size/time estimation
>>> for early inlining.
>>>
>>> This patch fixes this problem.
>>>
>>> -Rong
>>
>>> 2013-09-26  Rong Xu  
>>>
>>>   * ipa-inline-analysis.c (estimate_function_body_sizes): fix
>>> the size estimation for builtin_expect.
>>
>> This seems fine with an comment in the code what it is about.
>> I also think we want to support mutiple builtin_expects in a BB so perhaps
>> we want to have pointer set of statements to ignore?
>>
>> To avoid spagetti code, please just move the new logic into separate 
>> functions.
>
> Looks like this could use tree-ssa.c:walk_use_def_chains (please
> change its implementation as necessary, make it C++, etc. - you will
> be the first user again).
>

Thanks for the suggestion. But it seems walk_use_def_chains() was
removed by r201951.

-Rong

> Richard.
>
>> Honza
>>>
>>> Index: ipa-inline-analysis.c
>>> ===
>>> --- ipa-inline-analysis.c (revision 202638)
>>> +++ ipa-inline-analysis.c (working copy)
>>> @@ -2266,6 +2266,8 @@ estimate_function_body_sizes (struct cgraph_node *
>>>/* Estimate static overhead for function prologue/epilogue and 
>>> alignment. */
>>>int overhead = PARAM_VALUE (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE);
>>>int size = overhead;
>>> +  gimple fix_expect_builtin;
>>> +
>>>/* Benefits are scaled by probability of elimination that is in range
>>>   <0,2>.  */
>>>basic_block bb;
>>> @@ -2359,14 +2361,73 @@ estimate_function_body_sizes (struct cgraph_node *
>>>   }
>>>   }
>>>
>>> +  fix_expect_builtin = NULL;
>>>for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
>>>   {
>>> gimple stmt = gsi_stmt (bsi);
>>> +   if (gimple_call_builtin_p (stmt, BUILT_IN_EXPECT))
>>> +{
>>> +  tree var = gimple_call_lhs (stmt);
>>> +  tree arg = gimple_call_arg (stmt, 0);
>>> +  use_operand_p use_p;
>>> +  gimple use_stmt;
>>> +  bool match = false;
>>> +  bool done = false;
>>> +  gcc_assert (var && arg);
>>> +  gcc_assert (TREE_CODE (var) == SSA_NAME);
>>> +
>>> +  while (TREE_CODE (arg) == SSA_NAME)
>>> +{
>>> +  gimple stmt_tmp = SSA_NAME_DEF_STMT (arg);
>>> +  switch (gimple_assign_rhs_code (stmt_tmp))
>>> +{
>>> +  case LT_EXPR:
>>> +  case LE_EXPR:
>>> +  case GT_EXPR:
>>> +  case GE_EXPR:
>>> +  case EQ_EXPR:
>>> +  case NE_EXPR:
>>> +match = true;
>>> +done = true;
>>> +break;
>>> +  case NOP_EXPR:
>>> +break;
>>> +  default:
>>> +done = true;
>>> +break;
>>> +}
>>> +  if (done)
>>> +break;
>>> +  arg = gimple_assign_rhs1 (stmt_tmp);
>>> +}
>>> +
>>> +  if (match && single_imm_use (var, &use_p, &use_stmt))
>>> +{
>>> +  if (gimple_code (use_stmt) == GIMPLE_COND)
>>> +{
>>> +  fix_expect_builtin = use_stmt;
>>> +}
>>> +}
>>> +
>>> +  /* we should see one builtin_expert call in one bb.  */
>>> +  break;
>>> +}
>>> +}
>>> +
>>> +  for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
>>> + {
>>> +   gimple stmt = gsi_stmt (bsi);
>>> int this_size = estimate_num_insns (stmt, &eni_size_weights);
>>> int this_time = estimate_num_insns (stmt, &eni_time_weights);
>>> int prob;
>>> struct predicate will_be_nonconstant;
>>>
>>> +   if (stmt == fix_expect_builtin)
>>> +{
>>> +  this_size--;
>>> +  this_time--;
>>> +}
>>> +
>>> if (dump_file && (dump_flags & TDF_DETAILS))
>>>   {
>>> fprintf (dump_file, "  ");
>>


Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

2013-09-26 Thread Rong Xu
On Thu, Sep 26, 2013 at 2:54 PM, Jan Hubicka  wrote:
>> > As for COMDAT merging, i would like to see the patch.  I am experimenting
>> > now with a patch to also privatize COMDATs during -fprofile-generate to
>> > avoid problems with lost profiles mentioned above.
>> >
>>
>> Do you mean you privatize every COMDAT function in the profile-generate?
>> We discussed this idea internally and we thought it would not work for
>> large applications (like in google) due to size.
>
> Yes, Martin and I plan to test this on firefox.  In a way you already have all
> the COMDAT functions unshared in the object files, so the resulting binary
> should not be completely off the limits.  But I do not have any quantitative
> data, yet, since we hit bug in constant folding and devirtualization I fixed 
> in
> meantime but we did not re-run the tests yet.

LInker removes a great numbers of duplicated copies, esp for those
template functions.
We don't have a quantitative numbers either. But I'll collect some soon.
>
>>
>> > As for context sensitivity, one approach would be to have two sets of
>> > counters for every comdat - one merged globally and one counting local
>> > instances.  We can then privatize always and at profile read in stage
>> > just clone every comdat and have two instances - one for offline copy
>> > and one for inlining.
>> >
>>
>> In my implementation, I also allow multiple sets of COMDAT profile
>> co-existing in one compilation.
>> Due to the auxiliary modules in LIPO, I actually have more than two.
>
> How does auxiliary modules work?

It pulls in multiple profiles from other compilation. So there might be multiple
inlined profiles.

>>
>> But I'm wondering how do you determine which profile to use for each
>> call-site -- the inline decision may not
>> be the same for profile-generate and profile-use compilation.
>
> My suggestion was to simply use the module local profile for all inline sites
> within the given module and the global profile for the offline copy of the
> function (that one will, in the case it survives linking, be shared across
> all the modules anyway).

For simple example like:
callsite1 --> comcat_function_foo
callsite2 --> comdat_function_foo

callsite1 is inlined in profile-generate, it has its own inlined
profile counter.
callsite2 is not inlined and the profile goes to the offline copies.
let's callsite 1 is cold (0 counter) and callsite 2 is hot. Using
local profile (the cold one)
for callsite2 will not be correct.

>
> I think this may work in the cases where i.e. use of hash templates in one
> module is very different (in average size) from other module.
> I did not really put much effort into it - I currently worry primarily about
> the cases where profile is lost completely since it gets attached to a 
> function
> not surviving final linking (or because we inline something we did not inlined
> at profile time).
>
> As for context sensitivity, we may try to consider developing more consistent
> solution for this.  COMDAT functions are definitely not only that may exhibit
> context sensitive behaviour.
> One approach would be to always have multiple counters for each function and
> hash based on cbacktraces collected by indirect call profiling 
> instrumentation.
> In a way this is same path profiling, but that would definitely add quite some
> overhead + we will need to think of resonable way to represent this within
> compiler.
>
> How do you decide what functions you want to have multiple profiles for?

I do the instrumentation after ipa-inline for comdat function. I know
if a callsite
is inlined or not. In profile-use phrase, I also need to provide to
the context (which module this is from) to pick
the right profile.

>
> Honza


[google gcc-4_8] alternate hirate for builtin_expert

2013-09-26 Thread Rong Xu
Hi,

Current default probably for builtin_expect is 0.9996.
This makes the freq of unlikely bb very low (4), which
suppresses the inlining of any calls within those bb.

We used FDO data to measure the branch probably for
the branch annotated with builtin_expert.
 For google
internal benchmarks, the weight average
(the profile count value as the weight) is 0.9081.

Linux kernel is another program that is heavily annotated
with builtin-expert. We measured its weight average as 0.8717,
  using google search as
the workload.


This patch sets the alternate hirate probability for
builtin_expert
to 90%. With the alternate hirate, we measured performance
  improvement for google
benchmarks and Linux kernel.


  -Rong
2013-09-26  Rong Xu  

* params.def (DEFPARAM): New.
* params.def: New.
* predict.c (tree_predict_by_opcode): Alternate 
probablity hirate for builtin_expect.

Index: params.def
===
--- params.def  (revision 202638)
+++ params.def  (working copy)
@@ -483,6 +483,10 @@ DEFPARAM(TRACER_MIN_BRANCH_PROBABILITY,
 "tracer-min-branch-probability",
 "Stop forward growth if the probability of best edge is less than this 
threshold (in percent). Used when profile feedback is not available",
 50, 0, 100)
+DEFPARAM(BUILTIN_EXPECT_PROBABILITY_RELAXED,
+"builtin-expect-probability-relaxed",
+"Set the estimated probability for builtin expect. By default using 
PROB_VERY_LIKELY, while value of 1 using HIRATE(90) probability.",
+0, 0, 1)
 
 /* The maximum number of incoming edges to consider for crossjumping.  */
 DEFPARAM(PARAM_MAX_CROSSJUMP_EDGES,
Index: params.def
===
--- params.def  (revision 202638)
+++ params.def  (working copy)
@@ -483,6 +483,10 @@ DEFPARAM(TRACER_MIN_BRANCH_PROBABILITY,
 "tracer-min-branch-probability",
 "Stop forward growth if the probability of best edge is less than this 
threshold (in percent). Used when profile feedback is not available",
 50, 0, 100)
+DEFPARAM(BUILTIN_EXPECT_PROBABILITY_RELAXED,
+"builtin-expect-probability-relaxed",
+"Set the estimated probability for builtin expect. By default using 
PROB_VERY_LIKELY, while value of 1 using HIRATE(90) probability.",
+0, 0, 1)
 
 /* The maximum number of incoming edges to consider for crossjumping.  */
 DEFPARAM(PARAM_MAX_CROSSJUMP_EDGES,
Index: predict.c
===
--- predict.c   (revision 202638)
+++ predict.c   (working copy)
@@ -1950,10 +1950,17 @@ tree_predict_by_opcode (basic_block bb)
   BITMAP_FREE (visited);
   if (val)
 {
+  enum br_predictor predictor;
+
+  if (PARAM_VALUE (BUILTIN_EXPECT_PROBABILITY_RELAXED) == 0)
+predictor = PRED_BUILTIN_EXPECT;
+  else
+predictor = PRED_BUILTIN_EXPECT_RELAXED;
+
   if (integer_zerop (val))
-   predict_edge_def (then_edge, PRED_BUILTIN_EXPECT, NOT_TAKEN);
+   predict_edge_def (then_edge, predictor, NOT_TAKEN);
   else
-   predict_edge_def (then_edge, PRED_BUILTIN_EXPECT, TAKEN);
+   predict_edge_def (then_edge, predictor, TAKEN);
   return;
 }
   /* Try "pointer heuristic."


[google gcc-4_8] fix size_estimation for builtin_expect

2013-09-26 Thread Rong Xu
Hi,

builtin_expect should be a NOP in size_estimation. Indeed, the call
stmt itself is 0 weight in size and time. But it may introduce
an extra relation expr which has non-zero size/time. The end result
is: for w/ and w/o builtin_expect, we have different size/time estimation
for early inlining.

This patch fixes this problem.

-Rong
2013-09-26  Rong Xu  

* ipa-inline-analysis.c (estimate_function_body_sizes): fix
the size estimation for builtin_expect.

Index: ipa-inline-analysis.c
===
--- ipa-inline-analysis.c   (revision 202638)
+++ ipa-inline-analysis.c   (working copy)
@@ -2266,6 +2266,8 @@ estimate_function_body_sizes (struct cgraph_node *
   /* Estimate static overhead for function prologue/epilogue and alignment. */
   int overhead = PARAM_VALUE (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE);
   int size = overhead;
+  gimple fix_expect_builtin;
+
   /* Benefits are scaled by probability of elimination that is in range
  <0,2>.  */
   basic_block bb;
@@ -2359,14 +2361,73 @@ estimate_function_body_sizes (struct cgraph_node *
}
}
 
+  fix_expect_builtin = NULL;
   for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
{
  gimple stmt = gsi_stmt (bsi);
+ if (gimple_call_builtin_p (stmt, BUILT_IN_EXPECT))
+{
+  tree var = gimple_call_lhs (stmt);
+  tree arg = gimple_call_arg (stmt, 0);
+  use_operand_p use_p;
+  gimple use_stmt;
+  bool match = false;
+  bool done = false;
+  gcc_assert (var && arg);
+  gcc_assert (TREE_CODE (var) == SSA_NAME);
+
+  while (TREE_CODE (arg) == SSA_NAME)
+{
+  gimple stmt_tmp = SSA_NAME_DEF_STMT (arg);
+  switch (gimple_assign_rhs_code (stmt_tmp))
+{
+  case LT_EXPR:
+  case LE_EXPR:
+  case GT_EXPR:
+  case GE_EXPR:
+  case EQ_EXPR:
+  case NE_EXPR:
+match = true;
+done = true;
+break;
+  case NOP_EXPR:
+break;
+  default:
+done = true;
+break;
+}
+  if (done)
+break;
+  arg = gimple_assign_rhs1 (stmt_tmp);
+}
+
+  if (match && single_imm_use (var, &use_p, &use_stmt))
+{
+  if (gimple_code (use_stmt) == GIMPLE_COND)
+{
+  fix_expect_builtin = use_stmt;
+}
+}
+
+  /* we should see one builtin_expert call in one bb.  */
+  break;
+}
+}
+
+  for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
+   {
+ gimple stmt = gsi_stmt (bsi);
  int this_size = estimate_num_insns (stmt, &eni_size_weights);
  int this_time = estimate_num_insns (stmt, &eni_time_weights);
  int prob;
  struct predicate will_be_nonconstant;
 
+ if (stmt == fix_expect_builtin)
+{
+  this_size--;
+  this_time--;
+}
+
  if (dump_file && (dump_flags & TDF_DETAILS))
{
  fprintf (dump_file, "  ");


Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

2013-09-26 Thread Rong Xu
On Tue, Sep 24, 2013 at 5:31 AM, Jan Hubicka  wrote:
>> Hi Honza,
>>
>> I am finally getting back to working on this after a few weeks of
>> working on some other priorities.
>
> I am also trying to return to this, so good timming ;)
> Martin has got smaller C++ programs (Inkscape) to not touch cold segment
> during the startup with FDO (w/o partitioning). Firefox still does, I think
> the problem are lost samples due to different linker decisions even with LTO.
> (i.e. linker pick an object from .a libraryat profile-generate time that i 
> never
> passes later.).
>
> I plan to look into that today.
>>
>> Did you mean to commit the above change? I see that it went in as part
>> of r202258 but doesn't show up in the ChangeLog entry for that
>> revision.
>
> Yes, I meant to check it in, but did not mean to do so w/o Changelog.  I wil
> fix that.
>>
>> >
>> > In other cases it was mostly loop unrolling in combination with jump 
>> > threading. So
>> > I modified my script to separately report when failure happens for test 
>> > trained
>> > once and test trained hundred times.
>>
>> Thanks for the linker script. I reproduced your results. I looked at a
>> couple cases. The first was one that failed after 1 training run only
>> (2910-2.c). It was due to jump threading, which you noted was a
>> problem. For this one I think we can handle it in the partitioning,
>> since there is an FDO insanity that we could probably treat more
>> conservatively when splitting.
>
> We should fix the roundoff issues - when I was introducing the
> frequency/probability/count system I made an assumptions that parts of 
> programs
> with very low counts do not matter, since they are not part of hot spot (and I
> cared only about the hot spot).  Now we care about identifying unlikely
> executed spots and we need to fix this.
>>
>> I looked at one that failed after 100 as well (20031204-1.c). In this
>> case, it was due to expansion which was creating multiple branches/bbs
>> from a logical OR and guessing incorrectly on how to assign the
>> counts:
>>
>>  if (octets == 4 && (*cp == ':' || *cp == '\0')) {
>>
>> The (*cp == ':' || *cp == '\0') part looked like the following going
>> into RTL expansion:
>>
>>   [20031204-1.c : 31:33] _29 = _28 == 58;
>>   [20031204-1.c : 31:33] _30 = _28 == 0;
>>   [20031204-1.c : 31:33] _31 = _29 | _30;
>>   [20031204-1.c : 31:18] if (_31 != 0)
>> goto ;
>>   else
>> goto ;
>>
>> where the result of the OR was always true, so bb 16 had a count of
>> 100 and bb 19 a count of 0. When it was expanded, the expanded version
>> of the above turned into 2 bbs with a branch in between. Both
>> comparisons were done in the first bb, but the first bb checked
>> whether the result of the *cp == '\0' compare was true, and if not
>> branched to the check for whether the *cp == ':' compare was true. It
>> gave the branch to the second check against ':' a count of 0, so that
>> bb got a count of 0 and was split out, and put the count of 100 on the
>> fall through assuming the compare with '\0' always evaluated to true.
>> In reality, this OR condition was always true because *cp was ':', not
>> '\0'. Therefore, the count of 0 on the second block with the check for
>> ':' was incorrect, we ended up trying to execute it, and failed.
>>
>> Presumably we had the correct profile data for both blocks, but the
>> accuracy was reduced when the OR was represented as a logical
>> computation with a single branch. We could change the expansion code
>> to do something different, e.g. treat as a 50-50 branch. But we would
>> still end up with integer truncation issues when there was a single
>> training run. But that could be dealt with conservatively in the
>> bbpart code as I suggested for the jump threading issue above. I.e. a
>> cold block with incoming non-cold edges conservatively not marked cold
>> for splitting.
>>
>> >
>> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/2422-1.c
>> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/2910-2.c
>> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20020413-1.c
>> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20030903-1.c
>> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20031204-1.c
>> > FAIL /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20031204-1.c
>> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20060420-1.c
>> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20060905-1.c
>> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20120427-1.c
>> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20120427-2.c
>> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20120808-1.c
>> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20121108-1.c
>> > FAIL /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/20121108-1.c
>> > FAIL1 /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/920501-6.c
>> > FAIL /home/jh/trunk/gcc/testsuite/gcc.c-torture/execute/920

Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

2013-08-30 Thread Rong Xu
On Fri, Aug 30, 2013 at 7:50 AM, Teresa Johnson  wrote:
> Can someone review and ok the attached patch for trunk? It has been
> bootstrapped and tested on x86-64-unknown-linux-gnu, and tested by
> enabling -freorder-blocks-and-partition enabled for a
> profiledbootstrap as well.
>
> (Honza, see more responses inlined below. Rong, please see note below as 
> well).
>
> Thanks,
> Teresa
>
> On Fri, Aug 30, 2013 at 2:14 AM, Jan Hubicka  wrote:
>>> Great! Is this the LTO merging you were talking about in an earlier
>>> message, or the gcov runtime fix (that would presumably not be
>>> lto-specific)?
>>
>> It is the LTO path - we need to merge profiles there anyway for his code 
>> unification
>> work.
>
> Rong - can you send a summary of the approach you are working on? Is
> it LIPO-specific?

I'm also working to improve COMDAT handling in FDO/LIPO.  It's
applicable to regular FDO.
Our motivation case is different from the case talked here:
  COMDAT function F is defined in module A and is picked by linker as
the out-of-line copy in profile-gen phrase.
It gets all the counters. In profile-use compilation F may not be
emitted in A (like all the callsites are ipa-inlined in A),
We choose instance of F in module B as the out-of-line copy, and it's
not optimized.

We are seeing more of this kind of cases in LIPO due to multiple
COMDAT copies brought by the auxiliary modules.

Since COMDAT function may be inlined after instrumentation, multiple
copies of the counters make be co-exists
We want to differentiate inlined-copy counters or out-of-line-copy counters.
(In LIPO, we actually encourage the inline of COMDAT in IPA-inline to
get more context sensitive counters.)

Our current solution is to have another instrumentation only for
COMDAT functions.
* For each  comdat_key, we create a global var pointing to the
gcov_fn_info of the out-of-line copy.
* This global var is initialized by the instrumentation code placed in
the function entry, to the value of gcov_fn_info of current module.
   This is post IPA-inline instrumentation. So at most one
instrumentation (the one picked by linker) is executed.
* Expand gcov_fn_info to points to the global var.
In the run time, we can differentiate if the counter out-of-line copy
or inlined copy. We set a tag in gcda file accordingly.
(in the case of both out-of-line copy and inlined copy, we treat it as
out-of-line copy).

In the profile-use phrase, we use this info to resolve the decls of
COMDAT functions, and also make sure we only emit
the out-of-line copy of the COMDAT function (even if it's not
referenced in the module).

This has been done and we are testing it now.

I talked to Teresa about the her case some time back. It seems that
merging the counters is an easy change with the above patch because
we have the address of the out-of-line gcov_fn_info. We can do a
simple in-memory merge if the checksum matches. The concern is
that this may reduce the context sensitive information.

-Rong

>
>>
>>> > I have patch to track this.  Moreover vforks seems to produce repeated
>>> > merging of results.
>>>
>>> Aha, does this explain the gimp issue as well?
>>
>> Not really - we still need to debug why we hit cold section so many times 
>> with
>> partitioning.  I sitll think easier approach will be to lock the cold 
>> section and
>> then start probably with testsuite (i.e. write script to compile the small 
>> testcases
>> with FDO + partitioning and see what crash by hitting cold section).
>
> Ok, that is on my todo list.
>
>>
>>> >
>>> > Is it really necessary to run this from internal loop of the cfgcleanup?
>>>
>>> The reason I added it here is that just below there is a call to
>>> verify_flow_info, and that will fail with the new verification.
>>
>> Hmm, OK, I suppose we run the cleanup after partitioning just once or twice, 
>> right?
>> We can track this incrementally - I am not sure if we put it from the 
>> internal iteration
>> loop we would get anything substantial either.
>> Removing unreachable blocks twice is however ugly.
>
> When I was debugging the issue that led to this change I seemed to see
> 1-2 iterations typically. Although I haven't measured it
> scientifically. It would be good to revisit that and see if we can
> pull both parts out of the loop, but as a separate patch.
>
>>
>>> +/* Ensure that all hot bbs are included in a hot path through the
>>> +   procedure. This is done by calling this function twice, once
>>> +   with WALK_UP true (to look for paths from the entry to hot bbs) and
>>> +   once with WALK_UP false (to look for paths from hot bbs to the exit).
>>> +   Returns the updated value of COLD_BB_COUNT and adds newly-hot bbs
>>> +   to BBS_IN_HOT_PARTITION.  */
>>> +
>>> +static unsigned int
>>> +sanitize_hot_paths (bool walk_up, unsigned int cold_bb_count,
>>> +vec *bbs_in_hot_partition)
>>> +{
>>> +  /* Callers check this.  */
>>> +  gcc_checking_assert (cold_bb_count);
>>> +
>>> +  /* Keep examining hot bbs while we stil

[Google] fix a bug in lipo varpool node linking

2013-08-16 Thread Rong Xu
This patch fixed a bug in lipo varpool node linking.

C++ FE drops the initializer if it's not used in this TU. For current
varpool linking may
resolve the varpool node to the one with null initializer.

-Rong


Index: l-ipo.c
===
--- l-ipo.c (revision 201800)
+++ l-ipo.c (working copy)
@@ -2151,6 +2151,19 @@ resolve_varpool_node (struct varpool_node **slot,
   return;
 }

+  if (DECL_INITIAL (decl1) && !DECL_INITIAL (decl2))
+{
+  merge_addressable_attr (decl1, decl2);
+  return;
+}
+
+  if (!DECL_INITIAL (decl1) && DECL_INITIAL (decl2))
+{
+  *slot = node;
+  merge_addressable_attr (decl2, decl1);
+  return;
+}
+
   /* Either all complete or neither's type is complete. Just
  pick the primary module's decl.  */
   if (!varpool_is_auxiliary (*slot))


[google gcc-4_8] Force cmd-line match for option -ansi in LIPO use

2013-07-30 Thread Rong Xu
The following patch forces the command line match for -ansi option
in LIPO use build. Otherwise, it gets various undefined symbol errors.
This is exposed in LIPO random grouping test.

Tested with google internal benchmarks and gcc regression test.

2013-07-30  Rong Xu  

* gcc/coverage.c (force_matching_cg_opts): require
cmd line match on -ansi option in LIPO use build.

Index: gcc/coverage.c
===
--- gcc/coverage.c  (revision 201219)
+++ gcc/coverage.c  (working copy)
@@ -263,6 +263,7 @@ static struct opt_desc force_matching_cg_opts[] =
 { "-fsized-delete", "-fno-sized-delete", false },
 { "-frtti", "-fno-rtti", true },
 { "-fstrict-aliasing", "-fno-strict-aliasing", true },
+{ "-ansi", "", false },
 { NULL, NULL, false }
   };


Re: [Google] Fix profiledbootstrap failure

2013-07-30 Thread Rong Xu
Will do.

The patch was in gcc-4_7 by Dehao.

r194713 | dehao | 2012-12-24 16:49:06 -0800 (Mon, 24 Dec 2012) | 5 lines

Fix the profiled bootstrap:

1. Set the default value of gcov-debug to be 0.
2. Merge profile summaries from different instrumented binaries.

On Tue, Jul 30, 2013 at 12:58 PM, Xinliang David Li  wrote:
> Ok. Rong, can you help commit the parameter default setting patch?
>
> thanks,
>
> David
>
> On Tue, Jul 30, 2013 at 12:48 PM, Rong Xu  wrote:
>> We have seen the issue before. It does fail the profile boostrap as it
>> reads a wrong gcda file.
>> I thought it had been fixed. (The fix was as David mentioned, setting
>> the default value of the parameter to 0).
>>
>> -Rong
>>
>> On Tue, Jul 30, 2013 at 12:02 PM, Xinliang David Li  
>> wrote:
>>> I need to understand why this affects profile bootstrap -- is this due
>>> to file name conflict?
>>>
>>> The fix is wrong -- please do not remove the parameter. If it is a
>>> problem, a better fix is to change the default parameter value to 0.
>>>
>>> David
>>>
>>>
>>> On Tue, Jul 30, 2013 at 11:56 AM, Teresa Johnson  
>>> wrote:
>>>> cc'ing Rong and David since this came from LIPO support.
>>>>
>>>> The patch as-is removes the one use of PARAM_GCOV_DEBUG (which is on
>>>> by default) without removing the parameter itself. What is the failure
>>>> mode you see from this code?
>>>>
>>>> Thanks,
>>>> Teresa
>>>>
>>>> On Tue, Jul 30, 2013 at 11:50 AM, Dinar Temirbulatov
>>>>  wrote:
>>>>> Hello
>>>>>
>>>>> This change allows to complete profiledbootstrap on the google gcc-4.8
>>>>> branch, tested with make bootstrap with no new regressions.  OK for
>>>>> google 4.8?
>>>>>thanks, Dinar.
>>>>
>>>>
>>>>
>>>> --
>>>> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [Google] Fix profiledbootstrap failure

2013-07-30 Thread Rong Xu
We have seen the issue before. It does fail the profile boostrap as it
reads a wrong gcda file.
I thought it had been fixed. (The fix was as David mentioned, setting
the default value of the parameter to 0).

-Rong

On Tue, Jul 30, 2013 at 12:02 PM, Xinliang David Li  wrote:
> I need to understand why this affects profile bootstrap -- is this due
> to file name conflict?
>
> The fix is wrong -- please do not remove the parameter. If it is a
> problem, a better fix is to change the default parameter value to 0.
>
> David
>
>
> On Tue, Jul 30, 2013 at 11:56 AM, Teresa Johnson  wrote:
>> cc'ing Rong and David since this came from LIPO support.
>>
>> The patch as-is removes the one use of PARAM_GCOV_DEBUG (which is on
>> by default) without removing the parameter itself. What is the failure
>> mode you see from this code?
>>
>> Thanks,
>> Teresa
>>
>> On Tue, Jul 30, 2013 at 11:50 AM, Dinar Temirbulatov
>>  wrote:
>>> Hello
>>>
>>> This change allows to complete profiledbootstrap on the google gcc-4.8
>>> branch, tested with make bootstrap with no new regressions.  OK for
>>> google 4.8?
>>>thanks, Dinar.
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


  1   2   >