Re: FDO usability patch -- cfg and lineno checksum

Xinliang David Li Wed, 20 Apr 2011 22:28:55 -0700

On Tue, Apr 19, 2011 at 3:20 PM, Jan Hubicka <hubi...@ucw.cz> wrote:
> I can not review tree.c changes.  I would probably suggest making crc_byte 
> inline.


Diego, can you review this change? This is just a simple refactoring.


>
>> +#if IN_LIBGCOV
>> +
>> +/* These functions are guarded by #if to avoid compile time warning.  */
>> +
>> +/* Return the number of words STRING would need including the length
>> +   field in the output stream itself.  This should be identical to
>> +   "alloc" calculation in gcov_write_string().  */
>
> Hmm, probably better to make gcov_write_string to use gcov_string_length then.


yes.

>> @@ -238,13 +265,15 @@ gcov_write_words (unsigned words)
>>
>>    gcc_assert (gcov_var.mode < 0);
>>  #if IN_LIBGCOV
>> -  if (gcov_var.offset >= GCOV_BLOCK_SIZE)
>> +  if (gcov_var.offset + words >= GCOV_BLOCK_SIZE)
>>      {
>> -      gcov_write_block (GCOV_BLOCK_SIZE);
>> +      gcov_write_block (MIN (gcov_var.offset, GCOV_BLOCK_SIZE));
>>        if (gcov_var.offset)
>>       {
>> -       gcc_assert (gcov_var.offset == 1);
>> -       memcpy (gcov_var.buffer, gcov_var.buffer + GCOV_BLOCK_SIZE, 4);
>> +       gcc_assert (gcov_var.offset < GCOV_BLOCK_SIZE);
>> +       memcpy (gcov_var.buffer,
>> +                  gcov_var.buffer + GCOV_BLOCK_SIZE,
>> +                  gcov_var.offset << 2);
>
> I don't really follow the logic here.  buffer is allocated to be size of
> block+4 and it is expected that gcov_write_words is not executed on size
> greater than 4.  Since gcov_write_string now seems to be expected to handle
> strings of bigger size, I think you acually need to make write_string to write
> in chunks when you reach block boundary?

gcov_write_words is used to reserve words*4 bytes in buffer for data
write later. The the old logic is wrong -- if 'words' is large enough,
it will lead to out of bound access.

>
> As soon as you decide to not write out in the GCOV_BLOCK_SIZE chunks, the
> MIN computation above seems unnecesary (and bogus, since we won't let
> gcov_var.offset to grow past GCOV_BLOCK_SIZE.

Yes, using gcov_var.offset should be good enough.

>
> What gets into libgcov is very problematic busyness for embedded targets, 
> where you really want
> libgcov to be small.  Why do you need to store strings now?

It is needed to store the assembler name for functions. It allows
lookup of the profile data using assembler name as key instead of
using function ids. For gcc, the total size of gcda files is about 59k
bytes without storing the names, and about 65k with names -- about 10%
increase. For eon, the size changes from 27k to 35k bytes.

I can split the patches into two parts -- one with cfg checksum and
one with the name string.

>> Index: gcov-io.h
>> ===================================================================
>> --- gcov-io.h (revision 172693)
>> +++ gcov-io.h (working copy)
>> @@ -101,9 +101,10 @@ see the files COPYING3 and COPYING.RUNTI
>>
>>     The basic block graph file contains the following records
>>       note: unit function-graph*
>> -     unit: header int32:checksum string:source
>> +     unit: header int32:checksum int32: string:source
>>       function-graph: announce_function basic_blocks {arcs | lines}*
>> -     announce_function: header int32:ident int32:checksum
>> +     announce_function: header int32:ident
>> +             int32:lineno_checksum int32:cfg_checksum
>>               string:name string:source int32:lineno
>>       basic_block: header int32:flags*
>>       arcs: header int32:block_no arc*
>> @@ -132,7 +133,9 @@ see the files COPYING3 and COPYING.RUNTI
>>          data: {unit function-data* summary:object summary:program*}*
>>       unit: header int32:checksum
>>          function-data:       announce_function arc_counts
>> -     announce_function: header int32:ident int32:checksum
>> +     announce_function: header int32:ident
>> +             int32:lineno_checksum int32:cfg_checksum
>> +             string:name
>>       arc_counts: header int64:count*
>>       summary: int32:checksum {count-summary}GCOV_COUNTERS
>>       count-summary:  int32:num int32:runs int64:sum
>
> We also need to bump gcov version, right?

Yes -- but the version is currently derived from gcc version number
and phase number --- this is wrong -- different version of gcc may
have compatible coverage data format.  Any suggestions to change this?
--- probably just hard code the GCOV_VERSION string?


>
>> @@ -411,11 +417,20 @@ struct gcov_summary
>>  /* Information about a single function.  This uses the trailing array
>>     idiom. The number of counters is determined from the counter_mask
>>     in gcov_info.  We hold an array of function info, so have to
>> -   explicitly calculate the correct array stride.  */
>> +   explicitly calculate the correct array stride.
>> +
>> +   "ident" is no longer used in hash computation.  Additionally,
>> +   "name" is used in hash computation.  This makes the profile data
>> +   not compatible across function name changes.
>> +   Also, the function pointer is stored for later use for indirect
>> +   function call name resolution.
>
> Hmm, your patch is not adding indirect function call name resolution,
> so independent change?

Yes -- the comment on indirect function call name is for SampleFDO change.


>> @@ -67,9 +71,11 @@ typedef struct counts_entry
>>    /* We hash by  */
>>    unsigned ident;
>>    unsigned ctr;
>> +  char *name;
> also const, right?

yes.

>> @@ -139,12 +144,21 @@ get_gcov_unsigned_t (void)
>>    return lang_hooks.types.type_for_size (32, true);
>>  }
>>
>> +/* Return the type node for const char *.  */
>> +
>> +static tree
>> +get_const_string_type (void)
>
> This is not really coverage specific and already mudflap is doing the same. 
> So probably it better
> should go into tree.c.


Ok.


>> -       if (entry->checksum != checksum)
>> -         inform (input_location, "checksum is %x instead of %x",
>> -                 entry->checksum, checksum);
>> +       if (entry->cfg_checksum != cfg_checksum)
>> +         inform (input_location, "cfg_checksum is %x instead of %x",
>> +                 entry->cfg_checksum, cfg_checksum);
>
> Well, the message is already cryptic, perhaps just expanding cfg to be 
> control flow graph
> so users get some clue.

Ok.


>> @@ -641,13 +744,24 @@ build_fn_info_type (unsigned int counter
>>    /* ident */
>>    fields = build_decl (BUILTINS_LOCATION,
>>                      FIELD_DECL, NULL_TREE, get_gcov_unsigned_t ());
>> -
>> -  /* checksum */
>> +  /* lineno_checksum */
>>    field = build_decl (BUILTINS_LOCATION,
>>                     FIELD_DECL, NULL_TREE, get_gcov_unsigned_t ());
>>    DECL_CHAIN (field) = fields;
>>    fields = field;
>>
>> +  /* cfg checksum */
>> +  field = build_decl (BUILTINS_LOCATION,
>> +                      FIELD_DECL, NULL_TREE, get_gcov_unsigned_t ());
>> +  DECL_CHAIN (field) = fields;
>> +  fields = field;
>
> Why do we need function names to end up there at runtime?

They need to be stored into gcda files at the end of the profile collection.

Thanks,


David


>
> Honza
>

Re: FDO usability patch -- cfg and lineno checksum

Reply via email to