Re: [PING][PATCH] Make function clone name numbering independent.

2018-08-20 Thread Jeff Law
On 08/13/2018 05:58 PM, Michael Ploujnikov wrote:
> Ping and I've updated the patch since last time as follows:
> 
>   - unittest scans assembly rather than the constprop dump because its
> forward changed
>   - unittests should handle different hosts where any of
> NO_DOT_IN_LABEL, NO_DOLLAR_IN_LABEL or __USER_LABEL_PREFIX__ may
> be defined
>   - not 100% it's safe to change DECL_NAME to DECL_ASSEMBLER_NAME in
> cgraph_node::create_virtual_clone, but I've attempted to reduce
> some code duplication
>   - lto-partition.c: privatize_symbol_name_1 *does* need numbered
> names
>   - but cold sections don't
>   - Expecting an IDENTIFIER_NODE in clone_function_name_1 avoids
> unreliable string pointer use as pointed out in the first review
>   - renamed clone_function_name_1 and clone_function_name to
> numbered_clone_function_name_1 and numbered_clone_function_name to
> clarify purpose and discourage future unintended uses
Richi has more state here than I do, so I'm going to let him own it.  I
know he's just returning from PTO, so it's going to take him a bit of
time to catch up.

jeff


[PING][PATCH] Make function clone name numbering independent.

2018-08-13 Thread Michael Ploujnikov
Ping and I've updated the patch since last time as follows:

  - unittest scans assembly rather than the constprop dump because its
forward changed
  - unittests should handle different hosts where any of
NO_DOT_IN_LABEL, NO_DOLLAR_IN_LABEL or __USER_LABEL_PREFIX__ may
be defined
  - not 100% it's safe to change DECL_NAME to DECL_ASSEMBLER_NAME in
cgraph_node::create_virtual_clone, but I've attempted to reduce
some code duplication
  - lto-partition.c: privatize_symbol_name_1 *does* need numbered
names
  - but cold sections don't
  - Expecting an IDENTIFIER_NODE in clone_function_name_1 avoids
unreliable string pointer use as pointed out in the first review
  - renamed clone_function_name_1 and clone_function_name to
numbered_clone_function_name_1 and numbered_clone_function_name to
clarify purpose and discourage future unintended uses

Also bootstrapped and regtested.

- Michael

On 2018-08-02 03:05 PM, Michael Ploujnikov wrote:
> On 2018-08-01 06:37 AM, Richard Biener wrote:
>> On Tue, Jul 31, 2018 at 7:40 PM Michael Ploujnikov
>>  wrote:
>>>
>>> On 2018-07-26 01:27 PM, Michael Ploujnikov wrote:
 On 2018-07-24 09:57 AM, Michael Ploujnikov wrote:
> On 2018-07-20 06:05 AM, Richard Biener wrote:
>>>  /* Return a new assembler name for a clone with SUFFIX of a decl named
>>> NAME.  */
>>> @@ -521,14 +521,13 @@ tree
>>>  clone_function_name_1 (const char *name, const char *suffix)
>>
>> pass this function the counter to use
>>
>>>  {
>>>size_t len = strlen (name);
>>> -  char *tmp_name, *prefix;
>>> +  char *prefix;
>>>
>>>prefix = XALLOCAVEC (char, len + strlen (suffix) + 2);
>>>memcpy (prefix, name, len);
>>>strcpy (prefix + len + 1, suffix);
>>>prefix[len] = symbol_table::symbol_suffix_separator ();
>>> -  ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++);
>>
>> and keep using ASM_FORMAT_PRIVATE_NAME here.  You need to change
>> the lto/lto-partition.c caller (just use zero as counter).
>>
>>> -  return get_identifier (tmp_name);
>>> +  return get_identifier (prefix);
>>>  }
>>>
>>>  /* Return a new assembler name for a clone of DECL with SUFFIX.  */
>>> @@ -537,7 +536,17 @@ tree
>>>  clone_function_name (tree decl, const char *suffix)
>>>  {
>>>tree name = DECL_ASSEMBLER_NAME (decl);
>>> -  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
>>> +  const char *decl_name = IDENTIFIER_POINTER (name);
>>> +  char *numbered_name;
>>> +  unsigned int *suffix_counter;
>>> +  if (!clone_fn_ids) {
>>> +/* Initialize the per-function counter hash table if this is the 
>>> first call */
>>> +clone_fn_ids = hash_map::create_ggc (64);
>>> +  }
>>
>> I still do not like throwing memory at the problem in this way for the
>> little benefit
>> this change provides :/
>>
>> So no approval from me at this point...
>>
>> Richard.
>
> Can you give me an idea of the memory constraints that are involved?
>
> The highest memory usage increase that I could find was when compiling
> a source file (from Linux) with roughly 10,000 functions. It showed a 2kB
> increase over the before-patch use of 6936kB which is barely 0.03%.
>
> Using a single counter can result in more confusing namespacing when
> you have .bar.clone.4 despite there only being 3 clones of .bar.
>
> From a practical point of view this change is helpful to anyone
> diffing binary output such as forensic analysts, Debian Reproducible
> Builds or even someone validating compiler output (before and after an 
> input
> source patch). The extra changes that this patch alleviates are a
> distraction and could even be misleading. For example, applying a
> source patch to the same Linux source produces the following binary
> diff before my change:
>
> --- /tmp/output.o.objdump
> +++ /tmp/patched-output.o.objdump
> @@ -1,5 +1,5 @@
>
> -/tmp/uverbs_cmd/output.o: file format elf32-i386
> +/tmp/uverbs_cmd/patched-output.o: file format elf32-i386
>
>
>  Disassembly of section .text.get_order:
> @@ -265,12 +265,12 @@
> 3:   e9 fc ff ff ff  jmp4 
>  4: R_386_PC32   .text.put_uobj_read
>
> -Disassembly of section .text.trace_kmalloc.constprop.3:
> +Disassembly of section .text.trace_kmalloc.constprop.4:
>
> - :
> + :
> 0:   83 3d 04 00 00 00 00cmpl   $0x0,0x4
>  2: R_386_32 __tracepoint_kmalloc
> -   7:   74 34   je 3d 
> 
> +   7:   74 34   je 3d 
> 
> 9:   55  push   %ebp
> a:   89 cd   mov%ecx,%ebp
>