Ping and I've updated the patch since last time as follows:
- unittest scans assembly rather than the constprop dump because its
forward changed
- unittests should handle different hosts where any of
NO_DOT_IN_LABEL, NO_DOLLAR_IN_LABEL or __USER_LABEL_PREFIX__ may
be defined
- not 100% it's safe to change DECL_NAME to DECL_ASSEMBLER_NAME in
cgraph_node::create_virtual_clone, but I've attempted to reduce
some code duplication
- lto-partition.c: privatize_symbol_name_1 *does* need numbered
names
- but cold sections don't
- Expecting an IDENTIFIER_NODE in clone_function_name_1 avoids
unreliable string pointer use as pointed out in the first review
- renamed clone_function_name_1 and clone_function_name to
numbered_clone_function_name_1 and numbered_clone_function_name to
clarify purpose and discourage future unintended uses
Also bootstrapped and regtested.
- Michael
On 2018-08-02 03:05 PM, Michael Ploujnikov wrote:
> On 2018-08-01 06:37 AM, Richard Biener wrote:
>> On Tue, Jul 31, 2018 at 7:40 PM Michael Ploujnikov
>> wrote:
>>>
>>> On 2018-07-26 01:27 PM, Michael Ploujnikov wrote:
On 2018-07-24 09:57 AM, Michael Ploujnikov wrote:
> On 2018-07-20 06:05 AM, Richard Biener wrote:
>>> /* Return a new assembler name for a clone with SUFFIX of a decl named
>>> NAME. */
>>> @@ -521,14 +521,13 @@ tree
>>> clone_function_name_1 (const char *name, const char *suffix)
>>
>> pass this function the counter to use
>>
>>> {
>>>size_t len = strlen (name);
>>> - char *tmp_name, *prefix;
>>> + char *prefix;
>>>
>>>prefix = XALLOCAVEC (char, len + strlen (suffix) + 2);
>>>memcpy (prefix, name, len);
>>>strcpy (prefix + len + 1, suffix);
>>>prefix[len] = symbol_table::symbol_suffix_separator ();
>>> - ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++);
>>
>> and keep using ASM_FORMAT_PRIVATE_NAME here. You need to change
>> the lto/lto-partition.c caller (just use zero as counter).
>>
>>> - return get_identifier (tmp_name);
>>> + return get_identifier (prefix);
>>> }
>>>
>>> /* Return a new assembler name for a clone of DECL with SUFFIX. */
>>> @@ -537,7 +536,17 @@ tree
>>> clone_function_name (tree decl, const char *suffix)
>>> {
>>>tree name = DECL_ASSEMBLER_NAME (decl);
>>> - return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
>>> + const char *decl_name = IDENTIFIER_POINTER (name);
>>> + char *numbered_name;
>>> + unsigned int *suffix_counter;
>>> + if (!clone_fn_ids) {
>>> +/* Initialize the per-function counter hash table if this is the
>>> first call */
>>> +clone_fn_ids = hash_map::create_ggc (64);
>>> + }
>>
>> I still do not like throwing memory at the problem in this way for the
>> little benefit
>> this change provides :/
>>
>> So no approval from me at this point...
>>
>> Richard.
>
> Can you give me an idea of the memory constraints that are involved?
>
> The highest memory usage increase that I could find was when compiling
> a source file (from Linux) with roughly 10,000 functions. It showed a 2kB
> increase over the before-patch use of 6936kB which is barely 0.03%.
>
> Using a single counter can result in more confusing namespacing when
> you have .bar.clone.4 despite there only being 3 clones of .bar.
>
> From a practical point of view this change is helpful to anyone
> diffing binary output such as forensic analysts, Debian Reproducible
> Builds or even someone validating compiler output (before and after an
> input
> source patch). The extra changes that this patch alleviates are a
> distraction and could even be misleading. For example, applying a
> source patch to the same Linux source produces the following binary
> diff before my change:
>
> --- /tmp/output.o.objdump
> +++ /tmp/patched-output.o.objdump
> @@ -1,5 +1,5 @@
>
> -/tmp/uverbs_cmd/output.o: file format elf32-i386
> +/tmp/uverbs_cmd/patched-output.o: file format elf32-i386
>
>
> Disassembly of section .text.get_order:
> @@ -265,12 +265,12 @@
> 3: e9 fc ff ff ff jmp4
> 4: R_386_PC32 .text.put_uobj_read
>
> -Disassembly of section .text.trace_kmalloc.constprop.3:
> +Disassembly of section .text.trace_kmalloc.constprop.4:
>
> - :
> + :
> 0: 83 3d 04 00 00 00 00cmpl $0x0,0x4
> 2: R_386_32 __tracepoint_kmalloc
> - 7: 74 34 je 3d
>
> + 7: 74 34 je 3d
>
> 9: 55 push %ebp
> a: 89 cd mov%ecx,%ebp
>