> On Jan 13, 2021, at 9:10 AM, Richard Biener <rguent...@suse.de> wrote:
>
> On Wed, 13 Jan 2021, Qing Zhao wrote:
>
>>
>>
>>> On Jan 13, 2021, at 1:39 AM, Richard Biener <rguent...@suse.de> wrote:
>>>
>>> On Tue, 12 Jan 2021, Qing Zhao wrote:
>>>
>>>> Hi,
>>>>
>>>> Just check in to see whether you have any comments and suggestions on this:
>>>>
>>>> FYI, I have been continue with Approach D implementation since last week:
>>>>
>>>> D. Adding calls to .DEFFERED_INIT during gimplification, expand the
>>>> .DEFFERED_INIT during expand to
>>>> real initialization. Adjusting uninitialized pass with the new refs with
>>>> “.DEFFERED_INIT”.
>>>>
>>>> For the remaining work of Approach D:
>>>>
>>>> ** complete the implementation of -ftrivial-auto-var-init=pattern;
>>>> ** complete the implementation of uninitialized warnings maintenance work
>>>> for D.
>>>>
>>>> I have completed the uninitialized warnings maintenance work for D.
>>>> And finished partial of the -ftrivial-auto-var-init=pattern
>>>> implementation.
>>>>
>>>> The following are remaining work of Approach D:
>>>>
>>>> ** -ftrivial-auto-var-init=pattern for VLA;
>>>> **add a new attribute for variable:
>>>> __attribute((uninitialized)
>>>> the marked variable is uninitialized intentionaly for performance purpose.
>>>> ** adding complete testing cases;
>>>>
>>>>
>>>> Please let me know if you have any objection on my current decision on
>>>> implementing approach D.
>>>
>>> Did you do any analysis on how stack usage and code size are changed
>>> with approach D?
>>
>> I did the code size change comparison (I will provide the data in another
>> email). And with this data, D works better than A in general. (This is
>> surprise to me actually).
>>
>> But not the stack usage. Not sure how to collect the stack usage data,
>> do you have any suggestion on this?
>
> There is -fstack-usage you could use, then of course watching
> the stack segment at runtime.
I can do this for CPU2017 to collect the stack usage data and report back.
> I'm mostly concerned about
> stack-limited "processes" such as the linux kernel which I think
> is a primary target of your work.
I don’t have any experience on building linux kernel.
Do we have to collect data for linux kernel at this time? Is CPU2017 data not
enough?
Qing
>
> Richard.
>
>>
>>> How does compile-time behave (we could gobble up
>>> lots of .DEFERRED_INIT calls I guess)?
>> I can collect this data too and report it later.
>>
>> Thanks.
>>
>> Qing
>>>
>>> Richard.
>>>
>>>> Thanks a lot for your help.
>>>>
>>>> Qing
>>>>
>>>>
>>>>> On Jan 5, 2021, at 1:05 PM, Qing Zhao via Gcc-patches
>>>>> <gcc-patches@gcc.gnu.org> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> This is an update for our previous discussion.
>>>>>
>>>>> 1. I implemented the following two different implementations in the
>>>>> latest upstream gcc:
>>>>>
>>>>> A. Adding real initialization during gimplification, not maintain the
>>>>> uninitialized warnings.
>>>>>
>>>>> D. Adding calls to .DEFFERED_INIT during gimplification, expand the
>>>>> .DEFFERED_INIT during expand to
>>>>> real initialization. Adjusting uninitialized pass with the new refs with
>>>>> “.DEFFERED_INIT”.
>>>>>
>>>>> Note, in this initial implementation,
>>>>> ** I ONLY implement -ftrivial-auto-var-init=zero, the implementation of
>>>>> -ftrivial-auto-var-init=pattern
>>>>> is not done yet. Therefore, the performance data is only about
>>>>> -ftrivial-auto-var-init=zero.
>>>>>
>>>>> ** I added an temporary option -fauto-var-init-approach=A|B|C|D to
>>>>> choose implementation A or D for
>>>>> runtime performance study.
>>>>> ** I didn’t finish the uninitialized warnings maintenance work for D.
>>>>> (That might take more time than I expected).
>>>>>
>>>>> 2. I collected runtime data for CPU2017 on a x86 machine with this new
>>>>> gcc for the following 3 cases:
>>>>>
>>>>> no: default. (-g -O2 -march=native )
>>>>> A: default + -ftrivial-auto-var-init=zero -fauto-var-init-approach=A
>>>>> D: default + -ftrivial-auto-var-init=zero -fauto-var-init-approach=D
>>>>>
>>>>> And then compute the slowdown data for both A and D as following:
>>>>>
>>>>> benchmarks A / no D /no
>>>>>
>>>>> 500.perlbench_r 1.25% 1.25%
>>>>> 502.gcc_r 0.68% 1.80%
>>>>> 505.mcf_r 0.68% 0.14%
>>>>> 520.omnetpp_r 4.83% 4.68%
>>>>> 523.xalancbmk_r 0.18% 1.96%
>>>>> 525.x264_r 1.55% 2.07%
>>>>> 531.deepsjeng_ 11.57% 11.85%
>>>>> 541.leela_r 0.64% 0.80%
>>>>> 557.xz_ -0.41% -0.41%
>>>>>
>>>>> 507.cactuBSSN_r 0.44% 0.44%
>>>>> 508.namd_r 0.34% 0.34%
>>>>> 510.parest_r 0.17% 0.25%
>>>>> 511.povray_r 56.57% 57.27%
>>>>> 519.lbm_r 0.00% 0.00%
>>>>> 521.wrf_r -0.28% -0.37%
>>>>> 526.blender_r 16.96% 17.71%
>>>>> 527.cam4_r 0.70% 0.53%
>>>>> 538.imagick_r 2.40% 2.40%
>>>>> 544.nab_r 0.00% -0.65%
>>>>>
>>>>> avg 5.17% 5.37%
>>>>>
>>>>> From the above data, we can see that in general, the runtime performance
>>>>> slowdown for
>>>>> implementation A and D are similar for individual benchmarks.
>>>>>
>>>>> There are several benchmarks that have significant slowdown with the new
>>>>> added initialization for both
>>>>> A and D, for example, 511.povray_r, 526.blender_, and 531.deepsjeng_r, I
>>>>> will try to study a little bit
>>>>> more on what kind of new initializations introduced such slowdown.
>>>>>
>>>>> From the current study so far, I think that approach D should be good
>>>>> enough for our final implementation.
>>>>> So, I will try to finish approach D with the following remaining work
>>>>>
>>>>> ** complete the implementation of -ftrivial-auto-var-init=pattern;
>>>>> ** complete the implementation of uninitialized warnings maintenance
>>>>> work for D.
>>>>>
>>>>>
>>>>> Let me know if you have any comments and suggestions on my current and
>>>>> future work.
>>>>>
>>>>> Thanks a lot for your help.
>>>>>
>>>>> Qing
>>>>>
>>>>>> On Dec 9, 2020, at 10:18 AM, Qing Zhao via Gcc-patches
>>>>>> <gcc-patches@gcc.gnu.org> wrote:
>>>>>>
>>>>>> The following are the approaches I will implement and compare:
>>>>>>
>>>>>> Our final goal is to keep the uninitialized warning and minimize the
>>>>>> run-time performance cost.
>>>>>>
>>>>>> A. Adding real initialization during gimplification, not maintain the
>>>>>> uninitialized warnings.
>>>>>> B. Adding real initialization during gimplification, marking them with
>>>>>> “artificial_init”.
>>>>>> Adjusting uninitialized pass, maintaining the annotation, making sure
>>>>>> the real init not
>>>>>> Deleted from the fake init.
>>>>>> C. Marking the DECL for an uninitialized auto variable as
>>>>>> “no_explicit_init” during gimplification,
>>>>>> maintain this “no_explicit_init” bit till after
>>>>>> pass_late_warn_uninitialized, or till pass_expand,
>>>>>> add real initialization for all DECLs that are marked with
>>>>>> “no_explicit_init”.
>>>>>> D. Adding .DEFFERED_INIT during gimplification, expand the
>>>>>> .DEFFERED_INIT during expand to
>>>>>> real initialization. Adjusting uninitialized pass with the new refs
>>>>>> with “.DEFFERED_INIT”.
>>>>>>
>>>>>>
>>>>>> In the above, approach A will be the one that have the minimum run-time
>>>>>> cost, will be the base for the performance
>>>>>> comparison.
>>>>>>
>>>>>> I will implement approach D then, this one is expected to have the most
>>>>>> run-time overhead among the above list, but
>>>>>> Implementation should be the cleanest among B, C, D. Let’s see how much
>>>>>> more performance overhead this approach
>>>>>> will be. If the data is good, maybe we can avoid the effort to implement
>>>>>> B, and C.
>>>>>>
>>>>>> If the performance of D is not good, I will implement B or C at that
>>>>>> time.
>>>>>>
>>>>>> Let me know if you have any comment or suggestions.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> Qing
>>>>>
>>>>
>>>>
>>>
>>> --
>>> Richard Biener <rguent...@suse.de <mailto:rguent...@suse.de>
>>> <mailto:rguent...@suse.de <mailto:rguent...@suse.de>>>
>>> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
>>> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
>>
>>
>
> --
> Richard Biener <rguent...@suse.de <mailto:rguent...@suse.de>>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)