> On Oct 26, 2023, at 1:21 AM, Jakub Jelinek <ja...@redhat.com> wrote:
> 
> On Wed, Oct 25, 2023 at 07:03:43PM +0000, Qing Zhao wrote:
>> For the code generation impact:
>> 
>> turning the original  x.buf 
>> to a builtin function call
>> __builtin_with_access_and_size(x,buf, x.L,-1)
>> 
>> might inhibit some optimizations from happening before the builtin is
>> evaluated into object size info (phase  .objsz1).  I guess there might be
>> some performance impact.
>> 
>> However, if we mark this builtin as PURE, NOTRROW, etc, then the negative
>> performance impact will be reduced to minimum?
> 
> You can't drop it during objsz1 pass though, otherwise __bdos wouldn't
> be able to figure out the dynamic sizes in case of normal (non-early)
> inlining - caller takes address of a counted_by array, passes it down
> to callee which is only inlined late and uses __bdos, or callee takes address
> and returns it and caller uses __bdos, etc. - so it would need to be objsz2.

I guess that I didn’t say it very clear previously. Let me explain again:

My understanding is, there are “early_objsz” phase and then later “objsz1” 
phase for -O[1|2|3]. 
For -Og, there are “early_objsz” and then later “objsz2”. 

So, the “objsz1” I mentioned (for the case -O[1|2|3])  should be the same as 
the “objsz2” you mentioned above?  -:)
It’s the second objsz phase. 

In the second objsz phase, I believe that all the inlining (including early 
inlining and IPA inlining) are all applied?
> 
> And while the builtin (or if it is an internal detail rather than user
> accessible builtin an internal function)

Okay, will use an “internal function” instead of “ builtin function”. 

> could be even const/nothrow/leaf if
> the arguments contain the loads from the structure 2 fields, I'm afraid it
> will still have huge code generation impact, prevent tons of pre-IPA
> optimizations.  And it will need some work to handle it properly during
> inlining heuristics, because in GIMPLE the COMPONENT_REF loads aren't gimple
> values, so it wouldn't be just the builtin/internal-fn call to be ignored,
> but also the count load from memory.

Are you worrying about the potential additional LOADs will change the inlining 
decision
 since the inlining heuristic depends on the # of loads from memory? 

In additional to the # of loads, the # of instructions and the # of calls of 
the function 
might be increased too, will these have impact on inlining decision? 

In addition to inlining decision, any other impact to other IPA optimizations? 

thanks.

Qing


> 
>       Jakub
> 

Reply via email to