Am Donnerstag, dem 26.10.2023 um 17:48 +0200 schrieb Richard Biener:
> 
> > Am 26.10.2023 um 16:58 schrieb Qing Zhao <qing.z...@oracle.com>:
> > 
> > 
> > 
> > > On Oct 26, 2023, at 4:56 AM, Richard Biener <richard.guent...@gmail.com> 
> > > wrote:
> > > 
> > > > On Thu, Oct 26, 2023 at 7:22 AM Jakub Jelinek <ja...@redhat.com> wrote:
> > > > 
> > > > On Wed, Oct 25, 2023 at 07:03:43PM +0000, Qing Zhao wrote:
> > > > > For the code generation impact:
> > > > > 
> > > > > turning the original  x.buf
> > > > > to a builtin function call
> > > > > __builtin_with_access_and_size(x,buf, x.L,-1)
> > > > > 
> > > > > might inhibit some optimizations from happening before the builtin is
> > > > > evaluated into object size info (phase  .objsz1).  I guess there 
> > > > > might be
> > > > > some performance impact.
> > > > > 
> > > > > However, if we mark this builtin as PURE, NOTRROW, etc, then the 
> > > > > negative
> > > > > performance impact will be reduced to minimum?
> > > > 
> > > > You can't drop it during objsz1 pass though, otherwise __bdos wouldn't
> > > > be able to figure out the dynamic sizes in case of normal (non-early)
> > > > inlining - caller takes address of a counted_by array, passes it down
> > > > to callee which is only inlined late and uses __bdos, or callee takes 
> > > > address
> > > > and returns it and caller uses __bdos, etc. - so it would need to be 
> > > > objsz2.
> > > > 
> > > > And while the builtin (or if it is an internal detail rather than user
> > > > accessible builtin an internal function) could be even 
> > > > const/nothrow/leaf if
> > > > the arguments contain the loads from the structure 2 fields, I'm afraid 
> > > > it
> > > > will still have huge code generation impact, prevent tons of pre-IPA
> > > > optimizations.  And it will need some work to handle it properly during
> > > > inlining heuristics, because in GIMPLE the COMPONENT_REF loads aren't 
> > > > gimple
> > > > values, so it wouldn't be just the builtin/internal-fn call to be 
> > > > ignored,
> > > > but also the count load from memory.
> > > 
> > > I think we want to track the value, not the "memory" in the builtin call,
> > > so GIMPLE would be
> > > 
> > > _1 = x.L;
> > > .. = __builtin_with_access_and_size (&x.buf, _1, -1);
> > 
> > Before adding the __builtin_with_access_and_size, the code is:
> > 
> > &x.buf
> > 
> > After inserting the built-in, it becomes:
> > 
> > _1 = x.L;
> > __builtin_with_access_and_size (&x.buf, _1, -1).
> > 
> > 
> > So, the # of total instructions, the # of LOADs, and the # of calls will 
> > all be increased.
> > There will be impact to the inlining decision definitely.
> 
> Note we have to make sure, if x is a pointer and we want to instrument 
> &x->buf that we
> Can dereference x.  Possibly doing
> 
> _1 = x ? x->Len : -1;
> 
> I’m not sure the C standard makes accessing x->Len unconditionally not 
> undefined behavior when &x->buf is computed.  Definitely it’s a violation of 
> the abstract machine of Len is volatile qualified (but we can reject such 
> counted_by or instantiations as volatile qualified types).

I believe it is implicit UB to do &x->buf if there is
no object *x because the wording assumes the existence
of an object.  In that case accessing x->L should
be fine too.  

In practice the access may trap  for other reasons 
(mprotect etc.),  but I guess this is acceptable,
but should probably be documented...

We might need the x?  to not run into trouble with
those offsetof  implementations written using null
pointer.  Although in this case maybe one could
hope that the load will get optimized anyway ...

Martin

> 
> Richard 
> 
> > 
> > > 
> > > also please make sure to use an internal function for
> > > __builtin_with_access_and_size,
> > > I don't think we want to expose this to users - it's an implementation 
> > > detail.
> > 
> > Okay, will define it as an internal function (add it to internal-fn.def). 
> > -:)
> > 
> > Qing
> > > 
> > > Richard.
> > > 
> > > > 
> > > >       Jakub
> > > > 
> > 

Reply via email to