Re: [PATCH] tree-optimization/120929: Limit MEM_REF handling to .ACCESS_WITH_SIZE

Qing Zhao Mon, 28 Jul 2025 12:37:31 -0700


> On Jul 28, 2025, at 12:48, Jakub Jelinek <ja...@redhat.com> wrote:
> 
> On Wed, Jul 23, 2025 at 05:59:22PM +0000, Qing Zhao wrote:
>> struct S {
>>  int n;
>>  int *p __attribute__((counted_by(n)));
>> } *f;
>> Int *g;
>> void setup (int **ptr, int count)
>> {
>> *ptr = __builtin_malloc (sizeof (int) * count);
>>  g = *ptr;
>> };
>> int main ()
>> {
>> f = __builtin_malloc (sizeof (struct S));
>> setup (&f->p, 10);
> 
> This is neither read nor write, it is taking an address of f->p.
> The above case is definitely questionable because nothing really initializes
> f->n, so any later uses of f->p would be invalid unless it is initialized
> first.


 f->p is actually initialized by the call to the routine “setup” through the 
1st argument.  
So, later uses of f->p after the call to “setup” in the routine main should be 
valid. 

However, without inter-procedural analysis and data-flow information, we can 
not determine
whether a later read of f->p is valid or not. 

That’s the major concern I have.

> Anyway, the choices are not mark with .ACCESS_WITH_SIZE taking address of
> such pointers,

So, you mean:  Not generate call to .ACCESS_WITh_SIZE for a f->p when its 
address is taken, i.e, 
not generate call to .ACCESS_WITH_SIZE for &f->p?

> or mark it with another mode and handle it differently later.

What do you mean here? A little confused. 

> At least for the start, I'd strongly suggest the former.

Okay, I can do that.  -:)

> With the above setup mess, it will always be just best effort, if it is
> inline, bos pass can see what it has been initialized to and associated
> with, if not, then it will simply not know it has an counted_by attribute.

Yes, the tradeoff for selectively generating .ACCESS_WITH_SIZE  for a f->p 
depending 
on its context will be:  we cannot get the counted_by information reliably 
through the call
to .ACCESS_WITH_SIZE in middle-end. 

> 
>> C FE has no such capability to determine whether the f->p is a read or a 
>> write.  Is this right?
> 
> C certainly can determine that, otherwise e.g. the -Wunused-but-set-*
> warnings wouldn't work.

Thanks a lot for the info. I just checked how C FE handles -Wunused-but-set-* 
options,  and see that
DECL_READ_P and TREE_USED are used for this purpose.  Will check further. 


> If there is an lvalue to rvalue conversion, it was read, so you can attach
> .ACCESS_WITH_SIZE to that if it is COMPONENT_REF with pointer type with
> counted_by attribute.
> If there is not an lvalue to rvalue conversion, it is write or something
> else.

What does “something else” include?  Taken address of it, and ? 

Can we just generate the call to .ACCESS_WITH_SIZE inside the routine 
“convert_lvalue_to_rvalue”
when the “EXP” is a COMPONENT_REF that represents a pointer field with 
counted_by attribute? 

Will this routine include all the READs from a pointer field? 

> So, one possibility is e.g. to look for mark_exp_read calls.

> Another is try default_function_array_read_conversion and a few other spots.

Is there any possibility that we cannot find all the places to generate the 
call to .ACCESS_WITH_SIZE
for pointer fields that are read from? 

> 
> Or another option might be don't mark even the loads with .ACCESS_WITH_SIZE
> when pointer type, tweak the content of the counted_by attribute (its
> argument) instead on the FIELD_DECLs such that the middle-end could figure
> it out and just handle it on the bos pass side.  

Inserting call to .ACCESS_WITH_SIZE to IL has two major purposes as our initial 
design:

struct S {
 int n;
 int *p __attribute__((counted_by(n)));
} *f;

For
 f->p = malloc (sz * sizeof (int));
 f->n = sz;
 __builtin_dynamic_object_size (f->p, 0);

1. Encode the implicit data dependence on “f->n” from 
__builtin_dynamic_object_size (f->p, 0) to IL
    to avoid incorrect code reordering of “f->n = sz” and 
“__builtin_dynamic_object_size (f->p, 0) before object size phase. 

Before any middle end optimization, we should generate:

  f->p = malloc (sz * sizeof (int));
  f->n = sz;
  tmp = .ACCESS_WITH_SIZE (f->p, &f->n, …);
__builtin_dynamic_object_size (tmp, 0);

2. Carry the size information explicitly in IL.  Then object_size phase will 
consistently get the size information from the 
    Call to .ACCESS_WITH_SIZE without other hack. 

Due to the above 1, we should generate the call to .ACCESS_WITH_SIZE BEFORE any 
middle-end optimization. 

> Though if counted_by
> argument is not just an identifier of a field in the same structure but
> complex expression, trying to reintroduce it into the IL might be too
> challenging at that point.

When the counted_by field is extended to complex expression,  the issue with 
implicit data dependence embedded in the 
__builtin_dynamic_object_size still exist (and might be even worse),  adding 
the expression explicitly into the IL through
an argument to .ACCESS_WITH_SIZE still necessary. 

For example:

int number;

struct S {
 Int n;
 Int *p __attribute__((counted_by_exp (int n; number * n)))
}

f->p = malloc (10 * 4 * sizeof (int));
number = 10;                                    
f->n = 4;
__builtin_dynamic_object_size (f->p, 0);

In the above, there is no data dependence between “number = 10”, “f->n = 4” and 
“__builtin_dynamic_object_size (f->p, 0)” in IL,
The compiler optimization is freely to reorder them. 

IN order to avoid such incorrect compiler optimization, we should insert the 
call to .ACCESS_WITH_SIZE to make such implicit
data dependence as explicit. 

However, we need to figure out how to pass the expressions to the call to 
.ACCESS_WITH_SIZE. 

Thanks.

Qing



> 
> Jakub
>

Re: [PATCH] tree-optimization/120929: Limit MEM_REF handling to .ACCESS_WITH_SIZE

Reply via email to