[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-25 Thread Bill Wendling via cfe-commits

bwendling wrote:

> > ```c
> > struct x {
> > int a;
> > char foo[2][40];
> > int b;
> > int c;
> > };
> > 
> > size_t f(struct x *p, int idx) {
> > return __builtin_dynamic_object_size(>foo[idx], 1);
> > }
> > ```
> 
> If I'm following correctly, the return here is 0, 40, or 80, depending on the 
> value of idx? That's not a constant, but the computation is entirely 
> syntactic; it doesn't matter what "p" actually points to. So clang can lower 
> the builtin itself. Currently it doesn't, I think, because all the relevant 
> code is in ExprConstant, but the code could be adapted.

Right. That's what I want to add to the front-end.

> The problem, really, is that we can't easily extend that approach to stuff 
> like the following:
> 
> ```c
> size_t f(struct x *p, int idx) {
> char *c = >foo[idx];
> return __builtin_dynamic_object_size(c, 1);
> }
> ```

Yup! I've been forbidden from doing this in the back-end, so I have to jump 
through hoops now and do partial solutions and hope that it works for most 
people and that when we get it wrong it doesn't hurt security (spoilers: it 
will).

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-24 Thread Bill Wendling via cfe-commits

bwendling wrote:

> > maybe we could add the subtype as part of the llvm.objectsize intrinsic and 
> > use that instead of grappling with the whole object's type
> 
> I'm not sure I follow; if you know the object's type, doesn't that mean you 
> also know its size?

Not necessarily. If you have something like this:

```c
struct x {
int a;
char foo[2][40];
int b;
int c;
};

size_t f(struct x *p, int idx) {
return __builtin_dynamic_object_size(>foo[idx], 1);
}
```

But in general, it would be tricky for something like:

```c
__builtin_dynamic_object_size(&((char *)p)[idx], 1);
```

which I'm not sure if that's something GCC can handle. The documentation says 
that if the pointer can point to multiple objects, then we can return the whole 
object value (or something to that affect).

> > (I don't readily know of any transformation that changes a structure's 
> > layout. Does it exist?)
> 
> Any such transform has to reason about all the uses, so the llvm.objectsize 
> call itself would prevent the transform from happening.

Bon! :-)

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-23 Thread Bill Wendling via cfe-commits

bwendling wrote:

It matches my understanding too. There are more issues with `__bdos` that arose 
due to this discussion and related discussions with the GCC maintainers.

I'm exploring some ways of fixing this issue:

* As @efriedma-quic mentioned, generating the size calculation in the front end 
(thus eliminating the need for the intrinsic) would be difficult in the general 
case, but may be possible for some simpler ones. One benefit of generating the 
calculation instead of the intrinsic is a potential win for inlining.

* If the main issue is that we don't know the subtype information, maybe we 
could add the subtype as part of the `llvm.objectsize` intrinsic and use that 
instead of grappling with the whole object's type. I'm unsure if this will work 
correctly for all cases though. (I don't readily know of any transformation 
that changes a structure's layout. Does it exist?) It's similar to the 
`llvm.memory.region.decl` proposal, but less intrusive.

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-23 Thread Nikita Popov via cfe-commits

nikic wrote:

Thanks for summarizing, that matches my understanding.

As to how to implement this "properly", that would probably be 
https://discourse.llvm.org/t/exploring-the-effects-and-uses-of-the-memory-region-declaration-intrinsic/72756.
 The memory.region.decl intrinsics effectively encode subobjects. However, 
doing this has some fairly substantial externalities.

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-23 Thread Eli Friedman via cfe-commits

efriedma-quic wrote:

Trying to summarize my understanding here:

- Using the type of an alloca in LLVM IR is wrong, for a variety of reasons.  
(At this point, the type of an alloca is basically just leftover junk from 
before the opaque pointer transition; I expect that we'll kill it off 
completely at some point.)
- There's some dispute over what it means to compute the size of a subobject 
"conservatively", in particular what's supposed to happen if we can't identify 
a subobject.
- clang returns significantly worse results than gcc in many cases because we 
don't encode subobject information into LLVM IR, and there's no IR between 
clang ASTs and LLVM IR we can use for this sort of analysis.  Encoding the 
information into LLVM IR (metadata on allocas, or something like that) might be 
feasible, but tricky to design well.  Doing dataflow analysis directly on the 
clang AST is extremely complicated and bug-prone because we don't directly 
encode control flow.
- Both gcc and LLVM compute object sizes after optimization in some cases; it's 
impossible to completely replicate that behavior because it depends on 
unrelated optimizations driven by heuristics.  But that's not really relevant 
for the cases we're discussing here.

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-22 Thread Siddhesh Poyarekar via cfe-commits

siddhesh wrote:

> Perhaps we need clarification on what GCC means by "may point to multiple 
> objects" in this instance. To me that means either "get me the size of the 
> largest of these multiple objects" or "size of the smallest." In my eyes, 
> that means pointing to a union field.
> 

It's not just a union field, it could literally point to one of many potential 
objects, e.g.:

```
  struct A *obj1 = ...;
  char *obj2 = ...;
  int *obj3 = ...;

  void *ptr = cond1 == satisfied ? (cond2 == satisfied ? obj1 : obj2) : obj3;

  return __builtin_object_size (ptr, 1);
```
Here, `__builtin_object_size` will return the maximum estimate it can compute 
among the three potential objects that `ptr` could point to.  
`__builtin_dynamic_object_size` is special in that it can handle this case, 
i.e. return an expression that will compute the size of the object returned 
through that condition but I retained the 'estimate' wording for the 
specification in that documentation because of reasons that have already been 
touched upon in this discussion.

In general, we want to allow returning a conservative estimate whenever 
possible and not fail. This is a deliberate design decision because the key 
usage of `__builtin_object_size` (and consequently, 
`__builtin_dynamic_object_size`) is exploit mitigation and we're much better 
off providing a conservative return value than completely bailing out. In the 
above case, returning the whole object size where the subobject size is 
unavailable is better than bailing out because while it cannot do the ideal 
thing of protecting the precise bounds, it at least bounds any potential 
overflows to the whole object., thus limiting what can be done in an exploit 
that uses this overflow.

I obviously have no comments on the patch itself, but hopefully this gives 
enough context to interpret the GCC implementation and/or documentation. It 
would be ideal to always return a precise object size expression, but falling 
back to an estimate is better than simply bailing out.

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-19 Thread Bill Wendling via cfe-commits

bwendling wrote:

> > My answer for the question "what's the semantics of GCC's builtin X?" has 
> > always been "whatever GCC does." It's the best we can rely upon. But then 
> > we get into situations like this, where you and @nikic have one 
> > interpretation of their documentation and I have another. I can point to 
> > their behavior to back up my claim, but in the end it's probably not 
> > exactly clear even to GCC.
> 
> [@nikic 
> demonstrated](https://github.com/llvm/llvm-project/pull/78526#issuecomment-1900439850)
>  that our current behavior is already compatible with GCC's behavior. If 
> GCC's behavior is the spec, then we are allowed to return 48 rather than only 
> 40 or -1 (or presumably 0 if `argc` is out of bounds) for the original 
> example, because in some cases GCC does so.

No, he exposed a fluke in their implementation. I gave examples of where clang 
gets it wrong, and yet I apparently haven't "proven" anything according to this 
logic?

> > My concern is that we want to use this for code hardening. Without precise 
> > object sizes, we're hampered in our goal. The unfortunate reality is that 
> > we can only get that size via these `__builtin_[dynamic_]object_size` 
> > functions.
> 
> That's a totally understandable desire, but I think it's not realistic to 
> expect precise _sub_object sizes in the same cases that GCC can provide them, 
> due to the different architectural choices in the two compilers. If we had a 
> mid-level IR for Clang that still had frontend information, we could do 
> better by evaluating BOS there, so maybe that's one long term path forward to 
> consider. And in the short term, while there are cases where we won't be able 
> to match GCC, I think Clang should do better than it currently does in the 
> frontend, 

We're not talking about something like inline asm, where the internals of their 
register allocator is more-or-less exposed to the user, and therefore it's next 
to impossible for us to replicate all of it. This feature is far simpler, and 
should be something we can determine based on all of the information we're 
given.

> specifically in cases like the one in the bug report where there's an obvious 
> better answer that doesn't require any sophisticated analysis to discover.

What "obvious better answer" is that?

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-19 Thread Bill Wendling via cfe-commits

bwendling wrote:

> > Perhaps we need clarification on what GCC means by "may point to multiple 
> > objects" in this instance. To me that means either "get me the size of the 
> > largest of these multiple objects" or "size of the smallest." In my eyes, 
> > that means pointing to a union field.
> 
> Per @nikic's example, it seems reasonably clear to me that GCC's intended 
> semantics are to get either the best upper bound or the best lower bound that 
> the compiler is able to compute. I mean sure, we can ask, and it does no harm 
> to do so, but it is not reasonable to expect that a quantity like this coming 
> from the vagaries of GCC's implementation details will be exactly the same 
> for all examples when computed by a different implementation. (It's not even 
> the same across optimization levels in GCC.)

I completely agree that using GCC's implementation and documentation is 
definitely lacking. That's the pitfalls of relying upon features added not by 
the standards committee, but by compiler developers. They're likely to be buggy 
as hell for unforeseen corner cases (despite some developers being on the 
standards committee).

My answer for the question "what's the semantics of GCC's builtin X?" has 
always been "whatever GCC does." It's the best we can rely upon. But then we 
get into situations like this, where you and @nikic have one interpretation of 
their documentation and I have another. I can point to their behavior to back 
up my claim, but in the end it's probably not exactly clear even to GCC.

My concern is that we want to use this for code hardening. Without precise 
object sizes, we're hampered in our goal. The unfortunate reality is that we 
can only get that size via these `__builtin_[dynamic_]object_size` functions. 
Thankfully, Apple has a way to help alleviate some of these issues, which 
they're push out soon-ish.

> > I know that we lose precise struct information going to LLVM IR. If that's 
> > what's needed here, there are ways to pass this information along. We 
> > retain this information via DWARF. We could use similar metadata for this 
> > instance. Would that be acceptable?
> 
> In theory, yes, we could preserve enough information in metadata to compute 
> this in the middle-end. But we would need to emit a _lot_ of metadata, just 
> on the off-chance that after optimization we happen to have a 
> builtin_object_size query that points to each object that we emit, so in 
> practice I don't think there's any chance we can do this. We can't use DWARF, 
> because it won't necessarily be available (and typically won't be available 
> in the interesting case where we find the object size only after 
> optimization), and in any case, the presence or absence of DWARF isn't 
> supposed to affect the executable code. Also, we'd need to annotate things 
> that simply don't exist at all in the LLVM IR:
> 
> ```
> typedef struct X { int a, b; } X;
> int f(void) {
>   X *p = malloc(sizeof(X));
>   int *q = >a;
>   return __builtin_object_size(q, 1);
> }
> ```
> 
> Here, `f` ideally would return 4, but at the LLVM IR level, `p` and `q` are 
> identical values and the `>a` operation is a no-op. In cases like this, 
> the best we can realistically do is to return 8.

The sub-object for `>a` and even `>b` is `struct X`, not the integers 
themselves. If you want that, you'll have to use casts: `&((char *)p->b)[2];`. 
(I had to take care to get that correct.) So `f` should return `8` (note it's 
likely to get `8` from the `alloc_size` attribute on `malloc` in your example).

For the record, I didn't mean that we *should* use DWARF. I'm not that much of 
a masochist. :-)

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-19 Thread Richard Smith via cfe-commits

zygoloid wrote:

> Perhaps we need clarification on what GCC means by "may point to multiple 
> objects" in this instance. To me that means either "get me the size of the 
> largest of these multiple objects" or "size of the smallest." In my eyes, 
> that means pointing to a union field.

Per @nikic's example, it seems reasonably clear to me that GCC's intended 
semantics are to get either the best upper bound or the best lower bound that 
the compiler is able to compute. I mean sure, we can ask, and it does no harm 
to do so, but it is not reasonable to expect that a quantity like this coming 
from the vagaries of GCC's implementation details will be exactly the same for 
all examples when computed by a different implementation. (It's not even the 
same across optimization levels in GCC.)

> I know that we lose precise struct information going to LLVM IR. If that's 
> what's needed here, there are ways to pass this information along. We retain 
> this information via DWARF. We could use similar metadata for this instance. 
> Would that be acceptable?

In theory, yes, we could preserve enough information in metadata to compute 
this in the middle-end. But we would need to emit a *lot* of metadata, just on 
the off-chance that after optimization we happen to have a builtin_object_size 
query that points to each object that we emit, so in practice I don't think 
there's any chance we can do this. We can't use DWARF, because it won't 
necessarily be available (and typically won't be available in the interesting 
case where we find the object size only after optimization), and in any case, 
the presence or absence of DWARF isn't supposed to affect the executable code. 
Also, we'd need to annotate things that simply don't exist at all in the LLVM 
IR:

```
typedef struct X { int a, b; } X;
int f(void) {
  X *p = malloc(sizeof(X));
  int *q = >a;
  return __builtin_object_size(q, 1);
}
```

Here, `f` ideally would return 4, but at the LLVM IR level, `p` and `q` are 
identical values and the `>a` operation is a no-op. In cases like this, the 
best we can realistically do is to return 8.

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-19 Thread Bill Wendling via cfe-commits

bwendling wrote:

> Taking a step back, while this patch is not the right direction, we can and 
> should do better for the original example. Probably the best way to do that 
> is to analyze the operand to `__builtin_[dynamic_]object_size` in the 
> frontend and compute a better bound based on the form of the expression. It 
> looks like it should be feasible to produce a tighter bound for an 
> array-to-pointer-decay expression like `f.bar[argc]` in subobject mode, as:
> 
> * `select llvm.is.constant(argc) and (argc < 0 or argc >= 2), 0, 
> sizeof(f.bar[argc])` for the non-dynamic case, and just
> * `select (argc < 0 or argc >= 2), 0, sizeof(f.bar[argc])` for the dynamic 
> case.
> 
> A possibly simpler alternative would be for the frontend to pass an upper 
> bound on the result to the LLVM builtin in mode 1, so Clang could say "I know 
> the result will never be more than 40" and LLVM could provide either that 
> size or the complete object size, whichever is smaller. That wouldn't give as 
> tight a bound for the argc == 2 case, though.

It might be possible to do something like this. If retaining precise type 
information from the front-end is the sticking issue, it could be as simple as 
casting this pointer top the type of the sub-object. I'll see what I can do.

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-19 Thread Bill Wendling via cfe-commits

bwendling wrote:

> > > But mode 0 and 1 are in general asking for an upper bound on the 
> > > accessible bytes (that is, an N so any.access beyond N bytes is 
> > > definitely out of bounds), so it seems to me that returning -1 is 
> > > strictly worse than returning 48.
> > 
> > 
> > It's the second bit that controls whether it's an upper bound or lower 
> > bound that's returned, not the least significant bit.
> 
> Right, that's what I said: modes 0 and 1 ask for an upper bound. (Modes 2 and 
> 3 ask for a lower bound.)

Perhaps we need clarification on what GCC means by "may point to multiple 
objects" in this instance. To me that means either "get me the size of the 
largest of these multiple objects" or "size of the smallest." In my eyes, that 
means pointing to a union field.

> > We're trying to implement a GNU builtin, and the only defined semantics we 
> > have to go on are GNU's documentation. I can't see how we can deviate from 
> > their documentation unless it's to say "we can't determine this value" and 
> > so return `-1` instead of an answer that might be wildly wrong and 
> > potentially cause a memory leak of some sort.
> 
> We're not really deviating from the documented rules. By the time we get to 
> LLVM IR lowering of the builtin, we have lost the precise frontend 
> information. But we know the pointer might point into the complete object 
> (where it would be able to write to 48 bytes) or to some subobject of that 
> complete object (where it would be able to write to 48 bytes or less). 
> Therefore the correct answer to return is 48.
> 
> In the same way, it would be correct to return 48 here:
> 
> ```
> char a[48];
> char b[40];
> bool always_false = false; // happens to never be modified
> int n = __builtin_dynamic_object_size(always_false ? a  : b, 1);
> ```
> 
> ... though we miscompile this and return 40 regardless of whether we're in 
> upper bound or lower bound mode. :-(
> 
> > In my made-up example, if we said, "Yes you can write up to 48 bytes into 
> > `p->failed_devs[argc]`, then a user may overwrite the two fields after 
> > `field_devs`. If we return `-1`, they'll have to take the "slow", but 
> > ideally more secure, route.
> 
> No, that is not what we are saying when we return 48. We're saying "a write 
> past 48 bytes is definitely bad". If you want a lower bound on the number of 
> bytes that you can write, then you need to use mode 2 or 3 instead.

But in the example, writing up to 48 bytes *is* definitely bad. And as I 
mentioned, we disagree on what's meant by "upper" and "lower" bounds in the 
case of a pointer to a non-union field. For me, that's managed by the first bit.

I know that we lose precise struct information going to LLVM IR. If that's 
what's needed here, there are ways to pass this information along. We retain 
this information via DWARF. We could use similar metadata for this instance. 
Would that be acceptable?

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-19 Thread Richard Smith via cfe-commits

zygoloid wrote:

Taking a step back, while this patch is not the right direction, we can and 
should do better for the original example. Probably the best way to do that is 
to analyze the operand to `__builtin_[dynamic_]object_size` in the frontend and 
compute a better bound based on the form of the expression. It looks like it 
should be feasible to produce a tighter bound for an array-to-pointer-decay 
expression like `f.bar[argc]` in subobject mode, as:

- `select llvm.is.constant(argc) and (argc < 0 or argc >= 2), 0, 
sizeof(f.bar[argc])` for the non-dynamic case, and just
- `select (argc < 0 or argc >= 2), 0, sizeof(f.bar[argc])` for the dynamic case.

A possibly simpler alternative would be for the frontend to pass an upper bound 
on the result to the LLVM builtin in mode 1, so Clang could say "I know the 
result will never be more than 40" and LLVM could provide either that size or 
the complete object size, whichever is smaller. That wouldn't give as tight a 
bound for the argc == 2 case, though.

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-19 Thread Bill Wendling via cfe-commits

bwendling wrote:

> > For unions, clang will use the type of the union member with the largest 
> > size as the alloca type, regardless of which union member is active. I 
> > haven't tried, but your patch will probably compute the subobject size 
> > based on that arbitrarily picked member, rather than the one being accessed.
> 
> Another concrete case where using the IR type won't work is arrays whose 
> initializer is mostly-zeroes. For example, for
> 
> ```
> int arr[20] = {[5] = 6};
> ```
> 
> we give `arr` the IR type `<{ i32, i32, i32, i32, i32, i32, [14 x i32] }>` so 
> that we can use a `zeroinitializer` to fill the trailing portion of the array.

My patch only works with structs.

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-19 Thread Richard Smith via cfe-commits

zygoloid wrote:

> For unions, clang will use the type of the union member with the largest size 
> as the alloca type, regardless of which union member is active. I haven't 
> tried, but your patch will probably compute the subobject size based on that 
> arbitrarily picked member, rather than the one being accessed.

Another concrete case where using the IR type won't work is arrays whose 
initializer is mostly-zeroes. For example, for
```
int arr[20] = {[5] = 6};
```
we give `arr` the IR type `<{ i32, i32, i32, i32, i32, i32, [14 x i32] }>` so 
that we can use a `zeroinitializer` to fill the trailing portion of the array.

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-18 Thread Bill Wendling via cfe-commits

bwendling wrote:

> Perhaps according to the GCC documentation as written. But mode 0 and 1 are 
> in general asking for an upper bound on the accessible bytes (that is, an N 
> so any.access beyond N bytes is definitely out of bounds), so it seems to me 
> that returning -1 is strictly worse than returning 48.

It's the second bit that controls whether it's an upper bound or lower bound 
that's returned, not the least significant bit.

> Do you have a use case for which -1 is a better answer?

I'm sure one could be constructed (for instance, someone wanting to check if 
it's okay to `strcpy` a string of size `48` to a pointer to `failed_devs[argc]` 
in the example above), but that's not really the point.

We're trying to implement a GNU builtin, and the only defined semantics we have 
to go on are GNU's documentation. I can't see how we can deviate from their 
documentation unless it's to say "we can't determine this value" and so return 
`-1` instead of an answer that might be wildly wrong and potentially cause a 
memory leak of some sort. In my made-up example, if we said, "Yes you can write 
up to 48 bytes into `p->failed_devs[argc]`, then a user may overwrite the two 
fields after `field_devs`. If we return `-1`, they'll have to take the "slow", 
but ideally more secure, route.

Lastly, returning `48` in the made-up case is different from the case when an 
immediate is used rather than a variable, which is certainly unexpected. We 
could document that we can't support GNU's implementation exactly, but I don't 
think that's good enough.

>From what I can tell, GNU's idea of what constitutes a "closest surrounding 
>sub-object" is one of a handful of things: the structure containing the field, 
>the array being pointed to with a struct, or the field being pointed into. 
>Example of the last one:

```
struct foo {
int a;
int b;
int c;
};

size_t foo(struct foo *p, int index) {
return __builtin_dynamic_object_size(&((char *)p->b)[idx], 1); // b is the 
sub-object.
}
```

All of these are explicit in the LLVM IR. Is the worry that they've been 
changed from some transformations? Or are there others I'm missing?

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-18 Thread Bill Wendling via cfe-commits

bwendling wrote:

> Using anything but the size and alignment of the alloca type in a way that 
> affects program semantics is illegal.

I'm sorry, but I don't understand your comment here. Could you supply some 
context?

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-18 Thread Bill Wendling via cfe-commits

bwendling wrote:

> We've discussed this before, years ago. Previously, the sticking point was: 
> how is LLVM going to know what the frontend considers the closest surrounding 
> subobject to be? The LLVM type doesn't give you that information, and it 
> seems like there's nowhere else that LLVM can get it from either, so this 
> flag ends up not being useful and the best we can do is to give a 
> conservatively-correct answer -- the complete object size if we want an upper 
> bound, or 0 if we want a lower bound.

Right now we're giving a wrong answer (see below), so that's no good. If we're 
going to err on the side of caution, then we should return the default "can't 
calculate the size" value.

When you say that we can't detect what the front-end considers the "closest 
surrounding subobject" to be, is that mostly due to corner cases or is it a 
more general concern? (Note, we're really only interested in supporting this 
for C structs. C++ structs / classes would require therapy.)

> Clang does respect the "subobject" flag if it can symbolically evaluate the 
> operand of `__builtin_object_size` sufficiently to determine which subobject 
> is being referenced. Previously we've thought that that was the best we could 
> do.

This is why the current behavior is wrong, in my opinion. The motivating 
example is below:

```
struct suspend_stats {
int success;
int fail;
int failed_freeze;
int failed_prepare;
int failed_suspend;
int failed_suspend_late;
int failed_suspend_noirq;
int failed_resume;
int failed_resume_early;
int failed_resume_noirq;
#define REC_FAILED_NUM  2
int last_failed_dev;
charfailed_devs[REC_FAILED_NUM][40]; /* offsetof(struct 
suspend_stats, failed_devs) == 44 */
int last_failed_errno;
int bar;
};

#define report(x) __builtin_printf(#x ": %zu\n", x)

int main(int argc, char *argv[])
{
struct suspend_stats foo;

report(sizeof(foo.failed_devs[1]));
report(sizeof(foo.failed_devs[argc]));
report(__builtin_dynamic_object_size(, 0));
report(__builtin_dynamic_object_size(, 1));
report(__builtin_dynamic_object_size(_freeze, 0));
report(__builtin_dynamic_object_size(foo.failed_devs[1], 0));
report(__builtin_dynamic_object_size(foo.failed_devs[1], 1));
report(__builtin_dynamic_object_size(foo.failed_devs[argc], 0));
report(__builtin_dynamic_object_size(foo.failed_devs[argc], 1));

return 0;
}
```

The output with this change is now:

```
__builtin_dynamic_object_size(, 0): 128
__builtin_dynamic_object_size(, 1): 4
__builtin_dynamic_object_size(_freeze, 0): 124
__builtin_dynamic_object_size(foo.failed_devs[1], 0): 48
__builtin_dynamic_object_size(foo.failed_devs[1], 1): 40
__builtin_dynamic_object_size(foo.failed_devs[argc], 0): 48
__builtin_dynamic_object_size(foo.failed_devs[argc], 1): 40
```

Without the change, the last line is:

```
__builtin_dynamic_object_size(foo.failed_devs[argc], 1): 48
```

Which isn't correct according to GNU's documentation. So if we can't honor the 
TYPE bit, then we should return `-1 / 0` here, right?

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-17 Thread Richard Smith via cfe-commits

https://github.com/zygoloid commented:

We've discussed this before, years ago. Previously, the sticking point was: how 
is LLVM going to know what the frontend considers the closest surrounding 
subobject to be? The LLVM type doesn't give you that information, and it seems 
like there's nowhere else that LLVM can get it from either, so this flag ends 
up not being useful and the best we can do is to give a conservatively-correct 
answer -- the complete object size if we want an upper bound, or 0 if we want a 
lower bound.

Clang does respect the "subobject" flag if it can symbolically evaluate the 
operand of `__builtin_object_size` sufficiently to determine which subobject is 
being referenced. Previously we've thought that that was the best we could do.

https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-17 Thread via cfe-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff 8947469ec1ad6d35b2feec0acc43d0d191514f0b 
ab13650e232b1a17cf9c7335f90a33f7eb71f741 -- clang/lib/CodeGen/CGBuiltin.cpp 
clang/lib/CodeGen/CGExpr.cpp clang/test/CodeGen/catch-undef-behavior.c 
clang/test/CodeGen/pass-object-size.c 
llvm/include/llvm/Analysis/MemoryBuiltins.h 
llvm/lib/Analysis/MemoryBuiltins.cpp llvm/lib/IR/AutoUpgrade.cpp 
llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
``





View the diff from clang-format here.


``diff
diff --git a/llvm/lib/Analysis/MemoryBuiltins.cpp 
b/llvm/lib/Analysis/MemoryBuiltins.cpp
index 56d6313905..41b6eefffe 100644
--- a/llvm/lib/Analysis/MemoryBuiltins.cpp
+++ b/llvm/lib/Analysis/MemoryBuiltins.cpp
@@ -732,8 +732,8 @@ SizeOffsetAPInt ObjectSizeOffsetVisitor::computeImpl(Value 
*V) {
   if (!Options.WholeObjectSize && AllocaTy) {
 // At this point, SOT.Size is the size of the whole struct. However, we
 // want the size of the sub-object.
-const StructLayout  = *DL.getStructLayout(
-const_cast(AllocaTy));
+const StructLayout  =
+*DL.getStructLayout(const_cast(AllocaTy));
 
 unsigned Idx = SL.getElementContainingOffset(Offset.getLimitedValue());
 

``




https://github.com/llvm/llvm-project/pull/78526
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)

2024-01-17 Thread via cfe-commits

llvmbot wrote:



@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-clang-codegen

Author: Bill Wendling (bwendling)


Changes

The second argument of __builtin_dynamic_object_size controls whether it
returns the size of the whole object or the closest surrounding object.
For this struct:

  struct s {
int  foo;
char bar[2][40];
int  baz;
int  qux;
  };

  int main(int argc, char **argv) {
struct s f;

  #define report(x) fprintf(stderr, #x ": %zu\n", x)

argc = 1;
report(__builtin_dynamic_object_size(f.bar[argc], 0));
report(__builtin_dynamic_object_size(f.bar[argc], 1));
return 0;
  }

should return:

  __builtin_dynamic_object_size(f.bar[argc], 0): 48
  __builtin_dynamic_object_size(f.bar[argc], 1): 40

determined by the least significant bit of the TYPE.

Add a new parameter to the llvm.objectsize intrinsic to control which
size it should return.


---

Patch is 97.55 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/78526.diff


40 Files Affected:

- (modified) clang/lib/CodeGen/CGBuiltin.cpp (+6-1) 
- (modified) clang/lib/CodeGen/CGExpr.cpp (+3-1) 
- (modified) clang/test/CodeGen/catch-undef-behavior.c (+1-1) 
- (modified) clang/test/CodeGen/pass-object-size.c (+3-3) 
- (modified) llvm/docs/LangRef.rst (+21-13) 
- (modified) llvm/include/llvm/Analysis/MemoryBuiltins.h (+9) 
- (modified) llvm/include/llvm/IR/Intrinsics.td (+1-1) 
- (modified) llvm/lib/Analysis/MemoryBuiltins.cpp (+31-4) 
- (modified) llvm/lib/IR/AutoUpgrade.cpp (+37-6) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp (+2-2) 
- (modified) llvm/test/Analysis/CostModel/X86/free-intrinsics.ll (+4-4) 
- (modified) llvm/test/Analysis/CostModel/free-intrinsics-datalayout.ll (+4-4) 
- (modified) llvm/test/Analysis/CostModel/free-intrinsics-no_info.ll (+4-4) 
- (modified) llvm/test/Assembler/auto_upgrade_intrinsics.ll (+4-4) 
- (modified) llvm/test/Bitcode/objectsize-upgrade-7.0.ll (+2-2) 
- (modified) llvm/test/CodeGen/AArch64/GlobalISel/memcpy_chk_no_tail.ll (+2-2) 
- (modified) llvm/test/CodeGen/AArch64/memsize-remarks.ll (+16-16) 
- (modified) llvm/test/CodeGen/AMDGPU/promote-alloca-mem-intrinsics.ll (+4-4) 
- (modified) llvm/test/Other/cgscc-libcall-update.ll (+2-2) 
- (modified) llvm/test/Transforms/InferAddressSpaces/AMDGPU/debug-info.ll 
(+4-4) 
- (modified) llvm/test/Transforms/InferAddressSpaces/AMDGPU/intrinsics.ll 
(+6-6) 
- (modified) llvm/test/Transforms/InferAlignment/propagate-assume.ll (+2-2) 
- (modified) llvm/test/Transforms/Inline/call-intrinsic-objectsize.ll (+3-3) 
- (modified) llvm/test/Transforms/InstCombine/allocsize.ll (+5-5) 
- (modified) llvm/test/Transforms/InstCombine/builtin-dynamic-object-size.ll 
(+13-13) 
- (modified) llvm/test/Transforms/InstCombine/builtin-object-size-custom-dl.ll 
(+3-3) 
- (modified) 
llvm/test/Transforms/InstCombine/builtin-object-size-strdup-family.ll (+5-5) 
- (modified) llvm/test/Transforms/InstCombine/invoke.ll (+1-1) 
- (modified) llvm/test/Transforms/InstCombine/memset_chk-1.ll (+5-5) 
- (modified) llvm/test/Transforms/InstCombine/objsize.ll (+38-38) 
- (modified) llvm/test/Transforms/InstCombine/stpcpy_chk-1.ll (+4-4) 
- (modified) llvm/test/Transforms/InstCombine/strcpy_chk-1.ll (+5-5) 
- (modified) 
llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-load.ll (+2-2) 
- (modified) 
llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll (+5-5) 
- (modified) 
llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-posix-memalign.ll
 (+8-8) 
- (modified) 
llvm/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll (+2-2) 
- (modified) 
llvm/test/Transforms/LowerConstantIntrinsics/crash-on-large-allocas.ll (+2-2) 
- (modified) llvm/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll 
(+18-19) 
- (modified) llvm/test/Transforms/LowerConstantIntrinsics/stale-worklist-phi.ll 
(+2-2) 
- (modified) llvm/test/Transforms/SCCP/issue59602-assume-like-call-users.ll 
(+5-5) 


``diff
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index de48b15645b1ab..513a007aa009b7 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -1103,12 +1103,17 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
   Function *F =
   CGM.getIntrinsic(Intrinsic::objectsize, {ResType, Ptr->getType()});
 
+  // If the least significant bit is clear, objects are whole variables. If
+  // it's set, a closest surrounding subobject is considered the object a
+  // pointer points to.
+  Value *WholeObj = Builder.getInt1((Type & 1) == 0);
+
   // LLVM only supports 0 and 2, make sure that we pass along that as a 
boolean.
   Value *Min = Builder.getInt1((Type & 2) != 0);
   // For GCC compatibility, __builtin_object_size treat NULL as unknown size.