[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
bwendling wrote: > > ```c > > struct x { > > int a; > > char foo[2][40]; > > int b; > > int c; > > }; > > > > size_t f(struct x *p, int idx) { > > return __builtin_dynamic_object_size(>foo[idx], 1); > > } > > ``` > > If I'm following correctly, the return here is 0, 40, or 80, depending on the > value of idx? That's not a constant, but the computation is entirely > syntactic; it doesn't matter what "p" actually points to. So clang can lower > the builtin itself. Currently it doesn't, I think, because all the relevant > code is in ExprConstant, but the code could be adapted. Right. That's what I want to add to the front-end. > The problem, really, is that we can't easily extend that approach to stuff > like the following: > > ```c > size_t f(struct x *p, int idx) { > char *c = >foo[idx]; > return __builtin_dynamic_object_size(c, 1); > } > ``` Yup! I've been forbidden from doing this in the back-end, so I have to jump through hoops now and do partial solutions and hope that it works for most people and that when we get it wrong it doesn't hurt security (spoilers: it will). https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
bwendling wrote: > > maybe we could add the subtype as part of the llvm.objectsize intrinsic and > > use that instead of grappling with the whole object's type > > I'm not sure I follow; if you know the object's type, doesn't that mean you > also know its size? Not necessarily. If you have something like this: ```c struct x { int a; char foo[2][40]; int b; int c; }; size_t f(struct x *p, int idx) { return __builtin_dynamic_object_size(>foo[idx], 1); } ``` But in general, it would be tricky for something like: ```c __builtin_dynamic_object_size(&((char *)p)[idx], 1); ``` which I'm not sure if that's something GCC can handle. The documentation says that if the pointer can point to multiple objects, then we can return the whole object value (or something to that affect). > > (I don't readily know of any transformation that changes a structure's > > layout. Does it exist?) > > Any such transform has to reason about all the uses, so the llvm.objectsize > call itself would prevent the transform from happening. Bon! :-) https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
bwendling wrote: It matches my understanding too. There are more issues with `__bdos` that arose due to this discussion and related discussions with the GCC maintainers. I'm exploring some ways of fixing this issue: * As @efriedma-quic mentioned, generating the size calculation in the front end (thus eliminating the need for the intrinsic) would be difficult in the general case, but may be possible for some simpler ones. One benefit of generating the calculation instead of the intrinsic is a potential win for inlining. * If the main issue is that we don't know the subtype information, maybe we could add the subtype as part of the `llvm.objectsize` intrinsic and use that instead of grappling with the whole object's type. I'm unsure if this will work correctly for all cases though. (I don't readily know of any transformation that changes a structure's layout. Does it exist?) It's similar to the `llvm.memory.region.decl` proposal, but less intrusive. https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
nikic wrote: Thanks for summarizing, that matches my understanding. As to how to implement this "properly", that would probably be https://discourse.llvm.org/t/exploring-the-effects-and-uses-of-the-memory-region-declaration-intrinsic/72756. The memory.region.decl intrinsics effectively encode subobjects. However, doing this has some fairly substantial externalities. https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
efriedma-quic wrote: Trying to summarize my understanding here: - Using the type of an alloca in LLVM IR is wrong, for a variety of reasons. (At this point, the type of an alloca is basically just leftover junk from before the opaque pointer transition; I expect that we'll kill it off completely at some point.) - There's some dispute over what it means to compute the size of a subobject "conservatively", in particular what's supposed to happen if we can't identify a subobject. - clang returns significantly worse results than gcc in many cases because we don't encode subobject information into LLVM IR, and there's no IR between clang ASTs and LLVM IR we can use for this sort of analysis. Encoding the information into LLVM IR (metadata on allocas, or something like that) might be feasible, but tricky to design well. Doing dataflow analysis directly on the clang AST is extremely complicated and bug-prone because we don't directly encode control flow. - Both gcc and LLVM compute object sizes after optimization in some cases; it's impossible to completely replicate that behavior because it depends on unrelated optimizations driven by heuristics. But that's not really relevant for the cases we're discussing here. https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
siddhesh wrote: > Perhaps we need clarification on what GCC means by "may point to multiple > objects" in this instance. To me that means either "get me the size of the > largest of these multiple objects" or "size of the smallest." In my eyes, > that means pointing to a union field. > It's not just a union field, it could literally point to one of many potential objects, e.g.: ``` struct A *obj1 = ...; char *obj2 = ...; int *obj3 = ...; void *ptr = cond1 == satisfied ? (cond2 == satisfied ? obj1 : obj2) : obj3; return __builtin_object_size (ptr, 1); ``` Here, `__builtin_object_size` will return the maximum estimate it can compute among the three potential objects that `ptr` could point to. `__builtin_dynamic_object_size` is special in that it can handle this case, i.e. return an expression that will compute the size of the object returned through that condition but I retained the 'estimate' wording for the specification in that documentation because of reasons that have already been touched upon in this discussion. In general, we want to allow returning a conservative estimate whenever possible and not fail. This is a deliberate design decision because the key usage of `__builtin_object_size` (and consequently, `__builtin_dynamic_object_size`) is exploit mitigation and we're much better off providing a conservative return value than completely bailing out. In the above case, returning the whole object size where the subobject size is unavailable is better than bailing out because while it cannot do the ideal thing of protecting the precise bounds, it at least bounds any potential overflows to the whole object., thus limiting what can be done in an exploit that uses this overflow. I obviously have no comments on the patch itself, but hopefully this gives enough context to interpret the GCC implementation and/or documentation. It would be ideal to always return a precise object size expression, but falling back to an estimate is better than simply bailing out. https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
bwendling wrote: > > My answer for the question "what's the semantics of GCC's builtin X?" has > > always been "whatever GCC does." It's the best we can rely upon. But then > > we get into situations like this, where you and @nikic have one > > interpretation of their documentation and I have another. I can point to > > their behavior to back up my claim, but in the end it's probably not > > exactly clear even to GCC. > > [@nikic > demonstrated](https://github.com/llvm/llvm-project/pull/78526#issuecomment-1900439850) > that our current behavior is already compatible with GCC's behavior. If > GCC's behavior is the spec, then we are allowed to return 48 rather than only > 40 or -1 (or presumably 0 if `argc` is out of bounds) for the original > example, because in some cases GCC does so. No, he exposed a fluke in their implementation. I gave examples of where clang gets it wrong, and yet I apparently haven't "proven" anything according to this logic? > > My concern is that we want to use this for code hardening. Without precise > > object sizes, we're hampered in our goal. The unfortunate reality is that > > we can only get that size via these `__builtin_[dynamic_]object_size` > > functions. > > That's a totally understandable desire, but I think it's not realistic to > expect precise _sub_object sizes in the same cases that GCC can provide them, > due to the different architectural choices in the two compilers. If we had a > mid-level IR for Clang that still had frontend information, we could do > better by evaluating BOS there, so maybe that's one long term path forward to > consider. And in the short term, while there are cases where we won't be able > to match GCC, I think Clang should do better than it currently does in the > frontend, We're not talking about something like inline asm, where the internals of their register allocator is more-or-less exposed to the user, and therefore it's next to impossible for us to replicate all of it. This feature is far simpler, and should be something we can determine based on all of the information we're given. > specifically in cases like the one in the bug report where there's an obvious > better answer that doesn't require any sophisticated analysis to discover. What "obvious better answer" is that? https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
bwendling wrote: > > Perhaps we need clarification on what GCC means by "may point to multiple > > objects" in this instance. To me that means either "get me the size of the > > largest of these multiple objects" or "size of the smallest." In my eyes, > > that means pointing to a union field. > > Per @nikic's example, it seems reasonably clear to me that GCC's intended > semantics are to get either the best upper bound or the best lower bound that > the compiler is able to compute. I mean sure, we can ask, and it does no harm > to do so, but it is not reasonable to expect that a quantity like this coming > from the vagaries of GCC's implementation details will be exactly the same > for all examples when computed by a different implementation. (It's not even > the same across optimization levels in GCC.) I completely agree that using GCC's implementation and documentation is definitely lacking. That's the pitfalls of relying upon features added not by the standards committee, but by compiler developers. They're likely to be buggy as hell for unforeseen corner cases (despite some developers being on the standards committee). My answer for the question "what's the semantics of GCC's builtin X?" has always been "whatever GCC does." It's the best we can rely upon. But then we get into situations like this, where you and @nikic have one interpretation of their documentation and I have another. I can point to their behavior to back up my claim, but in the end it's probably not exactly clear even to GCC. My concern is that we want to use this for code hardening. Without precise object sizes, we're hampered in our goal. The unfortunate reality is that we can only get that size via these `__builtin_[dynamic_]object_size` functions. Thankfully, Apple has a way to help alleviate some of these issues, which they're push out soon-ish. > > I know that we lose precise struct information going to LLVM IR. If that's > > what's needed here, there are ways to pass this information along. We > > retain this information via DWARF. We could use similar metadata for this > > instance. Would that be acceptable? > > In theory, yes, we could preserve enough information in metadata to compute > this in the middle-end. But we would need to emit a _lot_ of metadata, just > on the off-chance that after optimization we happen to have a > builtin_object_size query that points to each object that we emit, so in > practice I don't think there's any chance we can do this. We can't use DWARF, > because it won't necessarily be available (and typically won't be available > in the interesting case where we find the object size only after > optimization), and in any case, the presence or absence of DWARF isn't > supposed to affect the executable code. Also, we'd need to annotate things > that simply don't exist at all in the LLVM IR: > > ``` > typedef struct X { int a, b; } X; > int f(void) { > X *p = malloc(sizeof(X)); > int *q = >a; > return __builtin_object_size(q, 1); > } > ``` > > Here, `f` ideally would return 4, but at the LLVM IR level, `p` and `q` are > identical values and the `>a` operation is a no-op. In cases like this, > the best we can realistically do is to return 8. The sub-object for `>a` and even `>b` is `struct X`, not the integers themselves. If you want that, you'll have to use casts: `&((char *)p->b)[2];`. (I had to take care to get that correct.) So `f` should return `8` (note it's likely to get `8` from the `alloc_size` attribute on `malloc` in your example). For the record, I didn't mean that we *should* use DWARF. I'm not that much of a masochist. :-) https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
zygoloid wrote: > Perhaps we need clarification on what GCC means by "may point to multiple > objects" in this instance. To me that means either "get me the size of the > largest of these multiple objects" or "size of the smallest." In my eyes, > that means pointing to a union field. Per @nikic's example, it seems reasonably clear to me that GCC's intended semantics are to get either the best upper bound or the best lower bound that the compiler is able to compute. I mean sure, we can ask, and it does no harm to do so, but it is not reasonable to expect that a quantity like this coming from the vagaries of GCC's implementation details will be exactly the same for all examples when computed by a different implementation. (It's not even the same across optimization levels in GCC.) > I know that we lose precise struct information going to LLVM IR. If that's > what's needed here, there are ways to pass this information along. We retain > this information via DWARF. We could use similar metadata for this instance. > Would that be acceptable? In theory, yes, we could preserve enough information in metadata to compute this in the middle-end. But we would need to emit a *lot* of metadata, just on the off-chance that after optimization we happen to have a builtin_object_size query that points to each object that we emit, so in practice I don't think there's any chance we can do this. We can't use DWARF, because it won't necessarily be available (and typically won't be available in the interesting case where we find the object size only after optimization), and in any case, the presence or absence of DWARF isn't supposed to affect the executable code. Also, we'd need to annotate things that simply don't exist at all in the LLVM IR: ``` typedef struct X { int a, b; } X; int f(void) { X *p = malloc(sizeof(X)); int *q = >a; return __builtin_object_size(q, 1); } ``` Here, `f` ideally would return 4, but at the LLVM IR level, `p` and `q` are identical values and the `>a` operation is a no-op. In cases like this, the best we can realistically do is to return 8. https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
bwendling wrote: > Taking a step back, while this patch is not the right direction, we can and > should do better for the original example. Probably the best way to do that > is to analyze the operand to `__builtin_[dynamic_]object_size` in the > frontend and compute a better bound based on the form of the expression. It > looks like it should be feasible to produce a tighter bound for an > array-to-pointer-decay expression like `f.bar[argc]` in subobject mode, as: > > * `select llvm.is.constant(argc) and (argc < 0 or argc >= 2), 0, > sizeof(f.bar[argc])` for the non-dynamic case, and just > * `select (argc < 0 or argc >= 2), 0, sizeof(f.bar[argc])` for the dynamic > case. > > A possibly simpler alternative would be for the frontend to pass an upper > bound on the result to the LLVM builtin in mode 1, so Clang could say "I know > the result will never be more than 40" and LLVM could provide either that > size or the complete object size, whichever is smaller. That wouldn't give as > tight a bound for the argc == 2 case, though. It might be possible to do something like this. If retaining precise type information from the front-end is the sticking issue, it could be as simple as casting this pointer top the type of the sub-object. I'll see what I can do. https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
bwendling wrote: > > > But mode 0 and 1 are in general asking for an upper bound on the > > > accessible bytes (that is, an N so any.access beyond N bytes is > > > definitely out of bounds), so it seems to me that returning -1 is > > > strictly worse than returning 48. > > > > > > It's the second bit that controls whether it's an upper bound or lower > > bound that's returned, not the least significant bit. > > Right, that's what I said: modes 0 and 1 ask for an upper bound. (Modes 2 and > 3 ask for a lower bound.) Perhaps we need clarification on what GCC means by "may point to multiple objects" in this instance. To me that means either "get me the size of the largest of these multiple objects" or "size of the smallest." In my eyes, that means pointing to a union field. > > We're trying to implement a GNU builtin, and the only defined semantics we > > have to go on are GNU's documentation. I can't see how we can deviate from > > their documentation unless it's to say "we can't determine this value" and > > so return `-1` instead of an answer that might be wildly wrong and > > potentially cause a memory leak of some sort. > > We're not really deviating from the documented rules. By the time we get to > LLVM IR lowering of the builtin, we have lost the precise frontend > information. But we know the pointer might point into the complete object > (where it would be able to write to 48 bytes) or to some subobject of that > complete object (where it would be able to write to 48 bytes or less). > Therefore the correct answer to return is 48. > > In the same way, it would be correct to return 48 here: > > ``` > char a[48]; > char b[40]; > bool always_false = false; // happens to never be modified > int n = __builtin_dynamic_object_size(always_false ? a : b, 1); > ``` > > ... though we miscompile this and return 40 regardless of whether we're in > upper bound or lower bound mode. :-( > > > In my made-up example, if we said, "Yes you can write up to 48 bytes into > > `p->failed_devs[argc]`, then a user may overwrite the two fields after > > `field_devs`. If we return `-1`, they'll have to take the "slow", but > > ideally more secure, route. > > No, that is not what we are saying when we return 48. We're saying "a write > past 48 bytes is definitely bad". If you want a lower bound on the number of > bytes that you can write, then you need to use mode 2 or 3 instead. But in the example, writing up to 48 bytes *is* definitely bad. And as I mentioned, we disagree on what's meant by "upper" and "lower" bounds in the case of a pointer to a non-union field. For me, that's managed by the first bit. I know that we lose precise struct information going to LLVM IR. If that's what's needed here, there are ways to pass this information along. We retain this information via DWARF. We could use similar metadata for this instance. Would that be acceptable? https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
zygoloid wrote: Taking a step back, while this patch is not the right direction, we can and should do better for the original example. Probably the best way to do that is to analyze the operand to `__builtin_[dynamic_]object_size` in the frontend and compute a better bound based on the form of the expression. It looks like it should be feasible to produce a tighter bound for an array-to-pointer-decay expression like `f.bar[argc]` in subobject mode, as: - `select llvm.is.constant(argc) and (argc < 0 or argc >= 2), 0, sizeof(f.bar[argc])` for the non-dynamic case, and just - `select (argc < 0 or argc >= 2), 0, sizeof(f.bar[argc])` for the dynamic case. A possibly simpler alternative would be for the frontend to pass an upper bound on the result to the LLVM builtin in mode 1, so Clang could say "I know the result will never be more than 40" and LLVM could provide either that size or the complete object size, whichever is smaller. That wouldn't give as tight a bound for the argc == 2 case, though. https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
bwendling wrote: > > For unions, clang will use the type of the union member with the largest > > size as the alloca type, regardless of which union member is active. I > > haven't tried, but your patch will probably compute the subobject size > > based on that arbitrarily picked member, rather than the one being accessed. > > Another concrete case where using the IR type won't work is arrays whose > initializer is mostly-zeroes. For example, for > > ``` > int arr[20] = {[5] = 6}; > ``` > > we give `arr` the IR type `<{ i32, i32, i32, i32, i32, i32, [14 x i32] }>` so > that we can use a `zeroinitializer` to fill the trailing portion of the array. My patch only works with structs. https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
zygoloid wrote: > For unions, clang will use the type of the union member with the largest size > as the alloca type, regardless of which union member is active. I haven't > tried, but your patch will probably compute the subobject size based on that > arbitrarily picked member, rather than the one being accessed. Another concrete case where using the IR type won't work is arrays whose initializer is mostly-zeroes. For example, for ``` int arr[20] = {[5] = 6}; ``` we give `arr` the IR type `<{ i32, i32, i32, i32, i32, i32, [14 x i32] }>` so that we can use a `zeroinitializer` to fill the trailing portion of the array. https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
bwendling wrote: > Perhaps according to the GCC documentation as written. But mode 0 and 1 are > in general asking for an upper bound on the accessible bytes (that is, an N > so any.access beyond N bytes is definitely out of bounds), so it seems to me > that returning -1 is strictly worse than returning 48. It's the second bit that controls whether it's an upper bound or lower bound that's returned, not the least significant bit. > Do you have a use case for which -1 is a better answer? I'm sure one could be constructed (for instance, someone wanting to check if it's okay to `strcpy` a string of size `48` to a pointer to `failed_devs[argc]` in the example above), but that's not really the point. We're trying to implement a GNU builtin, and the only defined semantics we have to go on are GNU's documentation. I can't see how we can deviate from their documentation unless it's to say "we can't determine this value" and so return `-1` instead of an answer that might be wildly wrong and potentially cause a memory leak of some sort. In my made-up example, if we said, "Yes you can write up to 48 bytes into `p->failed_devs[argc]`, then a user may overwrite the two fields after `field_devs`. If we return `-1`, they'll have to take the "slow", but ideally more secure, route. Lastly, returning `48` in the made-up case is different from the case when an immediate is used rather than a variable, which is certainly unexpected. We could document that we can't support GNU's implementation exactly, but I don't think that's good enough. >From what I can tell, GNU's idea of what constitutes a "closest surrounding >sub-object" is one of a handful of things: the structure containing the field, >the array being pointed to with a struct, or the field being pointed into. >Example of the last one: ``` struct foo { int a; int b; int c; }; size_t foo(struct foo *p, int index) { return __builtin_dynamic_object_size(&((char *)p->b)[idx], 1); // b is the sub-object. } ``` All of these are explicit in the LLVM IR. Is the worry that they've been changed from some transformations? Or are there others I'm missing? https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
bwendling wrote: > Using anything but the size and alignment of the alloca type in a way that > affects program semantics is illegal. I'm sorry, but I don't understand your comment here. Could you supply some context? https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
bwendling wrote: > We've discussed this before, years ago. Previously, the sticking point was: > how is LLVM going to know what the frontend considers the closest surrounding > subobject to be? The LLVM type doesn't give you that information, and it > seems like there's nowhere else that LLVM can get it from either, so this > flag ends up not being useful and the best we can do is to give a > conservatively-correct answer -- the complete object size if we want an upper > bound, or 0 if we want a lower bound. Right now we're giving a wrong answer (see below), so that's no good. If we're going to err on the side of caution, then we should return the default "can't calculate the size" value. When you say that we can't detect what the front-end considers the "closest surrounding subobject" to be, is that mostly due to corner cases or is it a more general concern? (Note, we're really only interested in supporting this for C structs. C++ structs / classes would require therapy.) > Clang does respect the "subobject" flag if it can symbolically evaluate the > operand of `__builtin_object_size` sufficiently to determine which subobject > is being referenced. Previously we've thought that that was the best we could > do. This is why the current behavior is wrong, in my opinion. The motivating example is below: ``` struct suspend_stats { int success; int fail; int failed_freeze; int failed_prepare; int failed_suspend; int failed_suspend_late; int failed_suspend_noirq; int failed_resume; int failed_resume_early; int failed_resume_noirq; #define REC_FAILED_NUM 2 int last_failed_dev; charfailed_devs[REC_FAILED_NUM][40]; /* offsetof(struct suspend_stats, failed_devs) == 44 */ int last_failed_errno; int bar; }; #define report(x) __builtin_printf(#x ": %zu\n", x) int main(int argc, char *argv[]) { struct suspend_stats foo; report(sizeof(foo.failed_devs[1])); report(sizeof(foo.failed_devs[argc])); report(__builtin_dynamic_object_size(, 0)); report(__builtin_dynamic_object_size(, 1)); report(__builtin_dynamic_object_size(_freeze, 0)); report(__builtin_dynamic_object_size(foo.failed_devs[1], 0)); report(__builtin_dynamic_object_size(foo.failed_devs[1], 1)); report(__builtin_dynamic_object_size(foo.failed_devs[argc], 0)); report(__builtin_dynamic_object_size(foo.failed_devs[argc], 1)); return 0; } ``` The output with this change is now: ``` __builtin_dynamic_object_size(, 0): 128 __builtin_dynamic_object_size(, 1): 4 __builtin_dynamic_object_size(_freeze, 0): 124 __builtin_dynamic_object_size(foo.failed_devs[1], 0): 48 __builtin_dynamic_object_size(foo.failed_devs[1], 1): 40 __builtin_dynamic_object_size(foo.failed_devs[argc], 0): 48 __builtin_dynamic_object_size(foo.failed_devs[argc], 1): 40 ``` Without the change, the last line is: ``` __builtin_dynamic_object_size(foo.failed_devs[argc], 1): 48 ``` Which isn't correct according to GNU's documentation. So if we can't honor the TYPE bit, then we should return `-1 / 0` here, right? https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
https://github.com/zygoloid commented: We've discussed this before, years ago. Previously, the sticking point was: how is LLVM going to know what the frontend considers the closest surrounding subobject to be? The LLVM type doesn't give you that information, and it seems like there's nowhere else that LLVM can get it from either, so this flag ends up not being useful and the best we can do is to give a conservatively-correct answer -- the complete object size if we want an upper bound, or 0 if we want a lower bound. Clang does respect the "subobject" flag if it can symbolically evaluate the operand of `__builtin_object_size` sufficiently to determine which subobject is being referenced. Previously we've thought that that was the best we could do. https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff 8947469ec1ad6d35b2feec0acc43d0d191514f0b ab13650e232b1a17cf9c7335f90a33f7eb71f741 -- clang/lib/CodeGen/CGBuiltin.cpp clang/lib/CodeGen/CGExpr.cpp clang/test/CodeGen/catch-undef-behavior.c clang/test/CodeGen/pass-object-size.c llvm/include/llvm/Analysis/MemoryBuiltins.h llvm/lib/Analysis/MemoryBuiltins.cpp llvm/lib/IR/AutoUpgrade.cpp llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp `` View the diff from clang-format here. ``diff diff --git a/llvm/lib/Analysis/MemoryBuiltins.cpp b/llvm/lib/Analysis/MemoryBuiltins.cpp index 56d6313905..41b6eefffe 100644 --- a/llvm/lib/Analysis/MemoryBuiltins.cpp +++ b/llvm/lib/Analysis/MemoryBuiltins.cpp @@ -732,8 +732,8 @@ SizeOffsetAPInt ObjectSizeOffsetVisitor::computeImpl(Value *V) { if (!Options.WholeObjectSize && AllocaTy) { // At this point, SOT.Size is the size of the whole struct. However, we // want the size of the sub-object. -const StructLayout = *DL.getStructLayout( -const_cast(AllocaTy)); +const StructLayout = +*DL.getStructLayout(const_cast(AllocaTy)); unsigned Idx = SL.getElementContainingOffset(Offset.getLimitedValue()); `` https://github.com/llvm/llvm-project/pull/78526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Clang] Correct __builtin_dynamic_object_size for subobject types (PR #78526)
llvmbot wrote: @llvm/pr-subscribers-llvm-globalisel @llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-clang-codegen Author: Bill Wendling (bwendling) Changes The second argument of __builtin_dynamic_object_size controls whether it returns the size of the whole object or the closest surrounding object. For this struct: struct s { int foo; char bar[2][40]; int baz; int qux; }; int main(int argc, char **argv) { struct s f; #define report(x) fprintf(stderr, #x ": %zu\n", x) argc = 1; report(__builtin_dynamic_object_size(f.bar[argc], 0)); report(__builtin_dynamic_object_size(f.bar[argc], 1)); return 0; } should return: __builtin_dynamic_object_size(f.bar[argc], 0): 48 __builtin_dynamic_object_size(f.bar[argc], 1): 40 determined by the least significant bit of the TYPE. Add a new parameter to the llvm.objectsize intrinsic to control which size it should return. --- Patch is 97.55 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/78526.diff 40 Files Affected: - (modified) clang/lib/CodeGen/CGBuiltin.cpp (+6-1) - (modified) clang/lib/CodeGen/CGExpr.cpp (+3-1) - (modified) clang/test/CodeGen/catch-undef-behavior.c (+1-1) - (modified) clang/test/CodeGen/pass-object-size.c (+3-3) - (modified) llvm/docs/LangRef.rst (+21-13) - (modified) llvm/include/llvm/Analysis/MemoryBuiltins.h (+9) - (modified) llvm/include/llvm/IR/Intrinsics.td (+1-1) - (modified) llvm/lib/Analysis/MemoryBuiltins.cpp (+31-4) - (modified) llvm/lib/IR/AutoUpgrade.cpp (+37-6) - (modified) llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp (+2-2) - (modified) llvm/test/Analysis/CostModel/X86/free-intrinsics.ll (+4-4) - (modified) llvm/test/Analysis/CostModel/free-intrinsics-datalayout.ll (+4-4) - (modified) llvm/test/Analysis/CostModel/free-intrinsics-no_info.ll (+4-4) - (modified) llvm/test/Assembler/auto_upgrade_intrinsics.ll (+4-4) - (modified) llvm/test/Bitcode/objectsize-upgrade-7.0.ll (+2-2) - (modified) llvm/test/CodeGen/AArch64/GlobalISel/memcpy_chk_no_tail.ll (+2-2) - (modified) llvm/test/CodeGen/AArch64/memsize-remarks.ll (+16-16) - (modified) llvm/test/CodeGen/AMDGPU/promote-alloca-mem-intrinsics.ll (+4-4) - (modified) llvm/test/Other/cgscc-libcall-update.ll (+2-2) - (modified) llvm/test/Transforms/InferAddressSpaces/AMDGPU/debug-info.ll (+4-4) - (modified) llvm/test/Transforms/InferAddressSpaces/AMDGPU/intrinsics.ll (+6-6) - (modified) llvm/test/Transforms/InferAlignment/propagate-assume.ll (+2-2) - (modified) llvm/test/Transforms/Inline/call-intrinsic-objectsize.ll (+3-3) - (modified) llvm/test/Transforms/InstCombine/allocsize.ll (+5-5) - (modified) llvm/test/Transforms/InstCombine/builtin-dynamic-object-size.ll (+13-13) - (modified) llvm/test/Transforms/InstCombine/builtin-object-size-custom-dl.ll (+3-3) - (modified) llvm/test/Transforms/InstCombine/builtin-object-size-strdup-family.ll (+5-5) - (modified) llvm/test/Transforms/InstCombine/invoke.ll (+1-1) - (modified) llvm/test/Transforms/InstCombine/memset_chk-1.ll (+5-5) - (modified) llvm/test/Transforms/InstCombine/objsize.ll (+38-38) - (modified) llvm/test/Transforms/InstCombine/stpcpy_chk-1.ll (+4-4) - (modified) llvm/test/Transforms/InstCombine/strcpy_chk-1.ll (+5-5) - (modified) llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-load.ll (+2-2) - (modified) llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll (+5-5) - (modified) llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-posix-memalign.ll (+8-8) - (modified) llvm/test/Transforms/LowerConstantIntrinsics/constant-intrinsics.ll (+2-2) - (modified) llvm/test/Transforms/LowerConstantIntrinsics/crash-on-large-allocas.ll (+2-2) - (modified) llvm/test/Transforms/LowerConstantIntrinsics/objectsize_basic.ll (+18-19) - (modified) llvm/test/Transforms/LowerConstantIntrinsics/stale-worklist-phi.ll (+2-2) - (modified) llvm/test/Transforms/SCCP/issue59602-assume-like-call-users.ll (+5-5) ``diff diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index de48b15645b1ab..513a007aa009b7 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -1103,12 +1103,17 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, unsigned Type, Function *F = CGM.getIntrinsic(Intrinsic::objectsize, {ResType, Ptr->getType()}); + // If the least significant bit is clear, objects are whole variables. If + // it's set, a closest surrounding subobject is considered the object a + // pointer points to. + Value *WholeObj = Builder.getInt1((Type & 1) == 0); + // LLVM only supports 0 and 2, make sure that we pass along that as a boolean. Value *Min = Builder.getInt1((Type & 2) != 0); // For GCC compatibility, __builtin_object_size treat NULL as unknown size.