aeubanks added inline comments.

================
Comment at: clang/include/clang/Basic/AttrDocs.td:1416-1417
+not significant. This allows global constants with the same contents to be
+merged. This can break global pointer identity, i.e. two different globals have
+the same address.
+
----------------
erichkeane wrote:
> aeubanks wrote:
> > aaron.ballman wrote:
> > > aeubanks wrote:
> > > > aaron.ballman wrote:
> > > > > aeubanks wrote:
> > > > > > aaron.ballman wrote:
> > > > > > > What happens for tentative definitions where the value isn't 
> > > > > > > known? e.g.,
> > > > > > > ```
> > > > > > > [[clang::unnamed_addr]] int i1, i2;
> > > > > > > ```
> > > > > > > 
> > > > > > > What happens if the types are similar but not the same?
> > > > > > > ```
> > > > > > > [[clang::unnamed_addr]] signed int i1 = 32;
> > > > > > > [[clang::unnamed_addr]] unsigned int i2 = 32;
> > > > > > > ```
> > > > > > > 
> > > > > > > Should we diagnose taking the address of such an attributed 
> > > > > > > variable so users have some hope of spotting the non-conforming 
> > > > > > > situations?
> > > > > > > 
> > > > > > > Does this attribute have impacts across translation unit 
> > > > > > > boundaries (perhaps only when doing LTO) or only within a single 
> > > > > > > TU?
> > > > > > > 
> > > > > > > What does this attribute do in C++ in the presence of 
> > > > > > > constructors and destructors? e.g.,
> > > > > > > ```
> > > > > > > struct S {
> > > > > > >   S();
> > > > > > >   ~S();
> > > > > > > };
> > > > > > > 
> > > > > > > [[clang::unnamed_addr]] S s1, s2; // Are these merged and there's 
> > > > > > > only one ctor/dtor call?
> > > > > > > ```
> > > > > > globals are only mergeable if they're known to be constant and have 
> > > > > > the same value/size. this can be done at compile time only if the 
> > > > > > optimizer can see the constant values, or at link time
> > > > > > 
> > > > > > so nothing would happen in any of the cases you've given.
> > > > > > 
> > > > > > but yeah that does imply that we should warn when the attribute is 
> > > > > > used on non const, non-POD globals. I'll update this patch to do 
> > > > > > that
> > > > > > 
> > > > > > as mentioned in the description, we actually do want to take the 
> > > > > > address of these globals for table-driven parsing, but we don't 
> > > > > > care about identity equality
> > > > > > globals are only mergeable if they're known to be constant and have 
> > > > > > the same value/size. this can be done at compile time only if the 
> > > > > > optimizer can see the constant values, or at link time
> > > > > >
> > > > > > so nothing would happen in any of the cases you've given.
> > > > > 
> > > > > Ahhhh that's good to know. So I assume we *will* merge these?
> > > > > 
> > > > > ```
> > > > > struct S {
> > > > >   int i, j;
> > > > >   float f;
> > > > > };
> > > > > 
> > > > > [[clang::unnamed_addr]] const S s1 = { 1, 2, 3.0f };
> > > > > [[clang::unnamed_addr]] const S s2 = { 1, 2, 3.0f };
> > > > > [[clang::unnamed_addr]] const S s3 = s2;
> > > > > ```
> > > > > 
> > > > > > but yeah that does imply that we should warn when the attribute is 
> > > > > > used on non const, non-POD globals. I'll update this patch to do 
> > > > > > that
> > > > > 
> > > > > Thank you, I think that will be more user-friendly
> > > > > 
> > > > > > as mentioned in the description, we actually do want to take the 
> > > > > > address of these globals for table-driven parsing, but we don't 
> > > > > > care about identity equality
> > > > > 
> > > > > Yeah, I still wonder if we want to diagnose just the same -- if the 
> > > > > address is never taken, there's not really a way to notice the 
> > > > > optimization, but if the address is taken, you basically get UB (and 
> > > > > I think we should explicitly document it as such). Given how easy it 
> > > > > is to accidentally take the address of something (like via a 
> > > > > reference in C++), I think we should warn by default, but still have 
> > > > > a warning group for folks who want to live life dangerously.
> > > > > > globals are only mergeable if they're known to be constant and have 
> > > > > > the same value/size. this can be done at compile time only if the 
> > > > > > optimizer can see the constant values, or at link time
> > > > > >
> > > > > > so nothing would happen in any of the cases you've given.
> > > > > 
> > > > > Ahhhh that's good to know. So I assume we *will* merge these?
> > > > > 
> > > > > ```
> > > > > struct S {
> > > > >   int i, j;
> > > > >   float f;
> > > > > };
> > > > > 
> > > > > [[clang::unnamed_addr]] const S s1 = { 1, 2, 3.0f };
> > > > > [[clang::unnamed_addr]] const S s2 = { 1, 2, 3.0f };
> > > > > [[clang::unnamed_addr]] const S s3 = s2;
> > > > > ```
> > > > yeah those are merged even just by clang
> > > > 
> > > > ```
> > > > struct S {
> > > >   int i, j;
> > > >   float f;
> > > > };
> > > > 
> > > > [[clang::unnamed_addr]] const S s1 = { 1, 2, 3.0f };
> > > > [[clang::unnamed_addr]] const S s2 = { 1, 2, 3.0f };
> > > > [[clang::unnamed_addr]] const S s3 = s2;
> > > > 
> > > > const void * f1() {
> > > >   return &s1;
> > > > }
> > > > 
> > > > const void * f2() {
> > > >   return &s2;
> > > > }
> > > > 
> > > > const void * f3() {
> > > >   return &s3;
> > > > }
> > > > 
> > > > $ ./build/rel/bin/clang++ -S -emit-llvm -o - -O2 /tmp/a.cc
> > > > ```
> > > > > 
> > > > > > but yeah that does imply that we should warn when the attribute is 
> > > > > > used on non const, non-POD globals. I'll update this patch to do 
> > > > > > that
> > > > > 
> > > > > Thank you, I think that will be more user-friendly
> > > > > 
> > > > > > as mentioned in the description, we actually do want to take the 
> > > > > > address of these globals for table-driven parsing, but we don't 
> > > > > > care about identity equality
> > > > > 
> > > > > Yeah, I still wonder if we want to diagnose just the same -- if the 
> > > > > address is never taken, there's not really a way to notice the 
> > > > > optimization, but if the address is taken, you basically get UB (and 
> > > > > I think we should explicitly document it as such). Given how easy it 
> > > > > is to accidentally take the address of something (like via a 
> > > > > reference in C++), I think we should warn by default, but still have 
> > > > > a warning group for folks who want to live life dangerously.
> > > > 
> > > > I don't understand how there would be any UB. Especially if the user 
> > > > only dereferences them, and isn't comparing pointers, which is the 
> > > > requested use case.
> > > > yeah those are merged even just by clang
> > > 
> > > Nice, thank you for the confirmation!
> > > 
> > > > I don't understand how there would be any UB. Especially if the user 
> > > > only dereferences them, and isn't comparing pointers, which is the 
> > > > requested use case.
> > > 
> > > That's just it -- nothing prevents the user from taking the address and 
> > > comparing the pointers, which is no longer defined behavior with this 
> > > attribute. It would require a static analysis check to catch this problem 
> > > unless the compiler statically warns on taking the address in the first 
> > > place (IMO, we shouldn't assume users will use the attribute properly and 
> > > thus need no help to do so). I was also thinking about things like 
> > > accidental sharing across thread boundaries -- but perhaps that's fine 
> > > because the data is constant. I was also thinking that this potentially 
> > > breaks `restrict` semantics but on reflection... that seems almost 
> > > intentional given the goal of the attribute. But things along these lines 
> > > are what have me worried -- the language assumes unique locations for 
> > > objects, so I expect there's going to be fallout when object locations 
> > > are no longer unique. If we can remove sharp edges for users without 
> > > compromising the utility of the attribute, I think that's beneficial. Or 
> > > are you saying that warning like this would basically compromise the 
> > > utility?
> > > > I don't understand how there would be any UB. Especially if the user 
> > > > only dereferences them, and isn't comparing pointers, which is the 
> > > > requested use case.
> > > 
> > > That's just it -- nothing prevents the user from taking the address and 
> > > comparing the pointers, which is no longer defined behavior with this 
> > > attribute. It would require a static analysis check to catch this problem 
> > > unless the compiler statically warns on taking the address in the first 
> > > place (IMO, we shouldn't assume users will use the attribute properly and 
> > > thus need no help to do so). I was also thinking about things like 
> > > accidental sharing across thread boundaries -- but perhaps that's fine 
> > > because the data is constant. I was also thinking that this potentially 
> > > breaks `restrict` semantics but on reflection... that seems almost 
> > > intentional given the goal of the attribute. But things along these lines 
> > > are what have me worried -- the language assumes unique locations for 
> > > objects, so I expect there's going to be fallout when object locations 
> > > are no longer unique. If we can remove sharp edges for users without 
> > > compromising the utility of the attribute, I think that's beneficial. Or 
> > > are you saying that warning like this would basically compromise the 
> > > utility?
> > 
> > when you say "undefined behavior" do you mean "it's unspecified what 
> > happens" or literally the C/C++ "undefined behavior" where the compiler can 
> > assume it doesn't happen?
> > 
> > I don't think there's any UB in the C/C++ "undefined behavior" sense, we're 
> > just dropping a C/C++ guarantee of unique pointer identity for certain 
> > globals.
> > 
> > Yes I believe the warning would compromise the utility since the underlying 
> > request behind this is a case where the user explicitly wants to take the 
> > address of these globals for table driven parsing but does not care about 
> > unique global identity. i.e. it's fine if we have duplicate addresses in 
> > the table as long as each entry points to the proper data.
> I think this IS causing undefined behavior, any program that assumes the 
> addresses aren't the same (such as inserting addresses into a map, explicitly 
> destructing/initializing/etc), or are comparing addresses are now exhibiting 
> UB (in the purest of C++ senses).  It isn't the 'taking the address' that is 
> UB, it is comparing them, but unfortunately we don't have a good way to 
> diagnose THAT.  I believe what Aaron is getting at is that the taking of the 
> addresses should be diagnosed, particularly if we end up taking said address 
> less-obviously.
> 
> It DOES have to be a Static Analysis type diagnostic however, since I don't 
> think it is accurate enough.
> 
perhaps I'm arguing too literally (or am just wrong), but even if the program 
behaves incorrectly due to comparisons of addresses now giving different 
results, that's not necessarily UB. even if a program were using pointers as a 
key into a map and two globals that previously were separate entries are now 
one, that's not necessarily UB, that's just a program behaving incorrectly.

unless there's a specific C/C++ rule that you have in mind that I'm missing?

I'm happy to add a warning that we can turn off internally if people agree that 
taking the address of an `unnamed_addr` global is dangerous in general, not 
sure if that should be default on or off


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158223/new/

https://reviews.llvm.org/D158223

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to