zahiraam added a comment. In D146148#4330971 <https://reviews.llvm.org/D146148#4330971>, @rjmccall wrote:
> In D146148#4330611 <https://reviews.llvm.org/D146148#4330611>, @zahiraam > wrote: > >> In D146148#4221651 <https://reviews.llvm.org/D146148#4221651>, @rjmccall >> wrote: >> >>> In D146148#4220591 <https://reviews.llvm.org/D146148#4220591>, @zahiraam >>> wrote: >>> >>>> In D146148#4220495 <https://reviews.llvm.org/D146148#4220495>, >>>> @aaron.ballman wrote: >>>> >>>>> In D146148#4220433 <https://reviews.llvm.org/D146148#4220433>, @zahiraam >>>>> wrote: >>>>> >>>>>> In D146148#4220268 <https://reviews.llvm.org/D146148#4220268>, @rjmccall >>>>>> wrote: >>>>>> >>>>>>> Okay, so modifying `math.h` to use this attribute is acceptable? >>>>>>> That's great, that's definitely the best outcome for the compiler. >>>>>>> >>>>>>> Just a minor request about the diagnostic name, but otherwise LGTM. >>>>>> >>>>>> Yes. Added the attribute inside the included math.h header file. >>>>> >>>>> How does this work for, say, glibc, musl, or MSVC CRT? Won't those math.h >>>>> headers lack the attribute and thus run into problems when used with >>>>> Clang? >>>> >>>> Good point! @rjmccall are you thinking of something in particular with the >>>> attribute? >>> >>> Zahira, this is what I was asking you when I asked whether modifying the >>> math.h header was acceptable: whether you were prepared to accept that the >>> warning would only fire on system math.h headers that we'd modified, or >>> whether you cared about making it work with non-cooperative headers. I >>> wasn't asking if you were willing to change the test code. >>> >>>> If not I guess we will have to rely on string comparison for all the >>>> typedef in the TU. Aaron pointed out the compile time overhead. >>> >>> Well, the compile-time overhead of doing this on every typedef >>> *declaration* is way better than the overhead of doing this on every type >>> lookup, at least. >>> >>> Anyway, there are three possibilities I can see: >>> >>> - We accept that this needs cooperative system headers. >>> - We add a math.h compiler header that `#include_next`s the system math.h >>> and then adds the attribute. I believe you can just add an attribute to a >>> typedef retroactively with something like `typedef float_t float_t >>> __attribute__((whatever))`. >>> - We do checks on every typedef declaration. There's a builtin-identifier >>> trick that we do with library functions that we should be able to >>> generalize to typedefs, so you wouldn't need to actually do string >>> comparisons, you'd just check whether the `IdentifierInfo*` was storing a >>> special ID. We'd set that up in `initializeBuiltins` at the start of the >>> translation unit, so the overhead would just be that we'd create two extra >>> `IdentifierInfo*`s in every TU. >>> >>> The builtin ID logic is currently specific to builtin *functions*, and I >>> don't think we'd want to hack typedef names into that. But the same >>> storage in `IdentifierInfo*` is also used for identifying the ObjC >>> context-sensitive keywords, by just offsetting the builtin IDs by the >>> NUM_OBJC_KEYWORD. You should be able to generalize that by also >>> introducing a concept of a builtin *type* name, so that e.g. IDs in >>> [0,NUM_OBJC_KEYWORDS) are ObjCKeywordKinds, IDs in [NUM_OBJC_KEYWORDS, >>> NUM_OBJC_KEYWORDS+NUM_BUILTIN_TYPES) are BuiltinTypeKinds (when you >>> subtract NUM_OBJC_KEYWORDS), and IDs in >>> [NUM_OBJC_KEYWORDS+NUM_BUILTIN_TYPES,∞) are builtin function IDs (when you >>> subtract NUM_OBJC_KEYWORDS+NUM_BUILTIN_TYPES). >> >> Reading this more carefully... Does that mean that we initialize the >> float_t, double_t in initializeBuiltins even if they are not used in the >> source code? > > Yes. If we decide this is an overhead worth eliminating, we'll find a way to > do it lazily to the builtins, and then we'll be able to take advantage of it > here, too. The builtins are a much larger contributor to overhead. > >> Also not sure how to define the NUM_BUILTIN_TYPES since I don't need to add >> it to TokenKinds.h? I was proposing to do something like this: >> enum InterestingIdentifierKind { >> #define Interesting_Identifier(X) X, >> #include "clang/Basic/TokenKinds.def" >> >> NUM_INTERESTING_IDENTIFIERS >> >> }; >> But I guess I don't need since we don't want to add additional storage. Do I >> understand things correctly? > > We should have an enum like this, yes. And this is what we do in > `IdentifierTable.h` for all the other kinds. Alternatively, you can just > hard-code the number and then static_assert that it's correct in some .cpp > file. > > FWIW, I think I like NUM_INTERESTING_IDENTIFIERS as a name rather than > NUM_BUILTIN_TYPES. Okay, so I did add that in TokenKinds.h. Isn't that the right place for it? The same way it's done for the other builtins? And in TokenKinds.def I added the lines for the interesting identifiers? In D146148#4330971 <https://reviews.llvm.org/D146148#4330971>, @rjmccall wrote: > In D146148#4330611 <https://reviews.llvm.org/D146148#4330611>, @zahiraam > wrote: > >> In D146148#4221651 <https://reviews.llvm.org/D146148#4221651>, @rjmccall >> wrote: >> >>> In D146148#4220591 <https://reviews.llvm.org/D146148#4220591>, @zahiraam >>> wrote: >>> >>>> In D146148#4220495 <https://reviews.llvm.org/D146148#4220495>, >>>> @aaron.ballman wrote: >>>> >>>>> In D146148#4220433 <https://reviews.llvm.org/D146148#4220433>, @zahiraam >>>>> wrote: >>>>> >>>>>> In D146148#4220268 <https://reviews.llvm.org/D146148#4220268>, @rjmccall >>>>>> wrote: >>>>>> >>>>>>> Okay, so modifying `math.h` to use this attribute is acceptable? >>>>>>> That's great, that's definitely the best outcome for the compiler. >>>>>>> >>>>>>> Just a minor request about the diagnostic name, but otherwise LGTM. >>>>>> >>>>>> Yes. Added the attribute inside the included math.h header file. >>>>> >>>>> How does this work for, say, glibc, musl, or MSVC CRT? Won't those math.h >>>>> headers lack the attribute and thus run into problems when used with >>>>> Clang? >>>> >>>> Good point! @rjmccall are you thinking of something in particular with the >>>> attribute? >>> >>> Zahira, this is what I was asking you when I asked whether modifying the >>> math.h header was acceptable: whether you were prepared to accept that the >>> warning would only fire on system math.h headers that we'd modified, or >>> whether you cared about making it work with non-cooperative headers. I >>> wasn't asking if you were willing to change the test code. >>> >>>> If not I guess we will have to rely on string comparison for all the >>>> typedef in the TU. Aaron pointed out the compile time overhead. >>> >>> Well, the compile-time overhead of doing this on every typedef >>> *declaration* is way better than the overhead of doing this on every type >>> lookup, at least. >>> >>> Anyway, there are three possibilities I can see: >>> >>> - We accept that this needs cooperative system headers. >>> - We add a math.h compiler header that `#include_next`s the system math.h >>> and then adds the attribute. I believe you can just add an attribute to a >>> typedef retroactively with something like `typedef float_t float_t >>> __attribute__((whatever))`. >>> - We do checks on every typedef declaration. There's a builtin-identifier >>> trick that we do with library functions that we should be able to >>> generalize to typedefs, so you wouldn't need to actually do string >>> comparisons, you'd just check whether the `IdentifierInfo*` was storing a >>> special ID. We'd set that up in `initializeBuiltins` at the start of the >>> translation unit, so the overhead would just be that we'd create two extra >>> `IdentifierInfo*`s in every TU. >>> >>> The builtin ID logic is currently specific to builtin *functions*, and I >>> don't think we'd want to hack typedef names into that. But the same >>> storage in `IdentifierInfo*` is also used for identifying the ObjC >>> context-sensitive keywords, by just offsetting the builtin IDs by the >>> NUM_OBJC_KEYWORD. You should be able to generalize that by also >>> introducing a concept of a builtin *type* name, so that e.g. IDs in >>> [0,NUM_OBJC_KEYWORDS) are ObjCKeywordKinds, IDs in [NUM_OBJC_KEYWORDS, >>> NUM_OBJC_KEYWORDS+NUM_BUILTIN_TYPES) are BuiltinTypeKinds (when you >>> subtract NUM_OBJC_KEYWORDS), and IDs in >>> [NUM_OBJC_KEYWORDS+NUM_BUILTIN_TYPES,∞) are builtin function IDs (when you >>> subtract NUM_OBJC_KEYWORDS+NUM_BUILTIN_TYPES). >> >> Reading this more carefully... Does that mean that we initialize the >> float_t, double_t in initializeBuiltins even if they are not used in the >> source code? > > Yes. If we decide this is an overhead worth eliminating, we'll find a way to > do it lazily to the builtins, and then we'll be able to take advantage of it > here, too. The builtins are a much larger contributor to overhead. > >> Also not sure how to define the NUM_BUILTIN_TYPES since I don't need to add >> it to TokenKinds.h? I was proposing to do something like this: >> enum InterestingIdentifierKind { >> #define Interesting_Identifier(X) X, >> #include "clang/Basic/TokenKinds.def" >> >> NUM_INTERESTING_IDENTIFIERS >> >> }; >> But I guess I don't need since we don't want to add additional storage. Do I >> understand things correctly? > > We should have an enum like this, yes. And this is what we do in > `IdentifierTable.h` for all the other kinds. Alternatively, you can just > hard-code the number and then static_assert that it's correct in some .cpp > file. > > FWIW, I think I like NUM_INTERESTING_IDENTIFIERS as a name rather than > NUM_BUILTIN_TYPES. Okay but should this be added to TokenKinds.def or Builtin.defs? the patch is adding it to TokenKinds. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D146148/new/ https://reviews.llvm.org/D146148 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits