haoNoQ wrote:

Hi! Yes, the usual RFC process can probably handle that, it just needs to go to 
the [Clang Frontend](https://discourse.llvm.org/c/clang/6) section instead of 
the clang-tidy section. Just plan ahead a little bit, think about the 
motivation and the prior art, pros and cons and potential caveats, think how 
far you're willing to go on your own in this direction and how committed you'd 
be to maintaining this facility in the future, and lay it out in the RFC.

It may be a good idea to have a tiny proof-of-concept patch (implement the 
attribute and teach the clang-tidy check to consume it) (it's very easy to 
implement an attribute in clang - just add it to the list) but don't write too 
much code before you're able to confirm that clang maintainers are interested.

When it comes to motivation, yeah, every time you find yourself hardcoding a 
somewhat-standard function or class name in the checker code, it gets better 
when you have ways to annotate custom classes or replacements. Also the library 
headers thing is really useful.

Off the top of my head, here's a few places that could benefit:

- Clang Static Analyzer people such as myself, `@Xazax-hun`, `@steakhal`. (If 
you're mostly a `clang-tidy` person you know us as `clang-analyzer-*`.) Over 
the years it's been clear to us that many standard classes require some sort of 
manual modeling or summary-based modeling in order to handle them correctly. 
Every time we do that, we might as well support non-standard classes that 
behave the same way. Two particular pain points for us are:
  - Smart pointers that perform reference counting, such as `std::shared_ptr` 
or `llvm::IntrusiveRefCntPtr`. We produce weird use-after-free false positives 
when we misunderstand the initial reference count inside the smart pointer.
  - Collections and iterators. Collections such as `std::map` way are too 
complex for the analyzer to model manually by reading their source code. Custom 
collection classes are incredibly common too.
- The `-Wunsafe-buffer-usage` fixit machine: `@jkorous-apple`, `@dtarditi`. I'm 
not actively following their progress anymore so you gotta ask them, don't 
quote me as a confirmation that they actually need it. But I remember them 
expressing interest in recognizing custom replacements for `std::span` and 
other buffer containers, mostly for the purposes of suggesting replacements 
that are appropriate for the codebase that doesn't have access to the standard 
library. Identifying the appropriate replacement via attributes is one way to 
do that. If they still need that, an attribute-based solution would be 
particularly useful for them because hardcoding custom class names in the 
compiler proper is frowned upon. Like, it'd be super weird if compiler-proper 
worked differently depending on how you name your classes. That's only 
acceptable in static analysis tools.
- Not exactly a precise hit but: Any kind of use-after-move analysis 
([bugprone-use-after-move](https://clang.llvm.org/extra/clang-tidy/checks/bugprone/use-after-move.html),
 
[clang-analyzer-cplusplus.Move](https://clang.llvm.org/docs/analyzer/checkers.html#cplusplus-move-c))
 benefits from recognizing methods that "reset" the object to a valid state, 
making it legal to use after move. This usually includes assignments to the 
object and methods such as `.reset()` or `.clear()`. If custom classes can be 
annotated to tell the compiler about such methods the checker can be made 
significantly more powerful. Note that in this case the nature of the class 
itself isn't important, only the vague "role" of the method. So this is a 
slightly different problem but you can probably see how it's closely related 
and may potentially be addressed by a slight variation of your solution.

When it comes to caveats, it's somewhat important to recognize that many custom 
replacement classes *slightly diverge* from their standard counterparts.
- For example, is it really true that `llvm::IntrusiveRefCntPtr` "is basically 
a" `std::shared_ptr`? They may be "close enough" for some purposes (eg. they're 
both reference-counted smart pointers) but not for other purposes (eg. 
`shared_ptr` cannot be unwrapped into a raw pointer and later reconstructed 
from that raw pointer alone, say after tunneling through some plain C API). 
Maybe the attribute needs an extra "...for the purposes of" clause? The 
use-after-move example is kinda the opposite of this problem: it can be seen as 
"this method is basically a `.reset()` for an *extremely vague* purpose of 
movable class analysis regardless of all other properties of that class".
- Additionally, some custom classes would have methods that have absolutely no 
counterparts in the original class. For example, they could perform multiple 
standard operations in one step. Or they may perform an operation to which your 
checker is sensitive in a very weird way. Depending on how your checker works 
and what practical tradeoffs you're making in your analysis, it may or may not 
be important to annotate these methods as "weird" to let the checker know that 
the checker needs to actively forget the information it otherwise thinks it 
knows. (Or the checker may do that by default with every unannotated method; in 
this case a "weird" annotation isn't necessary.)

https://github.com/llvm/llvm-project/pull/144313
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to