Charusso added inline comments.
================ Comment at: clang/lib/StaticAnalyzer/Checkers/cert/StrChecker.cpp:124 + if (const SymbolicRegion *SR = DestMR->getSymbolicBase()) + if (const Expr *SizeExpr = getDynamicSizeExpr(C.getState(), SR)) + return exprToStr(SizeExpr, C); ---------------- NoQ wrote: > Charusso wrote: > > NoQ wrote: > > > Charusso wrote: > > > > Charusso wrote: > > > > > NoQ wrote: > > > > > > Again, you will have to highlight the allocation site with a note. > > > > > > Therefore you will have to write a bug visitor that traverses the > > > > > > size expression at some point (or, equivalently, a note tag when > > > > > > the size expression is evaluated). Therefore you don't need to > > > > > > store the expression in the program state. > > > > > Yes, you have pointed out the necessary visitor, but it needs more > > > > > thinking. > > > > > > > > > > I have a memory region which could be any kind of "memory block > > > > > region" therefore I have no idea where is the size expression. We are > > > > > supporting ~20 different allocations, which is nothing compared to > > > > > the wild with the not so uncommon 5+ parameter allocators. Therefore > > > > > I still do not want to reverse engineer a small MallocChecker + > > > > > ExprEngine + BuiltinFunctionChecker inside my checker. They provide > > > > > the necessary `DynamicSizeInfo` easily, which could be used in at > > > > > least 4 checkers at the moment (which I have pointed out earlier in > > > > > D68725). > > > > > > > > > > If I have the size expression in the dynamic size map, and I can > > > > > clearly point out the destination buffer, it is a lot more simplified > > > > > to traverse the graph where the buffer and its size comes from. > > > > Well, you really do not want to store `SizeExpr` of `malloc(SizeExpr)` > > > > and you are right I will have to traverse from it to see whether the > > > > `SizeExpr` is ambiguous or not, where it comes from. > > > > > > > > I want to rely on the `trackExpressionValue()` as the `SizeExpr` is > > > > available by `getDynamicSizeExpr()`, so it is one or two lines of code. > > > > > > > > Would you create your own switch-case to see where is the size > > > > expression goes in the allocation and use `trackExpressionValue()` on > > > > it? So that you do not store information in the global state which > > > > results in better run-time / less memory. > > > > > > > > At first I really wanted to model `malloc()` and `realloc()` and stuff, > > > > then I realized the `MallocChecker` provides every information I need. > > > > Would it be a better idea to create my own tiny `MallocChecker` inside > > > > my checker which does nothing but marks the size expression being > > > > interesting with `NoteTags`? > > > > > > > > Also I am thinking of a switch-case on the `DefinedOrUnknownSVal Size` > > > > which somewhere has an expression inside it which I could > > > > `trackExpressionValue()` on. > > > > > > > > Basically we are missing the rules what to use and I have picked the > > > > easiest solution. Could you share please which would be the right > > > > direction for such a simple task? > > > > I want to rely on the `trackExpressionValue()` as the `SizeExpr` is > > > > available by `getDynamicSizeExpr()`, so it is one or two lines of code. > > > > > > This won't work. `trackExpressionValue()` can only track an active > > > expression (that has, or at least should have, a value in the bug-node's > > > environment). You'll have to make a visitor or a note tag. > > > > > > You can either make your own visitor (which will detect the node in which > > > the extent information becomes present), or convert `MallocChecker` to > > > use note tags and then inter-operate with those tags (though the > > > interestingness map - "i mark the symbol as interesting so i'm interested > > > in highlighting the allocation site" - or a similar mechanism). The > > > second approach is more work because no such interoperation has ever been > > > implemented yet, but it should be pretty rewarding for the future. > > > This won't work. trackExpressionValue() can only track an active > > > expression (that has, or at least should have, a value in the bug-node's > > > environment). You'll have to make a visitor or a note tag. > > So because most likely after the `malloc()` the `size` symbol dies, the > > `trackExpressionValue()` cannot track dead symbols? Because we could make > > the `size` dying base on the `buffer`, we have some dependency logic for > > that. It also represents the truth, the size is part of that memory block's > > region. After that we could track the expression of the `size`? > > So because most likely after the malloc() the size symbol dies...? > > After the `malloc()` is consumed, the size //expression// dies and gets > cleaned up from the //Environment//. The symbol will only die if the value > wasn't put into the //Store// in the process of modeling the statement that > consumes the `malloc()` expression (such as an assignment). But > `trackExpressionValue()` can only track live (active) expressions. I see. Now I have tried out what we have. The `trackExpressionValue()` has a lookup to see where is the expression available: ```lang=c++ /// Find the ExplodedNode where the lvalue (the value of 'Ex') /// was computed. static const ExplodedNode* findNodeForExpression(const ExplodedNode *N, const Expr *Inner) { while (N) { if (N->getStmtForDiagnostics() == Inner) return N; N = N->getFirstPred(); } return N; } ``` from that point the expression was alive, and tracking is fine. ------------- The `InnerPointerChecker` has introduced a place: `AllocationState.h` to communicate with the `MallocBugVisitor`. I believe this is the simplest way to communicate. ================ Comment at: clang/lib/StaticAnalyzer/Checkers/cert/StrChecker.cpp:184 + if (IsFix) { + if (Optional<std::string> SizeStr = getSizeExprAsString(Call, CallC, C)) { + renameFunctionFix(UseSafeFunctions ? "gets_s" : "fgets", Call, *Report); ---------------- NoQ wrote: > Charusso wrote: > > NoQ wrote: > > > Also, which is probably more important, you will never be able to provide > > > a fixit for the malloced memory case, because there may be multiple > > > execution paths that reach the current point with different size > > > expressions (in fact, not necessarily all of them are malloced). > > > > > > Eg.: > > > ```lang=c > > > char *x = 0; > > > char y[10]; > > > > > > if (coin()) { > > > x = malloc(20); > > > } else { > > > x = y; > > > } > > > > > > gets(x); > > > ``` > > > > > > If you suggest replacing `gets(x)` with `gets_s(x, 20)`, you'll still > > > have a buffer overflow on the else-branch on which `x` points to an array > > > of 10 bytes. > > This checker going to evolve a lot, and all of the checked function calls > > have issues like that. I do not even think what else issues they have. I > > would like to cover the false alarm suppression when we are about to alarm. > > Is it would be okay? I really would like to see alarms first. > > > > For example, I have seen stuff in the wild so that I can state out 8-param > > allocators and we need to rely on the checkers provide information about > > allocation. > *summons @Szelethus* > > Apart from the obviously syntactic cases, you might actually be able to > implement fixits for the situation when the reaching-definitions analysis > displays exactly one definition for `x`, which additionally coincides with > the allocation site. If that definition is a simple assignment, you'll be > able to re-run the reaching definitions analysis for the RHS of that > assignment. If that definition comes from a function call, you might be able > to re-run the reaching definitions analysis on the return statement(s) of > that function (note that this function must have been inlined during > path-sensitive analysis, otherwise no definition in it would coincide with > the allocation site). And so on. > > This problem sheds some light on how much do we want to make the reaching > definitions analysis inter-procedural. My current guess is that we probably > don't need to; we'd rather have this guided by re-running the > reaching-definitions analysis based on the path-sensitive report data, than > have the reaching-definitions analysis be inter-procedural on our own. That is a cool idea! I hope @Szelethus has time for his project. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D69813/new/ https://reviews.llvm.org/D69813 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits