[clang] [analyzer] Fix crash of StreamChecker when eval calling 'fopen' (PR #100990)
@@ -1766,13 +1770,6 @@ are assumed to succeed.) fclose(p); } -**Limitations** - -The checker does not track the correspondence between integer file descriptors -and ``FILE *`` pointers. Operations on standard streams like ``stdin`` are not -treated specially and are therefore often not recognized (because these streams -are usually not opened explicitly by the program, and are global variables). NagyDonat wrote: It refers to functions like `write` [(man page)](https://www.man7.org/linux/man-pages/man2/write.2.html) where the file is specified by an `int` argument. https://github.com/llvm/llvm-project/pull/100990 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix crash of StreamChecker when eval calling 'fopen' (PR #100990)
@@ -1703,7 +1703,11 @@ are detected: * Invalid 3rd ("``whence``") argument to ``fseek``. The stream operations are by this checker usually split into two cases, a success -and a failure case. However, in the case of write operations (like ``fwrite``, +and a failure case. +On the success case it also assumes that the current value of ``stdout``, +``stderr``, or ``stdin`` can't be equal to the file pointer returned by ``fopen``. + +In the case of write operations (like ``fwrite``, NagyDonat wrote: Ok, then feel free to keep the short diff. https://github.com/llvm/llvm-project/pull/100990 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix crash of StreamChecker when eval calling 'fopen' (PR #100990)
NagyDonat wrote: > I decided to put the fixup NFC changes along with this PR (the ones were > submitted after I merged the original commit), but on hindsight probably it > would be better to merge those NFC changes separately. If you request, I'll > split the PR. Feel free to keep the NFC changes in this commit, just briefly mention them in the description ("This commit also contains some minor NFC code quality improvements." or something similar.) https://github.com/llvm/llvm-project/pull/100990 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix crash of StreamChecker when eval calling 'fopen' (PR #100990)
@@ -1703,7 +1703,11 @@ are detected: * Invalid 3rd ("``whence``") argument to ``fseek``. The stream operations are by this checker usually split into two cases, a success -and a failure case. However, in the case of write operations (like ``fwrite``, +and a failure case. +On the success case it also assumes that the current value of ``stdout``, +``stderr``, or ``stdin`` can't be equal to the file pointer returned by ``fopen``. + +In the case of write operations (like ``fwrite``, NagyDonat wrote: Personally I'd re-wrap the lines; but I can accept this if you think that it's important to minimize the diff footprint. https://github.com/llvm/llvm-project/pull/100990 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix crash of StreamChecker when eval calling 'fopen' (PR #100990)
@@ -1766,13 +1770,6 @@ are assumed to succeed.) fclose(p); } -**Limitations** - -The checker does not track the correspondence between integer file descriptors -and ``FILE *`` pointers. Operations on standard streams like ``stdin`` are not -treated specially and are therefore often not recognized (because these streams -are usually not opened explicitly by the program, and are global variables). NagyDonat wrote: Hmm, perhaps keep a note saying "This checker does not track operations that use integer file descriptors instead of `FILE *` pointers". (If I recall correctly this is true and might be relevant for someone who's used to work with low-level stuff.) However it's also completely fine (from my POV) if you delete the whole Limitations section. https://github.com/llvm/llvm-project/pull/100990 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix crash of StreamChecker when eval calling 'fopen' (PR #100990)
https://github.com/NagyDonat approved this pull request. LGTM. https://github.com/llvm/llvm-project/pull/100990 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix crash of StreamChecker when eval calling 'fopen' (PR #100990)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/100990 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Keep alive short-circuiting condition subexpressions in a conditional (PR #100745)
https://github.com/NagyDonat approved this pull request. Nice fix, thanks for upstreaming this! https://github.com/llvm/llvm-project/pull/100745 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][ASTImporter] Fix import of template parameter default values. (PR #100100)
@@ -359,6 +359,31 @@ namespace clang { Params, Importer.getToContext().getTranslationUnitDecl()); } +template +void tryUpdateTemplateParmDeclInheritedFrom(NamedDecl *RecentParm, +NamedDecl *NewParm) { + if (auto *ParmT = dyn_cast(RecentParm)) { +if (ParmT->hasDefaultArgument()) { + auto *P = cast(NewParm); + P->removeDefaultArgument(); + P->setInheritedDefaultArgument(Importer.ToContext, ParmT); +} + } +} + +void updateTemplateParametersInheritedFrom( NagyDonat wrote: Thanks for the clarifications! https://github.com/llvm/llvm-project/pull/100100 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][ASTImporter] Fix import of template parameter default values. (PR #100100)
https://github.com/NagyDonat approved this pull request. https://github.com/llvm/llvm-project/pull/100100 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Eliminate a dyn_cast (PR #100719)
https://github.com/NagyDonat approved this pull request. https://github.com/llvm/llvm-project/pull/100719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Minor cleanup in two test files. (PR #100570)
NagyDonat wrote: In my opinion the "put unrelated things into separate commits" guideline is just a corollary of the "don't create commits that are too complex" rule. If a change is extremely trivial, I'm actively trying to aggregate it with other tangentially related changes, because one simple commit is easier to read and handle than two (or five or ten) even smaller commits. https://github.com/llvm/llvm-project/pull/100570 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Minor cleanup in two test files. (PR #100570)
https://github.com/NagyDonat closed https://github.com/llvm/llvm-project/pull/100570 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Minor cleanup in two test files. (PR #100570)
https://github.com/NagyDonat created https://github.com/llvm/llvm-project/pull/100570 This commit contains two unrelated trivial changes: (1) Three unused variables are removed from `ctor.mm`. (2) A FIXME block is removed from `ctor-array.cpp` because it described an issue that was resolved since then. From 15fae768b0ad2412858520a137a51e7190a591cd Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= Date: Thu, 25 Jul 2024 15:43:05 +0200 Subject: [PATCH] [analyzer][NFC] Minor cleanup in two test files. This commit contains two unrelated trivial changes: (1) Three unused variables are removed from `ctor.mm`. (2) A FIXME block is removed from `ctor-array.cpp` because it described an issue that was resolved since then. --- clang/test/Analysis/ctor-array.cpp | 12 +--- clang/test/Analysis/ctor.mm| 2 -- 2 files changed, 1 insertion(+), 13 deletions(-) diff --git a/clang/test/Analysis/ctor-array.cpp b/clang/test/Analysis/ctor-array.cpp index 053669cc2aada..49412ee5a68c7 100644 --- a/clang/test/Analysis/ctor-array.cpp +++ b/clang/test/Analysis/ctor-array.cpp @@ -1,4 +1,4 @@ -// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-disable-checker=cplusplus -analyzer-config c++-inlining=constructors -verify %s +// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config c++-inlining=constructors -verify %s #include "Inputs/system-header-simulator-cxx.h" @@ -119,16 +119,6 @@ struct s5 { }; void g1(void) { - // FIXME: This test requires -analyzer-disable-checker=cplusplus, - // because of the checker's weird behaviour in case of arrays. - // E.g.: - //s3 *arr = new s3[4]; - //s3 *arr2 = new (arr + 1) s3[1]; - // ^~~ - // warning: 12 bytes is possibly not enough - //for array allocation which requires - //4 bytes. - s5::c = 0; s5 *arr = new s5[4]; new (arr + 1) s5[3]; diff --git a/clang/test/Analysis/ctor.mm b/clang/test/Analysis/ctor.mm index fb385833df9c7..6ac9050fc29f7 100644 --- a/clang/test/Analysis/ctor.mm +++ b/clang/test/Analysis/ctor.mm @@ -56,8 +56,6 @@ void testNonPODCopyConstructor() { namespace ConstructorVirtualCalls { class A { public: -int *out1, *out2, *out3; - virtual int get() { return 1; } A(int *out1) { ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Don't invalidate the super region when a std object ctor runs (PR #100405)
@@ -923,12 +923,31 @@ SVal AnyCXXConstructorCall::getCXXThisVal() const { return UnknownVal(); } +static bool isWithinStdNamespace(const Decl *D) { NagyDonat wrote: Of course, feel free to leave it for a followup patch. https://github.com/llvm/llvm-project/pull/100405 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Don't invalidate the super region when a std object ctor runs (PR #100405)
@@ -923,12 +923,31 @@ SVal AnyCXXConstructorCall::getCXXThisVal() const { return UnknownVal(); } +static bool isWithinStdNamespace(const Decl *D) { NagyDonat wrote: I think this function could be useful for other checkers as well; consider moving this to an externally visible location (e.g. turn this into a method of `Decl`). https://github.com/llvm/llvm-project/pull/100405 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Don't invalidate the super region when a std object ctor runs (PR #100405)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/100405 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Don't invalidate the super region when a std object ctor runs (PR #100405)
https://github.com/NagyDonat approved this pull request. LGTM, nice improvement. https://github.com/llvm/llvm-project/pull/100405 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][ASTImporter] Fix import of template parameter default values. (PR #100100)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/100100 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][ASTImporter] Fix import of template parameter default values. (PR #100100)
@@ -359,6 +359,31 @@ namespace clang { Params, Importer.getToContext().getTranslationUnitDecl()); } +template +void tryUpdateTemplateParmDeclInheritedFrom(NamedDecl *RecentParm, +NamedDecl *NewParm) { + if (auto *ParmT = dyn_cast(RecentParm)) { +if (ParmT->hasDefaultArgument()) { + auto *P = cast(NewParm); + P->removeDefaultArgument(); + P->setInheritedDefaultArgument(Importer.ToContext, ParmT); +} + } +} + +void updateTemplateParametersInheritedFrom( NagyDonat wrote: ```suggestion /// Update the parameter list `NewParams` of a template declaration by /// "inheriting" default argument values from `RecentParams`, which is /// the parameter list of an earlier declaration of the same template. /// (Note that "inheriting" default argument values this way is /// completely unrelated to the usual object-oriented "inheritance" /// relationship between classes.) /// /// Note that the standard specifies that the same parameter cannot be /// given default arguments twice in the same scope; but in our case /// the parameter list `NewParams` is freshly imported from a /// different TU, so it's possible that both `RecentParams` and /// `NewParams` specify a default argument for the same parameter. /// In this case, the code will use the default argument taken from /// `RecentParams`. /// FIXME: Report an error if the two default arguments are different. void updateTemplateParametersInheritedFrom( ``` I felt that it is important to explain that the "Inherited" within the name of this function is unrelated to the usual object-oriented "inheritance" between classes; and as I started to write this comment, I also ended up documenting the goals of this function. (I hope that I was correct -- please fix this comment block if I misunderstood something!) https://github.com/llvm/llvm-project/pull/100100 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][ASTImporter] Fix import of template parameter default values. (PR #100100)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/100100 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][ASTImporter] Fix import of template parameter default values. (PR #100100)
https://github.com/NagyDonat approved this pull request. LGTM, I'm very grateful that you can understand and resolve these tricky `ASTImporter` issues. https://github.com/llvm/llvm-project/pull/100100 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Assume the result of 'fopen' can't alias with 'std{in,out,err}' (PR #100085)
https://github.com/NagyDonat approved this pull request. LGTM, this is a nice improvement. I vaguely recall that a few months ago somebody else on our team had trouble with false positives similar false positives, so it's good to see that these will be fixed. https://github.com/llvm/llvm-project/pull/100085 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Assume the result of 'fopen' can't alias with 'std{in,out,err}' (PR #100085)
@@ -451,6 +462,10 @@ class StreamChecker : public Checker` In your code the three standard streams have exactly identical roles (as far as I see), and I think it would be good to emphasize this by storing them in a three-element array instead of three separate independently named variables. For example, I could imagine a solution like ```c++ const char *StdStreamNames[3] = {"stdin", "stdout", "stderr"}; mutable const VarDecl *StdStreamDecls[3] = {}; // ... in checkASTDecl() for (int i = 0; i < 3; i++) { // inline getGlobalStreamPointerByName here to initialize // StdStreamDecls[i] via StdStreamNames[i] } // ... in assumeNoAliasingWithStdStreams() for (const VarDecl *Var : StdStreamDecls) { // put the definition of the lambda `assumeRetNE` here } ``` (This is just a rough draft, there might be more idiomatic/elegant solutions for some parts.) `` https://github.com/llvm/llvm-project/pull/100085 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Assume the result of 'fopen' can't alias with 'std{in,out,err}' (PR #100085)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/100085 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Model builtin-like functions as builtin functions (PR #99886)
@@ -570,13 +570,8 @@ void differentBranchesTest(int i) { { A a; a.foo() > 0 ? a.foo() : A(std::move(a)).foo(); -#ifdef DFS -// peaceful-note@-2 {{Assuming the condition is false}} -// peaceful-note@-3 {{'?' condition is false}} -#else -// peaceful-note@-5 {{Assuming the condition is true}} -// peaceful-note@-6 {{'?' condition is true}} -#endif NagyDonat wrote: I see, thanks for the explanation. I agree that this isn't an important question and simply adjusting the TC is the right solution. https://github.com/llvm/llvm-project/pull/99886 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Model builtin-like functions as builtin functions (PR #99886)
https://github.com/NagyDonat commented: LGTM. https://github.com/llvm/llvm-project/pull/99886 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)
https://github.com/NagyDonat closed https://github.com/llvm/llvm-project/pull/98621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)
https://github.com/NagyDonat updated https://github.com/llvm/llvm-project/pull/98621 From 2765bc97d3242d50fd73aedb9e9d38dfdcef814c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= Date: Fri, 12 Jul 2024 13:57:53 +0200 Subject: [PATCH 1/3] [analyzer] Don't display the offset value in underflows Previously alpha.security.ArrayBoundV2 displayed the (negative) offset value when it reported an underflow, but this produced lots of very similar and redundant reports in certain situations. After this commit the offset won't be printed so the usual deduplication will handle these reports as equivalent (and print only one of them). See https://github.com/llvm/llvm-project/issues/86969 for background. --- .../Checkers/ArrayBoundCheckerV2.cpp | 15 ++- .../test/Analysis/out-of-bounds-diagnostics.c | 26 +-- 2 files changed, 32 insertions(+), 9 deletions(-) diff --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp index f82288f1099e8..f177041abc411 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp @@ -374,13 +374,14 @@ static std::optional getConcreteValue(std::optional SV) { static Messages getPrecedesMsgs(const SubRegion *Region, NonLoc Offset) { std::string RegName = getRegionName(Region); - SmallString<128> Buf; - llvm::raw_svector_ostream Out(Buf); - Out << "Access of " << RegName << " at negative byte offset"; - if (auto ConcreteIdx = Offset.getAs()) -Out << ' ' << ConcreteIdx->getValue(); + + // We're not reporting the Offset, because we don't want to spam the user + // with similar reports that differ only in different offset values. + // See https://github.com/llvm/llvm-project/issues/86969 for details. + (void)Offset; + return {formatv("Out of bound access to memory preceding {0}", RegName), - std::string(Buf)}; + formatv("Access of {0} at negative byte offset", RegName)}; } /// Try to divide `Val1` and `Val2` (in place) by `Divisor` and return true if @@ -609,7 +610,7 @@ void ArrayBoundCheckerV2::performCheck(const Expr *E, CheckerContext ) const { // CHECK UPPER BOUND DefinedOrUnknownSVal Size = getDynamicExtent(State, Reg, SVB); if (auto KnownSize = Size.getAs()) { -// In a situation where both overflow and overflow are possible (but the +// In a situation where both underflow and overflow are possible (but the // index is either tainted or known to be invalid), the logic of this // checker will first assume that the offset is non-negative, and then // (with this additional assumption) it will detect an overflow error. diff --git a/clang/test/Analysis/out-of-bounds-diagnostics.c b/clang/test/Analysis/out-of-bounds-diagnostics.c index 92f983d8b1561..10cfe95a79a55 100644 --- a/clang/test/Analysis/out-of-bounds-diagnostics.c +++ b/clang/test/Analysis/out-of-bounds-diagnostics.c @@ -6,7 +6,7 @@ int TenElements[10]; void arrayUnderflow(void) { TenElements[-3] = 5; // expected-warning@-1 {{Out of bound access to memory preceding 'TenElements'}} - // expected-note@-2 {{Access of 'TenElements' at negative byte offset -12}} + // expected-note@-2 {{Access of 'TenElements' at negative byte offset}} } int underflowWithDeref(void) { @@ -14,7 +14,29 @@ int underflowWithDeref(void) { --p; return *p; // expected-warning@-1 {{Out of bound access to memory preceding 'TenElements'}} - // expected-note@-2 {{Access of 'TenElements' at negative byte offset -4}} + // expected-note@-2 {{Access of 'TenElements' at negative byte offset}} +} + +int rng(void); +int getIndex(void) { + switch (rng()) { +case 1: return -152; +case 2: return -160; +case 3: return -168; +default: return -172; + } +} + +void gh86959(void) { + // Previously code like this produced many almost-identical bug reports that + // only differed in the byte offset value (which was reported by the checker + // at that time). Verify that now we only see one report. + + // expected-note@+1 {{Entering loop body}} + while (rng()) +TenElements[getIndex()] = 10; + // expected-warning@-1 {{Out of bound access to memory preceding 'TenElements'}} + // expected-note@-2 {{Access of 'TenElements' at negative byte offset}} } int scanf(const char *restrict fmt, ...); From d7b71a21368c70da21b2ca6e7d02c7022ec21dde Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= Date: Fri, 12 Jul 2024 21:11:22 +0200 Subject: [PATCH 2/3] Alternative approach: tweak the `Profile()`` method of `BugReport`s This commit re-adds the concrete offset value at the end of the (long) `Description` of the underflow report, but ensures that the `Profile()` method of `PathSensitiveBugReport` only uses the _short_ description (as returned by `getShortDescription()`) instead of the the `Description` field. For the sake of consistency, the same
[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)
@@ -2213,7 +2213,7 @@ void BasicBugReport::Profile(llvm::FoldingSetNodeID& hash) const { void PathSensitiveBugReport::Profile(llvm::FoldingSetNodeID ) const { hash.AddInteger(static_cast(getKind())); hash.AddPointer(); - hash.AddString(Description); + hash.AddString(getShortDescription()); NagyDonat wrote: Yes, it's intentional, because the method call `getShortDescription()` is equivalent to `ShortDescripton.empty() ? Description : ShortDescription`. (The common case when the short and full descriptions are identical is represented internally by an empty `ShortDescription` and the shared value stored in `Description`.) https://github.com/llvm/llvm-project/pull/98621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)
NagyDonat wrote: (Btw it seems that I accidentally added everyone as an "assignee" instead of a reviewer...) https://github.com/llvm/llvm-project/pull/98621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)
https://github.com/NagyDonat unassigned https://github.com/llvm/llvm-project/pull/98621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)
https://github.com/NagyDonat unassigned https://github.com/llvm/llvm-project/pull/98621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)
https://github.com/NagyDonat unassigned https://github.com/llvm/llvm-project/pull/98621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)
NagyDonat wrote: I'm back from the vacation, so I would like to restart and conclude this review process. @steakhal or anybody else please review when it's convenient for you. https://github.com/llvm/llvm-project/pull/98621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][ASTImporter] Fix import of anonymous enums if multiple are present (PR #99281)
https://github.com/NagyDonat approved this pull request. LGTM as far as I can judge this tricky situation. https://github.com/llvm/llvm-project/pull/99281 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/98621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)
@@ -373,14 +373,14 @@ static std::optional getConcreteValue(std::optional SV) { } static Messages getPrecedesMsgs(const SubRegion *Region, NonLoc Offset) { - std::string RegName = getRegionName(Region); - SmallString<128> Buf; - llvm::raw_svector_ostream Out(Buf); - Out << "Access of " << RegName << " at negative byte offset"; - if (auto ConcreteIdx = Offset.getAs()) -Out << ' ' << ConcreteIdx->getValue(); NagyDonat wrote: This `raw_svector_ostream` solution is very clunky, so I replaced it with a less ugly alternative when I re-added the code that prints the offset. https://github.com/llvm/llvm-project/pull/98621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/98621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/98621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Don't display the offset value in underflows (PR #98621)
NagyDonat wrote: _(Technical detail: I'll be on vacation during the next week, so I won't see updates on this PR until the 22nd of July. If you want to merge this PR, feel free to do so, but don't forget to adjust the description and the title to accurately reflect the current state of the PR.)_ https://github.com/llvm/llvm-project/pull/98621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Don't display the offset value in underflows (PR #98621)
NagyDonat wrote: > I wonder if we could have something in between. I'm thinking of having the > concrete offset as a separate note, instead of having it part of the primary > message. That way after BR selection, we would still deterministically pick > the shortest parh, and also have the offset in the path notes. Actually I had a better solution for achieving this: during my earlier experimentation I noticed that the `Profile()` methods of the `BugReporter` use the (long) `Description` field (which is "Access of `` at negative byte offset -124", the final note on the bug path), but ignore the `ShortDescription` field (which is "Out of bound access to memory preceding ``", the stand-alone warning message). In the commit that I just pushed, I tweaked `Profile()` to ensure that it uses the short description and ignores the long one -- which yields the right deduplication behavior. I'd guess that `ShortDescription` was missing from the `Profile()` method because it was introduced at a later point in the development, and there are very few checkers that actually set different short and long descriptions. (For more details see the commit message of the commit that I just pushed onto this PR.) https://github.com/llvm/llvm-project/pull/98621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Don't display the offset value in underflows (PR #98621)
https://github.com/NagyDonat updated https://github.com/llvm/llvm-project/pull/98621 From 2765bc97d3242d50fd73aedb9e9d38dfdcef814c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= Date: Fri, 12 Jul 2024 13:57:53 +0200 Subject: [PATCH 1/2] [analyzer] Don't display the offset value in underflows Previously alpha.security.ArrayBoundV2 displayed the (negative) offset value when it reported an underflow, but this produced lots of very similar and redundant reports in certain situations. After this commit the offset won't be printed so the usual deduplication will handle these reports as equivalent (and print only one of them). See https://github.com/llvm/llvm-project/issues/86969 for background. --- .../Checkers/ArrayBoundCheckerV2.cpp | 15 ++- .../test/Analysis/out-of-bounds-diagnostics.c | 26 +-- 2 files changed, 32 insertions(+), 9 deletions(-) diff --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp index f82288f1099e8..f177041abc411 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp @@ -374,13 +374,14 @@ static std::optional getConcreteValue(std::optional SV) { static Messages getPrecedesMsgs(const SubRegion *Region, NonLoc Offset) { std::string RegName = getRegionName(Region); - SmallString<128> Buf; - llvm::raw_svector_ostream Out(Buf); - Out << "Access of " << RegName << " at negative byte offset"; - if (auto ConcreteIdx = Offset.getAs()) -Out << ' ' << ConcreteIdx->getValue(); + + // We're not reporting the Offset, because we don't want to spam the user + // with similar reports that differ only in different offset values. + // See https://github.com/llvm/llvm-project/issues/86969 for details. + (void)Offset; + return {formatv("Out of bound access to memory preceding {0}", RegName), - std::string(Buf)}; + formatv("Access of {0} at negative byte offset", RegName)}; } /// Try to divide `Val1` and `Val2` (in place) by `Divisor` and return true if @@ -609,7 +610,7 @@ void ArrayBoundCheckerV2::performCheck(const Expr *E, CheckerContext ) const { // CHECK UPPER BOUND DefinedOrUnknownSVal Size = getDynamicExtent(State, Reg, SVB); if (auto KnownSize = Size.getAs()) { -// In a situation where both overflow and overflow are possible (but the +// In a situation where both underflow and overflow are possible (but the // index is either tainted or known to be invalid), the logic of this // checker will first assume that the offset is non-negative, and then // (with this additional assumption) it will detect an overflow error. diff --git a/clang/test/Analysis/out-of-bounds-diagnostics.c b/clang/test/Analysis/out-of-bounds-diagnostics.c index 92f983d8b1561..10cfe95a79a55 100644 --- a/clang/test/Analysis/out-of-bounds-diagnostics.c +++ b/clang/test/Analysis/out-of-bounds-diagnostics.c @@ -6,7 +6,7 @@ int TenElements[10]; void arrayUnderflow(void) { TenElements[-3] = 5; // expected-warning@-1 {{Out of bound access to memory preceding 'TenElements'}} - // expected-note@-2 {{Access of 'TenElements' at negative byte offset -12}} + // expected-note@-2 {{Access of 'TenElements' at negative byte offset}} } int underflowWithDeref(void) { @@ -14,7 +14,29 @@ int underflowWithDeref(void) { --p; return *p; // expected-warning@-1 {{Out of bound access to memory preceding 'TenElements'}} - // expected-note@-2 {{Access of 'TenElements' at negative byte offset -4}} + // expected-note@-2 {{Access of 'TenElements' at negative byte offset}} +} + +int rng(void); +int getIndex(void) { + switch (rng()) { +case 1: return -152; +case 2: return -160; +case 3: return -168; +default: return -172; + } +} + +void gh86959(void) { + // Previously code like this produced many almost-identical bug reports that + // only differed in the byte offset value (which was reported by the checker + // at that time). Verify that now we only see one report. + + // expected-note@+1 {{Entering loop body}} + while (rng()) +TenElements[getIndex()] = 10; + // expected-warning@-1 {{Out of bound access to memory preceding 'TenElements'}} + // expected-note@-2 {{Access of 'TenElements' at negative byte offset}} } int scanf(const char *restrict fmt, ...); From d7b71a21368c70da21b2ca6e7d02c7022ec21dde Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= Date: Fri, 12 Jul 2024 21:11:22 +0200 Subject: [PATCH 2/2] Alternative approach: tweak the `Profile()`` method of `BugReport`s This commit re-adds the concrete offset value at the end of the (long) `Description` of the underflow report, but ensures that the `Profile()` method of `PathSensitiveBugReport` only uses the _short_ description (as returned by `getShortDescription()`) instead of the the `Description` field. For the sake of consistency, the same
[clang] [analyzer] Don't display the offset value in underflows (PR #98621)
@@ -609,7 +610,7 @@ void ArrayBoundCheckerV2::performCheck(const Expr *E, CheckerContext ) const { // CHECK UPPER BOUND DefinedOrUnknownSVal Size = getDynamicExtent(State, Reg, SVB); if (auto KnownSize = Size.getAs()) { -// In a situation where both overflow and overflow are possible (but the +// In a situation where both underflow and overflow are possible (but the NagyDonat wrote: This is an unrelated change, I just noticed a mistake (that was introduced in one of my earlier commits). https://github.com/llvm/llvm-project/pull/98621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Don't display the offset value in underflows (PR #98621)
https://github.com/NagyDonat created https://github.com/llvm/llvm-project/pull/98621 Previously alpha.security.ArrayBoundV2 displayed the (negative) offset value when it reported an underflow, but this produced lots of very similar and redundant reports in certain situations. After this commit the offset won't be printed so the usual deduplication will handle these reports as equivalent (and print only one of them). See https://github.com/llvm/llvm-project/issues/86969 for background. From 2765bc97d3242d50fd73aedb9e9d38dfdcef814c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= Date: Fri, 12 Jul 2024 13:57:53 +0200 Subject: [PATCH] [analyzer] Don't display the offset value in underflows Previously alpha.security.ArrayBoundV2 displayed the (negative) offset value when it reported an underflow, but this produced lots of very similar and redundant reports in certain situations. After this commit the offset won't be printed so the usual deduplication will handle these reports as equivalent (and print only one of them). See https://github.com/llvm/llvm-project/issues/86969 for background. --- .../Checkers/ArrayBoundCheckerV2.cpp | 15 ++- .../test/Analysis/out-of-bounds-diagnostics.c | 26 +-- 2 files changed, 32 insertions(+), 9 deletions(-) diff --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp index f82288f1099e8..f177041abc411 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp @@ -374,13 +374,14 @@ static std::optional getConcreteValue(std::optional SV) { static Messages getPrecedesMsgs(const SubRegion *Region, NonLoc Offset) { std::string RegName = getRegionName(Region); - SmallString<128> Buf; - llvm::raw_svector_ostream Out(Buf); - Out << "Access of " << RegName << " at negative byte offset"; - if (auto ConcreteIdx = Offset.getAs()) -Out << ' ' << ConcreteIdx->getValue(); + + // We're not reporting the Offset, because we don't want to spam the user + // with similar reports that differ only in different offset values. + // See https://github.com/llvm/llvm-project/issues/86969 for details. + (void)Offset; + return {formatv("Out of bound access to memory preceding {0}", RegName), - std::string(Buf)}; + formatv("Access of {0} at negative byte offset", RegName)}; } /// Try to divide `Val1` and `Val2` (in place) by `Divisor` and return true if @@ -609,7 +610,7 @@ void ArrayBoundCheckerV2::performCheck(const Expr *E, CheckerContext ) const { // CHECK UPPER BOUND DefinedOrUnknownSVal Size = getDynamicExtent(State, Reg, SVB); if (auto KnownSize = Size.getAs()) { -// In a situation where both overflow and overflow are possible (but the +// In a situation where both underflow and overflow are possible (but the // index is either tainted or known to be invalid), the logic of this // checker will first assume that the offset is non-negative, and then // (with this additional assumption) it will detect an overflow error. diff --git a/clang/test/Analysis/out-of-bounds-diagnostics.c b/clang/test/Analysis/out-of-bounds-diagnostics.c index 92f983d8b1561..10cfe95a79a55 100644 --- a/clang/test/Analysis/out-of-bounds-diagnostics.c +++ b/clang/test/Analysis/out-of-bounds-diagnostics.c @@ -6,7 +6,7 @@ int TenElements[10]; void arrayUnderflow(void) { TenElements[-3] = 5; // expected-warning@-1 {{Out of bound access to memory preceding 'TenElements'}} - // expected-note@-2 {{Access of 'TenElements' at negative byte offset -12}} + // expected-note@-2 {{Access of 'TenElements' at negative byte offset}} } int underflowWithDeref(void) { @@ -14,7 +14,29 @@ int underflowWithDeref(void) { --p; return *p; // expected-warning@-1 {{Out of bound access to memory preceding 'TenElements'}} - // expected-note@-2 {{Access of 'TenElements' at negative byte offset -4}} + // expected-note@-2 {{Access of 'TenElements' at negative byte offset}} +} + +int rng(void); +int getIndex(void) { + switch (rng()) { +case 1: return -152; +case 2: return -160; +case 3: return -168; +default: return -172; + } +} + +void gh86959(void) { + // Previously code like this produced many almost-identical bug reports that + // only differed in the byte offset value (which was reported by the checker + // at that time). Verify that now we only see one report. + + // expected-note@+1 {{Entering loop body}} + while (rng()) +TenElements[getIndex()] = 10; + // expected-warning@-1 {{Out of bound access to memory preceding 'TenElements'}} + // expected-note@-2 {{Access of 'TenElements' at negative byte offset}} } int scanf(const char *restrict fmt, ...); ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)
@@ -1046,10 +1044,7 @@ bool GenericTaintChecker::generateReportIfTainted(const Expr *E, StringRef Msg, return false; // Generate diagnostic. NagyDonat wrote: :thinking: Perhaps add an `assert(BT)` here for the sake of paranoia? https://github.com/llvm/llvm-project/pull/98157 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)
NagyDonat wrote: (By the way, this change doesn't have significant user-facing parts, so I don't think that we need to mention it in the release notes.) https://github.com/llvm/llvm-project/pull/98157 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)
https://github.com/NagyDonat approved this pull request. LGTM, thanks for the updates. @steakhal Is it OK for you if we merge this? https://github.com/llvm/llvm-project/pull/98157 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)
@@ -1122,10 +1131,20 @@ void GenericTaintChecker::taintUnsafeSocketProtocol(const CallEvent , } /// Checker registration -void ento::registerGenericTaintChecker(CheckerManager ) { +void ento::registerTaintPropagationChecker(CheckerManager ) { Mgr.registerChecker(); } +bool ento::shouldRegisterTaintPropagationChecker(const CheckerManager ) { + return true; +} + +void ento::registerGenericTaintChecker(CheckerManager ) { + GenericTaintChecker *checker = Mgr.getChecker(); + checker->isTaintReporterCheckerEnabled = true; + checker->reporterCheckerName = Mgr.getCurrentCheckerName(); NagyDonat wrote: Because this way we can access the checker name specified in `Checkers.td` via the function `Mgr.getCurrentCheckerName();`. When the checker class corresponds to just one checker defined in `Checkers.td` we can use the alternative constructor of `BugType` that takes the checker object (`this`) as a first argument and then queries the name from the checker object (which can save a _single_ name automatically). When a single class implements multiple checkers, we need to explicitly pass the right name to the `BugType` constructor, and so we need to either (1) postpone the construction of the `BugType` to a point after the registration or (2) duplicate the checker name between `Checkers.td` and the source file. AFAIK we always choose option (1), but I didn't check this systematically. https://github.com/llvm/llvm-project/pull/98157 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)
@@ -1122,10 +1131,20 @@ void GenericTaintChecker::taintUnsafeSocketProtocol(const CallEvent , } /// Checker registration -void ento::registerGenericTaintChecker(CheckerManager ) { +void ento::registerTaintPropagationChecker(CheckerManager ) { Mgr.registerChecker(); } +bool ento::shouldRegisterTaintPropagationChecker(const CheckerManager ) { + return true; +} + +void ento::registerGenericTaintChecker(CheckerManager ) { + GenericTaintChecker *checker = Mgr.getChecker(); + checker->isTaintReporterCheckerEnabled = true; + checker->reporterCheckerName = Mgr.getCurrentCheckerName(); NagyDonat wrote: We only use that `unique_ptr` has a null / empty state in addition to representing the "real" `BugType` values. However, `std::optional` would be completely sufficient for this goal (here and AFAIK in every checker that uses `unique_ptr`s to juggle multiple `BugType`s). If we really want, in _this_ checker we could use a "bare" `BugType` data member if we leave it uninitialized in the constructor of the checker class (or initialize it with dummy values) and then unconditionally overwrite it with the "real" `BugType` in this `register...` function. (Other checkers have lazy initialization patterns that check the nullness of the `BugType`.) https://github.com/llvm/llvm-project/pull/98157 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)
https://github.com/NagyDonat commented: Overall LGTM, I added some minor remarks in inline comments. Also note that with this change we can finally remove the note ``` The ``alpha.security.taint.TaintPropagation`` checker also needs to be enabled for this checker to give warnings. ``` from the documentation of `optin.taint.TaintedAlloc` (because the `TaintPropagation` modeling checker is now a _dependency_ of it). https://github.com/llvm/llvm-project/pull/98157 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)
@@ -18,7 +21,7 @@ Taint analysis works by checking for the occurrence of special operations during the symbolic execution of the program. Taint analysis defines sources, sinks, and propagation rules. It identifies errors by detecting a flow of information that originates from a taint source, reaches a taint sink, and propagates through the program paths via propagation rules. -A source, sink, or an operation that propagates taint is mainly domain-specific knowledge, but there are some built-in defaults provided by :ref:`alpha-security-taint-TaintPropagation`. +A source, sink, or an operation that propagates taint is mainly domain-specific knowledge, but there are some built-in defaults provided by ``TaintPropagation`` checker. NagyDonat wrote: ```suggestion A source, sink, or an operation that propagates taint is mainly domain-specific knowledge, but there are some built-in defaults provided by the ``TaintPropagation`` checker. ``` https://github.com/llvm/llvm-project/pull/98157 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)
@@ -1122,10 +1131,20 @@ void GenericTaintChecker::taintUnsafeSocketProtocol(const CallEvent , } /// Checker registration -void ento::registerGenericTaintChecker(CheckerManager ) { +void ento::registerTaintPropagationChecker(CheckerManager ) { Mgr.registerChecker(); } +bool ento::shouldRegisterTaintPropagationChecker(const CheckerManager ) { + return true; +} + +void ento::registerGenericTaintChecker(CheckerManager ) { + GenericTaintChecker *checker = Mgr.getChecker(); + checker->isTaintReporterCheckerEnabled = true; + checker->reporterCheckerName = Mgr.getCurrentCheckerName(); NagyDonat wrote: Move the initialization of `BT` to this point -- then it doesn't need to be a `mutable` and you won't need the string data member `reporterCheckerName`. Notes: - You may turn `BT` into a public data member -- its visibility is still limited to this one source file. - `BugType` is a very simple type (essentially a plain struct of three strings), there is no reason to postpone its initialization once you have all the data. - You can use `BugType::getCheckerName()` to get the checker name to construct the `CheckerProgramPointTag`. https://github.com/llvm/llvm-project/pull/98157 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)
@@ -391,8 +392,11 @@ class GenericTaintChecker : public Checker { bool generateReportIfTainted(const Expr *E, StringRef Msg, CheckerContext ) const; + bool isTaintReporterCheckerEnabled = false; + CheckerNameRef reporterCheckerName; + private: - const BugType BT{this, "Use of Untrusted Data", categories::TaintedData}; + mutable std::unique_ptr BT; NagyDonat wrote: I'm a bit sad that this mutable unique_ptr trickery reappears in this checker, but I don't see a better alternative :frowning_face: https://github.com/llvm/llvm-project/pull/98157 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/98157 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [NFCI][clang][analyzer] Make ProgramStatePartialTrait a template definition (PR #98150)
https://github.com/NagyDonat approved this pull request. LGTM, nice little update :) https://github.com/llvm/llvm-project/pull/98150 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)
@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc { static bool classof(SVal V) { return V.getKind() == CompoundValKind; } }; +/// The simplest example of a concrete compound value is nonloc::CompoundVal, +/// which represents a concrete r-value of an initializer-list or a string. +/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the +/// literal. +/// +/// However, there is another compound value used in the analyzer, which appears +/// much more often during analysis, which is nonloc::LazyCompoundVal. This +/// value is an r-value that represents a snapshot of any structure "as a whole" +/// at a given moment during the analysis. Such value is already quite far from +/// being re- ferred to as "concrete", as many fields inside it would be unknown +/// or symbolic. nonloc::LazyCompoundVal operates by storing two things: +/// * a reference to the TypedValueRegion being snapshotted (yes, it is always +/// typed), and also +/// * a copy of the whole Store object, obtained from the ProgramState in +/// which it was created. NagyDonat wrote: ```suggestion /// * a reference to the whole Store object, obtained from the ProgramState in ///which the nonloc::LazyCompoundVal was created. /// /// Note that the old ProgramState and its Store is kept alive during the /// analysis because these are immutable functional data structures and each new /// Store value is represented as "earlier Store" + "additional binding". ``` What would you think about this phrasing? https://github.com/llvm/llvm-project/pull/97407 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)
@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc { static bool classof(SVal V) { return V.getKind() == CompoundValKind; } }; +/// The simplest example of a concrete compound value is nonloc::CompoundVal, +/// which represents a concrete r-value of an initializer-list or a string. +/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the +/// literal. +/// +/// However, there is another compound value used in the analyzer, which appears +/// much more often during analysis, which is nonloc::LazyCompoundVal. This +/// value is an r-value that represents a snapshot of any structure "as a whole" +/// at a given moment during the analysis. Such value is already quite far from +/// being re- ferred to as "concrete", as many fields inside it would be unknown +/// or symbolic. nonloc::LazyCompoundVal operates by storing two things: +/// * a reference to the TypedValueRegion being snapshotted (yes, it is always +/// typed), and also +/// * a copy of the whole Store object, obtained from the ProgramState in NagyDonat wrote: I do understand functional programming (I had a phase when I learnt Haskell and for a few years I thought that it's the absolutely perfect language :star_struck: -- since then I'm more realistic but I still like it), I was just confused by the fact that this line emphasizes "a copy of the whole Store object" as opposed to the "reference to" in the previous line. In a functional language the word "copy" is practically non-existent (we speak about values, not objects and their identity, so there is no reason to say that this is _a copy of_ e.g. that list -- we simply say that "this is that list"), while in C++ saying "copy" instead of "move" or "pointer/reference" is an important synonym of "we allocate lots of new memory", so I'd still prefer tweaking this line. https://github.com/llvm/llvm-project/pull/97407 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)
@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc { static bool classof(SVal V) { return V.getKind() == CompoundValKind; } }; +/// The simplest example of a concrete compound value is nonloc::CompoundVal, +/// which represents a concrete r-value of an initializer-list or a string. +/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the +/// literal. +/// +/// However, there is another compound value used in the analyzer, which appears +/// much more often during analysis, which is nonloc::LazyCompoundVal. This +/// value is an r-value that represents a snapshot of any structure "as a whole" +/// at a given moment during the analysis. Such value is already quite far from +/// being re- ferred to as "concrete", as many fields inside it would be unknown +/// or symbolic. nonloc::LazyCompoundVal operates by storing two things: +/// * a reference to the TypedValueRegion being snapshotted (yes, it is always +/// typed), and also +/// * a copy of the whole Store object, obtained from the ProgramState in NagyDonat wrote: Perhaps we should mention that `Store` is just a typedef for `const void *`... https://github.com/llvm/llvm-project/pull/97407 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)
@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc { static bool classof(SVal V) { return V.getKind() == CompoundValKind; } }; +/// The simplest example of a concrete compound value is nonloc::CompoundVal, +/// which represents a concrete r-value of an initializer-list or a string. +/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the +/// literal. +/// +/// However, there is another compound value used in the analyzer, which appears +/// much more often during analysis, which is nonloc::LazyCompoundVal. This +/// value is an r-value that represents a snapshot of any structure "as a whole" +/// at a given moment during the analysis. Such value is already quite far from +/// being re- ferred to as "concrete", as many fields inside it would be unknown NagyDonat wrote: ```suggestion /// being referred to as "concrete", as many fields inside it would be unknown ``` https://github.com/llvm/llvm-project/pull/97407 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/97407 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)
@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc { static bool classof(SVal V) { return V.getKind() == CompoundValKind; } }; +/// The simplest example of a concrete compound value is nonloc::CompoundVal, +/// which represents a concrete r-value of an initializer-list or a string. +/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the +/// literal. +/// +/// However, there is another compound value used in the analyzer, which appears +/// much more often during analysis, which is nonloc::LazyCompoundVal. This +/// value is an r-value that represents a snapshot of any structure "as a whole" +/// at a given moment during the analysis. Such value is already quite far from +/// being re- ferred to as "concrete", as many fields inside it would be unknown +/// or symbolic. nonloc::LazyCompoundVal operates by storing two things: +/// * a reference to the TypedValueRegion being snapshotted (yes, it is always +/// typed), and also +/// * a copy of the whole Store object, obtained from the ProgramState in NagyDonat wrote: Is it truly a _copy_ of the store object? This seems to contradict the "is a very cheap operation" line in the next paragraph...? Or is it a _copied store object_ that _shares the same bulk allocations_ with the original one (but differs in minor details?) https://github.com/llvm/llvm-project/pull/97407 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)
https://github.com/NagyDonat commented: I'm really happy that you decided to document these :smile: https://github.com/llvm/llvm-project/pull/97407 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/97034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)
@@ -0,0 +1,239 @@ +Command Line Usage: scan-build and CodeChecker +== + +This document provides guidelines for running Clang Static Analyzer from the command line on whole projects. +CodeChecker and scan-build are two CLI tools for using CSA on multiple files (tranlation units). +Both provide a way of driving the analyzer, detecting compilation flags, and generating reports. +CodeChecker is more actively maintained, provides heuristics for working with multiple versions of popular compilers and it also comes with a web-based GUI for viewing, filtering, categorizing and suppressing the results. +Therefore CodeChecker is recommended in case you need any of the above features or just more customizability in general. + +Comparison of CodeChecker and scan-build + + +Static Analyzer is by design a GUI tool originally intended to be consumed by the XCode IDE. +Its purpose is to find buggy execution paths in the program, and such paths are very hard to comprehend by looking at a non-interactive standard output. +It is possible, however, to invoke the Static Analyzer from the command line in order to obtain analysis results, and then later view them interactively in a graphical interface. +The following tools are used commonly to run the analyzer from the command line. +Both tools are wrapper scripts to drive the analysis and the underlying invocations of the Clang compiler: + +1. scan-build_ is an old and simple command line tool that emits static analyzer warnings as HTML files while compiling your project. You can view the analysis results in your web browser. +- Useful for individual developers who simply want to view static analysis results at their desk, or in a very simple collaborative environment. +- Works on all major platforms (Windows, Linux, macOS) and is available as a package in many Linux distributions. +- Does not include support for cross-translation-unit analysis. + +2. CodeChecker_ is a driver and web server that runs the Static Analyzer on your projects on demand and maintains a database of issues. +- Perfect for managing large amounts of Static Analyzer warnings in a collaborative environment. +- Generally much more feature-rich than scan-build. +- Supports incremental analysis: Results can be stored in a database, subsequent analysis runs can be compared to list the newly added defects. +- :doc:`CrossTranslationUnit` is supported fully on Linux via CodeChecker. +- Can run clang-tidy checkers too. +- Open source, but out-of-tree, i.e. not part of the LLVM project. + +scan-build +-- + +**scan-build** is a command line utility that enables a user to run the static analyzer over their codebase as part of performing a regular build (from the command line). NagyDonat wrote: The use of "the Static Analyzer" vs "the static analyzer" is inconsistent -- it would be good to pick one of them and use it consistently. https://github.com/llvm/llvm-project/pull/97034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)
@@ -0,0 +1,239 @@ +Command Line Usage: scan-build and CodeChecker +== + +This document provides guidelines for running Clang Static Analyzer from the command line on whole projects. +CodeChecker and scan-build are two CLI tools for using CSA on multiple files (tranlation units). +Both provide a way of driving the analyzer, detecting compilation flags, and generating reports. +CodeChecker is more actively maintained, provides heuristics for working with multiple versions of popular compilers and it also comes with a web-based GUI for viewing, filtering, categorizing and suppressing the results. +Therefore CodeChecker is recommended in case you need any of the above features or just more customizability in general. + +Comparison of CodeChecker and scan-build + + +Static Analyzer is by design a GUI tool originally intended to be consumed by the XCode IDE. NagyDonat wrote: Is this still true? (Even if it's obsolete, we should probably leave it for a follow-up commit.) https://github.com/llvm/llvm-project/pull/97034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/97034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)
https://github.com/NagyDonat commented: I'm really happy to see that this duplicated documentation is finally cleaned up after so many years :partying_face: I added a few minor remarks in inline comments, but the change looks good overall. As we discussed privately, this change should be (mostly) limited to the format changes, and then separate follow-up commit should eliminate the outdated content (and, perhaps, add new information if that's necessary). I'll update the PR description to emphasize this. https://github.com/llvm/llvm-project/pull/97034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)
@@ -0,0 +1,238 @@ +Command-Line Usage: CodeChecker and scan-build +=== + +This document provides guidelines for running Clang Static Analyzer from the command line on whole projects. +CodeChecker and scan-build are two CLI tools for using CSA on multiple files (tranlation units). +Both provide a way of driving the analyzer, detecting compilation flags, and generating reports. +CodeChecker is more actively maintained, provides heuristics for working with multiple versions of popular compilers and it also comes with a web-based GUI for viewing, filtering, categorizing and suppressing the results. +Therefore CodeChecker is recommended in case you need any of the above features or just more customizability in general. + +Comparison of CodeChecker and scan-build + + +Static Analyzer is by design a GUI tool originally intended to be consumed by the XCode IDE. +Its purpose is to find buggy execution paths in the program, and such paths are very hard to comprehend by looking at a non-interactive standard output. +It is possible, however, to invoke the Static Analyzer from the command line in order to obtain analysis results, and then later view them interactively in a graphical interface. +The following tools are used commonly to run the analyzer from the commandline. +Both tools are wrapper scripts to drive the analysis and the underlying invocations of the Clang compiler: + +1. CodeChecker_ is a driver and web server that runs the Static Analyzer on your projects on demand and maintains a database of issues. NagyDonat wrote: Oh, that's surprising -- I was not familiar with that feature of RST. Naturally this invalidates my concern. https://github.com/llvm/llvm-project/pull/97034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Check the correct first and last elements in cstring.UninitializedRead (PR #95408)
https://github.com/NagyDonat approved this pull request. LGTM, with the disclaimer that I'd still prefer e.g. "second argument instead of "2nd argument" -- but I won't block the review with this bikeshedding. https://github.com/llvm/llvm-project/pull/95408 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)
@@ -0,0 +1,238 @@ +Command-Line Usage: CodeChecker and scan-build +=== + +This document provides guidelines for running Clang Static Analyzer from the command line on whole projects. +CodeChecker and scan-build are two CLI tools for using CSA on multiple files (tranlation units). +Both provide a way of driving the analyzer, detecting compilation flags, and generating reports. +CodeChecker is more actively maintained, provides heuristics for working with multiple versions of popular compilers and it also comes with a web-based GUI for viewing, filtering, categorizing and suppressing the results. +Therefore CodeChecker is recommended in case you need any of the above features or just more customizability in general. + +Comparison of CodeChecker and scan-build + + +Static Analyzer is by design a GUI tool originally intended to be consumed by the XCode IDE. +Its purpose is to find buggy execution paths in the program, and such paths are very hard to comprehend by looking at a non-interactive standard output. +It is possible, however, to invoke the Static Analyzer from the command line in order to obtain analysis results, and then later view them interactively in a graphical interface. +The following tools are used commonly to run the analyzer from the commandline. +Both tools are wrapper scripts to drive the analysis and the underlying invocations of the Clang compiler: + +1. CodeChecker_ is a driver and web server that runs the Static Analyzer on your projects on demand and maintains a database of issues. NagyDonat wrote: It's still strange that there is only one underscore which is _after_ the word 'CodeChecker'. I'd guess that the intent was to italicize 'CodeChecker' and we should add another underscore _before_ this word for the sake of clarity (even if the actual HTML generation can work with this half-broken situation. https://github.com/llvm/llvm-project/pull/97034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer][doc] Migrate checkers-related docs from HTML to RST (PR #97032)
NagyDonat wrote: :thinking: Deleting the html files could break some links on external sites, so I think it would be better to replace them with a very simple "This content was moved to " placeholder. https://github.com/llvm/llvm-project/pull/97032 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] [clang-tidy] Clarify diagnostics of bugprone-sizeof-expression (PR #95550)
https://github.com/NagyDonat closed https://github.com/llvm/llvm-project/pull/95550 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] [clang-tidy] Clarify diagnostics of bugprone-sizeof-expression (PR #95550)
NagyDonat wrote: I mentioned this change in the paragraph that was describing my earlier commit (that was also modifying this check). https://github.com/llvm/llvm-project/pull/95550 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang-tools-extra] [clang-tidy] Clarify diagnostics of bugprone-sizeof-expression (PR #95550)
https://github.com/NagyDonat updated https://github.com/llvm/llvm-project/pull/95550 From 06adc063c2388ea534537f5a417751fdf64b22cd Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= Date: Fri, 14 Jun 2024 15:16:34 +0200 Subject: [PATCH 1/4] [clang-tidy] Clarify diagnostics of bugprone-sizeof-expression ...becasue they were strangely worded and in a few cases outright incorrect. --- .../bugprone/SizeofExpressionCheck.cpp| 22 ++--- .../checkers/bugprone/sizeof-expression-2.c | 12 +-- .../sizeof-expression-any-pointer.cpp | 86 +-- ...on-warn-on-sizeof-pointer-to-aggregate.cpp | 4 +- .../checkers/bugprone/sizeof-expression.cpp | 84 +- 5 files changed, 105 insertions(+), 103 deletions(-) diff --git a/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp b/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp index c25ee42d0899a..3afa0e5c39882 100644 --- a/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp +++ b/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp @@ -296,7 +296,7 @@ void SizeofExpressionCheck::check(const MatchFinder::MatchResult ) { } else if (const auto *E = Result.Nodes.getNodeAs("sizeof-integer-call")) { diag(E->getBeginLoc(), "suspicious usage of 'sizeof()' on an expression " - "that results in an integer") + "of integer type") << E->getSourceRange(); } else if (const auto *E = Result.Nodes.getNodeAs("sizeof-this")) { diag(E->getBeginLoc(), @@ -314,7 +314,7 @@ void SizeofExpressionCheck::check(const MatchFinder::MatchResult ) { << E->getSourceRange(); } else { diag(E->getBeginLoc(), "suspicious usage of 'sizeof()' on an expression " - "that results in a pointer") + "of pointer type") << E->getSourceRange(); } } else if (const auto *E = Result.Nodes.getNodeAs( @@ -342,31 +342,33 @@ void SizeofExpressionCheck::check(const MatchFinder::MatchResult ) { if (DenominatorSize > CharUnits::Zero() && !NumeratorSize.isMultipleOf(DenominatorSize)) { - diag(E->getOperatorLoc(), "suspicious usage of 'sizeof(...)/sizeof(...)';" + diag(E->getOperatorLoc(), "suspicious usage of 'sizeof(...)/sizeof(...)':" " numerator is not a multiple of denominator") << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange(); } else if (ElementSize > CharUnits::Zero() && DenominatorSize > CharUnits::Zero() && ElementSize != DenominatorSize) { - diag(E->getOperatorLoc(), "suspicious usage of 'sizeof(...)/sizeof(...)';" -" numerator is not a multiple of denominator") + diag(E->getOperatorLoc(), + "suspicious usage of 'sizeof(array)/sizeof(...)':" + " denominator differs from the size of array elements") << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange(); } else if (NumTy && DenomTy && NumTy == DenomTy) { - // FIXME: This message is wrong, it should not refer to sizeof "pointer" - // usage (and by the way, it would be to clarify all the messages). diag(E->getOperatorLoc(), - "suspicious usage of sizeof pointer 'sizeof(T)/sizeof(T)'") + "suspicious usage of 'sizeof(...)/sizeof(...)': both expressions " + "have the same type") << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange(); } else if (!WarnOnSizeOfPointer) { // When 'WarnOnSizeOfPointer' is enabled, these messages become redundant: if (PointedTy && DenomTy && PointedTy == DenomTy) { diag(E->getOperatorLoc(), - "suspicious usage of sizeof pointer 'sizeof(T*)/sizeof(T)'") + "suspicious usage of 'sizeof(...)/sizeof(...)': size of pointer " + "is divided by size of pointed type") << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange(); } else if (NumTy && DenomTy && NumTy->isPointerType() && DenomTy->isPointerType()) { diag(E->getOperatorLoc(), - "suspicious usage of sizeof pointer 'sizeof(P*)/sizeof(Q*)'") + "suspicious usage of 'sizeof(...)/sizeof(...)': both expressions " + "have pointer types") << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange(); } } diff --git a/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c b/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c index aef930f2c8fda..b898071a56613 100644 --- a/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c +++ b/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c @@ -34,24 +34,24 @@ int Test5() { int sum = 0; sum += sizeof(); - //
[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)
@@ -0,0 +1,37 @@ +Obtaining the Static Analyzer += + +This page describes how to download and install the analyzer. Once the analyzer is installed, follow the asdf :doc:`CommandLineUsage` on using the commandline to get started analyzing your code. NagyDonat wrote: Nice catch! :laughing: https://github.com/llvm/llvm-project/pull/97034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)
@@ -1034,16 +1034,16 @@ void StreamChecker::preWrite(const FnDescription *Desc, const CallEvent , C.addTransition(State); } -static std::optional getPointeeType(const MemRegion *R) { +static QualType getPointeeType(const MemRegion *R) { if (!R) -return std::nullopt; +return {}; if (const auto *ER = dyn_cast(R)) -return ER->getElementType(); +return ER->getElementType()->getCanonicalTypeUnqualified(); NagyDonat wrote: Consider moving the `->getCanonicalTypeUnqualified()` calls into `escapeByStartIndexAndCount()` because that's the only place where it will be relevant. (This would eliminate the code duplication.) https://github.com/llvm/llvm-project/pull/97199 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)
@@ -1034,16 +1034,16 @@ void StreamChecker::preWrite(const FnDescription *Desc, const CallEvent , C.addTransition(State); } -static std::optional getPointeeType(const MemRegion *R) { +static QualType getPointeeType(const MemRegion *R) { if (!R) -return std::nullopt; +return {}; NagyDonat wrote: It's a bit strange that here you return `{}`, while at the end of the function `QualType{}` is explicitly specified. https://github.com/llvm/llvm-project/pull/97199 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/97199 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)
https://github.com/NagyDonat approved this pull request. Thanks for the updates! I added few minor comments, but the PR is already good enough to be merged. > > I only noticed that this PR was already merged after posting the review. > > There is no need to revert the commit -- it's better than nothing -- but > > I'd be happy if you created a followup change that also handles the > > testcase that I mentioned. > > I'm not sure what you refer to. This PR is not approved, hence not merged. > Please continue with the review. Oops, sorry -- I was probably confused by the purple "merged" icon in ![image](https://github.com/llvm/llvm-project/assets/43410265/e4b494d4-014f-4b03-942b-00debb90a709) ...but the real reason is that I got up at 4 AM today :sleeping: https://github.com/llvm/llvm-project/pull/97199 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reland "[analyzer] Harden safeguards for Z3 query times" (PR #97298)
https://github.com/NagyDonat approved this pull request. This should be re-landed as well; when it was merged earlier the only issue was the incompatibility with old Z3 and that was addressed since then (by bumping the version requirement). https://github.com/llvm/llvm-project/pull/97298 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/97078 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)
@@ -21,30 +21,55 @@ #include "clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h" #include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h" #include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h" +#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h" using namespace clang; using namespace ento; namespace { -class MmapWriteExecChecker : public Checker { +class MmapWriteExecChecker +: public Checker { CallDescription MmapFn{CDM::CLibrary, {"mmap"}, 6}; CallDescription MprotectFn{CDM::CLibrary, {"mprotect"}, 3}; - static int ProtWrite; - static int ProtExec; - static int ProtRead; const BugType BT{this, "W^X check fails, Write Exec prot flags set", "Security"}; + mutable bool FlagsInitialized = false; + mutable int ProtRead = 0x01; + mutable int ProtWrite = 0x02; + mutable int ProtExec = 0x04; + public: + void checkBeginFunction(CheckerContext ) const; void checkPreCall(const CallEvent , CheckerContext ) const; + int ProtExecOv; int ProtReadOv; }; } -int MmapWriteExecChecker::ProtWrite = 0x02; -int MmapWriteExecChecker::ProtExec = 0x04; -int MmapWriteExecChecker::ProtRead = 0x01; +void MmapWriteExecChecker::checkBeginFunction(CheckerContext ) const { + if (FlagsInitialized) +return; + + FlagsInitialized = true; + + const std::optional FoundProtRead = + tryExpandAsInteger("PROT_READ", C.getPreprocessor()); + const std::optional FoundProtWrite = + tryExpandAsInteger("PROT_WRITE", C.getPreprocessor()); + const std::optional FoundProtExec = + tryExpandAsInteger("PROT_EXEC", C.getPreprocessor()); + if (FoundProtRead && FoundProtWrite && FoundProtExec) { +ProtRead = *FoundProtRead; +ProtWrite = *FoundProtWrite; +ProtExec = *FoundProtExec; + } else { +// FIXME: Are these useful? +ProtRead = ProtReadOv; +ProtExec = ProtExecOv; NagyDonat wrote: This code fragment lets the user specify the value of the `PROT_READ` and `PROT_EXEC` macros as checker options: ```cpp Mwec->ProtExecOv = mgr.getAnalyzerOptions() .getCheckerIntegerOption(Mwec, "MmapProtExec"); Mwec->ProtReadOv = mgr.getAnalyzerOptions() .getCheckerIntegerOption(Mwec, "MmapProtRead"); ``` This is a very marginal feature and I wouldn't spend time on implementing it; but now that it's already implemented I'm not sure that we should delete it. https://github.com/llvm/llvm-project/pull/97078 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] Reland "[analyzer][NFC] Reorganize Z3 report refutation" (PR #97265)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/97265 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] Reland "[analyzer][NFC] Reorganize Z3 report refutation" (PR #97265)
https://github.com/NagyDonat approved this pull request. Let's merge this again, after the Z3 version there shouldn't be any additional problems. https://github.com/llvm/llvm-project/pull/97265 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/97078 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)
@@ -21,30 +21,55 @@ #include "clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h" #include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h" #include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h" +#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h" using namespace clang; using namespace ento; namespace { -class MmapWriteExecChecker : public Checker { +class MmapWriteExecChecker +: public Checker { CallDescription MmapFn{CDM::CLibrary, {"mmap"}, 6}; CallDescription MprotectFn{CDM::CLibrary, {"mprotect"}, 3}; - static int ProtWrite; - static int ProtExec; - static int ProtRead; const BugType BT{this, "W^X check fails, Write Exec prot flags set", "Security"}; + mutable bool FlagsInitialized = false; + mutable int ProtRead = 0x01; + mutable int ProtWrite = 0x02; + mutable int ProtExec = 0x04; + public: + void checkBeginFunction(CheckerContext ) const; void checkPreCall(const CallEvent , CheckerContext ) const; + int ProtExecOv; int ProtReadOv; }; } -int MmapWriteExecChecker::ProtWrite = 0x02; -int MmapWriteExecChecker::ProtExec = 0x04; -int MmapWriteExecChecker::ProtRead = 0x01; +void MmapWriteExecChecker::checkBeginFunction(CheckerContext ) const { + if (FlagsInitialized) +return; + + FlagsInitialized = true; + + const std::optional FoundProtRead = + tryExpandAsInteger("PROT_READ", C.getPreprocessor()); + const std::optional FoundProtWrite = + tryExpandAsInteger("PROT_WRITE", C.getPreprocessor()); + const std::optional FoundProtExec = + tryExpandAsInteger("PROT_EXEC", C.getPreprocessor()); + if (FoundProtRead && FoundProtWrite && FoundProtExec) { +ProtRead = *FoundProtRead; +ProtWrite = *FoundProtWrite; +ProtExec = *FoundProtExec; + } else { +// FIXME: Are these useful? NagyDonat wrote: They are not too useful (and it's a little bit surprising that there is no "ProtWriteOv"), but they're harmless and they don't require complex code, so let's keep them for now. https://github.com/llvm/llvm-project/pull/97078 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)
https://github.com/NagyDonat approved this pull request. LGTM, straightforward change. Thanks! https://github.com/llvm/llvm-project/pull/97078 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)
NagyDonat wrote: _I only noticed that this PR was already merged after posting the review. There is no need to revert the commit -- it's better than nothing -- but I'd be happy if you created a followup change that also handles the testcase that I mentioned._ https://github.com/llvm/llvm-project/pull/97199 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Improve PointerSubChecker (PR #96501)
NagyDonat wrote: > Even protobuf contains this type of code: > https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=protobuf_v3.13.0_pointersub1=on=New=alpha.core.PointerSub=5545776=1bcd310fbaeccbcc13645b9b277239a2=%2adescriptor.pb.cc I still think that this (1) is undeniably undefined behavior (2) isn't common, so won't cause "spam" problems and (3( can be replaced by standard-compliant code (`offsetof`) so there is no need to introduce a special case for it. https://github.com/llvm/llvm-project/pull/96501 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)
https://github.com/NagyDonat requested changes to this pull request. Unfortunately this PR is not a full solution, because e.g. the following test code still triggers the crash (if it is appended to the test file `stream.c`): ```c struct zerosized { int foo[0]; }; void fread_zerosized(struct zerosized *buffer) { FILE *f = fopen("/tmp/foo.txt", "r"); fread(buffer, 1, 1, f); fclose(f); } ``` (I know that zero-sized arrays are not standard-compliant, but they are a widespread extension: e.g. both clang and gcc accept this `struct zerosized` with the default settings.) Overall, I'd say that it's futile to try to recognize zero-sized types with a "canonical type equal to" check, so you should just check whether `ElemSizeInChars` is zero and do something based on that. (Either an early return, or you can say `ElemSizeInChars = 1` at that point if you think that that's the logically correct solution.) ``This way you could also avoid the immediately invoked lambda in `getPointeeType` which is really ugly in my opinion.`` https://github.com/llvm/llvm-project/pull/97199 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Improve documentation of checker 'cplusplus.Move' (NFC) (PR #96295)
https://github.com/NagyDonat approved this pull request. The new changes also LGTM, feel free to merge this when you want. https://github.com/llvm/llvm-project/pull/96295 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Improve PointerSubChecker (PR #96501)
NagyDonat wrote: > The warning message may be still misleading if the LHS or RHS "arrays" are > non-array variables. I think that the warning message is OK: "Subtraction of two pointers that do not point into the same array is undefined behavior." -- this also covers the case when one or both of the pointers do not point to arrays. (It doesn't mention the corner case that it's also valid to subtract two identical pointers that point to a non-array value, but that's completely irrelevant in practice, so wouldn't be a helpful suggestion.) > (or detect if `offsetof` can be used and include it in the message)? I think that would be a waste of time, because it's very rare that a project manually reimplements `offsetof` -- I think it only appears in `vim` becasue it's a very old codebase. (Also developers who play with this kind of low-level trickery should be familiar with the standard and understand what's the problem.) https://github.com/llvm/llvm-project/pull/96501 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Improve PointerSubChecker (PR #96501)
NagyDonat wrote: > These results look correct according to the checker, but I am not sure if > such results are useful or really invalid: > https://codechecker-demo.eastus.cloudapp.azure.com/Default/reports?run=vim_v8.2.1920_pointersub1=on=New=alpha.core.PointerSub > In these cases the address difference of an (array) member of a struct and > start of the struct is taken. According to the checker rules, taking > difference of addresses of any variables or member variables is invalid. This trickery basically reimplements `offsetof()` (or something very similar to it), and as it is in the stable `vim` repo I assume that it's accepted by practically all compilers; but it clearly violates the language standard so I think it's good that the checker reports it. (Perhaps this trick would've been acceptable thirty years ago, but now we have standard `offsetof()` that usually expands to `__builtin_offsetof()` so developers who try to use homemade alternatives deserve a warning.) https://github.com/llvm/llvm-project/pull/96501 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Improve PointerSubChecker (PR #96501)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/96501 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Improve PointerSubChecker (PR #96501)
https://github.com/NagyDonat approved this pull request. LGTM. I'm a bit surprised to see that this checker duplicates the logic of the array bounds checkers (in the special case when the indexing operation is within a pointer subtraction). Right now this is OK but we'll need to delete this once ArrayBoundV2 becomes stable. https://github.com/llvm/llvm-project/pull/96501 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Check the correct first and last elements in cstring.UninitializedRead (PR #95408)
@@ -393,6 +401,162 @@ ProgramStateRef CStringChecker::checkNonNull(CheckerContext , return stateNonNull; } +static std::optional getIndex(ProgramStateRef State, + const ElementRegion *ER, CharKind CK) { + SValBuilder = State->getStateManager().getSValBuilder(); + ASTContext = SVB.getContext(); + + if (CK == CharKind::Regular) { +if (ER->getValueType() != Ctx.CharTy) + return {}; +return ER->getIndex(); + } + + if (ER->getValueType() != Ctx.WideCharTy) +return {}; + + QualType SizeTy = Ctx.getSizeType(); + NonLoc WideSize = + SVB.makeIntVal(Ctx.getTypeSizeInChars(Ctx.WideCharTy).getQuantity(), + SizeTy) + .castAs(); + SVal Offset = + SVB.evalBinOpNN(State, BO_Mul, ER->getIndex(), WideSize, SizeTy); + if (Offset.isUnknown()) +return {}; + return Offset.castAs(); +} + +// Try to get hold of the origin region (e.g. the actual array region from an +// element region). +static const TypedValueRegion *getOriginRegion(const ElementRegion *ER) { + const MemRegion *MR = ER->getSuperRegion(); + const MemRegion *Ret = MR; + assert(MR); + if (const auto *sym = MR->getAs()) { +SymbolRef sym2 = sym->getSymbol(); +if (!sym2) + return nullptr; +Ret = sym2->getOriginRegion(); + } + return dyn_cast_or_null(Ret); +} + +// Basically 1 -> 1st, 12 -> 12th, etc. +static void printIdxWithOrdinalSuffix(llvm::raw_ostream , unsigned Idx) { + Os << Idx << llvm::getOrdinalSuffix(Idx); NagyDonat wrote: Hmm, I agree that "(second)" doesn't look good when it represents an element index _(as highlighted)_: ![image](https://github.com/llvm/llvm-project/assets/43410265/bb01a19c-9d32-4a1d-a356-8cdf742d311d) but that's already changed to "(at index 1)". On the other hand, I still think "second argument" is significantly better than "2nd argument": ![image](https://github.com/llvm/llvm-project/assets/43410265/1e053a0d-7a38-47d6-8ebe-d66ebf46bba1) I can accept "2nd argument" if you think that it's significantly better, but if you're indifferent, then please change it. https://github.com/llvm/llvm-project/pull/95408 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Improve documentation of checker 'cplusplus.Move' (NFC) (PR #96295)
https://github.com/NagyDonat edited https://github.com/llvm/llvm-project/pull/96295 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Improve documentation of checker 'cplusplus.Move' (NFC) (PR #96295)
https://github.com/NagyDonat approved this pull request. LGTM, thanks for updating the docs! I only have one minor inline remark about formatting. https://github.com/llvm/llvm-project/pull/96295 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Improve documentation of checker 'cplusplus.Move' (NFC) (PR #96295)
@@ -420,21 +420,52 @@ around, such as ``std::string_view``. cplusplus.Move (C++) -Method calls on a moved-from object and copying a moved-from object will be reported. - +Find use-after-move bugs in C++. This includes method calls on moved-from +objects, assignment of a moved-from object, and repeated move of a moved-from +object. .. code-block:: cpp - struct A { + struct A { void foo() {} }; - void f() { + void f1() { A a; A b = std::move(a); // note: 'a' became 'moved-from' here a.foo();// warn: method call on a 'moved-from' object 'a' } + void f2() { + A a; + A b = std::move(a); + A c(std::move(a)); // warn: move of an already moved-from object + } + + void f3() { + A a; + A b = std::move(a); + b = a; // warn: copy of moved-from object + } + +The checker option ``WarnOn`` controls on what objects the use-after-move is +checked. The most strict value is ``KnownsOnly``, in this mode only objects are +checked whose type is known to be move-unsafe. These include most STL objects +(but excluding move-safe ones) and smart pointers. With option value +``KnownsAndLocals`` local variables (of any type) are additionally checked. The +idea behind this is that local variables are usually not tempting to be re-used +so an use after move is more likely a bug than with member variables. With +option value ``All`` any use-after move condition is checked on all kinds of +variables, excluding global variables and known move-safe cases. Default value +is ``KnownsAndLocals``. NagyDonat wrote: ```suggestion The checker option ``WarnOn`` controls on what objects the use-after-move is checked. * The most strict value is ``KnownsOnly``, in this mode only objects are checked whose type is known to be move-unsafe. These include most STL objects (but excluding move-safe ones) and smart pointers. * With option value ``KnownsAndLocals`` local variables (of any type) are additionally checked. The idea behind this is that local variables are usually not tempting to be re-used so an use after move is more likely a bug than with member variables. * With option value ``All`` any use-after move condition is checked on all kinds of variables, excluding global variables and known move-safe cases. Default value is ``KnownsAndLocals``. ``` This paragraph was very long, re-flow it into a bullet point list. https://github.com/llvm/llvm-project/pull/96295 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits