[clang] [analyzer] Fix crash of StreamChecker when eval calling 'fopen' (PR #100990)

2024-07-29 Thread Donát Nagy via cfe-commits


@@ -1766,13 +1770,6 @@ are assumed to succeed.)
fclose(p);
  }
 
-**Limitations**
-
-The checker does not track the correspondence between integer file descriptors
-and ``FILE *`` pointers. Operations on standard streams like ``stdin`` are not
-treated specially and are therefore often not recognized (because these streams
-are usually not opened explicitly by the program, and are global variables).

NagyDonat wrote:

It refers to functions like `write` [(man 
page)](https://www.man7.org/linux/man-pages/man2/write.2.html) where the file 
is specified by an `int` argument.

https://github.com/llvm/llvm-project/pull/100990
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Fix crash of StreamChecker when eval calling 'fopen' (PR #100990)

2024-07-29 Thread Donát Nagy via cfe-commits


@@ -1703,7 +1703,11 @@ are detected:
 * Invalid 3rd ("``whence``") argument to ``fseek``.
 
 The stream operations are by this checker usually split into two cases, a 
success
-and a failure case. However, in the case of write operations (like ``fwrite``,
+and a failure case.
+On the success case it also assumes that the current value of ``stdout``,
+``stderr``, or ``stdin`` can't be equal to the file pointer returned by 
``fopen``.
+
+In the case of write operations (like ``fwrite``,

NagyDonat wrote:

Ok, then feel free to keep the short diff.

https://github.com/llvm/llvm-project/pull/100990
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Fix crash of StreamChecker when eval calling 'fopen' (PR #100990)

2024-07-29 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

> I decided to put the fixup NFC changes along with this PR (the ones were 
> submitted after I merged the original commit), but on hindsight probably it 
> would be better to merge those NFC changes separately. If you request, I'll 
> split the PR.

Feel free to keep the NFC changes in this commit, just briefly mention them in 
the description ("This commit also contains some minor NFC code quality 
improvements." or something similar.) 

https://github.com/llvm/llvm-project/pull/100990
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Fix crash of StreamChecker when eval calling 'fopen' (PR #100990)

2024-07-29 Thread Donát Nagy via cfe-commits


@@ -1703,7 +1703,11 @@ are detected:
 * Invalid 3rd ("``whence``") argument to ``fseek``.
 
 The stream operations are by this checker usually split into two cases, a 
success
-and a failure case. However, in the case of write operations (like ``fwrite``,
+and a failure case.
+On the success case it also assumes that the current value of ``stdout``,
+``stderr``, or ``stdin`` can't be equal to the file pointer returned by 
``fopen``.
+
+In the case of write operations (like ``fwrite``,

NagyDonat wrote:

Personally I'd re-wrap the lines; but I can accept this if you think that it's 
important to minimize the diff footprint.

https://github.com/llvm/llvm-project/pull/100990
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Fix crash of StreamChecker when eval calling 'fopen' (PR #100990)

2024-07-29 Thread Donát Nagy via cfe-commits


@@ -1766,13 +1770,6 @@ are assumed to succeed.)
fclose(p);
  }
 
-**Limitations**
-
-The checker does not track the correspondence between integer file descriptors
-and ``FILE *`` pointers. Operations on standard streams like ``stdin`` are not
-treated specially and are therefore often not recognized (because these streams
-are usually not opened explicitly by the program, and are global variables).

NagyDonat wrote:

Hmm, perhaps keep a note saying "This checker does not track operations that 
use integer file descriptors instead of `FILE *` pointers". (If I recall 
correctly this is true and might be relevant for someone who's used to work 
with low-level stuff.) However it's also completely fine (from my POV) if you 
delete the whole Limitations section.



https://github.com/llvm/llvm-project/pull/100990
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Fix crash of StreamChecker when eval calling 'fopen' (PR #100990)

2024-07-29 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

LGTM.

https://github.com/llvm/llvm-project/pull/100990
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Fix crash of StreamChecker when eval calling 'fopen' (PR #100990)

2024-07-29 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/100990
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Keep alive short-circuiting condition subexpressions in a conditional (PR #100745)

2024-07-26 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

Nice fix, thanks for upstreaming this!

https://github.com/llvm/llvm-project/pull/100745
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ASTImporter] Fix import of template parameter default values. (PR #100100)

2024-07-26 Thread Donát Nagy via cfe-commits


@@ -359,6 +359,31 @@ namespace clang {
   Params, Importer.getToContext().getTranslationUnitDecl());
 }
 
+template 
+void tryUpdateTemplateParmDeclInheritedFrom(NamedDecl *RecentParm,
+NamedDecl *NewParm) {
+  if (auto *ParmT = dyn_cast(RecentParm)) {
+if (ParmT->hasDefaultArgument()) {
+  auto *P = cast(NewParm);
+  P->removeDefaultArgument();
+  P->setInheritedDefaultArgument(Importer.ToContext, ParmT);
+}
+  }
+}
+
+void updateTemplateParametersInheritedFrom(

NagyDonat wrote:

Thanks for the clarifications!

https://github.com/llvm/llvm-project/pull/100100
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ASTImporter] Fix import of template parameter default values. (PR #100100)

2024-07-26 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.


https://github.com/llvm/llvm-project/pull/100100
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Eliminate a dyn_cast (PR #100719)

2024-07-26 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.


https://github.com/llvm/llvm-project/pull/100719
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Minor cleanup in two test files. (PR #100570)

2024-07-25 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

In my opinion the "put unrelated things into separate commits" guideline is 
just a corollary of the "don't create commits that are too complex" rule. If a 
change is extremely trivial, I'm actively trying to aggregate it with other 
tangentially related changes, because one simple commit is easier to read and 
handle than two (or five or ten) even smaller commits.

https://github.com/llvm/llvm-project/pull/100570
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Minor cleanup in two test files. (PR #100570)

2024-07-25 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat closed 
https://github.com/llvm/llvm-project/pull/100570
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Minor cleanup in two test files. (PR #100570)

2024-07-25 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat created 
https://github.com/llvm/llvm-project/pull/100570

This commit contains two unrelated trivial changes:

(1) Three unused variables are removed from `ctor.mm`.
(2) A FIXME block is removed from `ctor-array.cpp` because it described
an issue that was resolved since then.

From 15fae768b0ad2412858520a137a51e7190a591cd Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Thu, 25 Jul 2024 15:43:05 +0200
Subject: [PATCH] [analyzer][NFC] Minor cleanup in two test files.

This commit contains two unrelated trivial changes:
(1) Three unused variables are removed from `ctor.mm`.
(2) A FIXME block is removed from `ctor-array.cpp` because it described
an issue that was resolved since then.
---
 clang/test/Analysis/ctor-array.cpp | 12 +---
 clang/test/Analysis/ctor.mm|  2 --
 2 files changed, 1 insertion(+), 13 deletions(-)

diff --git a/clang/test/Analysis/ctor-array.cpp 
b/clang/test/Analysis/ctor-array.cpp
index 053669cc2aada..49412ee5a68c7 100644
--- a/clang/test/Analysis/ctor-array.cpp
+++ b/clang/test/Analysis/ctor-array.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection 
-analyzer-disable-checker=cplusplus -analyzer-config c++-inlining=constructors 
-verify %s
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection 
-analyzer-config c++-inlining=constructors -verify %s
 
 #include "Inputs/system-header-simulator-cxx.h"
 
@@ -119,16 +119,6 @@ struct s5 {
 };
 
 void g1(void) {
-  // FIXME: This test requires -analyzer-disable-checker=cplusplus,
-  // because of the checker's weird behaviour in case of arrays.
-  // E.g.:
-  //s3 *arr = new s3[4];
-  //s3 *arr2 = new (arr + 1) s3[1];
-  //   ^~~
-  //   warning: 12 bytes is possibly not enough
-  //for array allocation which requires
-  //4 bytes.
-
   s5::c = 0;
   s5 *arr = new s5[4];
   new (arr + 1) s5[3];
diff --git a/clang/test/Analysis/ctor.mm b/clang/test/Analysis/ctor.mm
index fb385833df9c7..6ac9050fc29f7 100644
--- a/clang/test/Analysis/ctor.mm
+++ b/clang/test/Analysis/ctor.mm
@@ -56,8 +56,6 @@ void testNonPODCopyConstructor() {
 namespace ConstructorVirtualCalls {
   class A {
   public:
-int *out1, *out2, *out3;
-
 virtual int get() { return 1; }
 
 A(int *out1) {

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Don't invalidate the super region when a std object ctor runs (PR #100405)

2024-07-25 Thread Donát Nagy via cfe-commits


@@ -923,12 +923,31 @@ SVal AnyCXXConstructorCall::getCXXThisVal() const {
   return UnknownVal();
 }
 
+static bool isWithinStdNamespace(const Decl *D) {

NagyDonat wrote:

Of course, feel free to leave it for a followup patch.

https://github.com/llvm/llvm-project/pull/100405
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Don't invalidate the super region when a std object ctor runs (PR #100405)

2024-07-25 Thread Donát Nagy via cfe-commits


@@ -923,12 +923,31 @@ SVal AnyCXXConstructorCall::getCXXThisVal() const {
   return UnknownVal();
 }
 
+static bool isWithinStdNamespace(const Decl *D) {

NagyDonat wrote:

I think this function could be useful for other checkers as well; consider 
moving this to an externally visible location (e.g. turn this into a method of 
`Decl`).

https://github.com/llvm/llvm-project/pull/100405
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Don't invalidate the super region when a std object ctor runs (PR #100405)

2024-07-25 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/100405
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Don't invalidate the super region when a std object ctor runs (PR #100405)

2024-07-25 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

LGTM, nice improvement.

https://github.com/llvm/llvm-project/pull/100405
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ASTImporter] Fix import of template parameter default values. (PR #100100)

2024-07-23 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/100100
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ASTImporter] Fix import of template parameter default values. (PR #100100)

2024-07-23 Thread Donát Nagy via cfe-commits


@@ -359,6 +359,31 @@ namespace clang {
   Params, Importer.getToContext().getTranslationUnitDecl());
 }
 
+template 
+void tryUpdateTemplateParmDeclInheritedFrom(NamedDecl *RecentParm,
+NamedDecl *NewParm) {
+  if (auto *ParmT = dyn_cast(RecentParm)) {
+if (ParmT->hasDefaultArgument()) {
+  auto *P = cast(NewParm);
+  P->removeDefaultArgument();
+  P->setInheritedDefaultArgument(Importer.ToContext, ParmT);
+}
+  }
+}
+
+void updateTemplateParametersInheritedFrom(

NagyDonat wrote:

```suggestion
  /// Update the parameter list `NewParams` of a template declaration by
  /// "inheriting" default argument values from `RecentParams`, which is
  /// the parameter list of an earlier declaration of the same template.
  /// (Note that "inheriting" default argument values this way is
  /// completely unrelated to the usual object-oriented "inheritance"
  /// relationship between classes.)
  ///
  /// Note that the standard specifies that the same parameter cannot be
  /// given default arguments twice in the same scope; but in our case
  /// the parameter list `NewParams` is freshly imported from a
  /// different TU, so it's possible that both `RecentParams` and
  /// `NewParams` specify a default argument for the same parameter.
  /// In this case, the code will use the default argument taken from
  /// `RecentParams`.
  /// FIXME: Report an error if the two default arguments are different.
void updateTemplateParametersInheritedFrom(
```
I felt that it is important to explain that the "Inherited" within the name of 
this function is unrelated to the usual object-oriented "inheritance" between 
classes; and as I started to write this comment, I also ended up documenting 
the goals of this function. (I hope that I was correct -- please fix this 
comment block if I misunderstood something!) 

https://github.com/llvm/llvm-project/pull/100100
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ASTImporter] Fix import of template parameter default values. (PR #100100)

2024-07-23 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/100100
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ASTImporter] Fix import of template parameter default values. (PR #100100)

2024-07-23 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

LGTM, I'm very grateful that you can understand and resolve these tricky 
`ASTImporter` issues.

https://github.com/llvm/llvm-project/pull/100100
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Assume the result of 'fopen' can't alias with 'std{in,out,err}' (PR #100085)

2024-07-23 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

LGTM, this is a nice improvement. I vaguely recall that a few months ago 
somebody else on our team had trouble with false positives similar false 
positives, so it's good to see that these will be fixed.

https://github.com/llvm/llvm-project/pull/100085
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Assume the result of 'fopen' can't alias with 'std{in,out,err}' (PR #100085)

2024-07-23 Thread Donát Nagy via cfe-commits


@@ -451,6 +462,10 @@ class StreamChecker : public Checker`
In your code the three standard streams have exactly identical roles (as far as 
I see), and I think it would be good to emphasize this by storing them in a 
three-element array instead of three separate independently named variables.

For example, I could imagine a solution like 
```c++
  const char *StdStreamNames[3] = {"stdin", "stdout", "stderr"};
  mutable const VarDecl *StdStreamDecls[3] = {};
  
  // ... in checkASTDecl()
  for (int i = 0; i < 3; i++) {
// inline getGlobalStreamPointerByName here to initialize
// StdStreamDecls[i] via StdStreamNames[i]
  }
  
  // ... in assumeNoAliasingWithStdStreams()
  for (const VarDecl *Var : StdStreamDecls) {
// put the definition of the lambda `assumeRetNE` here
  }
```
(This is just a rough draft, there might be more idiomatic/elegant solutions 
for some parts.)
``

https://github.com/llvm/llvm-project/pull/100085
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Assume the result of 'fopen' can't alias with 'std{in,out,err}' (PR #100085)

2024-07-23 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/100085
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Model builtin-like functions as builtin functions (PR #99886)

2024-07-23 Thread Donát Nagy via cfe-commits


@@ -570,13 +570,8 @@ void differentBranchesTest(int i) {
   {
 A a;
 a.foo() > 0 ? a.foo() : A(std::move(a)).foo();
-#ifdef DFS
-// peaceful-note@-2 {{Assuming the condition is false}}
-// peaceful-note@-3 {{'?' condition is false}}
-#else
-// peaceful-note@-5 {{Assuming the condition is true}}
-// peaceful-note@-6 {{'?' condition is true}}
-#endif

NagyDonat wrote:

I see, thanks for the explanation. I agree that this isn't an important 
question and simply adjusting the TC is the right solution.

https://github.com/llvm/llvm-project/pull/99886
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Model builtin-like functions as builtin functions (PR #99886)

2024-07-23 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat commented:

LGTM.

https://github.com/llvm/llvm-project/pull/99886
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)

2024-07-22 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat closed 
https://github.com/llvm/llvm-project/pull/98621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)

2024-07-22 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat updated 
https://github.com/llvm/llvm-project/pull/98621

From 2765bc97d3242d50fd73aedb9e9d38dfdcef814c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Fri, 12 Jul 2024 13:57:53 +0200
Subject: [PATCH 1/3] [analyzer] Don't display the offset value in underflows

Previously alpha.security.ArrayBoundV2 displayed the (negative) offset
value when it reported an underflow, but this produced lots of very
similar and redundant reports in certain situations.

After this commit the offset won't be printed so the usual deduplication
will handle these reports as equivalent (and print only one of them).

See https://github.com/llvm/llvm-project/issues/86969 for background.
---
 .../Checkers/ArrayBoundCheckerV2.cpp  | 15 ++-
 .../test/Analysis/out-of-bounds-diagnostics.c | 26 +--
 2 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
index f82288f1099e8..f177041abc411 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -374,13 +374,14 @@ static std::optional 
getConcreteValue(std::optional SV) {
 
 static Messages getPrecedesMsgs(const SubRegion *Region, NonLoc Offset) {
   std::string RegName = getRegionName(Region);
-  SmallString<128> Buf;
-  llvm::raw_svector_ostream Out(Buf);
-  Out << "Access of " << RegName << " at negative byte offset";
-  if (auto ConcreteIdx = Offset.getAs())
-Out << ' ' << ConcreteIdx->getValue();
+
+  // We're not reporting the Offset, because we don't want to spam the user
+  // with similar reports that differ only in different offset values.
+  // See https://github.com/llvm/llvm-project/issues/86969 for details.
+  (void)Offset;
+
   return {formatv("Out of bound access to memory preceding {0}", RegName),
-  std::string(Buf)};
+  formatv("Access of {0} at negative byte offset", RegName)};
 }
 
 /// Try to divide `Val1` and `Val2` (in place) by `Divisor` and return true if
@@ -609,7 +610,7 @@ void ArrayBoundCheckerV2::performCheck(const Expr *E, 
CheckerContext ) const {
   // CHECK UPPER BOUND
   DefinedOrUnknownSVal Size = getDynamicExtent(State, Reg, SVB);
   if (auto KnownSize = Size.getAs()) {
-// In a situation where both overflow and overflow are possible (but the
+// In a situation where both underflow and overflow are possible (but the
 // index is either tainted or known to be invalid), the logic of this
 // checker will first assume that the offset is non-negative, and then
 // (with this additional assumption) it will detect an overflow error.
diff --git a/clang/test/Analysis/out-of-bounds-diagnostics.c 
b/clang/test/Analysis/out-of-bounds-diagnostics.c
index 92f983d8b1561..10cfe95a79a55 100644
--- a/clang/test/Analysis/out-of-bounds-diagnostics.c
+++ b/clang/test/Analysis/out-of-bounds-diagnostics.c
@@ -6,7 +6,7 @@ int TenElements[10];
 void arrayUnderflow(void) {
   TenElements[-3] = 5;
   // expected-warning@-1 {{Out of bound access to memory preceding 
'TenElements'}}
-  // expected-note@-2 {{Access of 'TenElements' at negative byte offset -12}}
+  // expected-note@-2 {{Access of 'TenElements' at negative byte offset}}
 }
 
 int underflowWithDeref(void) {
@@ -14,7 +14,29 @@ int underflowWithDeref(void) {
   --p;
   return *p;
   // expected-warning@-1 {{Out of bound access to memory preceding 
'TenElements'}}
-  // expected-note@-2 {{Access of 'TenElements' at negative byte offset -4}}
+  // expected-note@-2 {{Access of 'TenElements' at negative byte offset}}
+}
+
+int rng(void);
+int getIndex(void) {
+  switch (rng()) {
+case 1: return -152;
+case 2: return -160;
+case 3: return -168;
+default: return -172;
+  }
+}
+
+void gh86959(void) {
+  // Previously code like this produced many almost-identical bug reports that
+  // only differed in the byte offset value (which was reported by the checker
+  // at that time). Verify that now we only see one report.
+
+  // expected-note@+1 {{Entering loop body}}
+  while (rng())
+TenElements[getIndex()] = 10;
+  // expected-warning@-1 {{Out of bound access to memory preceding 
'TenElements'}}
+  // expected-note@-2 {{Access of 'TenElements' at negative byte offset}}
 }
 
 int scanf(const char *restrict fmt, ...);

From d7b71a21368c70da21b2ca6e7d02c7022ec21dde Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Fri, 12 Jul 2024 21:11:22 +0200
Subject: [PATCH 2/3] Alternative approach: tweak the `Profile()`` method of
 `BugReport`s

This commit re-adds the concrete offset value at the end of the (long)
`Description` of the underflow report, but ensures that the `Profile()`
method of `PathSensitiveBugReport` only uses the _short_ description (as
returned by `getShortDescription()`) instead of the the `Description`
field.

For the sake of consistency, the same 

[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)

2024-07-22 Thread Donát Nagy via cfe-commits


@@ -2213,7 +2213,7 @@ void BasicBugReport::Profile(llvm::FoldingSetNodeID& 
hash) const {
 void PathSensitiveBugReport::Profile(llvm::FoldingSetNodeID ) const {
   hash.AddInteger(static_cast(getKind()));
   hash.AddPointer();
-  hash.AddString(Description);
+  hash.AddString(getShortDescription());

NagyDonat wrote:

Yes, it's intentional, because the method call `getShortDescription()` is 
equivalent to `ShortDescripton.empty() ? Description : ShortDescription`. (The 
common case when the short and full descriptions are identical is represented 
internally by an empty `ShortDescription` and the shared value stored in 
`Description`.)

https://github.com/llvm/llvm-project/pull/98621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)

2024-07-22 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

(Btw it seems that I accidentally added everyone as an "assignee" instead of a 
reviewer...)

https://github.com/llvm/llvm-project/pull/98621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)

2024-07-22 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat unassigned 
https://github.com/llvm/llvm-project/pull/98621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)

2024-07-22 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat unassigned 
https://github.com/llvm/llvm-project/pull/98621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)

2024-07-22 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat unassigned 
https://github.com/llvm/llvm-project/pull/98621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)

2024-07-22 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

I'm back from the vacation, so I would like to restart and conclude this review 
process. @steakhal or anybody else please review when it's convenient for you.

https://github.com/llvm/llvm-project/pull/98621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ASTImporter] Fix import of anonymous enums if multiple are present (PR #99281)

2024-07-22 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

LGTM as far as I can judge this tricky situation.

https://github.com/llvm/llvm-project/pull/99281
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)

2024-07-12 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/98621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)

2024-07-12 Thread Donát Nagy via cfe-commits


@@ -373,14 +373,14 @@ static std::optional 
getConcreteValue(std::optional SV) {
 }
 
 static Messages getPrecedesMsgs(const SubRegion *Region, NonLoc Offset) {
-  std::string RegName = getRegionName(Region);
-  SmallString<128> Buf;
-  llvm::raw_svector_ostream Out(Buf);
-  Out << "Access of " << RegName << " at negative byte offset";
-  if (auto ConcreteIdx = Offset.getAs())
-Out << ' ' << ConcreteIdx->getValue();

NagyDonat wrote:

This `raw_svector_ostream` solution is very clunky, so I replaced it with a 
less ugly alternative when I re-added the code that prints the offset.

https://github.com/llvm/llvm-project/pull/98621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)

2024-07-12 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/98621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve bug report hashing, merge similar reports (PR #98621)

2024-07-12 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/98621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Don't display the offset value in underflows (PR #98621)

2024-07-12 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

_(Technical detail: I'll be on vacation during the next week, so I won't see 
updates on this PR until the 22nd of July. If you want to merge this PR, feel 
free to do so, but don't forget to adjust the description and the title to 
accurately reflect the current state of the PR.)_ 

https://github.com/llvm/llvm-project/pull/98621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Don't display the offset value in underflows (PR #98621)

2024-07-12 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

> I wonder if we could have something in between. I'm thinking of having the 
> concrete offset as a separate note, instead of having it part of the primary 
> message. That way after BR selection, we would still deterministically pick 
> the shortest parh, and also have the offset in the path notes.

Actually I had a better solution for achieving this: during my earlier 
experimentation I noticed that the `Profile()` methods of the `BugReporter` use 
the (long) `Description` field (which is "Access of `` at negative byte 
offset -124", the final note on the bug path), but ignore the 
`ShortDescription` field (which is "Out of bound access to memory preceding 
``", the stand-alone warning message). In the commit that I just 
pushed, I tweaked `Profile()` to ensure that it uses the short description and 
ignores the long one -- which yields the right deduplication behavior.

I'd guess that `ShortDescription` was missing from the `Profile()` method 
because it was introduced at a later point in the development, and there are 
very few checkers that actually set different short and long descriptions.

(For more details see the commit message of the commit that I just pushed onto 
this PR.)



https://github.com/llvm/llvm-project/pull/98621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Don't display the offset value in underflows (PR #98621)

2024-07-12 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat updated 
https://github.com/llvm/llvm-project/pull/98621

From 2765bc97d3242d50fd73aedb9e9d38dfdcef814c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Fri, 12 Jul 2024 13:57:53 +0200
Subject: [PATCH 1/2] [analyzer] Don't display the offset value in underflows

Previously alpha.security.ArrayBoundV2 displayed the (negative) offset
value when it reported an underflow, but this produced lots of very
similar and redundant reports in certain situations.

After this commit the offset won't be printed so the usual deduplication
will handle these reports as equivalent (and print only one of them).

See https://github.com/llvm/llvm-project/issues/86969 for background.
---
 .../Checkers/ArrayBoundCheckerV2.cpp  | 15 ++-
 .../test/Analysis/out-of-bounds-diagnostics.c | 26 +--
 2 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
index f82288f1099e8..f177041abc411 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -374,13 +374,14 @@ static std::optional 
getConcreteValue(std::optional SV) {
 
 static Messages getPrecedesMsgs(const SubRegion *Region, NonLoc Offset) {
   std::string RegName = getRegionName(Region);
-  SmallString<128> Buf;
-  llvm::raw_svector_ostream Out(Buf);
-  Out << "Access of " << RegName << " at negative byte offset";
-  if (auto ConcreteIdx = Offset.getAs())
-Out << ' ' << ConcreteIdx->getValue();
+
+  // We're not reporting the Offset, because we don't want to spam the user
+  // with similar reports that differ only in different offset values.
+  // See https://github.com/llvm/llvm-project/issues/86969 for details.
+  (void)Offset;
+
   return {formatv("Out of bound access to memory preceding {0}", RegName),
-  std::string(Buf)};
+  formatv("Access of {0} at negative byte offset", RegName)};
 }
 
 /// Try to divide `Val1` and `Val2` (in place) by `Divisor` and return true if
@@ -609,7 +610,7 @@ void ArrayBoundCheckerV2::performCheck(const Expr *E, 
CheckerContext ) const {
   // CHECK UPPER BOUND
   DefinedOrUnknownSVal Size = getDynamicExtent(State, Reg, SVB);
   if (auto KnownSize = Size.getAs()) {
-// In a situation where both overflow and overflow are possible (but the
+// In a situation where both underflow and overflow are possible (but the
 // index is either tainted or known to be invalid), the logic of this
 // checker will first assume that the offset is non-negative, and then
 // (with this additional assumption) it will detect an overflow error.
diff --git a/clang/test/Analysis/out-of-bounds-diagnostics.c 
b/clang/test/Analysis/out-of-bounds-diagnostics.c
index 92f983d8b1561..10cfe95a79a55 100644
--- a/clang/test/Analysis/out-of-bounds-diagnostics.c
+++ b/clang/test/Analysis/out-of-bounds-diagnostics.c
@@ -6,7 +6,7 @@ int TenElements[10];
 void arrayUnderflow(void) {
   TenElements[-3] = 5;
   // expected-warning@-1 {{Out of bound access to memory preceding 
'TenElements'}}
-  // expected-note@-2 {{Access of 'TenElements' at negative byte offset -12}}
+  // expected-note@-2 {{Access of 'TenElements' at negative byte offset}}
 }
 
 int underflowWithDeref(void) {
@@ -14,7 +14,29 @@ int underflowWithDeref(void) {
   --p;
   return *p;
   // expected-warning@-1 {{Out of bound access to memory preceding 
'TenElements'}}
-  // expected-note@-2 {{Access of 'TenElements' at negative byte offset -4}}
+  // expected-note@-2 {{Access of 'TenElements' at negative byte offset}}
+}
+
+int rng(void);
+int getIndex(void) {
+  switch (rng()) {
+case 1: return -152;
+case 2: return -160;
+case 3: return -168;
+default: return -172;
+  }
+}
+
+void gh86959(void) {
+  // Previously code like this produced many almost-identical bug reports that
+  // only differed in the byte offset value (which was reported by the checker
+  // at that time). Verify that now we only see one report.
+
+  // expected-note@+1 {{Entering loop body}}
+  while (rng())
+TenElements[getIndex()] = 10;
+  // expected-warning@-1 {{Out of bound access to memory preceding 
'TenElements'}}
+  // expected-note@-2 {{Access of 'TenElements' at negative byte offset}}
 }
 
 int scanf(const char *restrict fmt, ...);

From d7b71a21368c70da21b2ca6e7d02c7022ec21dde Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Fri, 12 Jul 2024 21:11:22 +0200
Subject: [PATCH 2/2] Alternative approach: tweak the `Profile()`` method of
 `BugReport`s

This commit re-adds the concrete offset value at the end of the (long)
`Description` of the underflow report, but ensures that the `Profile()`
method of `PathSensitiveBugReport` only uses the _short_ description (as
returned by `getShortDescription()`) instead of the the `Description`
field.

For the sake of consistency, the same 

[clang] [analyzer] Don't display the offset value in underflows (PR #98621)

2024-07-12 Thread Donát Nagy via cfe-commits


@@ -609,7 +610,7 @@ void ArrayBoundCheckerV2::performCheck(const Expr *E, 
CheckerContext ) const {
   // CHECK UPPER BOUND
   DefinedOrUnknownSVal Size = getDynamicExtent(State, Reg, SVB);
   if (auto KnownSize = Size.getAs()) {
-// In a situation where both overflow and overflow are possible (but the
+// In a situation where both underflow and overflow are possible (but the

NagyDonat wrote:

This is an unrelated change, I just noticed a mistake (that was introduced in 
one of my earlier commits).

https://github.com/llvm/llvm-project/pull/98621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Don't display the offset value in underflows (PR #98621)

2024-07-12 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat created 
https://github.com/llvm/llvm-project/pull/98621

Previously alpha.security.ArrayBoundV2 displayed the (negative) offset value 
when it reported an underflow, but this produced lots of very similar and 
redundant reports in certain situations.

After this commit the offset won't be printed so the usual deduplication will 
handle these reports as equivalent (and print only one of them).

See https://github.com/llvm/llvm-project/issues/86969 for background.

From 2765bc97d3242d50fd73aedb9e9d38dfdcef814c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Fri, 12 Jul 2024 13:57:53 +0200
Subject: [PATCH] [analyzer] Don't display the offset value in underflows

Previously alpha.security.ArrayBoundV2 displayed the (negative) offset
value when it reported an underflow, but this produced lots of very
similar and redundant reports in certain situations.

After this commit the offset won't be printed so the usual deduplication
will handle these reports as equivalent (and print only one of them).

See https://github.com/llvm/llvm-project/issues/86969 for background.
---
 .../Checkers/ArrayBoundCheckerV2.cpp  | 15 ++-
 .../test/Analysis/out-of-bounds-diagnostics.c | 26 +--
 2 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
index f82288f1099e8..f177041abc411 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -374,13 +374,14 @@ static std::optional 
getConcreteValue(std::optional SV) {
 
 static Messages getPrecedesMsgs(const SubRegion *Region, NonLoc Offset) {
   std::string RegName = getRegionName(Region);
-  SmallString<128> Buf;
-  llvm::raw_svector_ostream Out(Buf);
-  Out << "Access of " << RegName << " at negative byte offset";
-  if (auto ConcreteIdx = Offset.getAs())
-Out << ' ' << ConcreteIdx->getValue();
+
+  // We're not reporting the Offset, because we don't want to spam the user
+  // with similar reports that differ only in different offset values.
+  // See https://github.com/llvm/llvm-project/issues/86969 for details.
+  (void)Offset;
+
   return {formatv("Out of bound access to memory preceding {0}", RegName),
-  std::string(Buf)};
+  formatv("Access of {0} at negative byte offset", RegName)};
 }
 
 /// Try to divide `Val1` and `Val2` (in place) by `Divisor` and return true if
@@ -609,7 +610,7 @@ void ArrayBoundCheckerV2::performCheck(const Expr *E, 
CheckerContext ) const {
   // CHECK UPPER BOUND
   DefinedOrUnknownSVal Size = getDynamicExtent(State, Reg, SVB);
   if (auto KnownSize = Size.getAs()) {
-// In a situation where both overflow and overflow are possible (but the
+// In a situation where both underflow and overflow are possible (but the
 // index is either tainted or known to be invalid), the logic of this
 // checker will first assume that the offset is non-negative, and then
 // (with this additional assumption) it will detect an overflow error.
diff --git a/clang/test/Analysis/out-of-bounds-diagnostics.c 
b/clang/test/Analysis/out-of-bounds-diagnostics.c
index 92f983d8b1561..10cfe95a79a55 100644
--- a/clang/test/Analysis/out-of-bounds-diagnostics.c
+++ b/clang/test/Analysis/out-of-bounds-diagnostics.c
@@ -6,7 +6,7 @@ int TenElements[10];
 void arrayUnderflow(void) {
   TenElements[-3] = 5;
   // expected-warning@-1 {{Out of bound access to memory preceding 
'TenElements'}}
-  // expected-note@-2 {{Access of 'TenElements' at negative byte offset -12}}
+  // expected-note@-2 {{Access of 'TenElements' at negative byte offset}}
 }
 
 int underflowWithDeref(void) {
@@ -14,7 +14,29 @@ int underflowWithDeref(void) {
   --p;
   return *p;
   // expected-warning@-1 {{Out of bound access to memory preceding 
'TenElements'}}
-  // expected-note@-2 {{Access of 'TenElements' at negative byte offset -4}}
+  // expected-note@-2 {{Access of 'TenElements' at negative byte offset}}
+}
+
+int rng(void);
+int getIndex(void) {
+  switch (rng()) {
+case 1: return -152;
+case 2: return -160;
+case 3: return -168;
+default: return -172;
+  }
+}
+
+void gh86959(void) {
+  // Previously code like this produced many almost-identical bug reports that
+  // only differed in the byte offset value (which was reported by the checker
+  // at that time). Verify that now we only see one report.
+
+  // expected-note@+1 {{Entering loop body}}
+  while (rng())
+TenElements[getIndex()] = 10;
+  // expected-warning@-1 {{Out of bound access to memory preceding 
'TenElements'}}
+  // expected-note@-2 {{Access of 'TenElements' at negative byte offset}}
 }
 
 int scanf(const char *restrict fmt, ...);

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)

2024-07-10 Thread Donát Nagy via cfe-commits


@@ -1046,10 +1044,7 @@ bool GenericTaintChecker::generateReportIfTainted(const 
Expr *E, StringRef Msg,
 return false;
 
   // Generate diagnostic.

NagyDonat wrote:

:thinking: Perhaps add an `assert(BT)` here for the sake of paranoia?

https://github.com/llvm/llvm-project/pull/98157
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)

2024-07-10 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

(By the way, this change doesn't have significant user-facing parts, so I don't 
think that we need to mention it in the release notes.)

https://github.com/llvm/llvm-project/pull/98157
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)

2024-07-10 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

LGTM, thanks for the updates.

@steakhal Is it OK for you if we merge this?

https://github.com/llvm/llvm-project/pull/98157
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)

2024-07-10 Thread Donát Nagy via cfe-commits


@@ -1122,10 +1131,20 @@ void 
GenericTaintChecker::taintUnsafeSocketProtocol(const CallEvent ,
 }
 
 /// Checker registration
-void ento::registerGenericTaintChecker(CheckerManager ) {
+void ento::registerTaintPropagationChecker(CheckerManager ) {
   Mgr.registerChecker();
 }
 
+bool ento::shouldRegisterTaintPropagationChecker(const CheckerManager ) {
+  return true;
+}
+
+void ento::registerGenericTaintChecker(CheckerManager ) {
+  GenericTaintChecker *checker = Mgr.getChecker();
+  checker->isTaintReporterCheckerEnabled = true;
+  checker->reporterCheckerName = Mgr.getCurrentCheckerName();

NagyDonat wrote:

Because this way we can access the checker name specified in `Checkers.td` via 
the function `Mgr.getCurrentCheckerName();`.

When the checker class corresponds to just one checker defined in `Checkers.td` 
we can use the alternative constructor of `BugType` that takes the checker 
object (`this`) as a first argument and then queries the name from the checker 
object (which can save a _single_ name automatically).

When a single class implements multiple checkers, we need to explicitly pass 
the right name to the `BugType` constructor, and so we need to either (1) 
postpone the construction of the `BugType` to a point after the registration or 
(2) duplicate the checker name between `Checkers.td` and the source file.

AFAIK we always choose option (1), but I didn't check this systematically.

https://github.com/llvm/llvm-project/pull/98157
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)

2024-07-10 Thread Donát Nagy via cfe-commits


@@ -1122,10 +1131,20 @@ void 
GenericTaintChecker::taintUnsafeSocketProtocol(const CallEvent ,
 }
 
 /// Checker registration
-void ento::registerGenericTaintChecker(CheckerManager ) {
+void ento::registerTaintPropagationChecker(CheckerManager ) {
   Mgr.registerChecker();
 }
 
+bool ento::shouldRegisterTaintPropagationChecker(const CheckerManager ) {
+  return true;
+}
+
+void ento::registerGenericTaintChecker(CheckerManager ) {
+  GenericTaintChecker *checker = Mgr.getChecker();
+  checker->isTaintReporterCheckerEnabled = true;
+  checker->reporterCheckerName = Mgr.getCurrentCheckerName();

NagyDonat wrote:

We only use that `unique_ptr` has a null / empty state in addition to 
representing the "real" `BugType` values. However, `std::optional` would be 
completely sufficient for this goal (here and AFAIK in every checker that uses 
`unique_ptr`s to juggle multiple `BugType`s).

If we really want, in _this_ checker we could use a "bare" `BugType` data 
member if we leave it uninitialized in the constructor of the checker class (or 
initialize it with dummy values) and then unconditionally overwrite it with the 
"real" `BugType` in this `register...` function. (Other checkers have lazy 
initialization patterns that check the nullness of the `BugType`.)

https://github.com/llvm/llvm-project/pull/98157
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)

2024-07-09 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat commented:

Overall LGTM, I added some minor remarks in inline comments.

Also note that with this change we can finally remove the note
```
The ``alpha.security.taint.TaintPropagation`` checker also needs to be enabled 
for
this checker to give warnings.
```
from the documentation of `optin.taint.TaintedAlloc` (because the 
`TaintPropagation` modeling checker is now a _dependency_ of it).

https://github.com/llvm/llvm-project/pull/98157
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)

2024-07-09 Thread Donát Nagy via cfe-commits


@@ -18,7 +21,7 @@ 
 
 Taint analysis works by checking for the occurrence of special operations 
during the symbolic execution of the program.
 Taint analysis defines sources, sinks, and propagation rules. It identifies 
errors by detecting a flow of information that originates from a taint source, 
reaches a taint sink, and propagates through the program paths via propagation 
rules.
-A source, sink, or an operation that propagates taint is mainly 
domain-specific knowledge, but there are some built-in defaults provided by 
:ref:`alpha-security-taint-TaintPropagation`.
+A source, sink, or an operation that propagates taint is mainly 
domain-specific knowledge, but there are some built-in defaults provided by 
``TaintPropagation`` checker.

NagyDonat wrote:

```suggestion
A source, sink, or an operation that propagates taint is mainly domain-specific 
knowledge, but there are some built-in defaults provided by the 
``TaintPropagation`` checker.
```

https://github.com/llvm/llvm-project/pull/98157
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)

2024-07-09 Thread Donát Nagy via cfe-commits


@@ -1122,10 +1131,20 @@ void 
GenericTaintChecker::taintUnsafeSocketProtocol(const CallEvent ,
 }
 
 /// Checker registration
-void ento::registerGenericTaintChecker(CheckerManager ) {
+void ento::registerTaintPropagationChecker(CheckerManager ) {
   Mgr.registerChecker();
 }
 
+bool ento::shouldRegisterTaintPropagationChecker(const CheckerManager ) {
+  return true;
+}
+
+void ento::registerGenericTaintChecker(CheckerManager ) {
+  GenericTaintChecker *checker = Mgr.getChecker();
+  checker->isTaintReporterCheckerEnabled = true;
+  checker->reporterCheckerName = Mgr.getCurrentCheckerName();

NagyDonat wrote:

Move the initialization of `BT` to this point -- then it doesn't need to be a 
`mutable` and you won't need the string data member `reporterCheckerName`.

Notes:
- You may turn `BT` into a public data member -- its visibility is still 
limited to this one source file. 
- `BugType` is a very simple type (essentially a plain struct of three 
strings), there is no reason to postpone its initialization once you have all 
the data.
- You can use `BugType::getCheckerName()` to get the checker name to construct 
the `CheckerProgramPointTag`.

https://github.com/llvm/llvm-project/pull/98157
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)

2024-07-09 Thread Donát Nagy via cfe-commits


@@ -391,8 +392,11 @@ class GenericTaintChecker : public Checker {
   bool generateReportIfTainted(const Expr *E, StringRef Msg,
CheckerContext ) const;
 
+  bool isTaintReporterCheckerEnabled = false;
+  CheckerNameRef reporterCheckerName;
+
 private:
-  const BugType BT{this, "Use of Untrusted Data", categories::TaintedData};
+  mutable std::unique_ptr BT;

NagyDonat wrote:

I'm a bit sad that this mutable unique_ptr trickery reappears in this checker, 
but I don't see a better alternative :frowning_face: 

https://github.com/llvm/llvm-project/pull/98157
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Splitting TaintPropagation checker into reporting and mode… (PR #98157)

2024-07-09 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/98157
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NFCI][clang][analyzer] Make ProgramStatePartialTrait a template definition (PR #98150)

2024-07-09 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

LGTM, nice little update :)

https://github.com/llvm/llvm-project/pull/98150
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)

2024-07-03 Thread Donát Nagy via cfe-commits


@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc {
   static bool classof(SVal V) { return V.getKind() == CompoundValKind; }
 };
 
+/// The simplest example of a concrete compound value is nonloc::CompoundVal,
+/// which represents a concrete r-value of an initializer-list or a string.
+/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the
+/// literal.
+///
+/// However, there is another compound value used in the analyzer, which 
appears
+/// much more often during analysis, which is nonloc::LazyCompoundVal. This
+/// value is an r-value that represents a snapshot of any structure "as a 
whole"
+/// at a given moment during the analysis. Such value is already quite far from
+/// being re- ferred to as "concrete", as many fields inside it would be 
unknown
+/// or symbolic. nonloc::LazyCompoundVal operates by storing two things:
+///   * a reference to the TypedValueRegion being snapshotted (yes, it is 
always
+/// typed), and also
+///   * a copy of the whole Store object, obtained from the ProgramState in
+/// which it was created.

NagyDonat wrote:

```suggestion
///  * a reference to the whole Store object, obtained from the ProgramState in
///which the nonloc::LazyCompoundVal was created.
///
/// Note that the old ProgramState and its Store is kept alive during the
/// analysis because these are immutable functional data structures and each new
/// Store value is represented as "earlier Store" + "additional binding".
```

What would you think about this phrasing?

https://github.com/llvm/llvm-project/pull/97407
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)

2024-07-03 Thread Donát Nagy via cfe-commits


@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc {
   static bool classof(SVal V) { return V.getKind() == CompoundValKind; }
 };
 
+/// The simplest example of a concrete compound value is nonloc::CompoundVal,
+/// which represents a concrete r-value of an initializer-list or a string.
+/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the
+/// literal.
+///
+/// However, there is another compound value used in the analyzer, which 
appears
+/// much more often during analysis, which is nonloc::LazyCompoundVal. This
+/// value is an r-value that represents a snapshot of any structure "as a 
whole"
+/// at a given moment during the analysis. Such value is already quite far from
+/// being re- ferred to as "concrete", as many fields inside it would be 
unknown
+/// or symbolic. nonloc::LazyCompoundVal operates by storing two things:
+///   * a reference to the TypedValueRegion being snapshotted (yes, it is 
always
+/// typed), and also
+///   * a copy of the whole Store object, obtained from the ProgramState in

NagyDonat wrote:

I do understand functional programming (I had a phase when I learnt Haskell and 
for a few years I thought that it's the absolutely perfect language 
:star_struck: -- since then I'm more realistic but I still like it), I was just 
confused by the fact that this line emphasizes "a copy of the whole Store 
object" as opposed to the "reference to" in the previous line.

In a functional language the word "copy" is practically non-existent (we speak 
about values, not objects and their identity, so there is no reason to say that 
this is _a copy of_ e.g. that list -- we simply say that "this is that list"), 
while in C++ saying "copy" instead of "move" or "pointer/reference" is an 
important synonym of "we allocate lots of new memory", so I'd still prefer 
tweaking this line.

https://github.com/llvm/llvm-project/pull/97407
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)

2024-07-02 Thread Donát Nagy via cfe-commits


@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc {
   static bool classof(SVal V) { return V.getKind() == CompoundValKind; }
 };
 
+/// The simplest example of a concrete compound value is nonloc::CompoundVal,
+/// which represents a concrete r-value of an initializer-list or a string.
+/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the
+/// literal.
+///
+/// However, there is another compound value used in the analyzer, which 
appears
+/// much more often during analysis, which is nonloc::LazyCompoundVal. This
+/// value is an r-value that represents a snapshot of any structure "as a 
whole"
+/// at a given moment during the analysis. Such value is already quite far from
+/// being re- ferred to as "concrete", as many fields inside it would be 
unknown
+/// or symbolic. nonloc::LazyCompoundVal operates by storing two things:
+///   * a reference to the TypedValueRegion being snapshotted (yes, it is 
always
+/// typed), and also
+///   * a copy of the whole Store object, obtained from the ProgramState in

NagyDonat wrote:

Perhaps we should mention that `Store` is just a typedef for `const void *`...

https://github.com/llvm/llvm-project/pull/97407
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)

2024-07-02 Thread Donát Nagy via cfe-commits


@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc {
   static bool classof(SVal V) { return V.getKind() == CompoundValKind; }
 };
 
+/// The simplest example of a concrete compound value is nonloc::CompoundVal,
+/// which represents a concrete r-value of an initializer-list or a string.
+/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the
+/// literal.
+///
+/// However, there is another compound value used in the analyzer, which 
appears
+/// much more often during analysis, which is nonloc::LazyCompoundVal. This
+/// value is an r-value that represents a snapshot of any structure "as a 
whole"
+/// at a given moment during the analysis. Such value is already quite far from
+/// being re- ferred to as "concrete", as many fields inside it would be 
unknown

NagyDonat wrote:

```suggestion
/// being referred to as "concrete", as many fields inside it would be unknown
```

https://github.com/llvm/llvm-project/pull/97407
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)

2024-07-02 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/97407
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)

2024-07-02 Thread Donát Nagy via cfe-commits


@@ -346,6 +352,39 @@ class CompoundVal : public NonLoc {
   static bool classof(SVal V) { return V.getKind() == CompoundValKind; }
 };
 
+/// The simplest example of a concrete compound value is nonloc::CompoundVal,
+/// which represents a concrete r-value of an initializer-list or a string.
+/// Internally, it contains an llvm::ImmutableList of SVal's stored inside the
+/// literal.
+///
+/// However, there is another compound value used in the analyzer, which 
appears
+/// much more often during analysis, which is nonloc::LazyCompoundVal. This
+/// value is an r-value that represents a snapshot of any structure "as a 
whole"
+/// at a given moment during the analysis. Such value is already quite far from
+/// being re- ferred to as "concrete", as many fields inside it would be 
unknown
+/// or symbolic. nonloc::LazyCompoundVal operates by storing two things:
+///   * a reference to the TypedValueRegion being snapshotted (yes, it is 
always
+/// typed), and also
+///   * a copy of the whole Store object, obtained from the ProgramState in

NagyDonat wrote:

Is it truly a _copy_ of the store object? This seems to contradict the "is a 
very cheap operation" line in the next paragraph...?

Or is it a _copied store object_ that _shares the same bulk allocations_ with 
the original one (but differs in minor details?) 

https://github.com/llvm/llvm-project/pull/97407
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Add some docs for LazyCompoundValue (PR #97407)

2024-07-02 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat commented:

I'm really happy that you decided to document these :smile: 

https://github.com/llvm/llvm-project/pull/97407
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)

2024-07-02 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/97034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)

2024-07-02 Thread Donát Nagy via cfe-commits


@@ -0,0 +1,239 @@
+Command Line Usage: scan-build and CodeChecker
+==
+
+This document provides guidelines for running Clang Static Analyzer from the 
command line on whole projects.
+CodeChecker and scan-build are two CLI tools for using CSA on multiple files 
(tranlation units).
+Both provide a way of driving the analyzer, detecting compilation flags, and 
generating reports.
+CodeChecker is more actively maintained, provides heuristics for working with 
multiple versions of popular compilers and it also comes with a web-based GUI 
for viewing, filtering, categorizing and suppressing the results.
+Therefore CodeChecker is recommended in case you need any of the above 
features or just more customizability in general.
+
+Comparison of CodeChecker and scan-build
+
+
+Static Analyzer is by design a GUI tool originally intended to be consumed by 
the XCode IDE.
+Its purpose is to find buggy execution paths in the program, and such paths 
are very hard to comprehend by looking at a non-interactive standard output.
+It is possible, however, to invoke the Static Analyzer from the command line 
in order to obtain analysis results, and then later view them interactively in 
a graphical interface.
+The following tools are used commonly to run the analyzer from the command 
line.
+Both tools are wrapper scripts to drive the analysis and the underlying 
invocations of the Clang compiler:
+
+1. scan-build_ is an old and simple command line tool that emits static 
analyzer warnings as HTML files while compiling your project. You can view the 
analysis results in your web browser.
+- Useful for individual developers who simply want to view static analysis 
results at their desk, or in a very simple collaborative environment.
+- Works on all major platforms (Windows, Linux, macOS) and is available as 
a package in many Linux distributions.
+- Does not include support for cross-translation-unit analysis.
+
+2. CodeChecker_ is a driver and web server that runs the Static Analyzer on 
your projects on demand and maintains a database of issues.
+- Perfect for managing large amounts of Static Analyzer warnings in a 
collaborative environment.
+- Generally much more feature-rich than scan-build.
+- Supports incremental analysis: Results can be stored in a database, 
subsequent analysis runs can be compared to list the newly added defects.
+- :doc:`CrossTranslationUnit` is supported fully on Linux via CodeChecker.
+- Can run clang-tidy checkers too.
+- Open source, but out-of-tree, i.e. not part of the LLVM project.
+
+scan-build
+--
+
+**scan-build** is a command line utility that enables a user to run the static 
analyzer over their codebase as part of performing a regular build (from the 
command line).

NagyDonat wrote:

The use of "the Static Analyzer" vs "the static analyzer" is inconsistent -- it 
would be good to pick one of them and use it consistently.


https://github.com/llvm/llvm-project/pull/97034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)

2024-07-02 Thread Donát Nagy via cfe-commits


@@ -0,0 +1,239 @@
+Command Line Usage: scan-build and CodeChecker
+==
+
+This document provides guidelines for running Clang Static Analyzer from the 
command line on whole projects.
+CodeChecker and scan-build are two CLI tools for using CSA on multiple files 
(tranlation units).
+Both provide a way of driving the analyzer, detecting compilation flags, and 
generating reports.
+CodeChecker is more actively maintained, provides heuristics for working with 
multiple versions of popular compilers and it also comes with a web-based GUI 
for viewing, filtering, categorizing and suppressing the results.
+Therefore CodeChecker is recommended in case you need any of the above 
features or just more customizability in general.
+
+Comparison of CodeChecker and scan-build
+
+
+Static Analyzer is by design a GUI tool originally intended to be consumed by 
the XCode IDE.

NagyDonat wrote:

Is this still true?

(Even if it's obsolete, we should probably leave it for a follow-up commit.)

https://github.com/llvm/llvm-project/pull/97034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)

2024-07-02 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/97034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)

2024-07-02 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat commented:

I'm really happy to see that this duplicated documentation is finally cleaned 
up after so many years :partying_face:

I added a few minor remarks in inline comments, but the change looks good 
overall.

As we discussed privately, this change should be (mostly) limited to the format 
changes, and then separate follow-up commit should eliminate the outdated 
content (and, perhaps, add new information if that's necessary). I'll update 
the PR description to emphasize this.

https://github.com/llvm/llvm-project/pull/97034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)

2024-07-02 Thread Donát Nagy via cfe-commits


@@ -0,0 +1,238 @@
+Command-Line Usage: CodeChecker and scan-build
+===
+
+This document provides guidelines for running Clang Static Analyzer from the 
command line on whole projects.
+CodeChecker and scan-build are two CLI tools for using CSA on multiple files 
(tranlation units).
+Both provide a way of driving the analyzer, detecting compilation flags, and 
generating reports.
+CodeChecker is more actively maintained, provides heuristics for working with 
multiple versions of popular compilers and it also comes with a web-based GUI 
for viewing, filtering, categorizing and suppressing the results.
+Therefore CodeChecker is recommended in case you need any of the above 
features or just more customizability in general.
+
+Comparison of CodeChecker and scan-build
+
+
+Static Analyzer is by design a GUI tool originally intended to be consumed by 
the XCode IDE.
+Its purpose is to find buggy execution paths in the program, and such paths 
are very hard to comprehend by looking at a non-interactive standard output.
+It is possible, however, to invoke the Static Analyzer from the command line 
in order to obtain analysis results, and then later view them interactively in 
a graphical interface.
+The following tools are used commonly to run the analyzer from the commandline.
+Both tools are wrapper scripts to drive the analysis and the underlying 
invocations of the Clang compiler:
+
+1. CodeChecker_ is a driver and web server that runs the Static Analyzer on 
your projects on demand and maintains a database of issues.

NagyDonat wrote:

Oh, that's surprising -- I was not familiar with that feature of RST. Naturally 
this invalidates my concern.

https://github.com/llvm/llvm-project/pull/97034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Check the correct first and last elements in cstring.UninitializedRead (PR #95408)

2024-07-02 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

LGTM, with the disclaimer that I'd still prefer e.g. "second argument instead 
of "2nd argument" -- but I won't block the review with this bikeshedding.

https://github.com/llvm/llvm-project/pull/95408
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)

2024-07-02 Thread Donát Nagy via cfe-commits


@@ -0,0 +1,238 @@
+Command-Line Usage: CodeChecker and scan-build
+===
+
+This document provides guidelines for running Clang Static Analyzer from the 
command line on whole projects.
+CodeChecker and scan-build are two CLI tools for using CSA on multiple files 
(tranlation units).
+Both provide a way of driving the analyzer, detecting compilation flags, and 
generating reports.
+CodeChecker is more actively maintained, provides heuristics for working with 
multiple versions of popular compilers and it also comes with a web-based GUI 
for viewing, filtering, categorizing and suppressing the results.
+Therefore CodeChecker is recommended in case you need any of the above 
features or just more customizability in general.
+
+Comparison of CodeChecker and scan-build
+
+
+Static Analyzer is by design a GUI tool originally intended to be consumed by 
the XCode IDE.
+Its purpose is to find buggy execution paths in the program, and such paths 
are very hard to comprehend by looking at a non-interactive standard output.
+It is possible, however, to invoke the Static Analyzer from the command line 
in order to obtain analysis results, and then later view them interactively in 
a graphical interface.
+The following tools are used commonly to run the analyzer from the commandline.
+Both tools are wrapper scripts to drive the analysis and the underlying 
invocations of the Clang compiler:
+
+1. CodeChecker_ is a driver and web server that runs the Static Analyzer on 
your projects on demand and maintains a database of issues.

NagyDonat wrote:

It's still strange that there is only one underscore which is _after_ the word 
'CodeChecker'. I'd guess that the intent was to italicize 'CodeChecker' and we 
should add another underscore _before_ this word for the sake of clarity (even 
if the actual HTML generation can work with this half-broken situation.

https://github.com/llvm/llvm-project/pull/97034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][doc] Migrate checkers-related docs from HTML to RST (PR #97032)

2024-07-02 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

:thinking: Deleting the html files could break some links on external sites, so 
I think it would be better to replace them with a very simple "This content was 
moved to " placeholder.

https://github.com/llvm/llvm-project/pull/97032
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [clang-tidy] Clarify diagnostics of bugprone-sizeof-expression (PR #95550)

2024-07-02 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat closed 
https://github.com/llvm/llvm-project/pull/95550
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [clang-tidy] Clarify diagnostics of bugprone-sizeof-expression (PR #95550)

2024-07-02 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

I mentioned this change in the paragraph that was describing my earlier commit 
(that was also modifying this check).

https://github.com/llvm/llvm-project/pull/95550
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [clang-tidy] Clarify diagnostics of bugprone-sizeof-expression (PR #95550)

2024-07-02 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat updated 
https://github.com/llvm/llvm-project/pull/95550

From 06adc063c2388ea534537f5a417751fdf64b22cd Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Fri, 14 Jun 2024 15:16:34 +0200
Subject: [PATCH 1/4] [clang-tidy] Clarify diagnostics of
 bugprone-sizeof-expression

...becasue they were strangely worded and in a few cases outright
incorrect.
---
 .../bugprone/SizeofExpressionCheck.cpp| 22 ++---
 .../checkers/bugprone/sizeof-expression-2.c   | 12 +--
 .../sizeof-expression-any-pointer.cpp | 86 +--
 ...on-warn-on-sizeof-pointer-to-aggregate.cpp |  4 +-
 .../checkers/bugprone/sizeof-expression.cpp   | 84 +-
 5 files changed, 105 insertions(+), 103 deletions(-)

diff --git a/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp 
b/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp
index c25ee42d0899a..3afa0e5c39882 100644
--- a/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp
+++ b/clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp
@@ -296,7 +296,7 @@ void SizeofExpressionCheck::check(const 
MatchFinder::MatchResult ) {
   } else if (const auto *E =
  Result.Nodes.getNodeAs("sizeof-integer-call")) {
 diag(E->getBeginLoc(), "suspicious usage of 'sizeof()' on an expression "
-   "that results in an integer")
+   "of integer type")
 << E->getSourceRange();
   } else if (const auto *E = Result.Nodes.getNodeAs("sizeof-this")) {
 diag(E->getBeginLoc(),
@@ -314,7 +314,7 @@ void SizeofExpressionCheck::check(const 
MatchFinder::MatchResult ) {
   << E->getSourceRange();
 } else {
   diag(E->getBeginLoc(), "suspicious usage of 'sizeof()' on an expression "
- "that results in a pointer")
+ "of pointer type")
   << E->getSourceRange();
 }
   } else if (const auto *E = Result.Nodes.getNodeAs(
@@ -342,31 +342,33 @@ void SizeofExpressionCheck::check(const 
MatchFinder::MatchResult ) {
 
 if (DenominatorSize > CharUnits::Zero() &&
 !NumeratorSize.isMultipleOf(DenominatorSize)) {
-  diag(E->getOperatorLoc(), "suspicious usage of 
'sizeof(...)/sizeof(...)';"
+  diag(E->getOperatorLoc(), "suspicious usage of 
'sizeof(...)/sizeof(...)':"
 " numerator is not a multiple of denominator")
   << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange();
 } else if (ElementSize > CharUnits::Zero() &&
DenominatorSize > CharUnits::Zero() &&
ElementSize != DenominatorSize) {
-  diag(E->getOperatorLoc(), "suspicious usage of 
'sizeof(...)/sizeof(...)';"
-" numerator is not a multiple of denominator")
+  diag(E->getOperatorLoc(),
+   "suspicious usage of 'sizeof(array)/sizeof(...)':"
+   " denominator differs from the size of array elements")
   << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange();
 } else if (NumTy && DenomTy && NumTy == DenomTy) {
-  // FIXME: This message is wrong, it should not refer to sizeof "pointer"
-  // usage (and by the way, it would be to clarify all the messages).
   diag(E->getOperatorLoc(),
-   "suspicious usage of sizeof pointer 'sizeof(T)/sizeof(T)'")
+   "suspicious usage of 'sizeof(...)/sizeof(...)': both expressions "
+   "have the same type")
   << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange();
 } else if (!WarnOnSizeOfPointer) {
   // When 'WarnOnSizeOfPointer' is enabled, these messages become 
redundant:
   if (PointedTy && DenomTy && PointedTy == DenomTy) {
 diag(E->getOperatorLoc(),
- "suspicious usage of sizeof pointer 'sizeof(T*)/sizeof(T)'")
+ "suspicious usage of 'sizeof(...)/sizeof(...)': size of pointer "
+ "is divided by size of pointed type")
 << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange();
   } else if (NumTy && DenomTy && NumTy->isPointerType() &&
  DenomTy->isPointerType()) {
 diag(E->getOperatorLoc(),
- "suspicious usage of sizeof pointer 'sizeof(P*)/sizeof(Q*)'")
+ "suspicious usage of 'sizeof(...)/sizeof(...)': both expressions "
+ "have pointer types")
 << E->getLHS()->getSourceRange() << E->getRHS()->getSourceRange();
   }
 }
diff --git 
a/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c 
b/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c
index aef930f2c8fda..b898071a56613 100644
--- a/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c
+++ b/clang-tools-extra/test/clang-tidy/checkers/bugprone/sizeof-expression-2.c
@@ -34,24 +34,24 @@ int Test5() {
 
   int sum = 0;
   sum += sizeof();
-  // 

[clang] [clang][analyzer][doc] Migrate user-related docs from HTML to RST (PR #97034)

2024-07-02 Thread Donát Nagy via cfe-commits


@@ -0,0 +1,37 @@
+Obtaining the Static Analyzer
+=
+
+This page describes how to download and install the analyzer. Once the 
analyzer is installed, follow the asdf :doc:`CommandLineUsage` on using the 
commandline to get started analyzing your code.

NagyDonat wrote:

Nice catch! :laughing: 

https://github.com/llvm/llvm-project/pull/97034
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)

2024-07-01 Thread Donát Nagy via cfe-commits


@@ -1034,16 +1034,16 @@ void StreamChecker::preWrite(const FnDescription *Desc, 
const CallEvent ,
   C.addTransition(State);
 }
 
-static std::optional getPointeeType(const MemRegion *R) {
+static QualType getPointeeType(const MemRegion *R) {
   if (!R)
-return std::nullopt;
+return {};
   if (const auto *ER = dyn_cast(R))
-return ER->getElementType();
+return ER->getElementType()->getCanonicalTypeUnqualified();

NagyDonat wrote:

Consider moving the `->getCanonicalTypeUnqualified()` calls into 
`escapeByStartIndexAndCount()` because that's the only place where it will be 
relevant. (This would eliminate the code duplication.)


https://github.com/llvm/llvm-project/pull/97199
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)

2024-07-01 Thread Donát Nagy via cfe-commits


@@ -1034,16 +1034,16 @@ void StreamChecker::preWrite(const FnDescription *Desc, 
const CallEvent ,
   C.addTransition(State);
 }
 
-static std::optional getPointeeType(const MemRegion *R) {
+static QualType getPointeeType(const MemRegion *R) {
   if (!R)
-return std::nullopt;
+return {};

NagyDonat wrote:

It's a bit strange that here you return `{}`, while at the end of the function 
`QualType{}` is explicitly specified.

https://github.com/llvm/llvm-project/pull/97199
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/97199
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

Thanks for the updates! I added few minor comments, but the PR is already good 
enough to be merged.

> >  I only noticed that this PR was already merged after posting the review. 
> > There is no need to revert the commit -- it's better than nothing -- but 
> > I'd be happy if you created a followup change that also handles the 
> > testcase that I mentioned.
>
> I'm not sure what you refer to. This PR is not approved, hence not merged. 
> Please continue with the review.

Oops, sorry -- I was probably confused by the purple "merged" icon in 
![image](https://github.com/llvm/llvm-project/assets/43410265/e4b494d4-014f-4b03-942b-00debb90a709)
...but the real reason is that I got up at 4 AM today :sleeping: 


https://github.com/llvm/llvm-project/pull/97199
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Reland "[analyzer] Harden safeguards for Z3 query times" (PR #97298)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

This should be re-landed as well; when it was merged earlier the only issue was 
the incompatibility with old Z3 and that was addressed since then (by bumping 
the version requirement).

https://github.com/llvm/llvm-project/pull/97298
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/97078
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)

2024-07-01 Thread Donát Nagy via cfe-commits


@@ -21,30 +21,55 @@
 #include "clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h"
 
 using namespace clang;
 using namespace ento;
 
 namespace {
-class MmapWriteExecChecker : public Checker {
+class MmapWriteExecChecker
+: public Checker {
   CallDescription MmapFn{CDM::CLibrary, {"mmap"}, 6};
   CallDescription MprotectFn{CDM::CLibrary, {"mprotect"}, 3};
-  static int ProtWrite;
-  static int ProtExec;
-  static int ProtRead;
   const BugType BT{this, "W^X check fails, Write Exec prot flags set",
"Security"};
 
+  mutable bool FlagsInitialized = false;
+  mutable int ProtRead = 0x01;
+  mutable int ProtWrite = 0x02;
+  mutable int ProtExec = 0x04;
+
 public:
+  void checkBeginFunction(CheckerContext ) const;
   void checkPreCall(const CallEvent , CheckerContext ) const;
+
   int ProtExecOv;
   int ProtReadOv;
 };
 }
 
-int MmapWriteExecChecker::ProtWrite = 0x02;
-int MmapWriteExecChecker::ProtExec  = 0x04;
-int MmapWriteExecChecker::ProtRead  = 0x01;
+void MmapWriteExecChecker::checkBeginFunction(CheckerContext ) const {
+  if (FlagsInitialized)
+return;
+
+  FlagsInitialized = true;
+
+  const std::optional FoundProtRead =
+  tryExpandAsInteger("PROT_READ", C.getPreprocessor());
+  const std::optional FoundProtWrite =
+  tryExpandAsInteger("PROT_WRITE", C.getPreprocessor());
+  const std::optional FoundProtExec =
+  tryExpandAsInteger("PROT_EXEC", C.getPreprocessor());
+  if (FoundProtRead && FoundProtWrite && FoundProtExec) {
+ProtRead = *FoundProtRead;
+ProtWrite = *FoundProtWrite;
+ProtExec = *FoundProtExec;
+  } else {
+// FIXME: Are these useful?
+ProtRead = ProtReadOv;
+ProtExec = ProtExecOv;

NagyDonat wrote:

This code fragment lets the user specify the value of the `PROT_READ` and 
`PROT_EXEC` macros as checker options:
```cpp
Mwec->ProtExecOv =
  mgr.getAnalyzerOptions()
.getCheckerIntegerOption(Mwec, "MmapProtExec");
Mwec->ProtReadOv =
  mgr.getAnalyzerOptions()
.getCheckerIntegerOption(Mwec, "MmapProtRead");
```
This is a very marginal feature and I wouldn't spend time on implementing it; 
but now that it's already implemented I'm not sure that we should delete it.

https://github.com/llvm/llvm-project/pull/97078
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reland "[analyzer][NFC] Reorganize Z3 report refutation" (PR #97265)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/97265
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Reland "[analyzer][NFC] Reorganize Z3 report refutation" (PR #97265)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

Let's merge this again, after the Z3 version there shouldn't be any additional 
problems. 

https://github.com/llvm/llvm-project/pull/97265
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/97078
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)

2024-07-01 Thread Donát Nagy via cfe-commits


@@ -21,30 +21,55 @@
 #include "clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h"
 
 using namespace clang;
 using namespace ento;
 
 namespace {
-class MmapWriteExecChecker : public Checker {
+class MmapWriteExecChecker
+: public Checker {
   CallDescription MmapFn{CDM::CLibrary, {"mmap"}, 6};
   CallDescription MprotectFn{CDM::CLibrary, {"mprotect"}, 3};
-  static int ProtWrite;
-  static int ProtExec;
-  static int ProtRead;
   const BugType BT{this, "W^X check fails, Write Exec prot flags set",
"Security"};
 
+  mutable bool FlagsInitialized = false;
+  mutable int ProtRead = 0x01;
+  mutable int ProtWrite = 0x02;
+  mutable int ProtExec = 0x04;
+
 public:
+  void checkBeginFunction(CheckerContext ) const;
   void checkPreCall(const CallEvent , CheckerContext ) const;
+
   int ProtExecOv;
   int ProtReadOv;
 };
 }
 
-int MmapWriteExecChecker::ProtWrite = 0x02;
-int MmapWriteExecChecker::ProtExec  = 0x04;
-int MmapWriteExecChecker::ProtRead  = 0x01;
+void MmapWriteExecChecker::checkBeginFunction(CheckerContext ) const {
+  if (FlagsInitialized)
+return;
+
+  FlagsInitialized = true;
+
+  const std::optional FoundProtRead =
+  tryExpandAsInteger("PROT_READ", C.getPreprocessor());
+  const std::optional FoundProtWrite =
+  tryExpandAsInteger("PROT_WRITE", C.getPreprocessor());
+  const std::optional FoundProtExec =
+  tryExpandAsInteger("PROT_EXEC", C.getPreprocessor());
+  if (FoundProtRead && FoundProtWrite && FoundProtExec) {
+ProtRead = *FoundProtRead;
+ProtWrite = *FoundProtWrite;
+ProtExec = *FoundProtExec;
+  } else {
+// FIXME: Are these useful?

NagyDonat wrote:

They are not too useful (and it's a little bit surprising that there is no 
"ProtWriteOv"), but they're harmless and they don't require complex code, so 
let's keep them for now.

https://github.com/llvm/llvm-project/pull/97078
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] MmapWriteExecChecker improvements (PR #97078)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

LGTM, straightforward change. Thanks!

https://github.com/llvm/llvm-project/pull/97078
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)

2024-07-01 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

_I only noticed that this PR was already merged after posting the review. There 
is no need to revert the commit -- it's better than nothing -- but I'd be happy 
if you created a followup change that also handles the testcase that I 
mentioned._

https://github.com/llvm/llvm-project/pull/97199
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improve PointerSubChecker (PR #96501)

2024-07-01 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

> Even protobuf contains this type of code: 
> https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=protobuf_v3.13.0_pointersub1=on=New=alpha.core.PointerSub=5545776=1bcd310fbaeccbcc13645b9b277239a2=%2adescriptor.pb.cc

I still think that this
 (1) is undeniably undefined behavior
 (2) isn't common, so won't cause "spam"  problems and
 (3( can be replaced by standard-compliant code (`offsetof`)
so there is no need to introduce a special case for it.

https://github.com/llvm/llvm-project/pull/96501
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Fix crash in Stream checker when using void pointers (PR #97199)

2024-07-01 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat requested changes to this pull request.

Unfortunately this PR is not a full solution, because e.g. the following test 
code still triggers the crash (if it is appended to the test file `stream.c`):
```c
struct zerosized {
int foo[0];
};

void fread_zerosized(struct zerosized *buffer) {
  FILE *f = fopen("/tmp/foo.txt", "r");
  fread(buffer, 1, 1, f);
  fclose(f);
}
```

(I know that zero-sized arrays are not standard-compliant, but they are a 
widespread extension: e.g. both clang and gcc accept this `struct zerosized` 
with the default settings.)

Overall, I'd say that it's futile to try to recognize zero-sized types with a 
"canonical type equal to" check, so you should just check whether 
`ElemSizeInChars` is zero and do something based on that. (Either an early 
return, or you can say `ElemSizeInChars = 1` at that point if you think that 
that's the logically correct solution.)

``This way you could also avoid the immediately invoked lambda in 
`getPointeeType` which is really ugly in my opinion.``

https://github.com/llvm/llvm-project/pull/97199
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improve documentation of checker 'cplusplus.Move' (NFC) (PR #96295)

2024-06-27 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

The new changes also LGTM, feel free to merge this when you want.

https://github.com/llvm/llvm-project/pull/96295
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improve PointerSubChecker (PR #96501)

2024-06-27 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

> The warning message may be still misleading if the LHS or RHS "arrays" are 
> non-array variables.

I think that the warning message is OK: "Subtraction of two pointers that do 
not point into the same array is undefined behavior." -- this also covers the 
case when one or both of the pointers do not point to arrays. (It doesn't 
mention the corner case that it's also valid to subtract two identical pointers 
that point to a non-array value, but that's completely irrelevant in practice, 
so wouldn't be a helpful suggestion.)

> (or detect if `offsetof` can be used and include it in the message)?

I think that would be a waste of time, because it's very rare that a project 
manually reimplements `offsetof` -- I think it only appears in `vim` becasue 
it's a very old codebase. (Also developers who play with this kind of low-level 
trickery should be familiar with the standard and understand what's the 
problem.)


https://github.com/llvm/llvm-project/pull/96501
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improve PointerSubChecker (PR #96501)

2024-06-26 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

> These results look correct according to the checker, but I am not sure if 
> such results are useful or really invalid: 
> https://codechecker-demo.eastus.cloudapp.azure.com/Default/reports?run=vim_v8.2.1920_pointersub1=on=New=alpha.core.PointerSub
>  In these cases the address difference of an (array) member of a struct and 
> start of the struct is taken. According to the checker rules, taking 
> difference of addresses of any variables or member variables is invalid.

This trickery basically reimplements `offsetof()` (or something very similar to 
it), and as it is in the stable `vim` repo I assume that it's accepted by 
practically all compilers; but it clearly violates the language standard so I 
think it's good that the checker reports it. (Perhaps this trick would've been 
acceptable thirty years ago, but now we have standard `offsetof()` that usually 
expands to `__builtin_offsetof()` so developers who try to use homemade 
alternatives deserve a warning.) 

https://github.com/llvm/llvm-project/pull/96501
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improve PointerSubChecker (PR #96501)

2024-06-25 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/96501
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improve PointerSubChecker (PR #96501)

2024-06-25 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

LGTM.

I'm a bit surprised to see that this checker duplicates the logic of the array 
bounds checkers (in the special case when the indexing operation is within a 
pointer subtraction). Right now this is OK but we'll need to delete this once 
ArrayBoundV2 becomes stable.

https://github.com/llvm/llvm-project/pull/96501
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Check the correct first and last elements in cstring.UninitializedRead (PR #95408)

2024-06-24 Thread Donát Nagy via cfe-commits


@@ -393,6 +401,162 @@ ProgramStateRef 
CStringChecker::checkNonNull(CheckerContext ,
   return stateNonNull;
 }
 
+static std::optional getIndex(ProgramStateRef State,
+  const ElementRegion *ER, CharKind CK) {
+  SValBuilder  = State->getStateManager().getSValBuilder();
+  ASTContext  = SVB.getContext();
+
+  if (CK == CharKind::Regular) {
+if (ER->getValueType() != Ctx.CharTy)
+  return {};
+return ER->getIndex();
+  }
+
+  if (ER->getValueType() != Ctx.WideCharTy)
+return {};
+
+  QualType SizeTy = Ctx.getSizeType();
+  NonLoc WideSize =
+  SVB.makeIntVal(Ctx.getTypeSizeInChars(Ctx.WideCharTy).getQuantity(),
+ SizeTy)
+  .castAs();
+  SVal Offset =
+  SVB.evalBinOpNN(State, BO_Mul, ER->getIndex(), WideSize, SizeTy);
+  if (Offset.isUnknown())
+return {};
+  return Offset.castAs();
+}
+
+// Try to get hold of the origin region (e.g. the actual array region from an
+// element region).
+static const TypedValueRegion *getOriginRegion(const ElementRegion *ER) {
+  const MemRegion *MR = ER->getSuperRegion();
+  const MemRegion *Ret = MR;
+  assert(MR);
+  if (const auto *sym = MR->getAs()) {
+SymbolRef sym2 = sym->getSymbol();
+if (!sym2)
+  return nullptr;
+Ret = sym2->getOriginRegion();
+  }
+  return dyn_cast_or_null(Ret);
+}
+
+// Basically 1 -> 1st, 12 -> 12th, etc.
+static void printIdxWithOrdinalSuffix(llvm::raw_ostream , unsigned Idx) {
+  Os << Idx << llvm::getOrdinalSuffix(Idx);

NagyDonat wrote:

Hmm, I agree that "(second)" doesn't look good when it represents an element 
index _(as highlighted)_:
![image](https://github.com/llvm/llvm-project/assets/43410265/bb01a19c-9d32-4a1d-a356-8cdf742d311d)
but that's already changed to "(at index 1)".

On the other hand, I still think "second argument" is significantly better than 
"2nd argument":
![image](https://github.com/llvm/llvm-project/assets/43410265/1e053a0d-7a38-47d6-8ebe-d66ebf46bba1)

I can accept "2nd argument" if you think that it's significantly better, but if 
you're indifferent, then please change it.


https://github.com/llvm/llvm-project/pull/95408
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improve documentation of checker 'cplusplus.Move' (NFC) (PR #96295)

2024-06-21 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/96295
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improve documentation of checker 'cplusplus.Move' (NFC) (PR #96295)

2024-06-21 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

LGTM, thanks for updating the docs!

I only have one minor inline remark about formatting.

https://github.com/llvm/llvm-project/pull/96295
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improve documentation of checker 'cplusplus.Move' (NFC) (PR #96295)

2024-06-21 Thread Donát Nagy via cfe-commits


@@ -420,21 +420,52 @@ around, such as ``std::string_view``.
 
 cplusplus.Move (C++)
 
-Method calls on a moved-from object and copying a moved-from object will be 
reported.
-
+Find use-after-move bugs in C++. This includes method calls on moved-from
+objects, assignment of a moved-from object, and repeated move of a moved-from
+object.
 
 .. code-block:: cpp
 
-  struct A {
+ struct A {
void foo() {}
  };
 
- void f() {
+ void f1() {
A a;
A b = std::move(a); // note: 'a' became 'moved-from' here
a.foo();// warn: method call on a 'moved-from' object 'a'
  }
 
+ void f2() {
+   A a;
+   A b = std::move(a);
+   A c(std::move(a)); // warn: move of an already moved-from object
+ }
+
+ void f3() {
+   A a;
+   A b = std::move(a);
+   b = a; // warn: copy of moved-from object
+ }
+
+The checker option ``WarnOn`` controls on what objects the use-after-move is
+checked. The most strict value is ``KnownsOnly``, in this mode only objects are
+checked whose type is known to be move-unsafe. These include most STL objects
+(but excluding move-safe ones) and smart pointers. With option value
+``KnownsAndLocals`` local variables (of any type) are additionally checked. The
+idea behind this is that local variables are usually not tempting to be re-used
+so an use after move is more likely a bug than with member variables. With
+option value ``All`` any use-after move condition is checked on all kinds of
+variables, excluding global variables and known move-safe cases. Default value
+is ``KnownsAndLocals``.

NagyDonat wrote:

```suggestion
The checker option ``WarnOn`` controls on what objects the use-after-move is
checked.
* The most strict value is ``KnownsOnly``, in this mode only objects are
  checked whose type is known to be move-unsafe. These include most STL objects
  (but excluding move-safe ones) and smart pointers.
* With option value ``KnownsAndLocals`` local variables (of any type) are
  additionally checked. The idea behind this is that local variables are
  usually not tempting to be re-used so an use after move is more likely a bug
  than with member variables.
* With option value ``All`` any use-after move condition is checked on all
  kinds of variables, excluding global variables and known move-safe cases.
Default value is ``KnownsAndLocals``.
```
This paragraph was very long, re-flow it into a bullet point list.

https://github.com/llvm/llvm-project/pull/96295
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   4   >