[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-02 Thread Artem Dergachev via cfe-commits

https://github.com/haoNoQ created 
https://github.com/llvm/llvm-project/pull/90918

Fixes #90498.

Same as 5337efc69cdd5 for atomic builtins, but for `std::atomic` this time. 
This is useful because even though the actual builtin atomic is still there, it 
may be buried beyond the inlining depth limit.

Also add one popular custom smart pointer class name to the name-based 
heuristics, which isn't necessary to fix the bug but arguably a good idea 
regardless.

>From 6f7c0e33d10d6b3dec2a96a8cdc132eff71463fc Mon Sep 17 00:00:00 2001
From: Artem Dergachev 
Date: Thu, 2 May 2024 15:15:27 -0700
Subject: [PATCH] [analyzer] MallocChecker: Recognize std::atomics in smart
 pointer suppression.

Fixes #90498.

Same as 5337efc69cdd5 for atomic builtins, but for `std::atomic` this time.
This is useful because even though the actual builtin atomic is still there,
it may be buried beyond the inlining depth limit.

Also add one popular custom smart pointer class name to the name-based
heuristics, which isn't necessary to fix the bug but arguably a good idea
regardless.
---
 .../StaticAnalyzer/Checkers/MallocChecker.cpp |  19 ++-
 .../Inputs/system-header-simulator-cxx.h  |   7 ++
 clang/test/Analysis/NewDelete-atomics.cpp | 116 --
 3 files changed, 130 insertions(+), 12 deletions(-)

diff --git a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
index 11651fd491f743..3d89d1e6343490 100644
--- a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
@@ -3447,7 +3447,7 @@ static bool isReferenceCountingPointerDestructor(const 
CXXDestructorDecl *DD) {
 if (N.contains_insensitive("ptr") || N.contains_insensitive("pointer")) {
   if (N.contains_insensitive("ref") || N.contains_insensitive("cnt") ||
   N.contains_insensitive("intrusive") ||
-  N.contains_insensitive("shared")) {
+  N.contains_insensitive("shared") || N.ends_with_insensitive("rc")) {
 return true;
   }
 }
@@ -3479,13 +3479,24 @@ PathDiagnosticPieceRef 
MallocBugVisitor::VisitNode(const ExplodedNode *N,
   // original reference count is positive, we should not report use-after-frees
   // on objects deleted in such destructors. This can probably be improved
   // through better shared pointer modeling.
-  if (ReleaseDestructorLC) {
+  if (ReleaseDestructorLC && (ReleaseDestructorLC == CurrentLC ||
+  ReleaseDestructorLC->isParentOf(CurrentLC))) {
 if (const auto *AE = dyn_cast(S)) {
+  // Check for manual use of atomic builtins.
   AtomicExpr::AtomicOp Op = AE->getOp();
   if (Op == AtomicExpr::AO__c11_atomic_fetch_add ||
   Op == AtomicExpr::AO__c11_atomic_fetch_sub) {
-if (ReleaseDestructorLC == CurrentLC ||
-ReleaseDestructorLC->isParentOf(CurrentLC)) {
+BR.markInvalid(getTag(), S);
+  }
+} else if (const auto *CE = dyn_cast(S)) {
+  // Check for `std::atomic` and such. This covers both regular method 
calls
+  // and operator calls.
+  if (const auto *MD =
+  dyn_cast_or_null(CE->getDirectCallee())) {
+const CXXRecordDecl *RD = MD->getParent();
+// A bit wobbly with ".contains()" because it may be like
+// "__atomic_base" or something.
+if (StringRef(RD->getNameAsString()).contains("atomic")) {
   BR.markInvalid(getTag(), S);
 }
   }
diff --git a/clang/test/Analysis/Inputs/system-header-simulator-cxx.h 
b/clang/test/Analysis/Inputs/system-header-simulator-cxx.h
index 1c2be322f83c20..29326ec1f9280d 100644
--- a/clang/test/Analysis/Inputs/system-header-simulator-cxx.h
+++ b/clang/test/Analysis/Inputs/system-header-simulator-cxx.h
@@ -1260,6 +1260,13 @@ template<
 iterator end() const { return iterator(val + 1); }
 };
 
+template 
+class atomic {
+public:
+  T operator++();
+  T operator--();
+};
+
 namespace execution {
 class sequenced_policy {};
 }
diff --git a/clang/test/Analysis/NewDelete-atomics.cpp 
b/clang/test/Analysis/NewDelete-atomics.cpp
index 54fce17ea7bd22..1425acab7489ba 100644
--- a/clang/test/Analysis/NewDelete-atomics.cpp
+++ b/clang/test/Analysis/NewDelete-atomics.cpp
@@ -20,7 +20,7 @@ typedef enum memory_order {
   memory_order_seq_cst = __ATOMIC_SEQ_CST
 } memory_order;
 
-class Obj {
+class RawObj {
   int RefCnt;
 
 public:
@@ -37,11 +37,27 @@ class Obj {
   void foo();
 };
 
+class StdAtomicObj {
+  std::atomic RefCnt;
+
+public:
+  int incRef() {
+return ++RefCnt;
+  }
+
+  int decRef() {
+return --RefCnt;
+  }
+
+  void foo();
+};
+
+template 
 class IntrusivePtr {
-  Obj *Ptr;
+  T *Ptr;
 
 public:
-  IntrusivePtr(Obj *Ptr) : Ptr(Ptr) {
+  IntrusivePtr(T *Ptr) : Ptr(Ptr) {
 Ptr->incRef();
   }
 
@@ -55,22 +71,106 @@ class IntrusivePtr {
   delete Ptr;
   }
 
-  Obj *getPtr() const { return Ptr; } // no-warning
+  T *getPtr() const { return Ptr; } // no-warning
+};
+
+// A

[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-02 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-static-analyzer-1

Author: Artem Dergachev (haoNoQ)


Changes

Fixes #90498.

Same as 5337efc69cdd5 for atomic builtins, but for `std::atomic` this time. 
This is useful because even though the actual builtin atomic is still there, it 
may be buried beyond the inlining depth limit.

Also add one popular custom smart pointer class name to the name-based 
heuristics, which isn't necessary to fix the bug but arguably a good idea 
regardless.

---
Full diff: https://github.com/llvm/llvm-project/pull/90918.diff


3 Files Affected:

- (modified) clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp (+15-4) 
- (modified) clang/test/Analysis/Inputs/system-header-simulator-cxx.h (+7) 
- (modified) clang/test/Analysis/NewDelete-atomics.cpp (+108-8) 


``diff
diff --git a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
index 11651fd491f743..3d89d1e6343490 100644
--- a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
@@ -3447,7 +3447,7 @@ static bool isReferenceCountingPointerDestructor(const 
CXXDestructorDecl *DD) {
 if (N.contains_insensitive("ptr") || N.contains_insensitive("pointer")) {
   if (N.contains_insensitive("ref") || N.contains_insensitive("cnt") ||
   N.contains_insensitive("intrusive") ||
-  N.contains_insensitive("shared")) {
+  N.contains_insensitive("shared") || N.ends_with_insensitive("rc")) {
 return true;
   }
 }
@@ -3479,13 +3479,24 @@ PathDiagnosticPieceRef 
MallocBugVisitor::VisitNode(const ExplodedNode *N,
   // original reference count is positive, we should not report use-after-frees
   // on objects deleted in such destructors. This can probably be improved
   // through better shared pointer modeling.
-  if (ReleaseDestructorLC) {
+  if (ReleaseDestructorLC && (ReleaseDestructorLC == CurrentLC ||
+  ReleaseDestructorLC->isParentOf(CurrentLC))) {
 if (const auto *AE = dyn_cast(S)) {
+  // Check for manual use of atomic builtins.
   AtomicExpr::AtomicOp Op = AE->getOp();
   if (Op == AtomicExpr::AO__c11_atomic_fetch_add ||
   Op == AtomicExpr::AO__c11_atomic_fetch_sub) {
-if (ReleaseDestructorLC == CurrentLC ||
-ReleaseDestructorLC->isParentOf(CurrentLC)) {
+BR.markInvalid(getTag(), S);
+  }
+} else if (const auto *CE = dyn_cast(S)) {
+  // Check for `std::atomic` and such. This covers both regular method 
calls
+  // and operator calls.
+  if (const auto *MD =
+  dyn_cast_or_null(CE->getDirectCallee())) {
+const CXXRecordDecl *RD = MD->getParent();
+// A bit wobbly with ".contains()" because it may be like
+// "__atomic_base" or something.
+if (StringRef(RD->getNameAsString()).contains("atomic")) {
   BR.markInvalid(getTag(), S);
 }
   }
diff --git a/clang/test/Analysis/Inputs/system-header-simulator-cxx.h 
b/clang/test/Analysis/Inputs/system-header-simulator-cxx.h
index 1c2be322f83c20..29326ec1f9280d 100644
--- a/clang/test/Analysis/Inputs/system-header-simulator-cxx.h
+++ b/clang/test/Analysis/Inputs/system-header-simulator-cxx.h
@@ -1260,6 +1260,13 @@ template<
 iterator end() const { return iterator(val + 1); }
 };
 
+template 
+class atomic {
+public:
+  T operator++();
+  T operator--();
+};
+
 namespace execution {
 class sequenced_policy {};
 }
diff --git a/clang/test/Analysis/NewDelete-atomics.cpp 
b/clang/test/Analysis/NewDelete-atomics.cpp
index 54fce17ea7bd22..1425acab7489ba 100644
--- a/clang/test/Analysis/NewDelete-atomics.cpp
+++ b/clang/test/Analysis/NewDelete-atomics.cpp
@@ -20,7 +20,7 @@ typedef enum memory_order {
   memory_order_seq_cst = __ATOMIC_SEQ_CST
 } memory_order;
 
-class Obj {
+class RawObj {
   int RefCnt;
 
 public:
@@ -37,11 +37,27 @@ class Obj {
   void foo();
 };
 
+class StdAtomicObj {
+  std::atomic RefCnt;
+
+public:
+  int incRef() {
+return ++RefCnt;
+  }
+
+  int decRef() {
+return --RefCnt;
+  }
+
+  void foo();
+};
+
+template 
 class IntrusivePtr {
-  Obj *Ptr;
+  T *Ptr;
 
 public:
-  IntrusivePtr(Obj *Ptr) : Ptr(Ptr) {
+  IntrusivePtr(T *Ptr) : Ptr(Ptr) {
 Ptr->incRef();
   }
 
@@ -55,22 +71,106 @@ class IntrusivePtr {
   delete Ptr;
   }
 
-  Obj *getPtr() const { return Ptr; } // no-warning
+  T *getPtr() const { return Ptr; } // no-warning
+};
+
+// Also IntrusivePtr but let's dodge name-based heuristics.
+template 
+class DifferentlyNamed {
+  T *Ptr;
+
+public:
+  DifferentlyNamed(T *Ptr) : Ptr(Ptr) {
+Ptr->incRef();
+  }
+
+  DifferentlyNamed(const DifferentlyNamed &Other) : Ptr(Other.Ptr) {
+Ptr->incRef();
+  }
+
+  ~DifferentlyNamed() {
+  // We should not take the path on which the object is deleted.
+if (Ptr->decRef() == 1)
+  delete Ptr;
+  }
+
+  T *getPtr() const { return Ptr; } // no-warning
 };
 
 void testDestroyLoc

[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-02 Thread Gábor Horváth via cfe-commits

https://github.com/Xazax-hun edited 
https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-02 Thread Gábor Horváth via cfe-commits

https://github.com/Xazax-hun approved this pull request.

LG!

https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-02 Thread Gábor Horváth via cfe-commits


@@ -3479,13 +3479,24 @@ PathDiagnosticPieceRef 
MallocBugVisitor::VisitNode(const ExplodedNode *N,
   // original reference count is positive, we should not report use-after-frees
   // on objects deleted in such destructors. This can probably be improved
   // through better shared pointer modeling.
-  if (ReleaseDestructorLC) {
+  if (ReleaseDestructorLC && (ReleaseDestructorLC == CurrentLC ||
+  ReleaseDestructorLC->isParentOf(CurrentLC))) {
 if (const auto *AE = dyn_cast(S)) {
+  // Check for manual use of atomic builtins.
   AtomicExpr::AtomicOp Op = AE->getOp();
   if (Op == AtomicExpr::AO__c11_atomic_fetch_add ||
   Op == AtomicExpr::AO__c11_atomic_fetch_sub) {
-if (ReleaseDestructorLC == CurrentLC ||
-ReleaseDestructorLC->isParentOf(CurrentLC)) {
+BR.markInvalid(getTag(), S);
+  }
+} else if (const auto *CE = dyn_cast(S)) {
+  // Check for `std::atomic` and such. This covers both regular method 
calls
+  // and operator calls.
+  if (const auto *MD =
+  dyn_cast_or_null(CE->getDirectCallee())) {
+const CXXRecordDecl *RD = MD->getParent();
+// A bit wobbly with ".contains()" because it may be like
+// "__atomic_base" or something.
+if (StringRef(RD->getNameAsString()).contains("atomic")) {

Xazax-hun wrote:

Do we need to do the heavy weight `getNameAsString` here? Could we get the 
identifier instead?

https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-03 Thread Balazs Benics via cfe-commits

https://github.com/steakhal commented:

Hah!
I've just debugged an identical case last week, wrt. llvm intrusive pointers, 
which also used std::atomics inside for the refcount and lead to a Leak FP.

When I dag into the case, I realized that the root cause is that we can't track 
the value of the `std::atomic` from zero, inc, dec and then fail to prove that 
it must be equal to zero and take the "delete" path.

This is not strictly related to leak reports, probably that's the most affected 
checker - but one can craft any other case with other checkers too. This is why 
I wanted a solution not special-casing the leak checker, but something more 
generic.

I considered modeling the atomic member functions, until the first place they 
escape, after which we can no longer ever reason about them. This made me to 
look into suppressions.

My first idea was to check what "interesting symbols" we have after a BR 
traversal and check if any is a conjured one conjured for a callexpr statement 
calling a std::atomic-related function - and suppress the report in that case.
I realized that this wouldn't be enough, as control dependencies such as 
conditions, wouldn't be considered for leaks, thus not suppress the motivating 
FP example.

My next idea was to also check the `TrackedConditions` of the bug report, and 
look for expressions that refer to a subexpression for which we associate a 
conjured symbol, etc. and apply the same logic as before.
It didn't work because it would suppress all paths crossing that condition, no 
matter if the branch was taken or not.
Now, the final idea was to only consider conditions, which dominate the basic 
block of the bug report itself. This wouldn't solve the leak FP per-say, but 
would work for any other bug report.

I believe, that such a heuristic with dominators could work for the rest of the 
checkers - where the bug report is attached to some given statement - and not 
delayed until some other future statement like in the leak checker.

All in all, I think the suppression you propose here makes sense.
I've just left here what my thought process was when I dig into a similar case. 
It might be useful, who knows.

https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-03 Thread Balazs Benics via cfe-commits


@@ -3479,13 +3479,24 @@ PathDiagnosticPieceRef 
MallocBugVisitor::VisitNode(const ExplodedNode *N,
   // original reference count is positive, we should not report use-after-frees
   // on objects deleted in such destructors. This can probably be improved
   // through better shared pointer modeling.
-  if (ReleaseDestructorLC) {
+  if (ReleaseDestructorLC && (ReleaseDestructorLC == CurrentLC ||
+  ReleaseDestructorLC->isParentOf(CurrentLC))) {
 if (const auto *AE = dyn_cast(S)) {
+  // Check for manual use of atomic builtins.
   AtomicExpr::AtomicOp Op = AE->getOp();
   if (Op == AtomicExpr::AO__c11_atomic_fetch_add ||
   Op == AtomicExpr::AO__c11_atomic_fetch_sub) {
-if (ReleaseDestructorLC == CurrentLC ||
-ReleaseDestructorLC->isParentOf(CurrentLC)) {
+BR.markInvalid(getTag(), S);
+  }
+} else if (const auto *CE = dyn_cast(S)) {
+  // Check for `std::atomic` and such. This covers both regular method 
calls
+  // and operator calls.
+  if (const auto *MD =
+  dyn_cast_or_null(CE->getDirectCallee())) {
+const CXXRecordDecl *RD = MD->getParent();
+// A bit wobbly with ".contains()" because it may be like
+// "__atomic_base" or something.
+if (StringRef(RD->getNameAsString()).contains("atomic")) {

steakhal wrote:

Do we have any safeguard to only match names within the `std` namespace?
Could you add a test case demonstrating that a user-defined type wouldn't be 
mistaken for `atomic` here?

https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-03 Thread Balazs Benics via cfe-commits

https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-06 Thread Artem Dergachev via cfe-commits


@@ -3479,13 +3479,24 @@ PathDiagnosticPieceRef 
MallocBugVisitor::VisitNode(const ExplodedNode *N,
   // original reference count is positive, we should not report use-after-frees
   // on objects deleted in such destructors. This can probably be improved
   // through better shared pointer modeling.
-  if (ReleaseDestructorLC) {
+  if (ReleaseDestructorLC && (ReleaseDestructorLC == CurrentLC ||
+  ReleaseDestructorLC->isParentOf(CurrentLC))) {
 if (const auto *AE = dyn_cast(S)) {
+  // Check for manual use of atomic builtins.
   AtomicExpr::AtomicOp Op = AE->getOp();
   if (Op == AtomicExpr::AO__c11_atomic_fetch_add ||
   Op == AtomicExpr::AO__c11_atomic_fetch_sub) {
-if (ReleaseDestructorLC == CurrentLC ||
-ReleaseDestructorLC->isParentOf(CurrentLC)) {
+BR.markInvalid(getTag(), S);
+  }
+} else if (const auto *CE = dyn_cast(S)) {
+  // Check for `std::atomic` and such. This covers both regular method 
calls
+  // and operator calls.
+  if (const auto *MD =
+  dyn_cast_or_null(CE->getDirectCallee())) {
+const CXXRecordDecl *RD = MD->getParent();
+// A bit wobbly with ".contains()" because it may be like
+// "__atomic_base" or something.
+if (StringRef(RD->getNameAsString()).contains("atomic")) {

haoNoQ wrote:

> Do we need to do the heavy weight `getNameAsString` here? Could we get the 
> identifier instead?

Dunno do we still care about this? This isn't a hot path, and it's so annoying 
and error-prone to check the crash pre-condition first. If it's fast enough for 
ASTMatchers it's probably fast enough for us.

I feel like, since nobody provided a safer method, and since the documentation 
appears to be very stale (`getNameAsString` is described as deprecated but it's 
already clear that it's not going anywhere), this doesn't appear to be a real 
problem to anybody.

Though, I'm very open to changing my mind about this :)

https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-06 Thread Artem Dergachev via cfe-commits


@@ -3479,13 +3479,24 @@ PathDiagnosticPieceRef 
MallocBugVisitor::VisitNode(const ExplodedNode *N,
   // original reference count is positive, we should not report use-after-frees
   // on objects deleted in such destructors. This can probably be improved
   // through better shared pointer modeling.
-  if (ReleaseDestructorLC) {
+  if (ReleaseDestructorLC && (ReleaseDestructorLC == CurrentLC ||
+  ReleaseDestructorLC->isParentOf(CurrentLC))) {
 if (const auto *AE = dyn_cast(S)) {
+  // Check for manual use of atomic builtins.
   AtomicExpr::AtomicOp Op = AE->getOp();
   if (Op == AtomicExpr::AO__c11_atomic_fetch_add ||
   Op == AtomicExpr::AO__c11_atomic_fetch_sub) {
-if (ReleaseDestructorLC == CurrentLC ||
-ReleaseDestructorLC->isParentOf(CurrentLC)) {
+BR.markInvalid(getTag(), S);
+  }
+} else if (const auto *CE = dyn_cast(S)) {
+  // Check for `std::atomic` and such. This covers both regular method 
calls
+  // and operator calls.
+  if (const auto *MD =
+  dyn_cast_or_null(CE->getDirectCallee())) {
+const CXXRecordDecl *RD = MD->getParent();
+// A bit wobbly with ".contains()" because it may be like
+// "__atomic_base" or something.
+if (StringRef(RD->getNameAsString()).contains("atomic")) {

haoNoQ wrote:

> Do we have any safeguard to only match names within the `std` namespace? 
> Could you add a test case demonstrating that a user-defined type wouldn't be 
> mistaken for `atomic` here?

There aren't any safeguards, but I'm not sure we want them. This is already a 
crude heuristic that goes at like 45 degrees against the desired direction. I 
think I'd rather have it catch more false positives by respecting user-defined 
types that are probably atomics, than eliminate a few false negatives when our 
tool is applied to... Chemistry software probably? A few video games come to 
mind? Which are both amazing and I'd love to catch a few bugs in them. But I'm 
generally more worried about the entire projects that can't use our tool at all 
because they use custom atomic classes, dealing with problems similar to the 
original bug report.

Because this heuristic applies only to method calls inside destructors (which 
doesn't include other destructor calls), the exact situation where this causes 
problems is _"somebody explicitly calls a method on a class named 
'...atomic...' which isn't an actual atomic integer, in a destructor which 
isn't a destructor of a smart pointer, and we're tracking a MallocChecker 
use-after-free report where memory was released inside that destructor, and 
that report is actually desired by the user"_. Which is definitely not 
impossible, but even in projects where this could happen, it would not happen 
every time; I hope that only one or two reports are affected in practice. 
MallocChecker isn't even that good in C++ code in which the programmers know 
what a destructor is, so I think even the "report is actually desired by the 
user" part would be fairly hard to satisfy.

So I'm future-proofing this a bit, acknowledging that if we went with strict 
`std::atomic` requirement, we'd be likely to relax it in the future when more 
bug reports come in.

Dunno, am I being overly pessimistic? Again, I'm very open to changing my mind.

https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-06 Thread Artem Dergachev via cfe-commits

haoNoQ wrote:

> I've just left here what my thought process was when I dig into a similar 
> case. It might be useful, who knows.

Medium-to-long-term I think an attribute-based approach might make sense there:
- Either annotate reference-counting pointers as "hey I'm a smart pointer (I'm 
following a different safety model in which symbolic execution based analysis 
would do more harm than good) so please ignore memory managed by me";
- Or annotate reference-counted objects (i.e. objects that contain an intrusive 
reference count and are always managed by intrusive reference-counted pointers) 
with an attribute "hey I'm always well-managed, please ignore my allocations 
entirely (for the same reason)".

Or both.

I've definitely seen a few codebases, that are security-critical to a large-ish 
chunk of humanity, that have like 20 competing ad-hoc reference-counting 
schemes. Mostly in plain C where MallocChecker would otherwise be very useful, 
if it wasn't ruined by false positives of a similar nature from all these 
reference counting schemes. These schemes probably don't make sense to hardcode 
in the compiler because there's no inheritance so you'd need to hardcode every 
struct that follows one of those schemes, not just every scheme in and of 
itself.

> I considered modeling the atomic member functions, until the first place they 
> escape, after which we can no longer ever reason about them. This made me to 
> look into suppressions.

Yeah I think it's not sufficient to model the atomics. It's a good idea to 
model them anyway, but even when atomics aren't used (eg. the custom smart 
pointer never needed to be thread-safe), the fundamental problem is that we 
still don't know the *initial* value of the reference count. In particular we 
can't refute the possibility that the original value was like `-1` or something.

Modeling atomics would make it better when the smart pointer was first created 
*during* analysis, so we actually know that the initial value is 0. Then by 
carefully tracking it we might arrive to the correct conclusion. But when the 
initial value is symbolic we're fundamentally powerless without the 
domain-specific knowledge that the value could not have been `-1`.

It's possible that this domain-specific knowledge could be transferred to us 
with the help of well-placed assertions across the smart pointer class. But an 
attribute could achieve the same in a more direct manner.

> I believe, that such a heuristic with dominators could work for the rest of 
> the checkers - where the bug report is attached to some given statement - and 
> not delayed until some other future statement like in the leak checker.

Yes, I think there has to be something good in this area, even though I haven't 
got any good specific solutions in my head. @Szelethus did a lot of initial 
experimentation in this area, which resulted in improved condition tracking, 
but we haven't used it for false positive suppression yet. Once we research 
this deeper and make it more principled, maybe we should really start doing 
that.

It may also be a good idea to not look at the bug report in isolation, but 
consider the entire space of execution paths on which it was found. If the 
space isn't "vast" enough to convince us that we aren't stepping onto an 
#61669, suppress the warning.

https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-06 Thread via cfe-commits

sharkautarch wrote:


> Yeah I think it's not sufficient to model the atomics. It's a good idea to 
> model them anyway, but even when atomics aren't used (eg. the custom smart 
> pointer never needed to be thread-safe), the fundamental problem is that we 
> still don't know the _initial_ value of the reference count. In particular we 
> can't refute the possibility that the original value was like `-1` or 
> something.

I was actually thinking about something similar to that sort of conundrum, and 
I did have some ideas that don't involve whole program analysis:

- Track function/method calls that pass freeable pointers/references to 
freeable stuff that cross TU boundaries. 
 Maybe that would allow the analyzer to backtrack beyond the local TU, without 
having to try to analyze all of the TUs?

- Specifically for use-after-free FPs with reference counting implementations, 
with perhaps the exception of static variables, I imagine you'd have to have 
allocated memory for you do to a use-after-free on it. Maybe for 
classes/objects recognized as reference counting implementations, you'd ignore 
any weird edgecase for semi-cross-TU use-after-free analysis (only thing I 
could think of is if in one TU memory could be allocated and then the pointer 
to it is set to null, and somehow another TU still has a pointer to said memory 
and frees the memory, and then a third TU uses the memory after it is freed) 
 So, if you assume that memory has to have been allocated at some point (again 
only applicable for use-after-free analysis), you could assume that the 
reference count for said memory *should* have been more than zero at *some 
point*

https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-06 Thread Balazs Benics via cfe-commits


@@ -3479,13 +3479,24 @@ PathDiagnosticPieceRef 
MallocBugVisitor::VisitNode(const ExplodedNode *N,
   // original reference count is positive, we should not report use-after-frees
   // on objects deleted in such destructors. This can probably be improved
   // through better shared pointer modeling.
-  if (ReleaseDestructorLC) {
+  if (ReleaseDestructorLC && (ReleaseDestructorLC == CurrentLC ||
+  ReleaseDestructorLC->isParentOf(CurrentLC))) {
 if (const auto *AE = dyn_cast(S)) {
+  // Check for manual use of atomic builtins.
   AtomicExpr::AtomicOp Op = AE->getOp();
   if (Op == AtomicExpr::AO__c11_atomic_fetch_add ||
   Op == AtomicExpr::AO__c11_atomic_fetch_sub) {
-if (ReleaseDestructorLC == CurrentLC ||
-ReleaseDestructorLC->isParentOf(CurrentLC)) {
+BR.markInvalid(getTag(), S);
+  }
+} else if (const auto *CE = dyn_cast(S)) {
+  // Check for `std::atomic` and such. This covers both regular method 
calls
+  // and operator calls.
+  if (const auto *MD =
+  dyn_cast_or_null(CE->getDirectCallee())) {
+const CXXRecordDecl *RD = MD->getParent();
+// A bit wobbly with ".contains()" because it may be like
+// "__atomic_base" or something.
+if (StringRef(RD->getNameAsString()).contains("atomic")) {

steakhal wrote:

Makes sense.

https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-08 Thread Gábor Horváth via cfe-commits


@@ -3479,13 +3479,24 @@ PathDiagnosticPieceRef 
MallocBugVisitor::VisitNode(const ExplodedNode *N,
   // original reference count is positive, we should not report use-after-frees
   // on objects deleted in such destructors. This can probably be improved
   // through better shared pointer modeling.
-  if (ReleaseDestructorLC) {
+  if (ReleaseDestructorLC && (ReleaseDestructorLC == CurrentLC ||
+  ReleaseDestructorLC->isParentOf(CurrentLC))) {
 if (const auto *AE = dyn_cast(S)) {
+  // Check for manual use of atomic builtins.
   AtomicExpr::AtomicOp Op = AE->getOp();
   if (Op == AtomicExpr::AO__c11_atomic_fetch_add ||
   Op == AtomicExpr::AO__c11_atomic_fetch_sub) {
-if (ReleaseDestructorLC == CurrentLC ||
-ReleaseDestructorLC->isParentOf(CurrentLC)) {
+BR.markInvalid(getTag(), S);
+  }
+} else if (const auto *CE = dyn_cast(S)) {
+  // Check for `std::atomic` and such. This covers both regular method 
calls
+  // and operator calls.
+  if (const auto *MD =
+  dyn_cast_or_null(CE->getDirectCallee())) {
+const CXXRecordDecl *RD = MD->getParent();
+// A bit wobbly with ".contains()" because it may be like
+// "__atomic_base" or something.
+if (StringRef(RD->getNameAsString()).contains("atomic")) {

Xazax-hun wrote:

> Though, I'm very open to changing my mind about this :)

I do not insist :) I am OK with the code as is. 

https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-08 Thread Artem Dergachev via cfe-commits

haoNoQ wrote:

@sharkautarch Yeah I think the situation where we observe the allocation site 
during analysis may be significantly different in presence of smart pointers. 
It may be a good idea to include such check in the heuristic, to reclaim some 
of the false negatives currently silenced by it. If we've seen the initial 
reference count and we've modeled it perfectly so far, these false positives 
can't happen. But this implies that we model our atomics perfectly. And given 
that we don't even know what a "reference count" is (maybe it's just some 
unrelated integer?), it's quite hard to get that right.

Also we don't really do "whole program analysis" and we probably can't do that 
without major changes to our technique; it simply doesn't scale to that scale. 
Our existing cross-TU mode is more about _occasionally_ breaking translation 
unit boundaries when we really want something specific from another TU (like 
the effects of a function call we just encountered), but we still focus on 
small portions of the program at a time. We never really derive whole-program 
conclusions from these small independent invocations of our analysis. That'd 
require very different technology. It's not necessarily difficult per se – we 
could develop such technology and occasionally make our main analysis 
interoperate with it. But we haven't done any of that yet and our cross-TU mode 
isn't that kind of cross-TU mode.

https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-08 Thread Artem Dergachev via cfe-commits

https://github.com/haoNoQ closed https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits