[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-11-07 Thread via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 


https://github.com/DonatNagyE closed 
https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-11-07 Thread via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 


DonatNagyE wrote:

I decided to keep the `reportOOB()` calls without additional changes. I feel 
that adding `std::move()` has limited benefits (performance impact is 
negligible and the next line is a `return` so the reader can already see that 
the variable won't be used again) and it would make those already long lines 
even more convoluted. I also considered refactoring the data flow, but I didn't 
find a more elegant arrangement than the current one.

https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-11-07 Thread via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 


https://github.com/DonatNagyE updated 
https://github.com/llvm/llvm-project/pull/70056

>From 77143e74edda6177248bebdf0424db915aa68a05 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Mon, 2 Oct 2023 13:32:31 +0200
Subject: [PATCH 1/5] [analyzer] Improve reports from ArrayBoundCheckerV2

Previously alpha.security.ArrayBoundV2 produced very spartan bug
reports; this commit ensures that the relevant (and known) details are
reported to the user.

The logic for detecting bugs is not changed, after this commit the
checker will report the same set of issues, but with better messages.

To test the details of the message generation this commit adds a new
test file 'out-of-bounds-diagnostics.c'. Three of the testcases are
added with FIXME notes because they reveal shortcomings of the existing
modeling and bounds checking code. I will try to fix them in separate
follow-up commits.
---
 .../Checkers/ArrayBoundCheckerV2.cpp  | 181 ++
 .../test/Analysis/out-of-bounds-diagnostics.c | 149 ++
 clang/test/Analysis/out-of-bounds-new.cpp |  26 +--
 clang/test/Analysis/out-of-bounds.c   |  28 +--
 .../test/Analysis/taint-diagnostic-visitor.c  |   4 +-
 clang/test/Analysis/taint-generic.c   |  20 +-
 clang/test/Analysis/taint-generic.cpp |  14 +-
 7 files changed, 339 insertions(+), 83 deletions(-)
 create mode 100644 clang/test/Analysis/out-of-bounds-diagnostics.c

diff --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
index e80a06e38587520..7d97f36e4cffe44 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -22,23 +22,25 @@
 #include "clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h"
 #include "llvm/ADT/SmallString.h"
+#include "llvm/Support/FormatVariadic.h"
 #include "llvm/Support/raw_ostream.h"
 #include 
 
 using namespace clang;
 using namespace ento;
 using namespace taint;
+using llvm::formatv;
 
 namespace {
+enum OOB_Kind { OOB_Precedes, OOB_Exceeds, OOB_Taint };
+
 class ArrayBoundCheckerV2 :
 public Checker {
   BugType BT{this, "Out-of-bound access"};
   BugType TaintBT{this, "Out-of-bound access", categories::TaintedData};
 
-  enum OOB_Kind { OOB_Precedes, OOB_Exceeds, OOB_Taint };
-
   void reportOOB(CheckerContext &C, ProgramStateRef ErrorState, OOB_Kind Kind,
- SVal TaintedSVal = UnknownVal()) const;
+ NonLoc Offset, std::string RegName, std::string Msg) const;
 
   static bool isFromCtypeMacro(const Stmt *S, ASTContext &AC);
 
@@ -53,7 +55,7 @@ class ArrayBoundCheckerV2 :
 /// Arr and the distance of Location from the beginning of Arr (expressed in a
 /// NonLoc that specifies the number of CharUnits). Returns nullopt when these
 /// cannot be determined.
-std::optional>
+static std::optional>
 computeOffset(ProgramStateRef State, SValBuilder &SVB, SVal Location) {
   QualType T = SVB.getArrayIndexType();
   auto EvalBinOp = [&SVB, State, T](BinaryOperatorKind Op, NonLoc L, NonLoc R) 
{
@@ -174,9 +176,119 @@ compareValueToThreshold(ProgramStateRef State, NonLoc 
Value, NonLoc Threshold,
   return {nullptr, nullptr};
 }
 
+static std::string getRegionName(const SubRegion *Region) {
+  std::string RegName = Region->getDescriptiveName();
+  if (!RegName.empty())
+return RegName;
+
+  // Field regions only have descriptive names when their parent has a
+  // descriptive name; so we provide a fallback representation for them:
+  if (const auto *FR = Region->getAs()) {
+StringRef Name = FR->getDecl()->getName();
+if (!Name.empty())
+  return formatv("the field '{0}'", Name);
+return "the unnamed field";
+  }
+
+  if (isa(Region))
+return "the memory returned by 'alloca'";
+
+  if (isa(Region) &&
+  isa(Region->getMemorySpace()))
+return "the heap area";
+
+  if (isa(Region))
+return "the string literal";
+
+  return "the region";
+}
+
+static std::optional getConcreteValue(NonLoc SV) {
+  if (auto ConcreteVal = SV.getAs()) {
+const llvm::APSInt &IntVal = ConcreteVal->getValue();
+return IntVal.tryExtValue();
+  }
+  return std::nullopt;
+}
+
+static const char *ShortMsgTemplates[] = {
+"Out of bound access to memory preceding {0}",
+"Out of bound access to memory after the end of {0}",
+"Potential out of bound access to {0} with tainted offset"};
+
+static std::string getShortMsg(OOB_Kind Kind, std::string RegName) {
+  return formatv(ShortMsgTemplates[Kind], RegName);
+}
+
+static std::string getPrecedesMsg(std::string RegName, NonLoc Offset) {
+  SmallString<128> Buf;
+  llvm::raw_svector_ostream Out(Buf);
+  Out << "Access of " << RegName << " at negative byte offset";
+  if (auto Co

[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-11-07 Thread via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 


https://github.com/DonatNagyE updated 
https://github.com/llvm/llvm-project/pull/70056

>From 77143e74edda6177248bebdf0424db915aa68a05 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Mon, 2 Oct 2023 13:32:31 +0200
Subject: [PATCH 1/4] [analyzer] Improve reports from ArrayBoundCheckerV2

Previously alpha.security.ArrayBoundV2 produced very spartan bug
reports; this commit ensures that the relevant (and known) details are
reported to the user.

The logic for detecting bugs is not changed, after this commit the
checker will report the same set of issues, but with better messages.

To test the details of the message generation this commit adds a new
test file 'out-of-bounds-diagnostics.c'. Three of the testcases are
added with FIXME notes because they reveal shortcomings of the existing
modeling and bounds checking code. I will try to fix them in separate
follow-up commits.
---
 .../Checkers/ArrayBoundCheckerV2.cpp  | 181 ++
 .../test/Analysis/out-of-bounds-diagnostics.c | 149 ++
 clang/test/Analysis/out-of-bounds-new.cpp |  26 +--
 clang/test/Analysis/out-of-bounds.c   |  28 +--
 .../test/Analysis/taint-diagnostic-visitor.c  |   4 +-
 clang/test/Analysis/taint-generic.c   |  20 +-
 clang/test/Analysis/taint-generic.cpp |  14 +-
 7 files changed, 339 insertions(+), 83 deletions(-)
 create mode 100644 clang/test/Analysis/out-of-bounds-diagnostics.c

diff --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
index e80a06e38587520..7d97f36e4cffe44 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -22,23 +22,25 @@
 #include "clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h"
 #include "llvm/ADT/SmallString.h"
+#include "llvm/Support/FormatVariadic.h"
 #include "llvm/Support/raw_ostream.h"
 #include 
 
 using namespace clang;
 using namespace ento;
 using namespace taint;
+using llvm::formatv;
 
 namespace {
+enum OOB_Kind { OOB_Precedes, OOB_Exceeds, OOB_Taint };
+
 class ArrayBoundCheckerV2 :
 public Checker {
   BugType BT{this, "Out-of-bound access"};
   BugType TaintBT{this, "Out-of-bound access", categories::TaintedData};
 
-  enum OOB_Kind { OOB_Precedes, OOB_Exceeds, OOB_Taint };
-
   void reportOOB(CheckerContext &C, ProgramStateRef ErrorState, OOB_Kind Kind,
- SVal TaintedSVal = UnknownVal()) const;
+ NonLoc Offset, std::string RegName, std::string Msg) const;
 
   static bool isFromCtypeMacro(const Stmt *S, ASTContext &AC);
 
@@ -53,7 +55,7 @@ class ArrayBoundCheckerV2 :
 /// Arr and the distance of Location from the beginning of Arr (expressed in a
 /// NonLoc that specifies the number of CharUnits). Returns nullopt when these
 /// cannot be determined.
-std::optional>
+static std::optional>
 computeOffset(ProgramStateRef State, SValBuilder &SVB, SVal Location) {
   QualType T = SVB.getArrayIndexType();
   auto EvalBinOp = [&SVB, State, T](BinaryOperatorKind Op, NonLoc L, NonLoc R) 
{
@@ -174,9 +176,119 @@ compareValueToThreshold(ProgramStateRef State, NonLoc 
Value, NonLoc Threshold,
   return {nullptr, nullptr};
 }
 
+static std::string getRegionName(const SubRegion *Region) {
+  std::string RegName = Region->getDescriptiveName();
+  if (!RegName.empty())
+return RegName;
+
+  // Field regions only have descriptive names when their parent has a
+  // descriptive name; so we provide a fallback representation for them:
+  if (const auto *FR = Region->getAs()) {
+StringRef Name = FR->getDecl()->getName();
+if (!Name.empty())
+  return formatv("the field '{0}'", Name);
+return "the unnamed field";
+  }
+
+  if (isa(Region))
+return "the memory returned by 'alloca'";
+
+  if (isa(Region) &&
+  isa(Region->getMemorySpace()))
+return "the heap area";
+
+  if (isa(Region))
+return "the string literal";
+
+  return "the region";
+}
+
+static std::optional getConcreteValue(NonLoc SV) {
+  if (auto ConcreteVal = SV.getAs()) {
+const llvm::APSInt &IntVal = ConcreteVal->getValue();
+return IntVal.tryExtValue();
+  }
+  return std::nullopt;
+}
+
+static const char *ShortMsgTemplates[] = {
+"Out of bound access to memory preceding {0}",
+"Out of bound access to memory after the end of {0}",
+"Potential out of bound access to {0} with tainted offset"};
+
+static std::string getShortMsg(OOB_Kind Kind, std::string RegName) {
+  return formatv(ShortMsgTemplates[Kind], RegName);
+}
+
+static std::string getPrecedesMsg(std::string RegName, NonLoc Offset) {
+  SmallString<128> Buf;
+  llvm::raw_svector_ostream Out(Buf);
+  Out << "Access of " << RegName << " at negative byte offset";
+  if (auto ConcreteIdx = Offset.getAs(

[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-11-02 Thread Balazs Benics via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 



@@ -217,80 +326,71 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, 
bool isLoad,
 // MallocChecker that call SValBuilder::getConjuredHeapSymbolVal()) and
 // non-symbolic regions (e.g. a field subregion of a symbolic region) in
 // unknown space.
-auto [state_precedesLowerBound, state_withinLowerBound] =
-compareValueToThreshold(state, ByteOffset,
-svalBuilder.makeZeroArrayIndex(), svalBuilder);
+auto [PrecedesLowerBound, WithinLowerBound] = compareValueToThreshold(
+State, ByteOffset, SVB.makeZeroArrayIndex(), SVB);
 
-if (state_precedesLowerBound && !state_withinLowerBound) {
+if (PrecedesLowerBound && !WithinLowerBound) {
   // We know that the index definitely precedes the lower bound.
-  reportOOB(checkerContext, state_precedesLowerBound, OOB_Precedes);
+  std::string RegName = getRegionName(Reg);
+  std::string Msg = getPrecedesMsg(RegName, ByteOffset);
+  reportOOB(C, PrecedesLowerBound, OOB_Precedes, ByteOffset, RegName, Msg);

steakhal wrote:

To me, `move` describes the intent: I produced something, and this is the only 
place that is supposed to consume it. Keep in mind that this code is cold, thus 
we can do whatever we want, including making unnecessary copies.
BTW I didn't see that the result of the first is actually used for the second 
call. This way this code makes all sense.

https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-11-02 Thread Balazs Benics via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 



@@ -0,0 +1,149 @@
+// RUN: %clang_analyze_cc1 -Wno-array-bounds -analyzer-output=text\
+// RUN: 
-analyzer-checker=core,alpha.security.ArrayBoundV2,unix.Malloc,alpha.security.taint
 -verify %s
+
+int array[10];
+
+void arrayUnderflow(void) {
+  array[-3] = 5;
+  // expected-warning@-1 {{Out of bound access to memory preceding 'array'}}
+  // expected-note@-2 {{Access of 'array' at negative byte offset -12}}
+}
+
+int scanf(const char *restrict fmt, ...);
+
+void taintedIndex(void) {
+  int index;
+  scanf("%d", &index);
+  // expected-note@-1 {{Taint originated here}}
+  // expected-note@-2 {{Taint propagated to the 2nd argument}}
+  array[index] = 5;
+  // expected-warning@-1 {{Potential out of bound access to 'array' with 
tainted offset}}
+  // expected-note@-2 {{Access of 'array' with a tainted offset that may be 
too large}}
+}
+
+void arrayOverflow(void) {
+  array[12] = 5;
+  // expected-warning@-1 {{Out of bound access to memory after the end of 
'array'}}
+  // expected-note@-2 {{Access of 'array' at index 12, while it holds only 10 
'int' elements}}
+}
+
+int scalar;
+int scalarOverflow(void) {
+  return (&scalar)[1];
+  // expected-warning@-1 {{Out of bound access to memory after the end of 
'scalar'}}
+  // expected-note@-2 {{Access of 'scalar' at index 1, while it holds only a 
single 'int' element}}
+}
+
+int oneElementArray[1];
+int oneElementArrayOverflow(void) {
+  return oneElementArray[1];
+  // expected-warning@-1 {{Out of bound access to memory after the end of 
'oneElementArray'}}
+  // expected-note@-2 {{Access of 'oneElementArray' at index 1, while it holds 
only a single 'int' element}}
+}
+
+short convertedArray(void) {
+  return ((short*)array)[47];
+  // expected-warning@-1 {{Out of bound access to memory after the end of 
'array'}}
+  // expected-note@-2 {{Access of 'array' at index 47, while it holds only 20 
'short' elements}}
+}
+
+struct vec {
+  int len;
+  double elems[64];
+} v;
+
+double arrayInStruct(void) {
+  return v.elems[64];
+  // expected-warning@-1 {{Out of bound access to memory after the end of 
'v.elems'}}
+  // expected-note@-2 {{Access of 'v.elems' at index 64, while it holds only 
64 'double' elements}}
+}
+
+double arrayInStructPtr(struct vec *pv) {
+  return pv->elems[64];
+  // expected-warning@-1 {{Out of bound access to memory after the end of the 
field 'elems'}}
+  // expected-note@-2 {{Access of the field 'elems' at index 64, while it 
holds only 64 'double' elements}}
+}
+
+struct two_bytes {
+  char lo, hi;
+};
+
+struct two_bytes convertedArray2(void) {
+  // We report this with byte offsets because the offset is not divisible by 
the element size.
+  struct two_bytes a = {0, 0};
+  char *p = (char*)&a;
+  return *((struct two_bytes*)(p + 7));
+  // expected-warning@-1 {{Out of bound access to memory after the end of 'a'}}
+  // expected-note@-2 {{Access of 'a' at byte offset 7, while it holds only 2 
bytes}}
+}
+
+int intFromString(void) {
+  // We report this with byte offsets because the extent is not divisible by 
the element size.
+  return ((const int*)"this is a string of 33 characters")[20];
+  // expected-warning@-1 {{Out of bound access to memory after the end of the 
string literal}}
+  // expected-note@-2 {{Access of the string literal at byte offset 80, while 
it holds only 34 bytes}}
+}
+
+int intFromStringDivisible(void) {
+  // However, this is reported with indices/elements, because the extent 
happens to be a multiple of 4.
+  return ((const int*)"abc")[20];
+  // expected-warning@-1 {{Out of bound access to memory after the end of the 
string literal}}
+  // expected-note@-2 {{Access of the string literal at index 20, while it 
holds only a single 'int' element}}
+}

steakhal wrote:

I have no strong opinion.
Thanks for the explanation.

https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-11-02 Thread via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 


DonatNagyE wrote:

Thanks for the suggestions; I'll probably upload the next revision tomorrow.

> One remark about the review workflow, I'd prefer if the conversation starter 
> would resolve the conversations. Let me explain why:
> 
> Given the amount of reviewers for CSA, I'd suggest explicitly supporting 
> reviewer experience. Speaking of that, personally I find the following 
> workflow to suite myself the best:
> 
> * All comments needs a reaction, either by explict commenting on it or by 
> putting there a thums-up emoji or something. This helps the author follow 
> which comments were addressed downstream and is pending for submission. An 
> explicit comment is expected for challenging a comment. I prefer emojies, 
> becase they don't send an email to everyone subscribe - unlike for individual 
> comments; and usually I also batch individual commits into a "self-review" as 
> patch author when I reply for the same reason.
> 
> * All comments needs to be reacted before requestion the next round of review 
> from the person who added those comments.
> 
> * The comment should be only marked resolved by the person who raised the 
> concern. This ensures that the comment was adequately solved, thus the issue 
> is no longer relevant.
> 
> * To merge a PR, the PR should have no open conversations. Post-approval 
> comments are also fine and leaves room for further discussions, but the goal 
> is no longer to directly challenge the legitimacy of the patch.

Sounds reasonable, I'll try to follow this process, especially when interacting 
with you. In the previous round I resolved the conversations because I applied 
them (mostly verbatim) in the new revision that I uploaded; but I can instead 
apply a thumbsup emoji if you'd prefer that. 

https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-11-02 Thread via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 



@@ -0,0 +1,149 @@
+// RUN: %clang_analyze_cc1 -Wno-array-bounds -analyzer-output=text\
+// RUN: 
-analyzer-checker=core,alpha.security.ArrayBoundV2,unix.Malloc,alpha.security.taint
 -verify %s
+
+int array[10];
+
+void arrayUnderflow(void) {
+  array[-3] = 5;
+  // expected-warning@-1 {{Out of bound access to memory preceding 'array'}}
+  // expected-note@-2 {{Access of 'array' at negative byte offset -12}}
+}
+
+int scanf(const char *restrict fmt, ...);
+
+void taintedIndex(void) {
+  int index;
+  scanf("%d", &index);
+  // expected-note@-1 {{Taint originated here}}
+  // expected-note@-2 {{Taint propagated to the 2nd argument}}
+  array[index] = 5;
+  // expected-warning@-1 {{Potential out of bound access to 'array' with 
tainted offset}}
+  // expected-note@-2 {{Access of 'array' with a tainted offset that may be 
too large}}
+}
+
+void arrayOverflow(void) {
+  array[12] = 5;
+  // expected-warning@-1 {{Out of bound access to memory after the end of 
'array'}}
+  // expected-note@-2 {{Access of 'array' at index 12, while it holds only 10 
'int' elements}}
+}
+
+int scalar;
+int scalarOverflow(void) {
+  return (&scalar)[1];
+  // expected-warning@-1 {{Out of bound access to memory after the end of 
'scalar'}}
+  // expected-note@-2 {{Access of 'scalar' at index 1, while it holds only a 
single 'int' element}}
+}
+
+int oneElementArray[1];
+int oneElementArrayOverflow(void) {
+  return oneElementArray[1];
+  // expected-warning@-1 {{Out of bound access to memory after the end of 
'oneElementArray'}}
+  // expected-note@-2 {{Access of 'oneElementArray' at index 1, while it holds 
only a single 'int' element}}
+}
+
+short convertedArray(void) {
+  return ((short*)array)[47];
+  // expected-warning@-1 {{Out of bound access to memory after the end of 
'array'}}
+  // expected-note@-2 {{Access of 'array' at index 47, while it holds only 20 
'short' elements}}
+}
+
+struct vec {
+  int len;
+  double elems[64];
+} v;
+
+double arrayInStruct(void) {
+  return v.elems[64];
+  // expected-warning@-1 {{Out of bound access to memory after the end of 
'v.elems'}}
+  // expected-note@-2 {{Access of 'v.elems' at index 64, while it holds only 
64 'double' elements}}
+}
+
+double arrayInStructPtr(struct vec *pv) {
+  return pv->elems[64];
+  // expected-warning@-1 {{Out of bound access to memory after the end of the 
field 'elems'}}
+  // expected-note@-2 {{Access of the field 'elems' at index 64, while it 
holds only 64 'double' elements}}
+}
+
+struct two_bytes {
+  char lo, hi;
+};
+
+struct two_bytes convertedArray2(void) {
+  // We report this with byte offsets because the offset is not divisible by 
the element size.
+  struct two_bytes a = {0, 0};
+  char *p = (char*)&a;
+  return *((struct two_bytes*)(p + 7));
+  // expected-warning@-1 {{Out of bound access to memory after the end of 'a'}}
+  // expected-note@-2 {{Access of 'a' at byte offset 7, while it holds only 2 
bytes}}
+}
+
+int intFromString(void) {
+  // We report this with byte offsets because the extent is not divisible by 
the element size.
+  return ((const int*)"this is a string of 33 characters")[20];
+  // expected-warning@-1 {{Out of bound access to memory after the end of the 
string literal}}
+  // expected-note@-2 {{Access of the string literal at byte offset 80, while 
it holds only 34 bytes}}
+}
+
+int intFromStringDivisible(void) {
+  // However, this is reported with indices/elements, because the extent 
happens to be a multiple of 4.
+  return ((const int*)"abc")[20];
+  // expected-warning@-1 {{Out of bound access to memory after the end of the 
string literal}}
+  // expected-note@-2 {{Access of the string literal at index 20, while it 
holds only a single 'int' element}}
+}

DonatNagyE wrote:

The note mentions `int` because it's using the array subscript operator on an 
`int *` pointer (to get an `int` value). This string literal memory area 
consists of the 4 bytes `abc\0`, which is equivalent to 'a single' 4-byte 
integer.

I can add a special case to avoid messages like this e.g. by always using byte 
offsets for reporting issues that are coming from string literals.

https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-11-02 Thread via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 



@@ -217,80 +326,71 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, 
bool isLoad,
 // MallocChecker that call SValBuilder::getConjuredHeapSymbolVal()) and
 // non-symbolic regions (e.g. a field subregion of a symbolic region) in
 // unknown space.
-auto [state_precedesLowerBound, state_withinLowerBound] =
-compareValueToThreshold(state, ByteOffset,
-svalBuilder.makeZeroArrayIndex(), svalBuilder);
+auto [PrecedesLowerBound, WithinLowerBound] = compareValueToThreshold(
+State, ByteOffset, SVB.makeZeroArrayIndex(), SVB);
 
-if (state_precedesLowerBound && !state_withinLowerBound) {
+if (PrecedesLowerBound && !WithinLowerBound) {
   // We know that the index definitely precedes the lower bound.
-  reportOOB(checkerContext, state_precedesLowerBound, OOB_Precedes);
+  std::string RegName = getRegionName(Reg);
+  std::string Msg = getPrecedesMsg(RegName, ByteOffset);
+  reportOOB(C, PrecedesLowerBound, OOB_Precedes, ByteOffset, RegName, Msg);

DonatNagyE wrote:

Good point, I'm not familiar enough with move use to spot situations where it's 
needed. (I'll also consider refactoring the code, because the `reportOOB()` 
call becomes very long with the addition of the moves. Perhaps I'll just 
pessimize the code and allow it to call `getRegionName()` twice, because 
passing around the region name string seems to be a premature optimization.)

https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-11-02 Thread via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 



@@ -22,23 +22,25 @@
 #include "clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h"
 #include "llvm/ADT/SmallString.h"
+#include "llvm/Support/FormatVariadic.h"
 #include "llvm/Support/raw_ostream.h"
 #include 
 
 using namespace clang;
 using namespace ento;
 using namespace taint;
+using llvm::formatv;
 
 namespace {
+enum OOB_Kind { OOB_Precedes, OOB_Exceeds, OOB_Taint };

DonatNagyE wrote:

The enumerator naming style was "inherited" from old versions of the checker. 
Perhaps I'll rename them later when I need to touch them for some reason.

https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-11-02 Thread via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 



@@ -174,9 +176,116 @@ compareValueToThreshold(ProgramStateRef State, NonLoc 
Value, NonLoc Threshold,
   return {nullptr, nullptr};
 }
 
-void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad,
-const Stmt* LoadS,
-CheckerContext &checkerContext) const {
+static std::string getRegionName(const SubRegion *Region) {
+  if (std::string RegName = Region->getDescriptiveName(); !RegName.empty())
+return RegName;
+
+  // Field regions only have descriptive names when their parent has a
+  // descriptive name; so we provide a fallback representation for them:
+  if (const auto *FR = Region->getAs()) {
+if (StringRef Name = FR->getDecl()->getName(); !Name.empty())
+  return formatv("the field '{0}'", Name);
+return "the unnamed field";
+  }
+
+  if (isa(Region))
+return "the memory returned by 'alloca'";
+
+  if (isa(Region) &&
+  isa(Region->getMemorySpace()))
+return "the heap area";
+
+  if (isa(Region))
+return "the string literal";
+
+  return "the region";
+}
+
+static std::optional getConcreteValue(NonLoc SV) {
+  if (auto ConcreteVal = SV.getAs()) {
+return ConcreteVal->getValue().tryExtValue();
+  }
+  return std::nullopt;
+}
+
+static std::string getShortMsg(OOB_Kind Kind, std::string RegName) {
+  static const char *ShortMsgTemplates[] = {
+  "Out of bound access to memory preceding {0}",
+  "Out of bound access to memory after the end of {0}",
+  "Potential out of bound access to {0} with tainted offset"};
+
+  return formatv(ShortMsgTemplates[Kind], RegName);
+}
+
+static std::string getPrecedesMsg(std::string RegName, NonLoc Offset) {
+  SmallString<128> Buf;
+  llvm::raw_svector_ostream Out(Buf);

DonatNagyE wrote:

This is a minor performance optimization, which was suggested by the LLVM 
Programmer's Manual:
> The major disadvantage of std::string is that almost every operation that 
> makes them larger can allocate memory, which is slow. As such, it is better 
> to use SmallVector or Twine as a scratch buffer, but then use std::string to 
> persist the result.
(from https://www.llvm.org/docs/ProgrammersManual.html#std-string)

I'd slightly prefer keeping it, but I can replace it with `std::string` if you 
want.


https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-10-31 Thread Balazs Benics via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 


https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-10-31 Thread Balazs Benics via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 


https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-10-31 Thread Balazs Benics via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 



@@ -22,23 +22,25 @@
 #include "clang/StaticAnalyzer/Core/PathSensitive/DynamicExtent.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h"
 #include "llvm/ADT/SmallString.h"
+#include "llvm/Support/FormatVariadic.h"
 #include "llvm/Support/raw_ostream.h"
 #include 
 
 using namespace clang;
 using namespace ento;
 using namespace taint;
+using llvm::formatv;
 
 namespace {
+enum OOB_Kind { OOB_Precedes, OOB_Exceeds, OOB_Taint };

steakhal wrote:

FYI you can use any identifiers here because these are only exposed to this 
single TU.
The ugly prefixes are only applied to enums exposed in headers etc. where there 
is an unbounded chance of having collisions.
No action is expected.

https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-10-31 Thread Balazs Benics via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 



@@ -0,0 +1,149 @@
+// RUN: %clang_analyze_cc1 -Wno-array-bounds -analyzer-output=text\
+// RUN: 
-analyzer-checker=core,alpha.security.ArrayBoundV2,unix.Malloc,alpha.security.taint
 -verify %s
+
+int array[10];
+
+void arrayUnderflow(void) {
+  array[-3] = 5;
+  // expected-warning@-1 {{Out of bound access to memory preceding 'array'}}
+  // expected-note@-2 {{Access of 'array' at negative byte offset -12}}
+}
+
+int scanf(const char *restrict fmt, ...);
+
+void taintedIndex(void) {
+  int index;
+  scanf("%d", &index);
+  // expected-note@-1 {{Taint originated here}}
+  // expected-note@-2 {{Taint propagated to the 2nd argument}}
+  array[index] = 5;
+  // expected-warning@-1 {{Potential out of bound access to 'array' with 
tainted offset}}
+  // expected-note@-2 {{Access of 'array' with a tainted offset that may be 
too large}}
+}
+
+void arrayOverflow(void) {
+  array[12] = 5;
+  // expected-warning@-1 {{Out of bound access to memory after the end of 
'array'}}
+  // expected-note@-2 {{Access of 'array' at index 12, while it holds only 10 
'int' elements}}
+}
+
+int scalar;
+int scalarOverflow(void) {
+  return (&scalar)[1];
+  // expected-warning@-1 {{Out of bound access to memory after the end of 
'scalar'}}
+  // expected-note@-2 {{Access of 'scalar' at index 1, while it holds only a 
single 'int' element}}
+}
+
+int oneElementArray[1];
+int oneElementArrayOverflow(void) {
+  return oneElementArray[1];
+  // expected-warning@-1 {{Out of bound access to memory after the end of 
'oneElementArray'}}
+  // expected-note@-2 {{Access of 'oneElementArray' at index 1, while it holds 
only a single 'int' element}}
+}
+
+short convertedArray(void) {
+  return ((short*)array)[47];
+  // expected-warning@-1 {{Out of bound access to memory after the end of 
'array'}}
+  // expected-note@-2 {{Access of 'array' at index 47, while it holds only 20 
'short' elements}}
+}
+
+struct vec {
+  int len;
+  double elems[64];
+} v;
+
+double arrayInStruct(void) {
+  return v.elems[64];
+  // expected-warning@-1 {{Out of bound access to memory after the end of 
'v.elems'}}
+  // expected-note@-2 {{Access of 'v.elems' at index 64, while it holds only 
64 'double' elements}}
+}
+
+double arrayInStructPtr(struct vec *pv) {
+  return pv->elems[64];
+  // expected-warning@-1 {{Out of bound access to memory after the end of the 
field 'elems'}}
+  // expected-note@-2 {{Access of the field 'elems' at index 64, while it 
holds only 64 'double' elements}}
+}
+
+struct two_bytes {
+  char lo, hi;
+};
+
+struct two_bytes convertedArray2(void) {
+  // We report this with byte offsets because the offset is not divisible by 
the element size.
+  struct two_bytes a = {0, 0};
+  char *p = (char*)&a;
+  return *((struct two_bytes*)(p + 7));
+  // expected-warning@-1 {{Out of bound access to memory after the end of 'a'}}
+  // expected-note@-2 {{Access of 'a' at byte offset 7, while it holds only 2 
bytes}}
+}
+
+int intFromString(void) {
+  // We report this with byte offsets because the extent is not divisible by 
the element size.
+  return ((const int*)"this is a string of 33 characters")[20];
+  // expected-warning@-1 {{Out of bound access to memory after the end of the 
string literal}}
+  // expected-note@-2 {{Access of the string literal at byte offset 80, while 
it holds only 34 bytes}}
+}
+
+int intFromStringDivisible(void) {
+  // However, this is reported with indices/elements, because the extent 
happens to be a multiple of 4.
+  return ((const int*)"abc")[20];
+  // expected-warning@-1 {{Out of bound access to memory after the end of the 
string literal}}
+  // expected-note@-2 {{Access of the string literal at index 20, while it 
holds only a single 'int' element}}
+}

steakhal wrote:

I couldn't wrap my head around the end of the note: `while it holds only a 
single 'int' element`
Could you elaborate on why is `int` mentioned here, and if so, why only a 
`single`?

https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-10-31 Thread Balazs Benics via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 



@@ -217,80 +326,71 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, 
bool isLoad,
 // MallocChecker that call SValBuilder::getConjuredHeapSymbolVal()) and
 // non-symbolic regions (e.g. a field subregion of a symbolic region) in
 // unknown space.
-auto [state_precedesLowerBound, state_withinLowerBound] =
-compareValueToThreshold(state, ByteOffset,
-svalBuilder.makeZeroArrayIndex(), svalBuilder);
+auto [PrecedesLowerBound, WithinLowerBound] = compareValueToThreshold(
+State, ByteOffset, SVB.makeZeroArrayIndex(), SVB);
 
-if (state_precedesLowerBound && !state_withinLowerBound) {
+if (PrecedesLowerBound && !WithinLowerBound) {
   // We know that the index definitely precedes the lower bound.
-  reportOOB(checkerContext, state_precedesLowerBound, OOB_Precedes);
+  std::string RegName = getRegionName(Reg);
+  std::string Msg = getPrecedesMsg(RegName, ByteOffset);
+  reportOOB(C, PrecedesLowerBound, OOB_Precedes, ByteOffset, RegName, Msg);
   return;
 }
 
-if (state_withinLowerBound)
-  state = state_withinLowerBound;
+if (WithinLowerBound)
+  State = WithinLowerBound;
   }
 
   // CHECK UPPER BOUND
-  DefinedOrUnknownSVal Size = getDynamicExtent(state, Reg, svalBuilder);
+  DefinedOrUnknownSVal Size = getDynamicExtent(State, Reg, SVB);
   if (auto KnownSize = Size.getAs()) {
-auto [state_withinUpperBound, state_exceedsUpperBound] =
-compareValueToThreshold(state, ByteOffset, *KnownSize, svalBuilder);
+auto [WithinUpperBound, ExceedsUpperBound] =
+compareValueToThreshold(State, ByteOffset, *KnownSize, SVB);
 
-if (state_exceedsUpperBound) {
-  if (!state_withinUpperBound) {
+if (ExceedsUpperBound) {
+  if (!WithinUpperBound) {
 // We know that the index definitely exceeds the upper bound.
-reportOOB(checkerContext, state_exceedsUpperBound, OOB_Exceeds);
+std::string RegName = getRegionName(Reg);
+std::string Msg = getExceedsMsg(C.getASTContext(), RegName, ByteOffset,
+*KnownSize, Location);
+reportOOB(C, ExceedsUpperBound, OOB_Exceeds, ByteOffset, RegName, Msg);

steakhal wrote:

```suggestion
reportOOB(C, ExceedsUpperBound, OOB_Exceeds, ByteOffset, 
std::move(RegName), std::move(Msg));
```

https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-10-31 Thread Balazs Benics via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 



@@ -217,80 +326,71 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, 
bool isLoad,
 // MallocChecker that call SValBuilder::getConjuredHeapSymbolVal()) and
 // non-symbolic regions (e.g. a field subregion of a symbolic region) in
 // unknown space.
-auto [state_precedesLowerBound, state_withinLowerBound] =
-compareValueToThreshold(state, ByteOffset,
-svalBuilder.makeZeroArrayIndex(), svalBuilder);
+auto [PrecedesLowerBound, WithinLowerBound] = compareValueToThreshold(
+State, ByteOffset, SVB.makeZeroArrayIndex(), SVB);
 
-if (state_precedesLowerBound && !state_withinLowerBound) {
+if (PrecedesLowerBound && !WithinLowerBound) {
   // We know that the index definitely precedes the lower bound.
-  reportOOB(checkerContext, state_precedesLowerBound, OOB_Precedes);
+  std::string RegName = getRegionName(Reg);
+  std::string Msg = getPrecedesMsg(RegName, ByteOffset);
+  reportOOB(C, PrecedesLowerBound, OOB_Precedes, ByteOffset, RegName, Msg);
   return;
 }
 
-if (state_withinLowerBound)
-  state = state_withinLowerBound;
+if (WithinLowerBound)
+  State = WithinLowerBound;
   }
 
   // CHECK UPPER BOUND
-  DefinedOrUnknownSVal Size = getDynamicExtent(state, Reg, svalBuilder);
+  DefinedOrUnknownSVal Size = getDynamicExtent(State, Reg, SVB);
   if (auto KnownSize = Size.getAs()) {
-auto [state_withinUpperBound, state_exceedsUpperBound] =
-compareValueToThreshold(state, ByteOffset, *KnownSize, svalBuilder);
+auto [WithinUpperBound, ExceedsUpperBound] =
+compareValueToThreshold(State, ByteOffset, *KnownSize, SVB);
 
-if (state_exceedsUpperBound) {
-  if (!state_withinUpperBound) {
+if (ExceedsUpperBound) {
+  if (!WithinUpperBound) {
 // We know that the index definitely exceeds the upper bound.
-reportOOB(checkerContext, state_exceedsUpperBound, OOB_Exceeds);
+std::string RegName = getRegionName(Reg);
+std::string Msg = getExceedsMsg(C.getASTContext(), RegName, ByteOffset,
+*KnownSize, Location);
+reportOOB(C, ExceedsUpperBound, OOB_Exceeds, ByteOffset, RegName, Msg);
 return;
   }
-  if (isTainted(state, ByteOffset)) {
+  if (isTainted(State, ByteOffset)) {
 // Both cases are possible, but the index is tainted, so report.
-reportOOB(checkerContext, state_exceedsUpperBound, OOB_Taint,
-  ByteOffset);
+std::string RegName = getRegionName(Reg);
+std::string Msg = getTaintMsg(RegName);
+reportOOB(C, ExceedsUpperBound, OOB_Taint, ByteOffset, RegName, Msg);

steakhal wrote:

```suggestion
reportOOB(C, ExceedsUpperBound, OOB_Taint, ByteOffset, 
std::move(RegName), std::move(Msg));
```

https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-10-31 Thread Balazs Benics via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 



@@ -174,9 +176,116 @@ compareValueToThreshold(ProgramStateRef State, NonLoc 
Value, NonLoc Threshold,
   return {nullptr, nullptr};
 }
 
-void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad,
-const Stmt* LoadS,
-CheckerContext &checkerContext) const {
+static std::string getRegionName(const SubRegion *Region) {
+  if (std::string RegName = Region->getDescriptiveName(); !RegName.empty())
+return RegName;
+
+  // Field regions only have descriptive names when their parent has a
+  // descriptive name; so we provide a fallback representation for them:
+  if (const auto *FR = Region->getAs()) {
+if (StringRef Name = FR->getDecl()->getName(); !Name.empty())
+  return formatv("the field '{0}'", Name);
+return "the unnamed field";
+  }
+
+  if (isa(Region))
+return "the memory returned by 'alloca'";
+
+  if (isa(Region) &&
+  isa(Region->getMemorySpace()))
+return "the heap area";
+
+  if (isa(Region))
+return "the string literal";
+
+  return "the region";
+}
+
+static std::optional getConcreteValue(NonLoc SV) {
+  if (auto ConcreteVal = SV.getAs()) {
+return ConcreteVal->getValue().tryExtValue();
+  }
+  return std::nullopt;
+}
+
+static std::string getShortMsg(OOB_Kind Kind, std::string RegName) {
+  static const char *ShortMsgTemplates[] = {
+  "Out of bound access to memory preceding {0}",
+  "Out of bound access to memory after the end of {0}",
+  "Potential out of bound access to {0} with tainted offset"};
+
+  return formatv(ShortMsgTemplates[Kind], RegName);
+}
+
+static std::string getPrecedesMsg(std::string RegName, NonLoc Offset) {
+  SmallString<128> Buf;
+  llvm::raw_svector_ostream Out(Buf);

steakhal wrote:

Have you considered using `std::string` and `llvm::raw_string_ostream`, given 
that you convert it to `std::string` anyways. BTW sorry for not spotting this 
at the previous round.
This applies to other similar cases as well.

https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-10-31 Thread Balazs Benics via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 



@@ -217,80 +326,71 @@ void ArrayBoundCheckerV2::checkLocation(SVal location, 
bool isLoad,
 // MallocChecker that call SValBuilder::getConjuredHeapSymbolVal()) and
 // non-symbolic regions (e.g. a field subregion of a symbolic region) in
 // unknown space.
-auto [state_precedesLowerBound, state_withinLowerBound] =
-compareValueToThreshold(state, ByteOffset,
-svalBuilder.makeZeroArrayIndex(), svalBuilder);
+auto [PrecedesLowerBound, WithinLowerBound] = compareValueToThreshold(
+State, ByteOffset, SVB.makeZeroArrayIndex(), SVB);
 
-if (state_precedesLowerBound && !state_withinLowerBound) {
+if (PrecedesLowerBound && !WithinLowerBound) {
   // We know that the index definitely precedes the lower bound.
-  reportOOB(checkerContext, state_precedesLowerBound, OOB_Precedes);
+  std::string RegName = getRegionName(Reg);
+  std::string Msg = getPrecedesMsg(RegName, ByteOffset);
+  reportOOB(C, PrecedesLowerBound, OOB_Precedes, ByteOffset, RegName, Msg);

steakhal wrote:

```suggestion
  reportOOB(C, PrecedesLowerBound, OOB_Precedes, ByteOffset, 
std::move(RegName), std::move(Msg));
```

https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-10-31 Thread Balazs Benics via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 


https://github.com/steakhal approved this pull request.

I only had a couple `std::move`s missing, an FYI comment, and one question 
about the diagnostics in the tests.

Even in the current state, I think it's a good baby step in improving the 
status quo. 
Thank you for pushing this forward!
---

One remark about the review workflow, I'd prefer if the conversation starter 
would resolve the conversations. Let me explain why:

Given the amount of reviewers for CSA, I'd suggest explicitly supporting 
reviewer experience.
Speaking of that, personally I find the following workflow to suite myself the 
best:
 - All comments needs a reaction, either by explict commenting on it or by 
putting there a thums-up emoji or something. This helps the author follow which 
comments were addressed downstream and is pending for submission. An explicit 
comment is expected for challenging a comment. I prefer emojies, becase they 
don't send an email to everyone subscribe - unlike for individual comments; and 
usually I also batch individual commits into a "self-review" as patch author 
when I reply for the same reason.
 - All comments needs to be reacted before requestion the next round of review 
from the person who added those comments.
 - The comment should be only marked resolved by the person who raised the 
concern. This ensures that the comment was adequately solved, thus the issue is 
no longer relevant.
 - To merge a PR, the PR should have no open conversations.

Note that I'm not raising these because I feel we have to adjust, but rather 
because I find the reviews experience on GitHub really poor in general.
Comments disappear, comments don't show up for the patch author, only if they 
look at the right pane like "Conversations" and "Files changed" - yes, not all 
comments are present on each -.- And I've been bitten by it as a patch author 
many many times, and expectations around "reacting on comments" helped to 
mitigate this pain to some degree.

Note that this is only my personal preference, and if I'm not mistaken, there 
is no legitimized consensus around the project. At least, last time I checked 
the following relevant thread around GitHub review workflows:
https://discourse.llvm.org/t/rfc-github-pr-resolve-conversation-button/73178/



https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-10-31 Thread Balazs Benics via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 


https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-10-31 Thread via cfe-commits
=?utf-8?q?Don=C3=A1t?= Nagy ,
=?utf-8?q?Don=C3=A1t?= Nagy 
Message-ID:
In-Reply-To: 


DonatNagyE wrote:

The title change suggestion was a very good point, thanks!

https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Improve diagnostics from ArrayBoundCheckerV2 (PR #70056)

2023-10-31 Thread via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 


https://github.com/DonatNagyE edited 
https://github.com/llvm/llvm-project/pull/70056
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits