[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-07 Thread Balazs Benics via cfe-commits

steakhal wrote:

Let's do another round.

https://github.com/llvm/llvm-project/pull/93408
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-07 Thread Balazs Benics via cfe-commits

https://github.com/steakhal updated 
https://github.com/llvm/llvm-project/pull/93408

>From f9e841ddaa865d529c806b2d115d5ddbc7109243 Mon Sep 17 00:00:00 2001
From: Balazs Benics 
Date: Sun, 26 May 2024 11:40:01 +0200
Subject: [PATCH 01/16] [analyzer] Refine invalidation caused by `fread`

This change enables more accurate modeling of the write effects of `fread`.
In particular, instead of invalidating the whole buffer, in a best-effort
basis, we would try to invalidate the actually accesses elements of the buffer.
This preserves the previous value of the buffer of the unaffected slots.
As a result, diagnose more uninitialized buffer uses for example.

Currently, this refined invalidation only triggers for `fread` if and
only if the `count` parameter and the buffer pointer's index component
are concrete or perfectly-constrained symbols.
Additionally, if the `fread` would read more than 64 elements, the whole
buffer is invalidated as before. This is to have safeguards against
performance issues.

Refer to the comments of the assertions in the following example to see
the changes in the diagnostics:

```c++
void demo() {
  FILE *fp = fopen("/home/test", "rb+");
  if (!fp) return;
  int buffer[10]; // uninitialized
  int read_items = fread(buffer+1, sizeof(int), 5, fp);
  if (5 == read_items) {
int v1 = buffer[1]; // Unknown value but not garbage.
clang_analyzer_isTainted(v1); // expected-warning {{YES}} <-- Would be "NO" 
without this patch.
clang_analyzer_dump(v1); // expected-warning {{conj_}} <-- Not a "derived" 
symbol, so it's directly invalidated now.
int v0 = buffer[0]; // expected-warning {{Assigned value is garbage or 
undefined}} <-- Had no report here before.
(void)(v1 + v0);
  } else {
// If 'fread' had an error.
int v0 = buffer[0]; // expected-warning {{Assigned value is garbage or 
undefined}} <-- Had no report here before.
(void)v0;
  }
  fclose(fp);
}
```

[CPP-3247](https://sonarsource.atlassian.net/browse/CPP-3247)

Patch by Marco Borgeaud (marco-antognini-sonarsource)
---
 .../StaticAnalyzer/Checkers/StreamChecker.cpp |  88 -
 clang/test/Analysis/fread.cpp | 328 ++
 2 files changed, 405 insertions(+), 11 deletions(-)
 create mode 100644 clang/test/Analysis/fread.cpp

diff --git a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
index d4e020f7a72a0..7b42c4f72b322 100644
--- a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
@@ -717,18 +717,71 @@ const ExplodedNode 
*StreamChecker::getAcquisitionSite(const ExplodedNode *N,
   return nullptr;
 }
 
+/// Invalidate only the requested elements instead of the whole buffer.
+/// This is basically a refinement of the more generic 'escapeArgs' or
+/// the plain old 'invalidateRegions'.
+/// This only works if the \p StartIndex and \p Count are concrete or
+/// perfectly-constrained.
+static ProgramStateRef
+escapeByStartIndexAndCount(ProgramStateRef State, CheckerContext ,
+   const CallEvent , const MemRegion *Buffer,
+   QualType ElemType, SVal StartIndex, SVal Count) {
+  if (!llvm::isa_and_nonnull(Buffer))
+return State;
+
+  auto UnboxAsInt = [, ](SVal V) -> std::optional {
+auto  = C.getSValBuilder();
+if (const llvm::APSInt *Int = SVB.getKnownValue(State, V))
+  return Int->tryExtValue();
+return std::nullopt;
+  };
+
+  auto StartIndexVal = UnboxAsInt(StartIndex);
+  auto CountVal = UnboxAsInt(Count);
+
+  // FIXME: Maybe we could make this more generic, and expose this by the
+  // 'invalidateRegions' API. After doing so, it might make sense to make this
+  // limit configurable.
+  constexpr int MaxInvalidatedElementsLimit = 64;
+  if (!StartIndexVal || !CountVal || *CountVal > MaxInvalidatedElementsLimit) {
+return State->invalidateRegions({loc::MemRegionVal{Buffer}},
+Call.getOriginExpr(), C.blockCount(),
+C.getLocationContext(),
+/*CausesPointerEscape=*/false);
+  }
+
+  constexpr auto DoNotInvalidateSuperRegion =
+  RegionAndSymbolInvalidationTraits::InvalidationKinds::
+  TK_DoNotInvalidateSuperRegion;
+
+  auto  = Buffer->getMemRegionManager();
+  SmallVector EscapingVals;
+  EscapingVals.reserve(*CountVal);
+
+  RegionAndSymbolInvalidationTraits ITraits;
+  for (auto Idx : llvm::seq(*StartIndexVal, *StartIndexVal + *CountVal)) {
+NonLoc Index = C.getSValBuilder().makeArrayIndex(Idx);
+const auto *Element = RegionManager.getElementRegion(
+ElemType, Index, cast(Buffer), C.getASTContext());
+EscapingVals.push_back(loc::MemRegionVal(Element));
+ITraits.setTrait(Element, DoNotInvalidateSuperRegion);
+  }
+  return State->invalidateRegions(EscapingVals, Call.getOriginExpr(),
+  C.blockCount(), 

[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-07 Thread Balazs Benics via cfe-commits

https://github.com/steakhal updated 
https://github.com/llvm/llvm-project/pull/93408

>From f9e841ddaa865d529c806b2d115d5ddbc7109243 Mon Sep 17 00:00:00 2001
From: Balazs Benics 
Date: Sun, 26 May 2024 11:40:01 +0200
Subject: [PATCH 01/16] [analyzer] Refine invalidation caused by `fread`

This change enables more accurate modeling of the write effects of `fread`.
In particular, instead of invalidating the whole buffer, in a best-effort
basis, we would try to invalidate the actually accesses elements of the buffer.
This preserves the previous value of the buffer of the unaffected slots.
As a result, diagnose more uninitialized buffer uses for example.

Currently, this refined invalidation only triggers for `fread` if and
only if the `count` parameter and the buffer pointer's index component
are concrete or perfectly-constrained symbols.
Additionally, if the `fread` would read more than 64 elements, the whole
buffer is invalidated as before. This is to have safeguards against
performance issues.

Refer to the comments of the assertions in the following example to see
the changes in the diagnostics:

```c++
void demo() {
  FILE *fp = fopen("/home/test", "rb+");
  if (!fp) return;
  int buffer[10]; // uninitialized
  int read_items = fread(buffer+1, sizeof(int), 5, fp);
  if (5 == read_items) {
int v1 = buffer[1]; // Unknown value but not garbage.
clang_analyzer_isTainted(v1); // expected-warning {{YES}} <-- Would be "NO" 
without this patch.
clang_analyzer_dump(v1); // expected-warning {{conj_}} <-- Not a "derived" 
symbol, so it's directly invalidated now.
int v0 = buffer[0]; // expected-warning {{Assigned value is garbage or 
undefined}} <-- Had no report here before.
(void)(v1 + v0);
  } else {
// If 'fread' had an error.
int v0 = buffer[0]; // expected-warning {{Assigned value is garbage or 
undefined}} <-- Had no report here before.
(void)v0;
  }
  fclose(fp);
}
```

[CPP-3247](https://sonarsource.atlassian.net/browse/CPP-3247)

Patch by Marco Borgeaud (marco-antognini-sonarsource)
---
 .../StaticAnalyzer/Checkers/StreamChecker.cpp |  88 -
 clang/test/Analysis/fread.cpp | 328 ++
 2 files changed, 405 insertions(+), 11 deletions(-)
 create mode 100644 clang/test/Analysis/fread.cpp

diff --git a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
index d4e020f7a72a0..7b42c4f72b322 100644
--- a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
@@ -717,18 +717,71 @@ const ExplodedNode 
*StreamChecker::getAcquisitionSite(const ExplodedNode *N,
   return nullptr;
 }
 
+/// Invalidate only the requested elements instead of the whole buffer.
+/// This is basically a refinement of the more generic 'escapeArgs' or
+/// the plain old 'invalidateRegions'.
+/// This only works if the \p StartIndex and \p Count are concrete or
+/// perfectly-constrained.
+static ProgramStateRef
+escapeByStartIndexAndCount(ProgramStateRef State, CheckerContext ,
+   const CallEvent , const MemRegion *Buffer,
+   QualType ElemType, SVal StartIndex, SVal Count) {
+  if (!llvm::isa_and_nonnull(Buffer))
+return State;
+
+  auto UnboxAsInt = [, ](SVal V) -> std::optional {
+auto  = C.getSValBuilder();
+if (const llvm::APSInt *Int = SVB.getKnownValue(State, V))
+  return Int->tryExtValue();
+return std::nullopt;
+  };
+
+  auto StartIndexVal = UnboxAsInt(StartIndex);
+  auto CountVal = UnboxAsInt(Count);
+
+  // FIXME: Maybe we could make this more generic, and expose this by the
+  // 'invalidateRegions' API. After doing so, it might make sense to make this
+  // limit configurable.
+  constexpr int MaxInvalidatedElementsLimit = 64;
+  if (!StartIndexVal || !CountVal || *CountVal > MaxInvalidatedElementsLimit) {
+return State->invalidateRegions({loc::MemRegionVal{Buffer}},
+Call.getOriginExpr(), C.blockCount(),
+C.getLocationContext(),
+/*CausesPointerEscape=*/false);
+  }
+
+  constexpr auto DoNotInvalidateSuperRegion =
+  RegionAndSymbolInvalidationTraits::InvalidationKinds::
+  TK_DoNotInvalidateSuperRegion;
+
+  auto  = Buffer->getMemRegionManager();
+  SmallVector EscapingVals;
+  EscapingVals.reserve(*CountVal);
+
+  RegionAndSymbolInvalidationTraits ITraits;
+  for (auto Idx : llvm::seq(*StartIndexVal, *StartIndexVal + *CountVal)) {
+NonLoc Index = C.getSValBuilder().makeArrayIndex(Idx);
+const auto *Element = RegionManager.getElementRegion(
+ElemType, Index, cast(Buffer), C.getASTContext());
+EscapingVals.push_back(loc::MemRegionVal(Element));
+ITraits.setTrait(Element, DoNotInvalidateSuperRegion);
+  }
+  return State->invalidateRegions(EscapingVals, Call.getOriginExpr(),
+  C.blockCount(), 

[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-07 Thread Balazs Benics via cfe-commits


@@ -0,0 +1,443 @@
+// RUN: %clang_analyze_cc1 -verify %s \
+// RUN:   -triple x86_64-linux-gnu  \
+// RUN:   -analyzer-checker=core,unix.Stream,alpha.security.taint \
+// RUN:   -analyzer-checker=debug.ExprInspection
+
+#include "Inputs/system-header-simulator-for-simple-stream.h"
+
+#define EOF (-1)
+
+void clang_analyzer_dump(int);
+void clang_analyzer_isTainted(int);
+void clang_analyzer_warnIfReached(void);
+
+// A stream is only tracked by StreamChecker if it results from a call to 
"fopen".
+// Otherwise, there is no specific modelling of "fread".
+void untracked_stream(FILE *fp) {
+  char c;
+  if (1 == fread(, 1, 1, fp)) {
+char p = c; // Unknown value but not garbage and not modeled by checker.
+  } else {
+char p = c; // Possibly indeterminate value but not modeled by checker.
+  }
+}
+
+void fgetc_props_taint(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+int c = fgetc(fp); // c is tainted.
+if (c != EOF) {
+  clang_analyzer_isTainted(c); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void fread_props_taint(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+char buffer[10];
+int c = fread(buffer, 1, 10, fp); // c is tainted.
+if (c != 10) {
+  // If the read failed, then the number of bytes successfully read should 
be tainted.
+  clang_analyzer_isTainted(c); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte1(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+char c;
+if (1 == fread(, 1, 1, fp)) {
+  char p = c; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = c; // Possibly indeterminate value but not modeled by checker.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte2(char *buffer) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+if (1 == fread(buffer, 1, 1, fp)) {
+  char p = buffer[0]; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = buffer[0]; // Possibly indeterminate value but not modeled by 
checker.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte3(char *buffer) {
+  buffer[1] = 10;
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+// buffer[1] is not mutated by fread and remains not tainted.
+fread(buffer, 1, 1, fp);
+char p = buffer[1];
+clang_analyzer_isTainted(p); // expected-warning{{NO}}
+clang_analyzer_dump(buffer[1]); // expected-warning{{10 S32b}}
+fclose(fp);
+  }
+}
+
+void read_many_bytes(char *buffer) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+if (42 == fread(buffer, 1, 42, fp)) {
+  char p = buffer[0]; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = buffer[0]; // Possibly indeterminate value but not modeled.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void random_access_read1(int index) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+long c[4];
+int success = 2 == fread(c + 1, sizeof(long), 2, fp);
+
+switch (index) {
+case 0:
+  // c[0] is not mutated by fread.
+  if (success) {
+char p = c[0]; // expected-warning {{Assigned value is garbage or 
undefined}} We kept the first byte intact.
+  } else {
+char p = c[0]; // expected-warning {{Assigned value is garbage or 
undefined}} We kept the first byte intact.
+  }
+  break;
+
+case 1:
+  if (success) {
+// Unknown value but not garbage.
+clang_analyzer_isTainted(c[1]); // expected-warning {{YES}}
+clang_analyzer_dump(c[1]); // expected-warning {{conj_}}
+  } else {
+// Possibly indeterminate value but not modeled.
+clang_analyzer_isTainted(c[1]); // expected-warning {{YES}}
+clang_analyzer_dump(c[1]); // expected-warning {{conj_}}
+  }
+  break;
+
+case 2:
+  if (success) {
+long p = c[2]; // Unknown value but not garbage.
+// FIXME: Taint analysis only marks the first byte of a memory region. 
See getPointeeOf in GenericTaintChecker.cpp.
+clang_analyzer_isTainted(c[2]); // expected-warning {{NO}}
+clang_analyzer_dump(c[2]); // expected-warning {{conj_}}
+  } else {
+// Possibly indeterminate value but not modeled.
+clang_analyzer_isTainted(c[2]); // expected-warning {{NO}} // FIXME: 
See above.
+clang_analyzer_dump(c[2]); // expected-warning {{conj_}}
+  }
+  break;
+
+case 3:
+  // c[3] is not mutated by fread.
+  if (success) {
+long p = c[3]; // expected-warning {{Assigned value is garbage or 
undefined}}
+  } else {
+long p = c[3]; // 

[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-07 Thread Balazs Benics via cfe-commits


@@ -0,0 +1,443 @@
+// RUN: %clang_analyze_cc1 -verify %s \
+// RUN:   -triple x86_64-linux-gnu  \
+// RUN:   -analyzer-checker=core,unix.Stream,alpha.security.taint \
+// RUN:   -analyzer-checker=debug.ExprInspection
+
+#include "Inputs/system-header-simulator-for-simple-stream.h"
+
+#define EOF (-1)
+
+void clang_analyzer_dump(int);
+void clang_analyzer_isTainted(int);
+void clang_analyzer_warnIfReached(void);
+
+// A stream is only tracked by StreamChecker if it results from a call to 
"fopen".
+// Otherwise, there is no specific modelling of "fread".
+void untracked_stream(FILE *fp) {
+  char c;
+  if (1 == fread(, 1, 1, fp)) {
+char p = c; // Unknown value but not garbage and not modeled by checker.
+  } else {
+char p = c; // Possibly indeterminate value but not modeled by checker.
+  }
+}
+
+void fgetc_props_taint(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+int c = fgetc(fp); // c is tainted.
+if (c != EOF) {
+  clang_analyzer_isTainted(c); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void fread_props_taint(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+char buffer[10];
+int c = fread(buffer, 1, 10, fp); // c is tainted.
+if (c != 10) {
+  // If the read failed, then the number of bytes successfully read should 
be tainted.
+  clang_analyzer_isTainted(c); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte1(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+char c;
+if (1 == fread(, 1, 1, fp)) {
+  char p = c; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = c; // Possibly indeterminate value but not modeled by checker.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte2(char *buffer) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+if (1 == fread(buffer, 1, 1, fp)) {
+  char p = buffer[0]; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = buffer[0]; // Possibly indeterminate value but not modeled by 
checker.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte3(char *buffer) {
+  buffer[1] = 10;
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+// buffer[1] is not mutated by fread and remains not tainted.
+fread(buffer, 1, 1, fp);
+char p = buffer[1];
+clang_analyzer_isTainted(p); // expected-warning{{NO}}
+clang_analyzer_dump(buffer[1]); // expected-warning{{10 S32b}}
+fclose(fp);
+  }
+}
+
+void read_many_bytes(char *buffer) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+if (42 == fread(buffer, 1, 42, fp)) {
+  char p = buffer[0]; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = buffer[0]; // Possibly indeterminate value but not modeled.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void random_access_read1(int index) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+long c[4];
+int success = 2 == fread(c + 1, sizeof(long), 2, fp);
+
+switch (index) {
+case 0:
+  // c[0] is not mutated by fread.
+  if (success) {
+char p = c[0]; // expected-warning {{Assigned value is garbage or 
undefined}} We kept the first byte intact.
+  } else {
+char p = c[0]; // expected-warning {{Assigned value is garbage or 
undefined}} We kept the first byte intact.
+  }
+  break;
+
+case 1:
+  if (success) {
+// Unknown value but not garbage.
+clang_analyzer_isTainted(c[1]); // expected-warning {{YES}}
+clang_analyzer_dump(c[1]); // expected-warning {{conj_}}
+  } else {
+// Possibly indeterminate value but not modeled.
+clang_analyzer_isTainted(c[1]); // expected-warning {{YES}}
+clang_analyzer_dump(c[1]); // expected-warning {{conj_}}
+  }
+  break;
+
+case 2:
+  if (success) {
+long p = c[2]; // Unknown value but not garbage.
+// FIXME: Taint analysis only marks the first byte of a memory region. 
See getPointeeOf in GenericTaintChecker.cpp.
+clang_analyzer_isTainted(c[2]); // expected-warning {{NO}}
+clang_analyzer_dump(c[2]); // expected-warning {{conj_}}
+  } else {
+// Possibly indeterminate value but not modeled.
+clang_analyzer_isTainted(c[2]); // expected-warning {{NO}} // FIXME: 
See above.
+clang_analyzer_dump(c[2]); // expected-warning {{conj_}}
+  }
+  break;
+
+case 3:
+  // c[3] is not mutated by fread.
+  if (success) {
+long p = c[3]; // expected-warning {{Assigned value is garbage or 
undefined}}
+  } else {
+long p = c[3]; // 

[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-07 Thread Balazs Benics via cfe-commits


@@ -0,0 +1,443 @@
+// RUN: %clang_analyze_cc1 -verify %s \
+// RUN:   -triple x86_64-linux-gnu  \
+// RUN:   -analyzer-checker=core,unix.Stream,alpha.security.taint \
+// RUN:   -analyzer-checker=debug.ExprInspection
+
+#include "Inputs/system-header-simulator-for-simple-stream.h"
+
+#define EOF (-1)
+
+void clang_analyzer_dump(int);
+void clang_analyzer_isTainted(int);
+void clang_analyzer_warnIfReached(void);
+
+// A stream is only tracked by StreamChecker if it results from a call to 
"fopen".
+// Otherwise, there is no specific modelling of "fread".
+void untracked_stream(FILE *fp) {
+  char c;
+  if (1 == fread(, 1, 1, fp)) {
+char p = c; // Unknown value but not garbage and not modeled by checker.
+  } else {
+char p = c; // Possibly indeterminate value but not modeled by checker.
+  }
+}
+
+void fgetc_props_taint(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+int c = fgetc(fp); // c is tainted.
+if (c != EOF) {
+  clang_analyzer_isTainted(c); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void fread_props_taint(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+char buffer[10];
+int c = fread(buffer, 1, 10, fp); // c is tainted.
+if (c != 10) {
+  // If the read failed, then the number of bytes successfully read should 
be tainted.
+  clang_analyzer_isTainted(c); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte1(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+char c;
+if (1 == fread(, 1, 1, fp)) {
+  char p = c; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = c; // Possibly indeterminate value but not modeled by checker.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte2(char *buffer) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+if (1 == fread(buffer, 1, 1, fp)) {
+  char p = buffer[0]; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = buffer[0]; // Possibly indeterminate value but not modeled by 
checker.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte3(char *buffer) {
+  buffer[1] = 10;
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+// buffer[1] is not mutated by fread and remains not tainted.
+fread(buffer, 1, 1, fp);
+char p = buffer[1];
+clang_analyzer_isTainted(p); // expected-warning{{NO}}
+clang_analyzer_dump(buffer[1]); // expected-warning{{10 S32b}}
+fclose(fp);
+  }
+}
+
+void read_many_bytes(char *buffer) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+if (42 == fread(buffer, 1, 42, fp)) {
+  char p = buffer[0]; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = buffer[0]; // Possibly indeterminate value but not modeled.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void random_access_read1(int index) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+long c[4];
+int success = 2 == fread(c + 1, sizeof(long), 2, fp);
+
+switch (index) {
+case 0:
+  // c[0] is not mutated by fread.
+  if (success) {
+char p = c[0]; // expected-warning {{Assigned value is garbage or 
undefined}} We kept the first byte intact.
+  } else {
+char p = c[0]; // expected-warning {{Assigned value is garbage or 
undefined}} We kept the first byte intact.
+  }
+  break;
+
+case 1:
+  if (success) {
+// Unknown value but not garbage.
+clang_analyzer_isTainted(c[1]); // expected-warning {{YES}}
+clang_analyzer_dump(c[1]); // expected-warning {{conj_}}
+  } else {
+// Possibly indeterminate value but not modeled.
+clang_analyzer_isTainted(c[1]); // expected-warning {{YES}}
+clang_analyzer_dump(c[1]); // expected-warning {{conj_}}
+  }
+  break;
+
+case 2:
+  if (success) {
+long p = c[2]; // Unknown value but not garbage.
+// FIXME: Taint analysis only marks the first byte of a memory region. 
See getPointeeOf in GenericTaintChecker.cpp.
+clang_analyzer_isTainted(c[2]); // expected-warning {{NO}}
+clang_analyzer_dump(c[2]); // expected-warning {{conj_}}
+  } else {
+// Possibly indeterminate value but not modeled.
+clang_analyzer_isTainted(c[2]); // expected-warning {{NO}} // FIXME: 
See above.
+clang_analyzer_dump(c[2]); // expected-warning {{conj_}}
+  }
+  break;
+
+case 3:
+  // c[3] is not mutated by fread.
+  if (success) {
+long p = c[3]; // expected-warning {{Assigned value is garbage or 
undefined}}
+  } else {
+long p = c[3]; // 

[clang] [llvm] Add SonarCloud (PR #94745)

2024-06-07 Thread Balazs Benics via cfe-commits

https://github.com/steakhal updated 
https://github.com/llvm/llvm-project/pull/94745

>From 099e93f425293daf376eccbe6fd771f297126588 Mon Sep 17 00:00:00 2001
From: Balazs Benics 
Date: Fri, 7 Jun 2024 12:55:07 +0200
Subject: [PATCH 1/3] Add initial SonarCloud config

---
 .github/workflows/clang-tests-sonar-cloud.yml |  38 +
 .../llvm-project-tests-sonar-cloud.yml| 146 ++
 sonar-project.properties  |   7 +
 3 files changed, 191 insertions(+)
 create mode 100644 .github/workflows/clang-tests-sonar-cloud.yml
 create mode 100644 .github/workflows/llvm-project-tests-sonar-cloud.yml
 create mode 100644 sonar-project.properties

diff --git a/.github/workflows/clang-tests-sonar-cloud.yml 
b/.github/workflows/clang-tests-sonar-cloud.yml
new file mode 100644
index 0..2969687708629
--- /dev/null
+++ b/.github/workflows/clang-tests-sonar-cloud.yml
@@ -0,0 +1,38 @@
+name: Clang Tests with SonarScanner
+
+permissions:
+  contents: read
+
+on:
+  workflow_dispatch:
+  push:
+branches:
+  - 'release/**'
+paths:
+  - 'clang/**'
+  - '.github/workflows/clang-tests-sonar-cloud.yml'
+  - '.github/workflows/llvm-project-tests-sonar-cloud.yml'
+  - '!llvm/**'
+  pull_request:
+branches:
+  - 'release/**'
+paths:
+  - 'clang/**'
+  - '.github/workflows/clang-tests-sonar-cloud.yml'
+  - '.github/workflows/llvm-project-tests-sonar-cloud.yml'
+  - '!llvm/**'
+
+concurrency:
+  # Skip intermediate builds: always.
+  # Cancel intermediate builds: only if it is a pull request build.
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: ${{ startsWith(github.ref, 'refs/pull/') }}
+
+jobs:
+  check_clang:
+if: github.repository_owner == 'llvm'
+name: Test clang,lldb,libclc
+uses: ./.github/workflows/llvm-project-tests-sonar-cloud.yml
+with:
+  build_target: check-clang
+  projects: clang;lldb;libclc
diff --git a/.github/workflows/llvm-project-tests-sonar-cloud.yml 
b/.github/workflows/llvm-project-tests-sonar-cloud.yml
new file mode 100644
index 0..7c489effda92f
--- /dev/null
+++ b/.github/workflows/llvm-project-tests-sonar-cloud.yml
@@ -0,0 +1,146 @@
+name: LLVM Project Tests and analyze with SonarScanner
+
+permissions:
+  contents: read
+
+on:
+  workflow_dispatch:
+inputs:
+  build_target:
+required: false
+  projects:
+required: false
+  extra_cmake_args:
+required: false
+  os_list:
+required: false
+default: '["ubuntu-latest"]'
+  python_version:
+required: false
+type: string
+default: '3.11'
+  workflow_call:
+inputs:
+  build_target:
+required: false
+type: string
+default: "all"
+
+  projects:
+required: true
+type: string
+
+  extra_cmake_args:
+required: false
+type: string
+
+  os_list:
+required: false
+type: string
+# Use windows-2019 due to:
+# 
https://developercommunity.visualstudio.com/t/Prev-Issue---with-__assume-isnan-/1597317
+default: '["ubuntu-latest"]'
+
+  python_version:
+required: false
+type: string
+default: '3.11'
+
+concurrency:
+  # Skip intermediate builds: always.
+  # Cancel intermediate builds: only if it is a pull request build.
+  # If the group name here is the same as the group name in the workflow that 
includes
+  # this one, then the action will try to wait on itself and get stuck.
+  group: llvm-project-${{ github.workflow }}-${{ inputs.projects }}${{ 
github.ref }}
+  cancel-in-progress: ${{ startsWith(github.ref, 'refs/pull/') }}
+
+jobs:
+  lit-tests:
+name: Lit Tests
+runs-on: ${{ matrix.os }}
+container:
+  image: ${{(startsWith(matrix.os, 'ubuntu') && 
'ghcr.io/llvm/ci-ubuntu-22.04:latest') || null}}
+  volumes:
+- /mnt/:/mnt/
+strategy:
+  fail-fast: false
+  matrix:
+os: ${{ fromJSON(inputs.os_list) }}
+steps:
+  - name: Setup Windows
+if: startsWith(matrix.os, 'windows')
+uses: llvm/actions/setup-windows@main
+with:
+  arch: amd64
+  # On Windows, starting with win19/20220814.1, cmake choose the 32-bit
+  # python3.10.6 libraries instead of the 64-bit libraries when building
+  # lldb.  Using this setup-python action to make 3.10 the default
+  # python fixes this.
+  - name: Setup Python
+uses: actions/setup-python@v4
+with:
+  python-version: ${{ inputs.python_version }}
+  - name: Install Ninja
+if: runner.os != 'Linux'
+uses: llvm/actions/install-ninja@main
+  # actions/checkout deletes any existing files in the new git directory,
+  # so this needs to either run before ccache-action or it has to use
+  # clean: false.
+  - uses: actions/checkout@v4
+with:
+  fetch-depth: 250
+  - name: Setup 

[clang] [llvm] Add SonarCloud (PR #94745)

2024-06-07 Thread Balazs Benics via cfe-commits

https://github.com/steakhal updated 
https://github.com/llvm/llvm-project/pull/94745

>From 099e93f425293daf376eccbe6fd771f297126588 Mon Sep 17 00:00:00 2001
From: Balazs Benics 
Date: Fri, 7 Jun 2024 12:55:07 +0200
Subject: [PATCH 1/2] Add initial SonarCloud config

---
 .github/workflows/clang-tests-sonar-cloud.yml |  38 +
 .../llvm-project-tests-sonar-cloud.yml| 146 ++
 sonar-project.properties  |   7 +
 3 files changed, 191 insertions(+)
 create mode 100644 .github/workflows/clang-tests-sonar-cloud.yml
 create mode 100644 .github/workflows/llvm-project-tests-sonar-cloud.yml
 create mode 100644 sonar-project.properties

diff --git a/.github/workflows/clang-tests-sonar-cloud.yml 
b/.github/workflows/clang-tests-sonar-cloud.yml
new file mode 100644
index 0..2969687708629
--- /dev/null
+++ b/.github/workflows/clang-tests-sonar-cloud.yml
@@ -0,0 +1,38 @@
+name: Clang Tests with SonarScanner
+
+permissions:
+  contents: read
+
+on:
+  workflow_dispatch:
+  push:
+branches:
+  - 'release/**'
+paths:
+  - 'clang/**'
+  - '.github/workflows/clang-tests-sonar-cloud.yml'
+  - '.github/workflows/llvm-project-tests-sonar-cloud.yml'
+  - '!llvm/**'
+  pull_request:
+branches:
+  - 'release/**'
+paths:
+  - 'clang/**'
+  - '.github/workflows/clang-tests-sonar-cloud.yml'
+  - '.github/workflows/llvm-project-tests-sonar-cloud.yml'
+  - '!llvm/**'
+
+concurrency:
+  # Skip intermediate builds: always.
+  # Cancel intermediate builds: only if it is a pull request build.
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: ${{ startsWith(github.ref, 'refs/pull/') }}
+
+jobs:
+  check_clang:
+if: github.repository_owner == 'llvm'
+name: Test clang,lldb,libclc
+uses: ./.github/workflows/llvm-project-tests-sonar-cloud.yml
+with:
+  build_target: check-clang
+  projects: clang;lldb;libclc
diff --git a/.github/workflows/llvm-project-tests-sonar-cloud.yml 
b/.github/workflows/llvm-project-tests-sonar-cloud.yml
new file mode 100644
index 0..7c489effda92f
--- /dev/null
+++ b/.github/workflows/llvm-project-tests-sonar-cloud.yml
@@ -0,0 +1,146 @@
+name: LLVM Project Tests and analyze with SonarScanner
+
+permissions:
+  contents: read
+
+on:
+  workflow_dispatch:
+inputs:
+  build_target:
+required: false
+  projects:
+required: false
+  extra_cmake_args:
+required: false
+  os_list:
+required: false
+default: '["ubuntu-latest"]'
+  python_version:
+required: false
+type: string
+default: '3.11'
+  workflow_call:
+inputs:
+  build_target:
+required: false
+type: string
+default: "all"
+
+  projects:
+required: true
+type: string
+
+  extra_cmake_args:
+required: false
+type: string
+
+  os_list:
+required: false
+type: string
+# Use windows-2019 due to:
+# 
https://developercommunity.visualstudio.com/t/Prev-Issue---with-__assume-isnan-/1597317
+default: '["ubuntu-latest"]'
+
+  python_version:
+required: false
+type: string
+default: '3.11'
+
+concurrency:
+  # Skip intermediate builds: always.
+  # Cancel intermediate builds: only if it is a pull request build.
+  # If the group name here is the same as the group name in the workflow that 
includes
+  # this one, then the action will try to wait on itself and get stuck.
+  group: llvm-project-${{ github.workflow }}-${{ inputs.projects }}${{ 
github.ref }}
+  cancel-in-progress: ${{ startsWith(github.ref, 'refs/pull/') }}
+
+jobs:
+  lit-tests:
+name: Lit Tests
+runs-on: ${{ matrix.os }}
+container:
+  image: ${{(startsWith(matrix.os, 'ubuntu') && 
'ghcr.io/llvm/ci-ubuntu-22.04:latest') || null}}
+  volumes:
+- /mnt/:/mnt/
+strategy:
+  fail-fast: false
+  matrix:
+os: ${{ fromJSON(inputs.os_list) }}
+steps:
+  - name: Setup Windows
+if: startsWith(matrix.os, 'windows')
+uses: llvm/actions/setup-windows@main
+with:
+  arch: amd64
+  # On Windows, starting with win19/20220814.1, cmake choose the 32-bit
+  # python3.10.6 libraries instead of the 64-bit libraries when building
+  # lldb.  Using this setup-python action to make 3.10 the default
+  # python fixes this.
+  - name: Setup Python
+uses: actions/setup-python@v4
+with:
+  python-version: ${{ inputs.python_version }}
+  - name: Install Ninja
+if: runner.os != 'Linux'
+uses: llvm/actions/install-ninja@main
+  # actions/checkout deletes any existing files in the new git directory,
+  # so this needs to either run before ccache-action or it has to use
+  # clean: false.
+  - uses: actions/checkout@v4
+with:
+  fetch-depth: 250
+  - name: Setup 

[clang] [clang][analyzer] Improved PointerSubChecker (PR #93676)

2024-06-07 Thread Balazs Benics via cfe-commits
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= 
Message-ID:
In-Reply-To: 


https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/93676
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improved PointerSubChecker (PR #93676)

2024-06-07 Thread Balazs Benics via cfe-commits
=?utf-8?q?Bal=C3=A1zs_K=C3=A9ri?= ,
=?utf-8?q?Bal=C3=A1zs_K=C3=A9ri?= ,
=?utf-8?q?Bal=C3=A1zs_K=C3=A9ri?= 
Message-ID:
In-Reply-To: 


https://github.com/steakhal approved this pull request.

LGTM, thanks. Minor nits.

https://github.com/llvm/llvm-project/pull/93676
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improved PointerSubChecker (PR #93676)

2024-06-07 Thread Balazs Benics via cfe-commits
=?utf-8?q?Bal=C3=A1zs_K=C3=A9ri?= ,
=?utf-8?q?Bal=C3=A1zs_K=C3=A9ri?= ,
=?utf-8?q?Bal=C3=A1zs_K=C3=A9ri?= 
Message-ID:
In-Reply-To: 



@@ -53,10 +62,10 @@ void f4(void) {
   int (*p)[m] = a; // p == [0]
   p += 1; // p == [1]
 
-  // FIXME: This warning is not needed
+  // FIXME: This is a known problem with -Wpointer-arith

steakhal wrote:

Please put the issue link here and to the other place.

https://github.com/llvm/llvm-project/pull/93676
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-06 Thread Balazs Benics via cfe-commits

steakhal wrote:

Addressed all the feedback.

https://github.com/llvm/llvm-project/pull/93408
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-06 Thread Balazs Benics via cfe-commits

https://github.com/steakhal updated 
https://github.com/llvm/llvm-project/pull/93408




  
Unicorn!  GitHub

  body {
background-color: #f1f1f1;
margin: 0;
font-family: "Helvetica Neue", Helvetica, Arial, sans-serif;
  }

  .container { margin: 50px auto 40px auto; width: 600px; text-align: 
center; }

  a { color: #4183c4; text-decoration: none; }
  a:hover { text-decoration: underline; }

  h1 { letter-spacing: -1px; line-height: 60px; font-size: 60px; 
font-weight: 100; margin: 0px; text-shadow: 0 1px 0 #fff; }
  p { color: rgba(0, 0, 0, 0.5); margin: 10px 0 10px; font-size: 18px; 
font-weight: 200; line-height: 1.6em;}

  ul { list-style: none; margin: 25px 0; padding: 0; }
  li { display: table-cell; font-weight: bold; width: 1%; }

  .logo { display: inline-block; margin-top: 35px; }
  .logo-img-2x { display: none; }
  @media
  only screen and (-webkit-min-device-pixel-ratio: 2),
  only screen and (   min--moz-device-pixel-ratio: 2),
  only screen and ( -o-min-device-pixel-ratio: 2/1),
  only screen and (min-device-pixel-ratio: 2),
  only screen and (min-resolution: 192dpi),
  only screen and (min-resolution: 2dppx) {
.logo-img-1x { display: none; }
.logo-img-2x { display: inline-block; }
  }

  #suggestions {
margin-top: 35px;
color: #ccc;
  }
  #suggestions a {
color: #66;
font-weight: 200;
font-size: 14px;
margin: 0 10px;
  }


  
  


  

  

  We couldn't respond to your request in time.
  Sorry about that. Please try refreshing and contact us if the problem 
persists.
  
https://github.com/contact;>Contact Support 
https://www.githubstatus.com;>GitHub Status 
https://twitter.com/githubstatus;>@githubstatus
  

  

  

  

  

  


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-06 Thread Balazs Benics via cfe-commits


@@ -907,6 +945,73 @@ void StreamChecker::preWrite(const FnDescription *Desc, 
const CallEvent ,
   C.addTransition(State);
 }
 
+static std::optional getPointeeType(const MemRegion *R) {
+  if (!R)
+return std::nullopt;
+  if (const auto *ER = dyn_cast(R))
+return ER->getElementType();
+  if (const auto *TR = dyn_cast(R))
+return TR->getValueType();
+  if (const auto *SR = dyn_cast(R))
+return SR->getPointeeStaticType();
+  return std::nullopt;
+}
+
+static std::optional getStartIndex(SValBuilder ,
+   const MemRegion *R) {
+  if (!R)
+return std::nullopt;
+
+  auto Zero = [] {
+BasicValueFactory  = SVB.getBasicValueFactory();
+return nonloc::ConcreteInt(BVF.getIntValue(0, /*isUnsigned=*/false));
+  };
+
+  if (const auto *ER = dyn_cast(R))
+return ER->getIndex();
+  if (const auto *TR = dyn_cast(R))
+return Zero();
+  if (const auto *SR = dyn_cast(R))
+return Zero();
+  return std::nullopt;
+}
+
+static ProgramStateRef
+tryToInvalidateFReadBufferByElements(ProgramStateRef State, CheckerContext ,
+ const CallEvent , NonLoc SizeVal,
+ NonLoc NMembVal) {
+  // Try to invalidate the individual elements.
+  const auto *Buffer =
+  dyn_cast_or_null(Call.getArgSVal(0).getAsRegion());
+
+  std::optional ElemTy = getPointeeType(Buffer);
+  std::optional StartElementIndex =
+  getStartIndex(C.getSValBuilder(), Buffer);
+
+  // Drop the outermost ElementRegion to get the buffer.
+  if (const auto *ER = dyn_cast_or_null(Buffer))
+Buffer = dyn_cast(ER->getSuperRegion());
+
+  std::optional CountVal = getKnownValue(State, NMembVal);
+  std::optional Size = getKnownValue(State, SizeVal);
+  std::optional StartIndexVal =
+  getKnownValue(State, StartElementIndex.value_or(UnknownVal()));
+
+  if (ElemTy && CountVal && Size && StartIndexVal) {
+int64_t NumBytesRead = Size.value() * CountVal.value();
+int64_t ElemSizeInChars =
+C.getASTContext().getTypeSizeInChars(*ElemTy).getQuantity();
+bool DivisibleAccessSpan = (NumBytesRead % ElemSizeInChars) == 0;
+int64_t NumElementsRead = NumBytesRead / ElemSizeInChars;
+constexpr int MaxInvalidatedElementsLimit = 64;
+if (DivisibleAccessSpan && NumElementsRead <= MaxInvalidatedElementsLimit) 
{

steakhal wrote:

Applied a patch to this code such that the last partial element is considered 
as a full access.
This is indeed better than doing a fallback and invalidating the full buffer.
Also modified the test to demonstrate this, but there is yet another bug in the 
current Store:
If we read from a byte offset of an element, we still get unknown as a result: 
`int buffer[2] = {2,3}` then reading from `((char*)buffer)[1]` results in 
unknown. [Compiler explorer](https://godbolt.org/z/dTKGnYo9W).

https://github.com/llvm/llvm-project/pull/93408
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-06 Thread Balazs Benics via cfe-commits


@@ -0,0 +1,412 @@
+// RUN: %clang_analyze_cc1 -verify %s \
+// RUN:   -triple x86_64-linux-gnu  \
+// RUN:   -analyzer-checker=core,unix.Stream,alpha.security.taint \
+// RUN:   -analyzer-checker=debug.ExprInspection
+
+#include "Inputs/system-header-simulator-for-simple-stream.h"
+
+#define EOF (-1)
+
+void clang_analyzer_dump(int);
+void clang_analyzer_isTainted(int);
+void clang_analyzer_warnIfReached(void);
+
+// A stream is only tracked by StreamChecker if it results from a call to 
"fopen".
+// Otherwise, there is no specific modelling of "fread".
+void untracked_stream(FILE *fp) {
+  char c;
+  if (1 == fread(, 1, 1, fp)) {
+char p = c; // Unknown value but not garbage and not modeled by checker.
+  } else {
+char p = c; // Possibly indeterminate value but not modeled by checker.
+  }
+}
+
+void fgetc_props_taint(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+int c = fgetc(fp); // c is tainted.
+if (c != EOF) {
+  clang_analyzer_isTainted(c); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void fread_props_taint(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+char buffer[10];
+int c = fread(buffer, 1, 10, fp); // c is tainted.
+if (c != 10) {
+  // If the read failed, then the number of bytes successfully read should 
be tainted.
+  clang_analyzer_isTainted(c); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte1(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+char c;
+if (1 == fread(, 1, 1, fp)) {
+  char p = c; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = c; // Possibly indeterminate value but not modeled by checker.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte2(char *buffer) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+if (1 == fread(buffer, 1, 1, fp)) {
+  char p = buffer[0]; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = buffer[0]; // Possibly indeterminate value but not modeled by 
checker.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte3(char *buffer) {
+  buffer[1] = 10;
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+// buffer[1] is not mutated by fread and remains not tainted.
+fread(buffer, 1, 1, fp);
+char p = buffer[1];
+clang_analyzer_isTainted(p); // expected-warning{{NO}}
+clang_analyzer_dump(buffer[1]); // expected-warning{{10 S32b}}
+fclose(fp);
+  }
+}
+
+void read_many_bytes(char *buffer) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+if (42 == fread(buffer, 1, 42, fp)) {
+  char p = buffer[0]; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = buffer[0]; // Possibly indeterminate value but not modeled.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void random_access_write1(int index) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+long c[4];
+int success = 2 == fread(c + 1, sizeof(long), 2, fp);
+
+switch (index) {
+case 0:
+  // c[0] is not mutated by fread.
+  if (success) {
+char p = c[0]; // expected-warning {{Assigned value is garbage or 
undefined}} We kept the first byte intact.
+  } else {
+char p = c[0]; // expected-warning {{Assigned value is garbage or 
undefined}} We kept the first byte intact.
+  }
+  break;
+
+case 1:
+  if (success) {
+// Unknown value but not garbage.
+clang_analyzer_isTainted(c[1]); // expected-warning {{YES}}
+clang_analyzer_dump(c[1]); // expected-warning {{conj_}}
+  } else {
+// Possibly indeterminate value but not modeled.
+clang_analyzer_isTainted(c[1]); // expected-warning {{YES}}
+clang_analyzer_dump(c[1]); // expected-warning {{conj_}}
+  }
+  break;
+
+case 2:
+  if (success) {
+long p = c[2]; // Unknown value but not garbage.
+// FIXME: Taint analysis only marks the first byte of a memory region. 
See getPointeeOf in GenericTaintChecker.cpp.
+clang_analyzer_isTainted(c[2]); // expected-warning {{NO}}
+clang_analyzer_dump(c[2]); // expected-warning {{conj_}}
+  } else {
+// Possibly indeterminate value but not modeled.
+clang_analyzer_isTainted(c[2]); // expected-warning {{NO}} // FIXME: 
See above.
+clang_analyzer_dump(c[2]); // expected-warning {{conj_}}
+  }
+  break;
+
+case 3:
+  // c[3] is not mutated by fread.
+  if (success) {
+long p = c[3]; // expected-warning {{Assigned value is garbage or 
undefined}}
+  } else {
+long p = c[3]; // 

[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-06 Thread Balazs Benics via cfe-commits


@@ -0,0 +1,412 @@
+// RUN: %clang_analyze_cc1 -verify %s \
+// RUN:   -triple x86_64-linux-gnu  \
+// RUN:   -analyzer-checker=core,unix.Stream,alpha.security.taint \
+// RUN:   -analyzer-checker=debug.ExprInspection
+
+#include "Inputs/system-header-simulator-for-simple-stream.h"
+
+#define EOF (-1)
+
+void clang_analyzer_dump(int);
+void clang_analyzer_isTainted(int);
+void clang_analyzer_warnIfReached(void);
+
+// A stream is only tracked by StreamChecker if it results from a call to 
"fopen".
+// Otherwise, there is no specific modelling of "fread".
+void untracked_stream(FILE *fp) {
+  char c;
+  if (1 == fread(, 1, 1, fp)) {
+char p = c; // Unknown value but not garbage and not modeled by checker.
+  } else {
+char p = c; // Possibly indeterminate value but not modeled by checker.
+  }
+}
+
+void fgetc_props_taint(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+int c = fgetc(fp); // c is tainted.
+if (c != EOF) {
+  clang_analyzer_isTainted(c); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void fread_props_taint(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+char buffer[10];
+int c = fread(buffer, 1, 10, fp); // c is tainted.
+if (c != 10) {
+  // If the read failed, then the number of bytes successfully read should 
be tainted.
+  clang_analyzer_isTainted(c); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte1(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+char c;
+if (1 == fread(, 1, 1, fp)) {
+  char p = c; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = c; // Possibly indeterminate value but not modeled by checker.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte2(char *buffer) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+if (1 == fread(buffer, 1, 1, fp)) {
+  char p = buffer[0]; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = buffer[0]; // Possibly indeterminate value but not modeled by 
checker.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte3(char *buffer) {
+  buffer[1] = 10;
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+// buffer[1] is not mutated by fread and remains not tainted.
+fread(buffer, 1, 1, fp);
+char p = buffer[1];
+clang_analyzer_isTainted(p); // expected-warning{{NO}}
+clang_analyzer_dump(buffer[1]); // expected-warning{{10 S32b}}
+fclose(fp);
+  }
+}
+
+void read_many_bytes(char *buffer) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+if (42 == fread(buffer, 1, 42, fp)) {
+  char p = buffer[0]; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = buffer[0]; // Possibly indeterminate value but not modeled.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void random_access_write1(int index) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+long c[4];
+int success = 2 == fread(c + 1, sizeof(long), 2, fp);
+
+switch (index) {
+case 0:
+  // c[0] is not mutated by fread.
+  if (success) {
+char p = c[0]; // expected-warning {{Assigned value is garbage or 
undefined}} We kept the first byte intact.
+  } else {
+char p = c[0]; // expected-warning {{Assigned value is garbage or 
undefined}} We kept the first byte intact.
+  }
+  break;
+
+case 1:
+  if (success) {
+// Unknown value but not garbage.
+clang_analyzer_isTainted(c[1]); // expected-warning {{YES}}
+clang_analyzer_dump(c[1]); // expected-warning {{conj_}}
+  } else {
+// Possibly indeterminate value but not modeled.
+clang_analyzer_isTainted(c[1]); // expected-warning {{YES}}
+clang_analyzer_dump(c[1]); // expected-warning {{conj_}}
+  }
+  break;
+
+case 2:
+  if (success) {
+long p = c[2]; // Unknown value but not garbage.
+// FIXME: Taint analysis only marks the first byte of a memory region. 
See getPointeeOf in GenericTaintChecker.cpp.
+clang_analyzer_isTainted(c[2]); // expected-warning {{NO}}
+clang_analyzer_dump(c[2]); // expected-warning {{conj_}}
+  } else {
+// Possibly indeterminate value but not modeled.
+clang_analyzer_isTainted(c[2]); // expected-warning {{NO}} // FIXME: 
See above.
+clang_analyzer_dump(c[2]); // expected-warning {{conj_}}
+  }
+  break;
+
+case 3:
+  // c[3] is not mutated by fread.
+  if (success) {
+long p = c[3]; // expected-warning {{Assigned value is garbage or 
undefined}}
+  } else {
+long p = c[3]; // 

[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-06 Thread Balazs Benics via cfe-commits


@@ -0,0 +1,412 @@
+// RUN: %clang_analyze_cc1 -verify %s \
+// RUN:   -triple x86_64-linux-gnu  \
+// RUN:   -analyzer-checker=core,unix.Stream,alpha.security.taint \
+// RUN:   -analyzer-checker=debug.ExprInspection
+
+#include "Inputs/system-header-simulator-for-simple-stream.h"
+
+#define EOF (-1)
+
+void clang_analyzer_dump(int);
+void clang_analyzer_isTainted(int);
+void clang_analyzer_warnIfReached(void);
+
+// A stream is only tracked by StreamChecker if it results from a call to 
"fopen".
+// Otherwise, there is no specific modelling of "fread".
+void untracked_stream(FILE *fp) {
+  char c;
+  if (1 == fread(, 1, 1, fp)) {
+char p = c; // Unknown value but not garbage and not modeled by checker.
+  } else {
+char p = c; // Possibly indeterminate value but not modeled by checker.
+  }
+}
+
+void fgetc_props_taint(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+int c = fgetc(fp); // c is tainted.
+if (c != EOF) {
+  clang_analyzer_isTainted(c); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void fread_props_taint(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+char buffer[10];
+int c = fread(buffer, 1, 10, fp); // c is tainted.
+if (c != 10) {
+  // If the read failed, then the number of bytes successfully read should 
be tainted.
+  clang_analyzer_isTainted(c); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte1(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+char c;
+if (1 == fread(, 1, 1, fp)) {
+  char p = c; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = c; // Possibly indeterminate value but not modeled by checker.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte2(char *buffer) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+if (1 == fread(buffer, 1, 1, fp)) {
+  char p = buffer[0]; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = buffer[0]; // Possibly indeterminate value but not modeled by 
checker.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte3(char *buffer) {
+  buffer[1] = 10;
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+// buffer[1] is not mutated by fread and remains not tainted.
+fread(buffer, 1, 1, fp);
+char p = buffer[1];
+clang_analyzer_isTainted(p); // expected-warning{{NO}}
+clang_analyzer_dump(buffer[1]); // expected-warning{{10 S32b}}
+fclose(fp);
+  }
+}
+
+void read_many_bytes(char *buffer) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+if (42 == fread(buffer, 1, 42, fp)) {
+  char p = buffer[0]; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = buffer[0]; // Possibly indeterminate value but not modeled.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void random_access_write1(int index) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+long c[4];
+int success = 2 == fread(c + 1, sizeof(long), 2, fp);
+
+switch (index) {
+case 0:
+  // c[0] is not mutated by fread.
+  if (success) {
+char p = c[0]; // expected-warning {{Assigned value is garbage or 
undefined}} We kept the first byte intact.
+  } else {
+char p = c[0]; // expected-warning {{Assigned value is garbage or 
undefined}} We kept the first byte intact.
+  }
+  break;
+
+case 1:
+  if (success) {
+// Unknown value but not garbage.
+clang_analyzer_isTainted(c[1]); // expected-warning {{YES}}
+clang_analyzer_dump(c[1]); // expected-warning {{conj_}}
+  } else {
+// Possibly indeterminate value but not modeled.
+clang_analyzer_isTainted(c[1]); // expected-warning {{YES}}
+clang_analyzer_dump(c[1]); // expected-warning {{conj_}}
+  }
+  break;
+
+case 2:
+  if (success) {
+long p = c[2]; // Unknown value but not garbage.
+// FIXME: Taint analysis only marks the first byte of a memory region. 
See getPointeeOf in GenericTaintChecker.cpp.
+clang_analyzer_isTainted(c[2]); // expected-warning {{NO}}
+clang_analyzer_dump(c[2]); // expected-warning {{conj_}}
+  } else {
+// Possibly indeterminate value but not modeled.
+clang_analyzer_isTainted(c[2]); // expected-warning {{NO}} // FIXME: 
See above.
+clang_analyzer_dump(c[2]); // expected-warning {{conj_}}
+  }
+  break;
+
+case 3:
+  // c[3] is not mutated by fread.
+  if (success) {
+long p = c[3]; // expected-warning {{Assigned value is garbage or 
undefined}}
+  } else {
+long p = c[3]; // 

[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-06 Thread Balazs Benics via cfe-commits


@@ -0,0 +1,412 @@
+// RUN: %clang_analyze_cc1 -verify %s \
+// RUN:   -triple x86_64-linux-gnu  \
+// RUN:   -analyzer-checker=core,unix.Stream,alpha.security.taint \
+// RUN:   -analyzer-checker=debug.ExprInspection
+
+#include "Inputs/system-header-simulator-for-simple-stream.h"
+
+#define EOF (-1)
+
+void clang_analyzer_dump(int);
+void clang_analyzer_isTainted(int);
+void clang_analyzer_warnIfReached(void);
+
+// A stream is only tracked by StreamChecker if it results from a call to 
"fopen".
+// Otherwise, there is no specific modelling of "fread".
+void untracked_stream(FILE *fp) {
+  char c;
+  if (1 == fread(, 1, 1, fp)) {
+char p = c; // Unknown value but not garbage and not modeled by checker.
+  } else {
+char p = c; // Possibly indeterminate value but not modeled by checker.
+  }
+}
+
+void fgetc_props_taint(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+int c = fgetc(fp); // c is tainted.
+if (c != EOF) {
+  clang_analyzer_isTainted(c); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void fread_props_taint(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+char buffer[10];
+int c = fread(buffer, 1, 10, fp); // c is tainted.
+if (c != 10) {
+  // If the read failed, then the number of bytes successfully read should 
be tainted.
+  clang_analyzer_isTainted(c); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte1(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+char c;
+if (1 == fread(, 1, 1, fp)) {
+  char p = c; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = c; // Possibly indeterminate value but not modeled by checker.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte2(char *buffer) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+if (1 == fread(buffer, 1, 1, fp)) {
+  char p = buffer[0]; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = buffer[0]; // Possibly indeterminate value but not modeled by 
checker.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte3(char *buffer) {
+  buffer[1] = 10;
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+// buffer[1] is not mutated by fread and remains not tainted.
+fread(buffer, 1, 1, fp);
+char p = buffer[1];
+clang_analyzer_isTainted(p); // expected-warning{{NO}}
+clang_analyzer_dump(buffer[1]); // expected-warning{{10 S32b}}
+fclose(fp);
+  }
+}
+
+void read_many_bytes(char *buffer) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+if (42 == fread(buffer, 1, 42, fp)) {
+  char p = buffer[0]; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = buffer[0]; // Possibly indeterminate value but not modeled.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void random_access_write1(int index) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+long c[4];
+int success = 2 == fread(c + 1, sizeof(long), 2, fp);
+
+switch (index) {
+case 0:
+  // c[0] is not mutated by fread.
+  if (success) {
+char p = c[0]; // expected-warning {{Assigned value is garbage or 
undefined}} We kept the first byte intact.
+  } else {
+char p = c[0]; // expected-warning {{Assigned value is garbage or 
undefined}} We kept the first byte intact.
+  }
+  break;
+
+case 1:
+  if (success) {
+// Unknown value but not garbage.
+clang_analyzer_isTainted(c[1]); // expected-warning {{YES}}
+clang_analyzer_dump(c[1]); // expected-warning {{conj_}}
+  } else {
+// Possibly indeterminate value but not modeled.
+clang_analyzer_isTainted(c[1]); // expected-warning {{YES}}
+clang_analyzer_dump(c[1]); // expected-warning {{conj_}}
+  }
+  break;
+
+case 2:
+  if (success) {
+long p = c[2]; // Unknown value but not garbage.
+// FIXME: Taint analysis only marks the first byte of a memory region. 
See getPointeeOf in GenericTaintChecker.cpp.
+clang_analyzer_isTainted(c[2]); // expected-warning {{NO}}
+clang_analyzer_dump(c[2]); // expected-warning {{conj_}}
+  } else {
+// Possibly indeterminate value but not modeled.
+clang_analyzer_isTainted(c[2]); // expected-warning {{NO}} // FIXME: 
See above.
+clang_analyzer_dump(c[2]); // expected-warning {{conj_}}
+  }
+  break;
+
+case 3:
+  // c[3] is not mutated by fread.
+  if (success) {
+long p = c[3]; // expected-warning {{Assigned value is garbage or 
undefined}}
+  } else {
+long p = c[3]; // 

[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-06 Thread Balazs Benics via cfe-commits

https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/93408
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-06 Thread Balazs Benics via cfe-commits


@@ -0,0 +1,412 @@
+// RUN: %clang_analyze_cc1 -verify %s \
+// RUN:   -triple x86_64-linux-gnu  \
+// RUN:   -analyzer-checker=core,unix.Stream,alpha.security.taint \
+// RUN:   -analyzer-checker=debug.ExprInspection
+
+#include "Inputs/system-header-simulator-for-simple-stream.h"
+
+#define EOF (-1)
+
+void clang_analyzer_dump(int);
+void clang_analyzer_isTainted(int);
+void clang_analyzer_warnIfReached(void);
+
+// A stream is only tracked by StreamChecker if it results from a call to 
"fopen".
+// Otherwise, there is no specific modelling of "fread".
+void untracked_stream(FILE *fp) {
+  char c;
+  if (1 == fread(, 1, 1, fp)) {
+char p = c; // Unknown value but not garbage and not modeled by checker.
+  } else {
+char p = c; // Possibly indeterminate value but not modeled by checker.
+  }
+}
+
+void fgetc_props_taint(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+int c = fgetc(fp); // c is tainted.
+if (c != EOF) {
+  clang_analyzer_isTainted(c); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void fread_props_taint(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+char buffer[10];
+int c = fread(buffer, 1, 10, fp); // c is tainted.
+if (c != 10) {
+  // If the read failed, then the number of bytes successfully read should 
be tainted.
+  clang_analyzer_isTainted(c); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte1(void) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+char c;
+if (1 == fread(, 1, 1, fp)) {
+  char p = c; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = c; // Possibly indeterminate value but not modeled by checker.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte2(char *buffer) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+if (1 == fread(buffer, 1, 1, fp)) {
+  char p = buffer[0]; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = buffer[0]; // Possibly indeterminate value but not modeled by 
checker.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void read_one_byte3(char *buffer) {
+  buffer[1] = 10;
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+// buffer[1] is not mutated by fread and remains not tainted.
+fread(buffer, 1, 1, fp);
+char p = buffer[1];
+clang_analyzer_isTainted(p); // expected-warning{{NO}}
+clang_analyzer_dump(buffer[1]); // expected-warning{{10 S32b}}
+fclose(fp);
+  }
+}
+
+void read_many_bytes(char *buffer) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+if (42 == fread(buffer, 1, 42, fp)) {
+  char p = buffer[0]; // Unknown value but not garbage.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+} else {
+  char p = buffer[0]; // Possibly indeterminate value but not modeled.
+  clang_analyzer_isTainted(p); // expected-warning{{YES}}
+}
+fclose(fp);
+  }
+}
+
+void random_access_write1(int index) {
+  FILE *fp = fopen("/home/test", "rb+");
+  if (fp) {
+long c[4];
+int success = 2 == fread(c + 1, sizeof(long), 2, fp);
+
+switch (index) {
+case 0:
+  // c[0] is not mutated by fread.
+  if (success) {
+char p = c[0]; // expected-warning {{Assigned value is garbage or 
undefined}} We kept the first byte intact.
+  } else {
+char p = c[0]; // expected-warning {{Assigned value is garbage or 
undefined}} We kept the first byte intact.
+  }
+  break;
+
+case 1:
+  if (success) {
+// Unknown value but not garbage.
+clang_analyzer_isTainted(c[1]); // expected-warning {{YES}}
+clang_analyzer_dump(c[1]); // expected-warning {{conj_}}
+  } else {
+// Possibly indeterminate value but not modeled.
+clang_analyzer_isTainted(c[1]); // expected-warning {{YES}}
+clang_analyzer_dump(c[1]); // expected-warning {{conj_}}
+  }
+  break;
+
+case 2:
+  if (success) {
+long p = c[2]; // Unknown value but not garbage.
+// FIXME: Taint analysis only marks the first byte of a memory region. 
See getPointeeOf in GenericTaintChecker.cpp.
+clang_analyzer_isTainted(c[2]); // expected-warning {{NO}}
+clang_analyzer_dump(c[2]); // expected-warning {{conj_}}
+  } else {
+// Possibly indeterminate value but not modeled.
+clang_analyzer_isTainted(c[2]); // expected-warning {{NO}} // FIXME: 
See above.
+clang_analyzer_dump(c[2]); // expected-warning {{conj_}}
+  }
+  break;
+
+case 3:
+  // c[3] is not mutated by fread.
+  if (success) {
+long p = c[3]; // expected-warning {{Assigned value is garbage or 
undefined}}
+  } else {
+long p = c[3]; // 

[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-06 Thread Balazs Benics via cfe-commits


@@ -907,6 +945,73 @@ void StreamChecker::preWrite(const FnDescription *Desc, 
const CallEvent ,
   C.addTransition(State);
 }
 
+static std::optional getPointeeType(const MemRegion *R) {
+  if (!R)
+return std::nullopt;
+  if (const auto *ER = dyn_cast(R))
+return ER->getElementType();
+  if (const auto *TR = dyn_cast(R))
+return TR->getValueType();
+  if (const auto *SR = dyn_cast(R))
+return SR->getPointeeStaticType();
+  return std::nullopt;
+}
+
+static std::optional getStartIndex(SValBuilder ,
+   const MemRegion *R) {
+  if (!R)
+return std::nullopt;
+
+  auto Zero = [] {
+BasicValueFactory  = SVB.getBasicValueFactory();
+return nonloc::ConcreteInt(BVF.getIntValue(0, /*isUnsigned=*/false));
+  };

steakhal wrote:

This way it's lazily evaluated.

https://github.com/llvm/llvm-project/pull/93408
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] New optin.taint.TaintedAlloc checker for catching unbounded memory allocation calls (PR #92420)

2024-06-05 Thread Balazs Benics via cfe-commits

https://github.com/steakhal approved this pull request.


https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Factor out NoOwnershipChangeVisitor (PR #94357)

2024-06-05 Thread Balazs Benics via cfe-commits

steakhal wrote:

Is this NFC?

https://github.com/llvm/llvm-project/pull/94357
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] New optin.taint.TaintedAlloc checker for catching unbounded memory allocation calls (PR #92420)

2024-06-05 Thread Balazs Benics via cfe-commits


@@ -938,6 +938,53 @@ optin.portability.UnixAPI
 "
 Finds implementation-defined behavior in UNIX/Posix functions.
 
+.. _optin-taint-TaintedAlloc:
+
+optin.taint.TaintedAlloc (C, C++)
+"""

steakhal wrote:

```suggestion
"
```

https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-03 Thread Balazs Benics via cfe-commits

steakhal wrote:

Checkout the new implementation and the added "weird" fread tests.
FYI unfortunately our store interferes a bit, as you will see in the last test 
(`test_unaligned_start_read`) when the store does not purge the previous 
binding when we have an overlapping write to the store.
Consequently, when reading, we can read the old value - which was partially 
overwritten by the escape/invalidation. 

https://github.com/llvm/llvm-project/pull/93408
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-03 Thread Balazs Benics via cfe-commits


@@ -937,8 +990,21 @@ void StreamChecker::evalFreadFwrite(const FnDescription 
*Desc,
 
   // At read, invalidate the buffer in any case of error or success,
   // except if EOF was already present.
-  if (IsFread && !E.isStreamEof())
-State = escapeArgs(State, C, Call, {0});
+  if (IsFread && !E.isStreamEof()) {
+// Try to invalidate the individual elements.
+if (const auto *BufferFirstElem =
+dyn_cast_or_null(Call.getArgSVal(0).getAsRegion())) 
{

steakhal wrote:

Implemented.

https://github.com/llvm/llvm-project/pull/93408
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-03 Thread Balazs Benics via cfe-commits


@@ -717,18 +717,71 @@ const ExplodedNode 
*StreamChecker::getAcquisitionSite(const ExplodedNode *N,
   return nullptr;
 }
 
+/// Invalidate only the requested elements instead of the whole buffer.
+/// This is basically a refinement of the more generic 'escapeArgs' or
+/// the plain old 'invalidateRegions'.
+/// This only works if the \p StartIndex and \p Count are concrete or
+/// perfectly-constrained.
+static ProgramStateRef
+escapeByStartIndexAndCount(ProgramStateRef State, CheckerContext ,
+   const CallEvent , const MemRegion *Buffer,
+   QualType ElemType, SVal StartIndex, SVal Count) {
+  if (!llvm::isa_and_nonnull(Buffer))
+return State;
+
+  auto UnboxAsInt = [, ](SVal V) -> std::optional {
+auto  = C.getSValBuilder();
+if (const llvm::APSInt *Int = SVB.getKnownValue(State, V))
+  return Int->tryExtValue();
+return std::nullopt;
+  };
+
+  auto StartIndexVal = UnboxAsInt(StartIndex);
+  auto CountVal = UnboxAsInt(Count);
+
+  // FIXME: Maybe we could make this more generic, and expose this by the
+  // 'invalidateRegions' API. After doing so, it might make sense to make this
+  // limit configurable.
+  constexpr int MaxInvalidatedElementsLimit = 64;
+  if (!StartIndexVal || !CountVal || *CountVal > MaxInvalidatedElementsLimit) {
+return State->invalidateRegions({loc::MemRegionVal{Buffer}},
+Call.getOriginExpr(), C.blockCount(),
+C.getLocationContext(),
+/*CausesPointerEscape=*/false);
+  }
+
+  constexpr auto DoNotInvalidateSuperRegion =
+  RegionAndSymbolInvalidationTraits::InvalidationKinds::
+  TK_DoNotInvalidateSuperRegion;
+
+  auto  = Buffer->getMemRegionManager();
+  SmallVector EscapingVals;
+  EscapingVals.reserve(*CountVal);
+
+  RegionAndSymbolInvalidationTraits ITraits;
+  for (auto Idx : llvm::seq(*StartIndexVal, *StartIndexVal + *CountVal)) {

steakhal wrote:

Implemented.

https://github.com/llvm/llvm-project/pull/93408
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-03 Thread Balazs Benics via cfe-commits

https://github.com/steakhal updated 
https://github.com/llvm/llvm-project/pull/93408

>From f9e841ddaa865d529c806b2d115d5ddbc7109243 Mon Sep 17 00:00:00 2001
From: Balazs Benics 
Date: Sun, 26 May 2024 11:40:01 +0200
Subject: [PATCH 1/8] [analyzer] Refine invalidation caused by `fread`

This change enables more accurate modeling of the write effects of `fread`.
In particular, instead of invalidating the whole buffer, in a best-effort
basis, we would try to invalidate the actually accesses elements of the buffer.
This preserves the previous value of the buffer of the unaffected slots.
As a result, diagnose more uninitialized buffer uses for example.

Currently, this refined invalidation only triggers for `fread` if and
only if the `count` parameter and the buffer pointer's index component
are concrete or perfectly-constrained symbols.
Additionally, if the `fread` would read more than 64 elements, the whole
buffer is invalidated as before. This is to have safeguards against
performance issues.

Refer to the comments of the assertions in the following example to see
the changes in the diagnostics:

```c++
void demo() {
  FILE *fp = fopen("/home/test", "rb+");
  if (!fp) return;
  int buffer[10]; // uninitialized
  int read_items = fread(buffer+1, sizeof(int), 5, fp);
  if (5 == read_items) {
int v1 = buffer[1]; // Unknown value but not garbage.
clang_analyzer_isTainted(v1); // expected-warning {{YES}} <-- Would be "NO" 
without this patch.
clang_analyzer_dump(v1); // expected-warning {{conj_}} <-- Not a "derived" 
symbol, so it's directly invalidated now.
int v0 = buffer[0]; // expected-warning {{Assigned value is garbage or 
undefined}} <-- Had no report here before.
(void)(v1 + v0);
  } else {
// If 'fread' had an error.
int v0 = buffer[0]; // expected-warning {{Assigned value is garbage or 
undefined}} <-- Had no report here before.
(void)v0;
  }
  fclose(fp);
}
```

[CPP-3247](https://sonarsource.atlassian.net/browse/CPP-3247)

Patch by Marco Borgeaud (marco-antognini-sonarsource)
---
 .../StaticAnalyzer/Checkers/StreamChecker.cpp |  88 -
 clang/test/Analysis/fread.cpp | 328 ++
 2 files changed, 405 insertions(+), 11 deletions(-)
 create mode 100644 clang/test/Analysis/fread.cpp

diff --git a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
index d4e020f7a72a0..7b42c4f72b322 100644
--- a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
@@ -717,18 +717,71 @@ const ExplodedNode 
*StreamChecker::getAcquisitionSite(const ExplodedNode *N,
   return nullptr;
 }
 
+/// Invalidate only the requested elements instead of the whole buffer.
+/// This is basically a refinement of the more generic 'escapeArgs' or
+/// the plain old 'invalidateRegions'.
+/// This only works if the \p StartIndex and \p Count are concrete or
+/// perfectly-constrained.
+static ProgramStateRef
+escapeByStartIndexAndCount(ProgramStateRef State, CheckerContext ,
+   const CallEvent , const MemRegion *Buffer,
+   QualType ElemType, SVal StartIndex, SVal Count) {
+  if (!llvm::isa_and_nonnull(Buffer))
+return State;
+
+  auto UnboxAsInt = [, ](SVal V) -> std::optional {
+auto  = C.getSValBuilder();
+if (const llvm::APSInt *Int = SVB.getKnownValue(State, V))
+  return Int->tryExtValue();
+return std::nullopt;
+  };
+
+  auto StartIndexVal = UnboxAsInt(StartIndex);
+  auto CountVal = UnboxAsInt(Count);
+
+  // FIXME: Maybe we could make this more generic, and expose this by the
+  // 'invalidateRegions' API. After doing so, it might make sense to make this
+  // limit configurable.
+  constexpr int MaxInvalidatedElementsLimit = 64;
+  if (!StartIndexVal || !CountVal || *CountVal > MaxInvalidatedElementsLimit) {
+return State->invalidateRegions({loc::MemRegionVal{Buffer}},
+Call.getOriginExpr(), C.blockCount(),
+C.getLocationContext(),
+/*CausesPointerEscape=*/false);
+  }
+
+  constexpr auto DoNotInvalidateSuperRegion =
+  RegionAndSymbolInvalidationTraits::InvalidationKinds::
+  TK_DoNotInvalidateSuperRegion;
+
+  auto  = Buffer->getMemRegionManager();
+  SmallVector EscapingVals;
+  EscapingVals.reserve(*CountVal);
+
+  RegionAndSymbolInvalidationTraits ITraits;
+  for (auto Idx : llvm::seq(*StartIndexVal, *StartIndexVal + *CountVal)) {
+NonLoc Index = C.getSValBuilder().makeArrayIndex(Idx);
+const auto *Element = RegionManager.getElementRegion(
+ElemType, Index, cast(Buffer), C.getASTContext());
+EscapingVals.push_back(loc::MemRegionVal(Element));
+ITraits.setTrait(Element, DoNotInvalidateSuperRegion);
+  }
+  return State->invalidateRegions(EscapingVals, Call.getOriginExpr(),
+  C.blockCount(), 

[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-03 Thread Balazs Benics via cfe-commits

steakhal wrote:

Fixed most NFC typos and suggestions.
Let's continue the discussion.

https://github.com/llvm/llvm-project/pull/93408
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-03 Thread Balazs Benics via cfe-commits

https://github.com/steakhal updated 
https://github.com/llvm/llvm-project/pull/93408

>From f9e841ddaa865d529c806b2d115d5ddbc7109243 Mon Sep 17 00:00:00 2001
From: Balazs Benics 
Date: Sun, 26 May 2024 11:40:01 +0200
Subject: [PATCH 1/6] [analyzer] Refine invalidation caused by `fread`

This change enables more accurate modeling of the write effects of `fread`.
In particular, instead of invalidating the whole buffer, in a best-effort
basis, we would try to invalidate the actually accesses elements of the buffer.
This preserves the previous value of the buffer of the unaffected slots.
As a result, diagnose more uninitialized buffer uses for example.

Currently, this refined invalidation only triggers for `fread` if and
only if the `count` parameter and the buffer pointer's index component
are concrete or perfectly-constrained symbols.
Additionally, if the `fread` would read more than 64 elements, the whole
buffer is invalidated as before. This is to have safeguards against
performance issues.

Refer to the comments of the assertions in the following example to see
the changes in the diagnostics:

```c++
void demo() {
  FILE *fp = fopen("/home/test", "rb+");
  if (!fp) return;
  int buffer[10]; // uninitialized
  int read_items = fread(buffer+1, sizeof(int), 5, fp);
  if (5 == read_items) {
int v1 = buffer[1]; // Unknown value but not garbage.
clang_analyzer_isTainted(v1); // expected-warning {{YES}} <-- Would be "NO" 
without this patch.
clang_analyzer_dump(v1); // expected-warning {{conj_}} <-- Not a "derived" 
symbol, so it's directly invalidated now.
int v0 = buffer[0]; // expected-warning {{Assigned value is garbage or 
undefined}} <-- Had no report here before.
(void)(v1 + v0);
  } else {
// If 'fread' had an error.
int v0 = buffer[0]; // expected-warning {{Assigned value is garbage or 
undefined}} <-- Had no report here before.
(void)v0;
  }
  fclose(fp);
}
```

[CPP-3247](https://sonarsource.atlassian.net/browse/CPP-3247)

Patch by Marco Borgeaud (marco-antognini-sonarsource)
---
 .../StaticAnalyzer/Checkers/StreamChecker.cpp |  88 -
 clang/test/Analysis/fread.cpp | 328 ++
 2 files changed, 405 insertions(+), 11 deletions(-)
 create mode 100644 clang/test/Analysis/fread.cpp

diff --git a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
index d4e020f7a72a0..7b42c4f72b322 100644
--- a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
@@ -717,18 +717,71 @@ const ExplodedNode 
*StreamChecker::getAcquisitionSite(const ExplodedNode *N,
   return nullptr;
 }
 
+/// Invalidate only the requested elements instead of the whole buffer.
+/// This is basically a refinement of the more generic 'escapeArgs' or
+/// the plain old 'invalidateRegions'.
+/// This only works if the \p StartIndex and \p Count are concrete or
+/// perfectly-constrained.
+static ProgramStateRef
+escapeByStartIndexAndCount(ProgramStateRef State, CheckerContext ,
+   const CallEvent , const MemRegion *Buffer,
+   QualType ElemType, SVal StartIndex, SVal Count) {
+  if (!llvm::isa_and_nonnull(Buffer))
+return State;
+
+  auto UnboxAsInt = [, ](SVal V) -> std::optional {
+auto  = C.getSValBuilder();
+if (const llvm::APSInt *Int = SVB.getKnownValue(State, V))
+  return Int->tryExtValue();
+return std::nullopt;
+  };
+
+  auto StartIndexVal = UnboxAsInt(StartIndex);
+  auto CountVal = UnboxAsInt(Count);
+
+  // FIXME: Maybe we could make this more generic, and expose this by the
+  // 'invalidateRegions' API. After doing so, it might make sense to make this
+  // limit configurable.
+  constexpr int MaxInvalidatedElementsLimit = 64;
+  if (!StartIndexVal || !CountVal || *CountVal > MaxInvalidatedElementsLimit) {
+return State->invalidateRegions({loc::MemRegionVal{Buffer}},
+Call.getOriginExpr(), C.blockCount(),
+C.getLocationContext(),
+/*CausesPointerEscape=*/false);
+  }
+
+  constexpr auto DoNotInvalidateSuperRegion =
+  RegionAndSymbolInvalidationTraits::InvalidationKinds::
+  TK_DoNotInvalidateSuperRegion;
+
+  auto  = Buffer->getMemRegionManager();
+  SmallVector EscapingVals;
+  EscapingVals.reserve(*CountVal);
+
+  RegionAndSymbolInvalidationTraits ITraits;
+  for (auto Idx : llvm::seq(*StartIndexVal, *StartIndexVal + *CountVal)) {
+NonLoc Index = C.getSValBuilder().makeArrayIndex(Idx);
+const auto *Element = RegionManager.getElementRegion(
+ElemType, Index, cast(Buffer), C.getASTContext());
+EscapingVals.push_back(loc::MemRegionVal(Element));
+ITraits.setTrait(Element, DoNotInvalidateSuperRegion);
+  }
+  return State->invalidateRegions(EscapingVals, Call.getOriginExpr(),
+  C.blockCount(), 

[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-03 Thread Balazs Benics via cfe-commits


@@ -717,18 +717,71 @@ const ExplodedNode 
*StreamChecker::getAcquisitionSite(const ExplodedNode *N,
   return nullptr;
 }
 
+/// Invalidate only the requested elements instead of the whole buffer.
+/// This is basically a refinement of the more generic 'escapeArgs' or
+/// the plain old 'invalidateRegions'.
+/// This only works if the \p StartIndex and \p Count are concrete or
+/// perfectly-constrained.
+static ProgramStateRef
+escapeByStartIndexAndCount(ProgramStateRef State, CheckerContext ,
+   const CallEvent , const MemRegion *Buffer,
+   QualType ElemType, SVal StartIndex, SVal Count) {
+  if (!llvm::isa_and_nonnull(Buffer))
+return State;
+
+  auto UnboxAsInt = [, ](SVal V) -> std::optional {
+auto  = C.getSValBuilder();
+if (const llvm::APSInt *Int = SVB.getKnownValue(State, V))
+  return Int->tryExtValue();
+return std::nullopt;
+  };
+
+  auto StartIndexVal = UnboxAsInt(StartIndex);
+  auto CountVal = UnboxAsInt(Count);
+
+  // FIXME: Maybe we could make this more generic, and expose this by the
+  // 'invalidateRegions' API. After doing so, it might make sense to make this
+  // limit configurable.
+  constexpr int MaxInvalidatedElementsLimit = 64;
+  if (!StartIndexVal || !CountVal || *CountVal > MaxInvalidatedElementsLimit) {
+return State->invalidateRegions({loc::MemRegionVal{Buffer}},
+Call.getOriginExpr(), C.blockCount(),
+C.getLocationContext(),
+/*CausesPointerEscape=*/false);
+  }
+
+  constexpr auto DoNotInvalidateSuperRegion =
+  RegionAndSymbolInvalidationTraits::InvalidationKinds::
+  TK_DoNotInvalidateSuperRegion;
+
+  auto  = Buffer->getMemRegionManager();
+  SmallVector EscapingVals;
+  EscapingVals.reserve(*CountVal);
+
+  RegionAndSymbolInvalidationTraits ITraits;
+  for (auto Idx : llvm::seq(*StartIndexVal, *StartIndexVal + *CountVal)) {

steakhal wrote:

Yes, this is a problem.
To properly invalidate element-wise, I must bind the correctly typed conjured 
symbols so that subsequent lookups in the store would hit the entry.
Consequently, I can't consider the buffer as a char array and invalidate each 
chars in the range.
I must use `ElementRegions`, thus I need properly aligned item-wise iteration 
and invalidation.

So, I could try to convert the element size and item count into a start byte 
offset and length in bytes. After this I could check if the start bytes offset 
would land on an item boundary and if the length would cover a whole element 
object.

This way we could cover the case like:
```
int buffer[100]; // assuming 4 byte ints.
fread(buffer + 1, 2, 10, file); // Reads 20 bytes of data, which is dividable 
by 4, thus we can think of the 'buffer[1:6]' elements as individually 
invalidated.
``` 

WDYT?

https://github.com/llvm/llvm-project/pull/93408
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-06-03 Thread Balazs Benics via cfe-commits


@@ -937,8 +990,21 @@ void StreamChecker::evalFreadFwrite(const FnDescription 
*Desc,
 
   // At read, invalidate the buffer in any case of error or success,
   // except if EOF was already present.
-  if (IsFread && !E.isStreamEof())
-State = escapeArgs(State, C, Call, {0});
+  if (IsFread && !E.isStreamEof()) {
+// Try to invalidate the individual elements.
+if (const auto *BufferFirstElem =
+dyn_cast_or_null(Call.getArgSVal(0).getAsRegion())) 
{

steakhal wrote:

In case `fread` reads to the beginning of a buffer, we won't have an 
`ElementRegion`, thus the heuristic for eagerly binding the invalidated 
elements won't trigger. This is unfortunate, but you can think of this as we 
keep the previous behavior.
To circumvent this, I'd need to know the type for for the pointee.
This would imply that I should special-case `TypedValueRegion` and 
`SymbolicRegion`.

I'll think about it.

https://github.com/llvm/llvm-project/pull/93408
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move unix.BlockInCriticalSection out of alpha (PR #93815)

2024-06-03 Thread Balazs Benics via cfe-commits
Endre =?utf-8?q?F=C3=BCl=C3=B6p?= 
Message-ID:
In-Reply-To: 


https://github.com/steakhal approved this pull request.


https://github.com/llvm/llvm-project/pull/93815
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move PutenvStackArrayChecker out of alpha package (PR #93980)

2024-06-03 Thread Balazs Benics via cfe-commits
=?utf-8?q?Bal=C3=A1zs_K=C3=A9ri?= 
Message-ID:
In-Reply-To: 


https://github.com/steakhal approved this pull request.


https://github.com/llvm/llvm-project/pull/93980
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move PutenvStackArrayChecker out of alpha package (PR #93980)

2024-06-03 Thread Balazs Benics via cfe-commits
=?utf-8?q?Balázs_Kéri?= 
Message-ID:
In-Reply-To: 


https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/93980
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improved PointerSubChecker (PR #93676)

2024-06-01 Thread Balazs Benics via cfe-commits
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= 
Message-ID:
In-Reply-To: 



@@ -0,0 +1,74 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=alpha.core.PointerSub -verify %s
+
+void f1(void) {
+  int x, y, z[10];
+  int d =  -  // expected-warning{{Subtraction of two pointers that do 
not point into the same array is undefined behavior}}
+  d = z -  // expected-warning{{Subtraction of two pointers that do not 
point into the same array is undefined behavior}}
+  d =  -  // expected-warning{{Subtraction of two pointers that do not 
point into the same array is undefined behavior}}
+  d = (long*) - (long*)
+}
+
+void f2(void) {
+  int a[10], b[10], c;
+  int *p = [2];
+  int *q = [8];
+  int d = q - p; // no-warning
+
+  q = [3];
+  d = q - p; // expected-warning{{Subtraction of two pointers that}}
+
+  q = a + 10;
+  d = q - p; // no warning (use of pointer to one after the end is allowed)
+  d = [4] - a; // no warning
+
+  q = a + 11;
+  d = q - a; // ?
+
+  d =  - p; // expected-warning{{Subtraction of two pointers that}}
+}
+
+void f3(void) {
+  int a[3][4];
+  int d;
+
+  d = &(a[2]) - &(a[1]);
+  d = a[2] - a[1]; // expected-warning{{Subtraction of two pointers that}}
+  d = a[1] - a[1];
+  d = &(a[1][2]) - &(a[1][0]);
+  d = &(a[1][2]) - &(a[0][0]); // expected-warning{{Subtraction of two 
pointers that}}
+}
+
+void f4(void) {
+  int n = 4, m = 3;
+  int a[n][m];
+  int (*p)[m] = a; // p == [0]
+  p += 1; // p == [1]
+  int d = p - a; // d == 1 // expected-warning{{subtraction of pointers to 
type 'int[m]' of zero size has undefined behavior}}
+
+  d = &(a[2]) - &(a[1]); // expected-warning{{subtraction of pointers to type 
'int[m]' of zero size has undefined behavior}}
+  d = a[2] - a[1]; // expected-warning{{Subtraction of two pointers that}}

steakhal wrote:

Ah, that's gotta be it. Thanks!

https://github.com/llvm/llvm-project/pull/93676
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move PutenvStackArrayChecker out of alpha package. (PR #93980)

2024-06-01 Thread Balazs Benics via cfe-commits


@@ -1179,6 +1179,41 @@ security.insecureAPI.DeprecatedOrUnsafeBufferHandling (C)
strncpy(buf, "a", 1); // warn
  }
 
+.. _security-putenv-stack-array:
+
+security.PutenvStackArray (C)
+"
+Finds calls to the ``putenv`` function which pass a pointer to a 
stack-allocated
+(automatic) array as the argument. Function ``putenv`` does not copy the passed
+string, only a pointer to the data is stored and this data can be read even by
+other threads. Content of a stack-allocated array is likely to be overwritten
+after returning from the parent function.

steakhal wrote:

```suggestion
after returning to the parent function.
```

https://github.com/llvm/llvm-project/pull/93980
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move PutenvStackArrayChecker out of alpha package. (PR #93980)

2024-06-01 Thread Balazs Benics via cfe-commits

https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/93980
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move PutenvStackArrayChecker out of alpha package. (PR #93980)

2024-06-01 Thread Balazs Benics via cfe-commits


@@ -1179,6 +1179,41 @@ security.insecureAPI.DeprecatedOrUnsafeBufferHandling (C)
strncpy(buf, "a", 1); // warn
  }
 
+.. _security-putenv-stack-array:
+
+security.PutenvStackArray (C)
+"
+Finds calls to the ``putenv`` function which pass a pointer to a 
stack-allocated
+(automatic) array as the argument. Function ``putenv`` does not copy the passed
+string, only a pointer to the data is stored and this data can be read even by
+other threads. Content of a stack-allocated array is likely to be overwritten
+after returning from the parent function.
+
+The problem can be solved by using a static array variable or dynamically
+allocated memory. Even better is to avoid using ``putenv`` (it has other
+problems related to memory leaks) and use ``setenv`` instead.
+
+The check corresponds to CERT rule
+`POS34-C. Do not call putenv() with a pointer to an automatic variable as the 
argument
+`_.
+
+.. code-block:: c
+
+  int f() {
+char env[] = "NAME=value";
+return putenv(env); // putenv function should not be called with 
stack-allocated string
+  }
+
+There is one case where the checker can report a false positive. This is when
+the stack-allocated array is used at `putenv` in a function or code branch that
+does not return (calls `fork` or `exec` like function).

steakhal wrote:

I feel this isn't accurate. Both `fork` and `exec` eventually returns.
I think the reason for having an FP for these is because these will copy the 
environment variables and not directly use the one we manipulated by `putenv`.

https://github.com/llvm/llvm-project/pull/93980
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move PutenvStackArrayChecker out of alpha package. (PR #93980)

2024-06-01 Thread Balazs Benics via cfe-commits

https://github.com/steakhal commented:

I have only a handful of remarks. LGTM otherwise.

https://github.com/llvm/llvm-project/pull/93980
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move unix.BlockInCriticalSection out of alpha (PR #93815)

2024-06-01 Thread Balazs Benics via cfe-commits


@@ -1235,6 +1235,49 @@ Check calls to various UNIX/Posix functions: ``open, 
pthread_once, calloc, mallo
 .. literalinclude:: checkers/unix_api_example.c
 :language: c
 
+.. _unix-BlockInCriticalSection:
+
+unix.BlockInCriticalSection (C)
+"
+Check for calls to blocking functions inside a critical section.
+Blocking functions detected by this checker: ``sleep, getc, fgets, read, 
recv``.
+Critical section handling functions modelled by this checker: ``lock, unlock, 
pthread_mutex_lock, pthread_mutex_trylock, pthread_mutex_unlock, mtx_lock, 
mtx_timedlock, mtx_trylock, mtx_unlock, lock_guard, unique_lock``.

steakhal wrote:

This line is probably already too long. Let's break it.

https://github.com/llvm/llvm-project/pull/93815
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move unix.BlockInCriticalSection out of alpha (PR #93815)

2024-06-01 Thread Balazs Benics via cfe-commits


@@ -1235,6 +1235,49 @@ Check calls to various UNIX/Posix functions: ``open, 
pthread_once, calloc, mallo
 .. literalinclude:: checkers/unix_api_example.c
 :language: c
 
+.. _unix-BlockInCriticalSection:
+
+unix.BlockInCriticalSection (C)
+"
+Check for calls to blocking functions inside a critical section.
+Blocking functions detected by this checker: ``sleep, getc, fgets, read, 
recv``.
+Critical section handling functions modelled by this checker: ``lock, unlock, 
pthread_mutex_lock, pthread_mutex_trylock, pthread_mutex_unlock, mtx_lock, 
mtx_timedlock, mtx_trylock, mtx_unlock, lock_guard, unique_lock``.
+
+.. code-block:: c
+
+ void pthread_lock_example(pthread_mutex_t *m) {
+   pthread_mutex_lock(m); // note: entering critical section here
+   sleep(10); // warn: Call to blocking function 'sleep' inside of critical 
section
+   pthread_mutex_unlock(m);
+ }
+
+.. code-block:: cpp
+
+ void overlapping_critical_sections(mtx_t *m1, std::mutex ) {
+   std::lock_guard lg{m2}; // note: entering critical section here
+   mtx_lock(m1); // note: entering critical section here
+   sleep(10); // warn: Call to blocking function 'sleep' inside of critical 
section
+   mtx_unlock(m1);
+   sleep(10); // warn: Call to blocking function 'sleep' inside of critical 
section
+ // still inside of the critical section of the std::lock_guard

steakhal wrote:

```suggestion
  // still inside of the critical section of the std::lock_guard
```

https://github.com/llvm/llvm-project/pull/93815
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move unix.BlockInCriticalSection out of alpha (PR #93815)

2024-06-01 Thread Balazs Benics via cfe-commits


@@ -1235,6 +1235,49 @@ Check calls to various UNIX/Posix functions: ``open, 
pthread_once, calloc, mallo
 .. literalinclude:: checkers/unix_api_example.c
 :language: c
 
+.. _unix-BlockInCriticalSection:
+
+unix.BlockInCriticalSection (C)
+"
+Check for calls to blocking functions inside a critical section.
+Blocking functions detected by this checker: ``sleep, getc, fgets, read, 
recv``.
+Critical section handling functions modelled by this checker: ``lock, unlock, 
pthread_mutex_lock, pthread_mutex_trylock, pthread_mutex_unlock, mtx_lock, 
mtx_timedlock, mtx_trylock, mtx_unlock, lock_guard, unique_lock``.

steakhal wrote:

```suggestion
Critical section handling functions modeled by this checker:
``lock, unlock, pthread_mutex_lock, pthread_mutex_trylock, 
pthread_mutex_unlock, mtx_lock, mtx_timedlock, mtx_trylock, mtx_unlock, 
lock_guard, unique_lock``.
```

https://github.com/llvm/llvm-project/pull/93815
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move unix.BlockInCriticalSection out of alpha (PR #93815)

2024-06-01 Thread Balazs Benics via cfe-commits

https://github.com/steakhal approved this pull request.

LGTM. Minor typos/doc suggestions.

https://github.com/llvm/llvm-project/pull/93815
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move unix.BlockInCriticalSection out of alpha (PR #93815)

2024-06-01 Thread Balazs Benics via cfe-commits


@@ -1235,6 +1235,49 @@ Check calls to various UNIX/Posix functions: ``open, 
pthread_once, calloc, mallo
 .. literalinclude:: checkers/unix_api_example.c
 :language: c
 
+.. _unix-BlockInCriticalSection:
+
+unix.BlockInCriticalSection (C)
+"

steakhal wrote:

```suggestion
unix.BlockInCriticalSection (C, C++)

```

https://github.com/llvm/llvm-project/pull/93815
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move unix.BlockInCriticalSection out of alpha (PR #93815)

2024-06-01 Thread Balazs Benics via cfe-commits

https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/93815
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move unix.BlockInCriticalSection out of alpha (PR #93815)

2024-06-01 Thread Balazs Benics via cfe-commits

https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/93815
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Fix comparison to True/False (PR #94038)

2024-06-01 Thread Balazs Benics via cfe-commits

https://github.com/steakhal closed 
https://github.com/llvm/llvm-project/pull/94038
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Fix comparison to True/False (PR #94038)

2024-06-01 Thread Balazs Benics via cfe-commits

https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/94038
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] fix(clang/**.py): fix comparison to True/False (PR #94038)

2024-06-01 Thread Balazs Benics via cfe-commits

https://github.com/steakhal approved this pull request.

I'm not sure if llvm follows PEP8, but the change looks good regardless.

https://github.com/llvm/llvm-project/pull/94038
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer][NFC] Improve docs of alpha.unix.BlockInCriticalSection (PR #93812)

2024-05-31 Thread Balazs Benics via cfe-commits

https://github.com/steakhal approved this pull request.


https://github.com/llvm/llvm-project/pull/93812
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] New optin.taint.TaintAlloc checker for catching unbounded memory allocation calls (PR #92420)

2024-05-30 Thread Balazs Benics via cfe-commits


@@ -938,6 +938,53 @@ optin.portability.UnixAPI
 "
 Finds implementation-defined behavior in UNIX/Posix functions.
 
+.. _optin-taint-TaintAlloc:
+
+optin.taint.TaintAlloc (C, C++)
+"""
+
+This checker warns for cases when the ``size`` parameter of the ``malloc`` ,
+``calloc``, ``realloc``, ``alloca`` or the size parameter of the
+array new C++ operator is tainted (potentially attacker controlled).
+If an attacker can inject a large value as the size parameter, memory 
exhaustion
+denial of service attack can be carried out.
+
+The ``alpha.security.taint.TaintPropagation`` checker also needs to be enabled 
for
+this checker to give warnings.
+
+The analyzer emits warning only if it cannot prove that the size parameter is
+within reasonable bounds (``<= SIZE_MAX/4``). This functionality partially
+covers the SEI Cert coding standard rule `INT04-C
+`_.
+
+You can silence this warning either by bound checking the ``size`` parameter, 
or
+by explicitly marking the ``size`` parameter as sanitized. See the
+:ref:`alpha-security-taint-TaintPropagation` checker for more details.
+
+.. code-block:: c
+
+  void vulnerable(void) {
+size_t size;

steakhal wrote:

```suggestion
size_t size = 0;
```
Let's initialize the value to be similar to the other cases.

https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] New optin.taint.TaintAlloc checker for catching unbounded memory allocation calls (PR #92420)

2024-05-30 Thread Balazs Benics via cfe-commits


@@ -1695,6 +1707,12 @@ MallocChecker::processNewAllocation(const 
CXXAllocatorCall ,
   // MallocUpdateRefState() instead of MallocMemAux() which breaks the
   // existing binding.
   SVal Target = Call.getObjectUnderConstruction();
+  if (Call.getOriginExpr()->isArray()) {
+std::optional SizeEx = NE->getArraySize();
+if (SizeEx)

steakhal wrote:

```suggestion
if (auto SizeEx = NE->getArraySize())
```

https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] New optin.taint.TaintAlloc checker for catching unbounded memory allocation calls (PR #92420)

2024-05-30 Thread Balazs Benics via cfe-commits


@@ -938,6 +938,53 @@ optin.portability.UnixAPI
 "
 Finds implementation-defined behavior in UNIX/Posix functions.
 
+.. _optin-taint-TaintAlloc:
+
+optin.taint.TaintAlloc (C, C++)

steakhal wrote:

I was thinking of suggesting `TaintedAlloc` (same in past tense), as it sounds 
better to me. I have no strong opinion on this though.

https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] New optin.taint.TaintAlloc checker for catching unbounded memory allocation calls (PR #92420)

2024-05-30 Thread Balazs Benics via cfe-commits


@@ -1730,6 +1721,21 @@ def UnixAPIPortabilityChecker : Checker<"UnixAPI">,
 
 } // end optin.portability
 
+
+//===--===//
+// Taint checkers.
+//===--===//
+
+let ParentPackage = TaintOptIn in {
+
+def TaintMallocChecker: Checker<"TaintMalloc">,
+  HelpText<"Check for memory allocations, where the size parameter "
+   "might be a tainted (attacker controlled) value.">,
+  Dependencies<[DynamicMemoryModeling]>,

steakhal wrote:

Why can't we make it a dependency now? (I don't have a strong opinion, I'm more 
just curious)

https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] New optin.taint.TaintAlloc checker for catching unbounded memory allocation calls (PR #92420)

2024-05-30 Thread Balazs Benics via cfe-commits

https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] New optin.taint.TaintAlloc checker for catching unbounded memory allocation calls (PR #92420)

2024-05-30 Thread Balazs Benics via cfe-commits

https://github.com/steakhal approved this pull request.

I join Donát, and I agree that this looks good as it is.
I had a handful of final remarks but I have no strong opinion on any of the 
raised points.
Merge this, once you considered them and took action if you agreed.

https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improved PointerSubChecker (PR #93676)

2024-05-30 Thread Balazs Benics via cfe-commits

https://github.com/steakhal commented:

I'd say it looks pretty good. Objectively speaking it improves the TP rate.
I have compared these reports against the constant interpreters of clang and 
gcc to see if in constexpr context which of these expression would trigger "UB" 
there.
I found I think only 3 places where this implementation disagrees what I think 
we should report.

https://github.com/llvm/llvm-project/pull/93676
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improved PointerSubChecker (PR #93676)

2024-05-30 Thread Balazs Benics via cfe-commits


@@ -0,0 +1,74 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=alpha.core.PointerSub -verify %s
+
+void f1(void) {
+  int x, y, z[10];
+  int d =  -  // expected-warning{{Subtraction of two pointers that do 
not point into the same array is undefined behavior}}
+  d = z -  // expected-warning{{Subtraction of two pointers that do not 
point into the same array is undefined behavior}}
+  d =  -  // expected-warning{{Subtraction of two pointers that do not 
point into the same array is undefined behavior}}

steakhal wrote:

The tests look good, except for ` - `, which should be valid.
Wrt. ` - 1`, that should be valid as well, given that the resulting pointer 
is not dereferenced. However, clang's constrant interpreter disagrees with 
gcc's on that case: https://godbolt.org/z/dMd1rMazY
However, like I said, I believe gcc's right by accepting that code.


https://github.com/llvm/llvm-project/pull/93676
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improved PointerSubChecker (PR #93676)

2024-05-30 Thread Balazs Benics via cfe-commits


@@ -0,0 +1,74 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=alpha.core.PointerSub -verify %s
+
+void f1(void) {
+  int x, y, z[10];
+  int d =  -  // expected-warning{{Subtraction of two pointers that do 
not point into the same array is undefined behavior}}
+  d = z -  // expected-warning{{Subtraction of two pointers that do not 
point into the same array is undefined behavior}}
+  d =  -  // expected-warning{{Subtraction of two pointers that do not 
point into the same array is undefined behavior}}
+  d = (long*) - (long*)
+}
+
+void f2(void) {
+  int a[10], b[10], c;
+  int *p = [2];
+  int *q = [8];
+  int d = q - p; // no-warning
+
+  q = [3];
+  d = q - p; // expected-warning{{Subtraction of two pointers that}}
+
+  q = a + 10;
+  d = q - p; // no warning (use of pointer to one after the end is allowed)
+  d = [4] - a; // no warning
+
+  q = a + 11;
+  d = q - a; // ?
+
+  d =  - p; // expected-warning{{Subtraction of two pointers that}}
+}
+
+void f3(void) {
+  int a[3][4];
+  int d;
+
+  d = &(a[2]) - &(a[1]);
+  d = a[2] - a[1]; // expected-warning{{Subtraction of two pointers that}}
+  d = a[1] - a[1];
+  d = &(a[1][2]) - &(a[1][0]);
+  d = &(a[1][2]) - &(a[0][0]); // expected-warning{{Subtraction of two 
pointers that}}
+}
+
+void f4(void) {
+  int n = 4, m = 3;
+  int a[n][m];
+  int (*p)[m] = a; // p == [0]
+  p += 1; // p == [1]
+  int d = p - a; // d == 1 // expected-warning{{subtraction of pointers to 
type 'int[m]' of zero size has undefined behavior}}
+
+  d = &(a[2]) - &(a[1]); // expected-warning{{subtraction of pointers to type 
'int[m]' of zero size has undefined behavior}}
+  d = a[2] - a[1]; // expected-warning{{Subtraction of two pointers that}}

steakhal wrote:

Interestingly, clang's and gcc's constexpr interpreter accepts these lines, 
except for the last one. Who is right then?

https://github.com/llvm/llvm-project/pull/93676
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improved PointerSubChecker (PR #93676)

2024-05-30 Thread Balazs Benics via cfe-commits

https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/93676
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Adding taint analysis capability to unix.Malloc checker (PR #92420)

2024-05-26 Thread Balazs Benics via cfe-commits

steakhal wrote:

> I'll port this patch downstream to see how this would behave on the Juliet 
> C++ benchmark or on some real-world code.

Ah nvm. llvm/main diverged quite a bit since 18.1.6. I can't just pick this 
one. Given this, I won't backport and test this PR.

https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Adding taint analysis capability to unix.Malloc checker (PR #92420)

2024-05-26 Thread Balazs Benics via cfe-commits

https://github.com/steakhal commented:

The patch makes sense to me.
Have you considered applying the same heuristic to C++ array new allocations?

I'll port this patch downstream to see how this would behave on the Juliet C++ 
benchmark or on some real-world code.

https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] PutenvStackArrayChecker: No warning from 'main' (PR #93299)

2024-05-26 Thread Balazs Benics via cfe-commits

https://github.com/steakhal approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/93299
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-05-26 Thread Balazs Benics via cfe-commits

https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/93408
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refine invalidation caused by `fread` (PR #93408)

2024-05-26 Thread Balazs Benics via cfe-commits

https://github.com/steakhal created 
https://github.com/llvm/llvm-project/pull/93408

This change enables more accurate modeling of the write effects of `fread`. In 
particular, instead of invalidating the whole buffer, in a best-effort basis, 
we would try to invalidate the actually accesses elements of the buffer. This 
preserves the previous value of the buffer of the unaffected slots. As a 
result, diagnose more uninitialized buffer uses for example.

Currently, this refined invalidation only triggers for `fread` if and only if 
the `count` parameter and the buffer pointer's index component are concrete or 
perfectly-constrained symbols.
Additionally, if the `fread` would read more than 64 elements, the whole buffer 
is invalidated as before. This is to have safeguards against performance issues.

Refer to the comments of the assertions in the following example to see the 
changes in the diagnostics:

```c++
void demo() {
  FILE *fp = fopen("/home/test", "rb+");
  if (!fp) return;
  int buffer[10]; // uninitialized
  int read_items = fread(buffer+1, sizeof(int), 5, fp);
  if (5 == read_items) {
int v1 = buffer[1]; // Unknown value but not garbage.
clang_analyzer_isTainted(v1); // expected-warning {{YES}} <-- Would be "NO" 
without this patch.
clang_analyzer_dump(v1); // expected-warning {{conj_}} <-- Not a "derived" 
symbol, so it's directly invalidated now.
int v0 = buffer[0]; // expected-warning {{Assigned value is garbage or 
undefined}} <-- Had no report here before.
(void)(v1 + v0);
  } else {
// If 'fread' had an error.
int v0 = buffer[0]; // expected-warning {{Assigned value is garbage or 
undefined}} <-- Had no report here before.
(void)v0;
  }
  fclose(fp);
}
```

[CPP-3247](https://sonarsource.atlassian.net/browse/CPP-3247)

Patch by Marco Borgeaud (marco-antognini-sonarsource)

>From f9e841ddaa865d529c806b2d115d5ddbc7109243 Mon Sep 17 00:00:00 2001
From: Balazs Benics 
Date: Sun, 26 May 2024 11:40:01 +0200
Subject: [PATCH] [analyzer] Refine invalidation caused by `fread`

This change enables more accurate modeling of the write effects of `fread`.
In particular, instead of invalidating the whole buffer, in a best-effort
basis, we would try to invalidate the actually accesses elements of the buffer.
This preserves the previous value of the buffer of the unaffected slots.
As a result, diagnose more uninitialized buffer uses for example.

Currently, this refined invalidation only triggers for `fread` if and
only if the `count` parameter and the buffer pointer's index component
are concrete or perfectly-constrained symbols.
Additionally, if the `fread` would read more than 64 elements, the whole
buffer is invalidated as before. This is to have safeguards against
performance issues.

Refer to the comments of the assertions in the following example to see
the changes in the diagnostics:

```c++
void demo() {
  FILE *fp = fopen("/home/test", "rb+");
  if (!fp) return;
  int buffer[10]; // uninitialized
  int read_items = fread(buffer+1, sizeof(int), 5, fp);
  if (5 == read_items) {
int v1 = buffer[1]; // Unknown value but not garbage.
clang_analyzer_isTainted(v1); // expected-warning {{YES}} <-- Would be "NO" 
without this patch.
clang_analyzer_dump(v1); // expected-warning {{conj_}} <-- Not a "derived" 
symbol, so it's directly invalidated now.
int v0 = buffer[0]; // expected-warning {{Assigned value is garbage or 
undefined}} <-- Had no report here before.
(void)(v1 + v0);
  } else {
// If 'fread' had an error.
int v0 = buffer[0]; // expected-warning {{Assigned value is garbage or 
undefined}} <-- Had no report here before.
(void)v0;
  }
  fclose(fp);
}
```

[CPP-3247](https://sonarsource.atlassian.net/browse/CPP-3247)

Patch by Marco Borgeaud (marco-antognini-sonarsource)
---
 .../StaticAnalyzer/Checkers/StreamChecker.cpp |  88 -
 clang/test/Analysis/fread.cpp | 328 ++
 2 files changed, 405 insertions(+), 11 deletions(-)
 create mode 100644 clang/test/Analysis/fread.cpp

diff --git a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
index d4e020f7a72a0..7b42c4f72b322 100644
--- a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
@@ -717,18 +717,71 @@ const ExplodedNode 
*StreamChecker::getAcquisitionSite(const ExplodedNode *N,
   return nullptr;
 }
 
+/// Invalidate only the requested elements instead of the whole buffer.
+/// This is basically a refinement of the more generic 'escapeArgs' or
+/// the plain old 'invalidateRegions'.
+/// This only works if the \p StartIndex and \p Count are concrete or
+/// perfectly-constrained.
+static ProgramStateRef
+escapeByStartIndexAndCount(ProgramStateRef State, CheckerContext ,
+   const CallEvent , const MemRegion *Buffer,
+   QualType ElemType, SVal StartIndex, SVal Count) {
+  if 

[clang] [Analyzer][CFG] Correctly handle rebuilt default arg and default init expression (PR #91879)

2024-05-23 Thread Balazs Benics via cfe-commits

https://github.com/steakhal approved this pull request.

LGTM, thanks!

https://github.com/llvm/llvm-project/pull/91879
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Use ArrayRef for input parameters (PR #93203)

2024-05-23 Thread Balazs Benics via cfe-commits

https://github.com/steakhal created 
https://github.com/llvm/llvm-project/pull/93203

Fixes #79684

>From ef65ed8c193c43c1914dc39bf1cd48da83872fc5 Mon Sep 17 00:00:00 2001
From: Balazs Benics 
Date: Thu, 23 May 2024 10:56:33 +0200
Subject: [PATCH] [analyzer][NFC] Use ArrayRef for input parameters

Fixes #79684
---
 .../StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp| 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
index 845a5f9b390dc..8f4bd17afc858 100644
--- a/clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
@@ -672,7 +672,7 @@ class StdLibraryFunctionsChecker
 StringRef getNote() const { return Note; }
   };
 
-  using ArgTypes = std::vector>;
+  using ArgTypes = ArrayRef>;
   using RetType = std::optional;
 
   // A placeholder type, we use it whenever we do not care about the concrete
@@ -1746,7 +1746,7 @@ void StdLibraryFunctionsChecker::initFunctionSummaries(
 }
 // Add the same summary for different names with the Signature explicitly
 // given.
-void operator()(std::vector Names, Signature Sign, Summary Sum) 
{
+void operator()(ArrayRef Names, Signature Sign, Summary Sum) {
   for (StringRef Name : Names)
 operator()(Name, Sign, Sum);
 }

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][AST] Fix end location of DeclarationNameInfo on instantiated methods (PR #92654)

2024-05-22 Thread Balazs Benics via cfe-commits

https://github.com/steakhal closed 
https://github.com/llvm/llvm-project/pull/92654
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][AST] Fix end location of DeclarationNameInfo on instantiated methods (PR #92654)

2024-05-22 Thread Balazs Benics via cfe-commits

https://github.com/steakhal updated 
https://github.com/llvm/llvm-project/pull/92654

>From 3f3c98a55a6d89ddb05085c41d1fffad331595ce Mon Sep 17 00:00:00 2001
From: Alejandro _lvarez Ayll_n 
Date: Sat, 18 May 2024 16:53:33 +0200
Subject: [PATCH] [clang][AST] Fix end location of DeclarationNameInfo on
 instantiated methods

Fixes #71161

[D64087](https://reviews.llvm.org/D64087) updated some locations of the
instantiated method but forgot `DNLoc`.

`FunctionDecl::getNameInfo()` constructs a `DeclarationNameInfo` using
`Decl::Loc` as the beginning of the declaration name, and
`FunctionDecl::DNLoc` to compute the end of the declaration name. The
former was updated, but the latter was not, so
`DeclarationName::getSourceRange()` would return a range where the end
of the declaration name could come before its beginning.

Patch by Alejandro Alvarez Ayllon
Co-authored-by: steakhal

CPP-5166
---
 clang/docs/ReleaseNotes.rst   |  1 +
 clang/include/clang/AST/Decl.h|  2 ++
 .../lib/Sema/SemaTemplateInstantiateDecl.cpp  |  1 +
 clang/unittests/AST/DeclTest.cpp  | 31 +++
 4 files changed, 35 insertions(+)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 2899bc5ed35ad..93b6ba59ecf9f 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -766,6 +766,7 @@ Miscellaneous Bug Fixes
 
 - Fixed an infinite recursion in ASTImporter, on return type declared inside
   body of C++11 lambda without trailing return (#GH68775).
+- Fixed declaration name source location of instantiated function definitions 
(GH71161).
 
 Miscellaneous Clang Crashes Fixed
 ^
diff --git a/clang/include/clang/AST/Decl.h b/clang/include/clang/AST/Decl.h
index 5e485ccb85a13..7fd80b90d1033 100644
--- a/clang/include/clang/AST/Decl.h
+++ b/clang/include/clang/AST/Decl.h
@@ -2188,6 +2188,8 @@ class FunctionDecl : public DeclaratorDecl,
 
   void setRangeEnd(SourceLocation E) { EndRangeLoc = E; }
 
+  void setDeclarationNameLoc(DeclarationNameLoc L) { DNLoc = L; }
+
   /// Returns the location of the ellipsis of a variadic function.
   SourceLocation getEllipsisLoc() const {
 const auto *FPT = getType()->getAs();
diff --git a/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp 
b/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
index 09812946bd383..bb49aae2cb666 100644
--- a/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
+++ b/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
@@ -5055,6 +5055,7 @@ void Sema::InstantiateFunctionDefinition(SourceLocation 
PointOfInstantiation,
   Function->setLocation(PatternDecl->getLocation());
   Function->setInnerLocStart(PatternDecl->getInnerLocStart());
   Function->setRangeEnd(PatternDecl->getEndLoc());
+  Function->setDeclarationNameLoc(PatternDecl->getNameInfo().getInfo());
 
   EnterExpressionEvaluationContext EvalContext(
   *this, Sema::ExpressionEvaluationContext::PotentiallyEvaluated);
diff --git a/clang/unittests/AST/DeclTest.cpp b/clang/unittests/AST/DeclTest.cpp
index 2530ce74eb6a3..16aa2b50b7a06 100644
--- a/clang/unittests/AST/DeclTest.cpp
+++ b/clang/unittests/AST/DeclTest.cpp
@@ -545,3 +545,34 @@ TEST(Decl, TemplateArgumentDefaulted) {
   EXPECT_TRUE(ArgList.get(2).getIsDefaulted());
   EXPECT_TRUE(ArgList.get(3).getIsDefaulted());
 }
+
+TEST(Decl, CXXDestructorDeclsShouldHaveWellFormedNameInfoRanges) {
+  // GH71161
+  llvm::Annotations Code(R"cpp(
+template  struct Resource {
+  ~Resource(); // 1
+};
+template 
+Resource::~Resource() {} // 2,3
+
+void instantiate_template() {
+  Resource x;
+}
+)cpp");
+
+  auto AST = tooling::buildASTFromCode(Code.code());
+  ASTContext  = AST->getASTContext();
+
+  const auto  = Ctx.getSourceManager();
+  auto GetNameInfoRange = [](const BoundNodes ) {
+const auto *D = Match.getNodeAs("dtor");
+return D->getNameInfo().getSourceRange().printToString(SM);
+  };
+
+  auto Matches = match(findAll(cxxDestructorDecl().bind("dtor")),
+   *Ctx.getTranslationUnitDecl(), Ctx);
+  ASSERT_EQ(Matches.size(), 3U);
+  EXPECT_EQ(GetNameInfoRange(Matches[0]), "");
+  EXPECT_EQ(GetNameInfoRange(Matches[1]), "");
+  EXPECT_EQ(GetNameInfoRange(Matches[2]), "");
+}

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move checker alpha.security.cert.pos.34c into security.PutenvWithAuto (PR #92424)

2024-05-21 Thread Balazs Benics via cfe-commits
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= 
Message-ID:
In-Reply-To: 


steakhal wrote:

Make sure you adjust/sync the commit title, content and the PR title before 
merging.

https://github.com/llvm/llvm-project/pull/92424
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][AST] Fix end location of DeclarationNameInfo on instantiated methods (PR #92654)

2024-05-21 Thread Balazs Benics via cfe-commits

steakhal wrote:

> nit: add a note in `clang/docs/ReleaseNotes.rst`

Thanks. Added. Let me know if it's in the right section. @hokein

https://github.com/llvm/llvm-project/pull/92654
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][AST] Fix end location of DeclarationNameInfo on instantiated methods (PR #92654)

2024-05-21 Thread Balazs Benics via cfe-commits

https://github.com/steakhal updated 
https://github.com/llvm/llvm-project/pull/92654

>From 58ca4d9be1dbc43ad40b6e26b8a5d79e20be8d93 Mon Sep 17 00:00:00 2001
From: Alejandro _lvarez Ayll_n 
Date: Sat, 18 May 2024 16:53:33 +0200
Subject: [PATCH 1/2] [clang][AST] Fix end location of DeclarationNameInfo on
 instantiated methods

Fixes #71161

[D64087](https://reviews.llvm.org/D64087) updated some locations of the
instantiated method but forgot `DNLoc`.

`FunctionDecl::getNameInfo()` constructs a `DeclarationNameInfo` using
`Decl::Loc` as the beginning of the declaration name, and
`FunctionDecl::DNLoc` to compute the end of the declaration name. The
former was updated, but the latter was not, so
`DeclarationName::getSourceRange()` would return a range where the end
of the declaration name could come before its beginning.

Patch by Alejandro Alvarez Ayllon
Co-authored-by: steakhal

CPP-5166
---
 clang/include/clang/AST/Decl.h|  2 ++
 .../lib/Sema/SemaTemplateInstantiateDecl.cpp  |  1 +
 clang/unittests/AST/DeclTest.cpp  | 31 +++
 3 files changed, 34 insertions(+)

diff --git a/clang/include/clang/AST/Decl.h b/clang/include/clang/AST/Decl.h
index 5e485ccb85a13..7fd80b90d1033 100644
--- a/clang/include/clang/AST/Decl.h
+++ b/clang/include/clang/AST/Decl.h
@@ -2188,6 +2188,8 @@ class FunctionDecl : public DeclaratorDecl,
 
   void setRangeEnd(SourceLocation E) { EndRangeLoc = E; }
 
+  void setDeclarationNameLoc(DeclarationNameLoc L) { DNLoc = L; }
+
   /// Returns the location of the ellipsis of a variadic function.
   SourceLocation getEllipsisLoc() const {
 const auto *FPT = getType()->getAs();
diff --git a/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp 
b/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
index 381d79b2fcd46..6d736d0771eac 100644
--- a/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
+++ b/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
@@ -5055,6 +5055,7 @@ void Sema::InstantiateFunctionDefinition(SourceLocation 
PointOfInstantiation,
   Function->setLocation(PatternDecl->getLocation());
   Function->setInnerLocStart(PatternDecl->getInnerLocStart());
   Function->setRangeEnd(PatternDecl->getEndLoc());
+  Function->setDeclarationNameLoc(PatternDecl->getNameInfo().getInfo());
 
   EnterExpressionEvaluationContext EvalContext(
   *this, Sema::ExpressionEvaluationContext::PotentiallyEvaluated);
diff --git a/clang/unittests/AST/DeclTest.cpp b/clang/unittests/AST/DeclTest.cpp
index 2530ce74eb6a3..16aa2b50b7a06 100644
--- a/clang/unittests/AST/DeclTest.cpp
+++ b/clang/unittests/AST/DeclTest.cpp
@@ -545,3 +545,34 @@ TEST(Decl, TemplateArgumentDefaulted) {
   EXPECT_TRUE(ArgList.get(2).getIsDefaulted());
   EXPECT_TRUE(ArgList.get(3).getIsDefaulted());
 }
+
+TEST(Decl, CXXDestructorDeclsShouldHaveWellFormedNameInfoRanges) {
+  // GH71161
+  llvm::Annotations Code(R"cpp(
+template  struct Resource {
+  ~Resource(); // 1
+};
+template 
+Resource::~Resource() {} // 2,3
+
+void instantiate_template() {
+  Resource x;
+}
+)cpp");
+
+  auto AST = tooling::buildASTFromCode(Code.code());
+  ASTContext  = AST->getASTContext();
+
+  const auto  = Ctx.getSourceManager();
+  auto GetNameInfoRange = [](const BoundNodes ) {
+const auto *D = Match.getNodeAs("dtor");
+return D->getNameInfo().getSourceRange().printToString(SM);
+  };
+
+  auto Matches = match(findAll(cxxDestructorDecl().bind("dtor")),
+   *Ctx.getTranslationUnitDecl(), Ctx);
+  ASSERT_EQ(Matches.size(), 3U);
+  EXPECT_EQ(GetNameInfoRange(Matches[0]), "");
+  EXPECT_EQ(GetNameInfoRange(Matches[1]), "");
+  EXPECT_EQ(GetNameInfoRange(Matches[2]), "");
+}

>From ca454ba5c74e7bfdce2a4bfbfa55b0ad7b0c814e Mon Sep 17 00:00:00 2001
From: Balazs Benics 
Date: Tue, 21 May 2024 09:23:40 +0200
Subject: [PATCH 2/2] NFC Add release docs for this fix

---
 clang/docs/ReleaseNotes.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 2f83f5c6d54e9..7f4b1d2a00df7 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -770,6 +770,7 @@ Miscellaneous Bug Fixes
 
 - Fixed an infinite recursion in ASTImporter, on return type declared inside
   body of C++11 lambda without trailing return (#GH68775).
+- Fixed declaration name source location of instantiated function definitions 
(GH71161).
 
 Miscellaneous Clang Crashes Fixed
 ^

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][AST] Fix end location of DeclarationNameInfo on instantiated methods (PR #92654)

2024-05-18 Thread Balazs Benics via cfe-commits

https://github.com/steakhal created 
https://github.com/llvm/llvm-project/pull/92654

Fixes #71161

[D64087](https://reviews.llvm.org/D64087) updated some locations of the 
instantiated method but forgot `DNLoc`.

`FunctionDecl::getNameInfo()` constructs a `DeclarationNameInfo` using 
`Decl::Loc` as the beginning of the declaration name, and `FunctionDecl::DNLoc` 
to compute the end of the declaration name. The former was updated, but the 
latter was not, so `DeclarationName::getSourceRange()` would return a range 
where the end of the declaration name could come before its beginning.

Patch by Alejandro Alvarez Ayllon
Co-authored-by: steakhal

CPP-5166

>From 58ca4d9be1dbc43ad40b6e26b8a5d79e20be8d93 Mon Sep 17 00:00:00 2001
From: Alejandro _lvarez Ayll_n 
Date: Sat, 18 May 2024 16:53:33 +0200
Subject: [PATCH] [clang][AST] Fix end location of DeclarationNameInfo on
 instantiated methods

Fixes #71161

[D64087](https://reviews.llvm.org/D64087) updated some locations of the
instantiated method but forgot `DNLoc`.

`FunctionDecl::getNameInfo()` constructs a `DeclarationNameInfo` using
`Decl::Loc` as the beginning of the declaration name, and
`FunctionDecl::DNLoc` to compute the end of the declaration name. The
former was updated, but the latter was not, so
`DeclarationName::getSourceRange()` would return a range where the end
of the declaration name could come before its beginning.

Patch by Alejandro Alvarez Ayllon
Co-authored-by: steakhal

CPP-5166
---
 clang/include/clang/AST/Decl.h|  2 ++
 .../lib/Sema/SemaTemplateInstantiateDecl.cpp  |  1 +
 clang/unittests/AST/DeclTest.cpp  | 31 +++
 3 files changed, 34 insertions(+)

diff --git a/clang/include/clang/AST/Decl.h b/clang/include/clang/AST/Decl.h
index 5e485ccb85a13..7fd80b90d1033 100644
--- a/clang/include/clang/AST/Decl.h
+++ b/clang/include/clang/AST/Decl.h
@@ -2188,6 +2188,8 @@ class FunctionDecl : public DeclaratorDecl,
 
   void setRangeEnd(SourceLocation E) { EndRangeLoc = E; }
 
+  void setDeclarationNameLoc(DeclarationNameLoc L) { DNLoc = L; }
+
   /// Returns the location of the ellipsis of a variadic function.
   SourceLocation getEllipsisLoc() const {
 const auto *FPT = getType()->getAs();
diff --git a/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp 
b/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
index 381d79b2fcd46..6d736d0771eac 100644
--- a/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
+++ b/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
@@ -5055,6 +5055,7 @@ void Sema::InstantiateFunctionDefinition(SourceLocation 
PointOfInstantiation,
   Function->setLocation(PatternDecl->getLocation());
   Function->setInnerLocStart(PatternDecl->getInnerLocStart());
   Function->setRangeEnd(PatternDecl->getEndLoc());
+  Function->setDeclarationNameLoc(PatternDecl->getNameInfo().getInfo());
 
   EnterExpressionEvaluationContext EvalContext(
   *this, Sema::ExpressionEvaluationContext::PotentiallyEvaluated);
diff --git a/clang/unittests/AST/DeclTest.cpp b/clang/unittests/AST/DeclTest.cpp
index 2530ce74eb6a3..16aa2b50b7a06 100644
--- a/clang/unittests/AST/DeclTest.cpp
+++ b/clang/unittests/AST/DeclTest.cpp
@@ -545,3 +545,34 @@ TEST(Decl, TemplateArgumentDefaulted) {
   EXPECT_TRUE(ArgList.get(2).getIsDefaulted());
   EXPECT_TRUE(ArgList.get(3).getIsDefaulted());
 }
+
+TEST(Decl, CXXDestructorDeclsShouldHaveWellFormedNameInfoRanges) {
+  // GH71161
+  llvm::Annotations Code(R"cpp(
+template  struct Resource {
+  ~Resource(); // 1
+};
+template 
+Resource::~Resource() {} // 2,3
+
+void instantiate_template() {
+  Resource x;
+}
+)cpp");
+
+  auto AST = tooling::buildASTFromCode(Code.code());
+  ASTContext  = AST->getASTContext();
+
+  const auto  = Ctx.getSourceManager();
+  auto GetNameInfoRange = [](const BoundNodes ) {
+const auto *D = Match.getNodeAs("dtor");
+return D->getNameInfo().getSourceRange().printToString(SM);
+  };
+
+  auto Matches = match(findAll(cxxDestructorDecl().bind("dtor")),
+   *Ctx.getTranslationUnitDecl(), Ctx);
+  ASSERT_EQ(Matches.size(), 3U);
+  EXPECT_EQ(GetNameInfoRange(Matches[0]), "");
+  EXPECT_EQ(GetNameInfoRange(Matches[1]), "");
+  EXPECT_EQ(GetNameInfoRange(Matches[2]), "");
+}

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move checker alpha.security.cert.pos.34c into security.PutenvWithAuto (PR #92424)

2024-05-18 Thread Balazs Benics via cfe-commits
=?utf-8?q?Bal=C3=A1zs_K=C3=A9ri?= 
Message-ID:
In-Reply-To: 


https://github.com/steakhal approved this pull request.


https://github.com/llvm/llvm-project/pull/92424
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move checker alpha.security.cert.pos.34c into security.PutenvWithAuto (PR #92424)

2024-05-18 Thread Balazs Benics via cfe-commits
=?utf-8?q?Balázs_Kéri?= 
Message-ID:
In-Reply-To: 


https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/92424
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move checker alpha.security.cert.pos.34c into security.PutenvWithAuto (PR #92424)

2024-05-18 Thread Balazs Benics via cfe-commits
=?utf-8?q?Balázs_Kéri?= 
Message-ID:
In-Reply-To: 



@@ -2792,6 +2792,31 @@ Warn on mmap() calls that are both writable and 
executable.
//   code
  }
 
+.. _alpha-security-putenv-stack-array:
+
+alpha.security.PutenvStackArray
+"""

steakhal wrote:

```suggestion
alpha.security.PutenvStackArray (C)
"""
```

https://github.com/llvm/llvm-project/pull/92424
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move checker alpha.security.cert.pos.34c into security.PutenvWithAuto (PR #92424)

2024-05-18 Thread Balazs Benics via cfe-commits
=?utf-8?q?Balázs_Kéri?= 
Message-ID:
In-Reply-To: 



@@ -2792,6 +2792,31 @@ Warn on mmap() calls that are both writable and 
executable.
//   code
  }
 
+.. _alpha-security-putenv-stack-array:
+
+alpha.security.PutenvStackArray
+"""
+Finds calls to the ``putenv`` function which pass a pointer to a 
stack-allocated
+(automatic) array as the argument. Function ``putenv`` does not copy the passed
+string, only a pointer to the data is stored and this data can be read even by
+other threads. Content of a stack-allocated array is likely to be overwritten
+after returning from the parent function.
+
+The problem can be solved by using a static array variable or dynamically
+allocated memory. Even better is to avoid using ``putenv`` (it has other
+problems related to memory leaks) and use ``setenv`` instead.
+
+The check corresponds to CERT rule
+`POS34-C. Do not call putenv() with a pointer to an automatic variable as the 
argument
+`_.
+
+.. code-block:: c
+
+  int f() {
+char[] env = "NAME=value";

steakhal wrote:

```suggestion
char env[] = "NAME=value";
```

https://github.com/llvm/llvm-project/pull/92424
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move checker alpha.security.cert.pos.34c into security.PutenvWithAuto (PR #92424)

2024-05-17 Thread Balazs Benics via cfe-commits


@@ -1179,6 +1179,54 @@ security.insecureAPI.DeprecatedOrUnsafeBufferHandling (C)
strncpy(buf, "a", 1); // warn
  }
 
+.. _security-putenv-with-auto:
+
+security.PutenvWithAuto
+"""
+Finds calls to the ``putenv`` function which pass a pointer to an automatic 
variable as the argument.
+Function ``putenv`` does not copy the passed string, only a pointer to the 
data is stored.
+Content of an automatic variable is likely to be overwritten after returning 
from the parent function.

steakhal wrote:

Even though it's formally called `automatic storage duration`, I'd say that 
`stack`-variable is more commonly understood among programmers.
Consequently, I'd suggest `security.PutenvWithStack` or 
`security.PutenvWithStackVar` instead. I think it would be easier to discover 
that way.
But I guess, this should be discussed separately.

https://github.com/llvm/llvm-project/pull/92424
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move checker alpha.security.cert.pos.34c into security.PutenvWithAuto (PR #92424)

2024-05-17 Thread Balazs Benics via cfe-commits

steakhal wrote:

> The "cert" package looks not useful and the checker has not a meaningful name 
> with the old naming scheme.
> Additionally tests and documentation is updated.
> The checker looks good enough to be moved into non-alpha package.

Personally, I prefer reviewing content changes separately from deciding/moving 
the checker out of alpha. Here are the reasons for doing so:
 - When content is moved around, a different set of mistakes can be made than 
when updating some doc content.
 - These are orthogonal decisions.

---

I'd suggest to split this PR into two:
 - Keep this PR for updating the content.
 - Have a separate PR for moving it out of alpha.



https://github.com/llvm/llvm-project/pull/92424
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Move checker alpha.security.cert.pos.34c into security.PutenvWithAuto (PR #92424)

2024-05-17 Thread Balazs Benics via cfe-commits

https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/92424
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Adding taint analysis capability to unix.Malloc checker (PR #92420)

2024-05-17 Thread Balazs Benics via cfe-commits


@@ -1273,6 +1273,41 @@ Check for memory leaks, double free, and use-after-free 
problems. Traces memory
 .. literalinclude:: checkers/unix_malloc_example.c
 :language: c
 
+If the ``alpha.security.taint.TaintPropagation`` checker is enabled, the 
checker
+warns for cases when the ``size`` parameter of the ``malloc`` , ``calloc``,
+``realloc``, ``alloca`` is tainted (potentially attacker controlled). If an
+attacker can inject a large value as the size parameter, memory exhaustion
+denial of service attack can be carried out.
+
+The analyzer emits warning only if it cannot prove that the size parameter is
+within reasonable bounds (``<= SIZE_MAX/4``). This functionality partially
+covers the SEI Cert coding standard rule `INT04-C
+`_.
+
+You can silence this warning either by bound checking the ``size`` parameter, 
or
+by explicitly marking the ``size`` parameter as sanitized. See the
+:ref:`alpha-security-taint-TaintPropagation` checker for more details.
+
+.. code-block:: c
+
+  void t1(void) {
+size_t size;
+scanf("%zu", );
+int *p = malloc(size); // warn: malloc is called with a tainted 
(potentially attacker controlled) value
+free(p);
+  }
+
+  void t3(void) {
+size_t size;
+scanf("%zu", );
+if (1024https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Adding taint analysis capability to unix.Malloc checker (PR #92420)

2024-05-17 Thread Balazs Benics via cfe-commits


@@ -1779,18 +1790,79 @@ ProgramStateRef 
MallocChecker::MallocMemAux(CheckerContext ,
 const CallEvent ,
 const Expr *SizeEx, SVal Init,
 ProgramStateRef State,
-AllocationFamily Family) {
+AllocationFamily Family) const {
   if (!State)
 return nullptr;
 
   assert(SizeEx);
   return MallocMemAux(C, Call, C.getSVal(SizeEx), Init, State, Family);
 }
 
+void MallocChecker::reportTaintBug(StringRef Msg, ProgramStateRef State,
+   CheckerContext ,
+   llvm::ArrayRef TaintedSyms,
+   AllocationFamily Family,
+   const Expr *SizeEx) const {
+  if (ExplodedNode *N = C.generateErrorNode(State)) {
+
+std::optional CheckKind =
+getCheckIfTracked(Family);
+if (!CheckKind)
+  return;
+if (!BT_TaintedAlloc[*CheckKind])
+  BT_TaintedAlloc[*CheckKind].reset(new BugType(CheckNames[*CheckKind],
+"Tainted Memory 
Allocation",
+categories::MemoryError));
+auto R = std::make_unique(
+*BT_TaintedAlloc[*CheckKind], Msg, N);
+
+bugreporter::trackExpressionValue(N, SizeEx, *R);
+for (auto Sym : TaintedSyms)
+  R->markInteresting(Sym);
+C.emitReport(std::move(R));
+  }
+}
+
+void MallocChecker::CheckTaintedness(CheckerContext , const CallEvent ,
+ const SVal SizeSVal, ProgramStateRef 
State,
+ AllocationFamily Family) const {
+  std::vector TaintedSyms =
+  clang::ento::taint::getTaintedSymbols(State, SizeSVal);
+  if (!TaintedSyms.empty()) {

steakhal wrote:

Transform this into an early-return to reduce indentation.

https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Adding taint analysis capability to unix.Malloc checker (PR #92420)

2024-05-17 Thread Balazs Benics via cfe-commits


@@ -48,6 +49,45 @@ void myfoo(int *p);
 void myfooint(int p);
 char *fooRetPtr(void);
 
+void t1(void) {
+  size_t size;
+  scanf("%zu", );
+  int *p = malloc(size); // expected-warning{{malloc is called with a tainted 
(potentially attacker controlled) value}}
+  free(p);
+}
+
+void t2(void) {
+  size_t size;
+  scanf("%zu", );
+  int *p = calloc(size,2); // expected-warning{{calloc is called with a 
tainted (potentially attacker controlled) value}}
+  free(p);
+}
+
+void t3(void) {
+  size_t size;
+  scanf("%zu", );
+  if (1024https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Adding taint analysis capability to unix.Malloc checker (PR #92420)

2024-05-17 Thread Balazs Benics via cfe-commits


@@ -1779,18 +1790,79 @@ ProgramStateRef 
MallocChecker::MallocMemAux(CheckerContext ,
 const CallEvent ,
 const Expr *SizeEx, SVal Init,
 ProgramStateRef State,
-AllocationFamily Family) {
+AllocationFamily Family) const {
   if (!State)
 return nullptr;
 
   assert(SizeEx);
   return MallocMemAux(C, Call, C.getSVal(SizeEx), Init, State, Family);
 }
 
+void MallocChecker::reportTaintBug(StringRef Msg, ProgramStateRef State,
+   CheckerContext ,
+   llvm::ArrayRef TaintedSyms,
+   AllocationFamily Family,
+   const Expr *SizeEx) const {
+  if (ExplodedNode *N = C.generateErrorNode(State)) {
+
+std::optional CheckKind =
+getCheckIfTracked(Family);
+if (!CheckKind)
+  return;
+if (!BT_TaintedAlloc[*CheckKind])
+  BT_TaintedAlloc[*CheckKind].reset(new BugType(CheckNames[*CheckKind],
+"Tainted Memory 
Allocation",
+categories::MemoryError));
+auto R = std::make_unique(
+*BT_TaintedAlloc[*CheckKind], Msg, N);
+
+bugreporter::trackExpressionValue(N, SizeEx, *R);
+for (auto Sym : TaintedSyms)
+  R->markInteresting(Sym);
+C.emitReport(std::move(R));
+  }
+}
+
+void MallocChecker::CheckTaintedness(CheckerContext , const CallEvent ,
+ const SVal SizeSVal, ProgramStateRef 
State,
+ AllocationFamily Family) const {
+  std::vector TaintedSyms =
+  clang::ento::taint::getTaintedSymbols(State, SizeSVal);

steakhal wrote:

```suggestion
  taint::getTaintedSymbols(State, SizeSVal);
```

https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Adding taint analysis capability to unix.Malloc checker (PR #92420)

2024-05-17 Thread Balazs Benics via cfe-commits

https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Adding taint analysis capability to unix.Malloc checker (PR #92420)

2024-05-17 Thread Balazs Benics via cfe-commits


@@ -1273,6 +1273,41 @@ Check for memory leaks, double free, and use-after-free 
problems. Traces memory
 .. literalinclude:: checkers/unix_malloc_example.c
 :language: c
 
+If the ``alpha.security.taint.TaintPropagation`` checker is enabled, the 
checker
+warns for cases when the ``size`` parameter of the ``malloc`` , ``calloc``,
+``realloc``, ``alloca`` is tainted (potentially attacker controlled). If an
+attacker can inject a large value as the size parameter, memory exhaustion
+denial of service attack can be carried out.
+
+The analyzer emits warning only if it cannot prove that the size parameter is
+within reasonable bounds (``<= SIZE_MAX/4``). This functionality partially
+covers the SEI Cert coding standard rule `INT04-C
+`_.
+
+You can silence this warning either by bound checking the ``size`` parameter, 
or
+by explicitly marking the ``size`` parameter as sanitized. See the
+:ref:`alpha-security-taint-TaintPropagation` checker for more details.
+
+.. code-block:: c
+
+  void t1(void) {
+size_t size;
+scanf("%zu", );
+int *p = malloc(size); // warn: malloc is called with a tainted 
(potentially attacker controlled) value
+free(p);
+  }
+
+  void t3(void) {
+size_t size;
+scanf("%zu", );
+if (1024https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Adding taint analysis capability to unix.Malloc checker (PR #92420)

2024-05-17 Thread Balazs Benics via cfe-commits


@@ -48,6 +49,45 @@ void myfoo(int *p);
 void myfooint(int p);
 char *fooRetPtr(void);
 
+void t1(void) {
+  size_t size;
+  scanf("%zu", );
+  int *p = malloc(size); // expected-warning{{malloc is called with a tainted 
(potentially attacker controlled) value}}
+  free(p);
+}
+
+void t2(void) {
+  size_t size;
+  scanf("%zu", );
+  int *p = calloc(size,2); // expected-warning{{calloc is called with a 
tainted (potentially attacker controlled) value}}
+  free(p);
+}
+
+void t3(void) {
+  size_t size;
+  scanf("%zu", );
+  if (1024https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Adding taint analysis capability to unix.Malloc checker (PR #92420)

2024-05-17 Thread Balazs Benics via cfe-commits

https://github.com/steakhal commented:

The patch makes sense to me.
I'll not repeat the existing comments, they raise relevant concerns.

It would be nice to extend some test case with a tainted malloc to see how 
those note tags play out from the generic taint checker in this context. For 
this, I'd suggest you to have a look at some taint tests where we enable the 
`text` diagnostic output.

https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer][NFC] Require explicit matching mode for CallDescriptions (PR #92454)

2024-05-17 Thread Balazs Benics via cfe-commits
=?utf-8?q?Don=C3=A1t?= Nagy 
Message-ID:
In-Reply-To: 


https://github.com/steakhal approved this pull request.


https://github.com/llvm/llvm-project/pull/92454
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Add checker 'security.SetgidSetuidOrder' (PR #91445)

2024-05-16 Thread Balazs Benics via cfe-commits
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= 
Message-ID:
In-Reply-To: 



@@ -0,0 +1,75 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,security.SetgidSetuidOrder 
-analyzer-output=text -verify %s
+
+typedef int uid_t;
+typedef int gid_t;
+
+int setuid(uid_t);
+int setgid(gid_t);
+
+uid_t getuid();
+gid_t getgid();
+
+
+
+void test_note_1() {
+  if (setuid(getuid()) == -1) // expected-note{{Assuming the condition is 
false}} \
+  // expected-note{{Taking false branch}}
+return;
+  if (setuid(getuid()) == -1) // expected-note{{Call to 'setuid' found here 
that removes superuser privileges}} \
+  // expected-note{{Assuming the condition is 
false}} \
+  // expected-note{{Taking false branch}}
+return;
+  if (setgid(getgid()) == -1) // expected-warning{{A 'setgid(getgid())' call 
following a 'setuid(getuid())' call is likely to fail}} \
+  // expected-note{{A 'setgid(getgid())' call 
following a 'setuid(getuid())' call is likely to fail}}
+return;
+}
+
+void test_note_2() {
+  if (setuid(getuid()) == -1) // expected-note{{Call to 'setuid' found here 
that removes superuser privileges}} \
+  // expected-note{{Assuming the condition is 
false}} \
+  // expected-note{{Taking false branch}} \
+  // expected-note{{Assuming the condition is 
false}} \
+  // expected-note{{Taking false branch}}

steakhal wrote:

```suggestion
  // expected-note 2 {{Assuming the condition is 
false}} \
  // expected-note 2 {{Taking false branch}}
```

https://github.com/llvm/llvm-project/pull/91445
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Add checker 'security.SetgidSetuidOrder' (PR #91445)

2024-05-16 Thread Balazs Benics via cfe-commits
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= 
Message-ID:
In-Reply-To: 


https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/91445
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Add checker 'security.SetgidSetuidOrder' (PR #91445)

2024-05-16 Thread Balazs Benics via cfe-commits
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= 
Message-ID:
In-Reply-To: 


https://github.com/steakhal approved this pull request.

LGTM now, modulo the license concern mentioned 
[here](https://github.com/llvm/llvm-project/pull/91445#discussion_r1599766388).
I approve this, given that is checked/resolved.

https://github.com/llvm/llvm-project/pull/91445
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] [clang] Fix CXXNewExpr end source location for 'new struct S' (PR #92266)

2024-05-16 Thread Balazs Benics via cfe-commits

https://github.com/steakhal closed 
https://github.com/llvm/llvm-project/pull/92266
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Add checker 'security.SetgidSetuidOrder' (PR #91445)

2024-05-15 Thread Balazs Benics via cfe-commits
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= 
Message-ID:
In-Reply-To: 



@@ -0,0 +1,197 @@
+//===-- SetgidSetuidOrderChecker.cpp - check privilege revocation calls 
---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+//  This file defines a checker to detect possible reversed order of privilege
+//  revocations when 'setgid' and 'setuid' is used.
+//
+//===--===//
+
+#include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
+#include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
+#include "clang/StaticAnalyzer/Core/Checker.h"
+#include "clang/StaticAnalyzer/Core/CheckerManager.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramStateTrait.h"
+
+using namespace clang;
+using namespace ento;
+
+namespace {
+
+enum SetPrivilegeFunctionKind { Irrelevant, Setuid, Setgid };
+
+class SetgidSetuidOrderChecker : public Checker 
{
+  const BugType BT{this, "Possible wrong order of privilege revocation"};
+
+  const CallDescription SetuidDesc{CDM::CLibrary, {"setuid"}, 1};
+  const CallDescription SetgidDesc{CDM::CLibrary, {"setgid"}, 1};
+
+  const CallDescription GetuidDesc{CDM::CLibrary, {"getuid"}, 0};
+  const CallDescription GetgidDesc{CDM::CLibrary, {"getgid"}, 0};
+
+  const CallDescriptionSet OtherSetPrivilegeDesc{
+  {CDM::CLibrary, {"seteuid"}, 1},   {CDM::CLibrary, {"setegid"}, 1},
+  {CDM::CLibrary, {"setreuid"}, 2},  {CDM::CLibrary, {"setregid"}, 2},
+  {CDM::CLibrary, {"setresuid"}, 3}, {CDM::CLibrary, {"setresgid"}, 3}};
+
+public:
+  void checkPostCall(const CallEvent , CheckerContext ) const;
+  ProgramStateRef evalAssume(ProgramStateRef State, SVal Cond,
+ bool Assumption) const;
+  const BugType *getBT() const { return  }
+
+private:
+  void processSetuid(ProgramStateRef State, const CallEvent ,
+ CheckerContext ) const;
+  void processSetgid(ProgramStateRef State, const CallEvent ,
+ CheckerContext ) const;
+  void processOther(ProgramStateRef State, const CallEvent ,
+CheckerContext ) const;
+  /// Check if a function like \c getuid or \c getgid is called directly from
+  /// the first argument of function called from \a Call.
+  bool isFunctionCalledInArg(const CallDescription ,
+ const CallEvent ) const;
+  void emitReport(ProgramStateRef State, CheckerContext ) const;
+};
+
+} // end anonymous namespace
+
+/// Store if there was a call to 'setuid(getuid())' or 'setgid(getgid())' not
+/// followed by other different privilege-change functions.
+/// If the value \c Setuid is stored and a 'setgid(getgid())' call is found we
+/// have found the bug to be reported. Value \c Setgid is used too to prevent
+/// warnings at a setgid-setuid-setgid sequence.
+REGISTER_TRAIT_WITH_PROGRAMSTATE(LastSetPrivilegeCall, 
SetPrivilegeFunctionKind)
+/// Store the symbol value of the last 'setuid(getuid())' call. This is used to
+/// detect if the result is compared to -1 and avoid warnings on that branch
+/// (which is the failure branch of the call), and for identification of note
+/// tags.
+REGISTER_TRAIT_WITH_PROGRAMSTATE(LastSetuidCallSVal, SymbolRef)
+
+void SetgidSetuidOrderChecker::checkPostCall(const CallEvent ,
+ CheckerContext ) const {
+  ProgramStateRef State = C.getState();
+  if (SetuidDesc.matches(Call)) {
+processSetuid(State, Call, C);
+  } else if (SetgidDesc.matches(Call)) {
+processSetgid(State, Call, C);
+  } else if (OtherSetPrivilegeDesc.contains(Call)) {
+processOther(State, Call, C);
+  }
+}
+
+ProgramStateRef SetgidSetuidOrderChecker::evalAssume(ProgramStateRef State,
+ SVal Cond,
+ bool Assumption) const {
+  SValBuilder  = State->getStateManager().getSValBuilder();
+  SymbolRef LastSetuidSym = State->get();
+  if (!LastSetuidSym)
+return State;
+
+  // Check if the most recent call to 'setuid(getuid())' is assumed to be != 0.
+  // It should be only -1 at failure, but we want to accept a "!= 0" check too.
+  // (But now an invalid failure check like "!= 1" will be recognized as 
correct
+  // too. The "invalid failure check" is a different bug that is not the 

[clang] [clang][analyzer] Add checker 'security.SetgidSetuidOrder' (PR #91445)

2024-05-15 Thread Balazs Benics via cfe-commits
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= 
Message-ID:
In-Reply-To: 



@@ -0,0 +1,197 @@
+//===-- SetgidSetuidOrderChecker.cpp - check privilege revocation calls 
---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+//  This file defines a checker to detect possible reversed order of privilege
+//  revocations when 'setgid' and 'setuid' is used.
+//
+//===--===//
+
+#include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
+#include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
+#include "clang/StaticAnalyzer/Core/Checker.h"
+#include "clang/StaticAnalyzer/Core/CheckerManager.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramStateTrait.h"
+
+using namespace clang;
+using namespace ento;
+
+namespace {
+
+enum SetPrivilegeFunctionKind { Irrelevant, Setuid, Setgid };
+
+class SetgidSetuidOrderChecker : public Checker 
{
+  const BugType BT{this, "Possible wrong order of privilege revocation"};
+
+  const CallDescription SetuidDesc{CDM::CLibrary, {"setuid"}, 1};
+  const CallDescription SetgidDesc{CDM::CLibrary, {"setgid"}, 1};
+
+  const CallDescription GetuidDesc{CDM::CLibrary, {"getuid"}, 0};
+  const CallDescription GetgidDesc{CDM::CLibrary, {"getgid"}, 0};
+
+  const CallDescriptionSet OtherSetPrivilegeDesc{
+  {CDM::CLibrary, {"seteuid"}, 1},   {CDM::CLibrary, {"setegid"}, 1},
+  {CDM::CLibrary, {"setreuid"}, 2},  {CDM::CLibrary, {"setregid"}, 2},
+  {CDM::CLibrary, {"setresuid"}, 3}, {CDM::CLibrary, {"setresgid"}, 3}};
+
+public:
+  void checkPostCall(const CallEvent , CheckerContext ) const;
+  ProgramStateRef evalAssume(ProgramStateRef State, SVal Cond,
+ bool Assumption) const;
+  const BugType *getBT() const { return  }

steakhal wrote:

I guess you dont need this.

https://github.com/llvm/llvm-project/pull/91445
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Add checker 'security.SetgidSetuidOrder' (PR #91445)

2024-05-15 Thread Balazs Benics via cfe-commits
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= 
Message-ID:
In-Reply-To: 


https://github.com/steakhal requested changes to this pull request.

NoteTags, yeey.

Please add tests for the Notes added by the NoteTag callbacks. Inother words, 
add a test with the text diagnostic output.

https://github.com/llvm/llvm-project/pull/91445
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Add checker 'security.SetgidSetuidOrder' (PR #91445)

2024-05-15 Thread Balazs Benics via cfe-commits
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= ,
=?utf-8?q?Balázs_Kéri?= 
Message-ID:
In-Reply-To: 


https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/91445
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Fix CXXNewExpr end source location for 'new struct S' (PR #92266)

2024-05-15 Thread Balazs Benics via cfe-commits

https://github.com/steakhal updated 
https://github.com/llvm/llvm-project/pull/92266

>From eeb24ddbf261fd7667dd05feee14637bc379d182 Mon Sep 17 00:00:00 2001
From: Arseniy Zaostrovnykh 
Date: Wed, 15 May 2024 16:02:07 +0200
Subject: [PATCH] Fix CXXNewExpr end source location for 'new struct S'

---
 clang/lib/Parse/ParseDeclCXX.cpp  | 1 +
 clang/test/Misc/ast-source-ranges.cpp | 5 +
 2 files changed, 6 insertions(+)
 create mode 100644 clang/test/Misc/ast-source-ranges.cpp

diff --git a/clang/lib/Parse/ParseDeclCXX.cpp b/clang/lib/Parse/ParseDeclCXX.cpp
index 65ddebca49bc6..32c4e923243a9 100644
--- a/clang/lib/Parse/ParseDeclCXX.cpp
+++ b/clang/lib/Parse/ParseDeclCXX.cpp
@@ -1883,6 +1883,7 @@ void Parser::ParseClassSpecifier(tok::TokenKind 
TagTokKind,
   if (Tok.is(tok::identifier)) {
 Name = Tok.getIdentifierInfo();
 NameLoc = ConsumeToken();
+DS.SetRangeEnd(NameLoc);
 
 if (Tok.is(tok::less) && getLangOpts().CPlusPlus) {
   // The name was supposed to refer to a template, but didn't.
diff --git a/clang/test/Misc/ast-source-ranges.cpp 
b/clang/test/Misc/ast-source-ranges.cpp
new file mode 100644
index 0..9c0ab9449a6f5
--- /dev/null
+++ b/clang/test/Misc/ast-source-ranges.cpp
@@ -0,0 +1,5 @@
+// RUN: %clang_cc1 -ast-dump %s  2>&1 | FileCheck %s
+
+struct Sock {};
+void leakNewFn() { new struct Sock; }
+// CHECK: CXXNewExpr {{.*}}  'struct Sock *'

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Fix CXXNewExpr end source location for 'new struct S' (PR #92266)

2024-05-15 Thread Balazs Benics via cfe-commits

https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/92266
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   4   5   6   7   8   9   10   >