[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-15 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat closed 
https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-14 Thread Balazs Benics via cfe-commits
=?utf-8?q?Don=C3=A1t?= Nagy ,
=?utf-8?q?Don=C3=A1t?= Nagy ,
=?utf-8?q?Don=C3=A1t?= Nagy ,
=?utf-8?q?Don=C3=A1t?= Nagy ,
=?utf-8?q?Don=C3=A1t?= Nagy 
Message-ID:
In-Reply-To: 


https://github.com/steakhal approved this pull request.


https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-14 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat updated 
https://github.com/llvm/llvm-project/pull/91531

From 07dc4dd5c60c8a04637cce686b379e195deb5b67 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Wed, 8 May 2024 20:01:57 +0200
Subject: [PATCH 1/6] [analyzer] Refactor recognition of the errno getter
 functions

There are many environments where `errno` is a macro that expands to
something like `(*__errno())` (different standard library
implementations use different names instead of "__errno").

In these environments the ErrnoModeling checker creates a symbolic
region which will be used to represent the return value of this "get the
location of errno" function.

Previously this symbol was only created when the checker was able to
find the declaration of the "get the location of errno" function; but
this commit eliminates the complex logic that was responsible for this
and always creates the symbolic region when `errno` is not available as
a "regular" global variable.

This significantly simplifies a code and only introduces a minimal
performance reduction (one extra symbol) in the unlikely case when
`errno` is not declared (neither as a variable nor as a function), but
the `ErrnoModeling` checker is enabled.

In addition to this simplification, this commit specifies that the
`CallDescription`s for the "get the location of errno" functions are
matched in `CDM::CLibrary` mode. (This was my original goal, but I was
sidetracked by resolving a FIXME above the `CallDescriptionSet` in
`ErrnoModeling.cpp`.)

This change is very close to being NFC, but it fixes weird corner
cases like the handling of a C++ method that happens to be named
"__errno()" (previously it could've been recognized as an errno
location getter function).
---
 .../StaticAnalyzer/Checkers/ErrnoChecker.cpp  |   2 +-
 .../StaticAnalyzer/Checkers/ErrnoModeling.cpp | 127 ++
 .../StaticAnalyzer/Checkers/ErrnoModeling.h   |   9 +-
 clang/test/Analysis/memory-model.cpp  |  18 +--
 4 files changed, 53 insertions(+), 103 deletions(-)

diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
index 18e718e085536..72fd6781a7561 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
@@ -205,7 +205,7 @@ void ErrnoChecker::checkPreCall(const CallEvent ,
   // Probably 'strerror'?
   if (CallF->isExternC() && CallF->isGlobal() &&
   C.getSourceManager().isInSystemHeader(CallF->getLocation()) &&
-  !isErrno(CallF)) {
+  !isErrnoLocationCall(Call)) {
 if (getErrnoState(C.getState()) == MustBeChecked) {
   std::optional ErrnoLoc = getErrnoLoc(C.getState());
   assert(ErrnoLoc && "ErrnoLoc should exist if an errno state is set.");
diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
index 1b34ea0e056e5..0612cd4c87248 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
@@ -39,10 +39,15 @@ namespace {
 // Name of the "errno" variable.
 // FIXME: Is there a system where it is not called "errno" but is a variable?
 const char *ErrnoVarName = "errno";
+
 // Names of functions that return a location of the "errno" value.
 // FIXME: Are there other similar function names?
-const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno",
-"__errno", "_errno", "__error"};
+CallDescriptionSet ErrnoLocationCalls{
+{CDM::SimpleFunc, {"__errno_location"}, 0, 0},
+{CDM::SimpleFunc, {"___errno"}, 0, 0},
+{CDM::SimpleFunc, {"__errno"}, 0, 0},
+{CDM::SimpleFunc, {"_errno"}, 0, 0},
+{CDM::SimpleFunc, {"__error"}, 0, 0}};
 
 class ErrnoModeling
 : public Checker, check::BeginFunction,
@@ -54,16 +59,10 @@ class ErrnoModeling
   void checkLiveSymbols(ProgramStateRef State, SymbolReaper ) const;
   bool evalCall(const CallEvent , CheckerContext ) const;
 
-  // The declaration of an "errno" variable or "errno location" function.
-  mutable const Decl *ErrnoDecl = nullptr;
-
 private:
-  // FIXME: Names from `ErrnoLocationFuncNames` are used to build this set.
-  CallDescriptionSet ErrnoLocationCalls{{{"__errno_location"}, 0, 0},
-{{"___errno"}, 0, 0},
-{{"__errno"}, 0, 0},
-{{"_errno"}, 0, 0},
-{{"__error"}, 0, 0}};
+  // The declaration of an "errno" variable on systems where errno is
+  // represented by a variable (and not a function that queries its location).
+  mutable const Decl *ErrnoDecl = nullptr;
 };
 
 } // namespace
@@ -74,9 +73,13 @@ REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoRegion, const 
MemRegion *)
 
 REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoState, errno_modeling::ErrnoCheckState)
 
-/// Search for a variable called "errno" in the AST.

[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-14 Thread Balazs Benics via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 



@@ -136,53 +100,49 @@ void ErrnoModeling::checkBeginFunction(CheckerContext ) 
const {
   ASTContext  = C.getASTContext();
   ProgramStateRef State = C.getState();
 
-  if (const auto *ErrnoVar = dyn_cast_or_null(ErrnoDecl)) {
-// There is an external 'errno' variable.
-// Use its memory region.
-// The memory region for an 'errno'-like variable is allocated in system
-// space by MemRegionManager.
-const MemRegion *ErrnoR =
-State->getRegion(ErrnoVar, C.getLocationContext());
+  const MemRegion *ErrnoR;

steakhal wrote:

```suggestion
  const MemRegion *ErrnoR = nullptr;
```

https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-14 Thread Balazs Benics via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 



@@ -136,53 +100,49 @@ void ErrnoModeling::checkBeginFunction(CheckerContext ) 
const {
   ASTContext  = C.getASTContext();
   ProgramStateRef State = C.getState();
 
-  if (const auto *ErrnoVar = dyn_cast_or_null(ErrnoDecl)) {
-// There is an external 'errno' variable.
-// Use its memory region.
-// The memory region for an 'errno'-like variable is allocated in system
-// space by MemRegionManager.
-const MemRegion *ErrnoR =
-State->getRegion(ErrnoVar, C.getLocationContext());
+  const MemRegion *ErrnoR;
+
+  if (ErrnoDecl) {
+// There is an external 'errno' variable, so we can simply use the memory
+// region that's associated with it.
+ErrnoR = State->getRegion(ErrnoDecl, C.getLocationContext());
 assert(ErrnoR && "Memory region should exist for the 'errno' variable.");
-State = State->set(ErrnoR);
-State =
-errno_modeling::setErrnoValue(State, C, 0, errno_modeling::Irrelevant);
-C.addTransition(State);
-  } else if (ErrnoDecl) {
-assert(isa(ErrnoDecl) && "Invalid errno location function.");
-// There is a function that returns the location of 'errno'.
-// We must create a memory region for it in system space.
-// Currently a symbolic region is used with an artifical symbol.
-// FIXME: It is better to have a custom (new) kind of MemRegion for such
-// cases.
+  } else {
+// There is no 'errno' variable, so create a new symbolic memory region
+// that can be used to model the return value of the "get the location of
+// errno" internal functions.
+// NOTE: this `SVal` is created even if errno is not defined or used.
 SValBuilder  = C.getSValBuilder();
 MemRegionManager  = C.getStateManager().getRegionManager();
 
 const MemSpaceRegion *GlobalSystemSpace =
 RMgr.getGlobalsRegion(MemRegion::GlobalSystemSpaceRegionKind);
 
 // Create an artifical symbol for the region.
-// It is not possible to associate a statement or expression in this case.
+// Note that it is not possible to associate a statement or expression in
+// this case and the `symbolTag` (opaque pointer tag) is just the address
+// of the data member `ErrnoDecl` of the singleton `ErrnoModeling` checker
+// object.
 const SymbolConjured *Sym = SVB.conjureSymbol(
 nullptr, C.getLocationContext(),
 ACtx.getLValueReferenceType(ACtx.IntTy), C.blockCount(), );
 
 // The symbolic region is untyped, create a typed sub-region in it.
 // The ElementRegion is used to make the errno region a typed region.
-const MemRegion *ErrnoR = RMgr.getElementRegion(
+ErrnoR = RMgr.getElementRegion(
 ACtx.IntTy, SVB.makeZeroArrayIndex(),
 RMgr.getSymbolicRegion(Sym, GlobalSystemSpace), C.getASTContext());
-State = State->set(ErrnoR);
-State =
-errno_modeling::setErrnoValue(State, C, 0, errno_modeling::Irrelevant);
-C.addTransition(State);
   }
+  State = State->set(ErrnoR);

steakhal wrote:

```suggestion
  assert(ErrnoR);
  State = State->set(ErrnoR);
```

https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-14 Thread Balazs Benics via cfe-commits
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy ,
=?utf-8?q?Donát?= Nagy 
Message-ID:
In-Reply-To: 


https://github.com/steakhal edited 
https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-14 Thread Balazs Benics via cfe-commits
=?utf-8?q?Don=C3=A1t?= Nagy ,
=?utf-8?q?Don=C3=A1t?= Nagy ,
=?utf-8?q?Don=C3=A1t?= Nagy ,
=?utf-8?q?Don=C3=A1t?= Nagy 
Message-ID:
In-Reply-To: 


https://github.com/steakhal approved this pull request.

LGTM.
I'd only add some safety barriers for some places, denoted my suggested edits.

https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-13 Thread Donát Nagy via cfe-commits


@@ -136,53 +100,48 @@ void ErrnoModeling::checkBeginFunction(CheckerContext ) 
const {
   ASTContext  = C.getASTContext();
   ProgramStateRef State = C.getState();
 
-  if (const auto *ErrnoVar = dyn_cast_or_null(ErrnoDecl)) {
-// There is an external 'errno' variable.
-// Use its memory region.
-// The memory region for an 'errno'-like variable is allocated in system
-// space by MemRegionManager.
-const MemRegion *ErrnoR =
-State->getRegion(ErrnoVar, C.getLocationContext());
+  const MemRegion *ErrnoR;
+
+  if (ErrnoDecl) {
+// There is an external 'errno' variable, so we can simply use the memory
+// region that's associated with it.
+ErrnoR = State->getRegion(ErrnoDecl, C.getLocationContext());
 assert(ErrnoR && "Memory region should exist for the 'errno' variable.");
-State = State->set(ErrnoR);
-State =
-errno_modeling::setErrnoValue(State, C, 0, errno_modeling::Irrelevant);
-C.addTransition(State);
-  } else if (ErrnoDecl) {
-assert(isa(ErrnoDecl) && "Invalid errno location function.");
-// There is a function that returns the location of 'errno'.
-// We must create a memory region for it in system space.
-// Currently a symbolic region is used with an artifical symbol.
-// FIXME: It is better to have a custom (new) kind of MemRegion for such
-// cases.
+  } else {
+// The 'errno' location is accessed via an internal getter function, so
+// create a new symbolic memory region that can be used as the return value
+// of that function.

NagyDonat wrote:

I extended the comment to emphasize this.

https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-13 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat updated 
https://github.com/llvm/llvm-project/pull/91531

From 07dc4dd5c60c8a04637cce686b379e195deb5b67 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Wed, 8 May 2024 20:01:57 +0200
Subject: [PATCH 1/5] [analyzer] Refactor recognition of the errno getter
 functions

There are many environments where `errno` is a macro that expands to
something like `(*__errno())` (different standard library
implementations use different names instead of "__errno").

In these environments the ErrnoModeling checker creates a symbolic
region which will be used to represent the return value of this "get the
location of errno" function.

Previously this symbol was only created when the checker was able to
find the declaration of the "get the location of errno" function; but
this commit eliminates the complex logic that was responsible for this
and always creates the symbolic region when `errno` is not available as
a "regular" global variable.

This significantly simplifies a code and only introduces a minimal
performance reduction (one extra symbol) in the unlikely case when
`errno` is not declared (neither as a variable nor as a function), but
the `ErrnoModeling` checker is enabled.

In addition to this simplification, this commit specifies that the
`CallDescription`s for the "get the location of errno" functions are
matched in `CDM::CLibrary` mode. (This was my original goal, but I was
sidetracked by resolving a FIXME above the `CallDescriptionSet` in
`ErrnoModeling.cpp`.)

This change is very close to being NFC, but it fixes weird corner
cases like the handling of a C++ method that happens to be named
"__errno()" (previously it could've been recognized as an errno
location getter function).
---
 .../StaticAnalyzer/Checkers/ErrnoChecker.cpp  |   2 +-
 .../StaticAnalyzer/Checkers/ErrnoModeling.cpp | 127 ++
 .../StaticAnalyzer/Checkers/ErrnoModeling.h   |   9 +-
 clang/test/Analysis/memory-model.cpp  |  18 +--
 4 files changed, 53 insertions(+), 103 deletions(-)

diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
index 18e718e085536..72fd6781a7561 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
@@ -205,7 +205,7 @@ void ErrnoChecker::checkPreCall(const CallEvent ,
   // Probably 'strerror'?
   if (CallF->isExternC() && CallF->isGlobal() &&
   C.getSourceManager().isInSystemHeader(CallF->getLocation()) &&
-  !isErrno(CallF)) {
+  !isErrnoLocationCall(Call)) {
 if (getErrnoState(C.getState()) == MustBeChecked) {
   std::optional ErrnoLoc = getErrnoLoc(C.getState());
   assert(ErrnoLoc && "ErrnoLoc should exist if an errno state is set.");
diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
index 1b34ea0e056e5..0612cd4c87248 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
@@ -39,10 +39,15 @@ namespace {
 // Name of the "errno" variable.
 // FIXME: Is there a system where it is not called "errno" but is a variable?
 const char *ErrnoVarName = "errno";
+
 // Names of functions that return a location of the "errno" value.
 // FIXME: Are there other similar function names?
-const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno",
-"__errno", "_errno", "__error"};
+CallDescriptionSet ErrnoLocationCalls{
+{CDM::SimpleFunc, {"__errno_location"}, 0, 0},
+{CDM::SimpleFunc, {"___errno"}, 0, 0},
+{CDM::SimpleFunc, {"__errno"}, 0, 0},
+{CDM::SimpleFunc, {"_errno"}, 0, 0},
+{CDM::SimpleFunc, {"__error"}, 0, 0}};
 
 class ErrnoModeling
 : public Checker, check::BeginFunction,
@@ -54,16 +59,10 @@ class ErrnoModeling
   void checkLiveSymbols(ProgramStateRef State, SymbolReaper ) const;
   bool evalCall(const CallEvent , CheckerContext ) const;
 
-  // The declaration of an "errno" variable or "errno location" function.
-  mutable const Decl *ErrnoDecl = nullptr;
-
 private:
-  // FIXME: Names from `ErrnoLocationFuncNames` are used to build this set.
-  CallDescriptionSet ErrnoLocationCalls{{{"__errno_location"}, 0, 0},
-{{"___errno"}, 0, 0},
-{{"__errno"}, 0, 0},
-{{"_errno"}, 0, 0},
-{{"__error"}, 0, 0}};
+  // The declaration of an "errno" variable on systems where errno is
+  // represented by a variable (and not a function that queries its location).
+  mutable const Decl *ErrnoDecl = nullptr;
 };
 
 } // namespace
@@ -74,9 +73,13 @@ REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoRegion, const 
MemRegion *)
 
 REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoState, errno_modeling::ErrnoCheckState)
 
-/// Search for a variable called "errno" in the AST.

[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-13 Thread Balázs Kéri via cfe-commits


@@ -136,53 +100,48 @@ void ErrnoModeling::checkBeginFunction(CheckerContext ) 
const {
   ASTContext  = C.getASTContext();
   ProgramStateRef State = C.getState();
 
-  if (const auto *ErrnoVar = dyn_cast_or_null(ErrnoDecl)) {
-// There is an external 'errno' variable.
-// Use its memory region.
-// The memory region for an 'errno'-like variable is allocated in system
-// space by MemRegionManager.
-const MemRegion *ErrnoR =
-State->getRegion(ErrnoVar, C.getLocationContext());
+  const MemRegion *ErrnoR;
+
+  if (ErrnoDecl) {
+// There is an external 'errno' variable, so we can simply use the memory
+// region that's associated with it.
+ErrnoR = State->getRegion(ErrnoDecl, C.getLocationContext());
 assert(ErrnoR && "Memory region should exist for the 'errno' variable.");
-State = State->set(ErrnoR);
-State =
-errno_modeling::setErrnoValue(State, C, 0, errno_modeling::Irrelevant);
-C.addTransition(State);
-  } else if (ErrnoDecl) {
-assert(isa(ErrnoDecl) && "Invalid errno location function.");
-// There is a function that returns the location of 'errno'.
-// We must create a memory region for it in system space.
-// Currently a symbolic region is used with an artifical symbol.
-// FIXME: It is better to have a custom (new) kind of MemRegion for such
-// cases.
+  } else {
+// The 'errno' location is accessed via an internal getter function, so
+// create a new symbolic memory region that can be used as the return value
+// of that function.

balazske wrote:

I wanted to point out in the comment that if there is no `errno` definition at 
all (no variable and no getter function) and `errno` is not used in the 
program, the region is still created.

https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-10 Thread Donát Nagy via cfe-commits


@@ -136,53 +100,48 @@ void ErrnoModeling::checkBeginFunction(CheckerContext ) 
const {
   ASTContext  = C.getASTContext();
   ProgramStateRef State = C.getState();
 
-  if (const auto *ErrnoVar = dyn_cast_or_null(ErrnoDecl)) {
-// There is an external 'errno' variable.
-// Use its memory region.
-// The memory region for an 'errno'-like variable is allocated in system
-// space by MemRegionManager.
-const MemRegion *ErrnoR =
-State->getRegion(ErrnoVar, C.getLocationContext());
+  const MemRegion *ErrnoR;
+
+  if (ErrnoDecl) {
+// There is an external 'errno' variable, so we can simply use the memory
+// region that's associated with it.
+ErrnoR = State->getRegion(ErrnoDecl, C.getLocationContext());
 assert(ErrnoR && "Memory region should exist for the 'errno' variable.");
-State = State->set(ErrnoR);
-State =
-errno_modeling::setErrnoValue(State, C, 0, errno_modeling::Irrelevant);
-C.addTransition(State);
-  } else if (ErrnoDecl) {
-assert(isa(ErrnoDecl) && "Invalid errno location function.");
-// There is a function that returns the location of 'errno'.
-// We must create a memory region for it in system space.
-// Currently a symbolic region is used with an artifical symbol.
-// FIXME: It is better to have a custom (new) kind of MemRegion for such
-// cases.
+  } else {
+// The 'errno' location is accessed via an internal getter function, so
+// create a new symbolic memory region that can be used as the return value
+// of that function.

NagyDonat wrote:

Good point, I fixed it.

https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-10 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat updated 
https://github.com/llvm/llvm-project/pull/91531

From 07dc4dd5c60c8a04637cce686b379e195deb5b67 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Wed, 8 May 2024 20:01:57 +0200
Subject: [PATCH 1/4] [analyzer] Refactor recognition of the errno getter
 functions

There are many environments where `errno` is a macro that expands to
something like `(*__errno())` (different standard library
implementations use different names instead of "__errno").

In these environments the ErrnoModeling checker creates a symbolic
region which will be used to represent the return value of this "get the
location of errno" function.

Previously this symbol was only created when the checker was able to
find the declaration of the "get the location of errno" function; but
this commit eliminates the complex logic that was responsible for this
and always creates the symbolic region when `errno` is not available as
a "regular" global variable.

This significantly simplifies a code and only introduces a minimal
performance reduction (one extra symbol) in the unlikely case when
`errno` is not declared (neither as a variable nor as a function), but
the `ErrnoModeling` checker is enabled.

In addition to this simplification, this commit specifies that the
`CallDescription`s for the "get the location of errno" functions are
matched in `CDM::CLibrary` mode. (This was my original goal, but I was
sidetracked by resolving a FIXME above the `CallDescriptionSet` in
`ErrnoModeling.cpp`.)

This change is very close to being NFC, but it fixes weird corner
cases like the handling of a C++ method that happens to be named
"__errno()" (previously it could've been recognized as an errno
location getter function).
---
 .../StaticAnalyzer/Checkers/ErrnoChecker.cpp  |   2 +-
 .../StaticAnalyzer/Checkers/ErrnoModeling.cpp | 127 ++
 .../StaticAnalyzer/Checkers/ErrnoModeling.h   |   9 +-
 clang/test/Analysis/memory-model.cpp  |  18 +--
 4 files changed, 53 insertions(+), 103 deletions(-)

diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
index 18e718e085536..72fd6781a7561 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
@@ -205,7 +205,7 @@ void ErrnoChecker::checkPreCall(const CallEvent ,
   // Probably 'strerror'?
   if (CallF->isExternC() && CallF->isGlobal() &&
   C.getSourceManager().isInSystemHeader(CallF->getLocation()) &&
-  !isErrno(CallF)) {
+  !isErrnoLocationCall(Call)) {
 if (getErrnoState(C.getState()) == MustBeChecked) {
   std::optional ErrnoLoc = getErrnoLoc(C.getState());
   assert(ErrnoLoc && "ErrnoLoc should exist if an errno state is set.");
diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
index 1b34ea0e056e5..0612cd4c87248 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
@@ -39,10 +39,15 @@ namespace {
 // Name of the "errno" variable.
 // FIXME: Is there a system where it is not called "errno" but is a variable?
 const char *ErrnoVarName = "errno";
+
 // Names of functions that return a location of the "errno" value.
 // FIXME: Are there other similar function names?
-const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno",
-"__errno", "_errno", "__error"};
+CallDescriptionSet ErrnoLocationCalls{
+{CDM::SimpleFunc, {"__errno_location"}, 0, 0},
+{CDM::SimpleFunc, {"___errno"}, 0, 0},
+{CDM::SimpleFunc, {"__errno"}, 0, 0},
+{CDM::SimpleFunc, {"_errno"}, 0, 0},
+{CDM::SimpleFunc, {"__error"}, 0, 0}};
 
 class ErrnoModeling
 : public Checker, check::BeginFunction,
@@ -54,16 +59,10 @@ class ErrnoModeling
   void checkLiveSymbols(ProgramStateRef State, SymbolReaper ) const;
   bool evalCall(const CallEvent , CheckerContext ) const;
 
-  // The declaration of an "errno" variable or "errno location" function.
-  mutable const Decl *ErrnoDecl = nullptr;
-
 private:
-  // FIXME: Names from `ErrnoLocationFuncNames` are used to build this set.
-  CallDescriptionSet ErrnoLocationCalls{{{"__errno_location"}, 0, 0},
-{{"___errno"}, 0, 0},
-{{"__errno"}, 0, 0},
-{{"_errno"}, 0, 0},
-{{"__error"}, 0, 0}};
+  // The declaration of an "errno" variable on systems where errno is
+  // represented by a variable (and not a function that queries its location).
+  mutable const Decl *ErrnoDecl = nullptr;
 };
 
 } // namespace
@@ -74,9 +73,13 @@ REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoRegion, const 
MemRegion *)
 
 REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoState, errno_modeling::ErrnoCheckState)
 
-/// Search for a variable called "errno" in the AST.

[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-10 Thread Balázs Kéri via cfe-commits


@@ -136,53 +100,48 @@ void ErrnoModeling::checkBeginFunction(CheckerContext ) 
const {
   ASTContext  = C.getASTContext();
   ProgramStateRef State = C.getState();
 
-  if (const auto *ErrnoVar = dyn_cast_or_null(ErrnoDecl)) {
-// There is an external 'errno' variable.
-// Use its memory region.
-// The memory region for an 'errno'-like variable is allocated in system
-// space by MemRegionManager.
-const MemRegion *ErrnoR =
-State->getRegion(ErrnoVar, C.getLocationContext());
+  const MemRegion *ErrnoR;
+
+  if (ErrnoDecl) {
+// There is an external 'errno' variable, so we can simply use the memory
+// region that's associated with it.
+ErrnoR = State->getRegion(ErrnoDecl, C.getLocationContext());
 assert(ErrnoR && "Memory region should exist for the 'errno' variable.");
-State = State->set(ErrnoR);
-State =
-errno_modeling::setErrnoValue(State, C, 0, errno_modeling::Irrelevant);
-C.addTransition(State);
-  } else if (ErrnoDecl) {
-assert(isa(ErrnoDecl) && "Invalid errno location function.");
-// There is a function that returns the location of 'errno'.
-// We must create a memory region for it in system space.
-// Currently a symbolic region is used with an artifical symbol.
-// FIXME: It is better to have a custom (new) kind of MemRegion for such
-// cases.
+  } else {
+// The 'errno' location is accessed via an internal getter function, so
+// create a new symbolic memory region that can be used as the return value
+// of that function.

balazske wrote:

We should mention that this case happens even if there is no errno getter 
function at all

https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-10 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat updated 
https://github.com/llvm/llvm-project/pull/91531

From 07dc4dd5c60c8a04637cce686b379e195deb5b67 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Wed, 8 May 2024 20:01:57 +0200
Subject: [PATCH 1/3] [analyzer] Refactor recognition of the errno getter
 functions

There are many environments where `errno` is a macro that expands to
something like `(*__errno())` (different standard library
implementations use different names instead of "__errno").

In these environments the ErrnoModeling checker creates a symbolic
region which will be used to represent the return value of this "get the
location of errno" function.

Previously this symbol was only created when the checker was able to
find the declaration of the "get the location of errno" function; but
this commit eliminates the complex logic that was responsible for this
and always creates the symbolic region when `errno` is not available as
a "regular" global variable.

This significantly simplifies a code and only introduces a minimal
performance reduction (one extra symbol) in the unlikely case when
`errno` is not declared (neither as a variable nor as a function), but
the `ErrnoModeling` checker is enabled.

In addition to this simplification, this commit specifies that the
`CallDescription`s for the "get the location of errno" functions are
matched in `CDM::CLibrary` mode. (This was my original goal, but I was
sidetracked by resolving a FIXME above the `CallDescriptionSet` in
`ErrnoModeling.cpp`.)

This change is very close to being NFC, but it fixes weird corner
cases like the handling of a C++ method that happens to be named
"__errno()" (previously it could've been recognized as an errno
location getter function).
---
 .../StaticAnalyzer/Checkers/ErrnoChecker.cpp  |   2 +-
 .../StaticAnalyzer/Checkers/ErrnoModeling.cpp | 127 ++
 .../StaticAnalyzer/Checkers/ErrnoModeling.h   |   9 +-
 clang/test/Analysis/memory-model.cpp  |  18 +--
 4 files changed, 53 insertions(+), 103 deletions(-)

diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
index 18e718e085536..72fd6781a7561 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
@@ -205,7 +205,7 @@ void ErrnoChecker::checkPreCall(const CallEvent ,
   // Probably 'strerror'?
   if (CallF->isExternC() && CallF->isGlobal() &&
   C.getSourceManager().isInSystemHeader(CallF->getLocation()) &&
-  !isErrno(CallF)) {
+  !isErrnoLocationCall(Call)) {
 if (getErrnoState(C.getState()) == MustBeChecked) {
   std::optional ErrnoLoc = getErrnoLoc(C.getState());
   assert(ErrnoLoc && "ErrnoLoc should exist if an errno state is set.");
diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
index 1b34ea0e056e5..0612cd4c87248 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
@@ -39,10 +39,15 @@ namespace {
 // Name of the "errno" variable.
 // FIXME: Is there a system where it is not called "errno" but is a variable?
 const char *ErrnoVarName = "errno";
+
 // Names of functions that return a location of the "errno" value.
 // FIXME: Are there other similar function names?
-const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno",
-"__errno", "_errno", "__error"};
+CallDescriptionSet ErrnoLocationCalls{
+{CDM::SimpleFunc, {"__errno_location"}, 0, 0},
+{CDM::SimpleFunc, {"___errno"}, 0, 0},
+{CDM::SimpleFunc, {"__errno"}, 0, 0},
+{CDM::SimpleFunc, {"_errno"}, 0, 0},
+{CDM::SimpleFunc, {"__error"}, 0, 0}};
 
 class ErrnoModeling
 : public Checker, check::BeginFunction,
@@ -54,16 +59,10 @@ class ErrnoModeling
   void checkLiveSymbols(ProgramStateRef State, SymbolReaper ) const;
   bool evalCall(const CallEvent , CheckerContext ) const;
 
-  // The declaration of an "errno" variable or "errno location" function.
-  mutable const Decl *ErrnoDecl = nullptr;
-
 private:
-  // FIXME: Names from `ErrnoLocationFuncNames` are used to build this set.
-  CallDescriptionSet ErrnoLocationCalls{{{"__errno_location"}, 0, 0},
-{{"___errno"}, 0, 0},
-{{"__errno"}, 0, 0},
-{{"_errno"}, 0, 0},
-{{"__error"}, 0, 0}};
+  // The declaration of an "errno" variable on systems where errno is
+  // represented by a variable (and not a function that queries its location).
+  mutable const Decl *ErrnoDecl = nullptr;
 };
 
 } // namespace
@@ -74,9 +73,13 @@ REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoRegion, const 
MemRegion *)
 
 REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoState, errno_modeling::ErrnoCheckState)
 
-/// Search for a variable called "errno" in the AST.

[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-10 Thread Balázs Kéri via cfe-commits


@@ -74,9 +73,13 @@ REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoRegion, const 
MemRegion *)
 
 REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoState, errno_modeling::ErrnoCheckState)
 
-/// Search for a variable called "errno" in the AST.
-/// Return nullptr if not found.
-static const VarDecl *getErrnoVar(ASTContext ) {
+void ErrnoModeling::checkASTDecl(const TranslationUnitDecl *D,
+ AnalysisManager , BugReporter ) const {
+  // Try to find the declaration of the external variable `int errno;`.
+  // There are also C library implementations, where the `errno` location is
+  // accessed via a function that returns its address; in those environments
+  // this callback does nothing.

balazske wrote:

```suggestion
  // this callback has no effect.
```

https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-10 Thread Donát Nagy via cfe-commits


@@ -71,12 +71,9 @@ ProgramStateRef setErrnoState(ProgramStateRef State, 
ErrnoCheckState EState);
 /// Clear state of errno (make it irrelevant).
 ProgramStateRef clearErrnoState(ProgramStateRef State);
 
-/// Determine if a `Decl` node related to 'errno'.
-/// This is true if the declaration is the errno variable or a function
-/// that returns a pointer to the 'errno' value (usually the 'errno' macro is
-/// defined with this function). \p D is not required to be a canonical
-/// declaration.
-bool isErrno(const Decl *D);
+/// Determine if `Call` is a call to a "magical" function that returns the
+/// location of `errno` (in environments where errno is accessed this way).
+bool isErrnoLocationCall(const CallEvent );

NagyDonat wrote:

Yes, I see that `isErrno` was intended for a different behavior; but its 
variable declaration handling "side" became unused, so I deleted that and 
modified the remaining function-handling case to accept a call instead of a 
declaration.

https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-10 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

Thanks for the review, I updated my commit!

https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-10 Thread Donát Nagy via cfe-commits


@@ -54,16 +59,10 @@ class ErrnoModeling
   void checkLiveSymbols(ProgramStateRef State, SymbolReaper ) const;
   bool evalCall(const CallEvent , CheckerContext ) const;
 
-  // The declaration of an "errno" variable or "errno location" function.
-  mutable const Decl *ErrnoDecl = nullptr;
-
 private:
-  // FIXME: Names from `ErrnoLocationFuncNames` are used to build this set.
-  CallDescriptionSet ErrnoLocationCalls{{{"__errno_location"}, 0, 0},
-{{"___errno"}, 0, 0},
-{{"__errno"}, 0, 0},
-{{"_errno"}, 0, 0},
-{{"__error"}, 0, 0}};
+  // The declaration of an "errno" variable on systems where errno is
+  // represented by a variable (and not a function that queries its location).
+  mutable const Decl *ErrnoDecl = nullptr;

NagyDonat wrote:

Good point!

https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-10 Thread Donát Nagy via cfe-commits


@@ -39,10 +39,15 @@ namespace {
 // Name of the "errno" variable.
 // FIXME: Is there a system where it is not called "errno" but is a variable?
 const char *ErrnoVarName = "errno";
+
 // Names of functions that return a location of the "errno" value.
 // FIXME: Are there other similar function names?
-const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno",
-"__errno", "_errno", "__error"};
+CallDescriptionSet ErrnoLocationCalls{
+{CDM::SimpleFunc, {"__errno_location"}, 0, 0},
+{CDM::SimpleFunc, {"___errno"}, 0, 0},
+{CDM::SimpleFunc, {"__errno"}, 0, 0},
+{CDM::SimpleFunc, {"_errno"}, 0, 0},
+{CDM::SimpleFunc, {"__error"}, 0, 0}};

NagyDonat wrote:

I updated them to `CLibrary` because it's slightly better for this usecase (it 
won't match functions within a user-defined C++ namespace).

By the way, note that the `CLibrary` matching mode intentionally **doesn't 
check whether the function was declared within a system header**, because 
"being in a system header" is a very fragile property -- consider e.g. a case 
when somebody tries to submit a preprocessed source file for bug reproduction 
and the declaration becomes part of the (tweaked, minified) source file. I also 
suspect that there might be some unusual/internal functions (like these ones) 
whose declaration has no location, so it isn't in a system header...

https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-10 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat updated 
https://github.com/llvm/llvm-project/pull/91531

From 07dc4dd5c60c8a04637cce686b379e195deb5b67 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Wed, 8 May 2024 20:01:57 +0200
Subject: [PATCH 1/2] [analyzer] Refactor recognition of the errno getter
 functions

There are many environments where `errno` is a macro that expands to
something like `(*__errno())` (different standard library
implementations use different names instead of "__errno").

In these environments the ErrnoModeling checker creates a symbolic
region which will be used to represent the return value of this "get the
location of errno" function.

Previously this symbol was only created when the checker was able to
find the declaration of the "get the location of errno" function; but
this commit eliminates the complex logic that was responsible for this
and always creates the symbolic region when `errno` is not available as
a "regular" global variable.

This significantly simplifies a code and only introduces a minimal
performance reduction (one extra symbol) in the unlikely case when
`errno` is not declared (neither as a variable nor as a function), but
the `ErrnoModeling` checker is enabled.

In addition to this simplification, this commit specifies that the
`CallDescription`s for the "get the location of errno" functions are
matched in `CDM::CLibrary` mode. (This was my original goal, but I was
sidetracked by resolving a FIXME above the `CallDescriptionSet` in
`ErrnoModeling.cpp`.)

This change is very close to being NFC, but it fixes weird corner
cases like the handling of a C++ method that happens to be named
"__errno()" (previously it could've been recognized as an errno
location getter function).
---
 .../StaticAnalyzer/Checkers/ErrnoChecker.cpp  |   2 +-
 .../StaticAnalyzer/Checkers/ErrnoModeling.cpp | 127 ++
 .../StaticAnalyzer/Checkers/ErrnoModeling.h   |   9 +-
 clang/test/Analysis/memory-model.cpp  |  18 +--
 4 files changed, 53 insertions(+), 103 deletions(-)

diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
index 18e718e085536..72fd6781a7561 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
@@ -205,7 +205,7 @@ void ErrnoChecker::checkPreCall(const CallEvent ,
   // Probably 'strerror'?
   if (CallF->isExternC() && CallF->isGlobal() &&
   C.getSourceManager().isInSystemHeader(CallF->getLocation()) &&
-  !isErrno(CallF)) {
+  !isErrnoLocationCall(Call)) {
 if (getErrnoState(C.getState()) == MustBeChecked) {
   std::optional ErrnoLoc = getErrnoLoc(C.getState());
   assert(ErrnoLoc && "ErrnoLoc should exist if an errno state is set.");
diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
index 1b34ea0e056e5..0612cd4c87248 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
@@ -39,10 +39,15 @@ namespace {
 // Name of the "errno" variable.
 // FIXME: Is there a system where it is not called "errno" but is a variable?
 const char *ErrnoVarName = "errno";
+
 // Names of functions that return a location of the "errno" value.
 // FIXME: Are there other similar function names?
-const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno",
-"__errno", "_errno", "__error"};
+CallDescriptionSet ErrnoLocationCalls{
+{CDM::SimpleFunc, {"__errno_location"}, 0, 0},
+{CDM::SimpleFunc, {"___errno"}, 0, 0},
+{CDM::SimpleFunc, {"__errno"}, 0, 0},
+{CDM::SimpleFunc, {"_errno"}, 0, 0},
+{CDM::SimpleFunc, {"__error"}, 0, 0}};
 
 class ErrnoModeling
 : public Checker, check::BeginFunction,
@@ -54,16 +59,10 @@ class ErrnoModeling
   void checkLiveSymbols(ProgramStateRef State, SymbolReaper ) const;
   bool evalCall(const CallEvent , CheckerContext ) const;
 
-  // The declaration of an "errno" variable or "errno location" function.
-  mutable const Decl *ErrnoDecl = nullptr;
-
 private:
-  // FIXME: Names from `ErrnoLocationFuncNames` are used to build this set.
-  CallDescriptionSet ErrnoLocationCalls{{{"__errno_location"}, 0, 0},
-{{"___errno"}, 0, 0},
-{{"__errno"}, 0, 0},
-{{"_errno"}, 0, 0},
-{{"__error"}, 0, 0}};
+  // The declaration of an "errno" variable on systems where errno is
+  // represented by a variable (and not a function that queries its location).
+  mutable const Decl *ErrnoDecl = nullptr;
 };
 
 } // namespace
@@ -74,9 +73,13 @@ REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoRegion, const 
MemRegion *)
 
 REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoState, errno_modeling::ErrnoCheckState)
 
-/// Search for a variable called "errno" in the AST.

[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-09 Thread Balázs Kéri via cfe-commits


@@ -71,12 +71,9 @@ ProgramStateRef setErrnoState(ProgramStateRef State, 
ErrnoCheckState EState);
 /// Clear state of errno (make it irrelevant).
 ProgramStateRef clearErrnoState(ProgramStateRef State);
 
-/// Determine if a `Decl` node related to 'errno'.
-/// This is true if the declaration is the errno variable or a function
-/// that returns a pointer to the 'errno' value (usually the 'errno' macro is
-/// defined with this function). \p D is not required to be a canonical
-/// declaration.
-bool isErrno(const Decl *D);
+/// Determine if `Call` is a call to a "magical" function that returns the

balazske wrote:

I do not like word "magical" here (and at another comments) (some people may 
only guess what this means), it can be "hidden" or "internal" but can be just 
removed.

https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-09 Thread Balázs Kéri via cfe-commits


@@ -54,16 +59,10 @@ class ErrnoModeling
   void checkLiveSymbols(ProgramStateRef State, SymbolReaper ) const;
   bool evalCall(const CallEvent , CheckerContext ) const;
 
-  // The declaration of an "errno" variable or "errno location" function.
-  mutable const Decl *ErrnoDecl = nullptr;
-
 private:
-  // FIXME: Names from `ErrnoLocationFuncNames` are used to build this set.
-  CallDescriptionSet ErrnoLocationCalls{{{"__errno_location"}, 0, 0},
-{{"___errno"}, 0, 0},
-{{"__errno"}, 0, 0},
-{{"_errno"}, 0, 0},
-{{"__error"}, 0, 0}};
+  // The declaration of an "errno" variable on systems where errno is
+  // represented by a variable (and not a function that queries its location).
+  mutable const Decl *ErrnoDecl = nullptr;

balazske wrote:

This can be changed to `VarDecl`.

https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-09 Thread Balázs Kéri via cfe-commits


@@ -71,12 +71,9 @@ ProgramStateRef setErrnoState(ProgramStateRef State, 
ErrnoCheckState EState);
 /// Clear state of errno (make it irrelevant).
 ProgramStateRef clearErrnoState(ProgramStateRef State);
 
-/// Determine if a `Decl` node related to 'errno'.
-/// This is true if the declaration is the errno variable or a function
-/// that returns a pointer to the 'errno' value (usually the 'errno' macro is
-/// defined with this function). \p D is not required to be a canonical
-/// declaration.
-bool isErrno(const Decl *D);
+/// Determine if `Call` is a call to a "magical" function that returns the
+/// location of `errno` (in environments where errno is accessed this way).
+bool isErrnoLocationCall(const CallEvent );

balazske wrote:

Intent of this function was to determine if the passed `Decl` is the "errno" 
value (independent of how it is defined). This is now changed, but not a 
problem because all uses are for the function case.

https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-09 Thread Balázs Kéri via cfe-commits


@@ -39,10 +39,15 @@ namespace {
 // Name of the "errno" variable.
 // FIXME: Is there a system where it is not called "errno" but is a variable?
 const char *ErrnoVarName = "errno";
+
 // Names of functions that return a location of the "errno" value.
 // FIXME: Are there other similar function names?
-const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno",
-"__errno", "_errno", "__error"};
+CallDescriptionSet ErrnoLocationCalls{
+{CDM::SimpleFunc, {"__errno_location"}, 0, 0},
+{CDM::SimpleFunc, {"___errno"}, 0, 0},
+{CDM::SimpleFunc, {"__errno"}, 0, 0},
+{CDM::SimpleFunc, {"_errno"}, 0, 0},
+{CDM::SimpleFunc, {"__error"}, 0, 0}};

balazske wrote:

Should these be `CLibrary`? Probably not, because these are already internal 
function names hidden behind a macro ("errno" is the non-internal name). But 
only global functions should match that are in a system header.

https://github.com/llvm/llvm-project/pull/91531
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-08 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-static-analyzer-1

Author: Donát Nagy (NagyDonat)


Changes

There are many environments where `errno` is a macro that expands to something 
like `(*__errno())` (different standard library implementations use different 
names instead of "__errno").

In these environments the ErrnoModeling checker creates a symbolic region which 
will be used to represent the return value of this "get the location of errno" 
function.

Previously this symbol was only created when the checker was able to find the 
declaration of the "get the location of errno" function; but this commit 
eliminates the complex logic that was responsible for this and always creates 
the symbolic region when `errno` is not available as a "regular" global 
variable.

This significantly simplifies a code and only introduces a minimal performance 
reduction (one extra symbol) in the unlikely case when `errno` is not declared 
(neither as a variable nor as a function), but the `ErrnoModeling` checker is 
enabled.

In addition to this simplification, this commit specifies that the 
`CallDescription`s for the "get the location of errno" functions are matched in 
`CDM::CLibrary` mode. (This was my original goal, but I was sidetracked by 
resolving a FIXME above the `CallDescriptionSet` in `ErrnoModeling.cpp`.)

This change is very close to being NFC, but it fixes weird corner cases like 
the handling of a C++ method that happens to be named "__errno()" (previously 
it could've been recognized as an errno location getter function).

---
Full diff: https://github.com/llvm/llvm-project/pull/91531.diff


4 Files Affected:

- (modified) clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp (+1-1) 
- (modified) clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp (+40-87) 
- (modified) clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.h (+3-6) 
- (modified) clang/test/Analysis/memory-model.cpp (+9-9) 


``diff
diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
index 18e718e085536..72fd6781a7561 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
@@ -205,7 +205,7 @@ void ErrnoChecker::checkPreCall(const CallEvent ,
   // Probably 'strerror'?
   if (CallF->isExternC() && CallF->isGlobal() &&
   C.getSourceManager().isInSystemHeader(CallF->getLocation()) &&
-  !isErrno(CallF)) {
+  !isErrnoLocationCall(Call)) {
 if (getErrnoState(C.getState()) == MustBeChecked) {
   std::optional ErrnoLoc = getErrnoLoc(C.getState());
   assert(ErrnoLoc && "ErrnoLoc should exist if an errno state is set.");
diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
index 1b34ea0e056e5..0612cd4c87248 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
@@ -39,10 +39,15 @@ namespace {
 // Name of the "errno" variable.
 // FIXME: Is there a system where it is not called "errno" but is a variable?
 const char *ErrnoVarName = "errno";
+
 // Names of functions that return a location of the "errno" value.
 // FIXME: Are there other similar function names?
-const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno",
-"__errno", "_errno", "__error"};
+CallDescriptionSet ErrnoLocationCalls{
+{CDM::SimpleFunc, {"__errno_location"}, 0, 0},
+{CDM::SimpleFunc, {"___errno"}, 0, 0},
+{CDM::SimpleFunc, {"__errno"}, 0, 0},
+{CDM::SimpleFunc, {"_errno"}, 0, 0},
+{CDM::SimpleFunc, {"__error"}, 0, 0}};
 
 class ErrnoModeling
 : public Checker, check::BeginFunction,
@@ -54,16 +59,10 @@ class ErrnoModeling
   void checkLiveSymbols(ProgramStateRef State, SymbolReaper ) const;
   bool evalCall(const CallEvent , CheckerContext ) const;
 
-  // The declaration of an "errno" variable or "errno location" function.
-  mutable const Decl *ErrnoDecl = nullptr;
-
 private:
-  // FIXME: Names from `ErrnoLocationFuncNames` are used to build this set.
-  CallDescriptionSet ErrnoLocationCalls{{{"__errno_location"}, 0, 0},
-{{"___errno"}, 0, 0},
-{{"__errno"}, 0, 0},
-{{"_errno"}, 0, 0},
-{{"__error"}, 0, 0}};
+  // The declaration of an "errno" variable on systems where errno is
+  // represented by a variable (and not a function that queries its location).
+  mutable const Decl *ErrnoDecl = nullptr;
 };
 
 } // namespace
@@ -74,9 +73,13 @@ REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoRegion, const 
MemRegion *)
 
 REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoState, errno_modeling::ErrnoCheckState)
 
-/// Search for a variable called "errno" in the AST.
-/// Return nullptr if not found.
-static const VarDecl *getErrnoVar(ASTContext ) {
+void 

[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)

2024-05-08 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat created 
https://github.com/llvm/llvm-project/pull/91531

There are many environments where `errno` is a macro that expands to something 
like `(*__errno())` (different standard library implementations use different 
names instead of "__errno").

In these environments the ErrnoModeling checker creates a symbolic region which 
will be used to represent the return value of this "get the location of errno" 
function.

Previously this symbol was only created when the checker was able to find the 
declaration of the "get the location of errno" function; but this commit 
eliminates the complex logic that was responsible for this and always creates 
the symbolic region when `errno` is not available as a "regular" global 
variable.

This significantly simplifies a code and only introduces a minimal performance 
reduction (one extra symbol) in the unlikely case when `errno` is not declared 
(neither as a variable nor as a function), but the `ErrnoModeling` checker is 
enabled.

In addition to this simplification, this commit specifies that the 
`CallDescription`s for the "get the location of errno" functions are matched in 
`CDM::CLibrary` mode. (This was my original goal, but I was sidetracked by 
resolving a FIXME above the `CallDescriptionSet` in `ErrnoModeling.cpp`.)

This change is very close to being NFC, but it fixes weird corner cases like 
the handling of a C++ method that happens to be named "__errno()" (previously 
it could've been recognized as an errno location getter function).

From 07dc4dd5c60c8a04637cce686b379e195deb5b67 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= 
Date: Wed, 8 May 2024 20:01:57 +0200
Subject: [PATCH] [analyzer] Refactor recognition of the errno getter functions

There are many environments where `errno` is a macro that expands to
something like `(*__errno())` (different standard library
implementations use different names instead of "__errno").

In these environments the ErrnoModeling checker creates a symbolic
region which will be used to represent the return value of this "get the
location of errno" function.

Previously this symbol was only created when the checker was able to
find the declaration of the "get the location of errno" function; but
this commit eliminates the complex logic that was responsible for this
and always creates the symbolic region when `errno` is not available as
a "regular" global variable.

This significantly simplifies a code and only introduces a minimal
performance reduction (one extra symbol) in the unlikely case when
`errno` is not declared (neither as a variable nor as a function), but
the `ErrnoModeling` checker is enabled.

In addition to this simplification, this commit specifies that the
`CallDescription`s for the "get the location of errno" functions are
matched in `CDM::CLibrary` mode. (This was my original goal, but I was
sidetracked by resolving a FIXME above the `CallDescriptionSet` in
`ErrnoModeling.cpp`.)

This change is very close to being NFC, but it fixes weird corner
cases like the handling of a C++ method that happens to be named
"__errno()" (previously it could've been recognized as an errno
location getter function).
---
 .../StaticAnalyzer/Checkers/ErrnoChecker.cpp  |   2 +-
 .../StaticAnalyzer/Checkers/ErrnoModeling.cpp | 127 ++
 .../StaticAnalyzer/Checkers/ErrnoModeling.h   |   9 +-
 clang/test/Analysis/memory-model.cpp  |  18 +--
 4 files changed, 53 insertions(+), 103 deletions(-)

diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
index 18e718e085536..72fd6781a7561 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
@@ -205,7 +205,7 @@ void ErrnoChecker::checkPreCall(const CallEvent ,
   // Probably 'strerror'?
   if (CallF->isExternC() && CallF->isGlobal() &&
   C.getSourceManager().isInSystemHeader(CallF->getLocation()) &&
-  !isErrno(CallF)) {
+  !isErrnoLocationCall(Call)) {
 if (getErrnoState(C.getState()) == MustBeChecked) {
   std::optional ErrnoLoc = getErrnoLoc(C.getState());
   assert(ErrnoLoc && "ErrnoLoc should exist if an errno state is set.");
diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp 
b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
index 1b34ea0e056e5..0612cd4c87248 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
@@ -39,10 +39,15 @@ namespace {
 // Name of the "errno" variable.
 // FIXME: Is there a system where it is not called "errno" but is a variable?
 const char *ErrnoVarName = "errno";
+
 // Names of functions that return a location of the "errno" value.
 // FIXME: Are there other similar function names?
-const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno",
-"__errno", "_errno", "__error"};
+CallDescriptionSet