[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
https://github.com/NagyDonat closed https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
=?utf-8?q?Don=C3=A1t?= Nagy , =?utf-8?q?Don=C3=A1t?= Nagy , =?utf-8?q?Don=C3=A1t?= Nagy , =?utf-8?q?Don=C3=A1t?= Nagy , =?utf-8?q?Don=C3=A1t?= Nagy Message-ID: In-Reply-To: https://github.com/steakhal approved this pull request. https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
https://github.com/NagyDonat updated https://github.com/llvm/llvm-project/pull/91531 From 07dc4dd5c60c8a04637cce686b379e195deb5b67 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= Date: Wed, 8 May 2024 20:01:57 +0200 Subject: [PATCH 1/6] [analyzer] Refactor recognition of the errno getter functions There are many environments where `errno` is a macro that expands to something like `(*__errno())` (different standard library implementations use different names instead of "__errno"). In these environments the ErrnoModeling checker creates a symbolic region which will be used to represent the return value of this "get the location of errno" function. Previously this symbol was only created when the checker was able to find the declaration of the "get the location of errno" function; but this commit eliminates the complex logic that was responsible for this and always creates the symbolic region when `errno` is not available as a "regular" global variable. This significantly simplifies a code and only introduces a minimal performance reduction (one extra symbol) in the unlikely case when `errno` is not declared (neither as a variable nor as a function), but the `ErrnoModeling` checker is enabled. In addition to this simplification, this commit specifies that the `CallDescription`s for the "get the location of errno" functions are matched in `CDM::CLibrary` mode. (This was my original goal, but I was sidetracked by resolving a FIXME above the `CallDescriptionSet` in `ErrnoModeling.cpp`.) This change is very close to being NFC, but it fixes weird corner cases like the handling of a C++ method that happens to be named "__errno()" (previously it could've been recognized as an errno location getter function). --- .../StaticAnalyzer/Checkers/ErrnoChecker.cpp | 2 +- .../StaticAnalyzer/Checkers/ErrnoModeling.cpp | 127 ++ .../StaticAnalyzer/Checkers/ErrnoModeling.h | 9 +- clang/test/Analysis/memory-model.cpp | 18 +-- 4 files changed, 53 insertions(+), 103 deletions(-) diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp index 18e718e085536..72fd6781a7561 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp @@ -205,7 +205,7 @@ void ErrnoChecker::checkPreCall(const CallEvent , // Probably 'strerror'? if (CallF->isExternC() && CallF->isGlobal() && C.getSourceManager().isInSystemHeader(CallF->getLocation()) && - !isErrno(CallF)) { + !isErrnoLocationCall(Call)) { if (getErrnoState(C.getState()) == MustBeChecked) { std::optional ErrnoLoc = getErrnoLoc(C.getState()); assert(ErrnoLoc && "ErrnoLoc should exist if an errno state is set."); diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp index 1b34ea0e056e5..0612cd4c87248 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp @@ -39,10 +39,15 @@ namespace { // Name of the "errno" variable. // FIXME: Is there a system where it is not called "errno" but is a variable? const char *ErrnoVarName = "errno"; + // Names of functions that return a location of the "errno" value. // FIXME: Are there other similar function names? -const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno", -"__errno", "_errno", "__error"}; +CallDescriptionSet ErrnoLocationCalls{ +{CDM::SimpleFunc, {"__errno_location"}, 0, 0}, +{CDM::SimpleFunc, {"___errno"}, 0, 0}, +{CDM::SimpleFunc, {"__errno"}, 0, 0}, +{CDM::SimpleFunc, {"_errno"}, 0, 0}, +{CDM::SimpleFunc, {"__error"}, 0, 0}}; class ErrnoModeling : public Checker, check::BeginFunction, @@ -54,16 +59,10 @@ class ErrnoModeling void checkLiveSymbols(ProgramStateRef State, SymbolReaper ) const; bool evalCall(const CallEvent , CheckerContext ) const; - // The declaration of an "errno" variable or "errno location" function. - mutable const Decl *ErrnoDecl = nullptr; - private: - // FIXME: Names from `ErrnoLocationFuncNames` are used to build this set. - CallDescriptionSet ErrnoLocationCalls{{{"__errno_location"}, 0, 0}, -{{"___errno"}, 0, 0}, -{{"__errno"}, 0, 0}, -{{"_errno"}, 0, 0}, -{{"__error"}, 0, 0}}; + // The declaration of an "errno" variable on systems where errno is + // represented by a variable (and not a function that queries its location). + mutable const Decl *ErrnoDecl = nullptr; }; } // namespace @@ -74,9 +73,13 @@ REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoRegion, const MemRegion *) REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoState, errno_modeling::ErrnoCheckState) -/// Search for a variable called "errno" in the AST.
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: @@ -136,53 +100,49 @@ void ErrnoModeling::checkBeginFunction(CheckerContext ) const { ASTContext = C.getASTContext(); ProgramStateRef State = C.getState(); - if (const auto *ErrnoVar = dyn_cast_or_null(ErrnoDecl)) { -// There is an external 'errno' variable. -// Use its memory region. -// The memory region for an 'errno'-like variable is allocated in system -// space by MemRegionManager. -const MemRegion *ErrnoR = -State->getRegion(ErrnoVar, C.getLocationContext()); + const MemRegion *ErrnoR; steakhal wrote: ```suggestion const MemRegion *ErrnoR = nullptr; ``` https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: @@ -136,53 +100,49 @@ void ErrnoModeling::checkBeginFunction(CheckerContext ) const { ASTContext = C.getASTContext(); ProgramStateRef State = C.getState(); - if (const auto *ErrnoVar = dyn_cast_or_null(ErrnoDecl)) { -// There is an external 'errno' variable. -// Use its memory region. -// The memory region for an 'errno'-like variable is allocated in system -// space by MemRegionManager. -const MemRegion *ErrnoR = -State->getRegion(ErrnoVar, C.getLocationContext()); + const MemRegion *ErrnoR; + + if (ErrnoDecl) { +// There is an external 'errno' variable, so we can simply use the memory +// region that's associated with it. +ErrnoR = State->getRegion(ErrnoDecl, C.getLocationContext()); assert(ErrnoR && "Memory region should exist for the 'errno' variable."); -State = State->set(ErrnoR); -State = -errno_modeling::setErrnoValue(State, C, 0, errno_modeling::Irrelevant); -C.addTransition(State); - } else if (ErrnoDecl) { -assert(isa(ErrnoDecl) && "Invalid errno location function."); -// There is a function that returns the location of 'errno'. -// We must create a memory region for it in system space. -// Currently a symbolic region is used with an artifical symbol. -// FIXME: It is better to have a custom (new) kind of MemRegion for such -// cases. + } else { +// There is no 'errno' variable, so create a new symbolic memory region +// that can be used to model the return value of the "get the location of +// errno" internal functions. +// NOTE: this `SVal` is created even if errno is not defined or used. SValBuilder = C.getSValBuilder(); MemRegionManager = C.getStateManager().getRegionManager(); const MemSpaceRegion *GlobalSystemSpace = RMgr.getGlobalsRegion(MemRegion::GlobalSystemSpaceRegionKind); // Create an artifical symbol for the region. -// It is not possible to associate a statement or expression in this case. +// Note that it is not possible to associate a statement or expression in +// this case and the `symbolTag` (opaque pointer tag) is just the address +// of the data member `ErrnoDecl` of the singleton `ErrnoModeling` checker +// object. const SymbolConjured *Sym = SVB.conjureSymbol( nullptr, C.getLocationContext(), ACtx.getLValueReferenceType(ACtx.IntTy), C.blockCount(), ); // The symbolic region is untyped, create a typed sub-region in it. // The ElementRegion is used to make the errno region a typed region. -const MemRegion *ErrnoR = RMgr.getElementRegion( +ErrnoR = RMgr.getElementRegion( ACtx.IntTy, SVB.makeZeroArrayIndex(), RMgr.getSymbolicRegion(Sym, GlobalSystemSpace), C.getASTContext()); -State = State->set(ErrnoR); -State = -errno_modeling::setErrnoValue(State, C, 0, errno_modeling::Irrelevant); -C.addTransition(State); } + State = State->set(ErrnoR); steakhal wrote: ```suggestion assert(ErrnoR); State = State->set(ErrnoR); ``` https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
=?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy , =?utf-8?q?Donát?= Nagy Message-ID: In-Reply-To: https://github.com/steakhal edited https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
=?utf-8?q?Don=C3=A1t?= Nagy , =?utf-8?q?Don=C3=A1t?= Nagy , =?utf-8?q?Don=C3=A1t?= Nagy , =?utf-8?q?Don=C3=A1t?= Nagy Message-ID: In-Reply-To: https://github.com/steakhal approved this pull request. LGTM. I'd only add some safety barriers for some places, denoted my suggested edits. https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
@@ -136,53 +100,48 @@ void ErrnoModeling::checkBeginFunction(CheckerContext ) const { ASTContext = C.getASTContext(); ProgramStateRef State = C.getState(); - if (const auto *ErrnoVar = dyn_cast_or_null(ErrnoDecl)) { -// There is an external 'errno' variable. -// Use its memory region. -// The memory region for an 'errno'-like variable is allocated in system -// space by MemRegionManager. -const MemRegion *ErrnoR = -State->getRegion(ErrnoVar, C.getLocationContext()); + const MemRegion *ErrnoR; + + if (ErrnoDecl) { +// There is an external 'errno' variable, so we can simply use the memory +// region that's associated with it. +ErrnoR = State->getRegion(ErrnoDecl, C.getLocationContext()); assert(ErrnoR && "Memory region should exist for the 'errno' variable."); -State = State->set(ErrnoR); -State = -errno_modeling::setErrnoValue(State, C, 0, errno_modeling::Irrelevant); -C.addTransition(State); - } else if (ErrnoDecl) { -assert(isa(ErrnoDecl) && "Invalid errno location function."); -// There is a function that returns the location of 'errno'. -// We must create a memory region for it in system space. -// Currently a symbolic region is used with an artifical symbol. -// FIXME: It is better to have a custom (new) kind of MemRegion for such -// cases. + } else { +// The 'errno' location is accessed via an internal getter function, so +// create a new symbolic memory region that can be used as the return value +// of that function. NagyDonat wrote: I extended the comment to emphasize this. https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
https://github.com/NagyDonat updated https://github.com/llvm/llvm-project/pull/91531 From 07dc4dd5c60c8a04637cce686b379e195deb5b67 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= Date: Wed, 8 May 2024 20:01:57 +0200 Subject: [PATCH 1/5] [analyzer] Refactor recognition of the errno getter functions There are many environments where `errno` is a macro that expands to something like `(*__errno())` (different standard library implementations use different names instead of "__errno"). In these environments the ErrnoModeling checker creates a symbolic region which will be used to represent the return value of this "get the location of errno" function. Previously this symbol was only created when the checker was able to find the declaration of the "get the location of errno" function; but this commit eliminates the complex logic that was responsible for this and always creates the symbolic region when `errno` is not available as a "regular" global variable. This significantly simplifies a code and only introduces a minimal performance reduction (one extra symbol) in the unlikely case when `errno` is not declared (neither as a variable nor as a function), but the `ErrnoModeling` checker is enabled. In addition to this simplification, this commit specifies that the `CallDescription`s for the "get the location of errno" functions are matched in `CDM::CLibrary` mode. (This was my original goal, but I was sidetracked by resolving a FIXME above the `CallDescriptionSet` in `ErrnoModeling.cpp`.) This change is very close to being NFC, but it fixes weird corner cases like the handling of a C++ method that happens to be named "__errno()" (previously it could've been recognized as an errno location getter function). --- .../StaticAnalyzer/Checkers/ErrnoChecker.cpp | 2 +- .../StaticAnalyzer/Checkers/ErrnoModeling.cpp | 127 ++ .../StaticAnalyzer/Checkers/ErrnoModeling.h | 9 +- clang/test/Analysis/memory-model.cpp | 18 +-- 4 files changed, 53 insertions(+), 103 deletions(-) diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp index 18e718e085536..72fd6781a7561 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp @@ -205,7 +205,7 @@ void ErrnoChecker::checkPreCall(const CallEvent , // Probably 'strerror'? if (CallF->isExternC() && CallF->isGlobal() && C.getSourceManager().isInSystemHeader(CallF->getLocation()) && - !isErrno(CallF)) { + !isErrnoLocationCall(Call)) { if (getErrnoState(C.getState()) == MustBeChecked) { std::optional ErrnoLoc = getErrnoLoc(C.getState()); assert(ErrnoLoc && "ErrnoLoc should exist if an errno state is set."); diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp index 1b34ea0e056e5..0612cd4c87248 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp @@ -39,10 +39,15 @@ namespace { // Name of the "errno" variable. // FIXME: Is there a system where it is not called "errno" but is a variable? const char *ErrnoVarName = "errno"; + // Names of functions that return a location of the "errno" value. // FIXME: Are there other similar function names? -const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno", -"__errno", "_errno", "__error"}; +CallDescriptionSet ErrnoLocationCalls{ +{CDM::SimpleFunc, {"__errno_location"}, 0, 0}, +{CDM::SimpleFunc, {"___errno"}, 0, 0}, +{CDM::SimpleFunc, {"__errno"}, 0, 0}, +{CDM::SimpleFunc, {"_errno"}, 0, 0}, +{CDM::SimpleFunc, {"__error"}, 0, 0}}; class ErrnoModeling : public Checker, check::BeginFunction, @@ -54,16 +59,10 @@ class ErrnoModeling void checkLiveSymbols(ProgramStateRef State, SymbolReaper ) const; bool evalCall(const CallEvent , CheckerContext ) const; - // The declaration of an "errno" variable or "errno location" function. - mutable const Decl *ErrnoDecl = nullptr; - private: - // FIXME: Names from `ErrnoLocationFuncNames` are used to build this set. - CallDescriptionSet ErrnoLocationCalls{{{"__errno_location"}, 0, 0}, -{{"___errno"}, 0, 0}, -{{"__errno"}, 0, 0}, -{{"_errno"}, 0, 0}, -{{"__error"}, 0, 0}}; + // The declaration of an "errno" variable on systems where errno is + // represented by a variable (and not a function that queries its location). + mutable const Decl *ErrnoDecl = nullptr; }; } // namespace @@ -74,9 +73,13 @@ REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoRegion, const MemRegion *) REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoState, errno_modeling::ErrnoCheckState) -/// Search for a variable called "errno" in the AST.
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
@@ -136,53 +100,48 @@ void ErrnoModeling::checkBeginFunction(CheckerContext ) const { ASTContext = C.getASTContext(); ProgramStateRef State = C.getState(); - if (const auto *ErrnoVar = dyn_cast_or_null(ErrnoDecl)) { -// There is an external 'errno' variable. -// Use its memory region. -// The memory region for an 'errno'-like variable is allocated in system -// space by MemRegionManager. -const MemRegion *ErrnoR = -State->getRegion(ErrnoVar, C.getLocationContext()); + const MemRegion *ErrnoR; + + if (ErrnoDecl) { +// There is an external 'errno' variable, so we can simply use the memory +// region that's associated with it. +ErrnoR = State->getRegion(ErrnoDecl, C.getLocationContext()); assert(ErrnoR && "Memory region should exist for the 'errno' variable."); -State = State->set(ErrnoR); -State = -errno_modeling::setErrnoValue(State, C, 0, errno_modeling::Irrelevant); -C.addTransition(State); - } else if (ErrnoDecl) { -assert(isa(ErrnoDecl) && "Invalid errno location function."); -// There is a function that returns the location of 'errno'. -// We must create a memory region for it in system space. -// Currently a symbolic region is used with an artifical symbol. -// FIXME: It is better to have a custom (new) kind of MemRegion for such -// cases. + } else { +// The 'errno' location is accessed via an internal getter function, so +// create a new symbolic memory region that can be used as the return value +// of that function. balazske wrote: I wanted to point out in the comment that if there is no `errno` definition at all (no variable and no getter function) and `errno` is not used in the program, the region is still created. https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
@@ -136,53 +100,48 @@ void ErrnoModeling::checkBeginFunction(CheckerContext ) const { ASTContext = C.getASTContext(); ProgramStateRef State = C.getState(); - if (const auto *ErrnoVar = dyn_cast_or_null(ErrnoDecl)) { -// There is an external 'errno' variable. -// Use its memory region. -// The memory region for an 'errno'-like variable is allocated in system -// space by MemRegionManager. -const MemRegion *ErrnoR = -State->getRegion(ErrnoVar, C.getLocationContext()); + const MemRegion *ErrnoR; + + if (ErrnoDecl) { +// There is an external 'errno' variable, so we can simply use the memory +// region that's associated with it. +ErrnoR = State->getRegion(ErrnoDecl, C.getLocationContext()); assert(ErrnoR && "Memory region should exist for the 'errno' variable."); -State = State->set(ErrnoR); -State = -errno_modeling::setErrnoValue(State, C, 0, errno_modeling::Irrelevant); -C.addTransition(State); - } else if (ErrnoDecl) { -assert(isa(ErrnoDecl) && "Invalid errno location function."); -// There is a function that returns the location of 'errno'. -// We must create a memory region for it in system space. -// Currently a symbolic region is used with an artifical symbol. -// FIXME: It is better to have a custom (new) kind of MemRegion for such -// cases. + } else { +// The 'errno' location is accessed via an internal getter function, so +// create a new symbolic memory region that can be used as the return value +// of that function. NagyDonat wrote: Good point, I fixed it. https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
https://github.com/NagyDonat updated https://github.com/llvm/llvm-project/pull/91531 From 07dc4dd5c60c8a04637cce686b379e195deb5b67 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= Date: Wed, 8 May 2024 20:01:57 +0200 Subject: [PATCH 1/4] [analyzer] Refactor recognition of the errno getter functions There are many environments where `errno` is a macro that expands to something like `(*__errno())` (different standard library implementations use different names instead of "__errno"). In these environments the ErrnoModeling checker creates a symbolic region which will be used to represent the return value of this "get the location of errno" function. Previously this symbol was only created when the checker was able to find the declaration of the "get the location of errno" function; but this commit eliminates the complex logic that was responsible for this and always creates the symbolic region when `errno` is not available as a "regular" global variable. This significantly simplifies a code and only introduces a minimal performance reduction (one extra symbol) in the unlikely case when `errno` is not declared (neither as a variable nor as a function), but the `ErrnoModeling` checker is enabled. In addition to this simplification, this commit specifies that the `CallDescription`s for the "get the location of errno" functions are matched in `CDM::CLibrary` mode. (This was my original goal, but I was sidetracked by resolving a FIXME above the `CallDescriptionSet` in `ErrnoModeling.cpp`.) This change is very close to being NFC, but it fixes weird corner cases like the handling of a C++ method that happens to be named "__errno()" (previously it could've been recognized as an errno location getter function). --- .../StaticAnalyzer/Checkers/ErrnoChecker.cpp | 2 +- .../StaticAnalyzer/Checkers/ErrnoModeling.cpp | 127 ++ .../StaticAnalyzer/Checkers/ErrnoModeling.h | 9 +- clang/test/Analysis/memory-model.cpp | 18 +-- 4 files changed, 53 insertions(+), 103 deletions(-) diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp index 18e718e085536..72fd6781a7561 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp @@ -205,7 +205,7 @@ void ErrnoChecker::checkPreCall(const CallEvent , // Probably 'strerror'? if (CallF->isExternC() && CallF->isGlobal() && C.getSourceManager().isInSystemHeader(CallF->getLocation()) && - !isErrno(CallF)) { + !isErrnoLocationCall(Call)) { if (getErrnoState(C.getState()) == MustBeChecked) { std::optional ErrnoLoc = getErrnoLoc(C.getState()); assert(ErrnoLoc && "ErrnoLoc should exist if an errno state is set."); diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp index 1b34ea0e056e5..0612cd4c87248 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp @@ -39,10 +39,15 @@ namespace { // Name of the "errno" variable. // FIXME: Is there a system where it is not called "errno" but is a variable? const char *ErrnoVarName = "errno"; + // Names of functions that return a location of the "errno" value. // FIXME: Are there other similar function names? -const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno", -"__errno", "_errno", "__error"}; +CallDescriptionSet ErrnoLocationCalls{ +{CDM::SimpleFunc, {"__errno_location"}, 0, 0}, +{CDM::SimpleFunc, {"___errno"}, 0, 0}, +{CDM::SimpleFunc, {"__errno"}, 0, 0}, +{CDM::SimpleFunc, {"_errno"}, 0, 0}, +{CDM::SimpleFunc, {"__error"}, 0, 0}}; class ErrnoModeling : public Checker, check::BeginFunction, @@ -54,16 +59,10 @@ class ErrnoModeling void checkLiveSymbols(ProgramStateRef State, SymbolReaper ) const; bool evalCall(const CallEvent , CheckerContext ) const; - // The declaration of an "errno" variable or "errno location" function. - mutable const Decl *ErrnoDecl = nullptr; - private: - // FIXME: Names from `ErrnoLocationFuncNames` are used to build this set. - CallDescriptionSet ErrnoLocationCalls{{{"__errno_location"}, 0, 0}, -{{"___errno"}, 0, 0}, -{{"__errno"}, 0, 0}, -{{"_errno"}, 0, 0}, -{{"__error"}, 0, 0}}; + // The declaration of an "errno" variable on systems where errno is + // represented by a variable (and not a function that queries its location). + mutable const Decl *ErrnoDecl = nullptr; }; } // namespace @@ -74,9 +73,13 @@ REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoRegion, const MemRegion *) REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoState, errno_modeling::ErrnoCheckState) -/// Search for a variable called "errno" in the AST.
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
@@ -136,53 +100,48 @@ void ErrnoModeling::checkBeginFunction(CheckerContext ) const { ASTContext = C.getASTContext(); ProgramStateRef State = C.getState(); - if (const auto *ErrnoVar = dyn_cast_or_null(ErrnoDecl)) { -// There is an external 'errno' variable. -// Use its memory region. -// The memory region for an 'errno'-like variable is allocated in system -// space by MemRegionManager. -const MemRegion *ErrnoR = -State->getRegion(ErrnoVar, C.getLocationContext()); + const MemRegion *ErrnoR; + + if (ErrnoDecl) { +// There is an external 'errno' variable, so we can simply use the memory +// region that's associated with it. +ErrnoR = State->getRegion(ErrnoDecl, C.getLocationContext()); assert(ErrnoR && "Memory region should exist for the 'errno' variable."); -State = State->set(ErrnoR); -State = -errno_modeling::setErrnoValue(State, C, 0, errno_modeling::Irrelevant); -C.addTransition(State); - } else if (ErrnoDecl) { -assert(isa(ErrnoDecl) && "Invalid errno location function."); -// There is a function that returns the location of 'errno'. -// We must create a memory region for it in system space. -// Currently a symbolic region is used with an artifical symbol. -// FIXME: It is better to have a custom (new) kind of MemRegion for such -// cases. + } else { +// The 'errno' location is accessed via an internal getter function, so +// create a new symbolic memory region that can be used as the return value +// of that function. balazske wrote: We should mention that this case happens even if there is no errno getter function at all https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
https://github.com/NagyDonat updated https://github.com/llvm/llvm-project/pull/91531 From 07dc4dd5c60c8a04637cce686b379e195deb5b67 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= Date: Wed, 8 May 2024 20:01:57 +0200 Subject: [PATCH 1/3] [analyzer] Refactor recognition of the errno getter functions There are many environments where `errno` is a macro that expands to something like `(*__errno())` (different standard library implementations use different names instead of "__errno"). In these environments the ErrnoModeling checker creates a symbolic region which will be used to represent the return value of this "get the location of errno" function. Previously this symbol was only created when the checker was able to find the declaration of the "get the location of errno" function; but this commit eliminates the complex logic that was responsible for this and always creates the symbolic region when `errno` is not available as a "regular" global variable. This significantly simplifies a code and only introduces a minimal performance reduction (one extra symbol) in the unlikely case when `errno` is not declared (neither as a variable nor as a function), but the `ErrnoModeling` checker is enabled. In addition to this simplification, this commit specifies that the `CallDescription`s for the "get the location of errno" functions are matched in `CDM::CLibrary` mode. (This was my original goal, but I was sidetracked by resolving a FIXME above the `CallDescriptionSet` in `ErrnoModeling.cpp`.) This change is very close to being NFC, but it fixes weird corner cases like the handling of a C++ method that happens to be named "__errno()" (previously it could've been recognized as an errno location getter function). --- .../StaticAnalyzer/Checkers/ErrnoChecker.cpp | 2 +- .../StaticAnalyzer/Checkers/ErrnoModeling.cpp | 127 ++ .../StaticAnalyzer/Checkers/ErrnoModeling.h | 9 +- clang/test/Analysis/memory-model.cpp | 18 +-- 4 files changed, 53 insertions(+), 103 deletions(-) diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp index 18e718e085536..72fd6781a7561 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp @@ -205,7 +205,7 @@ void ErrnoChecker::checkPreCall(const CallEvent , // Probably 'strerror'? if (CallF->isExternC() && CallF->isGlobal() && C.getSourceManager().isInSystemHeader(CallF->getLocation()) && - !isErrno(CallF)) { + !isErrnoLocationCall(Call)) { if (getErrnoState(C.getState()) == MustBeChecked) { std::optional ErrnoLoc = getErrnoLoc(C.getState()); assert(ErrnoLoc && "ErrnoLoc should exist if an errno state is set."); diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp index 1b34ea0e056e5..0612cd4c87248 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp @@ -39,10 +39,15 @@ namespace { // Name of the "errno" variable. // FIXME: Is there a system where it is not called "errno" but is a variable? const char *ErrnoVarName = "errno"; + // Names of functions that return a location of the "errno" value. // FIXME: Are there other similar function names? -const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno", -"__errno", "_errno", "__error"}; +CallDescriptionSet ErrnoLocationCalls{ +{CDM::SimpleFunc, {"__errno_location"}, 0, 0}, +{CDM::SimpleFunc, {"___errno"}, 0, 0}, +{CDM::SimpleFunc, {"__errno"}, 0, 0}, +{CDM::SimpleFunc, {"_errno"}, 0, 0}, +{CDM::SimpleFunc, {"__error"}, 0, 0}}; class ErrnoModeling : public Checker, check::BeginFunction, @@ -54,16 +59,10 @@ class ErrnoModeling void checkLiveSymbols(ProgramStateRef State, SymbolReaper ) const; bool evalCall(const CallEvent , CheckerContext ) const; - // The declaration of an "errno" variable or "errno location" function. - mutable const Decl *ErrnoDecl = nullptr; - private: - // FIXME: Names from `ErrnoLocationFuncNames` are used to build this set. - CallDescriptionSet ErrnoLocationCalls{{{"__errno_location"}, 0, 0}, -{{"___errno"}, 0, 0}, -{{"__errno"}, 0, 0}, -{{"_errno"}, 0, 0}, -{{"__error"}, 0, 0}}; + // The declaration of an "errno" variable on systems where errno is + // represented by a variable (and not a function that queries its location). + mutable const Decl *ErrnoDecl = nullptr; }; } // namespace @@ -74,9 +73,13 @@ REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoRegion, const MemRegion *) REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoState, errno_modeling::ErrnoCheckState) -/// Search for a variable called "errno" in the AST.
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
@@ -74,9 +73,13 @@ REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoRegion, const MemRegion *) REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoState, errno_modeling::ErrnoCheckState) -/// Search for a variable called "errno" in the AST. -/// Return nullptr if not found. -static const VarDecl *getErrnoVar(ASTContext ) { +void ErrnoModeling::checkASTDecl(const TranslationUnitDecl *D, + AnalysisManager , BugReporter ) const { + // Try to find the declaration of the external variable `int errno;`. + // There are also C library implementations, where the `errno` location is + // accessed via a function that returns its address; in those environments + // this callback does nothing. balazske wrote: ```suggestion // this callback has no effect. ``` https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
@@ -71,12 +71,9 @@ ProgramStateRef setErrnoState(ProgramStateRef State, ErrnoCheckState EState); /// Clear state of errno (make it irrelevant). ProgramStateRef clearErrnoState(ProgramStateRef State); -/// Determine if a `Decl` node related to 'errno'. -/// This is true if the declaration is the errno variable or a function -/// that returns a pointer to the 'errno' value (usually the 'errno' macro is -/// defined with this function). \p D is not required to be a canonical -/// declaration. -bool isErrno(const Decl *D); +/// Determine if `Call` is a call to a "magical" function that returns the +/// location of `errno` (in environments where errno is accessed this way). +bool isErrnoLocationCall(const CallEvent ); NagyDonat wrote: Yes, I see that `isErrno` was intended for a different behavior; but its variable declaration handling "side" became unused, so I deleted that and modified the remaining function-handling case to accept a call instead of a declaration. https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
NagyDonat wrote: Thanks for the review, I updated my commit! https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
@@ -54,16 +59,10 @@ class ErrnoModeling void checkLiveSymbols(ProgramStateRef State, SymbolReaper ) const; bool evalCall(const CallEvent , CheckerContext ) const; - // The declaration of an "errno" variable or "errno location" function. - mutable const Decl *ErrnoDecl = nullptr; - private: - // FIXME: Names from `ErrnoLocationFuncNames` are used to build this set. - CallDescriptionSet ErrnoLocationCalls{{{"__errno_location"}, 0, 0}, -{{"___errno"}, 0, 0}, -{{"__errno"}, 0, 0}, -{{"_errno"}, 0, 0}, -{{"__error"}, 0, 0}}; + // The declaration of an "errno" variable on systems where errno is + // represented by a variable (and not a function that queries its location). + mutable const Decl *ErrnoDecl = nullptr; NagyDonat wrote: Good point! https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
@@ -39,10 +39,15 @@ namespace { // Name of the "errno" variable. // FIXME: Is there a system where it is not called "errno" but is a variable? const char *ErrnoVarName = "errno"; + // Names of functions that return a location of the "errno" value. // FIXME: Are there other similar function names? -const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno", -"__errno", "_errno", "__error"}; +CallDescriptionSet ErrnoLocationCalls{ +{CDM::SimpleFunc, {"__errno_location"}, 0, 0}, +{CDM::SimpleFunc, {"___errno"}, 0, 0}, +{CDM::SimpleFunc, {"__errno"}, 0, 0}, +{CDM::SimpleFunc, {"_errno"}, 0, 0}, +{CDM::SimpleFunc, {"__error"}, 0, 0}}; NagyDonat wrote: I updated them to `CLibrary` because it's slightly better for this usecase (it won't match functions within a user-defined C++ namespace). By the way, note that the `CLibrary` matching mode intentionally **doesn't check whether the function was declared within a system header**, because "being in a system header" is a very fragile property -- consider e.g. a case when somebody tries to submit a preprocessed source file for bug reproduction and the declaration becomes part of the (tweaked, minified) source file. I also suspect that there might be some unusual/internal functions (like these ones) whose declaration has no location, so it isn't in a system header... https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
https://github.com/NagyDonat updated https://github.com/llvm/llvm-project/pull/91531 From 07dc4dd5c60c8a04637cce686b379e195deb5b67 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= Date: Wed, 8 May 2024 20:01:57 +0200 Subject: [PATCH 1/2] [analyzer] Refactor recognition of the errno getter functions There are many environments where `errno` is a macro that expands to something like `(*__errno())` (different standard library implementations use different names instead of "__errno"). In these environments the ErrnoModeling checker creates a symbolic region which will be used to represent the return value of this "get the location of errno" function. Previously this symbol was only created when the checker was able to find the declaration of the "get the location of errno" function; but this commit eliminates the complex logic that was responsible for this and always creates the symbolic region when `errno` is not available as a "regular" global variable. This significantly simplifies a code and only introduces a minimal performance reduction (one extra symbol) in the unlikely case when `errno` is not declared (neither as a variable nor as a function), but the `ErrnoModeling` checker is enabled. In addition to this simplification, this commit specifies that the `CallDescription`s for the "get the location of errno" functions are matched in `CDM::CLibrary` mode. (This was my original goal, but I was sidetracked by resolving a FIXME above the `CallDescriptionSet` in `ErrnoModeling.cpp`.) This change is very close to being NFC, but it fixes weird corner cases like the handling of a C++ method that happens to be named "__errno()" (previously it could've been recognized as an errno location getter function). --- .../StaticAnalyzer/Checkers/ErrnoChecker.cpp | 2 +- .../StaticAnalyzer/Checkers/ErrnoModeling.cpp | 127 ++ .../StaticAnalyzer/Checkers/ErrnoModeling.h | 9 +- clang/test/Analysis/memory-model.cpp | 18 +-- 4 files changed, 53 insertions(+), 103 deletions(-) diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp index 18e718e085536..72fd6781a7561 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp @@ -205,7 +205,7 @@ void ErrnoChecker::checkPreCall(const CallEvent , // Probably 'strerror'? if (CallF->isExternC() && CallF->isGlobal() && C.getSourceManager().isInSystemHeader(CallF->getLocation()) && - !isErrno(CallF)) { + !isErrnoLocationCall(Call)) { if (getErrnoState(C.getState()) == MustBeChecked) { std::optional ErrnoLoc = getErrnoLoc(C.getState()); assert(ErrnoLoc && "ErrnoLoc should exist if an errno state is set."); diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp index 1b34ea0e056e5..0612cd4c87248 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp @@ -39,10 +39,15 @@ namespace { // Name of the "errno" variable. // FIXME: Is there a system where it is not called "errno" but is a variable? const char *ErrnoVarName = "errno"; + // Names of functions that return a location of the "errno" value. // FIXME: Are there other similar function names? -const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno", -"__errno", "_errno", "__error"}; +CallDescriptionSet ErrnoLocationCalls{ +{CDM::SimpleFunc, {"__errno_location"}, 0, 0}, +{CDM::SimpleFunc, {"___errno"}, 0, 0}, +{CDM::SimpleFunc, {"__errno"}, 0, 0}, +{CDM::SimpleFunc, {"_errno"}, 0, 0}, +{CDM::SimpleFunc, {"__error"}, 0, 0}}; class ErrnoModeling : public Checker, check::BeginFunction, @@ -54,16 +59,10 @@ class ErrnoModeling void checkLiveSymbols(ProgramStateRef State, SymbolReaper ) const; bool evalCall(const CallEvent , CheckerContext ) const; - // The declaration of an "errno" variable or "errno location" function. - mutable const Decl *ErrnoDecl = nullptr; - private: - // FIXME: Names from `ErrnoLocationFuncNames` are used to build this set. - CallDescriptionSet ErrnoLocationCalls{{{"__errno_location"}, 0, 0}, -{{"___errno"}, 0, 0}, -{{"__errno"}, 0, 0}, -{{"_errno"}, 0, 0}, -{{"__error"}, 0, 0}}; + // The declaration of an "errno" variable on systems where errno is + // represented by a variable (and not a function that queries its location). + mutable const Decl *ErrnoDecl = nullptr; }; } // namespace @@ -74,9 +73,13 @@ REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoRegion, const MemRegion *) REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoState, errno_modeling::ErrnoCheckState) -/// Search for a variable called "errno" in the AST.
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
@@ -71,12 +71,9 @@ ProgramStateRef setErrnoState(ProgramStateRef State, ErrnoCheckState EState); /// Clear state of errno (make it irrelevant). ProgramStateRef clearErrnoState(ProgramStateRef State); -/// Determine if a `Decl` node related to 'errno'. -/// This is true if the declaration is the errno variable or a function -/// that returns a pointer to the 'errno' value (usually the 'errno' macro is -/// defined with this function). \p D is not required to be a canonical -/// declaration. -bool isErrno(const Decl *D); +/// Determine if `Call` is a call to a "magical" function that returns the balazske wrote: I do not like word "magical" here (and at another comments) (some people may only guess what this means), it can be "hidden" or "internal" but can be just removed. https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
@@ -54,16 +59,10 @@ class ErrnoModeling void checkLiveSymbols(ProgramStateRef State, SymbolReaper ) const; bool evalCall(const CallEvent , CheckerContext ) const; - // The declaration of an "errno" variable or "errno location" function. - mutable const Decl *ErrnoDecl = nullptr; - private: - // FIXME: Names from `ErrnoLocationFuncNames` are used to build this set. - CallDescriptionSet ErrnoLocationCalls{{{"__errno_location"}, 0, 0}, -{{"___errno"}, 0, 0}, -{{"__errno"}, 0, 0}, -{{"_errno"}, 0, 0}, -{{"__error"}, 0, 0}}; + // The declaration of an "errno" variable on systems where errno is + // represented by a variable (and not a function that queries its location). + mutable const Decl *ErrnoDecl = nullptr; balazske wrote: This can be changed to `VarDecl`. https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
@@ -71,12 +71,9 @@ ProgramStateRef setErrnoState(ProgramStateRef State, ErrnoCheckState EState); /// Clear state of errno (make it irrelevant). ProgramStateRef clearErrnoState(ProgramStateRef State); -/// Determine if a `Decl` node related to 'errno'. -/// This is true if the declaration is the errno variable or a function -/// that returns a pointer to the 'errno' value (usually the 'errno' macro is -/// defined with this function). \p D is not required to be a canonical -/// declaration. -bool isErrno(const Decl *D); +/// Determine if `Call` is a call to a "magical" function that returns the +/// location of `errno` (in environments where errno is accessed this way). +bool isErrnoLocationCall(const CallEvent ); balazske wrote: Intent of this function was to determine if the passed `Decl` is the "errno" value (independent of how it is defined). This is now changed, but not a problem because all uses are for the function case. https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
@@ -39,10 +39,15 @@ namespace { // Name of the "errno" variable. // FIXME: Is there a system where it is not called "errno" but is a variable? const char *ErrnoVarName = "errno"; + // Names of functions that return a location of the "errno" value. // FIXME: Are there other similar function names? -const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno", -"__errno", "_errno", "__error"}; +CallDescriptionSet ErrnoLocationCalls{ +{CDM::SimpleFunc, {"__errno_location"}, 0, 0}, +{CDM::SimpleFunc, {"___errno"}, 0, 0}, +{CDM::SimpleFunc, {"__errno"}, 0, 0}, +{CDM::SimpleFunc, {"_errno"}, 0, 0}, +{CDM::SimpleFunc, {"__error"}, 0, 0}}; balazske wrote: Should these be `CLibrary`? Probably not, because these are already internal function names hidden behind a macro ("errno" is the non-internal name). But only global functions should match that are in a system header. https://github.com/llvm/llvm-project/pull/91531 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
llvmbot wrote: @llvm/pr-subscribers-clang-static-analyzer-1 Author: Donát Nagy (NagyDonat) Changes There are many environments where `errno` is a macro that expands to something like `(*__errno())` (different standard library implementations use different names instead of "__errno"). In these environments the ErrnoModeling checker creates a symbolic region which will be used to represent the return value of this "get the location of errno" function. Previously this symbol was only created when the checker was able to find the declaration of the "get the location of errno" function; but this commit eliminates the complex logic that was responsible for this and always creates the symbolic region when `errno` is not available as a "regular" global variable. This significantly simplifies a code and only introduces a minimal performance reduction (one extra symbol) in the unlikely case when `errno` is not declared (neither as a variable nor as a function), but the `ErrnoModeling` checker is enabled. In addition to this simplification, this commit specifies that the `CallDescription`s for the "get the location of errno" functions are matched in `CDM::CLibrary` mode. (This was my original goal, but I was sidetracked by resolving a FIXME above the `CallDescriptionSet` in `ErrnoModeling.cpp`.) This change is very close to being NFC, but it fixes weird corner cases like the handling of a C++ method that happens to be named "__errno()" (previously it could've been recognized as an errno location getter function). --- Full diff: https://github.com/llvm/llvm-project/pull/91531.diff 4 Files Affected: - (modified) clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp (+1-1) - (modified) clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp (+40-87) - (modified) clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.h (+3-6) - (modified) clang/test/Analysis/memory-model.cpp (+9-9) ``diff diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp index 18e718e085536..72fd6781a7561 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp @@ -205,7 +205,7 @@ void ErrnoChecker::checkPreCall(const CallEvent , // Probably 'strerror'? if (CallF->isExternC() && CallF->isGlobal() && C.getSourceManager().isInSystemHeader(CallF->getLocation()) && - !isErrno(CallF)) { + !isErrnoLocationCall(Call)) { if (getErrnoState(C.getState()) == MustBeChecked) { std::optional ErrnoLoc = getErrnoLoc(C.getState()); assert(ErrnoLoc && "ErrnoLoc should exist if an errno state is set."); diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp index 1b34ea0e056e5..0612cd4c87248 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp @@ -39,10 +39,15 @@ namespace { // Name of the "errno" variable. // FIXME: Is there a system where it is not called "errno" but is a variable? const char *ErrnoVarName = "errno"; + // Names of functions that return a location of the "errno" value. // FIXME: Are there other similar function names? -const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno", -"__errno", "_errno", "__error"}; +CallDescriptionSet ErrnoLocationCalls{ +{CDM::SimpleFunc, {"__errno_location"}, 0, 0}, +{CDM::SimpleFunc, {"___errno"}, 0, 0}, +{CDM::SimpleFunc, {"__errno"}, 0, 0}, +{CDM::SimpleFunc, {"_errno"}, 0, 0}, +{CDM::SimpleFunc, {"__error"}, 0, 0}}; class ErrnoModeling : public Checker, check::BeginFunction, @@ -54,16 +59,10 @@ class ErrnoModeling void checkLiveSymbols(ProgramStateRef State, SymbolReaper ) const; bool evalCall(const CallEvent , CheckerContext ) const; - // The declaration of an "errno" variable or "errno location" function. - mutable const Decl *ErrnoDecl = nullptr; - private: - // FIXME: Names from `ErrnoLocationFuncNames` are used to build this set. - CallDescriptionSet ErrnoLocationCalls{{{"__errno_location"}, 0, 0}, -{{"___errno"}, 0, 0}, -{{"__errno"}, 0, 0}, -{{"_errno"}, 0, 0}, -{{"__error"}, 0, 0}}; + // The declaration of an "errno" variable on systems where errno is + // represented by a variable (and not a function that queries its location). + mutable const Decl *ErrnoDecl = nullptr; }; } // namespace @@ -74,9 +73,13 @@ REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoRegion, const MemRegion *) REGISTER_TRAIT_WITH_PROGRAMSTATE(ErrnoState, errno_modeling::ErrnoCheckState) -/// Search for a variable called "errno" in the AST. -/// Return nullptr if not found. -static const VarDecl *getErrnoVar(ASTContext ) { +void
[clang] [analyzer] Refactor recognition of the errno getter functions (PR #91531)
https://github.com/NagyDonat created https://github.com/llvm/llvm-project/pull/91531 There are many environments where `errno` is a macro that expands to something like `(*__errno())` (different standard library implementations use different names instead of "__errno"). In these environments the ErrnoModeling checker creates a symbolic region which will be used to represent the return value of this "get the location of errno" function. Previously this symbol was only created when the checker was able to find the declaration of the "get the location of errno" function; but this commit eliminates the complex logic that was responsible for this and always creates the symbolic region when `errno` is not available as a "regular" global variable. This significantly simplifies a code and only introduces a minimal performance reduction (one extra symbol) in the unlikely case when `errno` is not declared (neither as a variable nor as a function), but the `ErrnoModeling` checker is enabled. In addition to this simplification, this commit specifies that the `CallDescription`s for the "get the location of errno" functions are matched in `CDM::CLibrary` mode. (This was my original goal, but I was sidetracked by resolving a FIXME above the `CallDescriptionSet` in `ErrnoModeling.cpp`.) This change is very close to being NFC, but it fixes weird corner cases like the handling of a C++ method that happens to be named "__errno()" (previously it could've been recognized as an errno location getter function). From 07dc4dd5c60c8a04637cce686b379e195deb5b67 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Don=C3=A1t=20Nagy?= Date: Wed, 8 May 2024 20:01:57 +0200 Subject: [PATCH] [analyzer] Refactor recognition of the errno getter functions There are many environments where `errno` is a macro that expands to something like `(*__errno())` (different standard library implementations use different names instead of "__errno"). In these environments the ErrnoModeling checker creates a symbolic region which will be used to represent the return value of this "get the location of errno" function. Previously this symbol was only created when the checker was able to find the declaration of the "get the location of errno" function; but this commit eliminates the complex logic that was responsible for this and always creates the symbolic region when `errno` is not available as a "regular" global variable. This significantly simplifies a code and only introduces a minimal performance reduction (one extra symbol) in the unlikely case when `errno` is not declared (neither as a variable nor as a function), but the `ErrnoModeling` checker is enabled. In addition to this simplification, this commit specifies that the `CallDescription`s for the "get the location of errno" functions are matched in `CDM::CLibrary` mode. (This was my original goal, but I was sidetracked by resolving a FIXME above the `CallDescriptionSet` in `ErrnoModeling.cpp`.) This change is very close to being NFC, but it fixes weird corner cases like the handling of a C++ method that happens to be named "__errno()" (previously it could've been recognized as an errno location getter function). --- .../StaticAnalyzer/Checkers/ErrnoChecker.cpp | 2 +- .../StaticAnalyzer/Checkers/ErrnoModeling.cpp | 127 ++ .../StaticAnalyzer/Checkers/ErrnoModeling.h | 9 +- clang/test/Analysis/memory-model.cpp | 18 +-- 4 files changed, 53 insertions(+), 103 deletions(-) diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp index 18e718e085536..72fd6781a7561 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp @@ -205,7 +205,7 @@ void ErrnoChecker::checkPreCall(const CallEvent , // Probably 'strerror'? if (CallF->isExternC() && CallF->isGlobal() && C.getSourceManager().isInSystemHeader(CallF->getLocation()) && - !isErrno(CallF)) { + !isErrnoLocationCall(Call)) { if (getErrnoState(C.getState()) == MustBeChecked) { std::optional ErrnoLoc = getErrnoLoc(C.getState()); assert(ErrnoLoc && "ErrnoLoc should exist if an errno state is set."); diff --git a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp index 1b34ea0e056e5..0612cd4c87248 100644 --- a/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp @@ -39,10 +39,15 @@ namespace { // Name of the "errno" variable. // FIXME: Is there a system where it is not called "errno" but is a variable? const char *ErrnoVarName = "errno"; + // Names of functions that return a location of the "errno" value. // FIXME: Are there other similar function names? -const char *ErrnoLocationFuncNames[] = {"__errno_location", "___errno", -"__errno", "_errno", "__error"}; +CallDescriptionSet