[PATCH] D30534: [analyzer] When creating a temporary object copy, properly copy the value into it.
This revision was automatically updated to reflect the committed changes. Closed by commit rL298924: [analyzer] When creating a temporary object, properly copy the value into it. (authored by dergachev). Changed prior to commit: https://reviews.llvm.org/D30534?vs=92618&id=93250#toc Repository: rL LLVM https://reviews.llvm.org/D30534 Files: cfe/trunk/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h cfe/trunk/lib/StaticAnalyzer/Checkers/AnalysisOrderChecker.cpp cfe/trunk/lib/StaticAnalyzer/Core/ExprEngine.cpp cfe/trunk/test/Analysis/temporaries-callback-order.cpp cfe/trunk/test/Analysis/temporaries.cpp Index: cfe/trunk/lib/StaticAnalyzer/Core/ExprEngine.cpp === --- cfe/trunk/lib/StaticAnalyzer/Core/ExprEngine.cpp +++ cfe/trunk/lib/StaticAnalyzer/Core/ExprEngine.cpp @@ -182,19 +182,25 @@ ProgramStateRef ExprEngine::createTemporaryRegionIfNeeded(ProgramStateRef State, const LocationContext *LC, - const Expr *Ex, + const Expr *InitWithAdjustments, const Expr *Result) { - SVal V = State->getSVal(Ex, LC); + // FIXME: This function is a hack that works around the quirky AST + // we're often having with respect to C++ temporaries. If only we modelled + // the actual execution order of statements properly in the CFG, + // all the hassle with adjustments would not be necessary, + // and perhaps the whole function would be removed. + SVal InitValWithAdjustments = State->getSVal(InitWithAdjustments, LC); if (!Result) { // If we don't have an explicit result expression, we're in "if needed" // mode. Only create a region if the current value is a NonLoc. -if (!V.getAs()) +if (!InitValWithAdjustments.getAs()) return State; -Result = Ex; +Result = InitWithAdjustments; } else { // We need to create a region no matter what. For sanity, make sure we don't // try to stuff a Loc into a non-pointer temporary region. -assert(!V.getAs() || Loc::isLocType(Result->getType()) || +assert(!InitValWithAdjustments.getAs() || + Loc::isLocType(Result->getType()) || Result->getType()->isMemberPointerType()); } @@ -226,7 +232,8 @@ SmallVector CommaLHSs; SmallVector Adjustments; - const Expr *Init = Ex->skipRValueSubobjectAdjustments(CommaLHSs, Adjustments); + const Expr *Init = InitWithAdjustments->skipRValueSubobjectAdjustments( + CommaLHSs, Adjustments); const TypedValueRegion *TR = nullptr; if (const MaterializeTemporaryExpr *MT = @@ -241,6 +248,7 @@ TR = MRMgr.getCXXTempObjectRegion(Init, LC); SVal Reg = loc::MemRegionVal(TR); + SVal BaseReg = Reg; // Make the necessary adjustments to obtain the sub-object. for (auto I = Adjustments.rbegin(), E = Adjustments.rend(); I != E; ++I) { @@ -254,19 +262,47 @@ break; case SubobjectAdjustment::MemberPointerAdjustment: // FIXME: Unimplemented. - State->bindDefault(Reg, UnknownVal(), LC); + State = State->bindDefault(Reg, UnknownVal(), LC); return State; } } - // Try to recover some path sensitivity in case we couldn't compute the value. - if (V.isUnknown()) -V = getSValBuilder().conjureSymbolVal(Result, LC, TR->getValueType(), - currBldrCtx->blockCount()); - // Bind the value of the expression to the sub-object region, and then bind - // the sub-object region to our expression. - State = State->bindLoc(Reg, V, LC); + // What remains is to copy the value of the object to the new region. + // FIXME: In other words, what we should always do is copy value of the + // Init expression (which corresponds to the bigger object) to the whole + // temporary region TR. However, this value is often no longer present + // in the Environment. If it has disappeared, we instead invalidate TR. + // Still, what we can do is assign the value of expression Ex (which + // corresponds to the sub-object) to the TR's sub-region Reg. At least, + // values inside Reg would be correct. + SVal InitVal = State->getSVal(Init, LC); + if (InitVal.isUnknown()) { +InitVal = getSValBuilder().conjureSymbolVal(Result, LC, Init->getType(), +currBldrCtx->blockCount()); +State = State->bindLoc(BaseReg.castAs(), InitVal, LC, false); + +// Then we'd need to take the value that certainly exists and bind it over. +if (InitValWithAdjustments.isUnknown()) { + // Try to recover some path sensitivity in case we couldn't + // compute the value. + InitValWithAdjustments = getSValBuilder().conjureSymbolVal( + Result, LC, InitWithAdjustments->getType(), + currBldrCtx->blockCount()); +} +State = +State->bindLoc(Reg.castAs(), InitValWithAdjustme
[PATCH] D30534: [analyzer] When creating a temporary object copy, properly copy the value into it.
NoQ added inline comments. Comment at: lib/StaticAnalyzer/Core/ExprEngine.cpp:281 + // Try to recover some path sensitivity in case we couldn't compute the value. + if (ExV.isUnknown()) +ExV = getSValBuilder().conjureSymbolVal(Result, LC, Ex->getType(), a.sidorin wrote: > Should we do all these operations with ExV/Reg if the InitV is known? There > is a FIXME but I think it is related to all this code, not to the bindLoc > only. And what happens if we remove this code? My bad: i overwrote `SVal` of `Result` above, and `Result` might have been equal to `Init`, which caused conflict between `InitV` and `ExV`. https://reviews.llvm.org/D30534 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30534: [analyzer] When creating a temporary object copy, properly copy the value into it.
NoQ updated this revision to Diff 92618. NoQ marked 4 inline comments as done. NoQ added a comment. Fix review comments! Add a callback order test to ensure that `checkRegionChanges` is called as many (or rather as few) times as necessary. https://reviews.llvm.org/D30534 Files: include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h lib/StaticAnalyzer/Checkers/AnalysisOrderChecker.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp test/Analysis/temporaries-callback-order.cpp test/Analysis/temporaries.cpp Index: test/Analysis/temporaries.cpp === --- test/Analysis/temporaries.cpp +++ test/Analysis/temporaries.cpp @@ -503,3 +503,30 @@ }); } } + +namespace CopyToTemporaryCorrectly { +class Super { +public: + void m() { +mImpl(); + } + virtual void mImpl() = 0; +}; +class Sub : public Super { +public: + Sub(const int &p) : j(p) {} + virtual void mImpl() override { +// Used to be undefined pointer dereference because we didn't copy +// the subclass data (j) to the temporary object properly. +(void)(j + 1); // no-warning +if (j != 22) { + clang_analyzer_warnIfReached(); // no-warning +} + } + const int &j; +}; +void run() { + int i = 22; + Sub(i).m(); +} +} Index: test/Analysis/temporaries-callback-order.cpp === --- /dev/null +++ test/Analysis/temporaries-callback-order.cpp @@ -0,0 +1,36 @@ +// RUN: %clang_cc1 -analyze -analyzer-checker=debug.AnalysisOrder -analyzer-config debug.AnalysisOrder:Bind=true -analyzer-config debug.AnalysisOrder:RegionChanges=true %s 2>&1 | FileCheck %s + +struct Super { + virtual void m(); +}; +struct Sub : Super { + virtual void m() {} +}; + +void testTemporaries() { + // This triggers RegionChanges twice: + // - Once for zero-initialization of the structure. + // - Once for creating a temporary region and copying the structure there. + // FIXME: This code shouldn't really produce the extra temporary, however + // that's how we behave for now. + Sub().m(); +} + +void seeIfCheckBindWorks() { + // This should trigger checkBind. The rest of the code shouldn't. + // This also triggers checkRegionChanges after that. + // Note that this function is analyzed first, so the messages would be on top. + int x = 1; +} + +// seeIfCheckBindWorks(): +// CHECK: Bind +// CHECK-NEXT: RegionChanges + +// testTemporaries(): +// CHECK-NEXT: RegionChanges +// CHECK-NEXT: RegionChanges + +// Make sure there's no further output. +// CHECK-NOT: Bind +// CHECK-NOT: RegionChanges Index: lib/StaticAnalyzer/Core/ExprEngine.cpp === --- lib/StaticAnalyzer/Core/ExprEngine.cpp +++ lib/StaticAnalyzer/Core/ExprEngine.cpp @@ -182,19 +182,20 @@ ProgramStateRef ExprEngine::createTemporaryRegionIfNeeded(ProgramStateRef State, const LocationContext *LC, - const Expr *Ex, + const Expr *InitWithAdjustments, const Expr *Result) { - SVal V = State->getSVal(Ex, LC); + SVal InitValWithAdjustments = State->getSVal(InitWithAdjustments, LC); if (!Result) { // If we don't have an explicit result expression, we're in "if needed" // mode. Only create a region if the current value is a NonLoc. -if (!V.getAs()) +if (!InitValWithAdjustments.getAs()) return State; -Result = Ex; +Result = InitWithAdjustments; } else { // We need to create a region no matter what. For sanity, make sure we don't // try to stuff a Loc into a non-pointer temporary region. -assert(!V.getAs() || Loc::isLocType(Result->getType()) || +assert(!InitValWithAdjustments.getAs() || + Loc::isLocType(Result->getType()) || Result->getType()->isMemberPointerType()); } @@ -226,7 +227,8 @@ SmallVector CommaLHSs; SmallVector Adjustments; - const Expr *Init = Ex->skipRValueSubobjectAdjustments(CommaLHSs, Adjustments); + const Expr *Init = InitWithAdjustments->skipRValueSubobjectAdjustments( + CommaLHSs, Adjustments); const TypedValueRegion *TR = nullptr; if (const MaterializeTemporaryExpr *MT = @@ -241,6 +243,7 @@ TR = MRMgr.getCXXTempObjectRegion(Init, LC); SVal Reg = loc::MemRegionVal(TR); + SVal BaseReg = Reg; // Make the necessary adjustments to obtain the sub-object. for (auto I = Adjustments.rbegin(), E = Adjustments.rend(); I != E; ++I) { @@ -254,19 +257,47 @@ break; case SubobjectAdjustment::MemberPointerAdjustment: // FIXME: Unimplemented. - State->bindDefault(Reg, UnknownVal(), LC); + State = State->bindDefault(Reg, UnknownVal(), LC); return State; } } - // Try to recover some path sensitivity in case we couldn't compute the value. - if (V.isUnknown()) -V = getSValB
[PATCH] D30534: [analyzer] When creating a temporary object copy, properly copy the value into it.
NoQ planned changes to this revision. NoQ added inline comments. Comment at: lib/StaticAnalyzer/Core/ExprEngine.cpp:286 + // FIXME: Why do we need to do that if WipeV was known to begin with? + State = State->bindLoc(Reg, ExV, LC); + a.sidorin wrote: > If I understand correcly, if we call `bindLoc()`, we call > `checkRegionChanges()` callbacks. And if we `bindLoc()` twice, we call them > twice too. Is this what we want here? This is actually an excellent question. https://reviews.llvm.org/D30534 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30534: [analyzer] When creating a temporary object copy, properly copy the value into it.
a.sidorin added a comment. Hi Artem! Thank you for this patch. It looks very promising, but I have some questions and remarks. Comment at: lib/StaticAnalyzer/Core/ExprEngine.cpp:187 const Expr *Result) { - SVal V = State->getSVal(Ex, LC); + SVal ExV = State->getSVal(Ex, LC); if (!Result) { If we are touching names, should we rename Ex to InitWithAdjustments (or smth like this) and ExV correspondingly? Comment at: lib/StaticAnalyzer/Core/ExprEngine.cpp:281 + // Try to recover some path sensitivity in case we couldn't compute the value. + if (ExV.isUnknown()) +ExV = getSValBuilder().conjureSymbolVal(Result, LC, Ex->getType(), Should we do all these operations with ExV/Reg if the InitV is known? There is a FIXME but I think it is related to all this code, not to the bindLoc only. And what happens if we remove this code? Comment at: lib/StaticAnalyzer/Core/ExprEngine.cpp:285 + + // FIXME: Why do we need to do that if WipeV was known to begin with? + State = State->bindLoc(Reg, ExV, LC); Seems like WipeV in comment should be InitV? Comment at: lib/StaticAnalyzer/Core/ExprEngine.cpp:286 + // FIXME: Why do we need to do that if WipeV was known to begin with? + State = State->bindLoc(Reg, ExV, LC); + If I understand correcly, if we call `bindLoc()`, we call `checkRegionChanges()` callbacks. And if we `bindLoc()` twice, we call them twice too. Is this what we want here? https://reviews.llvm.org/D30534 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D30534: [analyzer] When creating a temporary object copy, properly copy the value into it.
NoQ created this revision. The test case provided demonstrates a regression due to my earlier attempt to fix temporaries in https://reviews.llvm.org/D26839. Neither before nor after, we did not properly copy the symbolic rvalue of the object to the newly created memory region; in different ways though. In the test case: 1. there are two classes - `Super` (smaller) and `Sub` (bigger); 2. a temporary object of `Sub` is created; 3. a `Sub`-specific field not present in `Super` is initialized during construction; 4. the temporary `Sub`-class object is casted to `Super` to call method `m()` that is declared in `Super`; 5. a temporary region is retroactively created to represent this-region during method call; 6. value of the `Super`-object is copied over the c++ base-object region; 7. the rest of the temporary remains uninitialized; 8. `m()` calls a virtual method `mImpl()` that resolves to the definition in `Sub`; 9. `mImpl()` uses the `Sub`-specific field initialized on step 3 but not copied on step 6, as explained on step 7; 10. a false warning regarding usage of uninitialized value is thrown. The problem (apart from step 5 which is the reason for all the hassle, but needs AST fixes) is of course on step 6, where we should have just copied the whole `Sub` object. But - surprise! - the value of this object is no longer there in the Environment, because the cast on step 4 was already successfully modeled. I'm uncertain of the proper solution. We could probably prevent objects from disappearing from the Environment upon unchecked-derived-to-base casts, but that'd require performance evaluations on C++-rich codebases. For now, i propose to invalidate the `Sub`-object before binding the `Super`-object. This way at least we don't produce false accusations. There's also one open question for the case when the value wasn't removed from the environment, but we still need to bind `Super` - i'm seeing failing tests when i disable binding `Super`, but i didn't investigate them yet. https://reviews.llvm.org/D30534 Files: lib/StaticAnalyzer/Core/ExprEngine.cpp test/Analysis/temporaries.cpp Index: test/Analysis/temporaries.cpp === --- test/Analysis/temporaries.cpp +++ test/Analysis/temporaries.cpp @@ -503,3 +503,30 @@ }); } } + +namespace CopyToTemporaryCorrectly { +class Super { +public: + void m() { +mImpl(); + } + virtual void mImpl() = 0; +}; +class Sub : public Super { +public: + Sub(const int &p) : j(p) {} + virtual void mImpl() override { +// Used to be undefined pointer dereference because we didn't copy +// the subclass data (j) to the temporary object properly. +(void)(j + 1); // no-warning +if (j != 22) { + clang_analyzer_warnIfReached(); // no-warning +} + } + const int &j; +}; +void run() { + int i = 22; + Sub(i).m(); +} +} Index: lib/StaticAnalyzer/Core/ExprEngine.cpp === --- lib/StaticAnalyzer/Core/ExprEngine.cpp +++ lib/StaticAnalyzer/Core/ExprEngine.cpp @@ -184,17 +184,17 @@ const LocationContext *LC, const Expr *Ex, const Expr *Result) { - SVal V = State->getSVal(Ex, LC); + SVal ExV = State->getSVal(Ex, LC); if (!Result) { // If we don't have an explicit result expression, we're in "if needed" // mode. Only create a region if the current value is a NonLoc. -if (!V.getAs()) +if (!ExV.getAs()) return State; Result = Ex; } else { // We need to create a region no matter what. For sanity, make sure we don't // try to stuff a Loc into a non-pointer temporary region. -assert(!V.getAs() || Loc::isLocType(Result->getType()) || +assert(!ExV.getAs() || Loc::isLocType(Result->getType()) || Result->getType()->isMemberPointerType()); } @@ -259,14 +259,32 @@ } } - // Try to recover some path sensitivity in case we couldn't compute the value. - if (V.isUnknown()) -V = getSValBuilder().conjureSymbolVal(Result, LC, TR->getValueType(), - currBldrCtx->blockCount()); - // Bind the value of the expression to the sub-object region, and then bind - // the sub-object region to our expression. - State = State->bindLoc(Reg, V, LC); + // The result expression would now point to the correct sub-region of the + // newly created temporary region. State = State->BindExpr(Result, LC, Reg); + + // What remains is to copy the value of the object to the new region. + // FIXME: In other words, what we should always do is copy value of the + // Init expression (which corresponds to the bigger object) to the whole + // temporary region TR. However, this value is often no longer present + // in the Environment. If it has disappeared, we instead invalidate TR. + // Still,