NoQ updated this revision to Diff 116542. NoQ added a comment. @dcoughlin: You're right, my reasoning and understanding was not correct, and your explanation is much more clear. My code still makes sense to me though, so i updated the comments to match. And moved the unusual logic for the lvalue-to-rvalue cast unwrap to the bottom of the function.
Essentially, `getDerefExpr()` tries its best to explain where the null pointer comes from, on the syntactic level, by unpacking expressions that wrap the expression from which the pointer "originates", where "originates" is understood in a vague manner defined only by what the callers of this utility function expect to see. This leaves us with a few kinds of expressions that we need to unwrap, and we shouldn't touch other expressions, leaving pattern-matching over them to the caller. There are other facilities that work on non-syntactic level, like your example where we might jump from function expression to return statement within the callee if the callee was inlined, or how `getNilReceiver()` function unwraps `ObjCMessageExpr` to the `self` pointer *iff* it `self` was `nil` during symbolic execution (which implies that the whole message expression is `nil`). So that's right - lvalue-to-rvalue casts are not "the whole point", but merely an example of an expression kind that we do not have a right to unwrap. In fact, the callers still expect us to unwrap it once, but immediately stop afterwards. This inconsistency should probably be fixed, but `getDerefExpr` is used by checkers, and current code in checkers seems clean and concise enough. I had a look at other cast kinds in `OperationKinds.def`, and all of them seemed as if they're worth unwrapping, apart from, of course, `CK_LValueToRValue`. @xazax.hun: So, i guess, if the cast is earlier than we'd expect, then we'd just fail to unwrap something. So we won't find where the pointer comes from, and give up prematurely on our pattern-matching, while looking at roughly the same expression/value, plus-minus offsets. It sounds better than the case when the cast is //later// than we'd expect, where we may accidentally unwrap wrong stuff and start tracking a completely unrelated value, so i hope it shouldn't be a problem. Thanks for sharing the concern - the AST is huge and very few people possess really deep understanding of it, so any curious findings about it are really nice to share. https://reviews.llvm.org/D37023 Files: lib/StaticAnalyzer/Core/BugReporterVisitors.cpp test/Analysis/null-deref-path-notes.m
Index: test/Analysis/null-deref-path-notes.m =================================================================== --- test/Analysis/null-deref-path-notes.m +++ test/Analysis/null-deref-path-notes.m @@ -50,6 +50,23 @@ *p = 1; // expected-warning{{Dereference of null pointer}} expected-note{{Dereference of null pointer}} } +@interface WithArrayPtr +- (void) useArray; +@end + +@implementation WithArrayPtr { +@public int *p; +} +- (void)useArray { + p[1] = 2; // expected-warning{{Array access (via ivar 'p') results in a null pointer dereference}} + // expected-note@-1{{Array access (via ivar 'p') results in a null pointer dereference}} +} +@end + +void testWithArrayPtr(WithArrayPtr *w) { + w->p = 0; // expected-note{{Null pointer value stored to 'p'}} + [w useArray]; // expected-note{{Calling 'useArray'}} +} // CHECK: <key>diagnostics</key> // CHECK-NEXT: <array> @@ -801,4 +818,227 @@ // CHECK-NEXT: <key>file</key><integer>0</integer> // CHECK-NEXT: </dict> // CHECK-NEXT: </dict> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>path</key> +// CHECK-NEXT: <array> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>kind</key><string>event</string> +// CHECK-NEXT: <key>location</key> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>67</integer> +// CHECK-NEXT: <key>col</key><integer>3</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <key>ranges</key> +// CHECK-NEXT: <array> +// CHECK-NEXT: <array> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>67</integer> +// CHECK-NEXT: <key>col</key><integer>3</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>67</integer> +// CHECK-NEXT: <key>col</key><integer>10</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: </array> +// CHECK-NEXT: </array> +// CHECK-NEXT: <key>depth</key><integer>0</integer> +// CHECK-NEXT: <key>extended_message</key> +// CHECK-NEXT: <string>Null pointer value stored to 'p'</string> +// CHECK-NEXT: <key>message</key> +// CHECK-NEXT: <string>Null pointer value stored to 'p'</string> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>kind</key><string>control</string> +// CHECK-NEXT: <key>edges</key> +// CHECK-NEXT: <array> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>start</key> +// CHECK-NEXT: <array> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>67</integer> +// CHECK-NEXT: <key>col</key><integer>3</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>67</integer> +// CHECK-NEXT: <key>col</key><integer>3</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: </array> +// CHECK-NEXT: <key>end</key> +// CHECK-NEXT: <array> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>68</integer> +// CHECK-NEXT: <key>col</key><integer>3</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>68</integer> +// CHECK-NEXT: <key>col</key><integer>3</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: </array> +// CHECK-NEXT: </dict> +// CHECK-NEXT: </array> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>kind</key><string>event</string> +// CHECK-NEXT: <key>location</key> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>68</integer> +// CHECK-NEXT: <key>col</key><integer>3</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <key>ranges</key> +// CHECK-NEXT: <array> +// CHECK-NEXT: <array> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>68</integer> +// CHECK-NEXT: <key>col</key><integer>3</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>68</integer> +// CHECK-NEXT: <key>col</key><integer>14</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: </array> +// CHECK-NEXT: </array> +// CHECK-NEXT: <key>depth</key><integer>0</integer> +// CHECK-NEXT: <key>extended_message</key> +// CHECK-NEXT: <string>Calling 'useArray'</string> +// CHECK-NEXT: <key>message</key> +// CHECK-NEXT: <string>Calling 'useArray'</string> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>kind</key><string>event</string> +// CHECK-NEXT: <key>location</key> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>60</integer> +// CHECK-NEXT: <key>col</key><integer>1</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <key>depth</key><integer>1</integer> +// CHECK-NEXT: <key>extended_message</key> +// CHECK-NEXT: <string>Entered call from 'testWithArrayPtr'</string> +// CHECK-NEXT: <key>message</key> +// CHECK-NEXT: <string>Entered call from 'testWithArrayPtr'</string> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>kind</key><string>control</string> +// CHECK-NEXT: <key>edges</key> +// CHECK-NEXT: <array> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>start</key> +// CHECK-NEXT: <array> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>60</integer> +// CHECK-NEXT: <key>col</key><integer>1</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>60</integer> +// CHECK-NEXT: <key>col</key><integer>1</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: </array> +// CHECK-NEXT: <key>end</key> +// CHECK-NEXT: <array> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>61</integer> +// CHECK-NEXT: <key>col</key><integer>3</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>61</integer> +// CHECK-NEXT: <key>col</key><integer>3</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: </array> +// CHECK-NEXT: </dict> +// CHECK-NEXT: </array> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>kind</key><string>control</string> +// CHECK-NEXT: <key>edges</key> +// CHECK-NEXT: <array> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>start</key> +// CHECK-NEXT: <array> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>61</integer> +// CHECK-NEXT: <key>col</key><integer>3</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>61</integer> +// CHECK-NEXT: <key>col</key><integer>3</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: </array> +// CHECK-NEXT: <key>end</key> +// CHECK-NEXT: <array> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>61</integer> +// CHECK-NEXT: <key>col</key><integer>8</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>61</integer> +// CHECK-NEXT: <key>col</key><integer>8</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: </array> +// CHECK-NEXT: </dict> +// CHECK-NEXT: </array> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>kind</key><string>event</string> +// CHECK-NEXT: <key>location</key> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>61</integer> +// CHECK-NEXT: <key>col</key><integer>8</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <key>ranges</key> +// CHECK-NEXT: <array> +// CHECK-NEXT: <array> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>61</integer> +// CHECK-NEXT: <key>col</key><integer>3</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>61</integer> +// CHECK-NEXT: <key>col</key><integer>3</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: </array> +// CHECK-NEXT: </array> +// CHECK-NEXT: <key>depth</key><integer>1</integer> +// CHECK-NEXT: <key>extended_message</key> +// CHECK-NEXT: <string>Array access (via ivar 'p') results in a null pointer dereference</string> +// CHECK-NEXT: <key>message</key> +// CHECK-NEXT: <string>Array access (via ivar 'p') results in a null pointer dereference</string> +// CHECK-NEXT: </dict> +// CHECK-NEXT: </array> +// CHECK-NEXT: <key>description</key><string>Array access (via ivar 'p') results in a null pointer dereference</string> +// CHECK-NEXT: <key>category</key><string>Logic error</string> +// CHECK-NEXT: <key>type</key><string>Dereference of null pointer</string> +// CHECK-NEXT: <key>check_name</key><string>core.NullDereference</string> +// CHECK-NEXT: <!-- This hash is experimental and going to change! --> +// CHECK-NEXT: <key>issue_hash_content_of_line_in_context</key><string>fb0ad1e4e3090d9834d542eb54bc9d2e</string> +// CHECK-NEXT: <key>issue_context_kind</key><string>Objective-C method</string> +// CHECK-NEXT: <key>issue_context</key><string>useArray</string> +// CHECK-NEXT: <key>issue_hash_function_offset</key><string>1</string> +// CHECK-NEXT: <key>location</key> +// CHECK-NEXT: <dict> +// CHECK-NEXT: <key>line</key><integer>61</integer> +// CHECK-NEXT: <key>col</key><integer>8</integer> +// CHECK-NEXT: <key>file</key><integer>0</integer> +// CHECK-NEXT: </dict> +// CHECK-NEXT: </dict> // CHECK-NEXT: </array> Index: lib/StaticAnalyzer/Core/BugReporterVisitors.cpp =================================================================== --- lib/StaticAnalyzer/Core/BugReporterVisitors.cpp +++ lib/StaticAnalyzer/Core/BugReporterVisitors.cpp @@ -42,48 +42,68 @@ return false; } +/// Given that expression S represents a pointer that would be dereferenced, +/// try to find a sub-expression from which the pointer came from. +/// This is used for tracking down origins of a null or undefined value: +/// "this is null because that is null because that is null" etc. +/// We wipe away field and element offsets because they merely add offsets. +/// We also wipe away all casts except lvalue-to-rvalue casts, because the +/// latter represent an actual pointer dereference; however, we remove +/// the final lvalue-to-rvalue cast before returning from this function +/// because it demonstrates more clearly from where the pointer rvalue was +/// loaded. Examples: +/// x->y.z ==> x (lvalue) +/// foo()->y.z ==> foo() (rvalue) const Expr *bugreporter::getDerefExpr(const Stmt *S) { - // Pattern match for a few useful cases: - // a[0], p->f, *p const Expr *E = dyn_cast<Expr>(S); if (!E) return nullptr; - E = E->IgnoreParenCasts(); while (true) { - if (const BinaryOperator *B = dyn_cast<BinaryOperator>(E)) { - assert(B->isAssignmentOp()); - E = B->getLHS()->IgnoreParenCasts(); - continue; - } - else if (const UnaryOperator *U = dyn_cast<UnaryOperator>(E)) { - if (U->getOpcode() == UO_Deref) - return U->getSubExpr()->IgnoreParenCasts(); - } - else if (const MemberExpr *ME = dyn_cast<MemberExpr>(E)) { - if (ME->isImplicitAccess()) { - return ME; - } else if (ME->isArrow() || isDeclRefExprToReference(ME->getBase())) { - return ME->getBase()->IgnoreParenCasts(); + if (const CastExpr *CE = dyn_cast<CastExpr>(E)) { + if (CE->getCastKind() == CK_LValueToRValue) { + // This cast represents the load we're looking for. + break; + } + E = CE->getSubExpr(); + } else if (isa<BinaryOperator>(E)) { + // Probably more arithmetic can be pattern-matched here, + // but for now give up. + break; + } else if (const UnaryOperator *U = dyn_cast<UnaryOperator>(E)) { + if (U->getOpcode() == UO_Deref) { + // Operators '*' and '&' don't actually mean anything. + // We look at casts instead. + E = U->getSubExpr(); } else { - // If we have a member expr with a dot, the base must have been - // dereferenced. - return getDerefExpr(ME->getBase()); + // Probably more arithmetic can be pattern-matched here, + // but for now give up. + break; } } - else if (const ObjCIvarRefExpr *IvarRef = dyn_cast<ObjCIvarRefExpr>(E)) { - return IvarRef->getBase()->IgnoreParenCasts(); - } - else if (const ArraySubscriptExpr *AE = dyn_cast<ArraySubscriptExpr>(E)) { - return getDerefExpr(AE->getBase()); - } - else if (isa<DeclRefExpr>(E)) { - return E; + // Pattern match for a few useful cases: a[0], p->f, *p etc. + else if (const MemberExpr *ME = dyn_cast<MemberExpr>(E)) { + E = ME->getBase(); + } else if (const ObjCIvarRefExpr *IvarRef = dyn_cast<ObjCIvarRefExpr>(E)) { + E = IvarRef->getBase(); + } else if (const ArraySubscriptExpr *AE = dyn_cast<ArraySubscriptExpr>(E)) { + E = AE->getBase(); + } else if (const ParenExpr *PE = dyn_cast<ParenExpr>(E)) { + E = PE->getSubExpr(); + } else { + // Other arbitrary stuff. + break; } - break; } - return nullptr; + // Special case: remove the final lvalue-to-rvalue cast, but do not recurse + // deeper into the sub-expression. This way we return the lvalue from which + // our pointer rvalue was loaded. + if (const ImplicitCastExpr *CE = dyn_cast<ImplicitCastExpr>(E)) + if (CE->getCastKind() == CK_LValueToRValue) + E = CE->getSubExpr(); + + return E; } const Stmt *bugreporter::GetDenomExpr(const ExplodedNode *N) {
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits