NoQ updated this revision to Diff 116542.
NoQ added a comment.

@dcoughlin: You're right, my reasoning and understanding was not correct, and 
your explanation is much more clear. My code still makes sense to me though, so 
i updated the comments to match. And moved the unusual logic for the 
lvalue-to-rvalue cast unwrap to the bottom of the function.

Essentially, `getDerefExpr()` tries its best to explain where the null pointer 
comes from, on the syntactic level, by unpacking expressions that wrap the 
expression from which the pointer "originates", where "originates" is 
understood in a vague manner defined only by what the callers of this utility 
function expect to see. This leaves us with a few kinds of expressions that we 
need to unwrap, and we shouldn't touch other expressions, leaving 
pattern-matching over them to the caller.

There are other facilities that work on non-syntactic level, like your example 
where we might jump from function expression to return statement within the 
callee if the callee was inlined, or how `getNilReceiver()` function unwraps 
`ObjCMessageExpr` to the `self` pointer *iff* it `self` was `nil` during 
symbolic execution (which implies that the whole message expression is `nil`).

So that's right - lvalue-to-rvalue casts are not "the whole point", but merely 
an example of an expression kind that we do not have a right to unwrap. In 
fact, the callers still expect us to unwrap it once, but immediately stop 
afterwards. This inconsistency should probably be fixed, but `getDerefExpr` is 
used by checkers, and current code in checkers seems clean and concise enough.

I had a look at other cast kinds in `OperationKinds.def`, and all of them 
seemed as if they're worth unwrapping, apart from, of course, 
`CK_LValueToRValue`.

@xazax.hun: So, i guess, if the cast is earlier than we'd expect, then we'd 
just fail to unwrap something. So we won't find where the pointer comes from, 
and give up prematurely on our pattern-matching, while looking at roughly the 
same expression/value, plus-minus offsets. It sounds better than the case when 
the cast is //later// than we'd expect, where we may accidentally unwrap wrong 
stuff and start tracking a completely unrelated value, so i hope it shouldn't 
be a problem. Thanks for sharing the concern - the AST is huge and very few 
people possess really deep understanding of it, so any curious findings about 
it are really nice to share.


https://reviews.llvm.org/D37023

Files:
  lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
  test/Analysis/null-deref-path-notes.m

Index: test/Analysis/null-deref-path-notes.m
===================================================================
--- test/Analysis/null-deref-path-notes.m
+++ test/Analysis/null-deref-path-notes.m
@@ -50,6 +50,23 @@
   *p = 1; // expected-warning{{Dereference of null pointer}} expected-note{{Dereference of null pointer}}
 }
 
+@interface WithArrayPtr
+- (void) useArray;
+@end
+
+@implementation WithArrayPtr {
+@public int *p;
+}
+- (void)useArray {
+  p[1] = 2; // expected-warning{{Array access (via ivar 'p') results in a null pointer dereference}}
+            // expected-note@-1{{Array access (via ivar 'p') results in a null pointer dereference}}
+}
+@end
+
+void testWithArrayPtr(WithArrayPtr *w) {
+  w->p = 0; // expected-note{{Null pointer value stored to 'p'}}
+  [w useArray]; // expected-note{{Calling 'useArray'}}
+}
 
 // CHECK:  <key>diagnostics</key>
 // CHECK-NEXT:  <array>
@@ -801,4 +818,227 @@
 // CHECK-NEXT:    <key>file</key><integer>0</integer>
 // CHECK-NEXT:   </dict>
 // CHECK-NEXT:   </dict>
+// CHECK-NEXT:   <dict>
+// CHECK-NEXT:    <key>path</key>
+// CHECK-NEXT:    <array>
+// CHECK-NEXT:     <dict>
+// CHECK-NEXT:      <key>kind</key><string>event</string>
+// CHECK-NEXT:      <key>location</key>
+// CHECK-NEXT:      <dict>
+// CHECK-NEXT:       <key>line</key><integer>67</integer>
+// CHECK-NEXT:       <key>col</key><integer>3</integer>
+// CHECK-NEXT:       <key>file</key><integer>0</integer>
+// CHECK-NEXT:      </dict>
+// CHECK-NEXT:      <key>ranges</key>
+// CHECK-NEXT:      <array>
+// CHECK-NEXT:        <array>
+// CHECK-NEXT:         <dict>
+// CHECK-NEXT:          <key>line</key><integer>67</integer>
+// CHECK-NEXT:          <key>col</key><integer>3</integer>
+// CHECK-NEXT:          <key>file</key><integer>0</integer>
+// CHECK-NEXT:         </dict>
+// CHECK-NEXT:         <dict>
+// CHECK-NEXT:          <key>line</key><integer>67</integer>
+// CHECK-NEXT:          <key>col</key><integer>10</integer>
+// CHECK-NEXT:          <key>file</key><integer>0</integer>
+// CHECK-NEXT:         </dict>
+// CHECK-NEXT:        </array>
+// CHECK-NEXT:      </array>
+// CHECK-NEXT:      <key>depth</key><integer>0</integer>
+// CHECK-NEXT:      <key>extended_message</key>
+// CHECK-NEXT:      <string>Null pointer value stored to &apos;p&apos;</string>
+// CHECK-NEXT:      <key>message</key>
+// CHECK-NEXT:      <string>Null pointer value stored to &apos;p&apos;</string>
+// CHECK-NEXT:     </dict>
+// CHECK-NEXT:     <dict>
+// CHECK-NEXT:      <key>kind</key><string>control</string>
+// CHECK-NEXT:      <key>edges</key>
+// CHECK-NEXT:       <array>
+// CHECK-NEXT:        <dict>
+// CHECK-NEXT:         <key>start</key>
+// CHECK-NEXT:          <array>
+// CHECK-NEXT:           <dict>
+// CHECK-NEXT:            <key>line</key><integer>67</integer>
+// CHECK-NEXT:            <key>col</key><integer>3</integer>
+// CHECK-NEXT:            <key>file</key><integer>0</integer>
+// CHECK-NEXT:           </dict>
+// CHECK-NEXT:           <dict>
+// CHECK-NEXT:            <key>line</key><integer>67</integer>
+// CHECK-NEXT:            <key>col</key><integer>3</integer>
+// CHECK-NEXT:            <key>file</key><integer>0</integer>
+// CHECK-NEXT:           </dict>
+// CHECK-NEXT:          </array>
+// CHECK-NEXT:         <key>end</key>
+// CHECK-NEXT:          <array>
+// CHECK-NEXT:           <dict>
+// CHECK-NEXT:            <key>line</key><integer>68</integer>
+// CHECK-NEXT:            <key>col</key><integer>3</integer>
+// CHECK-NEXT:            <key>file</key><integer>0</integer>
+// CHECK-NEXT:           </dict>
+// CHECK-NEXT:           <dict>
+// CHECK-NEXT:            <key>line</key><integer>68</integer>
+// CHECK-NEXT:            <key>col</key><integer>3</integer>
+// CHECK-NEXT:            <key>file</key><integer>0</integer>
+// CHECK-NEXT:           </dict>
+// CHECK-NEXT:          </array>
+// CHECK-NEXT:        </dict>
+// CHECK-NEXT:       </array>
+// CHECK-NEXT:     </dict>
+// CHECK-NEXT:     <dict>
+// CHECK-NEXT:      <key>kind</key><string>event</string>
+// CHECK-NEXT:      <key>location</key>
+// CHECK-NEXT:      <dict>
+// CHECK-NEXT:       <key>line</key><integer>68</integer>
+// CHECK-NEXT:       <key>col</key><integer>3</integer>
+// CHECK-NEXT:       <key>file</key><integer>0</integer>
+// CHECK-NEXT:      </dict>
+// CHECK-NEXT:      <key>ranges</key>
+// CHECK-NEXT:      <array>
+// CHECK-NEXT:        <array>
+// CHECK-NEXT:         <dict>
+// CHECK-NEXT:          <key>line</key><integer>68</integer>
+// CHECK-NEXT:          <key>col</key><integer>3</integer>
+// CHECK-NEXT:          <key>file</key><integer>0</integer>
+// CHECK-NEXT:         </dict>
+// CHECK-NEXT:         <dict>
+// CHECK-NEXT:          <key>line</key><integer>68</integer>
+// CHECK-NEXT:          <key>col</key><integer>14</integer>
+// CHECK-NEXT:          <key>file</key><integer>0</integer>
+// CHECK-NEXT:         </dict>
+// CHECK-NEXT:        </array>
+// CHECK-NEXT:      </array>
+// CHECK-NEXT:      <key>depth</key><integer>0</integer>
+// CHECK-NEXT:      <key>extended_message</key>
+// CHECK-NEXT:      <string>Calling &apos;useArray&apos;</string>
+// CHECK-NEXT:      <key>message</key>
+// CHECK-NEXT:      <string>Calling &apos;useArray&apos;</string>
+// CHECK-NEXT:     </dict>
+// CHECK-NEXT:     <dict>
+// CHECK-NEXT:      <key>kind</key><string>event</string>
+// CHECK-NEXT:      <key>location</key>
+// CHECK-NEXT:      <dict>
+// CHECK-NEXT:       <key>line</key><integer>60</integer>
+// CHECK-NEXT:       <key>col</key><integer>1</integer>
+// CHECK-NEXT:       <key>file</key><integer>0</integer>
+// CHECK-NEXT:      </dict>
+// CHECK-NEXT:      <key>depth</key><integer>1</integer>
+// CHECK-NEXT:      <key>extended_message</key>
+// CHECK-NEXT:      <string>Entered call from &apos;testWithArrayPtr&apos;</string>
+// CHECK-NEXT:      <key>message</key>
+// CHECK-NEXT:      <string>Entered call from &apos;testWithArrayPtr&apos;</string>
+// CHECK-NEXT:     </dict>
+// CHECK-NEXT:     <dict>
+// CHECK-NEXT:      <key>kind</key><string>control</string>
+// CHECK-NEXT:      <key>edges</key>
+// CHECK-NEXT:       <array>
+// CHECK-NEXT:        <dict>
+// CHECK-NEXT:         <key>start</key>
+// CHECK-NEXT:          <array>
+// CHECK-NEXT:           <dict>
+// CHECK-NEXT:            <key>line</key><integer>60</integer>
+// CHECK-NEXT:            <key>col</key><integer>1</integer>
+// CHECK-NEXT:            <key>file</key><integer>0</integer>
+// CHECK-NEXT:           </dict>
+// CHECK-NEXT:           <dict>
+// CHECK-NEXT:            <key>line</key><integer>60</integer>
+// CHECK-NEXT:            <key>col</key><integer>1</integer>
+// CHECK-NEXT:            <key>file</key><integer>0</integer>
+// CHECK-NEXT:           </dict>
+// CHECK-NEXT:          </array>
+// CHECK-NEXT:         <key>end</key>
+// CHECK-NEXT:          <array>
+// CHECK-NEXT:           <dict>
+// CHECK-NEXT:            <key>line</key><integer>61</integer>
+// CHECK-NEXT:            <key>col</key><integer>3</integer>
+// CHECK-NEXT:            <key>file</key><integer>0</integer>
+// CHECK-NEXT:           </dict>
+// CHECK-NEXT:           <dict>
+// CHECK-NEXT:            <key>line</key><integer>61</integer>
+// CHECK-NEXT:            <key>col</key><integer>3</integer>
+// CHECK-NEXT:            <key>file</key><integer>0</integer>
+// CHECK-NEXT:           </dict>
+// CHECK-NEXT:          </array>
+// CHECK-NEXT:        </dict>
+// CHECK-NEXT:       </array>
+// CHECK-NEXT:     </dict>
+// CHECK-NEXT:     <dict>
+// CHECK-NEXT:      <key>kind</key><string>control</string>
+// CHECK-NEXT:      <key>edges</key>
+// CHECK-NEXT:       <array>
+// CHECK-NEXT:        <dict>
+// CHECK-NEXT:         <key>start</key>
+// CHECK-NEXT:          <array>
+// CHECK-NEXT:           <dict>
+// CHECK-NEXT:            <key>line</key><integer>61</integer>
+// CHECK-NEXT:            <key>col</key><integer>3</integer>
+// CHECK-NEXT:            <key>file</key><integer>0</integer>
+// CHECK-NEXT:           </dict>
+// CHECK-NEXT:           <dict>
+// CHECK-NEXT:            <key>line</key><integer>61</integer>
+// CHECK-NEXT:            <key>col</key><integer>3</integer>
+// CHECK-NEXT:            <key>file</key><integer>0</integer>
+// CHECK-NEXT:           </dict>
+// CHECK-NEXT:          </array>
+// CHECK-NEXT:         <key>end</key>
+// CHECK-NEXT:          <array>
+// CHECK-NEXT:           <dict>
+// CHECK-NEXT:            <key>line</key><integer>61</integer>
+// CHECK-NEXT:            <key>col</key><integer>8</integer>
+// CHECK-NEXT:            <key>file</key><integer>0</integer>
+// CHECK-NEXT:           </dict>
+// CHECK-NEXT:           <dict>
+// CHECK-NEXT:            <key>line</key><integer>61</integer>
+// CHECK-NEXT:            <key>col</key><integer>8</integer>
+// CHECK-NEXT:            <key>file</key><integer>0</integer>
+// CHECK-NEXT:           </dict>
+// CHECK-NEXT:          </array>
+// CHECK-NEXT:        </dict>
+// CHECK-NEXT:       </array>
+// CHECK-NEXT:     </dict>
+// CHECK-NEXT:     <dict>
+// CHECK-NEXT:      <key>kind</key><string>event</string>
+// CHECK-NEXT:      <key>location</key>
+// CHECK-NEXT:      <dict>
+// CHECK-NEXT:       <key>line</key><integer>61</integer>
+// CHECK-NEXT:       <key>col</key><integer>8</integer>
+// CHECK-NEXT:       <key>file</key><integer>0</integer>
+// CHECK-NEXT:      </dict>
+// CHECK-NEXT:      <key>ranges</key>
+// CHECK-NEXT:      <array>
+// CHECK-NEXT:        <array>
+// CHECK-NEXT:         <dict>
+// CHECK-NEXT:          <key>line</key><integer>61</integer>
+// CHECK-NEXT:          <key>col</key><integer>3</integer>
+// CHECK-NEXT:          <key>file</key><integer>0</integer>
+// CHECK-NEXT:         </dict>
+// CHECK-NEXT:         <dict>
+// CHECK-NEXT:          <key>line</key><integer>61</integer>
+// CHECK-NEXT:          <key>col</key><integer>3</integer>
+// CHECK-NEXT:          <key>file</key><integer>0</integer>
+// CHECK-NEXT:         </dict>
+// CHECK-NEXT:        </array>
+// CHECK-NEXT:      </array>
+// CHECK-NEXT:      <key>depth</key><integer>1</integer>
+// CHECK-NEXT:      <key>extended_message</key>
+// CHECK-NEXT:      <string>Array access (via ivar &apos;p&apos;) results in a null pointer dereference</string>
+// CHECK-NEXT:      <key>message</key>
+// CHECK-NEXT:      <string>Array access (via ivar &apos;p&apos;) results in a null pointer dereference</string>
+// CHECK-NEXT:     </dict>
+// CHECK-NEXT:    </array>
+// CHECK-NEXT:    <key>description</key><string>Array access (via ivar &apos;p&apos;) results in a null pointer dereference</string>
+// CHECK-NEXT:    <key>category</key><string>Logic error</string>
+// CHECK-NEXT:    <key>type</key><string>Dereference of null pointer</string>
+// CHECK-NEXT:    <key>check_name</key><string>core.NullDereference</string>
+// CHECK-NEXT:    <!-- This hash is experimental and going to change! -->
+// CHECK-NEXT:    <key>issue_hash_content_of_line_in_context</key><string>fb0ad1e4e3090d9834d542eb54bc9d2e</string>
+// CHECK-NEXT:   <key>issue_context_kind</key><string>Objective-C method</string>
+// CHECK-NEXT:   <key>issue_context</key><string>useArray</string>
+// CHECK-NEXT:   <key>issue_hash_function_offset</key><string>1</string>
+// CHECK-NEXT:   <key>location</key>
+// CHECK-NEXT:   <dict>
+// CHECK-NEXT:    <key>line</key><integer>61</integer>
+// CHECK-NEXT:    <key>col</key><integer>8</integer>
+// CHECK-NEXT:    <key>file</key><integer>0</integer>
+// CHECK-NEXT:   </dict>
+// CHECK-NEXT:   </dict>
 // CHECK-NEXT:  </array>
Index: lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
===================================================================
--- lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
+++ lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
@@ -42,48 +42,68 @@
   return false;
 }
 
+/// Given that expression S represents a pointer that would be dereferenced,
+/// try to find a sub-expression from which the pointer came from.
+/// This is used for tracking down origins of a null or undefined value:
+/// "this is null because that is null because that is null" etc.
+/// We wipe away field and element offsets because they merely add offsets.
+/// We also wipe away all casts except lvalue-to-rvalue casts, because the
+/// latter represent an actual pointer dereference; however, we remove
+/// the final lvalue-to-rvalue cast before returning from this function
+/// because it demonstrates more clearly from where the pointer rvalue was
+/// loaded. Examples:
+///   x->y.z      ==>  x (lvalue)
+///   foo()->y.z  ==>  foo() (rvalue)
 const Expr *bugreporter::getDerefExpr(const Stmt *S) {
-  // Pattern match for a few useful cases:
-  //   a[0], p->f, *p
   const Expr *E = dyn_cast<Expr>(S);
   if (!E)
     return nullptr;
-  E = E->IgnoreParenCasts();
 
   while (true) {
-    if (const BinaryOperator *B = dyn_cast<BinaryOperator>(E)) {
-      assert(B->isAssignmentOp());
-      E = B->getLHS()->IgnoreParenCasts();
-      continue;
-    }
-    else if (const UnaryOperator *U = dyn_cast<UnaryOperator>(E)) {
-      if (U->getOpcode() == UO_Deref)
-        return U->getSubExpr()->IgnoreParenCasts();
-    }
-    else if (const MemberExpr *ME = dyn_cast<MemberExpr>(E)) {
-      if (ME->isImplicitAccess()) {
-        return ME;
-      } else if (ME->isArrow() || isDeclRefExprToReference(ME->getBase())) {
-        return ME->getBase()->IgnoreParenCasts();
+    if (const CastExpr *CE = dyn_cast<CastExpr>(E)) {
+      if (CE->getCastKind() == CK_LValueToRValue) {
+        // This cast represents the load we're looking for.
+        break;
+      }
+      E = CE->getSubExpr();
+    } else if (isa<BinaryOperator>(E)) {
+      // Probably more arithmetic can be pattern-matched here,
+      // but for now give up.
+      break;
+    } else if (const UnaryOperator *U = dyn_cast<UnaryOperator>(E)) {
+      if (U->getOpcode() == UO_Deref) {
+        // Operators '*' and '&' don't actually mean anything.
+        // We look at casts instead.
+        E = U->getSubExpr();
       } else {
-        // If we have a member expr with a dot, the base must have been
-        // dereferenced.
-        return getDerefExpr(ME->getBase());
+        // Probably more arithmetic can be pattern-matched here,
+        // but for now give up.
+        break;
       }
     }
-    else if (const ObjCIvarRefExpr *IvarRef = dyn_cast<ObjCIvarRefExpr>(E)) {
-      return IvarRef->getBase()->IgnoreParenCasts();
-    }
-    else if (const ArraySubscriptExpr *AE = dyn_cast<ArraySubscriptExpr>(E)) {
-      return getDerefExpr(AE->getBase());
-    }
-    else if (isa<DeclRefExpr>(E)) {
-      return E;
+    // Pattern match for a few useful cases: a[0], p->f, *p etc.
+    else if (const MemberExpr *ME = dyn_cast<MemberExpr>(E)) {
+      E = ME->getBase();
+    } else if (const ObjCIvarRefExpr *IvarRef = dyn_cast<ObjCIvarRefExpr>(E)) {
+      E = IvarRef->getBase();
+    } else if (const ArraySubscriptExpr *AE = dyn_cast<ArraySubscriptExpr>(E)) {
+      E = AE->getBase();
+    } else if (const ParenExpr *PE = dyn_cast<ParenExpr>(E)) {
+      E = PE->getSubExpr();
+    } else {
+      // Other arbitrary stuff.
+      break;
     }
-    break;
   }
 
-  return nullptr;
+  // Special case: remove the final lvalue-to-rvalue cast, but do not recurse
+  // deeper into the sub-expression. This way we return the lvalue from which
+  // our pointer rvalue was loaded.
+  if (const ImplicitCastExpr *CE = dyn_cast<ImplicitCastExpr>(E))
+    if (CE->getCastKind() == CK_LValueToRValue)
+      E = CE->getSubExpr();
+
+  return E;
 }
 
 const Stmt *bugreporter::GetDenomExpr(const ExplodedNode *N) {
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to