[PATCH] D122231: [clang][dataflow] Add support for `value_or` in a comparison.

Yitzhak Mandelbaum via Phabricator via cfe-commits Tue, 29 Mar 2022 08:15:53 -0700

ymandel marked 7 inline comments as done.
ymandel added inline comments.


================
Comment at: 
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp:142
+  // `opt.value_or(nullptr) != nullptr` and `opt.value_or(0) != 0`. Ideally,
+  // we'd support this pattern for any expression, but the AST does not have a
+  // generic expression comparison facility, so we specialize to common cases
----------------
xazax.hun wrote:
> Yeah, I think Clang is in a very sad state in this regard. We have a lot of 
> half done facilities littered all over the codebase, including:
> https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Analysis/CloneDetection.h
> https://github.com/llvm/llvm-project/blob/main/clang/lib/StaticAnalyzer/Checkers/IdenticalExprChecker.cpp#L306
> https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/clang-tidy/misc/RedundantExpressionCheck.cpp#L60
> 
> 
Right. I've added a FIXME. I think an elegant solution in this case would be a 
relational matcher for built-in types.


================
Comment at: 
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp:147
+                         anyOf(ComparesToSame(cxxNullPtrLiteralExpr()),
+                               ComparesToSame(integerLiteral(equals(0)))));
+}
----------------
xazax.hun wrote:
> I wonder if we want to add `""` to support `opt.value_or("") != ""`. Not sure 
> how frequent would this be over the empty call.
Makes sense -- done.


================
Comment at: 
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp:270
+  // needed.
+  BoolValue &ComparisonValue = MakeValue(Env, *HasValueVal);
+  auto *ComparisonExprLoc =
----------------
sgatev wrote:
> ymandel wrote:
> > sgatev wrote:
> > > ymandel wrote:
> > > > ymandel wrote:
> > > > > xazax.hun wrote:
> > > > > > xazax.hun wrote:
> > > > > > > ymandel wrote:
> > > > > > > > xazax.hun wrote:
> > > > > > > > > Is this the right way to initialize `ComparisonValue`?
> > > > > > > > > 
> > > > > > > > > Considering the expression: `opt.value_or(nullptr) != nullptr`
> > > > > > > > > * When `has_value == false`, `opt.value_or(nullptr)` will 
> > > > > > > > > return `nullptr`, so `!=` evaluates to false. This case seems 
> > > > > > > > > to check out.
> > > > > > > > > * However, when `has_value == true`, `opt` might still hold 
> > > > > > > > > an `nullptr` and `!=` could still evaluate to false. 
> > > > > > > > Thanks for digging into this. I think it's correct, but helpful 
> > > > > > > > to step through:
> > > > > > > > 
> > > > > > > > Its correctness depends on `MakeValue`, so I'll focus on that 
> > > > > > > > in particular. For the `nullptr` case, we'll get:
> > > > > > > > ```
> > > > > > > > HasValueVal && ContentsNotEqX
> > > > > > > > ```
> > > > > > > > So, when `has_value == true`, this basically reduces to 
> > > > > > > > `ContentsNotEqX`. Since that's an atom, the result is 
> > > > > > > > indeterminate, which I believe is the desired outcome.
> > > > > > > > 
> > > > > > > > WDYT?  Also, even if I've convinced you, please let me know how 
> > > > > > > > i can improve the comments. For that matter, would `MakeValue` 
> > > > > > > > be better with a more specific name, like "MakePredicate" or 
> > > > > > > > somesuch?
> > > > > > > I think what confuses me is that we do something different for 
> > > > > > > the 3 cases. You convinced me that `HasValueVal && 
> > > > > > > ContentsNotEqX` is correct. But we only do this for one branch 
> > > > > > > out of the 3.  What is the reason for that?
> > > > > > Oh, never mind. Yeah, I think changing `MakeValue` to 
> > > > > > `MakePredicate` would make this a bit clearer. After a second read 
> > > > > > now I understand better what is going on.
> > > > > Just to be clear: the three cases you mean are lines 273-283, or 
> > > > > something else?
> > > > and never mind my question, then (I rpelied before I saw your updated). 
> > > > I'll change the name and add comments.
> > > Can you elaborate on the three cases on lines 273-283? Why not simply do
> > > 
> > > ```
> > > auto &ComparisonExprLoc = Env.createStorageLocation(*ComparisonExpr);
> > > Env.setStorageLocation(ComparisonExpr, ComparisonExprLoc);
> > > Env.setValue(ComparisonExprLoc, ComparisonValue);
> > > ```
> > > Can you elaborate on the three cases on lines 273-283? Why not simply do
> > > 
> > > ```
> > > auto &ComparisonExprLoc = Env.createStorageLocation(*ComparisonExpr);
> > > Env.setStorageLocation(ComparisonExpr, ComparisonExprLoc);
> > > Env.setValue(ComparisonExprLoc, ComparisonValue);
> > > ```
> > 
> > for the second case: I think we should drop it -- I don't see a reason to 
> > maintain the previous value (if there is any). It might be a good idea for 
> > compositionality, but we're not doing that anywhere else, so it doesn't 
> > make sense here.
> > 
> > for the first and third case: I assumed that if the expression already has 
> > a location, we'd want to reuse it. But, based on your question, I take it 
> > that's incorrect?
> > 
> Dropping the second case makes sense to me.
> 
> For the rest, `createStorageLocation` returns a stable storage location so 
> the snippet above should be sufficient. However, `setStorageLocation` will 
> fail if we try calling it again with the same expression, even if it's called 
> with the same storage location. What do you think about making 
> `setStorageLocation` not fail if it's called with the same arguments?
for the code here -- I don't think that there's a case where there's not value 
associated with a comparison, yet there is a loc, so I think that snippet is 
fine.

for the general case, I think that `setStorageLocation` should be be no 
stricter than  `createStorageLocation`. it seems strange for the set operation 
to fail when the create does not (since "set" is more commonly a repeatable 
operation). Otherwise, no strong opinion as long as we document the behavior.


================
Comment at: 
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp:279
+                 cast_or_null<BoolValue>(Env.getValue(*ComparisonExprLoc))) {
+    Env.setValue(*ComparisonExprLoc,
+                 Env.makeAnd(*CurrentValue, ComparisonValue));
----------------
sgatev wrote:
> ymandel wrote:
> > ymandel wrote:
> > > xazax.hun wrote:
> > > > I am still wondering a bit about this case. 
> > > > 
> > > > We generate: `HasValueVal and ContentsNotEqX and CurrentValue`.'
> > > > I wonder if we want: `HasValueVal  and (ContentsNotEqX  <=> 
> > > > CurrentValue)` instead?  Or even `HasValueVal  and CurrentValue`?
> > > I don't think that the iff version is right, but `HasValueVal  and 
> > > CurrentValue` could be. My concern is that we're not guaranteed that 
> > > `CurrentValue` is populated. And, even if we were, it doesn't feel quite 
> > > right. Assuming its a high fidelity model, we get (logically): 
> > > `HasValue(opt) and Ne(ValueOr(opt,X),X)`. Then, when negated (say, on an 
> > > else branch) we get `not(HasValue(opt)) or not(Ne(ValueOr(opt,X),X))` 
> > > which is equivalent to `not(HasValue(opt)) or Eq(ValueOr(opt,X),X)`. 
> > > While  true, it seems redundant, since the first clause should be 
> > > derivable from the second (assuming an interpretatable semantics to the 
> > > `ValueOr` predicate).
> > > 
> > > Regardless, it might be better to step back and figure out how this 
> > > should be done systematically. I'll try to come back with a proposal on 
> > > that.
> > > Regardless, it might be better to step back and figure out how this 
> > > should be done systematically. I'll try to come back with a proposal on 
> > > that.
> > 
> > Here's what I have: in general, we're aiming for all models to be a sound 
> > (over) approximation of reality. That is what we're doing here as well. 
> > Yet, that poses a problem for the interpretation of the boolean not 
> > operator. If its operand is an overapproximation, then I believe the naive 
> > approach gives you an under approximation. That's the problem we're hitting 
> > when reasoning about the negation.
> > 
> > I'm not sure how to handle this. Stanislav -- have we dealt with this issue 
> > before?
> > 
> > That said, if we go back to the previous approach, of adding the 
> > information to the path condition, I think we avoid this problem, since the 
> > path conditions don't get negated.  To Gabor's earlier point:
> > > There is an implication in the reverse direction as well. In case we know 
> > > the optional is empty, we can prune one of the branches from the 
> > > analysis. Is it possible to implement that with the current status of the 
> > > framework?
> > I think is covered by the condition we're adding. Namely:
> > ```
> > ExprValue => has_value
> > ```
> > where `ExprValue` is the truth value of the boolean expression.
> > 
> > So, the implication in the reverse direction is:
> > ```
> > !has_value => !ExprValue
> > ```
> > that is, if we know the optional doesn't hold a value, then we know that 
> > `opt.value_or(X) = X`
> > 
> > But, that implication is the contrapositive of our own, so I think it's 
> > already implicitly covered by adding the single implication. Does that 
> > sound right?
> I'm not following where `Env.makeAnd(*CurrentValue, ComparisonValue)` comes 
> from so I'd question whether it's sound or not. I would have expected to see 
> something like `ExprValue => has_value` (which I believe was the case in the 
> first iteration) and I see no issues with the contrapositive. If you have `x 
> => y` and `not y` in the flow condition, you'll be able to infer that `not x` 
> is true (assuming no other statements for `x`). How we use this to prune 
> branches from the analysis is a question of its own.
I think the new version resolves this?


================
Comment at: 
clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp:295
+        // atom.
+        BoolValue &OptionalHoldsEmptyString = Env.makeAtomicBoolValue();
+        return Env.makeOr(Env.makeAnd(HasValueVal, OptionalHoldsEmptyString),
----------------
xazax.hun wrote:
> Not related to this PR, but I think in the future we will want to associate 
> names to the values to make debugging easier (or maybe to generate really 
> nice error messages).
Good idea. I've noted that (to myself) as a todo to add a FIXME or somesuch. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122231/new/

https://reviews.llvm.org/D122231

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D122231: [clang][dataflow] Add support for `value_or` in a comparison.

Reply via email to