[PATCH] D126960: [clang][sema] Unary not boolean to int conversion

Richard Smith - zygoloid via Phabricator via cfe-commits Wed, 15 Jun 2022 13:25:19 -0700

rsmith added a comment.

Thank you for the patch! This is certainly an improvement but I think there are 
still some cases where we compute the wrong range for `~` with this patch 
applied.




================
Comment at: clang/lib/Sema/SemaChecking.cpp:12328
       return IntRange::forValueOfType(C, GetExprType(E));
 
+    case UO_Not:
----------------
aaron.ballman wrote:
> erichkeane wrote:
> > Richard mentions UO_PreInc, UO_PreDec, UO_Minus, and UO_Not.  What about 
> > the other 3?
> `UO_Plus` also.
Taking those in turn:
- `UO_PreInc` and `UO_PreDec` can only produce values in the range of the 
lvalue operand; we'll never have a more precise range for an lvalue than the 
range from its type unless it's a bit-field, in which case we *do* want to take 
the bit-field's width into account. So I think the current code is actually 
correct for those two -- looking at the range of the operand gives the right 
result even for bit-fields, whereas looking at the type of the operand would 
give a less precise result in that case.
- `UO_Minus` is [not giving the right answer](https://godbolt.org/z/8veKnrv7v). 
I think we should model `-expr` in exactly the same way we model `0 - expr` -- 
see the code for `BO_Sub` above.
- `UO_Plus` seems OK with the current code: the range of values of `+expr` is 
the same as the range of values of `expr`, even if a promotion happens.

In any case, I don't think we should have a `default` here that returns the 
width of the operand. Instead, I think the `default` should return the range 
for the type of `E`.


================
Comment at: clang/lib/Sema/SemaChecking.cpp:12334
+
+      LLVM_FALLTHROUGH;
+
----------------
I don't think falling through and picking up the expression range of the 
operand is ever correct for `~`. [For example](https://godbolt.org/z/oeE3G6r6d):

```
bool f(char c) {
  return ~c > 0x10000;
}
```

... produces a bogus tautological comparison warning. (This is not 
tautological: the `~` operator will map negative `char`s to `int`s in the range 
[2^31 - 2^7, 2^31) and non-negative `char`s to `int`s in the range [-2^31, 
-2^31+2^7), so this is equivalent to `c < 0`.)

Also, using `MaxWidth` here is conservatively correct but isn't precise; for 
example, we should still warn on:

```
bool f(bool b) {
  return ~b > 0x1'0000'0000LL;
}
```

... because `~b` always fits in an `int`, but I think `MaxWidth` here will be 
64 so we won't warn. (It'd be nice to warn even on `~b > 0` but I don't think 
we can do that without some major changes to how `IntRange` is represented.)

The true result range of `~n` will be something like [-2^N, -2^N + 2^M) u [2^N 
- 2^M, 2^N) if the input is signed. `IntRange` can't represent a range with a 
hole in the middle like that, but it can represent [-2^N, 2^N), which seems 
like the least wrong answer that we can give -- that is notably also the entire 
range of the type of the `~` expression. We get a contiguous range [-2^N, -2^N 
+ 2^M) if the input is known non-negative, but `IntRange` still can't represent 
that, and so that doesn't help us make our result any more precise than using 
the entire range of the type of `E`, unfortunately. So I think `UO_Not` should 
never look at its subexpression -- the best result we can give is 
`IntRange::forValueOfType(C, GetExprType(E))`, and that's what we should use.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126960/new/

https://reviews.llvm.org/D126960

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D126960: [clang][sema] Unary not boolean to int conversion

Reply via email to