aaron.ballman added subscribers: dexonsmith, rjmccall.
aaron.ballman added a comment.

In D132568#3747598 <https://reviews.llvm.org/D132568#3747598>, @nickdesaulniers 
wrote:

> In D132568#3746551 <https://reviews.llvm.org/D132568#3746551>, @aaron.ballman 
> wrote:
>
>> Thank you for the patch, but it'd be really helpful to me as a reviewer if 
>> you and @nickdesaulniers could coordinate so there's only one patch trying 
>> to address #57102 instead of two competing patches (I'm happy to review 
>> whichever patch comes out). From a quick look over the test case changes, 
>> this patch looks like it's more in line with how I'd expect to resolve that 
>> issue compared to D132266 <https://reviews.llvm.org/D132266>.
>
> The addition to clang/test/Sema/format-strings.c looks like it will resolve 
> Linus' complaints in https://github.com/llvm/llvm-project/issues/57102; I'm 
> happy to abandon D132266 <https://reviews.llvm.org/D132266> in preference of 
> this.

Thanks, Nick!

I asked some questions about `wchar_t` behavior in the thread, but I sort of 
wonder if we should handle those concerns separately (if changes are needed) as 
it looks more like missing functionality than promotion-related behavior. We 
have other changes we'll need in this area anyway for C23 because of 
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2630.pdf, 
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2680.pdf, and 
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2763.pdf so there's no need 
to cram it all into one patch.



================
Comment at: clang/include/clang/AST/FormatString.h:263-264
     NoMatch = 0,
     /// The conversion specifier and the argument type are compatible. For
     /// instance, "%d" and _Bool.
     Match = 1,
----------------



================
Comment at: clang/lib/AST/FormatString.cpp:367
+        case BuiltinType::Char_U:
+        case BuiltinType::Bool:
+          return Match;
----------------
Isn't this a match promotion as well? e.g. `printf("%hhd", (_Bool)0);` where 
the `_Bool` is promoted to `int` but then cast back down to a char type (which 
seems rather unlikely to be a type confusion issue).


================
Comment at: clang/lib/AST/FormatString.cpp:378
+          case BuiltinType::Short:
+          case BuiltinType::UShort:
+            return NoMatchPromotionTypeConfusion;
----------------
In C, `wchar_t` is an integer type and that underlying type might be `int` or 
it might be promotable to `int`. Should we handle it here as a type confused 
promotion match?


================
Comment at: clang/lib/AST/FormatString.cpp:401
+      if (const auto *BT = argTy->getAs<BuiltinType>()) {
+        if (!Ptr) {
+          switch (BT->getKind()) {
----------------
It's a bit strange that we have two switches over the same `BT->getKind()` and 
the only difference is `!Ptr`; would it be easier to read if we combined the 
two switches into one and had logic in the individual cases for `Ptr` vs not 
`Ptr`?


================
Comment at: clang/lib/AST/FormatString.cpp:408
+            if (T == C.SignedCharTy || T == C.UnsignedCharTy ||
+                T == C.ShortTy || T == C.UnsignedShortTy)
+              return MatchPromotion;
----------------
`wchar_t`?


================
Comment at: clang/lib/AST/FormatString.cpp:413-414
+          case BuiltinType::UShort:
+            if (T == C.SignedCharTy || T == C.UnsignedCharTy || T == C.IntTy ||
+                T == C.UnsignedIntTy)
+              return NoMatchPromotionTypeConfusion;
----------------
I agree that `printf("%hhd", (short)0);` is potentially type confused, but why 
`printf("%d", (short)0);`?

FWIW, I think that `printf("%lc", (short)0));` is potentially type confused as 
well.


================
Comment at: clang/lib/AST/FormatString.cpp:422
+          case BuiltinType::Bool:
+            // Don't warn printf("%hd", [char])
+            // https://reviews.llvm.org/D66186
----------------
The comment is talking about passing a `char` (promoted to `int`) then printed 
as a `short` but the check is looking for the type to be printed as an `int`; 
that doesn't seem like a type confusion situation so much as a match due to 
promotion. Should the test below be looking for a short type instead of an int 
type?


================
Comment at: clang/lib/AST/FormatString.cpp:429
+          case BuiltinType::WChar_S:
+            return NoMatchPromotionTypeConfusion;
+          }
----------------
Why is this automatically type confusion regardless of the type specified by 
the format string?


================
Comment at: clang/lib/Sema/SemaChecking.cpp:10114
   } else if (const CharacterLiteral *CL = dyn_cast<CharacterLiteral>(E)) {
     // Special case for 'a', which has type 'int' in C.
     // Note, however, that we do /not/ want to treat multibyte constants like
----------------
Do we need a similar special case for `L'a'` in C, which has type `wchar_t` 
(which is a typedef to an integer type)?


================
Comment at: clang/test/FixIt/format.m:40
 
-void test_object_correction (id x) {  
+void test_object_correction (id x) {
   NSLog(@"%d", x); // expected-warning{{format specifies type 'int' but the 
argument has type 'id'}}
----------------
It looks like some whitespace only changes snuck in.


================
Comment at: clang/test/FixIt/format.m:173-176
-  NSLog(@"%hhd", 'a'); // expected-warning{{format specifies type 'char' but 
the argument has type 'int'}}
-  // CHECK: fix-it:"{{.*}}":{[[@LINE-1]]:11-[[@LINE-1]]:15}:"%d"
-
-  NSLog(@"%hhu", 'a'); // expected-warning{{format specifies type 'unsigned 
char' but the argument has type 'int'}}
----------------
Should we leave the test but remove the expected warnings and fix-it comments 
instead?


================
Comment at: clang/test/FixIt/format.m:195-208
-  NSLog(@"%C", 0x260300);  // expected-warning{{format specifies type 
'unichar' (aka 'unsigned short') but the argument has type 'int'}}
-  // CHECK-NOT: fix-it:"{{.*}}":{[[@LINE-1]]:11-[[@LINE-1]]:13}:"%d"
-  // CHECK: fix-it:"{{.*}}":{[[@LINE-2]]:16-[[@LINE-2]]:16}:"(unsigned short)"
 
   typedef unsigned short unichar;
-  
-  NSLog(@"%C", 0x260300);  // expected-warning{{format specifies type 
'unichar' (aka 'unsigned short') but the argument has type 'int'}}
----------------
Hmmm, did we intend to change the behavior here? `%C` is an extension, so it's 
not clear to me if we wanted to modify this behavior or not. CC @rjmccall and 
@dexonsmith for some ObjC opinions.


================
Comment at: clang/test/FixIt/format.mm:14-25
-  NSLog(@"%C", 0x260300);  // expected-warning{{format specifies type 
'unichar' (aka 'unsigned short') but the argument has type 'int'}}
-  // CHECK-NOT: fix-it:"{{.*}}":{[[@LINE-1]]:11-[[@LINE-1]]:13}:"%d"
-  // CHECK: fix-it:"{{.*}}":{[[@LINE-2]]:16-[[@LINE-2]]:16}:"(unsigned short)"
-
   typedef unsigned short unichar;
 
   NSLog(@"%C", wchar_data);  // expected-warning{{format specifies type 
'unichar' (aka 'unsigned short') but the argument has type 'wchar_t'}}
----------------
Same questions here as above.


================
Comment at: clang/test/Sema/format-strings-scanf.c:15
                       unsigned char : (signed char)0))
-typedef __SSIZE_TYPE__ ssize_t;                         
+typedef __SSIZE_TYPE__ ssize_t;
 
----------------
Looks like more whitespace-only changes snuck in.


================
Comment at: clang/test/Sema/format-strings-scanf.c:265-266
+  // char & uchar
+  scanf("%hhd", &sc); // no-warning
+  scanf("%hhd", &uc); // no-warning
+  scanf("%hd", &sc); // expected-warning{{format specifies type 'short *' but 
the argument has type 'signed char *'}}
----------------
WG14 hasn't made an official determination, but one test we should add is:
```
// FIXME: does this match what the C committee allows or should it be 
pedantically warned on?
char c;
void *vp;
scanf("%hhd", &c); // Pedantic warning?
scanf("%hhd", vp); // Pedantic warning?
```
I asked on the reflectors whether folks wanted me to file an NB comment about 
this and the answers have been mixed so far. I'll most likely still file an NB 
comment to get the committee officially on record though.


================
Comment at: clang/test/Sema/format-strings-scanf.c:289-290
+  // ill-formed floats
+  scanf("%hf", // expected-warning{{length modifier 'h' results in undefined 
behavior or no effect with 'f' conversion specifier}}
+  &sc); // Is this a clang bug ?
+
----------------
Is what a clang bug?

Also, what is with the "or no effect" in that wording? "This could cause major 
issues or nothing bad at all might happen" is a very odd way to word a 
diagnostic. :-D (Maybe something to look into improving in a later patch.)


================
Comment at: clang/test/Sema/format-strings-scanf.c:271-272
+  scanf("%llx", &i); // expected-warning{{format specifies type 'unsigned long 
long *' but the argument has type 'int *'}}
+  scanf("%hf", // expected-warning{{length modifier 'h' results in undefined 
behavior or no effect with 'f' conversion specifier}}
+  &sc); // Is this a clang bug ?
+  scanf("%s", i); // expected-warning{{format specifies type 'char *' but the 
argument has type 'int'}}
----------------
nickdesaulniers wrote:
> Hmm...
@nickdesaulniers -- are you thinking this might be a pedantic warning because 
it's always valid to cast to a pointer to char and write into it? e.g.,
```
short s;
char *cp = (char *)&s;
*cp = 0; // Not UB
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132568/new/

https://reviews.llvm.org/D132568

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to