================
@@ -286,7 +286,33 @@ 
clang::analyze_format_string::ParseLengthModifier(FormatSpecifier &FS,
       lmKind = LengthModifier::AsInt3264;
       break;
     case 'w':
-      lmKind = LengthModifier::AsWide; ++I; break;
+      ++I;
+      if (I == E) return false;
+      if (*I == 'f') {
+        lmKind = LengthModifier::AsWideFast;
+        ++I;
+      } else {
+        lmKind = LengthModifier::AsWide;
+      }
+
+      if (I == E) return false;
+      int s = 0;
+      while (unsigned(*I - '0') <= 9) {
+        s = 10 * s + unsigned(*I - '0');
+        ++I;
+      }
+
+      // s == 0 is MSVCRT case, like l but only for c, C, s, S, or Z on windows
+      // s != 0 for b, d, i, o, u, x, or X when a size followed(like 8, 16, 32 
or 64)
+      if (s != 0) {
+        std::set<int> supported_list {8, 16, 32, 64};
----------------
AaronBallman wrote:

Thanks for the feedback! I did some research of my own to see what the 
landscape looks like and here's what I found.

1) Support for these types came in 
https://github.com/llvm/llvm-project/commit/55c9877b664c1bc6614ad588f376ef41fe6ab4ca
 and there was no discussion on the patch 
(https://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20091109/023777.html) 
before it landed with post-commit review saying it looked good 
(https://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20091116/024219.html) 
and no justification beyond wanting a generalized algorithm.

2) There is hardware which supports 48-bit ints 
(https://en.wikipedia.org/wiki/48-bit_computing), 36-bit ints 
(https://en.wikipedia.org/wiki/36-bit_computing), 24-bit ints 
(https://en.wikipedia.org/wiki/24-bit_computing), etc. A lot of it is older 
hardware, but as examples, x86-64 uses a 48-bit addressing scheme, OpenCL 
supports intrinsics for 24-bit multiplication, UniSys has a 36-bit system they 
were selling as of 2015...

3) There's not evidence people are using stdint.h to serve those needs. For 
example: 
https://sourcegraph.com/search?q=context:global+int48_t+-file:.*clang.*+-file:.*stdint.*+-file:.*int48_t.*&patternType=standard&sm=1&groupBy=repo
 shows that most uses are either a custom C++ data type, use of an underlying 
non-basic int type, or bit-fields. I don't think I spotted a single use of 
`int48_t` that came from stdint.h. 
https://sourcegraph.com/search?q=context:global+int24_t+-file:.*clang.*+-file:.*stdint.*+-file:.*int24_t.*&patternType=standard&sm=1&groupBy=repo
 shows similarly for `int24_t`.

4) The code which defines the underlying macros used by stdint.h is here: 
https://github.com/llvm/llvm-project/blob/29a0f3ec2b47630ce229953fe7250e741b6c10b6/clang/lib/Frontend/InitPreprocessor.cpp#L220
 and it uses `TargetInfo::getTypeWidth()` to determine the width generically. 
We have no targets 
(https://github.com/llvm/llvm-project/tree/main/clang/lib/Basic/Targets) which 
define bit widths other than 8, 16, 32, 64, or 128. So Clang itself doesn't 
support those odd types yet.

Based on this and your details above, I think we should ignore these odd types 
for format strings on the assumption they're just not used. As for removing 
them from `stdint.h`, I think that is reasonable to explore given that we have 
no targets defining those widths (and so we should never be defining those 
macros to begin with); the algorithm remains general, we just drop the extra 
cruft from stdint.h. I don't think this should cause any issues.

CC @jrtc27 @jyknight for some extra opinions.

https://github.com/llvm/llvm-project/pull/71771
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to