================ @@ -286,7 +286,33 @@ clang::analyze_format_string::ParseLengthModifier(FormatSpecifier &FS, lmKind = LengthModifier::AsInt3264; break; case 'w': - lmKind = LengthModifier::AsWide; ++I; break; + ++I; + if (I == E) return false; + if (*I == 'f') { + lmKind = LengthModifier::AsWideFast; + ++I; + } else { + lmKind = LengthModifier::AsWide; + } + + if (I == E) return false; + int s = 0; + while (unsigned(*I - '0') <= 9) { + s = 10 * s + unsigned(*I - '0'); + ++I; + } + + // s == 0 is MSVCRT case, like l but only for c, C, s, S, or Z on windows + // s != 0 for b, d, i, o, u, x, or X when a size followed(like 8, 16, 32 or 64) + if (s != 0) { + std::set<int> supported_list {8, 16, 32, 64}; ---------------- AaronBallman wrote:
Thanks for the feedback! I did some research of my own to see what the landscape looks like and here's what I found. 1) Support for these types came in https://github.com/llvm/llvm-project/commit/55c9877b664c1bc6614ad588f376ef41fe6ab4ca and there was no discussion on the patch (https://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20091109/023777.html) before it landed with post-commit review saying it looked good (https://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20091116/024219.html) and no justification beyond wanting a generalized algorithm. 2) There is hardware which supports 48-bit ints (https://en.wikipedia.org/wiki/48-bit_computing), 36-bit ints (https://en.wikipedia.org/wiki/36-bit_computing), 24-bit ints (https://en.wikipedia.org/wiki/24-bit_computing), etc. A lot of it is older hardware, but as examples, x86-64 uses a 48-bit addressing scheme, OpenCL supports intrinsics for 24-bit multiplication, UniSys has a 36-bit system they were selling as of 2015... 3) There's not evidence people are using stdint.h to serve those needs. For example: https://sourcegraph.com/search?q=context:global+int48_t+-file:.*clang.*+-file:.*stdint.*+-file:.*int48_t.*&patternType=standard&sm=1&groupBy=repo shows that most uses are either a custom C++ data type, use of an underlying non-basic int type, or bit-fields. I don't think I spotted a single use of `int48_t` that came from stdint.h. https://sourcegraph.com/search?q=context:global+int24_t+-file:.*clang.*+-file:.*stdint.*+-file:.*int24_t.*&patternType=standard&sm=1&groupBy=repo shows similarly for `int24_t`. 4) The code which defines the underlying macros used by stdint.h is here: https://github.com/llvm/llvm-project/blob/29a0f3ec2b47630ce229953fe7250e741b6c10b6/clang/lib/Frontend/InitPreprocessor.cpp#L220 and it uses `TargetInfo::getTypeWidth()` to determine the width generically. We have no targets (https://github.com/llvm/llvm-project/tree/main/clang/lib/Basic/Targets) which define bit widths other than 8, 16, 32, 64, or 128. So Clang itself doesn't support those odd types yet. Based on this and your details above, I think we should ignore these odd types for format strings on the assumption they're just not used. As for removing them from `stdint.h`, I think that is reasonable to explore given that we have no targets defining those widths (and so we should never be defining those macros to begin with); the algorithm remains general, we just drop the extra cruft from stdint.h. I don't think this should cause any issues. CC @jrtc27 @jyknight for some extra opinions. https://github.com/llvm/llvm-project/pull/71771 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits