[clang] isUncountedPtr should take QualType as an argument. (PR #110213)
@@ -190,11 +190,7 @@ class UncountedLocalVarsChecker if (shouldSkipVarDecl(V)) return; -const auto *ArgType = V->getType().getTypePtr(); -if (!ArgType) haoNoQ wrote: Some of these null checks may still be necessary (with `QualType.isNull()`). https://github.com/llvm/llvm-project/pull/110213 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [alpha.webkit.NoUncheckedPtrMemberChecker] Introduce member variable checker for CheckedPtr/CheckedRef (PR #108352)
@@ -1771,6 +1771,10 @@ def UncountedLambdaCapturesChecker : Checker<"UncountedLambdaCapturesChecker">, let ParentPackage = WebKitAlpha in { +def NoUncheckedPtrMemberChecker : Checker<"NoUncheckedPtrMemberChecker">, + HelpText<"Check for no unchecked member variables.">, + Documentation; haoNoQ wrote: We can actually write a tiny bit of documentation that simply points the users to https://github.com/WebKit/WebKit/wiki/Safer-CPP-Guidelines (?) https://github.com/llvm/llvm-project/pull/108352 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [alpha.webkit.NoUncheckedPtrMemberChecker] Introduce member variable checker for CheckedPtr/CheckedRef (PR #108352)
@@ -0,0 +1,53 @@ +// RUN: %clang_analyze_cc1 -analyzer-checker=alpha.webkit.NoUncheckedPtrMemberChecker -verify %s + +#include "mock-types.h" +#include "mock-system-header.h" haoNoQ wrote: Do you need to include this one everywhere? Isn't it just a tiny test to confirm that some specific checker doesn't do something stupid in a system header? https://github.com/llvm/llvm-project/pull/108352 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [alpha.webkit.NoUncheckedPtrMemberChecker] Introduce member variable checker for CheckedPtr/CheckedRef (PR #108352)
@@ -134,10 +137,10 @@ class NoUncountedMemberChecker Os << " in "; printQuotedQualifiedName(Os, ClassCXXRD); Os << " is a " - << (isa(MemberType) ? "raw pointer" : "reference") - << " to ref-countable type "; + << (isa(MemberType) ? "raw pointer" : "reference") << " to " + << typeName() << " "; printQuotedQualifiedName(Os, MemberCXXRD); haoNoQ wrote: Have you tried a simple `Os << MemberCXXRD` here? I think it even adds quotes automatically when you do that. But I don't remember if it works for arbitrary streams or only for clang warnings streams, so this might be completely misleading. https://github.com/llvm/llvm-project/pull/108352 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [alpha.webkit.NoUncheckedPtrMemberChecker] Introduce member variable checker for CheckedPtr/CheckedRef (PR #108352)
@@ -146,13 +149,67 @@ class NoUncountedMemberChecker BR->emitReport(std::move(Report)); haoNoQ wrote: `setDeclWithIssue()` goes into a different PR right? https://github.com/llvm/llvm-project/pull/108352 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [alpha.webkit.NoUncheckedPtrMemberChecker] Introduce member variable checker for CheckedPtr/CheckedRef (PR #108352)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/108352 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [alpha.webkit.NoUncheckedPtrMemberChecker] Introduce member variable checker for CheckedPtr/CheckedRef (PR #108352)
@@ -146,13 +149,67 @@ class NoUncountedMemberChecker BR->emitReport(std::move(Report)); } }; + +class NoUncountedMemberChecker final : public RawPtrRefMemberChecker { haoNoQ wrote: Yes this is a perfectly valid way to reuse code here! https://github.com/llvm/llvm-project/pull/108352 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [alpha.webkit.NoUncheckedPtrMemberChecker] Introduce member variable checker for CheckedPtr/CheckedRef (PR #108352)
https://github.com/haoNoQ approved this pull request. LGTM!! I've got nitpicks but none of them are substantial enough to block. We've figured out the ObjC thing offline right? https://github.com/llvm/llvm-project/pull/108352 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [alpha.webkit.NoUncheckedPtrMemberChecker] Introduce member variable checker for CheckedPtr/CheckedRef (PR #108352)
@@ -53,48 +53,49 @@ hasPublicMethodInBase(const CXXBaseSpecifier *Base, const char *NameToMatch) { return hasPublicMethodInBaseClass(R, NameToMatch) ? R : nullptr; } -std::optional isRefCountable(const CXXRecordDecl* R) -{ +std::optional isSmartPtrCompatible(const CXXRecordDecl *R, + const char *IncMethodName, + const char *DecMethodName) { haoNoQ wrote: `StringRef`? Gotta start somewhere. Your static strings are implicitly convertible to that. https://github.com/llvm/llvm-project/pull/108352 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [alpha.webkit.UncountedCallArgsChecker] Use canonical type (PR #109393)
@@ -102,12 +102,13 @@ class UncountedCallArgsChecker // if ((*P)->hasAttr()) // continue; -const auto *ArgType = (*P)->getType().getTypePtrOrNull(); -if (!ArgType) +QualType ArgType = (*P)->getType().getCanonicalType(); +const auto *TypePtr = ArgType.getTypePtrOrNull(); haoNoQ wrote: `getTypePtrOrNull()` is unnecessary most of the time because you can do all the same operations on `QualType` directly, thanks to the overloaded `QualType::operator->()`. I think `isUncountedPtr()` should simply accept a `QualType` directly. https://github.com/llvm/llvm-project/pull/109393 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [alpha.webkit.UncountedCallArgsChecker] Use canonical type (PR #109393)
https://github.com/haoNoQ approved this pull request. Ah classic! LGTM! https://github.com/llvm/llvm-project/pull/109393 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [alpha.webkit.UncountedCallArgsChecker] Use canonical type (PR #109393)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/109393 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] WebKit Checkers should set DeclWithIssue. (PR #109389)
https://github.com/haoNoQ approved this pull request. LGTM thank you so much! https://github.com/llvm/llvm-project/pull/109389 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] WebKit Checkers should set DeclWithIssue. (PR #109389)
@@ -134,18 +135,25 @@ class UncountedLocalVarsChecker bool shouldVisitTemplateInstantiations() const { return true; } bool shouldVisitImplicitCode() const { return false; } + bool TraverseDecl(Decl *D) { +llvm::SaveAndRestore SavedDecl(DeclWithIssue); +if (D && isa(D)) haoNoQ wrote: You can avoid this if-statement by overriding the `TraverseFunctionDecl` method instead. Though, on the other hand, you might want to include `ObjCMethodDecl` too, in case you run into any ObjC code. (It doesn't inherit from `FunctionDecl`.) Which is slightly less convenient to do if you split it into `Traverse...` methods. So ultimately up to you^^ https://github.com/llvm/llvm-project/pull/109389 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] WebKit Checkers should set DeclWithIssue. (PR #109389)
@@ -56,12 +62,16 @@ class UncountedCallArgsChecker bool TraverseClassTemplateDecl(ClassTemplateDecl *Decl) { if (isRefType(safeGetName(Decl))) return true; -return RecursiveASTVisitor::TraverseClassTemplateDecl( -Decl); +return Base::TraverseClassTemplateDecl(Decl); + } + + bool TraverseDecl(Decl *D) { haoNoQ wrote: With this blanket "let's catch every decl" code it may catch a few very weird decls here and there (eg `VarDecl`, `BlockDecl`, etc). I don't know what the exact consequences of that look. They're probably benign. But it's probably easier to stick to `FunctionDecl` and `ObjCMethodDecl`. https://github.com/llvm/llvm-project/pull/109389 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] WebKit Checkers should set DeclWithIssue. (PR #109389)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/109389 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [-Wunsafe-buffer-usage] Fix a bug and suppress libc warnings for C files (PR #109496)
@@ -784,12 +786,12 @@ AST_MATCHER_P(CallExpr, hasUnsafePrintfStringArg, return false; // possibly some user-defined printf function ASTContext &Ctx = Finder->getASTContext(); - QualType FristParmTy = FD->getParamDecl(0)->getType(); + QualType FirstParmTy = FD->getParamDecl(0)->getType(); - if (!FristParmTy->isPointerType()) + if (!FirstParmTy->isPointerType()) return false; // possibly some user-defined printf function - QualType FirstPteTy = (cast(FristParmTy))->getPointeeType(); + QualType FirstPteTy = FirstParmTy->getAs()->getPointeeType(); haoNoQ wrote: `castAs`? https://github.com/llvm/llvm-project/pull/109496 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [-Wunsafe-buffer-usage] Fix a bug and suppress libc warnings for C files (PR #109496)
https://github.com/haoNoQ approved this pull request. LGTM! https://github.com/llvm/llvm-project/pull/109496 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [-Wunsafe-buffer-usage] Fix a bug and suppress libc warnings for C files (PR #109496)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/109496 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [webkit.RefCntblBaseVirtualDtor] ThreadSafeRefCounted still generates warnings (PR #108656)
@@ -119,6 +119,11 @@ template ensureOnMainThread([this] { delete static_cast(this); }); +} else if constexpr (destructionThread == DestructionThread::MainRunLoop) { +auto deleteThis = [this] { haoNoQ wrote: (That situation is also fun because `Function<>` can probably be overwritten, so `getInit()` may not necessarily be the correct value of the variable.) https://github.com/llvm/llvm-project/pull/108656 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [webkit.RefCntblBaseVirtualDtor] ThreadSafeRefCounted still generates warnings (PR #108656)
@@ -84,13 +84,22 @@ class DerefFuncDeleteExprVisitor E = E->IgnoreParenCasts(); if (auto *TempE = dyn_cast(E)) E = TempE->getSubExpr(); +E = E->IgnoreParenCasts(); +if (auto *Ref = dyn_cast(E)) { + if (auto *Decl = Ref->getDecl()) { +if (auto *VD = dyn_cast(Decl)) haoNoQ wrote: You can combine the two ifs with `dyn_cast_or_null`. Though, I also don't think a `DeclRefExpr`'s decl can be null in the first place. (If only we could, you know, annotate that somehow...) https://github.com/llvm/llvm-project/pull/108656 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [webkit.RefCntblBaseVirtualDtor] ThreadSafeRefCounted still generates warnings (PR #108656)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/108656 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [webkit.RefCntblBaseVirtualDtor] ThreadSafeRefCounted still generates warnings (PR #108656)
@@ -119,6 +119,11 @@ template ensureOnMainThread([this] { delete static_cast(this); }); +} else if constexpr (destructionThread == DestructionThread::MainRunLoop) { +auto deleteThis = [this] { haoNoQ wrote: Can this variable be declared as `Function<>` instead? (It's probably unlikely. But the AST may be different if they do that.) https://github.com/llvm/llvm-project/pull/108656 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [webkit.RefCntblBaseVirtualDtor] ThreadSafeRefCounted still generates warnings (PR #108656)
@@ -84,13 +84,22 @@ class DerefFuncDeleteExprVisitor E = E->IgnoreParenCasts(); if (auto *TempE = dyn_cast(E)) E = TempE->getSubExpr(); +E = E->IgnoreParenCasts(); +if (auto *Ref = dyn_cast(E)) { + if (auto *Decl = Ref->getDecl()) { +if (auto *VD = dyn_cast(Decl)) + return VisitLabmdaArgument(VD->getInit()); haoNoQ wrote: Looks like there's a typo here (and everywhere else). ```suggestion return VisitLambdaArgument(VD->getInit()); ``` https://github.com/llvm/llvm-project/pull/108656 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [webkit.RefCntblBaseVirtualDtor] ThreadSafeRefCounted still generates warnings (PR #108656)
https://github.com/haoNoQ approved this pull request. LGTM! https://github.com/llvm/llvm-project/pull/108656 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [alpha.webkit.UncountedCallArgsChecker] Add support for Objective-C++ property access (PR #108669)
haoNoQ wrote: I'm also somewhat terrified of returning C++ objects from ObjC methods by value, ever since I learned that when you call an ObjC method on a nil it returns a _zero-initialized_ object without calling a constructor on it. https://github.com/llvm/llvm-project/pull/108669 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [alpha.webkit.UncountedCallArgsChecker] Add support for Objective-C++ property access (PR #108669)
https://github.com/haoNoQ approved this pull request. Yeah sounds about right! The property syntax is confusing, I don't have enough expertise to confirm that `getResultExpr()` is the right solution so I think we should keep trying and incrementally figuring out what's going on. Setters probably look very different but also you probably don't need to support them anyway. https://github.com/llvm/llvm-project/pull/108669 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [WebKit Checkers] Allow "singleton" suffix to be camelCased. (PR #108257)
https://github.com/haoNoQ approved this pull request. LGTM! https://github.com/llvm/llvm-project/pull/108257 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [alpha.webkit.UncountedCallArgsChecker] Allow protector functions in Objective-C++ (PR #108184)
@@ -143,6 +143,16 @@ bool isReturnValueRefCounted(const clang::FunctionDecl *F) { return false; } +std::optional isUncounted(const QualType T) { + if (auto *Subst = dyn_cast(T)) { +if (auto *Decl = Subst->getAssociatedDecl()) { + if (isRefType(safeGetName(Decl))) +return false; +} + } + return isUncounted(T->getAsCXXRecordDecl()); +} + std::optional isUncounted(const CXXRecordDecl* Class) haoNoQ wrote: Should we force every checker to use the new function now that we know about this cornercase? https://github.com/llvm/llvm-project/pull/108184 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [alpha.webkit.UncountedCallArgsChecker] Allow protector functions in Objective-C++ (PR #108184)
https://github.com/haoNoQ approved this pull request. Aha ok LGTM! https://github.com/llvm/llvm-project/pull/108184 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [alpha.webkit.UncountedCallArgsChecker] Allow protector functions in Objective-C++ (PR #108184)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/108184 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [alpha.webkit.UncountedCallArgsChecker] Allow protector functions in Objective-C++ (PR #108184)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/108184 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [alpha.webkit.UncountedCallArgsChecker] Allow protector functions in Objective-C++ (PR #108184)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/108184 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [alpha.webkit.UncountedCallArgsChecker] Allow protector functions in Objective-C++ (PR #108184)
@@ -143,6 +143,16 @@ bool isReturnValueRefCounted(const clang::FunctionDecl *F) { return false; } +std::optional isUncounted(const clang::QualType T) { haoNoQ wrote: `clang::` is redundant because you're in `using namespace clang`. https://github.com/llvm/llvm-project/pull/108184 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [WebKit Static Analyzer] Treat WTFReportBacktrace as a trivial function. (PR #108167)
https://github.com/haoNoQ approved this pull request. https://github.com/llvm/llvm-project/pull/108167 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [webkit.RefCntblBaseVirtualDtor] Make ThreadSafeRefCounted not generate warnings (PR #107676)
https://github.com/haoNoQ approved this pull request. Aha great LGTM! https://github.com/llvm/llvm-project/pull/107676 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [webkit.RefCntblBaseVirtualDtor] Make ThreadSafeRefCounted not generate warnings (PR #107676)
@@ -67,6 +68,48 @@ class DerefFuncDeleteExprVisitor const Decl *D = CE->getCalleeDecl(); if (D && D->hasBody()) return VisitBody(D->getBody()); +else { + auto name = safeGetName(D); + if (name == "ensureOnMainThread" || name == "ensureOnMainRunLoop") { +for (unsigned i = 0; i < CE->getNumArgs(); ++i) { + auto *Arg = CE->getArg(i); + if (FindLabmda(Arg)) +return true; +} + } +} +return false; + } + + bool FindLabmda(const Expr *E) { +while (E) { + if (auto *TempE = dyn_cast(E)) { haoNoQ wrote: Aha great! `Expr->IgnoreParenCasts()` skips all of these automatically, recursively too, except `CXXBindTemporaryExpr`. But IIRC `CXXBindTemporaryExpr` doesn't need to be skipped recursively anyway, it's a one-time thing that always appears immediately before the constructor. https://github.com/llvm/llvm-project/pull/107676 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [webkit.RefCntblBaseVirtualDtor] Make ThreadSafeRefCounted not generate warnings (PR #107676)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/107676 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [webkit.RefCntblBaseVirtualDtor] Make ThreadSafeRefCounted not generate warnings (PR #107676)
@@ -67,6 +68,15 @@ class DerefFuncDeleteExprVisitor const Decl *D = CE->getCalleeDecl(); if (D && D->hasBody()) return VisitBody(D->getBody()); +else if (!VisitLambdaBody) { + for (unsigned i = 0; i < CE->getNumArgs(); ++i) { +auto *Arg = CE->getArg(i); +VisitLambdaBody = true; +auto Restore = llvm::make_scope_exit([&] { VisitLambdaBody = false; }); +if (VisitChildren(Arg)) haoNoQ wrote: You can probably avoid statefulness here if you directly pattern-match the argument as a lambda (with the help of a few manual `dyn_cast`s) and then invoke `Visit()` directly on that lambda's body. It'd also make `VisitLambdaExpr()` unnecessary. It's usually not a good idea to search for a nested object at arbitrary nesting depths. When you do that, you lose your ability to confirm the exact relationship between the current object and the nested object (in our case, "the lambda is passed to the opaque call as an argument" as opposed to "the lambda is lexically nested into an argument in any shape or form"). The only time you actually want to ignore that relationship is in the very top-level visitor that simply looks for all matching code patterns in the entire translation unit no matter where they're written in the translation unit. It typically doesn't make a lot of sense in the middle of the pattern. https://github.com/llvm/llvm-project/pull/107676 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [webkit.RefCntblBaseVirtualDtor] Make ThreadSafeRefCounted not generate warnings (PR #107676)
https://github.com/haoNoQ commented: Aha makes sense! Looks like you're putting no restrictions on what the opaque function is. This may cause some false negatives but it's probably ultimately ok, but it might be a good idea to confirm. https://github.com/llvm/llvm-project/pull/107676 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [-Wunsafe-buffer-usage] Warning Libc functions (PR #101583)
haoNoQ wrote: `-Wunsafe-buffer-usage` is theoretically possible to use in C but it involves a lot of `#pragma clang unsafe_buffer_usage` to annotate and encapsulate every unsafe buffer operation. So it's impractical but we aren't disabling it because that'd be an unnecessary restriction and it won't be possible for the user to enable it back if we do that. And if we turn it into a separate warning flag, it won't really eliminate the problem for users who enable `-Weverything`. Which is kind of the point of `-Weverything`. Configurations that pass `-Weverything` can never truly be supported. If you use `-Weverything` then every new warning implemented in clang will potentially annoy you. `-Weverything` is intended for discovering new warnings to enable explicitly. It's not designed for keeping it turned on on a permanent basis. https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [-Wunsafe-buffer-usage] Warning Libc functions (PR #101583)
@@ -443,6 +443,426 @@ AST_MATCHER(ArraySubscriptExpr, isSafeArraySubscript) { return false; } +AST_MATCHER_P(CallExpr, hasNumArgs, unsigned, Num) { + return Node.getNumArgs() == Num; +} + +namespace libc_func_matchers { +// Under `libc_func_matchers`, define a set of matchers that match unsafe +// functions in libc and unsafe calls to them. + +// A tiny parser to strip off common prefix and suffix of libc function names +// in real code. +// +// Given a function name, `matchName` returns `CoreName` according to the +// following grammar: +// +// LibcName := CoreName | CoreName + "_s" +// MatchingName := "__builtin_" + LibcName | +// "__builtin___" + LibcName + "_chk" | +// "__asan_" + LibcName +// +struct LibcFunNamePrefixSuffixParser { haoNoQ wrote: This struct has no members. Should we turn it into a namespace? https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [-Wunsafe-buffer-usage] Warning Libc functions (PR #101583)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [-Wunsafe-buffer-usage] Warning Libc functions (PR #101583)
https://github.com/haoNoQ approved this pull request. I think this is good to go, LGTM!! https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Revert "[LinkerWrapper] Extend with usual pass options (#96704)" (PR #102226)
haoNoQ wrote: #106439 https://github.com/llvm/llvm-project/pull/102226 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Revert "[LinkerWrapper] Extend with usual pass options (#96704)" (PR #102226)
haoNoQ wrote: Oh that's just a whitespace change. Would you like me to make a manual PR or were you looking into it anyway? https://github.com/llvm/llvm-project/pull/102226 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Revert "[LinkerWrapper] Extend with usual pass options (#96704)" (PR #102226)
haoNoQ wrote: /cherry-pick 030ee841a9c9fbbd6e7c001e751737381da01f7b https://github.com/llvm/llvm-project/pull/102226 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Revert "[LinkerWrapper] Extend with usual pass options (#96704)" (PR #102226)
haoNoQ wrote: /cherry-pick 3ddd7a6df3ef85cbfe3f5fc0294817638275d4df https://github.com/llvm/llvm-project/pull/102226 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Revert "[LinkerWrapper] Extend with usual pass options (#96704)" (PR #102226)
https://github.com/haoNoQ milestoned https://github.com/llvm/llvm-project/pull/102226 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Revert "[LinkerWrapper] Extend with usual pass options (#96704)" (PR #102226)
haoNoQ wrote: Yeah I think we really gotta cherry-pick this revert, because failing tests on release branches are somewhat scary (cf. https://green.lab.llvm.org/job/llvm.org/job/clang-stage1-RA-release-branch/69/console). I don't know how it's usually done but let me try to push some buttons, it's kind of semi-automatic I think. https://github.com/llvm/llvm-project/pull/102226 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [-Wunsafe-buffer-usage] Warning Libc functions (PR #101583)
@@ -443,6 +448,368 @@ AST_MATCHER(ArraySubscriptExpr, isSafeArraySubscript) { return false; } +namespace libc_fun_disjoint_inner_matchers { +// `libc_fun_disjoint_inner_matchers` covers a set of matchers that match +// disjoint node sets. They all take a `CoreName`, which is the substring of a +// function name after `ignoreLibcPrefixAndSuffix`. They are suppose to be used +// as an inner matcher of `ignoreLibcPrefixAndSuffix` to deal with different +// libc function calls. + +// Matches a function call node such that +// 1. It's name, after stripping off predefined prefix and suffix, is +// `CoreName`; and +// 2. `CoreName` or `CoreName[str/wcs]` is one of the `PredefinedNames`, which +// is a set of libc function names. +// +// Note: For predefined prefix and suffix, see `ignoreLibcPrefixAndSuffix`. +// And, the notation `CoreName[str/wcs]` means a new name obtained from replace +// string "wcs" with "str" in `CoreName`. +// +// Also note, the set of predefined function names does not include `printf` +// functions, they are checked exclusively with other matchers below. +// Maintaining the invariant that all matchers under +// `libc_fun_disjoint_inner_matchers` are disjoint. +AST_MATCHER_P(CallExpr, predefinedUnsafeLibcFunCall, StringRef, CoreName) { + static const std::set PredefinedNames{ + // numeric conversion: + "atof", + "atoi", + "atol", + "atoll", + "strtol", + "strtoll", + "strtoul", + "strtoull", + "strtof", + "strtod", + "strtold", + "strtoimax", + "strtoumax", + // "strfromf", "strfromd", "strfroml", // C23? + // string manipulation: + "strcpy", + "strncpy", + "strlcpy", + "strcat", + "strncat", + "strlcat", + "strxfrm", + "strdup", + "strndup", + // string examination: + "strlen", + "strnlen", + "strcmp", + "strncmp", + "stricmp", + "strcasecmp", + "strcoll", + "strchr", + "strrchr", + "strspn", + "strcspn", + "strpbrk", + "strstr", + "strtok", + // "mem-" functions + "memchr", + "wmemchr", + "memcmp", + "wmemcmp", + "memcpy", + "memccpy", + "mempcpy", + "wmemcpy", + "memmove", + "wmemmove", + "memset", + "wmemset", + // IO: + "fread", + "fwrite", + "fgets", + "fgetws", + "gets", + "fputs", + "fputws", + "puts", + // others + "strerror_s", + "strerror_r", + "bcopy", + "bzero", + "bsearch", + "qsort", + }; + // This is safe: strlen("hello"). We don't want to be noisy on this case. + auto isSafeStrlen = [&Node](StringRef Name) -> bool { +return Name == "strlen" && Node.getNumArgs() == 1 && + isa(Node.getArg(0)->IgnoreParenImpCasts()); + }; + + // Match predefined names: + if (PredefinedNames.find(CoreName.str()) != PredefinedNames.end()) +return !isSafeStrlen(CoreName); + + std::string NameWCS = CoreName.str(); + size_t WcsPos = NameWCS.find("wcs"); + + while (WcsPos != std::string::npos) { +NameWCS[WcsPos++] = 's'; +NameWCS[WcsPos++] = 't'; +NameWCS[WcsPos++] = 'r'; +WcsPos = NameWCS.find("wcs", WcsPos); + } + if (PredefinedNames.find(NameWCS) != PredefinedNames.end()) +return !isSafeStrlen(NameWCS); + // All `scanf` functions are unsafe (including `sscanf`, `vsscanf`, etc.. They + // all should end with "scanf"): + return CoreName.ends_with("scanf"); +} + +// Match a call to one of the `-printf` functions taking `va_list`. We cannot +// check safety for these functions so they should be changed to their +// non-va_list versions. +AST_MATCHER_P(CallExpr, unsafeVaListPrintfs, StringRef, CoreName) { + StringRef Name = CoreName; + + if (!Name.ends_with("printf")) +return false; // neither printf nor scanf + return Name.starts_with("v"); +} + +// Matches a call to one of the `-sprintf` functions (excluding the ones with +// va_list) as they are always unsafe and should be changed to corresponding +// `-snprintf`s. +AST_MATCHER_P(CallExpr, unsafeSprintfs, StringRef, CoreName) { + StringRef Name = CoreName; + + if (!Name.ends_with("printf") || Name.starts_with("v")) +return false; + + StringRef Prefix = Name.drop_back(6); + + if (Prefix.ends_with("w")) +Prefix = Prefix.drop_back(1); + return Prefix == "s"; +} + +// A pointer type expression is known to be null-terminated, if it has the +// form: E.c_str(), for any expression E of `std::string` type. +static bool isNullTermPointer(const Expr *Ptr) { + if (isa(Ptr->IgnoreParenImpCasts())) +return true; + if (isa(Ptr->IgnoreParenImpCasts())) +return true; + if (auto *MCE = dyn_cast(Ptr->IgnoreParenImpCasts())) { +const CXXMethodDecl *MD = MCE->getMethodDecl(); +const CXXRecordDecl *RD = MCE->getRecordDecl()->getCanonicalDecl(); + +if (MD && RD && RD->isInStdNamespace(
[clang] [-Wunsafe-buffer-usage] Warning Libc functions (PR #101583)
@@ -0,0 +1,101 @@ +// RUN: %clang_cc1 -std=c++20 -Wno-all -Wunsafe-buffer-usage \ +// RUN:-verify %s + +typedef struct {} FILE; +void memcpy(); +void __asan_memcpy(); +void strcpy(); +void strcpy_s(); +void wcscpy_s(); +unsigned strlen( const char* str ); +int fprintf( FILE* stream, const char* format, ... ); +int printf( const char* format, ... ); +int sprintf( char* buffer, const char* format, ... ); +int swprintf( char* buffer, const char* format, ... ); +int snprintf( char* buffer, unsigned buf_size, const char* format, ... ); +int snwprintf( char* buffer, unsigned buf_size, const char* format, ... ); +int snwprintf_s( char* buffer, unsigned buf_size, const char* format, ... ); +int vsnprintf( char* buffer, unsigned buf_size, const char* format, ... ); +int sscanf_s(const char * buffer, const char * format, ...); +int sscanf(const char * buffer, const char * format, ... ); + +namespace std { + template< class InputIt, class OutputIt > + OutputIt copy( InputIt first, InputIt last, +OutputIt d_first ); + + struct iterator{}; + template + struct span { +T * ptr; +T * data(); +unsigned size_bytes(); +unsigned size(); +iterator begin() const noexcept; +iterator end() const noexcept; + }; + + template + struct basic_string { +T* p; +T *c_str(); +T *data(); +unsigned size_bytes(); + }; + + typedef basic_string string; + typedef basic_string wstring; + + // C function under std: + void memcpy(); + void strcpy(); +} + +void f(char * p, char * q, std::span s, std::span s2) { + memcpy(); // expected-warning{{function 'memcpy' introduces unsafe buffer access}} + std::memcpy(); // expected-warning{{function 'memcpy' introduces unsafe buffer access}} + __builtin_memcpy(p, q, 64); // expected-warning{{function '__builtin_memcpy' introduces unsafe buffer access}} + __builtin___memcpy_chk(p, q, 8, 64); // expected-warning{{function '__builtin___memcpy_chk' introduces unsafe buffer access}} + __asan_memcpy(); // expected-warning{{function '__asan_memcpy' introduces unsafe buffer access}} + strcpy(); // expected-warning{{function 'strcpy' introduces unsafe buffer access}} + std::strcpy(); // expected-warning{{function 'strcpy' introduces unsafe buffer access}} + strcpy_s(); // expected-warning{{function 'strcpy_s' introduces unsafe buffer access}} + wcscpy_s(); // expected-warning{{function 'wcscpy_s' introduces unsafe buffer access}} + + + /* Test printfs */ + fprintf((FILE*)p, "%s%d", p, *p); // expected-warning{{function 'fprintf' introduces unsafe buffer access}} expected-note{{use 'std::string::c_str' or string literal as string pointer to guarantee null-termination}} + printf("%s%d", p, *p); // expected-warning{{function 'printf' introduces unsafe buffer access}} expected-note{{use 'std::string::c_str' or string literal as string pointer to guarantee null-termination}} + sprintf(q, "%s%d", "hello", *p); // expected-warning{{function 'sprintf' introduces unsafe buffer access}} expected-note{{change to 'snprintf' for explicit bounds checking}} + swprintf(q, "%s%d", "hello", *p); // expected-warning{{function 'swprintf' introduces unsafe buffer access}} expected-note{{change to 'snprintf' for explicit bounds checking}} + snprintf(q, 10, "%s%d", "hello", *p); // expected-warning{{function 'snprintf' introduces unsafe buffer access}} expected-note{{buffer pointer and size may not match}} haoNoQ wrote: Do we (and/or should we) support ``` char arr[10]; snprintf(arr, sizeof(arr), ...); ``` ? https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [-Wunsafe-buffer-usage] Warning Libc functions (PR #101583)
@@ -443,6 +449,396 @@ AST_MATCHER(ArraySubscriptExpr, isSafeArraySubscript) { return false; } +AST_MATCHER_P(CallExpr, hasNumArgs, unsigned, Num) { + return Node.getNumArgs() == Num; +} + +namespace libc_func_matchers { +// Under `libc_func_matchers`, define a set of matchers that match unsafe +// functions in libc and unsafe calls to them. + +// A tiny parser to strip off common prefix and suffix of libc function names +// in real code. +// +// Given a function name, `matchName` returns `CoreName` according to the +// following grammar: +// +// LibcName := CoreName | CoreName + "_s" +// MatchingName := "__builtin_" + LibcName | +// "__builtin___" + LibcName + "_chk" | +// "__asan_" + LibcName +// +struct LibcFunNamePrefixSuffixParser { + StringRef matchName(StringRef FunName, bool isBuiltin) { +// Try to match __builtin_: +if (isBuiltin && FunName.starts_with("__builtin_")) + // Then either it is __builtin_LibcName or __builtin___LibcName_chk or + // no match: + return matchLibcNameOrBuiltinChk( + FunName.drop_front(10 /* truncate "__builtin_" */)); +// Try to match __asan_: +if (FunName.starts_with("__asan_")) + return matchLibcName(FunName.drop_front(7 /* truncate of "__asan_" */)); +return matchLibcName(FunName); + } + + // Parameter `Name` is the substring after stripping off the prefix + // "__builtin_". + StringRef matchLibcNameOrBuiltinChk(StringRef Name) { +if (Name.starts_with("__") && Name.ends_with("_chk")) + return matchLibcName( + Name.drop_front(2).drop_back(4) /* truncate "__" and "_chk" */); +return matchLibcName(Name); + } + + StringRef matchLibcName(StringRef Name) { +if (Name.ends_with("_s")) + return Name.drop_back(2 /* truncate "_s" */); +return Name; + } +}; + +// A pointer type expression is known to be null-terminated, if it has the +// form: E.c_str(), for any expression E of `std::string` type. +static bool isNullTermPointer(const Expr *Ptr) { + if (isa(Ptr->IgnoreParenImpCasts())) +return true; + if (isa(Ptr->IgnoreParenImpCasts())) +return true; + if (auto *MCE = dyn_cast(Ptr->IgnoreParenImpCasts())) { +const CXXMethodDecl *MD = MCE->getMethodDecl(); +const CXXRecordDecl *RD = MCE->getRecordDecl()->getCanonicalDecl(); + +if (MD && RD && RD->isInStdNamespace()) + if (MD->getName() == "c_str" && RD->getName() == "basic_string") +return true; + } + return false; +} + +// Return true iff at least one of following cases holds: +// 1. Format string is a literal and there is an unsafe pointer argument +// corresponding to an `s` specifier; +// 2. Format string is not a literal and there is least an unsafe pointer +// argument (including the formatter argument). +static bool hasUnsafeFormatOrSArg(const CallExpr *Call, unsigned FmtArgIdx, + ASTContext &Ctx, bool isKprintf = false) { + class StringFormatStringHandler + : public analyze_format_string::FormatStringHandler { +const CallExpr *Call; +unsigned FmtArgIdx; + + public: +StringFormatStringHandler(const CallExpr *Call, unsigned FmtArgIdx) +: Call(Call), FmtArgIdx(FmtArgIdx) {} + +bool HandlePrintfSpecifier(const analyze_printf::PrintfSpecifier &FS, + const char *startSpecifier, + unsigned specifierLen, + const TargetInfo &Target) override { + if (FS.getConversionSpecifier().getKind() == + analyze_printf::PrintfConversionSpecifier::sArg) { +unsigned ArgIdx = FS.getPositionalArgIndex() + FmtArgIdx; + +if (0 < ArgIdx && ArgIdx < Call->getNumArgs()) + if (!isNullTermPointer(Call->getArg(ArgIdx))) +return false; // stop parsing + } + return true; // continue parsing +} + }; + + const Expr *Fmt = Call->getArg(FmtArgIdx); + + if (auto *SL = dyn_cast(Fmt->IgnoreParenImpCasts())) { +StringRef FmtStr = SL->getString(); +StringFormatStringHandler Handler(Call, FmtArgIdx); + +return analyze_format_string::ParsePrintfString( +Handler, FmtStr.begin(), FmtStr.end(), Ctx.getLangOpts(), +Ctx.getTargetInfo(), isKprintf); + } + // If format is not a string literal, we cannot analyze the format string. + // In this case, this call is considered unsafe if at least one argument + // (including the format argument) is unsafe pointer. + return llvm::any_of( + llvm::make_range(Call->arg_begin() + FmtArgIdx, Call->arg_end()), + [](const Expr *Arg) { +return Arg->getType()->isPointerType() && !isNullTermPointer(Arg); + }); +} + +// Matches a FunctionDecl node such that +// 1. It's name, after stripping off predefined prefix and suffix, is +// `CoreName`; and +// 2. `CoreName` or `CoreName[str/wcs]` is one of the `PredefinedNames`, which +// is a set of lib
[clang] [-Wunsafe-buffer-usage] Warning Libc functions (PR #101583)
https://github.com/haoNoQ commented: Mostly LGTM! I don't have major concerns. https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [-Wunsafe-buffer-usage] Warning Libc functions (PR #101583)
@@ -12383,6 +12383,13 @@ def warn_unsafe_buffer_operation : Warning< "%select{unsafe pointer operation|unsafe pointer arithmetic|" "unsafe buffer access|function introduces unsafe buffer manipulation|unsafe invocation of span::data}0">, InGroup, DefaultIgnore; +def warn_unsafe_buffer_libc_call : Warning< + "function %0 introduces unsafe buffer access">, + InGroup, DefaultIgnore; +def note_unsafe_buffer_printf_call : Note< + "%select{| change to 'snprintf' for explicit bounds checking | buffer pointer and size may not match" + "| use 'std::string::c_str' or string literal as string pointer to guarantee null-termination" haoNoQ wrote: Also if `-Wunsafe-buffer-usage` is enabled in C (for any obscure reason), would we still recommend `c_str` here? Maybe this recommendation should be guarded by the "emit suggestions" flag? https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [-Wunsafe-buffer-usage] Warning Libc functions (PR #101583)
@@ -12383,6 +12383,13 @@ def warn_unsafe_buffer_operation : Warning< "%select{unsafe pointer operation|unsafe pointer arithmetic|" "unsafe buffer access|function introduces unsafe buffer manipulation|unsafe invocation of span::data}0">, InGroup, DefaultIgnore; +def warn_unsafe_buffer_libc_call : Warning< + "function %0 introduces unsafe buffer access">, + InGroup, DefaultIgnore; +def note_unsafe_buffer_printf_call : Note< + "%select{| change to 'snprintf' for explicit bounds checking | buffer pointer and size may not match" + "| use 'std::string::c_str' or string literal as string pointer to guarantee null-termination" haoNoQ wrote: I think for this warning in particular, it's valuable to point the user to the specific `%s` argument. To, at least, make sure that they know we don't mean the snprintf's target string or something. https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [-Wunsafe-buffer-usage] Warning Libc functions (PR #101583)
@@ -1025,6 +1421,92 @@ class DataInvocationGadget : public WarningGadget { DeclUseList getClaimedVarUseSites() const override { return {}; } }; +class UnsafeLibcFunctionCallGadget : public WarningGadget { + const CallExpr *const Call; + constexpr static const char *const Tag = "UnsafeLibcFunctionCall"; + // Extra tags for additional information: + constexpr static const char *const UnsafeSprintfTag = + "UnsafeLibcFunctionCall_sprintf"; + constexpr static const char *const UnsafeSizedByTag = + "UnsafeLibcFunctionCall_sized_by"; + constexpr static const char *const UnsafeStringTag = + "UnsafeLibcFunctionCall_string"; + constexpr static const char *const UnsafeVaListTag = + "UnsafeLibcFunctionCall_va_list"; + + enum UnsafeKind { +OTHERS = 0, // no specific information, the callee function is unsafe +SPRINTF = 1, // never call `-sprintf`s, call `-snprintf`s instead. +SIZED_BY = 2, // a pair of function arguments have "__sized_by" relation but + // they do not conform to safe patterns +STRING = 3, // an argument is a pointer-to-char-as-string but does not + // guarantee null-termination +VA_LIST = 4, // one of the `-printf`s function that take va_list, which is + // considered unsafe as it is not compile-time check + } WarnedFunKind = OTHERS; + +public: + UnsafeLibcFunctionCallGadget(const MatchFinder::MatchResult &Result) + : WarningGadget(Kind::UnsafeLibcFunctionCall), +Call(Result.Nodes.getNodeAs(Tag)) { +if (Result.Nodes.getNodeAs(UnsafeSprintfTag)) + WarnedFunKind = SPRINTF; +else if (Result.Nodes.getNodeAs(UnsafeStringTag)) + WarnedFunKind = STRING; +else if (Result.Nodes.getNodeAs(UnsafeSizedByTag)) + WarnedFunKind = SIZED_BY; +else if (Result.Nodes.getNodeAs(UnsafeVaListTag)) + WarnedFunKind = VA_LIST; + } + + static Matcher matcher() { +return stmt( +stmt( +anyOf( +// Match a call to a predefined unsafe libc function (unless the +// call has a sole string literal argument): +callExpr(callee(functionDecl( + libc_func_matchers::isPredefinedUnsafeLibcFunc())), + unless(allOf(hasArgument(0, expr(stringLiteral())), + hasNumArgs(1, +// Match a call to one of the `v*printf` functions taking +// va-list, which cannot be checked at compile-time: +callExpr(callee(functionDecl( + libc_func_matchers::isUnsafeVaListPrintfFunc( +.bind(UnsafeVaListTag), haoNoQ wrote: It doesn't look like you care which node carries the tag, only that the tag exists in the match result, right? So if you attach the `.bind()` to `functionDecl()` instead, would it help you simplify the code like this?: ``` callExpr(callee(anyOf( functionDecl(...), functionDecl(...).bind(UnsafeVaListTag), ... ))) ``` https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [-Wunsafe-buffer-usage] Warning Libc functions (PR #101583)
@@ -443,6 +449,396 @@ AST_MATCHER(ArraySubscriptExpr, isSafeArraySubscript) { return false; } +AST_MATCHER_P(CallExpr, hasNumArgs, unsigned, Num) { + return Node.getNumArgs() == Num; +} + +namespace libc_func_matchers { +// Under `libc_func_matchers`, define a set of matchers that match unsafe +// functions in libc and unsafe calls to them. + +// A tiny parser to strip off common prefix and suffix of libc function names +// in real code. +// +// Given a function name, `matchName` returns `CoreName` according to the +// following grammar: +// +// LibcName := CoreName | CoreName + "_s" +// MatchingName := "__builtin_" + LibcName | +// "__builtin___" + LibcName + "_chk" | +// "__asan_" + LibcName +// +struct LibcFunNamePrefixSuffixParser { + StringRef matchName(StringRef FunName, bool isBuiltin) { +// Try to match __builtin_: +if (isBuiltin && FunName.starts_with("__builtin_")) + // Then either it is __builtin_LibcName or __builtin___LibcName_chk or + // no match: + return matchLibcNameOrBuiltinChk( + FunName.drop_front(10 /* truncate "__builtin_" */)); +// Try to match __asan_: +if (FunName.starts_with("__asan_")) + return matchLibcName(FunName.drop_front(7 /* truncate of "__asan_" */)); +return matchLibcName(FunName); + } + + // Parameter `Name` is the substring after stripping off the prefix + // "__builtin_". + StringRef matchLibcNameOrBuiltinChk(StringRef Name) { +if (Name.starts_with("__") && Name.ends_with("_chk")) + return matchLibcName( + Name.drop_front(2).drop_back(4) /* truncate "__" and "_chk" */); +return matchLibcName(Name); + } + + StringRef matchLibcName(StringRef Name) { +if (Name.ends_with("_s")) + return Name.drop_back(2 /* truncate "_s" */); +return Name; + } +}; + +// A pointer type expression is known to be null-terminated, if it has the +// form: E.c_str(), for any expression E of `std::string` type. +static bool isNullTermPointer(const Expr *Ptr) { + if (isa(Ptr->IgnoreParenImpCasts())) +return true; + if (isa(Ptr->IgnoreParenImpCasts())) +return true; + if (auto *MCE = dyn_cast(Ptr->IgnoreParenImpCasts())) { +const CXXMethodDecl *MD = MCE->getMethodDecl(); +const CXXRecordDecl *RD = MCE->getRecordDecl()->getCanonicalDecl(); + +if (MD && RD && RD->isInStdNamespace()) + if (MD->getName() == "c_str" && RD->getName() == "basic_string") +return true; + } + return false; +} + +// Return true iff at least one of following cases holds: +// 1. Format string is a literal and there is an unsafe pointer argument +// corresponding to an `s` specifier; +// 2. Format string is not a literal and there is least an unsafe pointer +// argument (including the formatter argument). +static bool hasUnsafeFormatOrSArg(const CallExpr *Call, unsigned FmtArgIdx, + ASTContext &Ctx, bool isKprintf = false) { + class StringFormatStringHandler + : public analyze_format_string::FormatStringHandler { +const CallExpr *Call; +unsigned FmtArgIdx; + + public: +StringFormatStringHandler(const CallExpr *Call, unsigned FmtArgIdx) +: Call(Call), FmtArgIdx(FmtArgIdx) {} + +bool HandlePrintfSpecifier(const analyze_printf::PrintfSpecifier &FS, + const char *startSpecifier, + unsigned specifierLen, + const TargetInfo &Target) override { + if (FS.getConversionSpecifier().getKind() == + analyze_printf::PrintfConversionSpecifier::sArg) { +unsigned ArgIdx = FS.getPositionalArgIndex() + FmtArgIdx; + +if (0 < ArgIdx && ArgIdx < Call->getNumArgs()) + if (!isNullTermPointer(Call->getArg(ArgIdx))) +return false; // stop parsing haoNoQ wrote: So it doesn't just stop parsing, it also changes the return value of `ParsePrintfString()` from `false` to `true` right? This is how you get away without storing the intermediate answer anywhere in the visitor. I think it's a valid way to use it but it probably deserves a slightly more verbose comment. https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [-Wunsafe-buffer-usage] Warning Libc functions (PR #101583)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [llvm][clang] Move RewriterBuffer to ADT. (PR #99770)
https://github.com/haoNoQ approved this pull request. Works for me! IIUC nobody touched this code in a while, and that's probably because it's basically perfect for everyone's existing purposes. So I think moving it to ADT is appropriate. https://github.com/llvm/llvm-project/pull/99770 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [-Wunsafe-buffer-usage] Fix a small bug recently found (PR #102953)
https://github.com/haoNoQ approved this pull request. LGTM! https://github.com/llvm/llvm-project/pull/102953 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [-Wunsafe-buffer-usage] Warning Libc functions (PR #101583)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [-Wunsafe-buffer-usage] Fix a small bug recently found (PR #102953)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/102953 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning Libc functions (PR #101583)
@@ -443,6 +448,368 @@ AST_MATCHER(ArraySubscriptExpr, isSafeArraySubscript) { return false; } +namespace libc_fun_disjoint_inner_matchers { +// `libc_fun_disjoint_inner_matchers` covers a set of matchers that match +// disjoint node sets. They all take a `CoreName`, which is the substring of a +// function name after `ignoreLibcPrefixAndSuffix`. They are suppose to be used +// as an inner matcher of `ignoreLibcPrefixAndSuffix` to deal with different +// libc function calls. + +// Matches a function call node such that +// 1. It's name, after stripping off predefined prefix and suffix, is +// `CoreName`; and +// 2. `CoreName` or `CoreName[str/wcs]` is one of the `PredefinedNames`, which +// is a set of libc function names. +// +// Note: For predefined prefix and suffix, see `ignoreLibcPrefixAndSuffix`. +// And, the notation `CoreName[str/wcs]` means a new name obtained from replace +// string "wcs" with "str" in `CoreName`. +// +// Also note, the set of predefined function names does not include `printf` +// functions, they are checked exclusively with other matchers below. +// Maintaining the invariant that all matchers under +// `libc_fun_disjoint_inner_matchers` are disjoint. +AST_MATCHER_P(CallExpr, predefinedUnsafeLibcFunCall, StringRef, CoreName) { + static const std::set PredefinedNames{ + // numeric conversion: + "atof", + "atoi", + "atol", + "atoll", + "strtol", + "strtoll", + "strtoul", + "strtoull", + "strtof", + "strtod", + "strtold", + "strtoimax", + "strtoumax", + // "strfromf", "strfromd", "strfroml", // C23? + // string manipulation: + "strcpy", + "strncpy", + "strlcpy", + "strcat", + "strncat", + "strlcat", + "strxfrm", + "strdup", + "strndup", + // string examination: + "strlen", + "strnlen", + "strcmp", + "strncmp", + "stricmp", + "strcasecmp", + "strcoll", + "strchr", + "strrchr", + "strspn", + "strcspn", + "strpbrk", + "strstr", + "strtok", + // "mem-" functions + "memchr", + "wmemchr", + "memcmp", + "wmemcmp", + "memcpy", + "memccpy", + "mempcpy", + "wmemcpy", + "memmove", + "wmemmove", + "memset", + "wmemset", + // IO: + "fread", + "fwrite", + "fgets", + "fgetws", + "gets", + "fputs", + "fputws", + "puts", + // others + "strerror_s", + "strerror_r", + "bcopy", + "bzero", + "bsearch", + "qsort", + }; + // This is safe: strlen("hello"). We don't want to be noisy on this case. + auto isSafeStrlen = [&Node](StringRef Name) -> bool { +return Name == "strlen" && Node.getNumArgs() == 1 && + isa(Node.getArg(0)->IgnoreParenImpCasts()); + }; + + // Match predefined names: + if (PredefinedNames.find(CoreName.str()) != PredefinedNames.end()) haoNoQ wrote: AFAICT, `.str()` is an O(N) copy of the string. Which isn't a lot but `llvm::StringSet` would probably work much better here given how it works with `StringRef`s directly. https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning Libc functions (PR #101583)
@@ -443,6 +448,368 @@ AST_MATCHER(ArraySubscriptExpr, isSafeArraySubscript) { return false; } +namespace libc_fun_disjoint_inner_matchers { +// `libc_fun_disjoint_inner_matchers` covers a set of matchers that match +// disjoint node sets. They all take a `CoreName`, which is the substring of a +// function name after `ignoreLibcPrefixAndSuffix`. They are suppose to be used +// as an inner matcher of `ignoreLibcPrefixAndSuffix` to deal with different +// libc function calls. + +// Matches a function call node such that +// 1. It's name, after stripping off predefined prefix and suffix, is +// `CoreName`; and +// 2. `CoreName` or `CoreName[str/wcs]` is one of the `PredefinedNames`, which +// is a set of libc function names. +// +// Note: For predefined prefix and suffix, see `ignoreLibcPrefixAndSuffix`. +// And, the notation `CoreName[str/wcs]` means a new name obtained from replace +// string "wcs" with "str" in `CoreName`. +// +// Also note, the set of predefined function names does not include `printf` +// functions, they are checked exclusively with other matchers below. +// Maintaining the invariant that all matchers under +// `libc_fun_disjoint_inner_matchers` are disjoint. +AST_MATCHER_P(CallExpr, predefinedUnsafeLibcFunCall, StringRef, CoreName) { + static const std::set PredefinedNames{ + // numeric conversion: + "atof", + "atoi", + "atol", + "atoll", + "strtol", + "strtoll", + "strtoul", + "strtoull", + "strtof", + "strtod", + "strtold", + "strtoimax", + "strtoumax", + // "strfromf", "strfromd", "strfroml", // C23? + // string manipulation: + "strcpy", + "strncpy", + "strlcpy", + "strcat", + "strncat", + "strlcat", + "strxfrm", + "strdup", + "strndup", + // string examination: + "strlen", + "strnlen", + "strcmp", + "strncmp", + "stricmp", + "strcasecmp", + "strcoll", + "strchr", + "strrchr", + "strspn", + "strcspn", + "strpbrk", + "strstr", + "strtok", + // "mem-" functions + "memchr", + "wmemchr", + "memcmp", + "wmemcmp", + "memcpy", + "memccpy", + "mempcpy", + "wmemcpy", + "memmove", + "wmemmove", + "memset", + "wmemset", + // IO: + "fread", + "fwrite", + "fgets", + "fgetws", + "gets", + "fputs", + "fputws", + "puts", + // others + "strerror_s", + "strerror_r", + "bcopy", + "bzero", + "bsearch", + "qsort", + }; + // This is safe: strlen("hello"). We don't want to be noisy on this case. + auto isSafeStrlen = [&Node](StringRef Name) -> bool { +return Name == "strlen" && Node.getNumArgs() == 1 && + isa(Node.getArg(0)->IgnoreParenImpCasts()); + }; + + // Match predefined names: + if (PredefinedNames.find(CoreName.str()) != PredefinedNames.end()) +return !isSafeStrlen(CoreName); + + std::string NameWCS = CoreName.str(); + size_t WcsPos = NameWCS.find("wcs"); + + while (WcsPos != std::string::npos) { +NameWCS[WcsPos++] = 's'; +NameWCS[WcsPos++] = 't'; +NameWCS[WcsPos++] = 'r'; +WcsPos = NameWCS.find("wcs", WcsPos); + } + if (PredefinedNames.find(NameWCS) != PredefinedNames.end()) +return !isSafeStrlen(NameWCS); + // All `scanf` functions are unsafe (including `sscanf`, `vsscanf`, etc.. They + // all should end with "scanf"): + return CoreName.ends_with("scanf"); +} + +// Match a call to one of the `-printf` functions taking `va_list`. We cannot +// check safety for these functions so they should be changed to their +// non-va_list versions. +AST_MATCHER_P(CallExpr, unsafeVaListPrintfs, StringRef, CoreName) { + StringRef Name = CoreName; + + if (!Name.ends_with("printf")) +return false; // neither printf nor scanf + return Name.starts_with("v"); +} + +// Matches a call to one of the `-sprintf` functions (excluding the ones with +// va_list) as they are always unsafe and should be changed to corresponding +// `-snprintf`s. +AST_MATCHER_P(CallExpr, unsafeSprintfs, StringRef, CoreName) { + StringRef Name = CoreName; + + if (!Name.ends_with("printf") || Name.starts_with("v")) +return false; + + StringRef Prefix = Name.drop_back(6); + + if (Prefix.ends_with("w")) +Prefix = Prefix.drop_back(1); + return Prefix == "s"; +} + +// A pointer type expression is known to be null-terminated, if it has the +// form: E.c_str(), for any expression E of `std::string` type. +static bool isNullTermPointer(const Expr *Ptr) { + if (isa(Ptr->IgnoreParenImpCasts())) +return true; + if (isa(Ptr->IgnoreParenImpCasts())) +return true; + if (auto *MCE = dyn_cast(Ptr->IgnoreParenImpCasts())) { +const CXXMethodDecl *MD = MCE->getMethodDecl(); +const CXXRecordDecl *RD = MCE->getRecordDecl()->getCanonicalDecl(); + +if (MD && RD && RD->isInStdNamespace(
[clang] Warning Libc functions (PR #101583)
@@ -443,6 +448,368 @@ AST_MATCHER(ArraySubscriptExpr, isSafeArraySubscript) { return false; } +namespace libc_fun_disjoint_inner_matchers { +// `libc_fun_disjoint_inner_matchers` covers a set of matchers that match +// disjoint node sets. They all take a `CoreName`, which is the substring of a +// function name after `ignoreLibcPrefixAndSuffix`. They are suppose to be used +// as an inner matcher of `ignoreLibcPrefixAndSuffix` to deal with different +// libc function calls. + +// Matches a function call node such that +// 1. It's name, after stripping off predefined prefix and suffix, is +// `CoreName`; and +// 2. `CoreName` or `CoreName[str/wcs]` is one of the `PredefinedNames`, which +// is a set of libc function names. +// +// Note: For predefined prefix and suffix, see `ignoreLibcPrefixAndSuffix`. +// And, the notation `CoreName[str/wcs]` means a new name obtained from replace +// string "wcs" with "str" in `CoreName`. +// +// Also note, the set of predefined function names does not include `printf` +// functions, they are checked exclusively with other matchers below. +// Maintaining the invariant that all matchers under +// `libc_fun_disjoint_inner_matchers` are disjoint. +AST_MATCHER_P(CallExpr, predefinedUnsafeLibcFunCall, StringRef, CoreName) { + static const std::set PredefinedNames{ + // numeric conversion: + "atof", + "atoi", + "atol", + "atoll", + "strtol", + "strtoll", + "strtoul", + "strtoull", + "strtof", + "strtod", + "strtold", + "strtoimax", + "strtoumax", + // "strfromf", "strfromd", "strfroml", // C23? + // string manipulation: + "strcpy", + "strncpy", + "strlcpy", + "strcat", + "strncat", + "strlcat", + "strxfrm", + "strdup", + "strndup", + // string examination: + "strlen", + "strnlen", + "strcmp", + "strncmp", + "stricmp", + "strcasecmp", + "strcoll", + "strchr", + "strrchr", + "strspn", + "strcspn", + "strpbrk", + "strstr", + "strtok", + // "mem-" functions + "memchr", + "wmemchr", + "memcmp", + "wmemcmp", + "memcpy", + "memccpy", + "mempcpy", + "wmemcpy", + "memmove", + "wmemmove", + "memset", + "wmemset", + // IO: + "fread", + "fwrite", + "fgets", + "fgetws", + "gets", + "fputs", + "fputws", + "puts", + // others + "strerror_s", + "strerror_r", + "bcopy", + "bzero", + "bsearch", + "qsort", + }; + // This is safe: strlen("hello"). We don't want to be noisy on this case. + auto isSafeStrlen = [&Node](StringRef Name) -> bool { +return Name == "strlen" && Node.getNumArgs() == 1 && + isa(Node.getArg(0)->IgnoreParenImpCasts()); + }; + + // Match predefined names: + if (PredefinedNames.find(CoreName.str()) != PredefinedNames.end()) +return !isSafeStrlen(CoreName); + + std::string NameWCS = CoreName.str(); + size_t WcsPos = NameWCS.find("wcs"); + + while (WcsPos != std::string::npos) { +NameWCS[WcsPos++] = 's'; +NameWCS[WcsPos++] = 't'; +NameWCS[WcsPos++] = 'r'; +WcsPos = NameWCS.find("wcs", WcsPos); + } + if (PredefinedNames.find(NameWCS) != PredefinedNames.end()) +return !isSafeStrlen(NameWCS); + // All `scanf` functions are unsafe (including `sscanf`, `vsscanf`, etc.. They + // all should end with "scanf"): + return CoreName.ends_with("scanf"); +} + +// Match a call to one of the `-printf` functions taking `va_list`. We cannot +// check safety for these functions so they should be changed to their +// non-va_list versions. +AST_MATCHER_P(CallExpr, unsafeVaListPrintfs, StringRef, CoreName) { + StringRef Name = CoreName; + + if (!Name.ends_with("printf")) +return false; // neither printf nor scanf + return Name.starts_with("v"); +} + +// Matches a call to one of the `-sprintf` functions (excluding the ones with +// va_list) as they are always unsafe and should be changed to corresponding +// `-snprintf`s. +AST_MATCHER_P(CallExpr, unsafeSprintfs, StringRef, CoreName) { + StringRef Name = CoreName; + + if (!Name.ends_with("printf") || Name.starts_with("v")) +return false; + + StringRef Prefix = Name.drop_back(6); + + if (Prefix.ends_with("w")) +Prefix = Prefix.drop_back(1); + return Prefix == "s"; +} + +// A pointer type expression is known to be null-terminated, if it has the +// form: E.c_str(), for any expression E of `std::string` type. +static bool isNullTermPointer(const Expr *Ptr) { + if (isa(Ptr->IgnoreParenImpCasts())) +return true; + if (isa(Ptr->IgnoreParenImpCasts())) haoNoQ wrote: I just looked up `PredefinedExpr` on Doxygen and was very confused about what did they mean by "A predefined identifier such as __func__". Turns out they meant `__func__` but it got markdown'd 😄 Yeah nice one, th
[clang] Warning Libc functions (PR #101583)
@@ -2292,6 +2292,18 @@ class UnsafeBufferUsageReporter : public UnsafeBufferUsageHandler { } } + void handleUnsafeLibcCall(const CallExpr *Call, unsigned PrintfInfo, +ASTContext &Ctx) override { +// We have checked that there is a direct callee with an identifier name: +StringRef Name = Call->getDirectCallee()->getName(); + +S.Diag(Call->getBeginLoc(), diag::warn_unsafe_buffer_libc_call) +<< Name << Call->getSourceRange(); haoNoQ wrote: Just pass `const Decl *D = Call->getDirectCallee()` directly into `S.Diag()`. It even automatically adds quotes around the name this way. https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning Libc functions (PR #101583)
@@ -443,6 +448,368 @@ AST_MATCHER(ArraySubscriptExpr, isSafeArraySubscript) { return false; } +namespace libc_fun_disjoint_inner_matchers { +// `libc_fun_disjoint_inner_matchers` covers a set of matchers that match +// disjoint node sets. They all take a `CoreName`, which is the substring of a +// function name after `ignoreLibcPrefixAndSuffix`. They are suppose to be used +// as an inner matcher of `ignoreLibcPrefixAndSuffix` to deal with different +// libc function calls. + +// Matches a function call node such that +// 1. It's name, after stripping off predefined prefix and suffix, is +// `CoreName`; and +// 2. `CoreName` or `CoreName[str/wcs]` is one of the `PredefinedNames`, which +// is a set of libc function names. +// +// Note: For predefined prefix and suffix, see `ignoreLibcPrefixAndSuffix`. +// And, the notation `CoreName[str/wcs]` means a new name obtained from replace +// string "wcs" with "str" in `CoreName`. +// +// Also note, the set of predefined function names does not include `printf` +// functions, they are checked exclusively with other matchers below. +// Maintaining the invariant that all matchers under +// `libc_fun_disjoint_inner_matchers` are disjoint. +AST_MATCHER_P(CallExpr, predefinedUnsafeLibcFunCall, StringRef, CoreName) { + static const std::set PredefinedNames{ + // numeric conversion: + "atof", + "atoi", + "atol", + "atoll", + "strtol", + "strtoll", + "strtoul", + "strtoull", + "strtof", + "strtod", + "strtold", + "strtoimax", + "strtoumax", + // "strfromf", "strfromd", "strfroml", // C23? + // string manipulation: + "strcpy", + "strncpy", + "strlcpy", + "strcat", + "strncat", + "strlcat", + "strxfrm", + "strdup", + "strndup", + // string examination: + "strlen", + "strnlen", + "strcmp", + "strncmp", + "stricmp", + "strcasecmp", + "strcoll", + "strchr", + "strrchr", + "strspn", + "strcspn", + "strpbrk", + "strstr", + "strtok", + // "mem-" functions + "memchr", + "wmemchr", + "memcmp", + "wmemcmp", + "memcpy", + "memccpy", + "mempcpy", + "wmemcpy", + "memmove", + "wmemmove", + "memset", + "wmemset", + // IO: + "fread", + "fwrite", + "fgets", + "fgetws", + "gets", + "fputs", + "fputws", + "puts", + // others + "strerror_s", + "strerror_r", + "bcopy", + "bzero", + "bsearch", + "qsort", + }; + // This is safe: strlen("hello"). We don't want to be noisy on this case. + auto isSafeStrlen = [&Node](StringRef Name) -> bool { +return Name == "strlen" && Node.getNumArgs() == 1 && + isa(Node.getArg(0)->IgnoreParenImpCasts()); + }; + + // Match predefined names: + if (PredefinedNames.find(CoreName.str()) != PredefinedNames.end()) +return !isSafeStrlen(CoreName); + + std::string NameWCS = CoreName.str(); + size_t WcsPos = NameWCS.find("wcs"); + + while (WcsPos != std::string::npos) { +NameWCS[WcsPos++] = 's'; +NameWCS[WcsPos++] = 't'; +NameWCS[WcsPos++] = 'r'; +WcsPos = NameWCS.find("wcs", WcsPos); + } + if (PredefinedNames.find(NameWCS) != PredefinedNames.end()) +return !isSafeStrlen(NameWCS); + // All `scanf` functions are unsafe (including `sscanf`, `vsscanf`, etc.. They + // all should end with "scanf"): + return CoreName.ends_with("scanf"); +} + +// Match a call to one of the `-printf` functions taking `va_list`. We cannot +// check safety for these functions so they should be changed to their +// non-va_list versions. +AST_MATCHER_P(CallExpr, unsafeVaListPrintfs, StringRef, CoreName) { + StringRef Name = CoreName; + + if (!Name.ends_with("printf")) +return false; // neither printf nor scanf + return Name.starts_with("v"); +} + +// Matches a call to one of the `-sprintf` functions (excluding the ones with +// va_list) as they are always unsafe and should be changed to corresponding +// `-snprintf`s. +AST_MATCHER_P(CallExpr, unsafeSprintfs, StringRef, CoreName) { + StringRef Name = CoreName; + + if (!Name.ends_with("printf") || Name.starts_with("v")) +return false; + + StringRef Prefix = Name.drop_back(6); + + if (Prefix.ends_with("w")) +Prefix = Prefix.drop_back(1); + return Prefix == "s"; +} + +// A pointer type expression is known to be null-terminated, if it has the +// form: E.c_str(), for any expression E of `std::string` type. +static bool isNullTermPointer(const Expr *Ptr) { + if (isa(Ptr->IgnoreParenImpCasts())) +return true; + if (isa(Ptr->IgnoreParenImpCasts())) +return true; + if (auto *MCE = dyn_cast(Ptr->IgnoreParenImpCasts())) { +const CXXMethodDecl *MD = MCE->getMethodDecl(); +const CXXRecordDecl *RD = MCE->getRecordDecl()->getCanonicalDecl(); + +if (MD && RD && RD->isInStdNamespace(
[clang] Warning Libc functions (PR #101583)
@@ -483,6 +483,34 @@ bool clang::analyze_format_string::ParseFormatStringHasSArg(const char *I, return false; } +unsigned clang::analyze_format_string::ParseFormatStringFirstSArgIndex( +const char *&I, const char *E, unsigned ArgIndex, const LangOptions &LO, +const TargetInfo &Target) { + unsigned argIndex = ArgIndex; + + // Keep looking for a %s format specifier until we have exhausted the string. + FormatStringHandler H; + while (I != E) { +const PrintfSpecifierResult &FSR = +ParsePrintfSpecifier(H, I, E, argIndex, LO, Target, false, false); +// Did a fail-stop error of any kind occur when parsing the specifier? +// If so, don't do any more processing. +if (FSR.shouldStop()) haoNoQ wrote: Is this any easier than simply overriding `HandlePrintfSpecifier()` and `HandleScanfSpecifier()` in `FormatStringHandler` to tell you everything you need in a simple mostly-statically-typed manner? https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning Libc functions (PR #101583)
@@ -443,6 +448,368 @@ AST_MATCHER(ArraySubscriptExpr, isSafeArraySubscript) { return false; } +namespace libc_fun_disjoint_inner_matchers { +// `libc_fun_disjoint_inner_matchers` covers a set of matchers that match +// disjoint node sets. They all take a `CoreName`, which is the substring of a +// function name after `ignoreLibcPrefixAndSuffix`. They are suppose to be used +// as an inner matcher of `ignoreLibcPrefixAndSuffix` to deal with different +// libc function calls. + +// Matches a function call node such that +// 1. It's name, after stripping off predefined prefix and suffix, is +// `CoreName`; and +// 2. `CoreName` or `CoreName[str/wcs]` is one of the `PredefinedNames`, which +// is a set of libc function names. +// +// Note: For predefined prefix and suffix, see `ignoreLibcPrefixAndSuffix`. +// And, the notation `CoreName[str/wcs]` means a new name obtained from replace +// string "wcs" with "str" in `CoreName`. +// +// Also note, the set of predefined function names does not include `printf` +// functions, they are checked exclusively with other matchers below. +// Maintaining the invariant that all matchers under +// `libc_fun_disjoint_inner_matchers` are disjoint. +AST_MATCHER_P(CallExpr, predefinedUnsafeLibcFunCall, StringRef, CoreName) { + static const std::set PredefinedNames{ + // numeric conversion: + "atof", + "atoi", + "atol", + "atoll", + "strtol", + "strtoll", + "strtoul", + "strtoull", + "strtof", + "strtod", + "strtold", + "strtoimax", + "strtoumax", + // "strfromf", "strfromd", "strfroml", // C23? + // string manipulation: + "strcpy", + "strncpy", + "strlcpy", + "strcat", + "strncat", + "strlcat", + "strxfrm", + "strdup", + "strndup", + // string examination: + "strlen", + "strnlen", + "strcmp", + "strncmp", + "stricmp", + "strcasecmp", + "strcoll", + "strchr", + "strrchr", + "strspn", + "strcspn", + "strpbrk", + "strstr", + "strtok", + // "mem-" functions + "memchr", + "wmemchr", + "memcmp", + "wmemcmp", + "memcpy", + "memccpy", + "mempcpy", + "wmemcpy", + "memmove", + "wmemmove", + "memset", + "wmemset", + // IO: + "fread", + "fwrite", + "fgets", + "fgetws", + "gets", + "fputs", + "fputws", + "puts", + // others + "strerror_s", + "strerror_r", + "bcopy", + "bzero", + "bsearch", + "qsort", + }; + // This is safe: strlen("hello"). We don't want to be noisy on this case. + auto isSafeStrlen = [&Node](StringRef Name) -> bool { +return Name == "strlen" && Node.getNumArgs() == 1 && + isa(Node.getArg(0)->IgnoreParenImpCasts()); + }; + + // Match predefined names: + if (PredefinedNames.find(CoreName.str()) != PredefinedNames.end()) +return !isSafeStrlen(CoreName); + + std::string NameWCS = CoreName.str(); + size_t WcsPos = NameWCS.find("wcs"); + + while (WcsPos != std::string::npos) { +NameWCS[WcsPos++] = 's'; +NameWCS[WcsPos++] = 't'; +NameWCS[WcsPos++] = 'r'; +WcsPos = NameWCS.find("wcs", WcsPos); + } + if (PredefinedNames.find(NameWCS) != PredefinedNames.end()) +return !isSafeStrlen(NameWCS); + // All `scanf` functions are unsafe (including `sscanf`, `vsscanf`, etc.. They + // all should end with "scanf"): + return CoreName.ends_with("scanf"); +} + +// Match a call to one of the `-printf` functions taking `va_list`. We cannot +// check safety for these functions so they should be changed to their +// non-va_list versions. +AST_MATCHER_P(CallExpr, unsafeVaListPrintfs, StringRef, CoreName) { + StringRef Name = CoreName; + + if (!Name.ends_with("printf")) +return false; // neither printf nor scanf + return Name.starts_with("v"); +} + +// Matches a call to one of the `-sprintf` functions (excluding the ones with +// va_list) as they are always unsafe and should be changed to corresponding +// `-snprintf`s. +AST_MATCHER_P(CallExpr, unsafeSprintfs, StringRef, CoreName) { + StringRef Name = CoreName; + + if (!Name.ends_with("printf") || Name.starts_with("v")) +return false; + + StringRef Prefix = Name.drop_back(6); + + if (Prefix.ends_with("w")) +Prefix = Prefix.drop_back(1); + return Prefix == "s"; +} + +// A pointer type expression is known to be null-terminated, if it has the +// form: E.c_str(), for any expression E of `std::string` type. +static bool isNullTermPointer(const Expr *Ptr) { + if (isa(Ptr->IgnoreParenImpCasts())) +return true; + if (isa(Ptr->IgnoreParenImpCasts())) +return true; + if (auto *MCE = dyn_cast(Ptr->IgnoreParenImpCasts())) { +const CXXMethodDecl *MD = MCE->getMethodDecl(); +const CXXRecordDecl *RD = MCE->getRecordDecl()->getCanonicalDecl(); + +if (MD && RD && RD->isInStdNamespace(
[clang] Warning Libc functions (PR #101583)
@@ -783,6 +783,18 @@ bool ParsePrintfString(FormatStringHandler &H, bool ParseFormatStringHasSArg(const char *beg, const char *end, const LangOptions &LO, const TargetInfo &Target); +/// Parse C format string and return index (relative to `ArgIndex`) of the first +/// found `s` specifier. Return 0 if not found. +/// \param I The start of the C format string; Updated to the first unparsed +/// position upon return. +/// \param E The end of the C format string; +/// \param ArgIndex The argument index of the last found `s` specifier; Or the +/// argument index of the formatter in initial case. +unsigned ParseFormatStringFirstSArgIndex(const char *&I, const char *E, haoNoQ wrote: The `*&` part is probably going to be very confusing to the users of this utility function. Maybe just use the handler directly in our code? https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning Libc functions (PR #101583)
@@ -443,6 +448,368 @@ AST_MATCHER(ArraySubscriptExpr, isSafeArraySubscript) { return false; } +namespace libc_fun_disjoint_inner_matchers { +// `libc_fun_disjoint_inner_matchers` covers a set of matchers that match +// disjoint node sets. They all take a `CoreName`, which is the substring of a +// function name after `ignoreLibcPrefixAndSuffix`. They are suppose to be used +// as an inner matcher of `ignoreLibcPrefixAndSuffix` to deal with different +// libc function calls. + +// Matches a function call node such that +// 1. It's name, after stripping off predefined prefix and suffix, is +// `CoreName`; and +// 2. `CoreName` or `CoreName[str/wcs]` is one of the `PredefinedNames`, which +// is a set of libc function names. +// +// Note: For predefined prefix and suffix, see `ignoreLibcPrefixAndSuffix`. +// And, the notation `CoreName[str/wcs]` means a new name obtained from replace +// string "wcs" with "str" in `CoreName`. +// +// Also note, the set of predefined function names does not include `printf` +// functions, they are checked exclusively with other matchers below. +// Maintaining the invariant that all matchers under +// `libc_fun_disjoint_inner_matchers` are disjoint. +AST_MATCHER_P(CallExpr, predefinedUnsafeLibcFunCall, StringRef, CoreName) { + static const std::set PredefinedNames{ haoNoQ wrote: We can't have nice things because of https://llvm.org/docs/CodingStandards.html#do-not-use-static-constructors - if we need this to store this as a set (and I think we do because of the sheer size of it), we have to store it as a raw pointer and lazy-initialize it on first use. https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [analyzer] Delete `alpha.security.MallocOverflow` (PR #103059)
haoNoQ wrote: > Instead of actually tracking the symbolic values and the known constraints on > them, this checker blindly gropes the AST and uses heuristics like "this > variable was seen in a comparison operator expression that is not a loop > condition, so it's probably not too large" (which was improved in a separate > commit to at least ignore comparison operators that appear after the actual > `malloc()` call). Yeah this should either be a "taint analysis" thing. Or a coding-convention thing that only works when the users are provided with a clear alternative, such as "please always use overflow-checked builtins when computing the size for malloc". https://github.com/llvm/llvm-project/pull/103059 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [analyzer] Delete `alpha.security.MallocOverflow` (PR #103059)
@@ -1039,10 +1039,6 @@ def ArrayBoundCheckerV2 : Checker<"ArrayBoundV2">, HelpText<"Warn about buffer overflows (newer checker)">, Documentation; -def MallocOverflowSecurityChecker : Checker<"MallocOverflow">, haoNoQ wrote: We could keep this for a while to avoid breaking people's configurations, but given that the checker was alpha we're probably not strictly obliged to do that. https://github.com/llvm/llvm-project/pull/103059 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [analyzer] Delete `alpha.security.MallocOverflow` (PR #103059)
https://github.com/haoNoQ approved this pull request. I too think this checker can be safely deleted. To the best of my knowledge, nobody is using it. https://github.com/llvm/llvm-project/pull/103059 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [analyzer] Delete `alpha.security.MallocOverflow` (PR #103059)
@@ -1,40 +0,0 @@ -// RUN: %clang_analyze_cc1 -triple x86_64-unknown-unknown -analyzer-checker=alpha.security.MallocOverflow,unix -verify %s -// RUN: %clang_analyze_cc1 -triple x86_64-unknown-unknown -analyzer-checker=alpha.security.MallocOverflow,unix,optin.portability -DPORTABILITY -verify %s haoNoQ wrote: This test was enabling a few other checkers (including a few path-sensitive checkers), so it could have acted as a "true negatives" test for those checkers, so I think it may be slightly useful if we preserve it. Quite unlikely though. https://github.com/llvm/llvm-project/pull/103059 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [analyzer] Delete `alpha.security.MallocOverflow` (PR #103059)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/103059 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)
@@ -6835,6 +6837,31 @@ the proper solution would be to create a different function (possibly an overload of ``baz()``) that accepts a safe container like ``bar()``, and then use the attribute on the original ``baz()`` to help the users update their code to use the new function. + +Attribute attached to fields: + +The attribute should only be attached to struct fields, if the fields can not be +updated to a safe type with bounds check, such as std::span. In other words, the +buffers prone to unsafe accesses should always be updated to use safe containers/views +and attaching the attribute must be last resort when such an update is infeasible. + +The attribute can be placed on individual fields or a set of them as shown below. +.. code-block:: c++ + + struct A { +[[clang::unsafe_buffer_usage]] +int *ptr1; + +[[clang::unsafe_buffer_usage]] +int *ptr2, buf[10]; + +[[clang::unsafe_buffer_usage]] +size_t sz; + }; + +Here, every read/write to the fields ptr1, ptr2, buf and sz will trigger a warning that the +field has been explcitly marked as unsafe due to unsafe-buffer operations. + haoNoQ wrote: Discussed offline - what I proposed is "a" solution but very much not necessarily "the" solution so we probably shouldn't outright recommend it in a document as low-level as compiler documentation. https://github.com/llvm/llvm-project/pull/101585 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)
https://github.com/haoNoQ approved this pull request. LGTM!! https://github.com/llvm/llvm-project/pull/101585 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/101585 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Safe Buffers] Fix a small bug recently found (PR #102953)
@@ -404,7 +404,7 @@ AST_MATCHER(CXXConstructExpr, isSafeSpanTwoParamConstruct) { if (Arg0Ty->isConstantArrayType()) { const APSInt ConstArrSize = -APSInt(cast(Arg0Ty)->getSize()); +APSInt(cast(Arg0Ty.getCanonicalType())->getSize()); haoNoQ wrote: There's this whole `ASTContext::getAsArrayType()` thing (et al.): > This is a member of > [ASTContext](https://clang.llvm.org/doxygen/classclang_1_1ASTContext.html) > because this may need to do some amount of canonicalization, e.g. to move > type qualifiers into the element type. I.e. such "canonicalization" is stronger than a simple `.getCanonicalType()` and it sounds like we're supposed to do that consistently. So we probably should use `ASTContext::getAsConstantArrayType()` in such cases. https://github.com/llvm/llvm-project/pull/102953 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Safe Buffers] Fix a small bug recently found (PR #102953)
haoNoQ wrote: Ah classic! Forgotten `.getCanonicalType()`s always get me. https://github.com/llvm/llvm-project/pull/102953 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning Libc functions (PR #101583)
@@ -443,6 +447,314 @@ AST_MATCHER(ArraySubscriptExpr, isSafeArraySubscript) { return false; } +AST_MATCHER(CallExpr, isUnsafeLibcFunctionCall) { + static const std::set PredefinedNames{ + // numeric conversion: + "atof", + "atoi", + "atol", + "atoll", + "strtol", + "strtoll", + "strtoul", + "strtoull", + "strtof", + "strtod", + "strtold", + "strtoimax", + "strtoumax", + // "strfromf", "strfromd", "strfroml", // C23? + // string manipulation: + "strcpy", + "strncpy", + "strlcpy", + "strcat", + "strncat", + "strlcat", + "strxfrm", + "strdup", + "strndup", + // string examination: + "strlen", + "strnlen", + "strcmp", + "strncmp", + "stricmp", + "strcasecmp", + "strcoll", + "strchr", + "strrchr", + "strspn", + "strcspn", + "strpbrk", + "strstr", + "strtok", + // "mem-" functions + "memchr", + "wmemchr", + "memcmp", + "wmemcmp", + "memcpy", + "memccpy", + "mempcpy", + "wmemcpy", + "memmove", + "wmemmove", + "memset", + "wmemset", + // IO: + "fread", + "fwrite", + "fgets", + "fgetws", + "gets", + "fputs", + "fputws", + "puts", + // others + "strerror_s", + "strerror_r", + "bcopy", + "bzero", + "bsearch", + "qsort", + }; + + // A tiny name parser for unsafe libc function names with additional + // checks for `printf`s: + struct FuncNameMatch { +const CallExpr *const Call; +ASTContext &Ctx; +enum ResultKind { + NO, // no match + YES, // matched a name that is not a member of the printf family + SPRINTF, // matched `sprintf` + OTHER_PRINTF, // matched a printf function that is not `sprintf` +}; +FuncNameMatch(const CallExpr *Call, ASTContext &Ctx) +: Call(Call), Ctx(Ctx) {} + +// For a name `S` in `PredefinedNames` or a member of the printf/scanf +// family, define matching function names with `S` by the grammar below: +// +// CoreName := S | S["wcs"/"str"] +// LibcName := CoreName | CoreName + "_s" +// MatchingName := "__builtin_" + LibcName | +// "__builtin___" + LibcName + "_chk" | +// "__asan_" + LibcName +// +// (Note S["wcs"/"str"] means substitute "str" with "wcs" in S.) +ResultKind matchName(StringRef FunName, bool isBuiltin) { + // Try to match __builtin_: + if (isBuiltin && FunName.starts_with("__builtin_")) +// Then either it is __builtin_LibcName or __builtin___LibcName_chk or +// no match: +return matchLibcNameOrBuiltinChk( +FunName.drop_front(10 /* truncate "__builtin_" */)); + // Try to match __asan_: + if (FunName.starts_with("__asan_")) +return matchLibcName(FunName.drop_front(7 /* truncate of "__asan_" */)); + return matchLibcName(FunName); +} + +// Parameter `Name` is the substring after stripping off the prefix +// "__builtin_". +ResultKind matchLibcNameOrBuiltinChk(StringRef Name) { + if (Name.starts_with("__") && Name.ends_with("_chk")) +return matchLibcName( +Name.drop_front(2).drop_back(4) /* truncate "__" and "_chk" */); + return matchLibcName(Name); +} + +ResultKind matchLibcName(StringRef Name) { + if (Name.ends_with("_s")) +return matchCoreName(Name.drop_back(2 /* truncate "_s" */)); + return matchCoreName(Name); +} + +ResultKind matchCoreName(StringRef Name) { + if (PredefinedNames.find(Name.str()) != PredefinedNames.end()) +return !isSafeStrlen(Name) ? YES + : NO; // match unless it's a safe strlen call + + std::string NameWCS = Name.str(); + size_t WcsPos = NameWCS.find("wcs"); + + while (WcsPos != std::string::npos) { +NameWCS[WcsPos++] = 's'; +NameWCS[WcsPos++] = 't'; +NameWCS[WcsPos++] = 'r'; +WcsPos = NameWCS.find("wcs", WcsPos); + } + if (PredefinedNames.find(NameWCS) != PredefinedNames.end()) +return !isSafeStrlen(NameWCS) ? YES : NO; + return matchPrintfOrScanfFamily(Name); +} + +ResultKind matchPrintfOrScanfFamily(StringRef Name) { + if (Name.ends_with("scanf")) +return YES; // simply warn about scanf functions + if (!Name.ends_with("printf")) +return NO; // neither printf nor scanf + if (Name.starts_with("v")) +// cannot check args for va_list, so `vprintf`s are treated as regular +// unsafe libc calls: +return YES; + + // Truncate "printf", focus on prefixes. There are different possible + // name prefixes: "k", "f", "s", "sn", "fw", ..., "snw". We strip off the + // 'w' and handle printfs di
[clang] Warning Libc functions (PR #101583)
@@ -443,6 +448,260 @@ AST_MATCHER(ArraySubscriptExpr, isSafeArraySubscript) { return false; } +AST_MATCHER(CallExpr, isUnsafeLibcFunctionCall) { + static const std::set PredefinedNames{ + // numeric conversion: + "atof", + "atoi", + "atol", + "atoll", + "strtol", + "strtoll", + "strtoul", + "strtoull", + "strtof", + "strtod", + "strtold", + "strtoimax", + "strtoumax", + // "strfromf", "strfromd", "strfroml", // C23? + // string manipulation: + "strcpy", + "strncpy", + "strlcpy", + "strcat", + "strncat", + "strlcat", + "strxfrm", + "strdup", + "strndup", + // string examination: + "strlen", + "strnlen", + "strcmp", + "strncmp", + "stricmp", + "strcasecmp", + "strcoll", + "strchr", + "strrchr", + "strspn", + "strcspn", + "strpbrk", + "strstr", + "strtok", + // "mem-" functions + "memchr", + "wmemchr", + "memcmp", + "wmemcmp", + "memcpy", + "memccpy", + "mempcpy", + "wmemcpy", + "memmove", + "wmemmove", + "memset", + "wmemset", + // IO: + "fread", + "fwrite", + "fgets", + "fgetws", + "gets", + "fputs", + "fputws", + "puts", + // others + "strerror_s", + "strerror_r", + "bcopy", + "bzero", + "bsearch", + "qsort", + }; + + // A tiny name parser for unsafe libc function names with additional + // checks for `printf`s: + struct FuncNameMatch { +const CallExpr *const Call; +FuncNameMatch(const CallExpr *Call) : Call(Call) {} + +// For a name `S` in `PredefinedNames` or a member of the printf/scanf +// family, define matching function names with `S` by the grammar below: +// +// CoreName := S | S["wcs"/"str"] +// LibcName := CoreName | CoreName + "_s" +// MatchingName := "__builtin_" + LibcName | +// "__builtin___" + LibcName + "_chk" | +// "__asan_" + LibcName +// +// (Note S["wcs"/"str"] means substitute "str" with "wcs" in S.) +bool matchName(StringRef FunName) { + // Try to match __builtin_: + if (FunName.starts_with("__builtin_")) +// Then either it is __builtin_LibcName or __builtin___LibcName_chk or +// no match: +return matchLibcNameOrBuiltinChk( +FunName.drop_front(10 /* truncate "__builtin_" */)); + // Try to match __asan_: + if (FunName.starts_with("__asan_")) +return matchLibcName(FunName.drop_front(7 /* truncate of "__asan_" */)); + return matchLibcName(FunName); +} + +// Parameter `Name` is the substring after stripping off the prefix +// "__builtin_". +bool matchLibcNameOrBuiltinChk(StringRef Name) { + if (Name.starts_with("__") && Name.ends_with("_chk")) +return matchLibcName( +Name.drop_front(2).drop_back(4) /* truncate "__" and "_chk" */); + return matchLibcName(Name); +} + +bool matchLibcName(StringRef Name) { + if (Name.ends_with("_s")) +return matchCoreName(Name.drop_back(2 /* truncate "_s" */)); + return matchCoreName(Name); +} + +bool matchCoreName(StringRef Name) { + if (PredefinedNames.find(Name.str()) != PredefinedNames.end()) +return !isSafeStrlen(Name); // match unless it's a safe strlen call + + std::string NameWCS = Name.str(); + size_t WcsPos = NameWCS.find("wcs"); + + while (WcsPos != std::string::npos) { +NameWCS[WcsPos++] = 's'; +NameWCS[WcsPos++] = 't'; +NameWCS[WcsPos++] = 'r'; +WcsPos = NameWCS.find("wcs", WcsPos); + } + if (PredefinedNames.find(NameWCS) != PredefinedNames.end()) +return !isSafeStrlen(NameWCS); + return matchPrintfOrScanfFamily(Name); +} + +bool matchPrintfOrScanfFamily(StringRef Name) { + if (Name.ends_with("scanf")) +return true; // simply warn about scanf functions + if (!Name.ends_with("printf")) +return false; // neither printf nor scanf + if (Name.starts_with("v")) +// cannot check args for va_list, so `vprintf`s are treated as regular +// unsafe libc calls: +return true; + + // Truncate "printf", focus on prefixes. There are different possible + // name prefixes: "k", "f", "s", "sn", "fw", ..., "snw". We strip off the + // 'w' and handle printfs differently by "k", "f", "s", "sn" or no prefix: + StringRef Prefix = Name.drop_back(6); + + if (Prefix.ends_with("w")) +Prefix = Prefix.drop_back(1); + return isUnsafePrintf(Prefix); +} + +// A pointer type expression is known to be null-terminated, if it has the +// form: E.c_str(), for any expression E of `std::string` type. +static bool isNullTermPointer(const Expr *Ptr) { + if (i
[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)
@@ -0,0 +1,180 @@ +// RUN: %clang_cc1 -std=c++20 -Wunsafe-buffer-usage \ +// RUN:-fsafe-buffer-usage-suggestions -verify %s + +using size_t = __typeof(sizeof(int)); + +namespace std { + class type_info; + class bad_cast; + class bad_typeid; + + template class span { + + private: +T *elements; +size_t size_; + + public: +span(T *, size_t){} + +constexpr T* data() const noexcept { + return elements; +} + +constexpr size_t size() const noexcept { + return size_; +} + + }; +} + +struct A { +[[clang::unsafe_buffer_usage]] +int *ptr; + +size_t sz; +}; + +struct B { + A a; + + [[clang::unsafe_buffer_usage]] + int buf[]; +}; + +struct D { + [[clang::unsafe_buffer_usage]] + int *ptr, *ptr2; + + [[clang::unsafe_buffer_usage]] + int buf[10]; + + size_t sz; + +}; + +void foo(int *ptr); + +void foo_safe(std::span sp); + +int* test_atribute_struct(A a) { + int b = *(a.ptr); //expected-warning{{field 'ptr' prone to unsafe buffer manipulation}} + a.sz++; + // expected-warning@+1{{unsafe pointer arithmetic}} + return a.ptr++; //expected-warning{{field 'ptr' prone to unsafe buffer manipulation}} +} + +void test_attribute_field_deref_chain(B b) { + int *ptr = b.a.ptr;//expected-warning{{field 'ptr' prone to unsafe buffer manipulation}} + foo(b.buf); //expected-warning{{field 'buf' prone to unsafe buffer manipulation}} +} + +void test_safe_writes(std::span sp) { + A a; + // TODO: We should not warn for safe assignments from hardened types + a.ptr = sp.data(); //expected-warning{{field 'ptr' prone to unsafe buffer manipulation}} + a.sz = sp.size(); + + a.ptr = nullptr; // expected-warning{{field 'ptr' prone to unsafe buffer manipulation}} +} + +void test_safe_reads(A a, A b) { + //expected-warning@+1{{the two-parameter std::span construction is unsafe as it can introduce mismatch between buffer size and the bound information}} + std::span sp {a.ptr, a.sz}; //expected-warning{{field 'ptr' prone to unsafe buffer manipulation}} + + // expected-warning@+1 3{{field 'ptr' prone to unsafe buffer manipulation}} + if(a.ptr != nullptr && a.ptr != b.ptr) { +foo_safe(sp); + } + +} + +void test_attribute_multiple_fields (D d) { + int *p =d.ptr; //expected-warning{{field 'ptr' prone to unsafe buffer manipulation}} + p = d.ptr2; //expected-warning{{field 'ptr2' prone to unsafe buffer manipulation}} + + p = d.buf; //expected-warning{{field 'buf' prone to unsafe buffer manipulation}} + + int v = d.buf[0]; //expected-warning{{field 'buf' prone to unsafe buffer manipulation}} + + //expected-warning@+1{{unsafe buffer access}} + v = d.buf[5]; //expected-warning{{field 'buf' prone to unsafe buffer manipulation}} +} + +template +struct TemplateArray { + [[clang::unsafe_buffer_usage]] + T *buf; + + [[clang::unsafe_buffer_usage]] + size_t sz; +}; + + +void test_struct_template (TemplateArray t) { + int *p = t.buf; //expected-warning{{field 'buf' prone to unsafe buffer manipulation}} + size_t s = t.sz; //expected-warning{{field 'sz' prone to unsafe buffer manipulation}} +} + +class R { + [[clang::unsafe_buffer_usage]] + int *array; + + public: + int* getArray() { + return array; //expected-warning{{field 'array' prone to unsafe buffer manipulation}} + } + + void setArray(int *arr) { + array = arr; //expected-warning{{field 'array' prone to unsafe buffer manipulation}} + } +}; + +template +class Q { + [[clang::unsafe_buffer_usage]] + P *array; + + public: + P* getArray() { + return array; //expected-warning{{field 'array' prone to unsafe buffer manipulation}} + } + + void setArray(P *arr) { + array = arr; //expected-warning{{field 'array' prone to unsafe buffer manipulation}} + } +}; + +void test_class_template(Q q) { + q.getArray(); + q.setArray(nullptr); +} + +struct AnonFields { + struct { + [[clang::unsafe_buffer_usage]] + int a; + }; +}; + +void test_anon_fields(AnonFields anon) { + int val = anon.a; //expected-warning{{field 'a' prone to unsafe buffer manipulation}} haoNoQ wrote: In this case the field isn't anonymous, only the parent field is anonymous. I wonder if this works too: ``` struct AnonFields { [[clang::unsafe_buffer_usage]] struct { int a; }; }; ``` But this would probably cause the attribute to try to get attached to the struct decl, not to the field decl? So I'm not really aware of a realistic way to trigger this kind of problem. But in case of functions this is going to be way more common. Eg., every overloaded operator is basically an anonymous function. ObjC methods are named but their names aren't simple identifiers so this is another source of crashes. Blocks... exist. https://github.com/llvm/llvm-project/pull/101585 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)
@@ -926,22 +926,27 @@ class CArrayToPtrAssignmentGadget : public FixableGadget { /// A call of a function or method that performs unchecked buffer operations /// over one of its pointer parameters. class UnsafeBufferUsageAttrGadget : public WarningGadget { - constexpr static const char *const OpTag = "call_expr"; - const CallExpr *Op; + constexpr static const char *const OpTag = "attr_expr"; + const Expr *Op; public: UnsafeBufferUsageAttrGadget(const MatchFinder::MatchResult &Result) : WarningGadget(Kind::UnsafeBufferUsageAttr), -Op(Result.Nodes.getNodeAs(OpTag)) {} +Op(Result.Nodes.getNodeAs(OpTag)) {} static bool classof(const Gadget *G) { return G->getKind() == Kind::UnsafeBufferUsageAttr; } static Matcher matcher() { +auto HasUnsafeFielDecl = +member(fieldDecl(hasAttr(attr::UnsafeBufferUsage))); + auto HasUnsafeFnDecl = callee(functionDecl(hasAttr(attr::UnsafeBufferUsage))); -return stmt(callExpr(HasUnsafeFnDecl).bind(OpTag)); + +return stmt(expr(anyOf(callExpr(HasUnsafeFnDecl).bind(OpTag), haoNoQ wrote: I suspect that `stmt(expr(...))` is redundant. You probably need at most one of those. https://github.com/llvm/llvm-project/pull/101585 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)
@@ -6835,6 +6837,31 @@ the proper solution would be to create a different function (possibly an overload of ``baz()``) that accepts a safe container like ``bar()``, and then use the attribute on the original ``baz()`` to help the users update their code to use the new function. + +Attribute attached to fields: + +The attribute should only be attached to struct fields, if the fields can not be +updated to a safe type with bounds check, such as std::span. In other words, the +buffers prone to unsafe accesses should always be updated to use safe containers/views +and attaching the attribute must be last resort when such an update is infeasible. + +The attribute can be placed on individual fields or a set of them as shown below. +.. code-block:: c++ + + struct A { +[[clang::unsafe_buffer_usage]] +int *ptr1; + +[[clang::unsafe_buffer_usage]] +int *ptr2, buf[10]; haoNoQ wrote: Nice, I didn't know it worked this way! https://github.com/llvm/llvm-project/pull/101585 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)
@@ -6835,6 +6837,31 @@ the proper solution would be to create a different function (possibly an overload of ``baz()``) that accepts a safe container like ``bar()``, and then use the attribute on the original ``baz()`` to help the users update their code to use the new function. + +Attribute attached to fields: + +The attribute should only be attached to struct fields, if the fields can not be +updated to a safe type with bounds check, such as std::span. In other words, the +buffers prone to unsafe accesses should always be updated to use safe containers/views +and attaching the attribute must be last resort when such an update is infeasible. + +The attribute can be placed on individual fields or a set of them as shown below. +.. code-block:: c++ + + struct A { +[[clang::unsafe_buffer_usage]] +int *ptr1; + +[[clang::unsafe_buffer_usage]] +int *ptr2, buf[10]; + +[[clang::unsafe_buffer_usage]] +size_t sz; + }; + +Here, every read/write to the fields ptr1, ptr2, buf and sz will trigger a warning that the +field has been explcitly marked as unsafe due to unsafe-buffer operations. + haoNoQ wrote: I feel the need to add something constructive here, to give people a way to eliminate the warning. > "Together with adding the attribute, it is recommended to provide a safe way > of accessing these structs. Under our assumption that the fields cannot be > edited, or even made private, for compatibility reasons, one possible > solution is to provide safe `span`-based accessor methods to these fields, > then use the attribute to encourage users to use those methods, without > breaking compatibility if they don't: > ``` > struct A { > [[clang::unsafe_buffer_usage]] > int *ptr; > [[clang::unsafe_buffer_usage]] > size_t sz; > > std::span getPtrAsSpan() { > #pragma clang unsafe_buffer_usage begin > return std::span{ptr, sz}; > #pragma clang unsafe_buffer_usage end > } > > void setPtrFromSpan(std::span sp) { > #pragma clang unsafe_buffer_usage begin > ptr = sp.data(); > sz = sp.size(); > #pragma clang unsafe_buffer_usage begin > } > } > ``` > ". https://github.com/llvm/llvm-project/pull/101585 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)
https://github.com/haoNoQ commented: Ok technical part looks good! https://github.com/llvm/llvm-project/pull/101585 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/101585 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/101585 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)
@@ -927,21 +927,28 @@ class CArrayToPtrAssignmentGadget : public FixableGadget { /// over one of its pointer parameters. class UnsafeBufferUsageAttrGadget : public WarningGadget { constexpr static const char *const OpTag = "call_expr"; - const CallExpr *Op; + const Expr *Op; public: UnsafeBufferUsageAttrGadget(const MatchFinder::MatchResult &Result) : WarningGadget(Kind::UnsafeBufferUsageAttr), -Op(Result.Nodes.getNodeAs(OpTag)) {} +Op(Result.Nodes.getNodeAs(OpTag)) {} static bool classof(const Gadget *G) { return G->getKind() == Kind::UnsafeBufferUsageAttr; } static Matcher matcher() { +auto HasUnsafeFielDecl = +member(fieldDecl(allOf( + anyOf(hasPointerType(), hasArrayType()), haoNoQ wrote: > Doesn't the absence of -Wunsafe-buffer-usage warning give more or less the > same signal as the additional warning would provide? > I suggest we don't implement the extra warning. The purpose of implementing the extra warning/error would have been to avoid users writing code that will later be declared unsupported. The user may think "hmm there's a false negative, let me report a bug about the false negative but keep the code". Then they suddenly get an error later. This appears to be a strong tradition with attributes in Clang: the initial patch supports a restricted scope of use cases, contains a warning/error for uses beyond that scope, then later the scope gets expanded to more and more supported use cases later. We're expected to actively ban all uses we don't immediately support. But it sounds like we're taking the "big hammer" approach anyway so yeah the warning is no longer necessary. Every use case is now a supported use case. I'm slightly worried that with the "big hammer" approach we'll lose the opportunity to implement the "small hammer" approach in the future. Eg. we no longer have a way to suppress the warning when a pointer field is only used for the purposes of comparing to null, or when the size field is only read but never written. Like, we could implement that, but that would result in a "confusing hammer" that can be either big or small depending on obscure details of circumstantial evidence. So if we ultimately realize we need a smaller hammer - which is quite likely - we'll probably need to build a separate attribute. But I think I'm ok with that. This is in line with the rest of our work: we're mostly in the business of shipping big hammers. https://github.com/llvm/llvm-project/pull/101585 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)
@@ -2261,6 +2262,12 @@ class UnsafeBufferUsageReporter : public UnsafeBufferUsageHandler { // note_unsafe_buffer_operation doesn't have this mode yet. assert(!IsRelatedToDecl && "Not implemented yet!"); MsgParam = 3; + } else if (isa(Operation)) { +// note_unsafe_buffer_operation doesn't have this mode yet. +assert(!IsRelatedToDecl && "Not implemented yet!"); +auto ME = dyn_cast(Operation); +name = ME->getMemberDecl()->getName(); haoNoQ wrote: > Perhaps that needs to be a separate PR? It shouldn't block this PR but it'd indeed be very nice to have either way. https://github.com/llvm/llvm-project/pull/101585 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning Libc functions (PR #101583)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning Libc functions (PR #101583)
https://github.com/haoNoQ commented: Ooo that's a lot of functions! First round of comments, will try to look at the next commit tomorrow. https://github.com/llvm/llvm-project/pull/101583 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)
@@ -2261,6 +2262,12 @@ class UnsafeBufferUsageReporter : public UnsafeBufferUsageHandler { // note_unsafe_buffer_operation doesn't have this mode yet. assert(!IsRelatedToDecl && "Not implemented yet!"); MsgParam = 3; + } else if (isa(Operation)) { +// note_unsafe_buffer_operation doesn't have this mode yet. +assert(!IsRelatedToDecl && "Not implemented yet!"); +auto ME = dyn_cast(Operation); +name = ME->getMemberDecl()->getName(); haoNoQ wrote: I suspect that this one will crash in case of anonymous fields. That said, I also don't think it's even possible to put an attribute on an anonymous field, so we're probably fine. However, with diagnostics specifically, the tradition isn't to put `name` into the `Diag()` stream; the tradition is to put the decl itself, directly, by pointer into the Diag() stream! I.e. ```c++ const Decl *D = ME->getMemberDecl(); ... S.Diag(Loc, diag::warn_unsafe_buffer_operation) << MsgParam << D << Range; ``` This way not only it reacts to anonymous objects correctly (I think), but also it automatically puts quotes around the name, which is something we currently forget to do :) https://github.com/llvm/llvm-project/pull/101585 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)
@@ -959,12 +966,12 @@ class UnsafeBufferUsageAttrGadget : public WarningGadget { /// perform buffer operations that depend on the correctness of the parameters. class UnsafeBufferUsageCtorAttrGadget : public WarningGadget { constexpr static const char *const OpTag = "cxx_construct_expr"; - const CXXConstructExpr *Op; + const Expr *Op; haoNoQ wrote: Is this change still necessary? I think it was included by accident. https://github.com/llvm/llvm-project/pull/101585 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/101585 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)
https://github.com/haoNoQ approved this pull request. Aha great, this is exactly how I imagined it would look like! We might need some more boilerplate though. +Erich for attributes. https://github.com/llvm/llvm-project/pull/101585 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [attributes][-Wunsafe-buffer-usage] Support adding unsafe_buffer_usage attribute to struct fields (PR #101585)
https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/101585 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits