[clang] Warning for unsafe invocation of span::data (PR #75650)
https://github.com/ziqingluo-90 approved this pull request. LGTM! https://github.com/llvm/llvm-project/pull/75650 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning for unsafe invocation of span::data (PR #75650)
https://github.com/malavikasamak updated https://github.com/llvm/llvm-project/pull/75650 >From 809bf6f4237f634feaeb7e5b0b88be3a2e4de455 Mon Sep 17 00:00:00 2001 From: MalavikaSamak Date: Fri, 15 Dec 2023 11:40:55 -0800 Subject: [PATCH 1/3] While refactoring projects to eradicate unsafe buffer accesses using -Wunsafe-buffer-usage, there maybe accidental re-introduction of new OutOfBound accesses into the code bases. One such case is invoking span::data() method on a span variable to retrieve a pointer, which is then cast to a larger type and dereferenced. Such dereferences can introduce OutOfBound accesses. To address this, a new WarningGadget is being introduced to warn against such invocations. --- .../Analysis/Analyses/UnsafeBufferUsage.h | 2 +- .../Analyses/UnsafeBufferUsageGadgets.def | 1 + .../clang/Basic/DiagnosticSemaKinds.td| 2 +- clang/lib/Analysis/UnsafeBufferUsage.cpp | 37 - clang/lib/Sema/AnalysisBasedWarnings.cpp | 19 ++- ...e-buffer-usage-warning-data-invocation.cpp | 133 ++ 6 files changed, 186 insertions(+), 8 deletions(-) create mode 100644 clang/test/SemaCXX/warn-unsafe-buffer-usage-warning-data-invocation.cpp diff --git a/clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h b/clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h index 8a2d56668e32f9..b28f2c6b99c50e 100644 --- a/clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h +++ b/clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h @@ -66,7 +66,7 @@ class UnsafeBufferUsageHandler { /// Invoked when an unsafe operation over raw pointers is found. virtual void handleUnsafeOperation(const Stmt *Operation, - bool IsRelatedToDecl) = 0; + bool IsRelatedToDecl, ASTContext ) = 0; /// Invoked when a fix is suggested against a variable. This function groups /// all variables that must be fixed together (i.e their types must be changed diff --git a/clang/include/clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def b/clang/include/clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def index 757ee452ced748..c9766168836510 100644 --- a/clang/include/clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def +++ b/clang/include/clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def @@ -30,6 +30,7 @@ WARNING_GADGET(Decrement) WARNING_GADGET(ArraySubscript) WARNING_GADGET(PointerArithmetic) WARNING_GADGET(UnsafeBufferUsageAttr) +WARNING_GADGET(DataInvocation) FIXABLE_GADGET(ULCArraySubscript) // `DRE[any]` in an Unspecified Lvalue Context FIXABLE_GADGET(DerefSimplePtrArithFixable) FIXABLE_GADGET(PointerDereference) diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index 94e97a891baedc..9038dd879f54ae 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -12058,7 +12058,7 @@ def warn_unsafe_buffer_variable : Warning< InGroup, DefaultIgnore; def warn_unsafe_buffer_operation : Warning< "%select{unsafe pointer operation|unsafe pointer arithmetic|" - "unsafe buffer access|function introduces unsafe buffer manipulation}0">, + "unsafe buffer access|function introduces unsafe buffer manipulation|unsafe invocation of span::data}0">, InGroup, DefaultIgnore; def note_unsafe_buffer_operation : Note< "used%select{| in pointer arithmetic| in buffer access}0 here">; diff --git a/clang/lib/Analysis/UnsafeBufferUsage.cpp b/clang/lib/Analysis/UnsafeBufferUsage.cpp index 70eec1cee57f8e..e4eca9939b10d5 100644 --- a/clang/lib/Analysis/UnsafeBufferUsage.cpp +++ b/clang/lib/Analysis/UnsafeBufferUsage.cpp @@ -721,6 +721,33 @@ class UnsafeBufferUsageAttrGadget : public WarningGadget { DeclUseList getClaimedVarUseSites() const override { return {}; } }; +// Warning gadget for unsafe invocation of span::data method. +// Triggers when the pointer returned by the invocation is immediately +// cast to a larger type. + +class DataInvocationGadget : public WarningGadget { + constexpr static const char *const OpTag = "data_invocation_expr"; + const ExplicitCastExpr *Op; + + public: + DataInvocationGadget(const MatchFinder::MatchResult ) + : WarningGadget(Kind::DataInvocation), +Op(Result.Nodes.getNodeAs(OpTag)) {} + + static bool classof(const Gadget *G) { +return G->getKind() == Kind::DataInvocation; + } + + static Matcher matcher() { +return stmt( +explicitCastExpr(has(cxxMemberCallExpr(callee( + cxxMethodDecl(hasName("data")).bind(OpTag)); + } + const Stmt *getBaseStmt() const override { return Op; } + + DeclUseList getClaimedVarUseSites() const override { return {}; } +}; + // Represents expressions of the form `DRE[*]` in the Unspecified Lvalue // Context (see `isInUnspecifiedLvalueContext`). // Note here `[]` is the built-in subscript operator. @@ -2657,8 +2684,8 @@ void
[clang] Warning for unsafe invocation of span::data (PR #75650)
https://github.com/t-rasmud approved this pull request. LGTM! https://github.com/llvm/llvm-project/pull/75650 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning for unsafe invocation of span::data (PR #75650)
https://github.com/malavikasamak updated https://github.com/llvm/llvm-project/pull/75650 >From 809bf6f4237f634feaeb7e5b0b88be3a2e4de455 Mon Sep 17 00:00:00 2001 From: MalavikaSamak Date: Fri, 15 Dec 2023 11:40:55 -0800 Subject: [PATCH 1/2] While refactoring projects to eradicate unsafe buffer accesses using -Wunsafe-buffer-usage, there maybe accidental re-introduction of new OutOfBound accesses into the code bases. One such case is invoking span::data() method on a span variable to retrieve a pointer, which is then cast to a larger type and dereferenced. Such dereferences can introduce OutOfBound accesses. To address this, a new WarningGadget is being introduced to warn against such invocations. --- .../Analysis/Analyses/UnsafeBufferUsage.h | 2 +- .../Analyses/UnsafeBufferUsageGadgets.def | 1 + .../clang/Basic/DiagnosticSemaKinds.td| 2 +- clang/lib/Analysis/UnsafeBufferUsage.cpp | 37 - clang/lib/Sema/AnalysisBasedWarnings.cpp | 19 ++- ...e-buffer-usage-warning-data-invocation.cpp | 133 ++ 6 files changed, 186 insertions(+), 8 deletions(-) create mode 100644 clang/test/SemaCXX/warn-unsafe-buffer-usage-warning-data-invocation.cpp diff --git a/clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h b/clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h index 8a2d56668e32f9..b28f2c6b99c50e 100644 --- a/clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h +++ b/clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h @@ -66,7 +66,7 @@ class UnsafeBufferUsageHandler { /// Invoked when an unsafe operation over raw pointers is found. virtual void handleUnsafeOperation(const Stmt *Operation, - bool IsRelatedToDecl) = 0; + bool IsRelatedToDecl, ASTContext ) = 0; /// Invoked when a fix is suggested against a variable. This function groups /// all variables that must be fixed together (i.e their types must be changed diff --git a/clang/include/clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def b/clang/include/clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def index 757ee452ced748..c9766168836510 100644 --- a/clang/include/clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def +++ b/clang/include/clang/Analysis/Analyses/UnsafeBufferUsageGadgets.def @@ -30,6 +30,7 @@ WARNING_GADGET(Decrement) WARNING_GADGET(ArraySubscript) WARNING_GADGET(PointerArithmetic) WARNING_GADGET(UnsafeBufferUsageAttr) +WARNING_GADGET(DataInvocation) FIXABLE_GADGET(ULCArraySubscript) // `DRE[any]` in an Unspecified Lvalue Context FIXABLE_GADGET(DerefSimplePtrArithFixable) FIXABLE_GADGET(PointerDereference) diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index 94e97a891baedc..9038dd879f54ae 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -12058,7 +12058,7 @@ def warn_unsafe_buffer_variable : Warning< InGroup, DefaultIgnore; def warn_unsafe_buffer_operation : Warning< "%select{unsafe pointer operation|unsafe pointer arithmetic|" - "unsafe buffer access|function introduces unsafe buffer manipulation}0">, + "unsafe buffer access|function introduces unsafe buffer manipulation|unsafe invocation of span::data}0">, InGroup, DefaultIgnore; def note_unsafe_buffer_operation : Note< "used%select{| in pointer arithmetic| in buffer access}0 here">; diff --git a/clang/lib/Analysis/UnsafeBufferUsage.cpp b/clang/lib/Analysis/UnsafeBufferUsage.cpp index 70eec1cee57f8e..e4eca9939b10d5 100644 --- a/clang/lib/Analysis/UnsafeBufferUsage.cpp +++ b/clang/lib/Analysis/UnsafeBufferUsage.cpp @@ -721,6 +721,33 @@ class UnsafeBufferUsageAttrGadget : public WarningGadget { DeclUseList getClaimedVarUseSites() const override { return {}; } }; +// Warning gadget for unsafe invocation of span::data method. +// Triggers when the pointer returned by the invocation is immediately +// cast to a larger type. + +class DataInvocationGadget : public WarningGadget { + constexpr static const char *const OpTag = "data_invocation_expr"; + const ExplicitCastExpr *Op; + + public: + DataInvocationGadget(const MatchFinder::MatchResult ) + : WarningGadget(Kind::DataInvocation), +Op(Result.Nodes.getNodeAs(OpTag)) {} + + static bool classof(const Gadget *G) { +return G->getKind() == Kind::DataInvocation; + } + + static Matcher matcher() { +return stmt( +explicitCastExpr(has(cxxMemberCallExpr(callee( + cxxMethodDecl(hasName("data")).bind(OpTag)); + } + const Stmt *getBaseStmt() const override { return Op; } + + DeclUseList getClaimedVarUseSites() const override { return {}; } +}; + // Represents expressions of the form `DRE[*]` in the Unspecified Lvalue // Context (see `isInUnspecifiedLvalueContext`). // Note here `[]` is the built-in subscript operator. @@ -2657,8 +2684,8 @@ void
[clang] Warning for unsafe invocation of span::data (PR #75650)
https://github.com/malavikasamak edited https://github.com/llvm/llvm-project/pull/75650 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning for unsafe invocation of span::data (PR #75650)
@@ -721,6 +721,33 @@ class UnsafeBufferUsageAttrGadget : public WarningGadget { DeclUseList getClaimedVarUseSites() const override { return {}; } }; +// Warning gadget for unsafe invocation of span::data method. +// Triggers when the pointer returned by the invocation is immediately +// cast to a larger type. + +class DataInvocationGadget : public WarningGadget { + constexpr static const char *const OpTag = "data_invocation_expr"; malavikasamak wrote: Thank you. Will fix this. https://github.com/llvm/llvm-project/pull/75650 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning for unsafe invocation of span::data (PR #75650)
@@ -2261,6 +2261,21 @@ class UnsafeBufferUsageReporter : public UnsafeBufferUsageHandler { // note_unsafe_buffer_operation doesn't have this mode yet. assert(!IsRelatedToDecl && "Not implemented yet!"); MsgParam = 3; + } else if (const auto *ECE = dyn_cast(Operation)) { +QualType destType = ECE->getType(); +const uint64_t dSize = Ctx.getTypeSize(destType.getTypePtr()->getPointeeType()); +if(const auto *CE =dyn_cast(ECE->getSubExpr())) { + + if(CE->getRecordDecl()->getQualifiedNameAsString().compare("std::span")) malavikasamak wrote: Hmmm. I think it shouldn't be an issue as we are getting the qualified name. However, I will add a test case to check this. https://github.com/llvm/llvm-project/pull/75650 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning for unsafe invocation of span::data (PR #75650)
https://github.com/malavikasamak edited https://github.com/llvm/llvm-project/pull/75650 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning for unsafe invocation of span::data (PR #75650)
@@ -721,6 +721,33 @@ class UnsafeBufferUsageAttrGadget : public WarningGadget { DeclUseList getClaimedVarUseSites() const override { return {}; } }; +// Warning gadget for unsafe invocation of span::data method. +// Triggers when the pointer returned by the invocation is immediately +// cast to a larger type. + +class DataInvocationGadget : public WarningGadget { + constexpr static const char *const OpTag = "data_invocation_expr"; + const ExplicitCastExpr *Op; + + public: + DataInvocationGadget(const MatchFinder::MatchResult ) + : WarningGadget(Kind::DataInvocation), +Op(Result.Nodes.getNodeAs(OpTag)) {} + + static bool classof(const Gadget *G) { +return G->getKind() == Kind::DataInvocation; + } + + static Matcher matcher() { +return stmt( +explicitCastExpr(has(cxxMemberCallExpr(callee( + cxxMethodDecl(hasName("data")).bind(OpTag)); malavikasamak wrote: Yes. It matches all method invocations with name "data". However, we wouldn't be warning on all of them as the "handleUnsafeOperation" checks if the method belongs to std::span class. It maybe better to move it to the matcher as you point out, so there is less overlap between gadgets. https://github.com/llvm/llvm-project/pull/75650 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning for unsafe invocation of span::data (PR #75650)
@@ -2261,6 +2261,21 @@ class UnsafeBufferUsageReporter : public UnsafeBufferUsageHandler { // note_unsafe_buffer_operation doesn't have this mode yet. assert(!IsRelatedToDecl && "Not implemented yet!"); MsgParam = 3; + } else if (const auto *ECE = dyn_cast(Operation)) { +QualType destType = ECE->getType(); +const uint64_t dSize = Ctx.getTypeSize(destType.getTypePtr()->getPointeeType()); +if(const auto *CE =dyn_cast(ECE->getSubExpr())) { + + if(CE->getRecordDecl()->getQualifiedNameAsString().compare("std::span")) ziqingluo-90 wrote: I'm a tiny bit skeptical that if the qualified name is always `std::span`. Maybe we can have a test like this: ``` using namespace std; void f(span buf) { BigTy *p = (BigTy *) buf.data(); } ``` https://github.com/llvm/llvm-project/pull/75650 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning for unsafe invocation of span::data (PR #75650)
@@ -721,6 +721,33 @@ class UnsafeBufferUsageAttrGadget : public WarningGadget { DeclUseList getClaimedVarUseSites() const override { return {}; } }; +// Warning gadget for unsafe invocation of span::data method. +// Triggers when the pointer returned by the invocation is immediately +// cast to a larger type. + +class DataInvocationGadget : public WarningGadget { + constexpr static const char *const OpTag = "data_invocation_expr"; ziqingluo-90 wrote: a nitpick: I noticed most tag strings use camel cases. We better be consistent on this? https://github.com/llvm/llvm-project/pull/75650 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning for unsafe invocation of span::data (PR #75650)
https://github.com/t-rasmud requested changes to this pull request. https://github.com/llvm/llvm-project/pull/75650 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning for unsafe invocation of span::data (PR #75650)
https://github.com/t-rasmud edited https://github.com/llvm/llvm-project/pull/75650 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning for unsafe invocation of span::data (PR #75650)
@@ -721,6 +721,33 @@ class UnsafeBufferUsageAttrGadget : public WarningGadget { DeclUseList getClaimedVarUseSites() const override { return {}; } }; +// Warning gadget for unsafe invocation of span::data method. +// Triggers when the pointer returned by the invocation is immediately +// cast to a larger type. + +class DataInvocationGadget : public WarningGadget { + constexpr static const char *const OpTag = "data_invocation_expr"; + const ExplicitCastExpr *Op; + + public: + DataInvocationGadget(const MatchFinder::MatchResult ) + : WarningGadget(Kind::DataInvocation), +Op(Result.Nodes.getNodeAs(OpTag)) {} + + static bool classof(const Gadget *G) { +return G->getKind() == Kind::DataInvocation; + } + + static Matcher matcher() { +return stmt( +explicitCastExpr(has(cxxMemberCallExpr(callee( + cxxMethodDecl(hasName("data")).bind(OpTag)); t-rasmud wrote: Will this also match on user defined functions called "data"? I think something like `cxxMethodDecl(ofClass(hasName("span")))` might be needed to match `span.data()` alone (but I might be wrong). In any case, maybe have a test case with a user defined function called "data" which'll show the matcher matches only on `span.data()`. https://github.com/llvm/llvm-project/pull/75650 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Warning for unsafe invocation of span::data (PR #75650)
https://github.com/malavikasamak edited https://github.com/llvm/llvm-project/pull/75650 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits