[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Nick Desaulniers via cfe-commits


@@ -365,6 +365,10 @@ class DefaultFilterCCC final : public 
CorrectionCandidateCallback {
 template 
 class DeclFilterCCC final : public CorrectionCandidateCallback {
 public:
+  explicit DeclFilterCCC(const IdentifierInfo *Typo = nullptr,
+ NestedNameSpecifier *TypoNNS = nullptr)

nickdesaulniers wrote:

I don't think you need `explicit` if the constructor accepts more than 1 
argument.

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Nick Desaulniers via cfe-commits


@@ -818,6 +819,189 @@ CodeGenFunction::evaluateOrEmitBuiltinObjectSize(const 
Expr *E, unsigned Type,
   return ConstantInt::get(ResType, ObjectSize, /*isSigned=*/true);
 }
 
+const FieldDecl *CodeGenFunction::FindFlexibleArrayMemberField(
+ASTContext &Ctx, const RecordDecl *RD, uint64_t &Offset) {
+  const LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
+  getLangOpts().getStrictFlexArraysLevel();
+  unsigned FieldNo = 0;
+
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D);
+Field && Decl::isFlexibleArrayMemberLike(
+ Ctx, Field, Field->getType(), StrictFlexArraysLevel,
+ /*IgnoreTemplateOrMacroSubstitution=*/true)) {
+  const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+  Offset += Layout.getFieldOffset(FieldNo);
+  return Field;
+}
+
+if (const auto *Record = dyn_cast(D))
+  if (const FieldDecl *Field =
+  FindFlexibleArrayMemberField(Ctx, Record, Offset)) {
+const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+Offset += Layout.getFieldOffset(FieldNo);
+return Field;
+  }
+
+if (isa(D))
+  ++FieldNo;
+  }
+
+  return nullptr;
+}

nickdesaulniers wrote:

`FindFieldInTopLevelOrAnonymousStruct` looks somewhat similar.  Any chance to 
reuse code more between the two?

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Nick Desaulniers via cfe-commits


@@ -850,6 +1034,13 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
 }
   }
 
+  if (IsDynamic) {
+// Emit special code for a flexible array member with the "counted_by"
+// attribute.
+if (Value *V = emitFlexibleArrayMemberSize(E, Type, ResType))
+  return V;

nickdesaulniers wrote:

Should `emitFlexibleArrayMemberSize` be made infallible (via asserts rather 
than early returns of `nullptr`)?

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Bill Wendling via cfe-commits


@@ -365,6 +365,10 @@ class DefaultFilterCCC final : public 
CorrectionCandidateCallback {
 template 
 class DeclFilterCCC final : public CorrectionCandidateCallback {
 public:
+  explicit DeclFilterCCC(const IdentifierInfo *Typo = nullptr,
+ NestedNameSpecifier *TypoNNS = nullptr)

bwendling wrote:

I was copying off of the other `CCC` classes. Perhaps it's because both 
arguments have default values?

Segfault seems to agree 
(https://stackoverflow.com/questions/4467142/why-is-explicit-allowed-for-default-constructors-and-constructors-with-2-or-more):

> One reason where it's needed is, if you have default arguments for the first 
> parameter. The constructor becomes a default constructor, but can still be 
> used as converting constructor

```
struct A {
  explicit A(int = 0); // added it to a default constructor
};
```


https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Bill Wendling via cfe-commits


@@ -818,6 +819,189 @@ CodeGenFunction::evaluateOrEmitBuiltinObjectSize(const 
Expr *E, unsigned Type,
   return ConstantInt::get(ResType, ObjectSize, /*isSigned=*/true);
 }
 
+const FieldDecl *CodeGenFunction::FindFlexibleArrayMemberField(
+ASTContext &Ctx, const RecordDecl *RD, uint64_t &Offset) {
+  const LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
+  getLangOpts().getStrictFlexArraysLevel();
+  unsigned FieldNo = 0;
+
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D);
+Field && Decl::isFlexibleArrayMemberLike(
+ Ctx, Field, Field->getType(), StrictFlexArraysLevel,
+ /*IgnoreTemplateOrMacroSubstitution=*/true)) {
+  const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+  Offset += Layout.getFieldOffset(FieldNo);
+  return Field;
+}
+
+if (const auto *Record = dyn_cast(D))
+  if (const FieldDecl *Field =
+  FindFlexibleArrayMemberField(Ctx, Record, Offset)) {
+const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+Offset += Layout.getFieldOffset(FieldNo);
+return Field;
+  }
+
+if (isa(D))
+  ++FieldNo;
+  }
+
+  return nullptr;
+}

bwendling wrote:

I thought about that, and they're just differently enough to not be too 
compatible. The best(?) we could do would be to have a generic "deep" iterator 
through all of the fields in a struct. (As it is, the iterators are only for 
the surface level.)

In fact, I think there should be methods added to `ASTRecordLayout` / 
`CGRecordLayout` to iterate through more than just the first-level decls. I'd 
like to do that as a follow-up commit though.

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Bill Wendling via cfe-commits


@@ -850,6 +1034,13 @@ CodeGenFunction::emitBuiltinObjectSize(const Expr *E, 
unsigned Type,
 }
   }
 
+  if (IsDynamic) {
+// Emit special code for a flexible array member with the "counted_by"
+// attribute.
+if (Value *V = emitFlexibleArrayMemberSize(E, Type, ResType))
+  return V;

bwendling wrote:

Over the weekend, I thought about emitting a `-1 / 0` in the cases where we 
weren't able to generate the BDOS calculation. Thoughts?

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Yeoul Na via cfe-commits


@@ -1155,15 +1159,14 @@ const FieldDecl 
*CodeGenFunction::FindCountedByField(const FieldDecl *FD) {
 return nullptr;
 
   auto GetNonAnonStructOrUnion = [](const RecordDecl *RD) {
-while (RD && !RD->getDeclName())
-  if (const auto *R = dyn_cast(RD->getDeclContext()))
-RD = R;
-  else
+while (RD && RD->isAnonymousStructOrUnion()) {
+  const auto *R = dyn_cast(RD->getDeclContext());
+  if (!R)

rapidsna wrote:

Can `!R` condition be actually reachable? IOW, can an anonymous struct or union 
has a parent that is not a RecordDecl?

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Yeoul Na via cfe-commits


@@ -818,6 +819,189 @@ CodeGenFunction::evaluateOrEmitBuiltinObjectSize(const 
Expr *E, unsigned Type,
   return ConstantInt::get(ResType, ObjectSize, /*isSigned=*/true);
 }
 
+const FieldDecl *CodeGenFunction::FindFlexibleArrayMemberField(
+ASTContext &Ctx, const RecordDecl *RD, uint64_t &Offset) {
+  const LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
+  getLangOpts().getStrictFlexArraysLevel();
+  unsigned FieldNo = 0;
+
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D);
+Field && Decl::isFlexibleArrayMemberLike(
+ Ctx, Field, Field->getType(), StrictFlexArraysLevel,
+ /*IgnoreTemplateOrMacroSubstitution=*/true)) {
+  const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+  Offset += Layout.getFieldOffset(FieldNo);
+  return Field;
+}
+
+if (const auto *Record = dyn_cast(D))
+  if (const FieldDecl *Field =
+  FindFlexibleArrayMemberField(Ctx, Record, Offset)) {
+const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+Offset += Layout.getFieldOffset(FieldNo);
+return Field;
+  }
+
+if (isa(D))
+  ++FieldNo;
+  }
+
+  return nullptr;
+}
+
+llvm::Value *
+CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
+ llvm::IntegerType *ResType) {
+  // The code generated here calculates the size of a struct with a flexible
+  // array member that uses the counted_by attribute. There are two instances
+  // we handle:
+  //
+  //   struct s {
+  // unsigned long flags;
+  // int count;
+  // int array[] __attribute__((counted_by(count)));
+  //   }
+  //
+  //   1) bdos of the flexible array itself:
+  //
+  // __builtin_dynamic_object_size(p->array, 1) ==
+  // p->count * sizeof(*p->array)
+  //
+  //   2) bdos of a pointer into the flexible array:
+  //
+  // __builtin_dynamic_object_size(&p->array[42], 1) ==
+  // (p->count - 42) * sizeof(*p->array)
+  //
+  //   2) bdos of the whole struct, including the flexible array:
+  //
+  // __builtin_dynamic_object_size(p, 1) ==
+  //max(sizeof(struct s),
+  //offsetof(struct s, array) + p->count * sizeof(*p->array))
+  //
+  ASTContext &Ctx = getContext();
+  const Expr *Base = E->IgnoreParenImpCasts();
+  const Expr *Idx = nullptr;
+
+  if (const auto *UO = dyn_cast(Base);
+  UO && UO->getOpcode() == UO_AddrOf) {
+Expr *SubExpr = UO->getSubExpr()->IgnoreParenImpCasts();
+if (const auto *ASE = dyn_cast(SubExpr)) {
+  Base = ASE->getBase()->IgnoreParenImpCasts();
+  Idx = ASE->getIdx()->IgnoreParenImpCasts();
+
+  if (const auto *IL = dyn_cast(Idx)) {
+int64_t Val = IL->getValue().getSExtValue();
+if (Val < 0)
+  // __bdos returns 0 for negative indexes into an array in a struct.
+  return getDefaultBuiltinObjectSizeResult(Type, ResType);
+
+if (Val == 0)
+  // The index is 0, so we don't need to take it into account.
+  Idx = nullptr;
+  }
+} else {
+  // Potential pointer to another element in the struct.
+  Base = SubExpr;
+}
+  }
+
+  // Get the flexible array member Decl.
+  const RecordDecl *OuterRD = nullptr;
+  if (const auto *ME = dyn_cast(Base)) {
+// Check if \p Base is referencing the FAM itself.
+const ValueDecl *VD = ME->getMemberDecl();
+OuterRD = VD->getDeclContext()->getOuterLexicalRecordContext();
+  } else if (const auto *DRE = dyn_cast(Base)) {
+// Check if we're pointing to the whole struct.
+QualType Ty = DRE->getDecl()->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+OuterRD = Ty->getAsRecordDecl();
+  }
+
+  if (!OuterRD)
+return nullptr;
+
+  uint64_t Offset = 0;
+  const FieldDecl *FAMDecl = FindFlexibleArrayMemberField(Ctx, OuterRD, 
Offset);
+  Offset = Ctx.toCharUnitsFromBits(Offset).getQuantity();
+
+  if (!FAMDecl || !FAMDecl->hasAttr())
+// No flexible array member found or it doesn't have the "counted_by"
+// attribute.
+return nullptr;
+
+  const FieldDecl *CountedByFD = FindCountedByField(FAMDecl);
+  if (!CountedByFD)
+// Can't find the field referenced by the "counted_by" attribute.
+return nullptr;
+
+  // Build a load of the counted_by field.
+  bool IsSigned = CountedByFD->getType()->isSignedIntegerType();
+  Value *CountedByInst = EmitCountedByFieldExpr(Base, FAMDecl, CountedByFD);
+  if (!CountedByInst)
+return nullptr;
+
+  llvm::Type *CountedByTy = CountedByInst->getType();
+
+  // Build a load of the index and subtract it from the count.
+  Value *IdxInst = nullptr;
+  if (Idx) {
+bool IdxSigned = Idx->getType()->isSignedIntegerType();
+IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+IdxInst = IdxSigned ? Builder.CreateSExtOrTrunc(IdxInst, CountedByTy)
+: Builder.Create

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Yeoul Na via cfe-commits


@@ -944,22 +951,259 @@ static llvm::Value 
*getArrayIndexingBound(CodeGenFunction &CGF,
   return nullptr;
 }
 
+namespace {
+
+/// \p StructAccessBase returns the base \p Expr of a field access. It returns
+/// either a \p DeclRefExpr, representing the base pointer to the struct, i.e.:
+///
+/// p in p-> a.b.c
+///
+/// or a \p MemberExpr, if the \p MemberExpr has the \p RecordDecl we're
+/// looking for:
+///
+/// struct s {
+///   struct s *ptr;
+///   int count;
+///   char array[] __attribute__((counted_by(count)));
+/// };
+///
+/// If we have an expression like \p p->ptr->array[index], we want the
+/// \p MemberExpr for \p p->ptr instead of \p p.
+class StructAccessBase
+: public ConstStmtVisitor {
+  const RecordDecl *ExpectedRD;
+
+  bool IsExpectedRecordDecl(const Expr *E) const {
+QualType Ty = E->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+return ExpectedRD == Ty->getAsRecordDecl();
+  }
+
+public:
+  StructAccessBase(const RecordDecl *ExpectedRD) : ExpectedRD(ExpectedRD) {}
+
+  
//======//
+  //Visitor Methods
+  
//======//
+
+  // NOTE: If we build C++ support for counted_by, then we'll have to handle
+  // horrors like this:
+  //
+  // struct S {
+  //   int x, y;
+  //   int blah[] __attribute__((counted_by(x)));
+  // } s;
+  //
+  // int foo(int index, int val) {
+  //   int (S::*IHatePMDs)[] = &S::blah;
+  //   (s.*IHatePMDs)[index] = val;
+  // }
+
+  const Expr *Visit(const Expr *E) {
+return ConstStmtVisitor::Visit(E);
+  }
+
+  const Expr *VisitStmt(const Stmt *S) { return nullptr; }
+
+  // These are the types we expect to return (in order of most to least
+  // likely):
+  //
+  //   1. DeclRefExpr - This is the expression for the base of the structure.
+  //  It's exactly what we want to build an access to the \p counted_by
+  //  field.
+  //   2. MemberExpr - This is the expression that has the same \p RecordDecl
+  //  as the flexble array member's lexical enclosing \p RecordDecl. This
+  //  allows us to catch things like: "p->p->array"
+  //   3. CompoundLiteralExpr - This is for people who create something
+  //  heretical like (struct foo has a flexible array member):
+  //
+  //(struct foo){ 1, 2 }.blah[idx];
+  const Expr *VisitDeclRefExpr(const DeclRefExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitMemberExpr(const MemberExpr *E) {
+if (IsExpectedRecordDecl(E) && E->isArrow())
+  return E;
+const Expr *Res = Visit(E->getBase());
+return !Res && IsExpectedRecordDecl(E) ? E : Res;
+  }
+  const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitCallExpr(const CallExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+
+  const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) {
+if (IsExpectedRecordDecl(E))
+  return E;
+return Visit(E->getBase());
+  }
+  const Expr *VisitCastExpr(const CastExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitParenExpr(const ParenExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryAddrOf(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryDeref(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+};
+
+} // end anonymous namespace
+
+using RecIndicesTy =
+SmallVector, 8>;
+
+static bool getGEPIndicesToField(CodeGenFunction &CGF, const RecordDecl *RD,
+ const FieldDecl *FD, RecIndicesTy &Indices) {
+  const CGRecordLayout &Layout = CGF.CGM.getTypes().getCGRecordLayout(RD);
+  int64_t FieldNo = -1;
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D)) {
+  FieldNo = Layout.getLLVMFieldNo(Field);
+  if (FD == Field) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+
+if (const auto *Record = dyn_cast(D)) {
+  ++FieldNo;
+  if (getGEPIndicesToField(CGF, Record, FD, Indices)) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+  }
+
+  return false;
+}
+
+llvm::Value *CodeGenFunction::EmitCountedByFieldExpr(
+const Expr *Base, const FieldDecl *FAMDecl, const FieldDecl *CountDecl) {
+  // This method is typically called in contexts where we can't generate
+  // side-effects, like in __builtin_dynamic_object_size. When finding
+  // expressions, only choose those that have either already been emitted or
+  // can be loaded without side-effects.
+  const RecordDecl *RD = 
CountDecl->getParent()->getOuterLexicalRecordContext();
+
+  // F

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Yeoul Na via cfe-commits


@@ -944,22 +951,259 @@ static llvm::Value 
*getArrayIndexingBound(CodeGenFunction &CGF,
   return nullptr;
 }
 
+namespace {
+
+/// \p StructAccessBase returns the base \p Expr of a field access. It returns
+/// either a \p DeclRefExpr, representing the base pointer to the struct, i.e.:
+///
+/// p in p-> a.b.c
+///
+/// or a \p MemberExpr, if the \p MemberExpr has the \p RecordDecl we're
+/// looking for:
+///
+/// struct s {
+///   struct s *ptr;
+///   int count;
+///   char array[] __attribute__((counted_by(count)));
+/// };
+///
+/// If we have an expression like \p p->ptr->array[index], we want the
+/// \p MemberExpr for \p p->ptr instead of \p p.
+class StructAccessBase
+: public ConstStmtVisitor {
+  const RecordDecl *ExpectedRD;
+
+  bool IsExpectedRecordDecl(const Expr *E) const {
+QualType Ty = E->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+return ExpectedRD == Ty->getAsRecordDecl();
+  }
+
+public:
+  StructAccessBase(const RecordDecl *ExpectedRD) : ExpectedRD(ExpectedRD) {}
+
+  
//======//
+  //Visitor Methods
+  
//======//
+
+  // NOTE: If we build C++ support for counted_by, then we'll have to handle
+  // horrors like this:
+  //
+  // struct S {
+  //   int x, y;
+  //   int blah[] __attribute__((counted_by(x)));
+  // } s;
+  //
+  // int foo(int index, int val) {
+  //   int (S::*IHatePMDs)[] = &S::blah;
+  //   (s.*IHatePMDs)[index] = val;
+  // }
+
+  const Expr *Visit(const Expr *E) {
+return ConstStmtVisitor::Visit(E);
+  }
+
+  const Expr *VisitStmt(const Stmt *S) { return nullptr; }
+
+  // These are the types we expect to return (in order of most to least
+  // likely):
+  //
+  //   1. DeclRefExpr - This is the expression for the base of the structure.
+  //  It's exactly what we want to build an access to the \p counted_by
+  //  field.
+  //   2. MemberExpr - This is the expression that has the same \p RecordDecl
+  //  as the flexble array member's lexical enclosing \p RecordDecl. This
+  //  allows us to catch things like: "p->p->array"
+  //   3. CompoundLiteralExpr - This is for people who create something
+  //  heretical like (struct foo has a flexible array member):
+  //
+  //(struct foo){ 1, 2 }.blah[idx];
+  const Expr *VisitDeclRefExpr(const DeclRefExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitMemberExpr(const MemberExpr *E) {
+if (IsExpectedRecordDecl(E) && E->isArrow())
+  return E;
+const Expr *Res = Visit(E->getBase());
+return !Res && IsExpectedRecordDecl(E) ? E : Res;
+  }
+  const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitCallExpr(const CallExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+
+  const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) {
+if (IsExpectedRecordDecl(E))
+  return E;
+return Visit(E->getBase());
+  }
+  const Expr *VisitCastExpr(const CastExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitParenExpr(const ParenExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryAddrOf(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryDeref(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+};
+
+} // end anonymous namespace
+
+using RecIndicesTy =
+SmallVector, 8>;
+
+static bool getGEPIndicesToField(CodeGenFunction &CGF, const RecordDecl *RD,
+ const FieldDecl *FD, RecIndicesTy &Indices) {
+  const CGRecordLayout &Layout = CGF.CGM.getTypes().getCGRecordLayout(RD);
+  int64_t FieldNo = -1;
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D)) {
+  FieldNo = Layout.getLLVMFieldNo(Field);
+  if (FD == Field) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+
+if (const auto *Record = dyn_cast(D)) {
+  ++FieldNo;
+  if (getGEPIndicesToField(CGF, Record, FD, Indices)) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+  }
+
+  return false;
+}
+
+llvm::Value *CodeGenFunction::EmitCountedByFieldExpr(
+const Expr *Base, const FieldDecl *FAMDecl, const FieldDecl *CountDecl) {
+  // This method is typically called in contexts where we can't generate
+  // side-effects, like in __builtin_dynamic_object_size. When finding
+  // expressions, only choose those that have either already been emitted or
+  // can be loaded without side-effects.
+  const RecordDecl *RD = 
CountDecl->getParent()->getOuterLexicalRecordContext();
+
+  // F

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Yeoul Na via cfe-commits


@@ -818,6 +819,189 @@ CodeGenFunction::evaluateOrEmitBuiltinObjectSize(const 
Expr *E, unsigned Type,
   return ConstantInt::get(ResType, ObjectSize, /*isSigned=*/true);
 }
 
+const FieldDecl *CodeGenFunction::FindFlexibleArrayMemberField(
+ASTContext &Ctx, const RecordDecl *RD, uint64_t &Offset) {
+  const LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
+  getLangOpts().getStrictFlexArraysLevel();
+  unsigned FieldNo = 0;
+
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D);
+Field && Decl::isFlexibleArrayMemberLike(
+ Ctx, Field, Field->getType(), StrictFlexArraysLevel,
+ /*IgnoreTemplateOrMacroSubstitution=*/true)) {
+  const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+  Offset += Layout.getFieldOffset(FieldNo);
+  return Field;
+}
+
+if (const auto *Record = dyn_cast(D))
+  if (const FieldDecl *Field =
+  FindFlexibleArrayMemberField(Ctx, Record, Offset)) {
+const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+Offset += Layout.getFieldOffset(FieldNo);
+return Field;
+  }
+
+if (isa(D))
+  ++FieldNo;
+  }
+
+  return nullptr;
+}
+
+llvm::Value *
+CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
+ llvm::IntegerType *ResType) {
+  // The code generated here calculates the size of a struct with a flexible
+  // array member that uses the counted_by attribute. There are two instances
+  // we handle:
+  //
+  //   struct s {
+  // unsigned long flags;
+  // int count;
+  // int array[] __attribute__((counted_by(count)));
+  //   }
+  //
+  //   1) bdos of the flexible array itself:
+  //
+  // __builtin_dynamic_object_size(p->array, 1) ==
+  // p->count * sizeof(*p->array)
+  //
+  //   2) bdos of a pointer into the flexible array:
+  //
+  // __builtin_dynamic_object_size(&p->array[42], 1) ==
+  // (p->count - 42) * sizeof(*p->array)
+  //
+  //   2) bdos of the whole struct, including the flexible array:
+  //
+  // __builtin_dynamic_object_size(p, 1) ==
+  //max(sizeof(struct s),
+  //offsetof(struct s, array) + p->count * sizeof(*p->array))
+  //
+  ASTContext &Ctx = getContext();
+  const Expr *Base = E->IgnoreParenImpCasts();
+  const Expr *Idx = nullptr;
+
+  if (const auto *UO = dyn_cast(Base);
+  UO && UO->getOpcode() == UO_AddrOf) {
+Expr *SubExpr = UO->getSubExpr()->IgnoreParenImpCasts();
+if (const auto *ASE = dyn_cast(SubExpr)) {
+  Base = ASE->getBase()->IgnoreParenImpCasts();
+  Idx = ASE->getIdx()->IgnoreParenImpCasts();
+
+  if (const auto *IL = dyn_cast(Idx)) {
+int64_t Val = IL->getValue().getSExtValue();
+if (Val < 0)
+  // __bdos returns 0 for negative indexes into an array in a struct.
+  return getDefaultBuiltinObjectSizeResult(Type, ResType);
+
+if (Val == 0)
+  // The index is 0, so we don't need to take it into account.
+  Idx = nullptr;
+  }
+} else {
+  // Potential pointer to another element in the struct.
+  Base = SubExpr;
+}
+  }
+
+  // Get the flexible array member Decl.
+  const RecordDecl *OuterRD = nullptr;
+  if (const auto *ME = dyn_cast(Base)) {
+// Check if \p Base is referencing the FAM itself.
+const ValueDecl *VD = ME->getMemberDecl();
+OuterRD = VD->getDeclContext()->getOuterLexicalRecordContext();
+  } else if (const auto *DRE = dyn_cast(Base)) {
+// Check if we're pointing to the whole struct.
+QualType Ty = DRE->getDecl()->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+OuterRD = Ty->getAsRecordDecl();
+  }
+
+  if (!OuterRD)
+return nullptr;
+
+  uint64_t Offset = 0;
+  const FieldDecl *FAMDecl = FindFlexibleArrayMemberField(Ctx, OuterRD, 
Offset);
+  Offset = Ctx.toCharUnitsFromBits(Offset).getQuantity();
+
+  if (!FAMDecl || !FAMDecl->hasAttr())
+// No flexible array member found or it doesn't have the "counted_by"
+// attribute.
+return nullptr;
+
+  const FieldDecl *CountedByFD = FindCountedByField(FAMDecl);
+  if (!CountedByFD)
+// Can't find the field referenced by the "counted_by" attribute.
+return nullptr;
+
+  // Build a load of the counted_by field.
+  bool IsSigned = CountedByFD->getType()->isSignedIntegerType();
+  Value *CountedByInst = EmitCountedByFieldExpr(Base, FAMDecl, CountedByFD);
+  if (!CountedByInst)
+return nullptr;
+
+  llvm::Type *CountedByTy = CountedByInst->getType();
+
+  // Build a load of the index and subtract it from the count.
+  Value *IdxInst = nullptr;
+  if (Idx) {
+bool IdxSigned = Idx->getType()->isSignedIntegerType();
+IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+IdxInst = IdxSigned ? Builder.CreateSExtOrTrunc(IdxInst, CountedByTy)
+: Builder.Create

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Yeoul Na via cfe-commits


@@ -818,6 +819,189 @@ CodeGenFunction::evaluateOrEmitBuiltinObjectSize(const 
Expr *E, unsigned Type,
   return ConstantInt::get(ResType, ObjectSize, /*isSigned=*/true);
 }
 
+const FieldDecl *CodeGenFunction::FindFlexibleArrayMemberField(
+ASTContext &Ctx, const RecordDecl *RD, uint64_t &Offset) {
+  const LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
+  getLangOpts().getStrictFlexArraysLevel();
+  unsigned FieldNo = 0;
+
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D);
+Field && Decl::isFlexibleArrayMemberLike(
+ Ctx, Field, Field->getType(), StrictFlexArraysLevel,
+ /*IgnoreTemplateOrMacroSubstitution=*/true)) {
+  const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+  Offset += Layout.getFieldOffset(FieldNo);
+  return Field;
+}
+
+if (const auto *Record = dyn_cast(D))
+  if (const FieldDecl *Field =
+  FindFlexibleArrayMemberField(Ctx, Record, Offset)) {
+const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+Offset += Layout.getFieldOffset(FieldNo);
+return Field;
+  }
+
+if (isa(D))
+  ++FieldNo;
+  }
+
+  return nullptr;
+}
+
+llvm::Value *
+CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
+ llvm::IntegerType *ResType) {
+  // The code generated here calculates the size of a struct with a flexible
+  // array member that uses the counted_by attribute. There are two instances
+  // we handle:
+  //
+  //   struct s {
+  // unsigned long flags;
+  // int count;
+  // int array[] __attribute__((counted_by(count)));
+  //   }
+  //
+  //   1) bdos of the flexible array itself:
+  //
+  // __builtin_dynamic_object_size(p->array, 1) ==
+  // p->count * sizeof(*p->array)
+  //
+  //   2) bdos of a pointer into the flexible array:
+  //
+  // __builtin_dynamic_object_size(&p->array[42], 1) ==
+  // (p->count - 42) * sizeof(*p->array)
+  //
+  //   2) bdos of the whole struct, including the flexible array:
+  //
+  // __builtin_dynamic_object_size(p, 1) ==
+  //max(sizeof(struct s),
+  //offsetof(struct s, array) + p->count * sizeof(*p->array))
+  //
+  ASTContext &Ctx = getContext();
+  const Expr *Base = E->IgnoreParenImpCasts();
+  const Expr *Idx = nullptr;
+
+  if (const auto *UO = dyn_cast(Base);
+  UO && UO->getOpcode() == UO_AddrOf) {
+Expr *SubExpr = UO->getSubExpr()->IgnoreParenImpCasts();
+if (const auto *ASE = dyn_cast(SubExpr)) {
+  Base = ASE->getBase()->IgnoreParenImpCasts();
+  Idx = ASE->getIdx()->IgnoreParenImpCasts();
+
+  if (const auto *IL = dyn_cast(Idx)) {
+int64_t Val = IL->getValue().getSExtValue();
+if (Val < 0)
+  // __bdos returns 0 for negative indexes into an array in a struct.
+  return getDefaultBuiltinObjectSizeResult(Type, ResType);
+
+if (Val == 0)
+  // The index is 0, so we don't need to take it into account.
+  Idx = nullptr;
+  }
+} else {
+  // Potential pointer to another element in the struct.
+  Base = SubExpr;
+}
+  }
+
+  // Get the flexible array member Decl.
+  const RecordDecl *OuterRD = nullptr;
+  if (const auto *ME = dyn_cast(Base)) {
+// Check if \p Base is referencing the FAM itself.
+const ValueDecl *VD = ME->getMemberDecl();
+OuterRD = VD->getDeclContext()->getOuterLexicalRecordContext();
+  } else if (const auto *DRE = dyn_cast(Base)) {
+// Check if we're pointing to the whole struct.
+QualType Ty = DRE->getDecl()->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+OuterRD = Ty->getAsRecordDecl();
+  }
+
+  if (!OuterRD)
+return nullptr;
+
+  uint64_t Offset = 0;
+  const FieldDecl *FAMDecl = FindFlexibleArrayMemberField(Ctx, OuterRD, 
Offset);
+  Offset = Ctx.toCharUnitsFromBits(Offset).getQuantity();
+
+  if (!FAMDecl || !FAMDecl->hasAttr())
+// No flexible array member found or it doesn't have the "counted_by"
+// attribute.
+return nullptr;
+
+  const FieldDecl *CountedByFD = FindCountedByField(FAMDecl);
+  if (!CountedByFD)
+// Can't find the field referenced by the "counted_by" attribute.
+return nullptr;
+
+  // Build a load of the counted_by field.
+  bool IsSigned = CountedByFD->getType()->isSignedIntegerType();
+  Value *CountedByInst = EmitCountedByFieldExpr(Base, FAMDecl, CountedByFD);
+  if (!CountedByInst)
+return nullptr;
+
+  llvm::Type *CountedByTy = CountedByInst->getType();
+
+  // Build a load of the index and subtract it from the count.
+  Value *IdxInst = nullptr;
+  if (Idx) {
+bool IdxSigned = Idx->getType()->isSignedIntegerType();
+IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+IdxInst = IdxSigned ? Builder.CreateSExtOrTrunc(IdxInst, CountedByTy)

rapidsna wrote:

I'm not

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Yeoul Na via cfe-commits


@@ -944,22 +951,262 @@ static llvm::Value 
*getArrayIndexingBound(CodeGenFunction &CGF,
   return nullptr;
 }
 
+namespace {
+
+/// \p StructAccessBase returns the base \p Expr of a field access. It returns
+/// either a \p DeclRefExpr, representing the base pointer to the struct, i.e.:
+///
+/// p in p-> a.b.c
+///
+/// or a \p MemberExpr, if the \p MemberExpr has the \p RecordDecl we're
+/// looking for:
+///
+/// struct s {
+///   struct s *ptr;
+///   int count;
+///   char array[] __attribute__((counted_by(count)));
+/// };
+///
+/// If we have an expression like \p p->ptr->array[index], we want the
+/// \p MemberExpr for \p p->ptr instead of \p p.
+class StructAccessBase
+: public ConstStmtVisitor {
+  const RecordDecl *ExpectedRD;
+
+  bool IsExpectedRecordDecl(const Expr *E) const {
+QualType Ty = E->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+return ExpectedRD == Ty->getAsRecordDecl();
+  }
+
+public:
+  StructAccessBase(const RecordDecl *ExpectedRD) : ExpectedRD(ExpectedRD) {}
+
+  
//======//
+  //Visitor Methods
+  
//======//
+
+  // NOTE: If we build C++ support for counted_by, then we'll have to handle
+  // horrors like this:
+  //
+  // struct S {
+  //   int x, y;
+  //   int blah[] __attribute__((counted_by(x)));
+  // } s;
+  //
+  // int foo(int index, int val) {
+  //   int (S::*IHatePMDs)[] = &S::blah;
+  //   (s.*IHatePMDs)[index] = val;
+  // }
+
+  const Expr *Visit(const Expr *E) {
+return ConstStmtVisitor::Visit(E);
+  }
+
+  const Expr *VisitStmt(const Stmt *S) { return nullptr; }
+
+  // These are the types we expect to return (in order of most to least
+  // likely):
+  //
+  //   1. DeclRefExpr - This is the expression for the base of the structure.
+  //  It's exactly what we want to build an access to the \p counted_by
+  //  field.
+  //   2. MemberExpr - This is the expression that has the same \p RecordDecl
+  //  as the flexble array member's lexical enclosing \p RecordDecl. This
+  //  allows us to catch things like: "p->p->array"
+  //   3. CompoundLiteralExpr - This is for people who create something
+  //  heretical like (struct foo has a flexible array member):
+  //
+  //(struct foo){ 1, 2 }.blah[idx];
+  const Expr *VisitDeclRefExpr(const DeclRefExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitMemberExpr(const MemberExpr *E) {
+if (IsExpectedRecordDecl(E) && E->isArrow())
+  return E;
+const Expr *Res = Visit(E->getBase());
+return !Res && IsExpectedRecordDecl(E) ? E : Res;
+  }
+  const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitCallExpr(const CallExpr *E) {

rapidsna wrote:

Do we need to handle calls?

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Yeoul Na via cfe-commits


@@ -944,22 +951,259 @@ static llvm::Value 
*getArrayIndexingBound(CodeGenFunction &CGF,
   return nullptr;
 }
 
+namespace {
+
+/// \p StructAccessBase returns the base \p Expr of a field access. It returns
+/// either a \p DeclRefExpr, representing the base pointer to the struct, i.e.:
+///
+/// p in p-> a.b.c
+///
+/// or a \p MemberExpr, if the \p MemberExpr has the \p RecordDecl we're
+/// looking for:
+///
+/// struct s {
+///   struct s *ptr;
+///   int count;
+///   char array[] __attribute__((counted_by(count)));
+/// };
+///
+/// If we have an expression like \p p->ptr->array[index], we want the
+/// \p MemberExpr for \p p->ptr instead of \p p.
+class StructAccessBase
+: public ConstStmtVisitor {
+  const RecordDecl *ExpectedRD;
+
+  bool IsExpectedRecordDecl(const Expr *E) const {
+QualType Ty = E->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+return ExpectedRD == Ty->getAsRecordDecl();
+  }
+
+public:
+  StructAccessBase(const RecordDecl *ExpectedRD) : ExpectedRD(ExpectedRD) {}
+
+  
//======//
+  //Visitor Methods
+  
//======//
+
+  // NOTE: If we build C++ support for counted_by, then we'll have to handle
+  // horrors like this:
+  //
+  // struct S {
+  //   int x, y;
+  //   int blah[] __attribute__((counted_by(x)));
+  // } s;
+  //
+  // int foo(int index, int val) {
+  //   int (S::*IHatePMDs)[] = &S::blah;
+  //   (s.*IHatePMDs)[index] = val;
+  // }
+
+  const Expr *Visit(const Expr *E) {
+return ConstStmtVisitor::Visit(E);
+  }
+
+  const Expr *VisitStmt(const Stmt *S) { return nullptr; }
+
+  // These are the types we expect to return (in order of most to least
+  // likely):
+  //
+  //   1. DeclRefExpr - This is the expression for the base of the structure.
+  //  It's exactly what we want to build an access to the \p counted_by
+  //  field.
+  //   2. MemberExpr - This is the expression that has the same \p RecordDecl
+  //  as the flexble array member's lexical enclosing \p RecordDecl. This
+  //  allows us to catch things like: "p->p->array"
+  //   3. CompoundLiteralExpr - This is for people who create something
+  //  heretical like (struct foo has a flexible array member):
+  //
+  //(struct foo){ 1, 2 }.blah[idx];
+  const Expr *VisitDeclRefExpr(const DeclRefExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitMemberExpr(const MemberExpr *E) {
+if (IsExpectedRecordDecl(E) && E->isArrow())
+  return E;
+const Expr *Res = Visit(E->getBase());
+return !Res && IsExpectedRecordDecl(E) ? E : Res;
+  }
+  const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitCallExpr(const CallExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+
+  const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) {
+if (IsExpectedRecordDecl(E))
+  return E;
+return Visit(E->getBase());
+  }
+  const Expr *VisitCastExpr(const CastExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitParenExpr(const ParenExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryAddrOf(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryDeref(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+};
+
+} // end anonymous namespace
+
+using RecIndicesTy =
+SmallVector, 8>;
+
+static bool getGEPIndicesToField(CodeGenFunction &CGF, const RecordDecl *RD,
+ const FieldDecl *FD, RecIndicesTy &Indices) {
+  const CGRecordLayout &Layout = CGF.CGM.getTypes().getCGRecordLayout(RD);
+  int64_t FieldNo = -1;
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D)) {
+  FieldNo = Layout.getLLVMFieldNo(Field);
+  if (FD == Field) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+
+if (const auto *Record = dyn_cast(D)) {
+  ++FieldNo;
+  if (getGEPIndicesToField(CGF, Record, FD, Indices)) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+  }
+
+  return false;
+}
+
+llvm::Value *CodeGenFunction::EmitCountedByFieldExpr(
+const Expr *Base, const FieldDecl *FAMDecl, const FieldDecl *CountDecl) {
+  // This method is typically called in contexts where we can't generate
+  // side-effects, like in __builtin_dynamic_object_size. When finding
+  // expressions, only choose those that have either already been emitted or
+  // can be loaded without side-effects.
+  const RecordDecl *RD = 
CountDecl->getParent()->getOuterLexicalRecordContext();
+
+  // F

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna commented:

I left some comments. Looking good overall but `LocalDeclMap` doesn't seem to 
serve the purpose for this PR. And I'm not sure the code block using it would 
be actually necessary. Please see my inlined review.

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Bill Wendling via cfe-commits


@@ -944,22 +951,259 @@ static llvm::Value 
*getArrayIndexingBound(CodeGenFunction &CGF,
   return nullptr;
 }
 
+namespace {
+
+/// \p StructAccessBase returns the base \p Expr of a field access. It returns
+/// either a \p DeclRefExpr, representing the base pointer to the struct, i.e.:
+///
+/// p in p-> a.b.c
+///
+/// or a \p MemberExpr, if the \p MemberExpr has the \p RecordDecl we're
+/// looking for:
+///
+/// struct s {
+///   struct s *ptr;
+///   int count;
+///   char array[] __attribute__((counted_by(count)));
+/// };
+///
+/// If we have an expression like \p p->ptr->array[index], we want the
+/// \p MemberExpr for \p p->ptr instead of \p p.
+class StructAccessBase
+: public ConstStmtVisitor {
+  const RecordDecl *ExpectedRD;
+
+  bool IsExpectedRecordDecl(const Expr *E) const {
+QualType Ty = E->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+return ExpectedRD == Ty->getAsRecordDecl();
+  }
+
+public:
+  StructAccessBase(const RecordDecl *ExpectedRD) : ExpectedRD(ExpectedRD) {}
+
+  
//======//
+  //Visitor Methods
+  
//======//
+
+  // NOTE: If we build C++ support for counted_by, then we'll have to handle
+  // horrors like this:
+  //
+  // struct S {
+  //   int x, y;
+  //   int blah[] __attribute__((counted_by(x)));
+  // } s;
+  //
+  // int foo(int index, int val) {
+  //   int (S::*IHatePMDs)[] = &S::blah;
+  //   (s.*IHatePMDs)[index] = val;
+  // }
+
+  const Expr *Visit(const Expr *E) {
+return ConstStmtVisitor::Visit(E);
+  }
+
+  const Expr *VisitStmt(const Stmt *S) { return nullptr; }
+
+  // These are the types we expect to return (in order of most to least
+  // likely):
+  //
+  //   1. DeclRefExpr - This is the expression for the base of the structure.
+  //  It's exactly what we want to build an access to the \p counted_by
+  //  field.
+  //   2. MemberExpr - This is the expression that has the same \p RecordDecl
+  //  as the flexble array member's lexical enclosing \p RecordDecl. This
+  //  allows us to catch things like: "p->p->array"
+  //   3. CompoundLiteralExpr - This is for people who create something
+  //  heretical like (struct foo has a flexible array member):
+  //
+  //(struct foo){ 1, 2 }.blah[idx];
+  const Expr *VisitDeclRefExpr(const DeclRefExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitMemberExpr(const MemberExpr *E) {
+if (IsExpectedRecordDecl(E) && E->isArrow())
+  return E;
+const Expr *Res = Visit(E->getBase());
+return !Res && IsExpectedRecordDecl(E) ? E : Res;
+  }
+  const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitCallExpr(const CallExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+
+  const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) {
+if (IsExpectedRecordDecl(E))
+  return E;
+return Visit(E->getBase());
+  }
+  const Expr *VisitCastExpr(const CastExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitParenExpr(const ParenExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryAddrOf(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryDeref(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+};
+
+} // end anonymous namespace
+
+using RecIndicesTy =
+SmallVector, 8>;
+
+static bool getGEPIndicesToField(CodeGenFunction &CGF, const RecordDecl *RD,
+ const FieldDecl *FD, RecIndicesTy &Indices) {
+  const CGRecordLayout &Layout = CGF.CGM.getTypes().getCGRecordLayout(RD);
+  int64_t FieldNo = -1;
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D)) {
+  FieldNo = Layout.getLLVMFieldNo(Field);
+  if (FD == Field) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+
+if (const auto *Record = dyn_cast(D)) {
+  ++FieldNo;
+  if (getGEPIndicesToField(CGF, Record, FD, Indices)) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+  }
+
+  return false;
+}
+
+llvm::Value *CodeGenFunction::EmitCountedByFieldExpr(
+const Expr *Base, const FieldDecl *FAMDecl, const FieldDecl *CountDecl) {
+  // This method is typically called in contexts where we can't generate
+  // side-effects, like in __builtin_dynamic_object_size. When finding
+  // expressions, only choose those that have either already been emitted or
+  // can be loaded without side-effects.
+  const RecordDecl *RD = 
CountDecl->getParent()->getOuterLexicalRecordContext();
+
+  // F

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Bill Wendling via cfe-commits


@@ -1155,15 +1159,14 @@ const FieldDecl 
*CodeGenFunction::FindCountedByField(const FieldDecl *FD) {
 return nullptr;
 
   auto GetNonAnonStructOrUnion = [](const RecordDecl *RD) {
-while (RD && !RD->getDeclName())
-  if (const auto *R = dyn_cast(RD->getDeclContext()))
-RD = R;
-  else
+while (RD && RD->isAnonymousStructOrUnion()) {
+  const auto *R = dyn_cast(RD->getDeclContext());
+  if (!R)

bwendling wrote:

I think you could have:

```
int foo(void) {
  struct {
int a, b;
  } x;

  ...
}
```

Though in the case of a struct with a flexible array member, probably not. Then 
again, you might be able to do this:

```
int foo(void) {
  struct {
int count;
int array[] __counted_by(count);
  } x = { 3 };
  ...
}
```

I just didn't want to make any assumptions. I can make this an assert if you'd 
like.

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Bill Wendling via cfe-commits


@@ -818,6 +819,189 @@ CodeGenFunction::evaluateOrEmitBuiltinObjectSize(const 
Expr *E, unsigned Type,
   return ConstantInt::get(ResType, ObjectSize, /*isSigned=*/true);
 }
 
+const FieldDecl *CodeGenFunction::FindFlexibleArrayMemberField(
+ASTContext &Ctx, const RecordDecl *RD, uint64_t &Offset) {
+  const LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
+  getLangOpts().getStrictFlexArraysLevel();
+  unsigned FieldNo = 0;
+
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D);
+Field && Decl::isFlexibleArrayMemberLike(
+ Ctx, Field, Field->getType(), StrictFlexArraysLevel,
+ /*IgnoreTemplateOrMacroSubstitution=*/true)) {
+  const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+  Offset += Layout.getFieldOffset(FieldNo);
+  return Field;
+}
+
+if (const auto *Record = dyn_cast(D))
+  if (const FieldDecl *Field =
+  FindFlexibleArrayMemberField(Ctx, Record, Offset)) {
+const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+Offset += Layout.getFieldOffset(FieldNo);
+return Field;
+  }
+
+if (isa(D))
+  ++FieldNo;
+  }
+
+  return nullptr;
+}
+
+llvm::Value *
+CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
+ llvm::IntegerType *ResType) {
+  // The code generated here calculates the size of a struct with a flexible
+  // array member that uses the counted_by attribute. There are two instances
+  // we handle:
+  //
+  //   struct s {
+  // unsigned long flags;
+  // int count;
+  // int array[] __attribute__((counted_by(count)));
+  //   }
+  //
+  //   1) bdos of the flexible array itself:
+  //
+  // __builtin_dynamic_object_size(p->array, 1) ==
+  // p->count * sizeof(*p->array)
+  //
+  //   2) bdos of a pointer into the flexible array:
+  //
+  // __builtin_dynamic_object_size(&p->array[42], 1) ==
+  // (p->count - 42) * sizeof(*p->array)
+  //
+  //   2) bdos of the whole struct, including the flexible array:
+  //
+  // __builtin_dynamic_object_size(p, 1) ==
+  //max(sizeof(struct s),
+  //offsetof(struct s, array) + p->count * sizeof(*p->array))
+  //
+  ASTContext &Ctx = getContext();
+  const Expr *Base = E->IgnoreParenImpCasts();
+  const Expr *Idx = nullptr;
+
+  if (const auto *UO = dyn_cast(Base);
+  UO && UO->getOpcode() == UO_AddrOf) {
+Expr *SubExpr = UO->getSubExpr()->IgnoreParenImpCasts();
+if (const auto *ASE = dyn_cast(SubExpr)) {
+  Base = ASE->getBase()->IgnoreParenImpCasts();
+  Idx = ASE->getIdx()->IgnoreParenImpCasts();
+
+  if (const auto *IL = dyn_cast(Idx)) {
+int64_t Val = IL->getValue().getSExtValue();
+if (Val < 0)
+  // __bdos returns 0 for negative indexes into an array in a struct.
+  return getDefaultBuiltinObjectSizeResult(Type, ResType);
+
+if (Val == 0)
+  // The index is 0, so we don't need to take it into account.
+  Idx = nullptr;
+  }
+} else {
+  // Potential pointer to another element in the struct.
+  Base = SubExpr;
+}
+  }
+
+  // Get the flexible array member Decl.
+  const RecordDecl *OuterRD = nullptr;
+  if (const auto *ME = dyn_cast(Base)) {
+// Check if \p Base is referencing the FAM itself.
+const ValueDecl *VD = ME->getMemberDecl();
+OuterRD = VD->getDeclContext()->getOuterLexicalRecordContext();
+  } else if (const auto *DRE = dyn_cast(Base)) {
+// Check if we're pointing to the whole struct.
+QualType Ty = DRE->getDecl()->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+OuterRD = Ty->getAsRecordDecl();
+  }
+
+  if (!OuterRD)
+return nullptr;
+
+  uint64_t Offset = 0;
+  const FieldDecl *FAMDecl = FindFlexibleArrayMemberField(Ctx, OuterRD, 
Offset);
+  Offset = Ctx.toCharUnitsFromBits(Offset).getQuantity();
+
+  if (!FAMDecl || !FAMDecl->hasAttr())
+// No flexible array member found or it doesn't have the "counted_by"
+// attribute.
+return nullptr;
+
+  const FieldDecl *CountedByFD = FindCountedByField(FAMDecl);
+  if (!CountedByFD)
+// Can't find the field referenced by the "counted_by" attribute.
+return nullptr;
+
+  // Build a load of the counted_by field.
+  bool IsSigned = CountedByFD->getType()->isSignedIntegerType();
+  Value *CountedByInst = EmitCountedByFieldExpr(Base, FAMDecl, CountedByFD);
+  if (!CountedByInst)
+return nullptr;
+
+  llvm::Type *CountedByTy = CountedByInst->getType();
+
+  // Build a load of the index and subtract it from the count.
+  Value *IdxInst = nullptr;
+  if (Idx) {
+bool IdxSigned = Idx->getType()->isSignedIntegerType();
+IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+IdxInst = IdxSigned ? Builder.CreateSExtOrTrunc(IdxInst, CountedByTy)
+: Builder.Create

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Bill Wendling via cfe-commits


@@ -818,6 +819,189 @@ CodeGenFunction::evaluateOrEmitBuiltinObjectSize(const 
Expr *E, unsigned Type,
   return ConstantInt::get(ResType, ObjectSize, /*isSigned=*/true);
 }
 
+const FieldDecl *CodeGenFunction::FindFlexibleArrayMemberField(
+ASTContext &Ctx, const RecordDecl *RD, uint64_t &Offset) {
+  const LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
+  getLangOpts().getStrictFlexArraysLevel();
+  unsigned FieldNo = 0;
+
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D);
+Field && Decl::isFlexibleArrayMemberLike(
+ Ctx, Field, Field->getType(), StrictFlexArraysLevel,
+ /*IgnoreTemplateOrMacroSubstitution=*/true)) {
+  const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+  Offset += Layout.getFieldOffset(FieldNo);
+  return Field;
+}
+
+if (const auto *Record = dyn_cast(D))
+  if (const FieldDecl *Field =
+  FindFlexibleArrayMemberField(Ctx, Record, Offset)) {
+const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+Offset += Layout.getFieldOffset(FieldNo);
+return Field;
+  }
+
+if (isa(D))
+  ++FieldNo;
+  }
+
+  return nullptr;
+}
+
+llvm::Value *
+CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
+ llvm::IntegerType *ResType) {
+  // The code generated here calculates the size of a struct with a flexible
+  // array member that uses the counted_by attribute. There are two instances
+  // we handle:
+  //
+  //   struct s {
+  // unsigned long flags;
+  // int count;
+  // int array[] __attribute__((counted_by(count)));
+  //   }
+  //
+  //   1) bdos of the flexible array itself:
+  //
+  // __builtin_dynamic_object_size(p->array, 1) ==
+  // p->count * sizeof(*p->array)
+  //
+  //   2) bdos of a pointer into the flexible array:
+  //
+  // __builtin_dynamic_object_size(&p->array[42], 1) ==
+  // (p->count - 42) * sizeof(*p->array)
+  //
+  //   2) bdos of the whole struct, including the flexible array:
+  //
+  // __builtin_dynamic_object_size(p, 1) ==
+  //max(sizeof(struct s),
+  //offsetof(struct s, array) + p->count * sizeof(*p->array))
+  //
+  ASTContext &Ctx = getContext();
+  const Expr *Base = E->IgnoreParenImpCasts();
+  const Expr *Idx = nullptr;
+
+  if (const auto *UO = dyn_cast(Base);
+  UO && UO->getOpcode() == UO_AddrOf) {
+Expr *SubExpr = UO->getSubExpr()->IgnoreParenImpCasts();
+if (const auto *ASE = dyn_cast(SubExpr)) {
+  Base = ASE->getBase()->IgnoreParenImpCasts();
+  Idx = ASE->getIdx()->IgnoreParenImpCasts();
+
+  if (const auto *IL = dyn_cast(Idx)) {
+int64_t Val = IL->getValue().getSExtValue();
+if (Val < 0)
+  // __bdos returns 0 for negative indexes into an array in a struct.
+  return getDefaultBuiltinObjectSizeResult(Type, ResType);
+
+if (Val == 0)
+  // The index is 0, so we don't need to take it into account.
+  Idx = nullptr;
+  }
+} else {
+  // Potential pointer to another element in the struct.
+  Base = SubExpr;
+}
+  }
+
+  // Get the flexible array member Decl.
+  const RecordDecl *OuterRD = nullptr;
+  if (const auto *ME = dyn_cast(Base)) {
+// Check if \p Base is referencing the FAM itself.
+const ValueDecl *VD = ME->getMemberDecl();
+OuterRD = VD->getDeclContext()->getOuterLexicalRecordContext();
+  } else if (const auto *DRE = dyn_cast(Base)) {
+// Check if we're pointing to the whole struct.
+QualType Ty = DRE->getDecl()->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+OuterRD = Ty->getAsRecordDecl();
+  }
+
+  if (!OuterRD)
+return nullptr;
+
+  uint64_t Offset = 0;
+  const FieldDecl *FAMDecl = FindFlexibleArrayMemberField(Ctx, OuterRD, 
Offset);
+  Offset = Ctx.toCharUnitsFromBits(Offset).getQuantity();
+
+  if (!FAMDecl || !FAMDecl->hasAttr())
+// No flexible array member found or it doesn't have the "counted_by"
+// attribute.
+return nullptr;
+
+  const FieldDecl *CountedByFD = FindCountedByField(FAMDecl);
+  if (!CountedByFD)
+// Can't find the field referenced by the "counted_by" attribute.
+return nullptr;
+
+  // Build a load of the counted_by field.
+  bool IsSigned = CountedByFD->getType()->isSignedIntegerType();
+  Value *CountedByInst = EmitCountedByFieldExpr(Base, FAMDecl, CountedByFD);
+  if (!CountedByInst)
+return nullptr;
+
+  llvm::Type *CountedByTy = CountedByInst->getType();
+
+  // Build a load of the index and subtract it from the count.
+  Value *IdxInst = nullptr;
+  if (Idx) {
+bool IdxSigned = Idx->getType()->isSignedIntegerType();
+IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+IdxInst = IdxSigned ? Builder.CreateSExtOrTrunc(IdxInst, CountedByTy)

bwendling wrote:

Is the

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Bill Wendling via cfe-commits


@@ -944,22 +951,259 @@ static llvm::Value 
*getArrayIndexingBound(CodeGenFunction &CGF,
   return nullptr;
 }
 
+namespace {
+
+/// \p StructAccessBase returns the base \p Expr of a field access. It returns
+/// either a \p DeclRefExpr, representing the base pointer to the struct, i.e.:
+///
+/// p in p-> a.b.c
+///
+/// or a \p MemberExpr, if the \p MemberExpr has the \p RecordDecl we're
+/// looking for:
+///
+/// struct s {
+///   struct s *ptr;
+///   int count;
+///   char array[] __attribute__((counted_by(count)));
+/// };
+///
+/// If we have an expression like \p p->ptr->array[index], we want the
+/// \p MemberExpr for \p p->ptr instead of \p p.
+class StructAccessBase
+: public ConstStmtVisitor {
+  const RecordDecl *ExpectedRD;
+
+  bool IsExpectedRecordDecl(const Expr *E) const {
+QualType Ty = E->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+return ExpectedRD == Ty->getAsRecordDecl();
+  }
+
+public:
+  StructAccessBase(const RecordDecl *ExpectedRD) : ExpectedRD(ExpectedRD) {}
+
+  
//======//
+  //Visitor Methods
+  
//======//
+
+  // NOTE: If we build C++ support for counted_by, then we'll have to handle
+  // horrors like this:
+  //
+  // struct S {
+  //   int x, y;
+  //   int blah[] __attribute__((counted_by(x)));
+  // } s;
+  //
+  // int foo(int index, int val) {
+  //   int (S::*IHatePMDs)[] = &S::blah;
+  //   (s.*IHatePMDs)[index] = val;
+  // }
+
+  const Expr *Visit(const Expr *E) {
+return ConstStmtVisitor::Visit(E);
+  }
+
+  const Expr *VisitStmt(const Stmt *S) { return nullptr; }
+
+  // These are the types we expect to return (in order of most to least
+  // likely):
+  //
+  //   1. DeclRefExpr - This is the expression for the base of the structure.
+  //  It's exactly what we want to build an access to the \p counted_by
+  //  field.
+  //   2. MemberExpr - This is the expression that has the same \p RecordDecl
+  //  as the flexble array member's lexical enclosing \p RecordDecl. This
+  //  allows us to catch things like: "p->p->array"
+  //   3. CompoundLiteralExpr - This is for people who create something
+  //  heretical like (struct foo has a flexible array member):
+  //
+  //(struct foo){ 1, 2 }.blah[idx];
+  const Expr *VisitDeclRefExpr(const DeclRefExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitMemberExpr(const MemberExpr *E) {
+if (IsExpectedRecordDecl(E) && E->isArrow())
+  return E;
+const Expr *Res = Visit(E->getBase());
+return !Res && IsExpectedRecordDecl(E) ? E : Res;
+  }
+  const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitCallExpr(const CallExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+
+  const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) {
+if (IsExpectedRecordDecl(E))
+  return E;
+return Visit(E->getBase());
+  }
+  const Expr *VisitCastExpr(const CastExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitParenExpr(const ParenExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryAddrOf(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryDeref(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+};
+
+} // end anonymous namespace
+
+using RecIndicesTy =
+SmallVector, 8>;
+
+static bool getGEPIndicesToField(CodeGenFunction &CGF, const RecordDecl *RD,
+ const FieldDecl *FD, RecIndicesTy &Indices) {
+  const CGRecordLayout &Layout = CGF.CGM.getTypes().getCGRecordLayout(RD);
+  int64_t FieldNo = -1;
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D)) {
+  FieldNo = Layout.getLLVMFieldNo(Field);
+  if (FD == Field) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+
+if (const auto *Record = dyn_cast(D)) {
+  ++FieldNo;
+  if (getGEPIndicesToField(CGF, Record, FD, Indices)) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+  }
+
+  return false;
+}
+
+llvm::Value *CodeGenFunction::EmitCountedByFieldExpr(
+const Expr *Base, const FieldDecl *FAMDecl, const FieldDecl *CountDecl) {
+  // This method is typically called in contexts where we can't generate
+  // side-effects, like in __builtin_dynamic_object_size. When finding
+  // expressions, only choose those that have either already been emitted or
+  // can be loaded without side-effects.
+  const RecordDecl *RD = 
CountDecl->getParent()->getOuterLexicalRecordContext();
+
+  // F

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Bill Wendling via cfe-commits


@@ -944,22 +951,262 @@ static llvm::Value 
*getArrayIndexingBound(CodeGenFunction &CGF,
   return nullptr;
 }
 
+namespace {
+
+/// \p StructAccessBase returns the base \p Expr of a field access. It returns
+/// either a \p DeclRefExpr, representing the base pointer to the struct, i.e.:
+///
+/// p in p-> a.b.c
+///
+/// or a \p MemberExpr, if the \p MemberExpr has the \p RecordDecl we're
+/// looking for:
+///
+/// struct s {
+///   struct s *ptr;
+///   int count;
+///   char array[] __attribute__((counted_by(count)));
+/// };
+///
+/// If we have an expression like \p p->ptr->array[index], we want the
+/// \p MemberExpr for \p p->ptr instead of \p p.
+class StructAccessBase
+: public ConstStmtVisitor {
+  const RecordDecl *ExpectedRD;
+
+  bool IsExpectedRecordDecl(const Expr *E) const {
+QualType Ty = E->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+return ExpectedRD == Ty->getAsRecordDecl();
+  }
+
+public:
+  StructAccessBase(const RecordDecl *ExpectedRD) : ExpectedRD(ExpectedRD) {}
+
+  
//======//
+  //Visitor Methods
+  
//======//
+
+  // NOTE: If we build C++ support for counted_by, then we'll have to handle
+  // horrors like this:
+  //
+  // struct S {
+  //   int x, y;
+  //   int blah[] __attribute__((counted_by(x)));
+  // } s;
+  //
+  // int foo(int index, int val) {
+  //   int (S::*IHatePMDs)[] = &S::blah;
+  //   (s.*IHatePMDs)[index] = val;
+  // }
+
+  const Expr *Visit(const Expr *E) {
+return ConstStmtVisitor::Visit(E);
+  }
+
+  const Expr *VisitStmt(const Stmt *S) { return nullptr; }
+
+  // These are the types we expect to return (in order of most to least
+  // likely):
+  //
+  //   1. DeclRefExpr - This is the expression for the base of the structure.
+  //  It's exactly what we want to build an access to the \p counted_by
+  //  field.
+  //   2. MemberExpr - This is the expression that has the same \p RecordDecl
+  //  as the flexble array member's lexical enclosing \p RecordDecl. This
+  //  allows us to catch things like: "p->p->array"
+  //   3. CompoundLiteralExpr - This is for people who create something
+  //  heretical like (struct foo has a flexible array member):
+  //
+  //(struct foo){ 1, 2 }.blah[idx];
+  const Expr *VisitDeclRefExpr(const DeclRefExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitMemberExpr(const MemberExpr *E) {
+if (IsExpectedRecordDecl(E) && E->isArrow())
+  return E;
+const Expr *Res = Visit(E->getBase());
+return !Res && IsExpectedRecordDecl(E) ? E : Res;
+  }
+  const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitCallExpr(const CallExpr *E) {

bwendling wrote:

I have a test where I made a call and just wanted to make sure we can handle it:

```
struct annotated {
  struct annotated *ann_array[10];
  unsigned long flags;
  struct {
struct {
  int count;
  int array[] __counted_by(count);
};
  };
};

struct annotated * noinline test1_call(struct annotated *p) {
  if (test1_count++) {
__builtin_printf("already called\n");
__builtin_trap();
  }
  return &p[test1_idx];
}

void noinline test1(struct annotated *p, int index) {
  test1_count = 0;
  test1_call(p)->array[index] = __builtin_dynamic_object_size(p->array, 1);
}
```

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Bill Wendling via cfe-commits


@@ -944,22 +951,259 @@ static llvm::Value 
*getArrayIndexingBound(CodeGenFunction &CGF,
   return nullptr;
 }
 
+namespace {
+
+/// \p StructAccessBase returns the base \p Expr of a field access. It returns
+/// either a \p DeclRefExpr, representing the base pointer to the struct, i.e.:
+///
+/// p in p-> a.b.c
+///
+/// or a \p MemberExpr, if the \p MemberExpr has the \p RecordDecl we're
+/// looking for:
+///
+/// struct s {
+///   struct s *ptr;
+///   int count;
+///   char array[] __attribute__((counted_by(count)));
+/// };
+///
+/// If we have an expression like \p p->ptr->array[index], we want the
+/// \p MemberExpr for \p p->ptr instead of \p p.
+class StructAccessBase
+: public ConstStmtVisitor {
+  const RecordDecl *ExpectedRD;
+
+  bool IsExpectedRecordDecl(const Expr *E) const {
+QualType Ty = E->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+return ExpectedRD == Ty->getAsRecordDecl();
+  }
+
+public:
+  StructAccessBase(const RecordDecl *ExpectedRD) : ExpectedRD(ExpectedRD) {}
+
+  
//======//
+  //Visitor Methods
+  
//======//
+
+  // NOTE: If we build C++ support for counted_by, then we'll have to handle
+  // horrors like this:
+  //
+  // struct S {
+  //   int x, y;
+  //   int blah[] __attribute__((counted_by(x)));
+  // } s;
+  //
+  // int foo(int index, int val) {
+  //   int (S::*IHatePMDs)[] = &S::blah;
+  //   (s.*IHatePMDs)[index] = val;
+  // }
+
+  const Expr *Visit(const Expr *E) {
+return ConstStmtVisitor::Visit(E);
+  }
+
+  const Expr *VisitStmt(const Stmt *S) { return nullptr; }
+
+  // These are the types we expect to return (in order of most to least
+  // likely):
+  //
+  //   1. DeclRefExpr - This is the expression for the base of the structure.
+  //  It's exactly what we want to build an access to the \p counted_by
+  //  field.
+  //   2. MemberExpr - This is the expression that has the same \p RecordDecl
+  //  as the flexble array member's lexical enclosing \p RecordDecl. This
+  //  allows us to catch things like: "p->p->array"
+  //   3. CompoundLiteralExpr - This is for people who create something
+  //  heretical like (struct foo has a flexible array member):
+  //
+  //(struct foo){ 1, 2 }.blah[idx];
+  const Expr *VisitDeclRefExpr(const DeclRefExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitMemberExpr(const MemberExpr *E) {
+if (IsExpectedRecordDecl(E) && E->isArrow())
+  return E;
+const Expr *Res = Visit(E->getBase());
+return !Res && IsExpectedRecordDecl(E) ? E : Res;
+  }
+  const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitCallExpr(const CallExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+
+  const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) {
+if (IsExpectedRecordDecl(E))
+  return E;
+return Visit(E->getBase());
+  }
+  const Expr *VisitCastExpr(const CastExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitParenExpr(const ParenExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryAddrOf(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryDeref(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+};
+
+} // end anonymous namespace
+
+using RecIndicesTy =
+SmallVector, 8>;
+
+static bool getGEPIndicesToField(CodeGenFunction &CGF, const RecordDecl *RD,
+ const FieldDecl *FD, RecIndicesTy &Indices) {
+  const CGRecordLayout &Layout = CGF.CGM.getTypes().getCGRecordLayout(RD);
+  int64_t FieldNo = -1;
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D)) {
+  FieldNo = Layout.getLLVMFieldNo(Field);
+  if (FD == Field) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+
+if (const auto *Record = dyn_cast(D)) {
+  ++FieldNo;
+  if (getGEPIndicesToField(CGF, Record, FD, Indices)) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+  }
+
+  return false;
+}
+
+llvm::Value *CodeGenFunction::EmitCountedByFieldExpr(
+const Expr *Base, const FieldDecl *FAMDecl, const FieldDecl *CountDecl) {
+  // This method is typically called in contexts where we can't generate
+  // side-effects, like in __builtin_dynamic_object_size. When finding
+  // expressions, only choose those that have either already been emitted or
+  // can be loaded without side-effects.
+  const RecordDecl *RD = 
CountDecl->getParent()->getOuterLexicalRecordContext();
+
+  // F

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Bill Wendling via cfe-commits


@@ -818,6 +819,189 @@ CodeGenFunction::evaluateOrEmitBuiltinObjectSize(const 
Expr *E, unsigned Type,
   return ConstantInt::get(ResType, ObjectSize, /*isSigned=*/true);
 }
 
+const FieldDecl *CodeGenFunction::FindFlexibleArrayMemberField(
+ASTContext &Ctx, const RecordDecl *RD, uint64_t &Offset) {
+  const LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
+  getLangOpts().getStrictFlexArraysLevel();
+  unsigned FieldNo = 0;
+
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D);
+Field && Decl::isFlexibleArrayMemberLike(
+ Ctx, Field, Field->getType(), StrictFlexArraysLevel,
+ /*IgnoreTemplateOrMacroSubstitution=*/true)) {
+  const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+  Offset += Layout.getFieldOffset(FieldNo);
+  return Field;
+}
+
+if (const auto *Record = dyn_cast(D))
+  if (const FieldDecl *Field =
+  FindFlexibleArrayMemberField(Ctx, Record, Offset)) {
+const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+Offset += Layout.getFieldOffset(FieldNo);
+return Field;
+  }
+
+if (isa(D))
+  ++FieldNo;
+  }
+
+  return nullptr;
+}
+
+llvm::Value *
+CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
+ llvm::IntegerType *ResType) {
+  // The code generated here calculates the size of a struct with a flexible
+  // array member that uses the counted_by attribute. There are two instances
+  // we handle:
+  //
+  //   struct s {
+  // unsigned long flags;
+  // int count;
+  // int array[] __attribute__((counted_by(count)));
+  //   }
+  //
+  //   1) bdos of the flexible array itself:
+  //
+  // __builtin_dynamic_object_size(p->array, 1) ==
+  // p->count * sizeof(*p->array)
+  //
+  //   2) bdos of a pointer into the flexible array:
+  //
+  // __builtin_dynamic_object_size(&p->array[42], 1) ==
+  // (p->count - 42) * sizeof(*p->array)
+  //
+  //   2) bdos of the whole struct, including the flexible array:
+  //
+  // __builtin_dynamic_object_size(p, 1) ==
+  //max(sizeof(struct s),
+  //offsetof(struct s, array) + p->count * sizeof(*p->array))
+  //
+  ASTContext &Ctx = getContext();
+  const Expr *Base = E->IgnoreParenImpCasts();
+  const Expr *Idx = nullptr;
+
+  if (const auto *UO = dyn_cast(Base);
+  UO && UO->getOpcode() == UO_AddrOf) {
+Expr *SubExpr = UO->getSubExpr()->IgnoreParenImpCasts();
+if (const auto *ASE = dyn_cast(SubExpr)) {
+  Base = ASE->getBase()->IgnoreParenImpCasts();
+  Idx = ASE->getIdx()->IgnoreParenImpCasts();
+
+  if (const auto *IL = dyn_cast(Idx)) {
+int64_t Val = IL->getValue().getSExtValue();
+if (Val < 0)
+  // __bdos returns 0 for negative indexes into an array in a struct.
+  return getDefaultBuiltinObjectSizeResult(Type, ResType);
+
+if (Val == 0)
+  // The index is 0, so we don't need to take it into account.
+  Idx = nullptr;
+  }
+} else {
+  // Potential pointer to another element in the struct.
+  Base = SubExpr;
+}
+  }
+
+  // Get the flexible array member Decl.
+  const RecordDecl *OuterRD = nullptr;
+  if (const auto *ME = dyn_cast(Base)) {
+// Check if \p Base is referencing the FAM itself.
+const ValueDecl *VD = ME->getMemberDecl();
+OuterRD = VD->getDeclContext()->getOuterLexicalRecordContext();
+  } else if (const auto *DRE = dyn_cast(Base)) {
+// Check if we're pointing to the whole struct.
+QualType Ty = DRE->getDecl()->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+OuterRD = Ty->getAsRecordDecl();
+  }
+
+  if (!OuterRD)
+return nullptr;
+
+  uint64_t Offset = 0;
+  const FieldDecl *FAMDecl = FindFlexibleArrayMemberField(Ctx, OuterRD, 
Offset);
+  Offset = Ctx.toCharUnitsFromBits(Offset).getQuantity();
+
+  if (!FAMDecl || !FAMDecl->hasAttr())
+// No flexible array member found or it doesn't have the "counted_by"
+// attribute.
+return nullptr;
+
+  const FieldDecl *CountedByFD = FindCountedByField(FAMDecl);
+  if (!CountedByFD)
+// Can't find the field referenced by the "counted_by" attribute.
+return nullptr;
+
+  // Build a load of the counted_by field.
+  bool IsSigned = CountedByFD->getType()->isSignedIntegerType();
+  Value *CountedByInst = EmitCountedByFieldExpr(Base, FAMDecl, CountedByFD);
+  if (!CountedByInst)
+return nullptr;
+
+  llvm::Type *CountedByTy = CountedByInst->getType();
+
+  // Build a load of the index and subtract it from the count.
+  Value *IdxInst = nullptr;
+  if (Idx) {
+bool IdxSigned = Idx->getType()->isSignedIntegerType();
+IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+IdxInst = IdxSigned ? Builder.CreateSExtOrTrunc(IdxInst, CountedByTy)
+: Builder.Create

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Bill Wendling via cfe-commits


@@ -818,6 +819,189 @@ CodeGenFunction::evaluateOrEmitBuiltinObjectSize(const 
Expr *E, unsigned Type,
   return ConstantInt::get(ResType, ObjectSize, /*isSigned=*/true);
 }
 
+const FieldDecl *CodeGenFunction::FindFlexibleArrayMemberField(
+ASTContext &Ctx, const RecordDecl *RD, uint64_t &Offset) {
+  const LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
+  getLangOpts().getStrictFlexArraysLevel();
+  unsigned FieldNo = 0;
+
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D);
+Field && Decl::isFlexibleArrayMemberLike(
+ Ctx, Field, Field->getType(), StrictFlexArraysLevel,
+ /*IgnoreTemplateOrMacroSubstitution=*/true)) {
+  const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+  Offset += Layout.getFieldOffset(FieldNo);
+  return Field;
+}
+
+if (const auto *Record = dyn_cast(D))
+  if (const FieldDecl *Field =
+  FindFlexibleArrayMemberField(Ctx, Record, Offset)) {
+const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+Offset += Layout.getFieldOffset(FieldNo);
+return Field;
+  }
+
+if (isa(D))
+  ++FieldNo;
+  }
+
+  return nullptr;
+}
+
+llvm::Value *
+CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
+ llvm::IntegerType *ResType) {
+  // The code generated here calculates the size of a struct with a flexible
+  // array member that uses the counted_by attribute. There are two instances
+  // we handle:
+  //
+  //   struct s {
+  // unsigned long flags;
+  // int count;
+  // int array[] __attribute__((counted_by(count)));
+  //   }
+  //
+  //   1) bdos of the flexible array itself:
+  //
+  // __builtin_dynamic_object_size(p->array, 1) ==
+  // p->count * sizeof(*p->array)
+  //
+  //   2) bdos of a pointer into the flexible array:
+  //
+  // __builtin_dynamic_object_size(&p->array[42], 1) ==
+  // (p->count - 42) * sizeof(*p->array)
+  //
+  //   2) bdos of the whole struct, including the flexible array:
+  //
+  // __builtin_dynamic_object_size(p, 1) ==
+  //max(sizeof(struct s),
+  //offsetof(struct s, array) + p->count * sizeof(*p->array))
+  //
+  ASTContext &Ctx = getContext();
+  const Expr *Base = E->IgnoreParenImpCasts();
+  const Expr *Idx = nullptr;
+
+  if (const auto *UO = dyn_cast(Base);
+  UO && UO->getOpcode() == UO_AddrOf) {
+Expr *SubExpr = UO->getSubExpr()->IgnoreParenImpCasts();
+if (const auto *ASE = dyn_cast(SubExpr)) {
+  Base = ASE->getBase()->IgnoreParenImpCasts();
+  Idx = ASE->getIdx()->IgnoreParenImpCasts();
+
+  if (const auto *IL = dyn_cast(Idx)) {
+int64_t Val = IL->getValue().getSExtValue();
+if (Val < 0)
+  // __bdos returns 0 for negative indexes into an array in a struct.
+  return getDefaultBuiltinObjectSizeResult(Type, ResType);
+
+if (Val == 0)
+  // The index is 0, so we don't need to take it into account.
+  Idx = nullptr;
+  }
+} else {
+  // Potential pointer to another element in the struct.
+  Base = SubExpr;
+}
+  }
+
+  // Get the flexible array member Decl.
+  const RecordDecl *OuterRD = nullptr;
+  if (const auto *ME = dyn_cast(Base)) {
+// Check if \p Base is referencing the FAM itself.
+const ValueDecl *VD = ME->getMemberDecl();
+OuterRD = VD->getDeclContext()->getOuterLexicalRecordContext();
+  } else if (const auto *DRE = dyn_cast(Base)) {
+// Check if we're pointing to the whole struct.
+QualType Ty = DRE->getDecl()->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+OuterRD = Ty->getAsRecordDecl();
+  }
+
+  if (!OuterRD)
+return nullptr;
+
+  uint64_t Offset = 0;
+  const FieldDecl *FAMDecl = FindFlexibleArrayMemberField(Ctx, OuterRD, 
Offset);
+  Offset = Ctx.toCharUnitsFromBits(Offset).getQuantity();
+
+  if (!FAMDecl || !FAMDecl->hasAttr())
+// No flexible array member found or it doesn't have the "counted_by"
+// attribute.
+return nullptr;
+
+  const FieldDecl *CountedByFD = FindCountedByField(FAMDecl);
+  if (!CountedByFD)
+// Can't find the field referenced by the "counted_by" attribute.
+return nullptr;
+
+  // Build a load of the counted_by field.
+  bool IsSigned = CountedByFD->getType()->isSignedIntegerType();
+  Value *CountedByInst = EmitCountedByFieldExpr(Base, FAMDecl, CountedByFD);
+  if (!CountedByInst)
+return nullptr;
+
+  llvm::Type *CountedByTy = CountedByInst->getType();
+
+  // Build a load of the index and subtract it from the count.
+  Value *IdxInst = nullptr;
+  if (Idx) {
+bool IdxSigned = Idx->getType()->isSignedIntegerType();
+IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+IdxInst = IdxSigned ? Builder.CreateSExtOrTrunc(IdxInst, CountedByTy)
+: Builder.Create

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Kees Cook via cfe-commits

kees wrote:

Possibly due to bug #72032 , I can get this tree to crash using the latest 
`array-bounds.c` test from
https://github.com/kees/kernel-tools/tree/trunk/fortify

Specifically:

```
struct anon_struct {
unsigned long flags;
long count;
int array[] __counted_by(count);
};

struct composite {
unsigned stuff;
struct annotated inner;
};

static struct composite * noinline alloc_composite(int index)
{
struct composite *p;

p = malloc(sizeof(*p) + index * sizeof(*p->inner.array));
p->inner.count = index;

return p;
}

   struct composite *c;
   c = alloc_composite(index);
   ... actions on c->inner.array ...
```

```
3.  array-bounds.c:363:1 : Generating code 
for declaration 'counted_by_seen_by_bdos'
4.  array-bounds.c:405:2 : LLVM IR 
generation of compound statement ('{}')
...
 #4 0x556574d5b858 
clang::CodeGen::CodeGenTBAA::getAccessInfo(clang::QualType)
 #5 0x55657489e25d 
clang::CodeGen::CodeGenModule::getTBAAAccessInfo(clang::QualType)
 #6 0x5565748a9c20 
clang::CodeGen::CodeGenModule::getNaturalTypeAlignment(clang::QualType, 
clang::CodeGen::LValueBaseInfo*, clang::CodeGen::TBAAAccessInfo*, bool)
 #7 0x5565749fc4a2 EmitPointerWithAlignment(clang::Expr const*, 
clang::CodeGen::LValueBaseInfo*, clang::CodeGen::TBAAAccessInfo*, 
clang::CodeGen::KnownNonNull_t, clang::CodeGen::CodeGenFunction&) CGExpr.cpp:0:0
 #8 0x5565749f94bd 
clang::CodeGen::CodeGenFunction::EmitCountedByFieldExpr(clang::Expr const*, 
clang::FieldDecl const*, clang::FieldDecl const*)
```



https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Yeoul Na via cfe-commits


@@ -944,22 +951,259 @@ static llvm::Value 
*getArrayIndexingBound(CodeGenFunction &CGF,
   return nullptr;
 }
 
+namespace {
+
+/// \p StructAccessBase returns the base \p Expr of a field access. It returns
+/// either a \p DeclRefExpr, representing the base pointer to the struct, i.e.:
+///
+/// p in p-> a.b.c
+///
+/// or a \p MemberExpr, if the \p MemberExpr has the \p RecordDecl we're
+/// looking for:
+///
+/// struct s {
+///   struct s *ptr;
+///   int count;
+///   char array[] __attribute__((counted_by(count)));
+/// };
+///
+/// If we have an expression like \p p->ptr->array[index], we want the
+/// \p MemberExpr for \p p->ptr instead of \p p.
+class StructAccessBase
+: public ConstStmtVisitor {
+  const RecordDecl *ExpectedRD;
+
+  bool IsExpectedRecordDecl(const Expr *E) const {
+QualType Ty = E->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+return ExpectedRD == Ty->getAsRecordDecl();
+  }
+
+public:
+  StructAccessBase(const RecordDecl *ExpectedRD) : ExpectedRD(ExpectedRD) {}
+
+  
//======//
+  //Visitor Methods
+  
//======//
+
+  // NOTE: If we build C++ support for counted_by, then we'll have to handle
+  // horrors like this:
+  //
+  // struct S {
+  //   int x, y;
+  //   int blah[] __attribute__((counted_by(x)));
+  // } s;
+  //
+  // int foo(int index, int val) {
+  //   int (S::*IHatePMDs)[] = &S::blah;
+  //   (s.*IHatePMDs)[index] = val;
+  // }
+
+  const Expr *Visit(const Expr *E) {
+return ConstStmtVisitor::Visit(E);
+  }
+
+  const Expr *VisitStmt(const Stmt *S) { return nullptr; }
+
+  // These are the types we expect to return (in order of most to least
+  // likely):
+  //
+  //   1. DeclRefExpr - This is the expression for the base of the structure.
+  //  It's exactly what we want to build an access to the \p counted_by
+  //  field.
+  //   2. MemberExpr - This is the expression that has the same \p RecordDecl
+  //  as the flexble array member's lexical enclosing \p RecordDecl. This
+  //  allows us to catch things like: "p->p->array"
+  //   3. CompoundLiteralExpr - This is for people who create something
+  //  heretical like (struct foo has a flexible array member):
+  //
+  //(struct foo){ 1, 2 }.blah[idx];
+  const Expr *VisitDeclRefExpr(const DeclRefExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitMemberExpr(const MemberExpr *E) {
+if (IsExpectedRecordDecl(E) && E->isArrow())
+  return E;
+const Expr *Res = Visit(E->getBase());
+return !Res && IsExpectedRecordDecl(E) ? E : Res;
+  }
+  const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitCallExpr(const CallExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+
+  const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) {
+if (IsExpectedRecordDecl(E))
+  return E;
+return Visit(E->getBase());
+  }
+  const Expr *VisitCastExpr(const CastExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitParenExpr(const ParenExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryAddrOf(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryDeref(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+};
+
+} // end anonymous namespace
+
+using RecIndicesTy =
+SmallVector, 8>;
+
+static bool getGEPIndicesToField(CodeGenFunction &CGF, const RecordDecl *RD,
+ const FieldDecl *FD, RecIndicesTy &Indices) {
+  const CGRecordLayout &Layout = CGF.CGM.getTypes().getCGRecordLayout(RD);
+  int64_t FieldNo = -1;
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D)) {
+  FieldNo = Layout.getLLVMFieldNo(Field);
+  if (FD == Field) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+
+if (const auto *Record = dyn_cast(D)) {
+  ++FieldNo;
+  if (getGEPIndicesToField(CGF, Record, FD, Indices)) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+  }
+
+  return false;
+}
+
+llvm::Value *CodeGenFunction::EmitCountedByFieldExpr(
+const Expr *Base, const FieldDecl *FAMDecl, const FieldDecl *CountDecl) {
+  // This method is typically called in contexts where we can't generate
+  // side-effects, like in __builtin_dynamic_object_size. When finding
+  // expressions, only choose those that have either already been emitted or
+  // can be loaded without side-effects.
+  const RecordDecl *RD = 
CountDecl->getParent()->getOuterLexicalRecordContext();
+
+  // F

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Yeoul Na via cfe-commits


@@ -818,6 +819,189 @@ CodeGenFunction::evaluateOrEmitBuiltinObjectSize(const 
Expr *E, unsigned Type,
   return ConstantInt::get(ResType, ObjectSize, /*isSigned=*/true);
 }
 
+const FieldDecl *CodeGenFunction::FindFlexibleArrayMemberField(
+ASTContext &Ctx, const RecordDecl *RD, uint64_t &Offset) {
+  const LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
+  getLangOpts().getStrictFlexArraysLevel();
+  unsigned FieldNo = 0;
+
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D);
+Field && Decl::isFlexibleArrayMemberLike(
+ Ctx, Field, Field->getType(), StrictFlexArraysLevel,
+ /*IgnoreTemplateOrMacroSubstitution=*/true)) {
+  const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+  Offset += Layout.getFieldOffset(FieldNo);
+  return Field;
+}
+
+if (const auto *Record = dyn_cast(D))
+  if (const FieldDecl *Field =
+  FindFlexibleArrayMemberField(Ctx, Record, Offset)) {
+const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+Offset += Layout.getFieldOffset(FieldNo);
+return Field;
+  }
+
+if (isa(D))
+  ++FieldNo;
+  }
+
+  return nullptr;
+}
+
+llvm::Value *
+CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
+ llvm::IntegerType *ResType) {
+  // The code generated here calculates the size of a struct with a flexible
+  // array member that uses the counted_by attribute. There are two instances
+  // we handle:
+  //
+  //   struct s {
+  // unsigned long flags;
+  // int count;
+  // int array[] __attribute__((counted_by(count)));
+  //   }
+  //
+  //   1) bdos of the flexible array itself:
+  //
+  // __builtin_dynamic_object_size(p->array, 1) ==
+  // p->count * sizeof(*p->array)
+  //
+  //   2) bdos of a pointer into the flexible array:
+  //
+  // __builtin_dynamic_object_size(&p->array[42], 1) ==
+  // (p->count - 42) * sizeof(*p->array)
+  //
+  //   2) bdos of the whole struct, including the flexible array:
+  //
+  // __builtin_dynamic_object_size(p, 1) ==
+  //max(sizeof(struct s),
+  //offsetof(struct s, array) + p->count * sizeof(*p->array))
+  //
+  ASTContext &Ctx = getContext();
+  const Expr *Base = E->IgnoreParenImpCasts();
+  const Expr *Idx = nullptr;
+
+  if (const auto *UO = dyn_cast(Base);
+  UO && UO->getOpcode() == UO_AddrOf) {
+Expr *SubExpr = UO->getSubExpr()->IgnoreParenImpCasts();
+if (const auto *ASE = dyn_cast(SubExpr)) {
+  Base = ASE->getBase()->IgnoreParenImpCasts();
+  Idx = ASE->getIdx()->IgnoreParenImpCasts();
+
+  if (const auto *IL = dyn_cast(Idx)) {
+int64_t Val = IL->getValue().getSExtValue();
+if (Val < 0)
+  // __bdos returns 0 for negative indexes into an array in a struct.
+  return getDefaultBuiltinObjectSizeResult(Type, ResType);
+
+if (Val == 0)
+  // The index is 0, so we don't need to take it into account.
+  Idx = nullptr;
+  }
+} else {
+  // Potential pointer to another element in the struct.
+  Base = SubExpr;
+}
+  }
+
+  // Get the flexible array member Decl.
+  const RecordDecl *OuterRD = nullptr;
+  if (const auto *ME = dyn_cast(Base)) {
+// Check if \p Base is referencing the FAM itself.
+const ValueDecl *VD = ME->getMemberDecl();
+OuterRD = VD->getDeclContext()->getOuterLexicalRecordContext();
+  } else if (const auto *DRE = dyn_cast(Base)) {
+// Check if we're pointing to the whole struct.
+QualType Ty = DRE->getDecl()->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+OuterRD = Ty->getAsRecordDecl();
+  }
+
+  if (!OuterRD)
+return nullptr;
+
+  uint64_t Offset = 0;
+  const FieldDecl *FAMDecl = FindFlexibleArrayMemberField(Ctx, OuterRD, 
Offset);
+  Offset = Ctx.toCharUnitsFromBits(Offset).getQuantity();
+
+  if (!FAMDecl || !FAMDecl->hasAttr())
+// No flexible array member found or it doesn't have the "counted_by"
+// attribute.
+return nullptr;
+
+  const FieldDecl *CountedByFD = FindCountedByField(FAMDecl);
+  if (!CountedByFD)
+// Can't find the field referenced by the "counted_by" attribute.
+return nullptr;
+
+  // Build a load of the counted_by field.
+  bool IsSigned = CountedByFD->getType()->isSignedIntegerType();
+  Value *CountedByInst = EmitCountedByFieldExpr(Base, FAMDecl, CountedByFD);
+  if (!CountedByInst)
+return nullptr;
+
+  llvm::Type *CountedByTy = CountedByInst->getType();
+
+  // Build a load of the index and subtract it from the count.
+  Value *IdxInst = nullptr;
+  if (Idx) {
+bool IdxSigned = Idx->getType()->isSignedIntegerType();
+IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+IdxInst = IdxSigned ? Builder.CreateSExtOrTrunc(IdxInst, CountedByTy)
+: Builder.Create

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Yeoul Na via cfe-commits


@@ -818,6 +819,189 @@ CodeGenFunction::evaluateOrEmitBuiltinObjectSize(const 
Expr *E, unsigned Type,
   return ConstantInt::get(ResType, ObjectSize, /*isSigned=*/true);
 }
 
+const FieldDecl *CodeGenFunction::FindFlexibleArrayMemberField(
+ASTContext &Ctx, const RecordDecl *RD, uint64_t &Offset) {
+  const LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
+  getLangOpts().getStrictFlexArraysLevel();
+  unsigned FieldNo = 0;
+
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D);
+Field && Decl::isFlexibleArrayMemberLike(
+ Ctx, Field, Field->getType(), StrictFlexArraysLevel,
+ /*IgnoreTemplateOrMacroSubstitution=*/true)) {
+  const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+  Offset += Layout.getFieldOffset(FieldNo);
+  return Field;
+}
+
+if (const auto *Record = dyn_cast(D))
+  if (const FieldDecl *Field =
+  FindFlexibleArrayMemberField(Ctx, Record, Offset)) {
+const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+Offset += Layout.getFieldOffset(FieldNo);
+return Field;
+  }
+
+if (isa(D))
+  ++FieldNo;
+  }
+
+  return nullptr;
+}
+
+llvm::Value *
+CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
+ llvm::IntegerType *ResType) {
+  // The code generated here calculates the size of a struct with a flexible
+  // array member that uses the counted_by attribute. There are two instances
+  // we handle:
+  //
+  //   struct s {
+  // unsigned long flags;
+  // int count;
+  // int array[] __attribute__((counted_by(count)));
+  //   }
+  //
+  //   1) bdos of the flexible array itself:
+  //
+  // __builtin_dynamic_object_size(p->array, 1) ==
+  // p->count * sizeof(*p->array)
+  //
+  //   2) bdos of a pointer into the flexible array:
+  //
+  // __builtin_dynamic_object_size(&p->array[42], 1) ==
+  // (p->count - 42) * sizeof(*p->array)
+  //
+  //   2) bdos of the whole struct, including the flexible array:
+  //
+  // __builtin_dynamic_object_size(p, 1) ==
+  //max(sizeof(struct s),
+  //offsetof(struct s, array) + p->count * sizeof(*p->array))
+  //
+  ASTContext &Ctx = getContext();
+  const Expr *Base = E->IgnoreParenImpCasts();
+  const Expr *Idx = nullptr;
+
+  if (const auto *UO = dyn_cast(Base);
+  UO && UO->getOpcode() == UO_AddrOf) {
+Expr *SubExpr = UO->getSubExpr()->IgnoreParenImpCasts();
+if (const auto *ASE = dyn_cast(SubExpr)) {
+  Base = ASE->getBase()->IgnoreParenImpCasts();
+  Idx = ASE->getIdx()->IgnoreParenImpCasts();
+
+  if (const auto *IL = dyn_cast(Idx)) {
+int64_t Val = IL->getValue().getSExtValue();
+if (Val < 0)
+  // __bdos returns 0 for negative indexes into an array in a struct.
+  return getDefaultBuiltinObjectSizeResult(Type, ResType);
+
+if (Val == 0)
+  // The index is 0, so we don't need to take it into account.
+  Idx = nullptr;
+  }
+} else {
+  // Potential pointer to another element in the struct.
+  Base = SubExpr;
+}
+  }
+
+  // Get the flexible array member Decl.
+  const RecordDecl *OuterRD = nullptr;
+  if (const auto *ME = dyn_cast(Base)) {
+// Check if \p Base is referencing the FAM itself.
+const ValueDecl *VD = ME->getMemberDecl();
+OuterRD = VD->getDeclContext()->getOuterLexicalRecordContext();
+  } else if (const auto *DRE = dyn_cast(Base)) {
+// Check if we're pointing to the whole struct.
+QualType Ty = DRE->getDecl()->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+OuterRD = Ty->getAsRecordDecl();
+  }
+
+  if (!OuterRD)
+return nullptr;
+
+  uint64_t Offset = 0;
+  const FieldDecl *FAMDecl = FindFlexibleArrayMemberField(Ctx, OuterRD, 
Offset);
+  Offset = Ctx.toCharUnitsFromBits(Offset).getQuantity();
+
+  if (!FAMDecl || !FAMDecl->hasAttr())
+// No flexible array member found or it doesn't have the "counted_by"
+// attribute.
+return nullptr;
+
+  const FieldDecl *CountedByFD = FindCountedByField(FAMDecl);
+  if (!CountedByFD)
+// Can't find the field referenced by the "counted_by" attribute.
+return nullptr;
+
+  // Build a load of the counted_by field.
+  bool IsSigned = CountedByFD->getType()->isSignedIntegerType();
+  Value *CountedByInst = EmitCountedByFieldExpr(Base, FAMDecl, CountedByFD);
+  if (!CountedByInst)
+return nullptr;
+
+  llvm::Type *CountedByTy = CountedByInst->getType();
+
+  // Build a load of the index and subtract it from the count.
+  Value *IdxInst = nullptr;
+  if (Idx) {
+bool IdxSigned = Idx->getType()->isSignedIntegerType();
+IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+IdxInst = IdxSigned ? Builder.CreateSExtOrTrunc(IdxInst, CountedByTy)

rapidsna wrote:

Maybe w

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Yeoul Na via cfe-commits


@@ -944,22 +951,262 @@ static llvm::Value 
*getArrayIndexingBound(CodeGenFunction &CGF,
   return nullptr;
 }
 
+namespace {
+
+/// \p StructAccessBase returns the base \p Expr of a field access. It returns
+/// either a \p DeclRefExpr, representing the base pointer to the struct, i.e.:
+///
+/// p in p-> a.b.c
+///
+/// or a \p MemberExpr, if the \p MemberExpr has the \p RecordDecl we're
+/// looking for:
+///
+/// struct s {
+///   struct s *ptr;
+///   int count;
+///   char array[] __attribute__((counted_by(count)));
+/// };
+///
+/// If we have an expression like \p p->ptr->array[index], we want the
+/// \p MemberExpr for \p p->ptr instead of \p p.
+class StructAccessBase
+: public ConstStmtVisitor {
+  const RecordDecl *ExpectedRD;
+
+  bool IsExpectedRecordDecl(const Expr *E) const {
+QualType Ty = E->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+return ExpectedRD == Ty->getAsRecordDecl();
+  }
+
+public:
+  StructAccessBase(const RecordDecl *ExpectedRD) : ExpectedRD(ExpectedRD) {}
+
+  
//======//
+  //Visitor Methods
+  
//======//
+
+  // NOTE: If we build C++ support for counted_by, then we'll have to handle
+  // horrors like this:
+  //
+  // struct S {
+  //   int x, y;
+  //   int blah[] __attribute__((counted_by(x)));
+  // } s;
+  //
+  // int foo(int index, int val) {
+  //   int (S::*IHatePMDs)[] = &S::blah;
+  //   (s.*IHatePMDs)[index] = val;
+  // }
+
+  const Expr *Visit(const Expr *E) {
+return ConstStmtVisitor::Visit(E);
+  }
+
+  const Expr *VisitStmt(const Stmt *S) { return nullptr; }
+
+  // These are the types we expect to return (in order of most to least
+  // likely):
+  //
+  //   1. DeclRefExpr - This is the expression for the base of the structure.
+  //  It's exactly what we want to build an access to the \p counted_by
+  //  field.
+  //   2. MemberExpr - This is the expression that has the same \p RecordDecl
+  //  as the flexble array member's lexical enclosing \p RecordDecl. This
+  //  allows us to catch things like: "p->p->array"
+  //   3. CompoundLiteralExpr - This is for people who create something
+  //  heretical like (struct foo has a flexible array member):
+  //
+  //(struct foo){ 1, 2 }.blah[idx];
+  const Expr *VisitDeclRefExpr(const DeclRefExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitMemberExpr(const MemberExpr *E) {
+if (IsExpectedRecordDecl(E) && E->isArrow())
+  return E;
+const Expr *Res = Visit(E->getBase());
+return !Res && IsExpectedRecordDecl(E) ? E : Res;
+  }
+  const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitCallExpr(const CallExpr *E) {

rapidsna wrote:

Oh, I see. So `VisitCall` here is to handle the case for array-bounds checks.

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Bill Wendling via cfe-commits

bwendling wrote:

@kees Should be fixed now.

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Bill Wendling via cfe-commits


@@ -944,22 +951,259 @@ static llvm::Value 
*getArrayIndexingBound(CodeGenFunction &CGF,
   return nullptr;
 }
 
+namespace {
+
+/// \p StructAccessBase returns the base \p Expr of a field access. It returns
+/// either a \p DeclRefExpr, representing the base pointer to the struct, i.e.:
+///
+/// p in p-> a.b.c
+///
+/// or a \p MemberExpr, if the \p MemberExpr has the \p RecordDecl we're
+/// looking for:
+///
+/// struct s {
+///   struct s *ptr;
+///   int count;
+///   char array[] __attribute__((counted_by(count)));
+/// };
+///
+/// If we have an expression like \p p->ptr->array[index], we want the
+/// \p MemberExpr for \p p->ptr instead of \p p.
+class StructAccessBase
+: public ConstStmtVisitor {
+  const RecordDecl *ExpectedRD;
+
+  bool IsExpectedRecordDecl(const Expr *E) const {
+QualType Ty = E->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+return ExpectedRD == Ty->getAsRecordDecl();
+  }
+
+public:
+  StructAccessBase(const RecordDecl *ExpectedRD) : ExpectedRD(ExpectedRD) {}
+
+  
//======//
+  //Visitor Methods
+  
//======//
+
+  // NOTE: If we build C++ support for counted_by, then we'll have to handle
+  // horrors like this:
+  //
+  // struct S {
+  //   int x, y;
+  //   int blah[] __attribute__((counted_by(x)));
+  // } s;
+  //
+  // int foo(int index, int val) {
+  //   int (S::*IHatePMDs)[] = &S::blah;
+  //   (s.*IHatePMDs)[index] = val;
+  // }
+
+  const Expr *Visit(const Expr *E) {
+return ConstStmtVisitor::Visit(E);
+  }
+
+  const Expr *VisitStmt(const Stmt *S) { return nullptr; }
+
+  // These are the types we expect to return (in order of most to least
+  // likely):
+  //
+  //   1. DeclRefExpr - This is the expression for the base of the structure.
+  //  It's exactly what we want to build an access to the \p counted_by
+  //  field.
+  //   2. MemberExpr - This is the expression that has the same \p RecordDecl
+  //  as the flexble array member's lexical enclosing \p RecordDecl. This
+  //  allows us to catch things like: "p->p->array"
+  //   3. CompoundLiteralExpr - This is for people who create something
+  //  heretical like (struct foo has a flexible array member):
+  //
+  //(struct foo){ 1, 2 }.blah[idx];
+  const Expr *VisitDeclRefExpr(const DeclRefExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitMemberExpr(const MemberExpr *E) {
+if (IsExpectedRecordDecl(E) && E->isArrow())
+  return E;
+const Expr *Res = Visit(E->getBase());
+return !Res && IsExpectedRecordDecl(E) ? E : Res;
+  }
+  const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitCallExpr(const CallExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+
+  const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) {
+if (IsExpectedRecordDecl(E))
+  return E;
+return Visit(E->getBase());
+  }
+  const Expr *VisitCastExpr(const CastExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitParenExpr(const ParenExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryAddrOf(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryDeref(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+};
+
+} // end anonymous namespace
+
+using RecIndicesTy =
+SmallVector, 8>;
+
+static bool getGEPIndicesToField(CodeGenFunction &CGF, const RecordDecl *RD,
+ const FieldDecl *FD, RecIndicesTy &Indices) {
+  const CGRecordLayout &Layout = CGF.CGM.getTypes().getCGRecordLayout(RD);
+  int64_t FieldNo = -1;
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D)) {
+  FieldNo = Layout.getLLVMFieldNo(Field);
+  if (FD == Field) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+
+if (const auto *Record = dyn_cast(D)) {
+  ++FieldNo;
+  if (getGEPIndicesToField(CGF, Record, FD, Indices)) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+  }
+
+  return false;
+}
+
+llvm::Value *CodeGenFunction::EmitCountedByFieldExpr(
+const Expr *Base, const FieldDecl *FAMDecl, const FieldDecl *CountDecl) {
+  // This method is typically called in contexts where we can't generate
+  // side-effects, like in __builtin_dynamic_object_size. When finding
+  // expressions, only choose those that have either already been emitted or
+  // can be loaded without side-effects.
+  const RecordDecl *RD = 
CountDecl->getParent()->getOuterLexicalRecordContext();
+
+  // F

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Bill Wendling via cfe-commits


@@ -818,6 +819,189 @@ CodeGenFunction::evaluateOrEmitBuiltinObjectSize(const 
Expr *E, unsigned Type,
   return ConstantInt::get(ResType, ObjectSize, /*isSigned=*/true);
 }
 
+const FieldDecl *CodeGenFunction::FindFlexibleArrayMemberField(
+ASTContext &Ctx, const RecordDecl *RD, uint64_t &Offset) {
+  const LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
+  getLangOpts().getStrictFlexArraysLevel();
+  unsigned FieldNo = 0;
+
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D);
+Field && Decl::isFlexibleArrayMemberLike(
+ Ctx, Field, Field->getType(), StrictFlexArraysLevel,
+ /*IgnoreTemplateOrMacroSubstitution=*/true)) {
+  const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+  Offset += Layout.getFieldOffset(FieldNo);
+  return Field;
+}
+
+if (const auto *Record = dyn_cast(D))
+  if (const FieldDecl *Field =
+  FindFlexibleArrayMemberField(Ctx, Record, Offset)) {
+const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+Offset += Layout.getFieldOffset(FieldNo);
+return Field;
+  }
+
+if (isa(D))
+  ++FieldNo;
+  }
+
+  return nullptr;
+}
+
+llvm::Value *
+CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
+ llvm::IntegerType *ResType) {
+  // The code generated here calculates the size of a struct with a flexible
+  // array member that uses the counted_by attribute. There are two instances
+  // we handle:
+  //
+  //   struct s {
+  // unsigned long flags;
+  // int count;
+  // int array[] __attribute__((counted_by(count)));
+  //   }
+  //
+  //   1) bdos of the flexible array itself:
+  //
+  // __builtin_dynamic_object_size(p->array, 1) ==
+  // p->count * sizeof(*p->array)
+  //
+  //   2) bdos of a pointer into the flexible array:
+  //
+  // __builtin_dynamic_object_size(&p->array[42], 1) ==
+  // (p->count - 42) * sizeof(*p->array)
+  //
+  //   2) bdos of the whole struct, including the flexible array:
+  //
+  // __builtin_dynamic_object_size(p, 1) ==
+  //max(sizeof(struct s),
+  //offsetof(struct s, array) + p->count * sizeof(*p->array))
+  //
+  ASTContext &Ctx = getContext();
+  const Expr *Base = E->IgnoreParenImpCasts();
+  const Expr *Idx = nullptr;
+
+  if (const auto *UO = dyn_cast(Base);
+  UO && UO->getOpcode() == UO_AddrOf) {
+Expr *SubExpr = UO->getSubExpr()->IgnoreParenImpCasts();
+if (const auto *ASE = dyn_cast(SubExpr)) {
+  Base = ASE->getBase()->IgnoreParenImpCasts();
+  Idx = ASE->getIdx()->IgnoreParenImpCasts();
+
+  if (const auto *IL = dyn_cast(Idx)) {
+int64_t Val = IL->getValue().getSExtValue();
+if (Val < 0)
+  // __bdos returns 0 for negative indexes into an array in a struct.
+  return getDefaultBuiltinObjectSizeResult(Type, ResType);
+
+if (Val == 0)
+  // The index is 0, so we don't need to take it into account.
+  Idx = nullptr;
+  }
+} else {
+  // Potential pointer to another element in the struct.
+  Base = SubExpr;
+}
+  }
+
+  // Get the flexible array member Decl.
+  const RecordDecl *OuterRD = nullptr;
+  if (const auto *ME = dyn_cast(Base)) {
+// Check if \p Base is referencing the FAM itself.
+const ValueDecl *VD = ME->getMemberDecl();
+OuterRD = VD->getDeclContext()->getOuterLexicalRecordContext();
+  } else if (const auto *DRE = dyn_cast(Base)) {
+// Check if we're pointing to the whole struct.
+QualType Ty = DRE->getDecl()->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+OuterRD = Ty->getAsRecordDecl();
+  }
+
+  if (!OuterRD)
+return nullptr;
+
+  uint64_t Offset = 0;
+  const FieldDecl *FAMDecl = FindFlexibleArrayMemberField(Ctx, OuterRD, 
Offset);
+  Offset = Ctx.toCharUnitsFromBits(Offset).getQuantity();
+
+  if (!FAMDecl || !FAMDecl->hasAttr())
+// No flexible array member found or it doesn't have the "counted_by"
+// attribute.
+return nullptr;
+
+  const FieldDecl *CountedByFD = FindCountedByField(FAMDecl);
+  if (!CountedByFD)
+// Can't find the field referenced by the "counted_by" attribute.
+return nullptr;
+
+  // Build a load of the counted_by field.
+  bool IsSigned = CountedByFD->getType()->isSignedIntegerType();
+  Value *CountedByInst = EmitCountedByFieldExpr(Base, FAMDecl, CountedByFD);
+  if (!CountedByInst)
+return nullptr;
+
+  llvm::Type *CountedByTy = CountedByInst->getType();
+
+  // Build a load of the index and subtract it from the count.
+  Value *IdxInst = nullptr;
+  if (Idx) {
+bool IdxSigned = Idx->getType()->isSignedIntegerType();
+IdxInst = EmitAnyExprToTemp(Idx).getScalarVal();
+IdxInst = IdxSigned ? Builder.CreateSExtOrTrunc(IdxInst, CountedByTy)

bwendling wrote:

I coul

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Yeoul Na via cfe-commits


@@ -944,22 +951,259 @@ static llvm::Value 
*getArrayIndexingBound(CodeGenFunction &CGF,
   return nullptr;
 }
 
+namespace {
+
+/// \p StructAccessBase returns the base \p Expr of a field access. It returns
+/// either a \p DeclRefExpr, representing the base pointer to the struct, i.e.:
+///
+/// p in p-> a.b.c
+///
+/// or a \p MemberExpr, if the \p MemberExpr has the \p RecordDecl we're
+/// looking for:
+///
+/// struct s {
+///   struct s *ptr;
+///   int count;
+///   char array[] __attribute__((counted_by(count)));
+/// };
+///
+/// If we have an expression like \p p->ptr->array[index], we want the
+/// \p MemberExpr for \p p->ptr instead of \p p.
+class StructAccessBase
+: public ConstStmtVisitor {
+  const RecordDecl *ExpectedRD;
+
+  bool IsExpectedRecordDecl(const Expr *E) const {
+QualType Ty = E->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+return ExpectedRD == Ty->getAsRecordDecl();
+  }
+
+public:
+  StructAccessBase(const RecordDecl *ExpectedRD) : ExpectedRD(ExpectedRD) {}
+
+  
//======//
+  //Visitor Methods
+  
//======//
+
+  // NOTE: If we build C++ support for counted_by, then we'll have to handle
+  // horrors like this:
+  //
+  // struct S {
+  //   int x, y;
+  //   int blah[] __attribute__((counted_by(x)));
+  // } s;
+  //
+  // int foo(int index, int val) {
+  //   int (S::*IHatePMDs)[] = &S::blah;
+  //   (s.*IHatePMDs)[index] = val;
+  // }
+
+  const Expr *Visit(const Expr *E) {
+return ConstStmtVisitor::Visit(E);
+  }
+
+  const Expr *VisitStmt(const Stmt *S) { return nullptr; }
+
+  // These are the types we expect to return (in order of most to least
+  // likely):
+  //
+  //   1. DeclRefExpr - This is the expression for the base of the structure.
+  //  It's exactly what we want to build an access to the \p counted_by
+  //  field.
+  //   2. MemberExpr - This is the expression that has the same \p RecordDecl
+  //  as the flexble array member's lexical enclosing \p RecordDecl. This
+  //  allows us to catch things like: "p->p->array"
+  //   3. CompoundLiteralExpr - This is for people who create something
+  //  heretical like (struct foo has a flexible array member):
+  //
+  //(struct foo){ 1, 2 }.blah[idx];
+  const Expr *VisitDeclRefExpr(const DeclRefExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitMemberExpr(const MemberExpr *E) {
+if (IsExpectedRecordDecl(E) && E->isArrow())
+  return E;
+const Expr *Res = Visit(E->getBase());
+return !Res && IsExpectedRecordDecl(E) ? E : Res;
+  }
+  const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitCallExpr(const CallExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+
+  const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) {
+if (IsExpectedRecordDecl(E))
+  return E;
+return Visit(E->getBase());
+  }
+  const Expr *VisitCastExpr(const CastExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitParenExpr(const ParenExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryAddrOf(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryDeref(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+};
+
+} // end anonymous namespace
+
+using RecIndicesTy =
+SmallVector, 8>;
+
+static bool getGEPIndicesToField(CodeGenFunction &CGF, const RecordDecl *RD,
+ const FieldDecl *FD, RecIndicesTy &Indices) {
+  const CGRecordLayout &Layout = CGF.CGM.getTypes().getCGRecordLayout(RD);
+  int64_t FieldNo = -1;
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D)) {
+  FieldNo = Layout.getLLVMFieldNo(Field);
+  if (FD == Field) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+
+if (const auto *Record = dyn_cast(D)) {
+  ++FieldNo;
+  if (getGEPIndicesToField(CGF, Record, FD, Indices)) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+  }
+
+  return false;
+}
+
+llvm::Value *CodeGenFunction::EmitCountedByFieldExpr(
+const Expr *Base, const FieldDecl *FAMDecl, const FieldDecl *CountDecl) {
+  // This method is typically called in contexts where we can't generate
+  // side-effects, like in __builtin_dynamic_object_size. When finding
+  // expressions, only choose those that have either already been emitted or
+  // can be loaded without side-effects.
+  const RecordDecl *RD = 
CountDecl->getParent()->getOuterLexicalRecordContext();
+
+  // F

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-08 Thread Yeoul Na via cfe-commits


@@ -944,22 +951,259 @@ static llvm::Value 
*getArrayIndexingBound(CodeGenFunction &CGF,
   return nullptr;
 }
 
+namespace {
+
+/// \p StructAccessBase returns the base \p Expr of a field access. It returns
+/// either a \p DeclRefExpr, representing the base pointer to the struct, i.e.:
+///
+/// p in p-> a.b.c
+///
+/// or a \p MemberExpr, if the \p MemberExpr has the \p RecordDecl we're
+/// looking for:
+///
+/// struct s {
+///   struct s *ptr;
+///   int count;
+///   char array[] __attribute__((counted_by(count)));
+/// };
+///
+/// If we have an expression like \p p->ptr->array[index], we want the
+/// \p MemberExpr for \p p->ptr instead of \p p.
+class StructAccessBase
+: public ConstStmtVisitor {
+  const RecordDecl *ExpectedRD;
+
+  bool IsExpectedRecordDecl(const Expr *E) const {
+QualType Ty = E->getType();
+if (Ty->isPointerType())
+  Ty = Ty->getPointeeType();
+return ExpectedRD == Ty->getAsRecordDecl();
+  }
+
+public:
+  StructAccessBase(const RecordDecl *ExpectedRD) : ExpectedRD(ExpectedRD) {}
+
+  
//======//
+  //Visitor Methods
+  
//======//
+
+  // NOTE: If we build C++ support for counted_by, then we'll have to handle
+  // horrors like this:
+  //
+  // struct S {
+  //   int x, y;
+  //   int blah[] __attribute__((counted_by(x)));
+  // } s;
+  //
+  // int foo(int index, int val) {
+  //   int (S::*IHatePMDs)[] = &S::blah;
+  //   (s.*IHatePMDs)[index] = val;
+  // }
+
+  const Expr *Visit(const Expr *E) {
+return ConstStmtVisitor::Visit(E);
+  }
+
+  const Expr *VisitStmt(const Stmt *S) { return nullptr; }
+
+  // These are the types we expect to return (in order of most to least
+  // likely):
+  //
+  //   1. DeclRefExpr - This is the expression for the base of the structure.
+  //  It's exactly what we want to build an access to the \p counted_by
+  //  field.
+  //   2. MemberExpr - This is the expression that has the same \p RecordDecl
+  //  as the flexble array member's lexical enclosing \p RecordDecl. This
+  //  allows us to catch things like: "p->p->array"
+  //   3. CompoundLiteralExpr - This is for people who create something
+  //  heretical like (struct foo has a flexible array member):
+  //
+  //(struct foo){ 1, 2 }.blah[idx];
+  const Expr *VisitDeclRefExpr(const DeclRefExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitMemberExpr(const MemberExpr *E) {
+if (IsExpectedRecordDecl(E) && E->isArrow())
+  return E;
+const Expr *Res = Visit(E->getBase());
+return !Res && IsExpectedRecordDecl(E) ? E : Res;
+  }
+  const Expr *VisitCompoundLiteralExpr(const CompoundLiteralExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+  const Expr *VisitCallExpr(const CallExpr *E) {
+return IsExpectedRecordDecl(E) ? E : nullptr;
+  }
+
+  const Expr *VisitArraySubscriptExpr(const ArraySubscriptExpr *E) {
+if (IsExpectedRecordDecl(E))
+  return E;
+return Visit(E->getBase());
+  }
+  const Expr *VisitCastExpr(const CastExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitParenExpr(const ParenExpr *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryAddrOf(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+  const Expr *VisitUnaryDeref(const UnaryOperator *E) {
+return Visit(E->getSubExpr());
+  }
+};
+
+} // end anonymous namespace
+
+using RecIndicesTy =
+SmallVector, 8>;
+
+static bool getGEPIndicesToField(CodeGenFunction &CGF, const RecordDecl *RD,
+ const FieldDecl *FD, RecIndicesTy &Indices) {
+  const CGRecordLayout &Layout = CGF.CGM.getTypes().getCGRecordLayout(RD);
+  int64_t FieldNo = -1;
+  for (const Decl *D : RD->decls()) {
+if (const auto *Field = dyn_cast(D)) {
+  FieldNo = Layout.getLLVMFieldNo(Field);
+  if (FD == Field) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+
+if (const auto *Record = dyn_cast(D)) {
+  ++FieldNo;
+  if (getGEPIndicesToField(CGF, Record, FD, Indices)) {
+Indices.emplace_back(std::make_pair(RD, 
CGF.Builder.getInt32(FieldNo)));
+return true;
+  }
+}
+  }
+
+  return false;
+}
+
+llvm::Value *CodeGenFunction::EmitCountedByFieldExpr(
+const Expr *Base, const FieldDecl *FAMDecl, const FieldDecl *CountDecl) {
+  // This method is typically called in contexts where we can't generate
+  // side-effects, like in __builtin_dynamic_object_size. When finding
+  // expressions, only choose those that have either already been emitted or
+  // can be loaded without side-effects.
+  const RecordDecl *RD = 
CountDecl->getParent()->getOuterLexicalRecordContext();
+
+  // F

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-09 Thread Kees Cook via cfe-commits

kees wrote:

Thanks! The update fixes the anon struct issue I hit. I've found one more 
issue, though this appears to be a miscalculation with a pathological `count` 
value (i.e. `count` is signed type and contains a negative value):

```
struct annotated {
unsigned long flags;
int count;
int array __counted_by(count);
};

static struct annotated * noinline alloc_annotated(int index)
{
struct annotated *p;

p = malloc(sizeof(*p) + index * sizeof(*p->array));
p->count = index;

return p;
}

...
   struct annotated *a;

  c = alloc_annotated(index);
   c->count = -1;
   printf("%zu\n", __builtin_dynamic_object_size(p->array, 1));
```

This prints a wrapped calculation instead of the expected "0".

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-09 Thread Yeoul Na via cfe-commits

rapidsna wrote:

> This prints a wrapped calculation instead of the expected "0".

Shouldn't `getDefaultBuiltinObjectSizeResult` return `-1`? Have you tried `-2` 
to see if it's returning the negative count value or it happens to be equal to 
the default value?

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-09 Thread Kees Cook via cfe-commits

kees wrote:

> > This prints a wrapped calculation instead of the expected "0".
> 
> Shouldn't `getDefaultBuiltinObjectSizeResult` return `-1`? Have you tried 
> `-2` to see if it's returning the negative count value or it happens to be 
> equal to the default value?

Well, maybe it's not wrapped, but it should absolutely return 0, not -1. The 
return value of -1 means "I don't know how large this is". In this case we _do_ 
know, but it's nonsense, so we must return 0 so that anything checking lengths 
will not write anything to the array.

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-09 Thread Yeoul Na via cfe-commits

rapidsna wrote:

> The return value of -1 means "I don't know how large this is". 

@kees Oh, I see. I did not know such the convention. Is it documented somewhere?

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-09 Thread Kees Cook via cfe-commits

kees wrote:

> > but the value is nonsense, so we must return 0 so that anything checking 
> > lengths will not write anything to the array.
> 
> @kees Oh, I see. I did not know such the convention but it makes sense. Is it 
> documented somewhere?

This is new territory (having a multiplier for finding size that may be 
negative), so there's nothing to document it beyond FORTIFY users needing to 
maintain safe checks. The only safe size to return for "impossible size" is 0 
in this case, otherwise a confused state (negative `count`) can lead to FORTIFY 
bypasses.

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-09 Thread Bill Wendling via cfe-commits

bwendling wrote:

@kees, the issue should be fixed with the newest push.

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-09 Thread Kees Cook via cfe-commits

kees wrote:

> @kees, the issue should be fixed with the newest push.

Nice! This is so extremely close. It fixed all but 1 instance in the torture 
testing. The remaining seems to be union order sensitive. O_o This arrangement 
still reports non-zero for the `int count_bytes` one, unless I move it below 
the `signed char count_ints` counter:

```
#include 
#include 
#include 
#include 

#define noinline__attribute__((__noinline__))
#define __counted_by(member)__attribute__((__counted_by__(member)))

#define DECLARE_BOUNDED_FLEX_ARRAY(COUNT_TYPE, COUNT, TYPE, NAME)   \
struct {\
COUNT_TYPE COUNT;   \
TYPE NAME[] __counted_by(COUNT);\
}

#define DECLARE_FLEX_ARRAY(TYPE, NAME)  \
struct {\
struct { } __empty_ ## NAME;\
TYPE NAME[];\
}

struct multi {
unsigned long flags;
union {
/* count member type intentionally mismatched to induce padding 
*/
DECLARE_BOUNDED_FLEX_ARRAY(int, count_bytes, unsigned char, 
bytes);
DECLARE_BOUNDED_FLEX_ARRAY(signed char,  count_ints,  unsigned 
char, ints);
DECLARE_FLEX_ARRAY(unsigned char, unsafe);
};
};

/* Helper to hide the allocation size by using a leaf function. */
static struct multi * noinline alloc_multi_ints(int index)
{
struct multi *p;

p = malloc(sizeof(*p) + index * sizeof(*p->ints));
p->count_ints = index;

return p;
}

/* Helper to hide the allocation size by using a leaf function. */
static struct multi * noinline alloc_multi_bytes(int index)
{
struct multi *p;

p = malloc(sizeof(*p) + index * sizeof(*p->bytes));
p->count_bytes = index;

return p;
}

int main(void)
{
struct multi *mi, *mb;
volatile int negative = -3;
volatile int index = 10;

mi = alloc_multi_ints(index);
mi->count_ints = negative;
printf("%zu\n", __builtin_dynamic_object_size(mi->ints, 1));

mb = alloc_multi_bytes(index);
mb->count_bytes = negative;
printf("%zu\n", __builtin_dynamic_object_size(mb->bytes, 1));

return 0;
}
```

Returns:

```
253
0
```

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-09 Thread Bill Wendling via cfe-commits

bwendling wrote:

> > @kees, the issue should be fixed with the newest push.
> 
> Nice! This is so extremely close. It fixed all but 1 instance in the torture 
> testing. The remaining seems to be union order sensitive. O_o This 
> arrangement still reports non-zero for the `int count_bytes` one, unless I 
> move it below the `signed char count_ints` counter:

This is due to it being a union. There's more than one field with a 
`counted_by` attribute, and we just use the first one we see...and importantly 
it's type. This obviously isn't correct and I can fix it. However, it does 
bring up an issue that I *don't* know how to fix. If you have something like:

```
printf("%zu\n", __builtin_dynamic_object_size(mi, 1));
```

we don't know out of hand whether to use the `count_bytes` or `count_ints` 
field to get the object size. I suggest in that case we should probably throw 
up our hands and give up. Thoughts?

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-09 Thread Kees Cook via cfe-commits

kees wrote:

Accessing `mi->ints` is unambiguous (it would use the declared `count_ints`) 
but I'm fine to wait and fix that later. The flex array union is gloriously 
rare in the kernel. As for whole object, I say pick smallest from all available 
under bdos(x, 1) and largest for bdos(x, 0), or just say 0. Again, corner case 
of a corner case.

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-09 Thread Bill Wendling via cfe-commits

bwendling wrote:

Like I said, I can fix the `mi->ints` issue. It's when we're trying to ask for 
the size of the whole struct with more than one `__counted_by` attribute that 
we don't be able to handle.

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-09 Thread via cfe-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff 061e4f24b24a3b59d73a94dc6f2f0d21a2b7beac 
557604301ab98d89b5ddf6d857d85326d7d3da3d -- 
clang/test/CodeGen/attr-counted-by.c clang/test/Sema/attr-counted-by.c 
clang/include/clang/AST/DeclBase.h clang/include/clang/Sema/Sema.h 
clang/include/clang/Sema/TypoCorrection.h clang/lib/AST/ASTImporter.cpp 
clang/lib/AST/DeclBase.cpp clang/lib/AST/Expr.cpp 
clang/lib/CodeGen/CGBuiltin.cpp clang/lib/CodeGen/CGExpr.cpp 
clang/lib/CodeGen/CodeGenFunction.h clang/lib/Sema/SemaDecl.cpp 
clang/lib/Sema/SemaDeclAttr.cpp clang/lib/Sema/SemaExpr.cpp 
clang/test/CodeGen/bounds-checking.c
``





View the diff from clang-format here.


``diff
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 836ccf3409..b8277b28a7 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -830,8 +830,8 @@ const FieldDecl 
*CodeGenFunction::FindFlexibleArrayMemberField(
 if (const auto *Field = dyn_cast(D);
 Field && (Name.empty() || Field->getNameAsString() == Name) &&
 Decl::isFlexibleArrayMemberLike(
- Ctx, Field, Field->getType(), StrictFlexArraysLevel,
- /*IgnoreTemplateOrMacroSubstitution=*/true)) {
+Ctx, Field, Field->getType(), StrictFlexArraysLevel,
+/*IgnoreTemplateOrMacroSubstitution=*/true)) {
   const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
   Offset += Layout.getFieldOffset(FieldNo);
   return Field;
@@ -970,7 +970,8 @@ CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, 
unsigned Type,
 return nullptr;
 
   uint64_t Offset = 0;
-  const FieldDecl *FAMDecl = FindFlexibleArrayMemberField(Ctx, OuterRD, 
FAMName, Offset);
+  const FieldDecl *FAMDecl =
+  FindFlexibleArrayMemberField(Ctx, OuterRD, FAMName, Offset);
   Offset = Ctx.toCharUnitsFromBits(Offset).getQuantity();
 
   if (!FAMDecl || !FAMDecl->hasAttr())

``




https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-10 Thread Kees Cook via cfe-commits

kees wrote:

Thanks! It's fixed for me now. I continue to be highly unlucky, though, and 
keep finding weird stuff. It seems to segfault just parsing libc headers under 
`-D_FORTIFY_SOURCE=3` (but not `=2` or lower):

```
0.  Program arguments: 
/srv/built-compilers/llvm/counted_by/install/bin/clang -Wall -O2 
-fstrict-flex-arrays=3 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=3 -fsanitize=bounds 
-fsanitize=object-size -fsanitize-undefined-trap-on-error -c -o array-bounds.o 
array-bounds.c  
  
1.   parser at end of file 

2.  Per-file LLVM IR generation 

3.  /usr/include/x86_64-linux-gnu/bits/string_fortified.h:57:1 
: 
Generating code for declaration 'memset'
  
... 
 #4 0x55ef8b3ac144 clang::DeclContext::decls_begin() const  
   
 #5 0x55ef892afcad CountCountedByAttrs(clang::RecordDecl const*) 
CGBuiltin.cpp:0:0  
```

In my header at line 57, I see:

```
__fortify_function void *
__NTH (memset (void *__dest, int __ch, size_t __len))
{
  return __builtin___memset_chk (__dest, __ch, __len,
 __glibc_objsize0 (__dest));
}
```

where `__glibc_objsize0` is:

```
sys/cdefs.h:# define __glibc_objsize0(__o) __builtin_dynamic_object_size (__o, 
0)
```

I assume something weird is happening with `bdos` here.

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-10 Thread Bill Wendling via cfe-commits

bwendling wrote:

@kees Could you supply a pre-processed file and command line? (It should have 
been emitted when you got the backtrace.)

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-10 Thread Bill Wendling via cfe-commits

bwendling wrote:

Never mind. I think I found the issue anyway...

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-10 Thread Kees Cook via cfe-commits

kees wrote:

That's it! Everything works. :) Ship it!

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-10 Thread Kees Cook via cfe-commits

https://github.com/kees approved this pull request.

I can't speak to the LLVM code changes, but behaviorally, this passes all the 
torture tests I'd expect it to (kernel-tools/fortify/array-bounds.c and full 
kernel builds with its hundreds of `counted_by` annotations).

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-10 Thread Bill Wendling via cfe-commits

bwendling wrote:

@rapidsna & @nickdesaulniers: Do you have anything to add?

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-10 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna approved this pull request.

> Do you have anything to add?
LGTM

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-10 Thread Yeoul Na via cfe-commits

https://github.com/rapidsna edited 
https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-10 Thread Bill Wendling via cfe-commits

bwendling wrote:

Thanks everyone for their insightful feedback!!

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-10 Thread Bill Wendling via cfe-commits

https://github.com/bwendling closed 
https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-10 Thread via cfe-commits

dyung wrote:

@bwendling the test you added `CodeGen/attr-counted-by.c` appears to be failing 
on quite a few bots, can you take a look?

Some failing bots:
- https://lab.llvm.org/buildbot/#/builders/164/builds/48660
- https://lab.llvm.org/buildbot/#/builders/188/builds/40261
- https://lab.llvm.org/buildbot/#/builders/245/builds/18987
- https://lab.llvm.org/buildbot/#/builders/258/builds/12107
- https://lab.llvm.org/buildbot/#/builders/249/builds/13974
- https://lab.llvm.org/buildbot/#/builders/230/builds/23480
- https://lab.llvm.org/buildbot/#/builders/124/builds/9378
- https://lab.llvm.org/buildbot/#/builders/67/builds/14012
- https://lab.llvm.org/buildbot/#/builders/98/builds/31297
- https://lab.llvm.org/buildbot/#/builders/247/builds/13055
- https://lab.llvm.org/buildbot/#/builders/139/builds/56868
- https://lab.llvm.org/buildbot/#/builders/216/builds/32771
- https://lab.llvm.org/buildbot/#/builders/275/builds/3146
- https://lab.llvm.org/buildbot/#/builders/272/builds/5208
- https://lab.llvm.org/buildbot/#/builders/271/builds/2686

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-10 Thread Nico Weber via cfe-commits

nico wrote:

Reverted in 2dce77201c0c6b541a53aa7a09ec06e7561e8f74 for now.

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-10 Thread Bill Wendling via cfe-commits

bwendling wrote:

Resubmitted with fix. Sorry about the failures:

```
To https://github.com/llvm/llvm-project.git
   e8790027b169..164f85db876e  main -> main
```

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2023-12-24 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Bill Wendling (bwendling)


Changes

The 'counted_by' attribute is used on flexible array members. The
argument for the attribute is the name of the field member holding the
count of elements in the flexible array. This information is used to
improve the results of the array bound sanitizer and the
'__builtin_dynamic_object_size' builtin. The 'count' field member must
be within the same non-anonymous, enclosing struct as the flexible array
member. For example:

  struct bar;
  struct foo {
int count;
struct inner {
  struct {
int count; /* The 'count' referenced by 'counted_by' */
  };
  struct {
/* ... */
struct bar *array[] __attribute__((counted_by(count)));
  };
} baz;
  };

This example specifies that the flexible array member 'array' has the
number of elements allocated for it in 'count':

  struct bar;
  struct foo {
size_t count;
 /* ... */
struct bar *array[] __attribute__((counted_by(count)));
  };

This establishes a relationship between 'array' and 'count';
specifically that 'p->array' must have *at least* 'p->count' number of
elements available. It's the user's responsibility to ensure that this
relationship is maintained throughout changes to the structure.

In the following, the allocated array erroneously has fewer elements
than what's specified by 'p->count'. This would result in an
out-of-bounds access not not being detected:

  struct foo *p;

  void foo_alloc(size_t count) {
p = malloc(MAX(sizeof(struct foo),
   offsetof(struct foo, array[0]) + count *
   sizeof(struct bar *)));
p->count = count + 42;
  }

The next example updates 'p->count', breaking the relationship
requirement that 'p->array' must have at least 'p->count' number of
elements available:

  void use_foo(int index, int val) {
p->count += 42;
p->array[index] = val; /* The sanitizer can't properly check this access 
*/
  }

In this example, an update to 'p->count' maintains the relationship
requirement:

  void use_foo(int index, int val) {
if (p->count == 0)
  return;
--p->count;
p->array[index] = val;
  }

---

Patch is 183.31 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/76348.diff


20 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+5) 
- (modified) clang/include/clang/AST/DeclBase.h (+10) 
- (modified) clang/include/clang/Basic/Attr.td (+18) 
- (modified) clang/include/clang/Basic/AttrDocs.td (+78) 
- (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+13) 
- (modified) clang/include/clang/Sema/Sema.h (+3) 
- (modified) clang/include/clang/Sema/TypoCorrection.h (+8-4) 
- (modified) clang/lib/AST/ASTImporter.cpp (+13) 
- (modified) clang/lib/AST/DeclBase.cpp (+73-1) 
- (modified) clang/lib/AST/Expr.cpp (+10-73) 
- (modified) clang/lib/CodeGen/CGBuiltin.cpp (+191) 
- (modified) clang/lib/CodeGen/CGExpr.cpp (+345-6) 
- (modified) clang/lib/CodeGen/CodeGenFunction.h (+21) 
- (modified) clang/lib/Sema/SemaDecl.cpp (+6) 
- (modified) clang/lib/Sema/SemaDeclAttr.cpp (+133) 
- (modified) clang/lib/Sema/SemaExpr.cpp (+9-7) 
- (added) clang/test/CodeGen/attr-counted-by.c (+1815) 
- (modified) clang/test/CodeGen/bounds-checking.c (+9-1) 
- (modified) clang/test/Misc/pragma-attribute-supported-attributes-list.test 
(+1) 
- (added) clang/test/Sema/attr-counted-by.c (+64) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index ee211c16a48ac8..cfd89e32c3d0b9 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -202,6 +202,11 @@ C Language Changes
 - Enums will now be represented in TBAA metadata using their actual underlying
   integer type. Previously they were treated as chars, which meant they could
   alias with all other types.
+- Clang now supports the C-only attribute ``counted_by``. When applied to a
+  struct's flexible array member, it points to the struct field that holds the
+  number of elements in the flexible array member. This information can improve
+  the results of the array bound sanitizer and the
+  ``__builtin_dynamic_object_size`` builtin.
 
 C23 Feature Support
 ^^^
diff --git a/clang/include/clang/AST/DeclBase.h 
b/clang/include/clang/AST/DeclBase.h
index 10dcbdb262d84e..5b1038582bc674 100644
--- a/clang/include/clang/AST/DeclBase.h
+++ b/clang/include/clang/AST/DeclBase.h
@@ -19,6 +19,7 @@
 #include "clang/AST/SelectorLocationsKind.h"
 #include "clang/Basic/IdentifierTable.h"
 #include "clang/Basic/LLVM.h"
+#include "clang/Basic/LangOptions.h"
 #include "clang/Basic/SourceLocation.h"
 #include "clang/Basic/Specifiers.h"
 #include "llvm/ADT/ArrayRef.h"
@@ -488,6 +489,15 @@ class alignas(8) Decl {
   // Return true if this is a FileContext Decl.
   bool isFileContextDecl() const;
 
+  /// Whether it resembles a flexible array member. This is a static member
+  /// because we want

[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2023-12-26 Thread Bill Wendling via cfe-commits

https://github.com/bwendling edited 
https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2023-12-27 Thread Bill Wendling via cfe-commits

bwendling wrote:

Some (hopefully) helpful comments on this admittedly very large patch.

General Comment:

I pulled back on allowing the `count` to be in a non-anonymous struct that 
doesn't contain the flexible array. So this is no longer allowed:

```
struct count_not_in_substruct {
  unsigned long flags;
  unsigned char count; // expected-note {{field 'count' declared here}}
  struct A {
int dummy;
int array[] __counted_by(count); // expected-error {{'counted_by' field 
'count' isn't within the same struct as the flexible array}}
  } a;
};
```

This should align with @rapidsna's bounds checking work and get rid of the 
"different behaviors for a pointer and non-pointer" that we had in the previous 
PR (https://github.com/llvm/llvm-project/pull/73730#discussion_r1430672564).

SEMA:

The SEMA code is a bit different from the previous PR. Most notably is that I'm 
not using the SEMA "lookup" utility the same way. There's no clear way to 
prevent lookup from climbing from the current scope up through the top-level 
scope. This leads to false negatives, duplicate warnings, and other horrors. 
Also, the `Scope` for each sub-structure is deleted by the time the entire 
structure's parsing is completed, making checking the entire struct after 
parsing more difficult. And the `isAnonymousStructOrUnion` flag isn't set 
during SEMA.

All of these combined to make the use of the "lookup" utility very frustrating. 
There are probably many improvements that could be made to the lookup, but I'd 
prefer to do those in a follow-up patch.

CodeGen:

One observation: the `getNonTransparentContext()` (sp?) method doesn't appear 
to work. In particular, if called on an anonymous struct, it will return that 
struct and not the enclosing "non-transparent" one. I may be misunderstanding 
what's meant by non-transparent in this case.

There are two ways that the `counted_by` attribute are used during code 
generation: for sanitizer checks, and with `__builtin_dynamic_object_size()`.

* Sanitizer check: This uses @rjmccall's suggestion 
(https://github.com/llvm/llvm-project/pull/73730#discussion_r1427627732) of 
calculating a byte offset from the flexible array to the "count." It works very 
well (barring some issues with negative offsets being represented by an 
unsigned `~1U` value). The only bit of ugliness was determining the offsets of 
fields that aren't in the top-level of a `RecordDecl`. (I'm curious why this 
hasn't come up before, but I digress.) I created `getFieldOffsetInBits()` to do 
this. It's not quite a hack, but could probably be a bit more elegant. (For 
instance, there could be a `RecordLayout` method indicating if a `FieldDecl` 
exists in the `RecordDecl`, but that wouldn't take sub-`RecordDecls` into 
account, and so on, and so on...)

* `__builtin_dynamic_object_size()`: Side-effects aren't allowed with this 
builtin. So we can work only with pointers that have either already been 
emitted or that have no side-effects. The "visitor" tries to find the "base" of 
the expression. If it's a `DeclRefExpr`, we should be able to emit a load of 
that and continue. If not, we look at `LocalDeclMap` to see if the base is 
already available. If not, we check to see if it has side-effects and emit it 
if it doesn't. Assuming one of those three happen, we grab the indices to the 
`count` and build a series of `GEPs` to them (which seems to be how Clang 
generates GEPs to nested fields). Again, there didn't appear to be an existing 
method to get these indices. (I assume that the general case isn't as easy as 
when working with C structures.)

So that's basically it. I have a program on my machine that compiles most of 
the examples in the `test/CodeGen/attr-counted-by.c` and it works well. I don't 
know if there's a unittest like place that runs examples such as these, but I'd 
be happy to include that test there.

Thank you all for taking the time to review this!

Share and enjoy!

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-04 Thread Bill Wendling via cfe-commits

bwendling wrote:

Pinging for after-holidays visibility. :-)

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-19 Thread Sean McBride via cfe-commits

seanm wrote:

@bwendling is there any plan / possibility for simple expressions (with no side 
effects)?  Like:

```c
struct libusb_bos_dev_capability_descriptor {
uint8_t  bLength;
uint8_t  bDescriptorType;
uint8_t  bDevCapabilityType;
uint8_t  dev_capability_data[] __attribute__((counted_by(bLength - 3)));
};
```


https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-22 Thread Bill Wendling via cfe-commits

bwendling wrote:

> @bwendling is there any plan / possibility for simple expressions (with no 
> side effects)? Like:
> 
> ```c
> struct libusb_bos_dev_capability_descriptor {
>   uint8_t  bLength;
>   uint8_t  bDescriptorType;
>   uint8_t  bDevCapabilityType;
>   uint8_t  dev_capability_data[] __attribute__((counted_by(bLength - 3)));
> };
> ```

Not right now. Apple is in the process of expanding the bounds checking code 
beyond checking only flexible array members. We will most likely follow their 
lead on `constexpr`-like expressions. But the utility of such a feature would 
need to be explored.

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement the 'counted_by' attribute (PR #76348)

2024-01-22 Thread Sean McBride via cfe-commits

seanm wrote:

@bwendling thanks for your reply. No idea how representative it is of other 
projects, but 3 of the 6 flexible arrays in 
[libusb](https://github.com/libusb/libusb) structures have these kinds of 
non-trivial counts.

The GCC folks seem to have 
[considered](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896) adding such 
support, but it's not clear what, if anything, they decided.

https://github.com/llvm/llvm-project/pull/76348
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits