lebedev.ri created this revision.
lebedev.ri added reviewers: aaron.ballman, rsmith, efriedma, vsk, filcab, 
vitalybuka.
lebedev.ri added projects: LLVM, clang, Sanitizers.
Herald added subscribers: jansvoboda11, dexonsmith, dang.
lebedev.ri requested review of this revision.
Herald added subscribers: Sanitizers, cfe-commits.

This is somewhat related to the following RFC by @rjmccall:
https://lists.llvm.org/pipermail/llvm-dev/2016-January/094012.html

> C 6.3.2.3p7 (N1548) says:
>
>   A pointer to an object type may be converted to a pointer to a
>   different object type. If the resulting pointer is not correctly
>   aligned) for the referenced type, the behavior is undefined.
>
> C++ [expr.reinterpret.cast]p7 (N4527) defines pointer conversions in terms
> of conversions from void*:
>
>   An object pointer can be explicitly converted to an object pointer
>   of a different type. When a prvalue v of object pointer type is
>   converted to the object pointer type “pointer to cv T”, the result
>   is static_cast<cv T*>(static_cast<cv void*>(v)).
>
> C++ [expr.static.cast]p13 says of conversions from void*:
>
>   A prvalue of type “pointer to cv1 void” can be converted to a
>   prvalue of type “pointer to cv2 T” .... If the original pointer value
>   represents the address A of a byte in memory and A satisfies the alignment
>   requirement of T, then the resulting pointer value represents the same
>   address as the original pointer value, that is, A. The result of any
>   other such pointer conversion is unspecified.

Currently clang does not optimize based on the fact that the pointers
passed into the function must be appropriately aligned for their pointee type.
We now have a motivation to change that. Namely, it was estabilished that
not doing so prevents LICM, and it is likely not the only penalized transform.
See D99249 <https://reviews.llvm.org/D99249>.

Now, somehow i don't think we can just start optimizing on that.
Let's first give users tools to weed out the cases where the pointer
passed into function's argument is underaligned.

I've yet to actually see just how much mayhem this causes,
at least on the LLVM stage-2 build.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D100037

Files:
  clang/docs/ClangCommandLineReference.rst
  clang/docs/ReleaseNotes.rst
  clang/docs/UsersManual.rst
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/CGExpr.cpp
  clang/lib/CodeGen/CodeGenFunction.cpp
  clang/lib/CodeGen/CodeGenFunction.h
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGen/catch-function-pointer-argument-alignment.cpp
  compiler-rt/lib/ubsan/ubsan_handlers.cpp
  compiler-rt/test/ubsan/TestCases/Pointer/function-pointer-argument.cpp

Index: compiler-rt/test/ubsan/TestCases/Pointer/function-pointer-argument.cpp
===================================================================
--- /dev/null
+++ compiler-rt/test/ubsan/TestCases/Pointer/function-pointer-argument.cpp
@@ -0,0 +1,30 @@
+// RUN: %clang   -x c   -fsanitize=alignment -O0 %s -o %t && %run %t 2>&1 | FileCheck %s --implicit-check-not=" assumption " --implicit-check-not="note:" --implicit-check-not="error:"
+// RUN: %clang   -x c   -fsanitize=alignment -O1 %s -o %t && %run %t 2>&1 | FileCheck %s --implicit-check-not=" assumption " --implicit-check-not="note:" --implicit-check-not="error:"
+// RUN: %clang   -x c   -fsanitize=alignment -O2 %s -o %t && %run %t 2>&1 | FileCheck %s --implicit-check-not=" assumption " --implicit-check-not="note:" --implicit-check-not="error:"
+// RUN: %clang   -x c   -fsanitize=alignment -O3 %s -o %t && %run %t 2>&1 | FileCheck %s --implicit-check-not=" assumption " --implicit-check-not="note:" --implicit-check-not="error:"
+
+// RUN: %clang   -x c++ -fsanitize=alignment -O0 %s -o %t && %run %t 2>&1 | FileCheck %s --implicit-check-not=" assumption " --implicit-check-not="note:" --implicit-check-not="error:"
+// RUN: %clang   -x c++ -fsanitize=alignment -O1 %s -o %t && %run %t 2>&1 | FileCheck %s --implicit-check-not=" assumption " --implicit-check-not="note:" --implicit-check-not="error:"
+// RUN: %clang   -x c++ -fsanitize=alignment -O2 %s -o %t && %run %t 2>&1 | FileCheck %s --implicit-check-not=" assumption " --implicit-check-not="note:" --implicit-check-not="error:"
+// RUN: %clang   -x c++ -fsanitize=alignment -O3 %s -o %t && %run %t 2>&1 | FileCheck %s --implicit-check-not=" assumption " --implicit-check-not="note:" --implicit-check-not="error:"
+
+#include <stdlib.h>
+
+struct S {
+  int k;
+} __attribute__((aligned(8192)));
+
+void take_pointer_to_overaligned_struct(struct S *x) {
+}
+
+int main(int argc, char *argv[]) {
+  char *ptr = (char *)malloc(2 * sizeof(struct S));
+
+  take_pointer_to_overaligned_struct((struct S *)(ptr + 1));
+  // CHECK: {{.*}}function-pointer-argument.cpp:[[@LINE-7]]:51: runtime error: function pointer argument has misaligned address {{.*}} for type 'struct S', which requires 8192 byte alignment
+  // CHECK: 0x{{.*}}: note: pointer points here
+
+  free(ptr);
+
+  return 0;
+}
Index: compiler-rt/lib/ubsan/ubsan_handlers.cpp
===================================================================
--- compiler-rt/lib/ubsan/ubsan_handlers.cpp
+++ compiler-rt/lib/ubsan/ubsan_handlers.cpp
@@ -73,15 +73,25 @@
   TCK_NonnullAssign,
   /// Checking the operand of a dynamic_cast or a typeid expression.  Must be
   /// null or an object within its lifetime.
-  TCK_DynamicOperation
+  TCK_DynamicOperation,
+  /// Checking the alignment of the function's pointer argument.
+  TCK_FunctionPointerArgument
 };
 
-const char *TypeCheckKinds[] = {
-    "load of", "store to", "reference binding to", "member access within",
-    "member call on", "constructor call on", "downcast of", "downcast of",
-    "upcast of", "cast to virtual base of", "_Nonnull binding to",
-    "dynamic operation on"};
-}
+const char *TypeCheckKinds[] = {"load of",
+                                "store to",
+                                "reference binding to",
+                                "member access within",
+                                "member call on",
+                                "constructor call on",
+                                "downcast of",
+                                "downcast of",
+                                "upcast of",
+                                "cast to virtual base of",
+                                "_Nonnull binding to",
+                                "dynamic operation on",
+                                "function pointer argument has"};
+} // namespace __ubsan
 
 static void handleTypeMismatchImpl(TypeMismatchData *Data, ValueHandle Pointer,
                                    ReportOptions Opts) {
Index: clang/test/CodeGen/catch-function-pointer-argument-alignment.cpp
===================================================================
--- /dev/null
+++ clang/test/CodeGen/catch-function-pointer-argument-alignment.cpp
@@ -0,0 +1,29 @@
+// RUN: %clang_cc1                                                                                 -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s -implicit-check-not="call"
+// RUN: %clang_cc1 -fstrict-pointer-alignment                                                      -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s -implicit-check-not="call"
+// RUN: %clang_cc1 -fstrict-pointer-alignment -fsanitize=alignment -fno-sanitize-recover=alignment -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s -implicit-check-not="call" --check-prefixes=CHECK,CHECK-SANITIZE,CHECK-SANITIZE-ANYRECOVER,CHECK-SANITIZE-NORECOVER,CHECK-SANITIZE-UNREACHABLE
+// RUN: %clang_cc1 -fstrict-pointer-alignment -fsanitize=alignment -fsanitize-recover=alignment    -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s -implicit-check-not="call" --check-prefixes=CHECK,CHECK-SANITIZE,CHECK-SANITIZE-ANYRECOVER,CHECK-SANITIZE-RECOVER
+// RUN: %clang_cc1 -fstrict-pointer-alignment -fsanitize=alignment -fsanitize-trap=alignment       -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s -implicit-check-not="call" --check-prefixes=CHECK,CHECK-SANITIZE,CHECK-SANITIZE-TRAP,CHECK-SANITIZE-UNREACHABLE
+
+// CHECK-SANITIZE-ANYRECOVER: @[[INT:.*]] = {{.*}} c"'int'\00" }
+// CHECK-SANITIZE-ANYRECOVER: @[[LINE_100_POINTER_ARGUMENT:.*]] = {{.*}}, i32 100, i32 16 }, {{.*}}* @[[INT]], i8 2, i8 12 }
+
+#line 100
+void func(int *data) {
+  // CHECK:                           define{{.*}} void @{{.*}}(i32* %[[DATA:.*]])
+  // CHECK-NEXT:                      [[ENTRY:.*]]:
+  // CHECK-NEXT:                        %[[DATA_ADDR:.*]] = alloca i32*, align 8
+  // CHECK-NEXT:                        store i32* %[[DATA]], i32** %[[DATA_ADDR]], align 8
+  // CHECK-SANITIZE-NEXT:               %[[PTRINT:.*]] = ptrtoint i32* %[[DATA]] to i64
+  // CHECK-SANITIZE-NEXT:               %[[MASKEDPTR:.*]] = and i64 %[[PTRINT]], 3
+  // CHECK-SANITIZE-NEXT:               %[[MASKCOND:.*]] = icmp eq i64 %[[MASKEDPTR]], 0
+  // CHECK-SANITIZE-NEXT:               br i1 %[[MASKCOND]], label %[[CONT:.*]], label %[[HANDLER_TYPE_MISMATCH:[^,]+]],{{.*}} !nosanitize
+
+  // CHECK-SANITIZE:                  [[HANDLER_TYPE_MISMATCH]]:
+  // CHECK-SANITIZE-NORECOVER-NEXT:     call void @__ubsan_handle_type_mismatch_v1_abort(i8* bitcast ({ {{{.*}}}, {{{.*}}}*, {{.*}}, {{.*}} }* @[[LINE_100_POINTER_ARGUMENT]] to i8*), i64 %[[PTRINT]]){{.*}}, !nosanitize
+  // CHECK-SANITIZE-RECOVER-NEXT:       call void @__ubsan_handle_type_mismatch_v1(i8* bitcast ({ {{{.*}}}, {{{.*}}}*, {{.*}}, {{.*}} }* @[[LINE_100_POINTER_ARGUMENT]] to i8*), i64 %[[PTRINT]]){{.*}}, !nosanitize
+  // CHECK-SANITIZE-TRAP-NEXT:          call void @llvm.ubsantrap(i8 22){{.*}}, !nosanitize
+  // CHECK-SANITIZE-UNREACHABLE-NEXT:   unreachable, !nosanitize
+
+  // CHECK-SANITIZE:                  [[CONT]]:
+  // CHECK:                             ret void
+}
Index: clang/lib/Driver/ToolChains/Clang.cpp
===================================================================
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -4872,6 +4872,10 @@
                    options::OPT_fno_force_emit_vtables,
                    false))
     CmdArgs.push_back("-fforce-emit-vtables");
+  if (Args.hasFlag(options::OPT_fstrict_pointer_alignment,
+                   options::OPT_fno_strict_pointer_alignment,
+                   !RawTriple.isOSDarwin()))
+    CmdArgs.push_back("-fstrict-pointer-alignment");
   if (!Args.hasFlag(options::OPT_foptimize_sibling_calls,
                     options::OPT_fno_optimize_sibling_calls))
     CmdArgs.push_back("-mdisable-tail-calls");
Index: clang/lib/CodeGen/CodeGenFunction.h
===================================================================
--- clang/lib/CodeGen/CodeGenFunction.h
+++ clang/lib/CodeGen/CodeGenFunction.h
@@ -2898,7 +2898,9 @@
     TCK_NonnullAssign,
     /// Checking the operand of a dynamic_cast or a typeid expression.  Must be
     /// null or an object within its lifetime.
-    TCK_DynamicOperation
+    TCK_DynamicOperation,
+    /// Checking the alignment of the function's pointer argument.
+    TCK_FunctionPointerArgument
   };
 
   /// Determine whether the pointer type check \p TCK permits null pointers.
Index: clang/lib/CodeGen/CodeGenFunction.cpp
===================================================================
--- clang/lib/CodeGen/CodeGenFunction.cpp
+++ clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1156,6 +1156,32 @@
     }
   }
 
+  if (CGM.getCodeGenOpts().StrictPointerAlignment &&
+      SanOpts.has(SanitizerKind::Alignment)) {
+    // If we are optimizing based on the strict pointer alignment rules,
+    // check that each pointer argument is aligned appropriately.
+    for (size_t Arg = 0; Arg != Args.size(); ++Arg) {
+      const VarDecl *VD = Args[Arg];
+
+      const auto *PtrTy = VD->getType()->getAs<PointerType>();
+      if (!PtrTy)
+        continue;
+
+      QualType PTy = PtrTy->getPointeeType();
+      if (!PTy->isObjectType())
+        continue;
+
+      SanitizerSet SkippedChecks;
+      SkippedChecks.set(SanitizerKind::Null, true);
+      SkippedChecks.set(SanitizerKind::ObjectSize, true);
+      SkippedChecks.set(SanitizerKind::Vptr, true);
+
+      EmitTypeCheck(TCK_FunctionPointerArgument, VD->getLocation(),
+                    Fn->getArg(Arg), PTy, /*Alignment=*/CharUnits(),
+                    SkippedChecks);
+    }
+  }
+
   // If any of the arguments have a variably modified type, make sure to
   // emit the type size.
   for (FunctionArgList::const_iterator i = Args.begin(), e = Args.end();
Index: clang/lib/CodeGen/CGExpr.cpp
===================================================================
--- clang/lib/CodeGen/CGExpr.cpp
+++ clang/lib/CodeGen/CGExpr.cpp
@@ -656,7 +656,8 @@
 
 bool CodeGenFunction::isNullPointerAllowed(TypeCheckKind TCK) {
   return TCK == TCK_DowncastPointer || TCK == TCK_Upcast ||
-         TCK == TCK_UpcastToVirtualBase || TCK == TCK_DynamicOperation;
+         TCK == TCK_UpcastToVirtualBase || TCK == TCK_DynamicOperation ||
+         TCK == TCK_FunctionPointerArgument;
 }
 
 bool CodeGenFunction::isVptrCheckRequired(TypeCheckKind TCK, QualType Ty) {
Index: clang/include/clang/Driver/Options.td
===================================================================
--- clang/include/clang/Driver/Options.td
+++ clang/include/clang/Driver/Options.td
@@ -2446,6 +2446,11 @@
   PosFlag<SetTrue, [CC1Option], "Enable optimizations based on the strict rules for"
             " overwriting polymorphic C++ objects">,
   NegFlag<SetFalse>>;
+defm strict_pointer_alignment : BoolFOption<"strict-pointer-alignment",
+  CodeGenOpts<"StrictPointerAlignment">, DefaultFalse,
+  NegFlag<SetFalse, [CC1Option], "Disable">,
+  PosFlag<SetTrue,  [CC1Option], "Enable">,
+  BothFlags<[], " optimizations based on the strict rules for pointer alignmen">>;
 def fstrict_overflow : Flag<["-"], "fstrict-overflow">, Group<f_Group>;
 def fsyntax_only : Flag<["-"], "fsyntax-only">,
   Flags<[NoXarchOption,CoreOption,CC1Option,FC1Option]>, Group<Action_Group>;
Index: clang/include/clang/Basic/CodeGenOptions.def
===================================================================
--- clang/include/clang/Basic/CodeGenOptions.def
+++ clang/include/clang/Basic/CodeGenOptions.def
@@ -261,6 +261,7 @@
 CODEGENOPT(FineGrainedBitfieldAccesses, 1, 0) ///< Enable fine-grained bitfield accesses.
 CODEGENOPT(StrictEnums       , 1, 0) ///< Optimize based on strict enum definition.
 CODEGENOPT(StrictVTablePointers, 1, 0) ///< Optimize based on the strict vtable pointers
+CODEGENOPT(StrictPointerAlignment, 1, 1) ///< Optimize based on the strict pointer alignment rules
 CODEGENOPT(TimePasses        , 1, 0) ///< Set when -ftime-report or -ftime-report= is enabled.
 CODEGENOPT(TimePassesPerRun  , 1, 0) ///< Set when -ftime-report=per-pass-run is enabled.
 CODEGENOPT(TimeTrace         , 1, 0) ///< Set when -ftime-trace is enabled.
Index: clang/docs/UsersManual.rst
===================================================================
--- clang/docs/UsersManual.rst
+++ clang/docs/UsersManual.rst
@@ -1669,6 +1669,12 @@
    modules where it isn't necessary. It causes more inline virtual functions
    to be emitted.
 
+.. option:: -fstrict-pointer-alignment
+
+   Enable optimizations based on the strict rules for pointer alignment.
+   This currently only affects function pointer arguments, and only controls
+   UBSan behaviour, no optimizations are currently performed based on it.
+
 .. option:: -fno-assume-sane-operator-new
 
    Don't assume that the C++'s new operator is sane.
Index: clang/docs/ReleaseNotes.rst
===================================================================
--- clang/docs/ReleaseNotes.rst
+++ clang/docs/ReleaseNotes.rst
@@ -46,7 +46,16 @@
 Major New Features
 ------------------
 
-- ...
+- Passing underaligned pointers as function arguments is undefined behaviour.
+  Previously, clang was not making any optimizations based on that,
+  however, there's interest in doing so now.
+  A new ``-fstrict-pointer-alignment`` flag was introduces (default on), which
+  will guard these new optimizations in the future, and allow disabling them.
+  UBSan's ``-fsanitize=alignment`` was enhanced to catch the cases where
+  an underaligned pointer was passed as a function argument,
+  to allow codebases to be cleaned up in preparation for this optimization,
+  to avoid miscompiles.
+
 
 Improvements to Clang's diagnostics
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -231,6 +240,11 @@
 Undefined Behavior Sanitizer (UBSan)
 ------------------------------------
 
+- ``-fsanitize=alignment`` was enhanced to catch the cases where
+  an underaligned pointer was passed as a function argument.
+  In the future clang will optimize based on that UB.
+
+
 Core Analysis Improvements
 ==========================
 
Index: clang/docs/ClangCommandLineReference.rst
===================================================================
--- clang/docs/ClangCommandLineReference.rst
+++ clang/docs/ClangCommandLineReference.rst
@@ -2222,6 +2222,10 @@
 
 Enable optimizations based on the strict rules for overwriting polymorphic C++ objects
 
+.. option:: -fstrict-pointer-alignment, -fno-strict-pointer-alignment
+
+Enable optimizations based on the strict rules for pointer alignment
+
 .. option:: -fstruct-path-tbaa, -fno-struct-path-tbaa
 
 .. option:: -fsymbol-partition=<arg>
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
  • [PATCH] D100037: ... Roman Lebedev via Phabricator via cfe-commits

Reply via email to