https://github.com/gbMattN updated 
https://github.com/llvm/llvm-project/pull/123595

>From 807c2c8be0517cbb1b9db890f48baeb6f226ba2f Mon Sep 17 00:00:00 2001
From: gbMattN <matthew.n...@sony.com>
Date: Mon, 20 Jan 2025 11:02:06 +0000
Subject: [PATCH 1/6] [TySan] Add initial documentation

---
 clang/docs/TypeSanitizer.rst | 152 +++++++++++++++++++++++++++++++++++
 1 file changed, 152 insertions(+)
 create mode 100644 clang/docs/TypeSanitizer.rst

diff --git a/clang/docs/TypeSanitizer.rst b/clang/docs/TypeSanitizer.rst
new file mode 100644
index 00000000000000..6b320f3bb1773d
--- /dev/null
+++ b/clang/docs/TypeSanitizer.rst
@@ -0,0 +1,152 @@
+================
+TypeSanitizer
+================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+TypeSanitizer is a detector for strict type aliasing violations. It consists 
of a compiler
+instrumentation module and a run-time library. The tool detects violations 
such as the use 
+of an illegally cast pointer, or misuse of a union.
+
+The violations TypeSanitizer catches may cause the compiler to emit incorrect 
code.
+
+Typical slowdown introduced by TypeSanitizer is about **4x** [[CHECK THIS]]. 
Typical memory overhead introduced by TypeSanitizer is about **9x**. 
+
+How to build
+============
+
+Build LLVM/Clang with `CMake <https://llvm.org/docs/CMake.html>`_ and enable
+the ``compiler-rt`` runtime. An example CMake configuration that will allow
+for the use/testing of TypeSanitizer:
+
+.. code-block:: console
+
+   $ cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang" 
-DLLVM_ENABLE_RUNTIMES="compiler-rt" <path to source>/llvm
+
+Usage
+=====
+
+Compile and link your program with ``-fsanitize=type`` flag.  The
+TypeSanitizer run-time library should be linked to the final executable, so
+make sure to use ``clang`` (not ``ld``) for the final link step. To
+get a reasonable performance add ``-O1`` or higher 
+(`This may currently lead to false-negatives 
<https://github.com/llvm/llvm-project/issues/120855>`). 
+TypeSanitizer by default doesn't print the full stack trace on error messages. 
Use ``TYSAN_OPTIONS=print_stacktrace=1`` 
+to print the full trace. To get nicer stack traces in error messages add 
``-fno-omit-frame-pointer`` and 
+``-g``.  To get perfect stack traces you may need to disable inlining (just 
use ``-O1``) and tail call elimination 
+(``-fno-optimize-sibling-calls``).
+
+.. code-block:: console
+
+    % cat example_AliasViolation.c
+    int main(int argc, char **argv) {
+      int x = 100;
+      float *y = (float*)&x;
+      *y += 2.0f;          // Strict aliasing violation
+      return 0;
+    }
+
+    # Compile and link
+    % clang++ -g -fsanitize=type example_AliasViolation.cc
+
+If a strict aliasing violation is detected, the program will print an error 
message to stderr. 
+The program won't terminate, which will allow you to detect many strict 
aliasing violations in one 
+run.
+
+.. code-block:: console
+    % ./a.out
+    ==1375532==ERROR: TypeSanitizer: type-aliasing-violation on address 
0x7ffeebf1a72c (pc 0x5b3b1145ff41 bp 0x7ffeebf1a660 sp 0x7ffeebf19e08 tid 
1375532)
+    READ of size 4 at 0x7ffeebf1a72c with type float accesses an existing 
object of type int
+        #0 0x5b3b1145ff40 in main example_AliasViolation.c:4:10
+
+    ==1375532==ERROR: TypeSanitizer: type-aliasing-violation on address 
0x7ffeebf1a72c (pc 0x5b3b1146008a bp 0x7ffeebf1a660 sp 0x7ffeebf19e08 tid 
1375532)
+    WRITE of size 4 at 0x7ffeebf1a72c with type float accesses an existing 
object of type int
+        #0 0x5b3b11460089 in main example_AliasViolation.c:4:10
+
+Error terminology
+------------------
+
+There are some terms that may appear in TypeSanitizer errors that are derived 
from TBAA Metadata. This 
+section hopes to provide a brief dictionary of these terms.
+
+* ``omnipotent char``: This is a special type which can alias with anything. 
Its name comes from the C/C++ 
+  type ``char``.
+* ``type p[x]``: Sometimes a program could generate distinct TBAA metadata 
that resolve to the same name. 
+  To make them unique, they have the character 'p' and a number prepended to 
their name.
+
+These terms are a result of non-user-facing processes, and not always 
self-explanatory. There is some 
+interest in changing TypeSanitizer in the future to translate these terms 
before printing them to users.
+
+Sanitizer features
+==================
+
+``__has_feature(type_sanitizer)``
+------------------------------------
+
+In some cases one may need to execute different code depending on whether
+TypeSanitizer is enabled.
+:ref:`\_\_has\_feature <langext-__has_feature-__has_extension>` can be used for
+this purpose.
+
+.. code-block:: c
+
+    #if defined(__has_feature)
+    #  if __has_feature(type_sanitizer)
+    // code that builds only under TypeSanitizer
+    #  endif
+    #endif
+
+``__attribute__((no_sanitize("type")))``
+-----------------------------------------------
+
+Some code you may not want to be instrumented by TypeSanitizer.  One may use 
the
+function attribute ``no_sanitize("type")`` to disable instrumenting type 
aliasing. 
+Its possible, depending on what happens in non-instrumented code, that 
instrumented code 
+emits false-positives/ false-negatives. This attribute may not be supported by 
other 
+compilers, so we suggest to use it together with 
``__has_feature(type_sanitizer)``.
+
+``__attribute__((disable_sanitizer_instrumentation))``
+--------------------------------------------------------
+
+The ``disable_sanitizer_instrumentation`` attribute can be applied to functions
+to prevent all kinds of instrumentation. As a result, it may introduce false
+positives and incorrect stack traces. Therefore, it should be used with care,
+and only if absolutely required; for example for certain code that cannot
+tolerate any instrumentation and resulting side-effects. This attribute
+overrides ``no_sanitize("type")``.
+
+Ignorelist
+----------
+
+TypeSanitizer supports ``src`` and ``fun`` entity types in
+:doc:`SanitizerSpecialCaseList`, that can be used to suppress aliasing 
+violation reports in the specified source files or functions. Like 
+with other methods of ignoring instrumentation, this can result in false 
+positives/ false-negatives.
+
+Limitations
+-----------
+
+* TypeSanitizer uses more real memory than a native run. It uses 8 bytes of
+  shadow memory for each byte of user memory.
+* There are transformation passes which run before TypeSanitizer. If these 
+  passes optimize out an aliasing violation, TypeSanitizer cannot catch it.
+* Currently, all instrumentation is inlined. This can result in a **15x** 
+  (on average) increase in generated file size, and **3x** to **7x** increase 
+  in compile time. In some documented cases this can cause the compiler to 
hang.
+  A fix for this is in the last stages of release.
+* Codebases that use unions and struct-initialized variables can see incorrect 
+  results, as TypeSanitizer doesn't yet instrument these reliably.
+
+Current Status
+--------------
+
+TypeSanitizer is brand new, and still in development. There are some known 
+issues, especially in areas where clang doesn't generate valid TBAA metadata. 
+
+We are actively working on enhancing the tool --- stay tuned.  Any help, 
+issues, pull requests, ideas, is more than welcome.

>From 5c9d8f8176ebcf1bd3f1ef49ffb0e685c50d0749 Mon Sep 17 00:00:00 2001
From: gbMattN <matthew.n...@sony.com>
Date: Mon, 20 Jan 2025 11:41:35 +0000
Subject: [PATCH 2/6] Tweaks and edits

---
 clang/docs/TypeSanitizer.rst | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/clang/docs/TypeSanitizer.rst b/clang/docs/TypeSanitizer.rst
index 6b320f3bb1773d..ceb2fca37df904 100644
--- a/clang/docs/TypeSanitizer.rst
+++ b/clang/docs/TypeSanitizer.rst
@@ -33,8 +33,7 @@ Usage
 Compile and link your program with ``-fsanitize=type`` flag.  The
 TypeSanitizer run-time library should be linked to the final executable, so
 make sure to use ``clang`` (not ``ld``) for the final link step. To
-get a reasonable performance add ``-O1`` or higher 
-(`This may currently lead to false-negatives 
<https://github.com/llvm/llvm-project/issues/120855>`). 
+get a reasonable performance add ``-O1`` or higher.
 TypeSanitizer by default doesn't print the full stack trace on error messages. 
Use ``TYSAN_OPTIONS=print_stacktrace=1`` 
 to print the full trace. To get nicer stack traces in error messages add 
``-fno-omit-frame-pointer`` and 
 ``-g``.  To get perfect stack traces you may need to disable inlining (just 
use ``-O1``) and tail call elimination 
@@ -70,8 +69,9 @@ run.
 Error terminology
 ------------------
 
-There are some terms that may appear in TypeSanitizer errors that are derived 
from TBAA Metadata. This 
-section hopes to provide a brief dictionary of these terms.
+There are some terms that may appear in TypeSanitizer errors that are derived 
from 
+`TBAA Metadata <https://llvm.org/docs/LangRef.html#tbaa-metadata>`. This 
section hopes to provide a 
+brief dictionary of these terms.
 
 * ``omnipotent char``: This is a special type which can alias with anything. 
Its name comes from the C/C++ 
   type ``char``.
@@ -105,7 +105,7 @@ this purpose.
 
 Some code you may not want to be instrumented by TypeSanitizer.  One may use 
the
 function attribute ``no_sanitize("type")`` to disable instrumenting type 
aliasing. 
-Its possible, depending on what happens in non-instrumented code, that 
instrumented code 
+It is possible, depending on what happens in non-instrumented code, that 
instrumented code 
 emits false-positives/ false-negatives. This attribute may not be supported by 
other 
 compilers, so we suggest to use it together with 
``__has_feature(type_sanitizer)``.
 
@@ -138,7 +138,7 @@ Limitations
 * Currently, all instrumentation is inlined. This can result in a **15x** 
   (on average) increase in generated file size, and **3x** to **7x** increase 
   in compile time. In some documented cases this can cause the compiler to 
hang.
-  A fix for this is in the last stages of release.
+  There are plans to improve this in the future.
 * Codebases that use unions and struct-initialized variables can see incorrect 
   results, as TypeSanitizer doesn't yet instrument these reliably.
 

>From 3645fc18e198d0642543b002f1853e983dab1b65 Mon Sep 17 00:00:00 2001
From: gbMattN <matthew.n...@sony.com>
Date: Mon, 20 Jan 2025 15:17:54 +0000
Subject: [PATCH 3/6] Fixed error in code block

---
 clang/docs/TypeSanitizer.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/clang/docs/TypeSanitizer.rst b/clang/docs/TypeSanitizer.rst
index ceb2fca37df904..20d0fc71775237 100644
--- a/clang/docs/TypeSanitizer.rst
+++ b/clang/docs/TypeSanitizer.rst
@@ -57,6 +57,7 @@ The program won't terminate, which will allow you to detect 
many strict aliasing
 run.
 
 .. code-block:: console
+
     % ./a.out
     ==1375532==ERROR: TypeSanitizer: type-aliasing-violation on address 
0x7ffeebf1a72c (pc 0x5b3b1145ff41 bp 0x7ffeebf1a660 sp 0x7ffeebf19e08 tid 
1375532)
     READ of size 4 at 0x7ffeebf1a72c with type float accesses an existing 
object of type int

>From 3b27cf7b653b52d89d669db7b59f96a0ea719d03 Mon Sep 17 00:00:00 2001
From: gbMattN <matthew.n...@sony.com>
Date: Mon, 20 Jan 2025 15:31:01 +0000
Subject: [PATCH 4/6] Add TySan links to other doc pages

---
 clang/docs/UsersManual.rst | 3 +++
 clang/docs/index.rst       | 1 +
 2 files changed, 4 insertions(+)

diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index 260e84910c6f78..a56c9425ebb757 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -2103,7 +2103,10 @@ are listed below.
 
       ``-fsanitize=undefined``: :doc:`UndefinedBehaviorSanitizer`,
       a fast and compatible undefined behavior checker.
+   -  .. _opt_fsanitize_type:
 
+      ``-fsanitize=type``: :doc:`TypeSanitizer`, a detector for strict
+      aliasing violations.
    -  ``-fsanitize=dataflow``: :doc:`DataFlowSanitizer`, a general data
       flow analysis.
    -  ``-fsanitize=cfi``: :doc:`control flow integrity <ControlFlowIntegrity>`
diff --git a/clang/docs/index.rst b/clang/docs/index.rst
index cc070059eede5d..26cc08e23a5762 100644
--- a/clang/docs/index.rst
+++ b/clang/docs/index.rst
@@ -35,6 +35,7 @@ Using Clang as a Compiler
    UndefinedBehaviorSanitizer
    DataFlowSanitizer
    LeakSanitizer
+   TypeSanitizer
    RealtimeSanitizer
    SanitizerCoverage
    SanitizerStats

>From 8e3fbe17edbc6a8dd429743a8037b93d51deeb66 Mon Sep 17 00:00:00 2001
From: gbMattN <146744444+gbma...@users.noreply.github.com>
Date: Mon, 20 Jan 2025 16:45:44 +0000
Subject: [PATCH 5/6] Update clang/docs/TypeSanitizer.rst

Co-authored-by: Florian Hahn <f...@fhahn.com>
---
 clang/docs/TypeSanitizer.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/docs/TypeSanitizer.rst b/clang/docs/TypeSanitizer.rst
index 20d0fc71775237..96855d26186ead 100644
--- a/clang/docs/TypeSanitizer.rst
+++ b/clang/docs/TypeSanitizer.rst
@@ -52,7 +52,7 @@ to print the full trace. To get nicer stack traces in error 
messages add ``-fno-
     # Compile and link
     % clang++ -g -fsanitize=type example_AliasViolation.cc
 
-If a strict aliasing violation is detected, the program will print an error 
message to stderr. 
+The program will print an error message to stderr each time a strict aliasing 
violation is detected. 
 The program won't terminate, which will allow you to detect many strict 
aliasing violations in one 
 run.
 

>From b47bb47d5187dfa8507238826bac274f996d25c3 Mon Sep 17 00:00:00 2001
From: gbMattN <matthew.n...@sony.com>
Date: Mon, 20 Jan 2025 17:03:33 +0000
Subject: [PATCH 6/6] Touchups

---
 clang/docs/TypeSanitizer.rst | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/clang/docs/TypeSanitizer.rst b/clang/docs/TypeSanitizer.rst
index 96855d26186ead..ed68690fafa7ca 100644
--- a/clang/docs/TypeSanitizer.rst
+++ b/clang/docs/TypeSanitizer.rst
@@ -9,12 +9,13 @@ Introduction
 ============
 
 TypeSanitizer is a detector for strict type aliasing violations. It consists 
of a compiler
-instrumentation module and a run-time library. The tool detects violations 
such as the use 
-of an illegally cast pointer, or misuse of a union.
+instrumentation module and a run-time library. The tool detects violations 
where you access 
+memory under a different type than the dynamic type of the object.
 
 The violations TypeSanitizer catches may cause the compiler to emit incorrect 
code.
 
-Typical slowdown introduced by TypeSanitizer is about **4x** [[CHECK THIS]]. 
Typical memory overhead introduced by TypeSanitizer is about **9x**. 
+As TypeSanitizer is still experimental, it can currently have a large impact 
on runtime speed, 
+memory use, and code size.
 
 How to build
 ============
@@ -76,11 +77,11 @@ brief dictionary of these terms.
 
 * ``omnipotent char``: This is a special type which can alias with anything. 
Its name comes from the C/C++ 
   type ``char``.
-* ``type p[x]``: Sometimes a program could generate distinct TBAA metadata 
that resolve to the same name. 
-  To make them unique, they have the character 'p' and a number prepended to 
their name.
+* ``type p[x]``: This signifies pointers to the type. x is the number of 
indirections to reach the final value.
+  As an example, a pointer to a pointer to an integer would be ``type p2 int``.
 
-These terms are a result of non-user-facing processes, and not always 
self-explanatory. There is some 
-interest in changing TypeSanitizer in the future to translate these terms 
before printing them to users.
+TypeSanitizer is still experimental. User-facing error messages should be 
improved in the future to remove 
+references to LLVM IR specific terms.
 
 Sanitizer features
 ==================
@@ -147,7 +148,9 @@ Current Status
 --------------
 
 TypeSanitizer is brand new, and still in development. There are some known 
-issues, especially in areas where clang doesn't generate valid TBAA metadata. 
+issues, especially in areas where Clang's emitted TBAA data isn't extensive 
+enough for TypeSanitizer's runtime.
 
 We are actively working on enhancing the tool --- stay tuned.  Any help, 
-issues, pull requests, ideas, is more than welcome.
+issues, pull requests, ideas, is more than welcome. You can find the 
+`issue tracker 
here.<https://github.com/llvm/llvm-project/issues?q=is%3Aissue%20state%3Aopen%20TySan%20label%3Acompiler-rt%3Atysan>`

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to