Re: [PATCH 3/X] [libsanitizer] Add option to bootstrap using HWASAN

Martin Liška Thu, 21 Nov 2019 05:06:15 -0800

On 11/20/19 4:45 PM, Matthew Malcomson wrote:

On 20/11/2019 14:33, Martin Liška wrote:

On 11/13/19 4:24 PM, Matthew Malcomson wrote:

On 12/11/2019 12:08, Martin Liška wrote:

On 11/11/19 5:03 PM, Matthew Malcomson wrote:

Ah!
My apologies -- I sent up a series with a few documentation mistakes.


b) Marking 'ptr' and 'mem' in the dump sounds like a good idea to me.
      Exactly how I'm not sure -- maybe with a colourscheme?  Do you
have a
      marking in mind?


Libsanitizer is capable of using colors for report printing.
I can help with that and come up with a patch for upstream.


      Uninitialised shadow space has the zero tag, however, there are a
few
      extra details that help understanding these traces:

      On the stack, zero is both uninitialized and "the background" (i.e.
      the tag for anything not specially instrumented, like register
spills
      and parameters passed on the stack).
      However, accessible tagged objects can be given a zero tag.


Question here would be if we should use non-zero tags here? Maybe related
to my comment about skipping of HWASAN_STACK_BACKGROUND tag?


Unfortunately we can't skip non-zero tags at compile time when using a
random frame tag.  This is because we don't know at compile time what
the random frame tag will be.

On each entry to a frame a "base tag" is generated randomly at runtime.
Each local object in the frame has a compile-time offset that's what
gets calculated in `hwasan_increment_tag` -- the offset from this random
tag.
The tag assigned to a local object is the runtime random frame tag plus
the compile-time constant offset.


Ah, ok, I see.



I could avoid HWASAN_STACK_BACKGROUND as a tag when the parameter
`hwasan-random-frame-tag` is false, since then there is no runtime
random base tag (instead I start with zero).


Yes, I would recommend that approach.


I'll be happy to add that in if you'd like -- I decided against it since
it would only matter when a function has 256 or more variables, but I
flip-flopped on the decision a few times.

      We allow this to avoid runtime checks when incrementing the random
      frame tag to get the tag for a new local variable.
      We can easily avoid the zero tag at compile-time if we don't use a
      random tag for each frame.  I had this in development at one point
      and found it quite useful for verification.  I already have an
option
      to disable random tags for each frame that this ability could go
      under.
      I don't believe (but am not 100% certain) this option is in LLVM.

      On the heap uninitialised is tag zero, but memory that has been
      `free`d is given a random tag, so non-zero in a dump does not mean a
      valid object.

c) Is there an alternate notation you have in mind?
      I would guess the "dots" are there to say "this granule has no
      short-tag information", and I'm not sure what would be a better
      way to demonstrate that.


Now I've got it here. Dot means that top-byte of a pointer equals to zero.
Right?



Ah!
I think I never described the "short-tag" functionality, and the fact
it's in the debug output is getting confusing.

This will also be part of answering your question "c)", and question
"h", in the other email
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01950.html .


----------

The main tagging behaviour as described has a natural limitation.
Invalid accesses that do not cross a 16 byte boundary are not caught,
since each shadow-memory tag applies to a 16 byte chunk.

To account for this, HWASAN has a "short-tag" functionality.
This functionality was introduced in llvm-svn revision 365551.

Usually, a shadow-memory byte records the *tag* that is valid for access
to the relevant 16 byte granule in normal memory.
When using short-tags, if an object fills only part of a 16 byte granule
in normal memory, the corresponding shadow-memory byte stores the
*length (in bytes) into this granule that is valid*.
The *tag* is then stored in the last byte of the 16 byte granule in
normal memory.
(We know that last byte is unused, since this is a "short" granule tag).


Now, checking a memory access consists of two parts.
1) A normal tag comparison.
2) A fallback in the tag-mismatch case.
     This fallback checks if the accessing pointer is accessing less bytes
     into the granule than the length given in shadow-memory.
     Then if that's the case it also checks the pointers tag matches the
     last byte in this 16 byte granule.


That is a little difficult to explain clearly in text, so I apologise if
the above doesn't make sense.


I've got the technique, it's quite nice trick I would say. However, it makes
the sanitizer even more complicated.



----------

The hwasan error-reporting output lists both memory tags *and* short-tags.
These are the two sections under the titles of "Memory tags ..." and
"Tags for short granules ...".

The first printed section shows what is stored in shadow-memory.
This is usually the tag, but can be a length if using "short" tags.
The second section contains the "last byte of a granule" for every
granule whose shadow-memory byte could be a length.

This is why the majority of the "Memory tags ..." section is zero
(uninitialized).
The majority of the "Tags for short granules ..." section is dots to
represent that this granule can't be a "short" granule.
Hwasan knows those granules can't be "short" granules since their
corresponding byte in shadow memory is not a valid length for a
short-granule interpretation.
Valid lengths are in the range 1 to 15.

It is up to the user to disambiguate the two possibilities in the output.


I've got it.


----------

I have not implemented setting up short-tags for the stack (and do not
intend to for GCC 10).


Which makes perfectly sense, it's an initial implementation.


This explains your question "h)" -- the testcase you found accesses a
stack-allocated buffer.
The access is outside the buffer, but not outside the 16 byte granule
that object is in.  Hence without short-tags this can not be detected by
hwasan.

It also explains your question "c)" there are no granules that could be
interpreted as "short" because GCC doesn't yet set any granules as "short".

----------

Note: You will likely see some stack error-reports that do include
"short" tag information.  This is not because the compiler has generated
short tag information, but it's because the tags that have been
generated could be interpreted as valid short tags.
This could cause some rare false-passes, but hwasan is already a
probabilistic sanitizer.

Note 2: Adding short-tags later is backwards-compatible -- especially
since I have not added inline tests yet.
The compatibility story for adding short-tags is:
- If you generate short-tags you must have short-tag checking.
- Having short-tag checking without generating short-tags can add
    rare false-passes.


Martin


Note 3: short-tags is not a feature in MTE.


d) I agree, an address offset annotation on each line of the shadow
      memory sounds useful.


I can come up with an upstream patch as well.

Thank you,
Martin


Cheers,
MM


Thanks,
Martin


I'm attaching the entire updated patch series (with the other
documentation fixes in it too) and the fixed patch for just this
part in
case you just want to compile and test right now.

Re: [PATCH 3/X] [libsanitizer] Add option to bootstrap using HWASAN

Reply via email to