On Tue, 14 May 2024, Kees Cook wrote:

> On Tue, May 14, 2024 at 02:17:16PM +0000, Qing Zhao wrote:
> > The current major issue with the warning is:  the constant index value 4
> > is not in the source code, it’s a compiler generated intermediate value
> > (even though it’s a correct value -:)). Such warning messages confuse
> > the end-users with information that cannot be connected directly to the
> > source code.
> 
> Right -- this "4" comes from -fsanitize=array-bounds (in "warn but
> continue" mode).
> 
> Now, the minimized PoC shows a situation that triggers the situation, but
> I think it's worth looking at the original code that caused this false
> positive:
> 
>       for (i = 0; i < sg->num_entries; i++) {
>               gce = &sg->gce[i];
> 
> 
> The issue here is that sg->num_entries has already been bounds-checked
> in a separate function. As a result, the value range tracking for "i"
> here is unbounded.
> 
> Enabling -fsanitize=array-bounds means the sg->gce[i] access gets
> instrumented, and suddenly "i" gains an implicit range, induced by the
> sanitizer.
> 
> (I would point out that this is very similar to the problems we've had
> with -fsanitize=shift[1][2]: the sanitizer induces a belief about a
> given variable's range this isn't true.)
> 
> Now, there is an argument to be made that the original code should be
> doing:
> 
>       for (i = 0; i < 4 && i < sg->num_entries; i++) {
> 
> But this is:
> 
> a) logically redundant (Linux maintainers don't tend to like duplicating
>    their range checking)
> 
> b) a very simple case
> 
> The point of the sanitizers is to catch "impossible" situations at
> run-time for the cases where some value may end up out of range. Having
> it _induce_ a range on the resulting code makes no sense.
> 
> Could we, perhaps, have sanitizer code not influence the value range
> tracking? That continues to look like the root cause for these things.

The sanitizer code adds checks that are not distinguishable from
user code exactly because we want value-range analysis to eventually
elide even (redundant) sanitizer checks.

I think the fix for the source when there's a-priori knowledge
of sg->num_entries is to assert that knowledge through language
features or when using C through GNU extensions like assert()
using __builtin_unreachable ().  That also serves documentation
purposes "this code expects sg->num_entries to be bounds-checked".

To me it doesn't make much sense to mix sanitizing of array
accesses and at the same time do -Warray-bound diagnostics.

Note I tried to teach jump threading to be less aggressive
threading paths to exceptional situations (I think the
sanitizer runtime calls are at least marked unlikely), but
the comment was that even those are very much desired but
I can't remember the details.  This was as part of PR111515
but I think I've run into this earlier when trying to
improve back_threader_profitability::possibly_profitable_path_p.
There we have

  /* Threading is profitable if the path duplicated is hot but also
     in a case we separate cold path from hot path and permit optimization
     of the hot path later.  Be on the agressive side here. In some testcases,
     as in PR 78407 this leads to noticeable improvements.  */

here we have

  if (A)
    unlikely();
  B;
  if (A)
    unlikely ();

and we choose to perform that path separation which optimizes
the not exceptional path which automatically separates the
exceptional path as well.

IMO that sanitizer mode that continues running is bad - it makes
the compiler aware of undefined behavior and make the code run
into it with open eyes.  You get what you asked for.

Richard.

Reply via email to