Richard Biener <rguent...@suse.de> writes:
> On Mon, 18 Jan 2021, Jan Hubicka wrote:
>
>> > This is a repost of:
>> > 
>> >   https://gcc.gnu.org/pipermail/gcc-patches/2020-February/539763.html
>> > 
>> > which was initially posted during stage 4.  (And yeah, I only just
>> > missed stage 4 again.)
>> > 
>> > IMO it would be better to fix the bug directly (as the patch tries
>> > to do) instead of wait for a more thorough redesign of this area.
>> > See the end of:
>> > 
>> >   https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540002.html
>> > 
>> > for some stats.
>> > 
>> > Honza: Richard said he'd like your opinion on the patch.
>> > 
>> > 
>> > memrefs_conflict_p has a slightly odd structure.  It first checks
>> > whether two addresses based on SYMBOL_REFs refer to the same object,
>> > with a tristate result:
>> > 
>> >       int cmp = compare_base_symbol_refs (x,y);
>> > 
>> > If the addresses do refer to the same object, we can use offset-based 
>> > checks:
>> > 
>> >       /* If both decls are the same, decide by offsets.  */
>> >       if (cmp == 1)
>> >         return offset_overlap_p (c, xsize, ysize);
>> > 
>> > But then, apart from the special case of forced address alignment,
>> > we use an offset-based check even if we don't know whether the
>> > addresses refer to the same object:
>> > 
>> >       /* Assume a potential overlap for symbolic addresses that went
>> >     through alignment adjustments (i.e., that have negative
>> >     sizes), because we can't know how far they are from each
>> >     other.  */
>> >       if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
>> >    return -1;
>> >       /* If decls are different or we know by offsets that there is no 
>> > overlap,
>> >     we win.  */
>> >       if (!cmp || !offset_overlap_p (c, xsize, ysize))
>> >    return 0;
>> > 
>> > This somewhat contradicts:
>> > 
>> >   /* In general we assume that memory locations pointed to by different 
>> > labels
>> >      may overlap in undefined ways.  */
>> 
>> I suppose it is becuase the code above check for SYMBOL_REF and not
>> label (that is probably about jumptables and constpool injected into
>> text segment).
>> 
>> I assume this is also bit result of GCC not being very systematic about
>> aliases.  Sometimes it assumes that two different symbols do not point
>> to same object while in other cases it is worried about aliases.
>> 
>> I see that anchors are special since they point to "same object" with
>> different offests.
>> > 
>> > at the end of compare_base_symbol_refs.  In other words, we're taking -1
>> > to mean that either (a) the symbols are equal (via aliasing) or (b) the
>> > references access non-overlapping objects.
>> 
>> I for symbol refs yes, I think so.
>> > 
>> > But even assuming that's true for normal symbols, it doesn't cope
>> > correctly with section anchors.  If a symbol X at ANCHOR+OFFSET is
>> > preemptible, either (a) X = ANCHOR+OFFSET (rather than the X = ANCHOR
>> > assumed above) or (b) X and ANCHOR reference non-overlapping objects.
>> > 
>> > And an offset-based comparison makes no sense for an anchor symbol
>> > vs. a bare symbol with no decl.  If the bare symbol is allowed to
>> > alias other symbols then it can surely alias any symbol in the
>> > anchor's block, so there are multiple anchor offsets that might
>> > induce an alias.
>> > 
>> > This patch therefore replaces the current tristate:
>> > 
>> >   - known equal
>> >   - known independent (two accesses can't alias)
>> >   - equal or independent
>> > 
>> > with:
>> > 
>> >   - known distance apart
>> >   - known independent (two accesses can't alias)
>> >   - known distance apart or independent
>> >   - don't know
>> > 
>> > For safety, the patch puts all bare symbols in the "don't know"
>> > category.  If that turns out to be too conservative, we at least
>> > need that behaviour for combinations involving a bare symbol
>> > and a section anchor.  However, bare symbols should be relatively
>> > rare these days.
>> 
>> Well, in tree-ssa code we do assume these to be either disjoint objects
>> or equal (in decl_refs_may_alias_p that continues in case
>> compare_base_decls is -1).  I am not sure if we win much by threating
>> them differently on RTL level. I would preffer staying consistent here.

Yeah, I see your point.  My concern here was that the fallback case
applies to SYMBOL_REFs without decls, which might not have been visible
at the tree-ssa level.  E.g. they might be ABI-defined symbols that have
no known relation to source-level constructs.

E.g. the small-data base symbol _gp on MIPS points at a fixed offset
from the start of the small-data area (0x7ff0 IIRC).  If the target
generated rtl code that used _gp directly, we could wrongly assume
that _gp+X can't alias BASE+Y when X != Y, even though the real test
for small-data BASEs would be whether X + 0x7ff0 != Y.

I don't think that could occur in tree-ssa.  No valid C code would
be able to refer directly to _gp in this way.

On the other hand, I don't have a specific example of where this does
go wrong, it's just a feeling that it might.  I can drop it if you
think that's better.

> So that's because if an alias (via alias attribute) is not visible
> then it can be assumed to not exist.  Which means the bug is that
> with section anchors we do not know which variables can be refered
> to via the specific anchor?

No, we do know that.

> (if there's something like a "specific"
> anchor)  That looks like the actual defect to me?  I see we have
> block_symbol and object_block which may have all the data needed
> in case accesses are somehow well-constrained?

Right.  And the patch does take advantage of that information.
E.g. the existing:

      if (SYMBOL_REF_BLOCK (x_base) != SYMBOL_REF_BLOCK (y_base))
        return 0;

says that symbols can't alias if they're known to belong to different
blocks.  And if symbols are in the same block, the behaviour of the
patch is to adjust the relative offsets to account for the positions
of the symbols in the block.

The problem is just that “unequal offsets imply no alias” doesn't
hold for section anchors.  The offset of the symbol from an anchor
has to be taken into account.  If we have:

  (symbol_ref X)
  (plus (symbol_ref ANCHOR) Y) == unpreempted X

then ignoring Y gives two false results, a false positive and a false
negative:

(a) the existing code might wrongly assume that an access to X+0 could
    alias an access to ANCHOR+0, because the offsets are equal.

(b) the existing code would wrongly assume that an access to X+0 can't
    alias an access to ANCHOR+Y, because the offsets are unequal.

The relative offset has to be adjusted by Y first, before applying
the “unequal offsets imply no alias” rule.

>> Otheriwse the patch looks good to me.
>
> So let's go with it?  It looks like for decl vs. section anchor we
> can identify the offset of the decl in the anchor block and thus
> determine a offset adjustment necessary to perform an offset based
> check, no?

Yeah, that's what the patch is trying to do.

Thanks,
Richard

Reply via email to