On 1/14/21 12:43 AM, Richard Biener wrote:
On Wed, 13 Jan 2021, Jakub Jelinek wrote:

Hi!

The following patch doesn't actually fix the print_mem_ref bugs, I've kept
it for now as broken as it was, but at least improves the cases where
we can unambiguously map back MEM[&something + off] into some particular
reference (e.g. something.foo[1].bar etc.).
In the distant past I think we were folding such MEM_REFs back to
COMPONENT_REFs and ARRAY_REFs, but we've stopped doing that.

Yeah, because it has different semantics - *(((int *)t + 3)
accesses an int object while t.u.b accesses a 't' object from the TBAA
perspective.

  But for
diagnostics that is what the user actually want to see IMHO.
So on the attached testcase, instead of printing what is in left column
it prints what is in right column:
((int*)t) + 3           t.u.b
((int*)t) + 6           t.u.e.i
((int*)t) + 8           t.v
s + 1                   s[1]

so while that's "nice" in general, for TBAA diagnostics it might actually
be misleading.

I wonder whether we absolutely need to print a C expression here.
We could print, instead of *((int *)t + 3), "access to a memory
object of type 'int' at offset 12 bytes from 't'", thus explain
in plain english.

That said, *((int *)t + 3) is exactly what the access is,
semantically.  There's the case of a mismatch of the access type
and the TBAA type which we cannot write down in C terms but maybe
we want to have a builtin for this?  __builtin_access (ptr, lvalue-type,
tbaa-type)?

Of course, print_mem_ref needs to be also fixed to avoid printing the
nonsense it is printing right now, t is a structure type, so it can't be
cast to int* in C and in C++ only using some user operator, and
the result of what it printed is a pointer, while the uninitialized reads
are int.

I was hoping Martin would fix that, but given his comment in the PR I think
I'll fix it myself tomorrow.

Anyway, this patch is useful on its own.  Bootstrapped/regtested on
x86_64-linux and i686-linux, ok for trunk?

In the light of Martins patch this is probably reasonable but still
the general direction is wrong (which is why I didn't approve Martins
original patch).  I'm also somewhat disappointed we're breaking this
so late in the cycle.

So am I.  I didn't test this change as exhaustively as I could and
(in light of the poor test coverage) should have.  That's my bad.
FWIW, I did do it for the first patch (by instrumenting GCC and
formatting every MEM_REF it came across), but it didn't occur to
me to do it this time around.  I have now completed this testing
(it found one more ICE elsewhere that I'll fix soon).

That said, as I mentioned in the patch submission, most middle end
warnings don't format MEM_REFs.  -Wuninitialized is an outlier here.
Most middle end warnings about invalid accesses print the decl or
allocation call with either a type or size of the access, followed
by either an array index or a byte offset in to the target of
the access.  For consistency I'd like to converge most middle end
warnings on the same style (formatted by the same code).  When
that happens, the MEM_REF format should become much less important,
so I wouldn't invest too much effort into perfecting it.

Martin

Reply via email to