On 5/18/22 00:26, Richard Biener wrote:
On Tue, 17 May 2022, Martin Sebor wrote:

On 5/16/22 03:16, Richard Biener wrote:
The following tries to correct get_origin_and_offset_r not handling
non-constant sizes of array elements in ARRAY_REFs and non-constant
offsets of COMPONENT_REFs.  It isn't exactly clear how such failures
should be treated in this API and existing handling isn't consistent
here either.  The following applies two different variants, treating
non-constant array sizes like non-constant array indices and
treating non-constant offsets of COMPONENT_REFs by terminating
the recursion (not sure what that means to the callers).

Basically the code failed to use component_ref_field_offset and
array_ref_element_size and instead relies on inappropriate
helpers (that shouldn't exist in the first place ...).  The code
is also not safe-guarded against overflows in the final offset/size
computations but I'm not trying to rectify that.

Martin - can you comment on how the API should handle such
situations?

It looks like the -Wrestrict warning here ignores offsets equal to
HOST_WIDE_INT_MIN so presumably setting dst_offset (via *fldoff) to
that should avoid it.  Or maybe to HWI_MAX as it does for variable
offsets.

Can you suggest wording for the function comment as to how it handles
the case when offset or size cannot be determined exactly?   The
comment currently only suggests that the caller possibly cannot
trust fldsize or off when the function returns NULL but the actual
implementation differs from that.




It also looks like the function only handles constant offsets and
sizes, and I have a vague recollection of enhancing it to work with
ranges.  That should avoid the overflow problem too.

So the correct thing is to return NULL?

No, I don't think so.  The recursive get_origin_and_offset_r() assumes
its own invocations never return null (the one place it does that should
probably be moved to the nonrecursive caller).


Is the patch OK as-is?

It's an improvement but it's not complete as the following also ICEs
(albeit somewhere else):

void* f (void);

void g (int n)
{
  struct {
    char a[n], b[];
  } *p = f ();

  __builtin_sprintf (p->b, "%s", p->a);
}

With the ICE fixed the warning triggers.  That's not ideal but it's
unavoidable given the IR (I believe I mentioned this caveat some time
back).  This is the same as for:

  struct {
    char a[8], b[8];
  } *p = f ();

  __builtin_sprintf (&p->b[n], "%s", p->a);

because the IR looks more or less the same for &p->a[n] as it is for
&p->b[n].

As said, I'm not sure how the caller interprets
the result and how it can distinguish the exact vs. non-exact cases
or what a "conservative" inexact answer would be.

The warning triggers in both the certain cases and the inexact
ones like the one above when an overlap cannot be ruled out.  To
differentiate the two it's phrased as "may overlap".  The handling
is in maybe_warn_overlap().


Please help properly documenting this API.

I can spend some time in the next few days to page it all in, see
if I can clean it up a bit in addition to fixing the ICEs and
improve the comment.  Let me know if you have a different
preference.

Martin


Thanks,
Richard.

Martin


Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK for trunk and branches?

Thanks,
Richard.

2022-05-16  Richard Biener  <rguent...@suse.de>

  PR middle-end/105604
  * gimple-ssa-sprintf.cc (get_origin_and_offset_r):
  Handle non-constant ARRAY_REF element size and non-constant
  COMPONENT_REF field offset.

        * gcc.dg/torture/pr105604.c: New testcase.
---
   gcc/gimple-ssa-sprintf.cc               | 14 +++++++++++---
   gcc/testsuite/gcc.dg/torture/pr105604.c | 24 ++++++++++++++++++++++++
   2 files changed, 35 insertions(+), 3 deletions(-)
   create mode 100644 gcc/testsuite/gcc.dg/torture/pr105604.c

diff --git a/gcc/gimple-ssa-sprintf.cc b/gcc/gimple-ssa-sprintf.cc
index c93f12f90b5..14e215ce69c 100644
--- a/gcc/gimple-ssa-sprintf.cc
+++ b/gcc/gimple-ssa-sprintf.cc
@@ -2312,14 +2312,16 @@ get_origin_and_offset_r (tree x, HOST_WIDE_INT
*fldoff, HOST_WIDE_INT *fldsize,
    HOST_WIDE_INT idx = (tree_fits_uhwi_p (offset)
           ? tree_to_uhwi (offset) : HOST_WIDE_INT_MAX);
   +    tree elsz = array_ref_element_size (x);
    tree eltype = TREE_TYPE (x);
    if (TREE_CODE (eltype) == INTEGER_TYPE)
      {
        if (off)
          *off = idx;
          }
-       else if (idx < HOST_WIDE_INT_MAX)
-         *fldoff += idx * int_size_in_bytes (eltype);
+       else if (idx < HOST_WIDE_INT_MAX
+                && tree_fits_shwi_p (elsz))
+         *fldoff += idx * tree_to_shwi (elsz);
    else
      *fldoff = idx;
   @@ -2350,8 +2352,14 @@ get_origin_and_offset_r (tree x, HOST_WIDE_INT
*fldoff, HOST_WIDE_INT *fldsize,
case COMPONENT_REF:
         {
+       tree foff = component_ref_field_offset (x);
        tree fld = TREE_OPERAND (x, 1);
-       *fldoff += int_byte_position (fld);
+       if (!tree_fits_shwi_p (foff)
+           || !tree_fits_shwi_p (DECL_FIELD_BIT_OFFSET (fld)))
+         return x;
+       *fldoff += (tree_to_shwi (foff)
+                   + (tree_to_shwi (DECL_FIELD_BIT_OFFSET (fld))
+                      / BITS_PER_UNIT));
get_origin_and_offset_r (fld, fldoff, fldsize, off);
    x = TREE_OPERAND (x, 0);
diff --git a/gcc/testsuite/gcc.dg/torture/pr105604.c
b/gcc/testsuite/gcc.dg/torture/pr105604.c
new file mode 100644
index 00000000000..b002251df10
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr105604.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-Wall" } */
+
+struct {
+  long users;
+  long size;
+  char *data;
+} * main_trans;
+void *main___trans_tmp_1;
+int sprintf(char *, char *, ...);
+int main() {
+  int users = 0;
+  struct {
+    long users;
+    long size;
+    char *data;
+    int links[users];
+    char buf[];
+  } *trans = trans;
+  trans->data = trans->buf;
+  main___trans_tmp_1 = trans;
+  main_trans = main___trans_tmp_1;
+  sprintf(main_trans->data, "test");
+}





Reply via email to