https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91310

            Bug ID: 91310
           Summary: Read overflow generated by character array assignment
                    to self
           Product: gcc
           Version: 9.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
          Assignee: unassigned at gcc dot gnu.org
          Reporter: foreese at gcc dot gnu.org
  Target Milestone: ---

The following simple test cases read_overflow1.for and read_overflow2.for
generate warnings when compiled with -O2 (in gfortran 9.1.0) which allude to
the bug which is a read overflow generated from a character array assignment
within a single variable where the bounds of the LHS and RHS are known.

> $ cat read_overflow1.for
>       subroutine s(len)
>         implicit none
>         character str*512
>         integer(4) :: offset
>         integer(4), intent(in) :: len
> 
>         offset = 4
>         str(1:) = str(offset:len+3)
>         print *, str
>       end subroutine
> $ diff read_overflow1.for read_overflow2.for
> 3c3
> <         character str*512
> ---
> >         character str(512)
> $ gfortran -c -O0 read_overflow1.for # no warning
> $ gfortran -c -O2 read_overflow1.for
> read_overflow1.for:9:0:
> 
>     9 |           str(1:) = str(offset:len+3)
>       | 
> Warning: ‘__builtin_memmove’ reading 512 bytes from a region of size 509 
> [-Wstringop-overflow=]
> $ gfortran -c -O0 read_overflow2.for # no warning
> $ gfortran -c -O2 read_overflow2.for
> read_overflow2.for:9:0:
> 
>     9 |           str(1:) = str(offset:len+3)
>       | 
> Warning: iteration 509 invokes undefined behavior 
> [-Waggressive-loop-optimizations]
> read_overflow2.for:9:0: note: within this loop

The issue can be seen in the trees dumped by -fdump-tree-original and
-fdump-tree-optimized, both at -O0 and -O2 for both test cases. The
intermediate code generated is equivalent for both optimization levels. For
read_overflow1.for it boils down to the following:

> _read_length = len - 3;
> if (_read_length < 512) {
>   __builtin_memmove(&str, &str+3, _read_length);
>   __builtin_memset(&str, ' ', 512 - _read_length);
> } else {
>   __builtin_memmove(&str, &str+3, 512);
> }

The trees for read_overflow2.for are similar but are expanded into a loop
equivalent to the following:

> idx = 0;
> while (idx < 512) {
>   str[idx][1] = str[idx + 3][1];
>   ++idx;
> }

Clearly, code is generated to avoid overflow when _writing_ to str, but there
is no code generated to avoid overflow when _reading_ from str, though both the
source and destination sizes are known. In fact if len is > 509, the last three
characters would result in a read overflow of the str character array and
garbage would be written to the end of the array.

The warnings are only issued in -O2 because the intermediate code is simplified
enough for the overflow to be obvious.

A workaround from the user perspective is to replace the assignment with two
explicitly bounded assignments:

> s(1:len) = s(offset:len+3)
> s(len+1:) = ' '

In this case no overflow can occur from the generated code.


AFAICT the standard does not specify what to do when an assignment occurs among
array sections of differing sizes, so it is up to the compiler to be sensible.
One might argue it is not against the standard to ignore the source size and
always write the destination size -- however, I rather think of it as blatant
negligence on the compiler's part in this case, since the bounds of both are
known and it already tries to avoid overflow of one operation (write) but not
the other (read). If the behavior is intentional and the code _should_ result
in an overflow, then a consistent and obvious warning should be issued for all
optimization levels - but I don't think that it should.

Reply via email to