On 7/19/19 4:04 PM, Martin Sebor wrote: > On targets with permissive alignment requirements GCC sometimes > lowers stores of short (between two and 16 bytes), power-of-two > char sequences to single integer stores of the corresponding > width. This happens for sequences of ordinary character stores > as well as for some calls to memcpy. > > However, the strlen pass is only prepared to handle character > stores and not those of wider integers. As a result, the strlen > optimization tends to get defeated in cases when it could benefit > the most: very short strings. I counted 1544 instances where > the strlen optimization was disabled in a GCC build on x86_64 > due to this sort of early store merging, and over two thousand > in a build of the Linux kernel. > > In addition, -Wstringop-overflow only considers calls to string > functions and is ineffective against past-the-end accesses by > these merged multibyte stores. > > To improve the effectiveness of both the optimization as well > as the warning the attached patch enhances the strlen pass to > consider these wide accesses. Since the infrastructure for doing > this is already in place (strlen can compute multibyte accesses > via MEM_REFs of character arrays), the enhancement isn't very > intrusive. It relies on the native_encode_expr function to > determine the encoding of an expression and its "length". > > I tested the patch on x86_64. I expect the tests the patch > adds will need some adjustment for big-endian and strictly > aligned targets. > > Martin > > gcc-91183.diff > > PR tree-optimization/91183 - strlen of a strcpy result with a conditional > source not folded > PR tree-optimization/86688 - missing -Wstringop-overflow using a non-string > local array in strnlen with excessive bound > > gcc/ChangeLog: > > PR tree-optimization/91183 > PR tree-optimization/86688 > * builtins.c (compute_objsize): Handle MEM_REF. > * tree-ssa-strlen.c (class ssa_name_limit_t): New. > (get_min_string_length): Remove. > (count_nonzero_bytes): New function. > (handle_char_store): Rename... > (handle_store): to this. Handle multibyte stores via integer types. > (strlen_check_and_optimize_stmt): Adjust conditional and the called > function name. > > gcc/testsuite/ChangeLog: > > PR tree-optimization/91183 > PR tree-optimization/86688 > * gcc.dg/attr-nonstring-2.c: Remove xfails. > * gcc.dg/strlenopt-70.c: New test. > * gcc.dg/strlenopt-71.c: New test. > * gcc.dg/strlenopt-72.c: New test. > * gcc.dg/strlenopt-8.c: Remove xfails. >
> + > if (TREE_CODE (dest) != ADDR_EXPR) > return NULL_TREE; > > diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c > index 88b6bd7869e..ca1aca3ed9e 100644 > --- a/gcc/tree-ssa-strlen.c > +++ b/gcc/tree-ssa-strlen.c > + > +/* If the SSA_NAME has already been "seen" return a positive value. > + Otherwise add it to VISITED. If the SSA_NAME limit has been > + reached, return a negative value. Otherwise return zero. */ > + > +int ssa_name_limit_t::next_ssa_name (tree ssa_name) Nit. Return type on a different line than the function's name. The point behind that convention is to make it easier to search for a function's definition. > +/* Determine the minimum and maximum number of leading non-zero bytes > + in the representation of EXP and set LENRANGE[0] and LENRANGE[1] > + to each. Set LENRANGE[2] to the total number of bytes in > + the representation. Set *NULTREM if the representation contains > + a zero byte, and set *ALLNUL if all the bytes are zero. Avoid > + recursing deeper than the limits in SNLIM allow. Return true > + on success and false otherwise. */ S/NULTREM/NULTERM/ > > if (si != NULL) > { > - int cmp = compare_nonzero_chars (si, offset); > - gcc_assert (offset == 0 || cmp >= 0); > - if (storing_all_zeros_p && cmp == 0 && si->full_string_p) > + /* Set to 1 if the string described by SI is being written into > + before the terminating nul (if it has one), to zero if the nul > + is being overwritten but not beyond, or negative otherwise. */ > + int store_b4_nul[2]; Umm "store_b4_nul"? Are you really trying to save 3 characters in the name by writing it this way? :-) You've got two entries in the array, but the comment reads as if there's just one element. What is the difference between the two array elements? I didn't see anything terribly concerning so far, but I do want to look at handle_store again after the comment is fixed so that I'm sure I'm interpreting things correctly. Jeff