On 08/16/18 19:35, Jeff Law wrote: > On 08/15/2018 12:51 PM, Bernd Edlinger wrote: > [ snip -- comment attribution is likely lost... ] > >>>>> >>>>> 2018-08-14 Bernd Edlinger <bernd.edlin...@hotmail.de> >>>>> >>>>> * builtins.c (c_strlen): Add new parameter eltsize. >>>>> * builtins.h (c_strlen): Adjust prototype. >>>>> * expr.c (string_constant): Add new parameter mem_size. >>>>> * expr.h (string_constant): Adjust protoype. >>>>> * gimple-fold.c (get_range_strlen): Add new parameter eltsize. >>>>> * gimple-fold.h (get_range_strlen): Adjust prototype. >>>>> * gimple-ssa-sprintf.c (get_string_length): Add new parameter >>>>> eltsize. >>>>> (format_string): Call get_string_length with eltsize. >>>>> >>>>> 2018-08-14 Bernd Edlinger <bernd.edlin...@hotmail.de> >>>>> >>>>> * gcc.dg/strlenopt-49.c: Adjust test case. >>>>> * gcc.dg/tree-ssa/builtin-sprintf-warn-18.c: Likewise. >>>> >>>> Your patch appears to pass in an ELTSIZE which is used to determine how >>>> to count. This has some advantages in that the routine can be used to >>>> count bytes or other elements such as 2 or 4 byte wide characters. > [ ... ] > So yesterday I sent a message where I was totally offbase with the > intent of this patch. After further discussions and re-review ISTM the > real intent here is to restore prior behavior of counting bytes to the > clients outside the sprintf pass. Terribly sorry for that and the > conclusions I drew as a result. > >
no problem at all. I really appreciate the time you take to review this patch. And I see how confusing this must be, but I think some things went in a wrong direction for quite some time. > > >>> >>> Parameterizing the function to return either the number of >>> bytes or the number of elements makes sense as an enhancement. >>> It makes less sense (and could be the source of bugs) to let >>> callers pass in anything else. A boolean flag to toggle >>> between the two alternatives would make the API narrower >>> and safer to use. It doesn't seem necessary to fix any >>> bugs. > No bugs that we're currently aware of. But ensuring that routines that > expect to be counting bytes are counting bytes is a good thing. If some > pass (say forwprop) changed the char * argument to an array of something > other than chars then we'd likely end up giving a wrong result. > > I don't mind making the interface narrower with a boolean or an enum to > specify how we count rather than having the caller pass in a type. > > Yes, the caller _must_ know what to expect. The function should not try to be smarter than the caller. >>> >>>> Your testsuite change appears to remove the xfails in strlenopt-49 which >>>> is usually OK. So no concerns there. > So it turns out we don't really need any of the testsuite changes anyway > if we make this patch independent of the 86711/86714 patches. And > breaking that dependency is IMHO a good thing. > > [ Another snip. ] > >> Anyway, I thought your patch would convert dir.specifier == 'S' >> into dir.modifier == FMT_LEN_l, since %S is like %ls. >> >> the xfail will go away if I change this: >> >> get_string_length (arg, dir.modifier == FMT_LEN_l ? 4 : 1); >> >> to: > >> get_string_length (arg, dir.specifier == 'S' dir.modifier == FMT_LEN_l ? 4 : >> 1); > Presumably "||" is missing here. > Yes, and on another line, I confirmed that it works, and removes the xfail. >> >> >> That should be easy, I will change the patch to match your patch in that >> area. >> >> >> The other xfail with memcmp needs to be fixed separately. >> The issue is, callers of string_constant that do not pass the mem_size >> parameter may access up to TREE_STRING_LENGTH byte, therefore >> returning longer strings is not appropriate: the string constant >> contains 9 bytes, mind the terminating nul. > If we make the patch to add the size parameter to c_strlen independent > of the changes related to 86711/86714, then we don't need to xfail or > change the testsuite in any way shape or form. > > My overall thinking is to carve out this patch so that it's independent > of anything else. That's actually easy to do after addressing two very > minor issues. > I can give it a try, but I think that 86711/86714 is only a manifestation of the design issues, and it is probably impossible to fix the design without fixing 86711/86714 at the same time. It is likely not the only one IMHO. They are due to string_constant, c_strlen and c_getstr returning data that is invalid in some corner cases, like over-length string constants. These were previously only happening rarely, but due to the recent enhancements, much more corner cases were added, and some of them trigger those issues now. > We can then add Martin's patch to fix handling of wide chars in sprintf > (86853). > Yes, I think that seems likely an independent issue. If done properly it will probably not even be a merge conflict. > -- > > This overall plan addresses the need to be able to tell c_strlen how to > count and reverts its behavior for callers outside of the sprintf > warning pass to historical (pre-gcc7) behavior of counting bytes by > default. It does this without losing any warnings and allow us to move > forward with Martin's more complete fix for 86853. > Yes, I have already started to write a similar patch on c_getstr, which seems to fix the other xfail. The issue with c_getstr is similar, some call it with one parameter, those expect a zero terminated string, that needs to be valid. Others use a two parameter overload, which retuns the memory length, those use memcmp or memchr, and need no zero termination. But I start to think that I should merge all three patches into one, to avoid unnecessary confusion. Bernd. > With those two patches out of the way we can then dig further into the > issues around 86711/86714. > > > jeff > > >