On 08/25/2018 12:32 AM, Bernd Edlinger wrote: > On 08/25/18 01:54, Jeff Law wrote: >> On 08/24/2018 11:26 AM, Bernd Edlinger wrote: >>> On 08/24/18 18:51, Jeff Law wrote: >>>>> Well, this is broken for wide character strings. >>>>> but I hope we can get rid of STRING_CST which are >>>>> not explicitly null terminated. >>> >>> I am afraid that is not going to happen. >>> Maybe we can get STRING_CST that are never longer >>> than the TYPE_UNIT_SIZE, but c_strlen and c_getstr >>> need to take care that the string is zero-terminated. >>> >>> string_constant, should not promise the string is zero terminated. >>> But instead it can promise that: >>> 1) the STRING_CST is valid up to TREE_STRING_LENGTH >>> 2) mem_size is >= TREE_STRING_LENGTH >>> 3) memory between TREE_STRING_LENGTH and mem_size is ZERO. >>> >>> It will not guarantee anything about zero termination any more. >> Interesting because those conditions would be sufficient to deal with a >> regression I stumbled over after fixing Martin's patch to not assume >> that all STRING_CSTs are NUL terminated. >> >> But I need to think about this a bit more. Essentially the question >> we'd need to ask is whether or not these are sufficient in general or >> just in specific cases. >> >> I tend to think they're not sufficient in general. If a string returned >> by string_constant that didn't have a terminating NUL, but which did >> pass the tests above were ultimately passed to the runtime's str* >> routines, then the call may run off the end of the string. We'd like to >> be able to warn for that. >> >> So ISTM those rules are only valid in contexts where we know the result >> isn't going to be passed to str* and friends within the C library. >> >> I do think they're sufficient to avoid problems with the >> tree-ssa-forwprop code we've looked at. So what may make the most sense >> is to have that routine indicate it's willing to accept unterminated >> strings, then check the conditions above before optimizing the code. >> > > There are not too many callers of string_constant. > Not all need zero termination. Right. And in retrospect we probably should have avoided default parameter overloads and just fixed the callers. But that can be a follow-up.
> > But I think if the are interested in zero-termination > they should simply call c_strlen or c_getstr. Perhaps. > > >>> >>> In the end, the best approach might be to either merge my patch >>> with Martins, or step-wise, first fixing wrong code, and then >>> implementing warnings without fixing wrong code. >> Unsure at this time. I've been working with both. I suspect that if we >> went with yours that we'd then turn around and layer Martin's on top of >> it because of the desire to signal to callers that we have an >> unterminated string and have the callers take appropriate action. Which >> begs the question of whether or not we just go with Martin's -- ie, is >> there really any value in using both. I haven't seen indications there >> is value in that approach, but I'm still poking at things. >> > > Well, ya call it "layer one patch over the other" > I call it "incremental improvements". It is (of course) a case by case basis. The way I try to look at these things is to ask whether or not the first patch under consideration would have any value/purpose after the second patch was installed. If so, then it may make sense to include both. If not, then we really just want one patch. Jeff