On 10/02/2016 03:25 PM, Kinkie wrote: > On Fri, Sep 30, 2016 at 6:03 PM, Alex Rousskov > <rouss...@measurement-factory.com> wrote: >> On 09/29/2016 09:19 PM, Amos Jeffries wrote: >>> On 30/09/2016 5:03 a.m., Alex Rousskov wrote: >>>> Should we remove the increment to make concurrent c_str() calls safe? >> >>> The reason it exists remember is to prevent other SBuf sharing that >>> storage MemBuf from thinking they can append to the storage oer top of >>> the terminator. >> >> Yes, I know, but that increment not only prevents problems but _causes_ >> them as well: The increment may force the SBuf to realloc, which may >> result in deallocation of the original storage still pointed to by the >> earlier c_str() result. >> >> >>> Given that we have no easy knowledge of how the c_str() >>> is going to be used by the recipient code we cannot guarantee lack of >>> terminator is safe. >> >> Yes, with or without the increment, Squid may crash or misbehave when >> buggy code appends to SBuf while a previous c_str() result is still in >> use. The lack of increment is not safe. The presence of the increment is >> not safe. The problems are different, but they exist with or without the >> increment. >> >> >>> Time to push SBuf usage forward and aggressively remove the need for >>> c_str()? >> >> The need for c_str() will never go away because libraries outside of >> Squid control need c-strings. And, due to the lack of proper QA, any >> "aggressive push" affecting lots of Squid code is a recipe for a >> disaster IMO. >> >> The question is which c_str() implementation is safer? The one that may >> crash Squid when called twice for the same SBuf in the same expression >> or the one that may crash Squid when appending to the same SBuf in the >> same expression? There are already two known cases of the former. There >> are currently no known cases of the latter. >> >> >> Overall, I know of three primary ways to implement c_str(): >> >> 1. Always 0-terminate SBuf-used storage. Safest implementation possible, >> but has serious negative performance overheads, especially for tokenizing. > > If this is the decision, might as well drop sbuf altogether in favor > of std::string (or possibly std::string + custom mempools-friendly > Allocator).
You are probably right, although std::string may have other funny things going that we might want to avoid/control. >> 2. Terminate the used storage at the time of c_str(). >> >> 2a. Increment the used storage size. Current implementation. >> Leads to problems discussed in this thread. >> >> 2b. Do not increment the used storage size. >> Leads to other problems discussed in this thread. >> >> 3. Allocate and store a new c-string as needed, copying from the >> storage. Safest implementation possible but has serious negative >> performance overheads. These overheads affect different use cases than >> those of #1. > If we are aware of the underlying issue, there's no reason not to save > the temporary to a naked char* and use that, as long as the SBuf is > not touched it'll work fine. Save to a temporary variable outside SBuf? That is not a solution because we are trying to address the cases where the developer forgot to do that or could not really tell that doing that is necessary. Anything from trivial/direct foo(buf.c_str(), buf.c_str()); to complex/hidden foo(buf, buf.c_str()) which calls bar(buf1.c_str(), cstr2); >> Needless to say, we can add clever code to reduce risks associated with >> some implementations. For example (these are just rough sketches/ideas): >> >> a) Appending code in #2b can check whether it is about to overwrite the >> 0-terminator added by a previous c_str() call and, if yes, preserve the >> old storage (we can preserve one such storage or maintain a global FIFO >> queue). >> >> b) C-string allocating code in #4 can maintain a global FIFO queue of >> previously allocated but now supposedly unused c-strings to minimize the >> risk of de-allocating a still used c-string. >> >> c) C-string allocating code in #4 can delay allocation until it might be >> needed, similar to how (a) decides that the 0-terminator is in danger. >> >> >> Going forward, do you think we should: >> >> i. Keep using #2a and rewrite the known double-c_str() code. >> ii. Make double-c_str() code safe (i.e., switch to #2b). >> iii. Add code to make the existing #2 implementation safer. >> iiii. Switch to a completely different implementation like #1 or #4. > v. push SBuf as suggested by Amos and document this specific corner > case so to avoid it I do not think "push SBuf" is a comparable alternative. "Pushing SBuf" is not something we can safely do in a reasonable time. It is and remains a long-term goal, but we need to address the problem "now". The offered alternatives can be implemented in a matter of hours or weeks. "Pushing SBuf" safely will probably take a year and will not eliminate all c_str() calls anyway so the same problems/alternatives will remain even after the "push" is over. And the problematic cases are already documented -- documentation does not help much when the API itself invites errors like concurrent c_str() calls. Developers will continue to misuse c_str() and reviewers will continue to miss those bugs. Alex. _______________________________________________ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev