Re: [RFC] Time to talk about StringNG merge again?

Amos Jeffries Thu, 04 Jul 2013 23:33:50 -0700

Snipped out the Done bits.

On 5/07/2013 5:51 p.m., Kinkie wrote:

On Tue, Apr 16, 2013 at 12:49 AM, Alex Rousskov
<rouss...@measurement-factory.com> wrote:

Hi Kinkie,


     Here is my review of
https://code.launchpad.net/~squid/squid/stringng-cherrypick r12753. I
found approximately three bugs. I am also concerned about size_type
type, after starting to use MemBlob which has the same problem. The rest
is polishing.

Ok, thanks. I have some doubts about size_type as well, I am a bit
worried about touching it at this stage in development, but it may be
better now than never.

     if (InitialStore == NULL) {
         static char lowPrototype[] = "";
         InitialStore = new MemBlob(lowPrototype, 0);
     }

If possible, please use "new MemBlob(0)" and remove lowPrototype.

void
SBuf::reserveCapacity(size_type minCapacity)
{
     Must(0 <= minCapacity && minCapacity <= maxSize);
     reserveSpace(minCapacity-length());
}

This does not look right. For example, if minCapacity is smaller than
length(), we should do nothing, but will throw instead. I think this
method should look something like this:

     Must(0 <= minCapacity && minCapacity <= maxSize);
     cow(minCapacity);

     // we're not concerned about RefCounts here,
     // the store knows the last-used portion. If
     // it's available, we're effectively claiming ownership
     // of it. If it's not, we need to go away (realloc)
     if (store_->canAppend(off_+len_, minSpace)) {
         debugs(24, 7, "not growing");
         return;
     }

That comment is probably stale because canAppend() is currently not
related to ownership of the buffer: It will happily return true for
shared buffers. AFAICT, reserveSpace() should be implemented using
something like this:

     Must(0 <= minSpace && minSpace <= maxSize - length());
     reserveCapacity(length() + minSpace);

The comment is badly worded but the code is correct: canAppend() tells
us if the end of our buffer is at the end of the used portion of the
MemBlob, and there is enough space in the unused MemBlob portion to
contain the data we plan to use. If the buffer is shared, this
guarantee may become stale at the next append operation to any SBuf
sharing the same MemBlob.

A decision needs to be made if these two methods have different
purposes or are to be simply different expressions of the same
underlying concept (as it is now).
IMO, thinking about it, it does make sense that reserveSpace also
guarantees that after the call the MemBlob will be exclusively owned
by this SBuf (aka, blindly cow()), while reserveCapacity can mostly be
meant as an optimization, to hint the SBuf memory management system
that an append operation of a known size is about to happen, so that
only one cow is performed.
What do you think?


Sounds good.

The above two changes imply that reserve*() methods ensure
single-ownership of the buffer by the caller. This is not something
documentation of those methods currently guarantees, but I think it is
logical to assume that anybody calling one of these two methods wants to
modify the string and, hence, will need exclusive control over that
string. I cannot think of any reason to call those methods otherwise.

See above.

     typedef int32_t size_type;

Would it be better to make this unsigned? I do not think string sizes
can go negative. std::string is using an unsigned type, right? FWIW,
whenever I try to use MemBlob that has a signed size_type as well, I am
forced to do casts that would not exist otherwise. I think both types
should be changed to reflect what they are representing (a size of a
buffer).

I have the same doubt. I'm just a bit afraid to change this late. Will
try to do as an isolated change. So far this is NOT done.

On another note, would it be better to use MemBlob::size_type instead of
int32_t?

currently they're both int32_t. I believe it's more readable with the
extra level of indirection, at the cost of having to manually keep
them in sync.
Added a comment to highlight this fact.

So long as SBuf is backed by MemBlob defining SBuf::size_type to beMemBlob::size_type makes more sense. One who needs to know can alwayslook up both (probably should). The compiler will tell about signednesserrors in new/modified code very quickly.

The problem will be converting from signed to unsigned at this stage. Ithink get the rest to a milestone we can agree on merging beforeattempting please. Then if everthing goes sour in the change we canstill revert and merge with a TODO about perfecting the polish.When you do the change ensure you have a latest GCC or clang you canuse, they will be far more pedantic about signedness comparisons anditerator ranges etc where the issues are most likey to occur.

SBuf::append(const std::string &str, SBuf::size_type pos, SBuf::size_type n)
{
     return append(str.data(), pos, n); //bounds checked in append()
}

This will break when n is npos because str.data() is not 0-terminated.
The low-lever c-string append you call here assumes that the string
parameter is 0-terminated if n is npos. Please add the corresponding
missing test case.

This led to restructuring the append operation for C-string, which is
now missing the pos parameter, just like std::string does.

Also, I suggest splitting this into two methods, one with a required
(first?) SBufCaseSensitive parameter and one without it. This will allow
callers to specify n without specifying isCaseSensitive and vice versa.
The shorter, inlined method will simply call the longer one with
caseSensitive, of course, so the implementation will not require more
code or performance overhead.

Ok. I'm calling the two shorthand versions cmp and casecmp respectively
(please let me know if you'd prefer the naming-convention compliant
caseCmp instead)


Please. We need semantic similarity to std::string not symbolic.

Code is as before at lp:~squid/squid/stringng-cherrypick, ready and in
sync with trunk.
test-suite runs OK on eu and ubuntu raring.

I'll be on holiday in a couple of days for a couple of weeks.
--
     /kinkie

Re: [RFC] Time to talk about StringNG merge again?

Reply via email to