On 18/04/2014 6:54 a.m., Alex Rousskov wrote:
> On 04/17/2014 04:43 AM, Amos Jeffries wrote:
>> On 17/04/2014 7:23 a.m., Alex Rousskov wrote:
>>> Any reasonable/popular implementation selected by the developer. This is
>>> a one-time check done by the developer, not an automated check done
>>> during Squid build. Sorry I was not clear about that.
> 
>> So glibc: a do-while loop scanning word-by-word with individual
>> byte-by-byte loop (unrolled) over the bytes in each word.
> 
> Yeah (if by "word" you mean "4 characters"). The important part here is
> that they _are_ trying to optimize the scan using low level tricks such
> as partial loop unrolling. That makes the cloning approach more
> attractive than if they did not, but I am not sure it makes it
> attractive enough. I am still OK with the hand-rolled loops direction,
> as your latest patch suggests.

Yes I did mean that by "word", as in the 32-bit/4-byte CPU or memory
'word' they seem to be optimizing for.

I opted for 'hand rolling' by copying the simple(r) strncasecmp() code
loop for both our cases instead of the strncmp() optimized code.

> 
>> The cloning mechanism uses strlen() internally. So no benefit, but extra
>> malloc+free costs.
> 
> AFAICT, cloning should not call strlen(). We are cloning the SBuf object
> ("this") which has a known length. We are not cloning the c-string. The
> only purpose of cloning is null termination of "this" -- SBuf::c_str()
> is not a constant method. Terminating "this" is enough to use the
> standard strncmp().
> 

The c_str() part of the cloning is where the allocation cycle is added
to provide the guarantee that terminator and resulting (ab)uses of the
c_str() value are okay. Which is really worse than strlen() IMO.


> 
>> Patch with hand-rolled scanner attached.
> 
> 
>> +    // 0-length comparison is always true regardless of buffer states
>> +    if (!n || (!length() && *s=='\0')) {
>> +        ++stats.compareFast;
>> +        return 0;
>> +    }
>> +
>> +    // N-length compare MUST provide a non-NULL C-string pointer
>> +    assert(s);
> 
> The assertion appears too late to save us from dereferencing NULL s in
> the if-statement code above. I suggest carefully removing that overly
> complicated if-statement (adjusting the rest of the code accordingly).
> 
>     assert(!n || s);
>     ... adjusted loops go here ...
> 
> I do not insist on removing the if-statement, but if you keep it, please
> adjust it so that it asserts instead of dereferencing NULL s from broken
> callers.

That "" vs "" case you had me throw in fails the unit tests if the check
for 0-length strings. Good thing you did. :-)

I can move the check below the assert so it becomes
(EXHIBIT A):

    // 0-length comparison is always true regardless of buffer states
    if (!n) {
        ++stats.compareFast;
        return 0;
    }

    // N-length compare MUST provide a non-NULL C-string pointer
    assert(s);

    // when this is a 0-length string, no need for any complexity.
    if (!length()) {
        ++stats.compareFast;
        return '\0' - *s;
    }

This will also catch the all compares when the SBuf is empty string
regardess of the length of s. Which was the "left" overread bug you
mentioned below.


> 
>> +    size_type byteCount = min(length(), n);
> 
> Using npos in min() makes me uncomfortable but should work because
> size_type is unsigned. If you keep this code, please add a comment. For
> example:
> 
>   // n may be npos, but we treat that as a huge positive value
> 

Done.

> 
>> +        while ((rv = *left - *right++) == 0) {
> 
> "left" may point beyond allocated memory if length() is zero but *s is not.
> 

I was under the impression SBuf can never have a NULL buffer. But
anyway, this is resolved by the above change preventing this whole area
of code being reached.


> 
>> +    if (length() < n)
>> +        return '\0' - *right;
> 
> "right" might also point beyond allocated memory by now -- it was
> unconditionally incremented in the loops and I do not think the non-zero
> byteCount check protects us from reaching this code in some cases.
> 

1) The only case which can reach this dereference is when byteCount
reached 0.

2) "left" was also unconditionally incremented in the if-statement
condition before breaking out of the loop at the point byteCount reached 0.

3) we have not seen *right=='\0' yet. Because we only just reached EOS
on left OR byteCount would not have been decremented to 0 yet.

So we are safe in testing *right. Assuming that it is either
0-terminated or n is smaller than its endpoint. If those are not true we
will die in same way system strn*() do and even strlen() will not help
us with that one.


> 
>> +    size_type byteCount = min(length(), n);
> ...
>> +        while ((rv = *left - *right++) == 0) {
>> +            if (*left++ == '\0' || --byteCount == 0)
>> +                break;
> 
> byteCount may underflow here (become huge) if it starts as zero because
> length() is zero.

Solved by the earlier !length() check.

> 
> 
>> +        while ((rv = *left - *right++) == 0) {
>> +            if (*left++ == '\0' || --byteCount == 0)
>> +                break;
> 
> This loop in conjunction with mind boggling after-loop checks is just
> too smart for the human brains IMHO. This level of complexity creates a
> constant trickle of bugs. Please consider a more straightforward
> approach that is nearly as efficient. Here is a sketch:
> 

I'm sorry they boggle your mind. I have added comments (below) to
clarify what/why each exists.

> {
>   assert(!n || s);
>   ...
>   const char *left = buf();
>   leftCount = length();
>   const char *right = s; // or just rename s to right?
>   // n may be npos, but we treat that as a huge positive value
>   while (n) {
>     const c1 = leftCount-- ? *left++ : '\0';
>     const c2 = *right++;
>     if (!c1 || c1 != c2) // covers !c2 as well
>         return c1 - c2;
>     n--;
>   }
>   // all n characters were identical (and none was \0)
>   return 0;
> }
> 
> The above requires some polishing (such as adding c1/2 types and a
> case-insensitive loop) but, AFAICT, the only added inefficiency here is
> the leftCount--? expression. I think it is worth the gain of much easier
> to understand code.
> 
> However, I cannot insist that you switch to this simpler approach. If
> you prefer to keep fixing the more complex code, I will do my best to
> find the time to keep reviewing it.

I see you are suggesting the FreeBSD libc strncmp() method
 http://fxr.watson.org/fxr/source/string/strncmp.c?v=FREEBSD6-LIBC

I was following the GNU libc strncase() method
  http://www.ic.unicamp.br/~islene/2s2008-mo806/libc/string/strncase.c

>  while ((result = TOLOWER (*p1) - TOLOWER (*p2++)) == 0)
>    if (*p1++ == '\0' || --n == 0)
>      break;
>
>  return result;


The only new part is the if-statement(s) after looping. I have combined
them and added comments to clarify.

They/it are equivalent with your (leftCount-- ? *left++ : '\0'), but the
extra test is run only once per call instead of once per character.

I have merged the two into one with documentation to clarify
(EXHIBIT B):

    // If we stopped scanning because we reached the end of buf()
    // pretend we have a 0-terminator there to compare.
    // NP: the loop already incremented "right" ready for this
    if (!byteCount && length() < n)
        return '\0' - *right;

    // If we found a difference within the scan area,
    // or we found a '\0',
    // or all n characters were identical (and none was \0).
    return rv;



If you are okay with the new bits marked (EXHIBIT A) and (EXHIBIT B) I
think we are done. These have already been checked against all unit
tests and passed.

Amos

Reply via email to