Re: [PATCH] tests/test-strstr.c: Add another self-test.

Eric Blake Tue, 26 May 2009 15:47:14 -0700

Bruno Haible <bruno <at> clisp.org> writes:

> 
> Eric Blake wrote:
> > http://www.alphalinux.org/archives/axp-list/March2001/0337.shtml
> > 
> > It looks like the bug is alpha-specific in memchr
> 
> I don't think it is a bug. memchr could also be implemented by doing
> a backwards search and still be conforming to ISO C99 and POSIX:


It could, but this is different (and slower) than all traditional 
implementations.  In other words, while it might not violate the standards as 
currently written, it does violate quality of implementation.  But let's get 
glibc involved in the discussion, to get their opinion.

> > in anything else that
> > uses memchr to search for a trailing \0 with a length longer than the
> > allocated memory
> 
> It is such uses of memchr() that are buggy. When you call memchr(s,c,n), you
> must guarantee read access for the bytes at s, ..., s+n-1.

I think the standards are ambiguous on this point.  POSIX states only "locate 
the first occurrence of c (converted to an unsigned char) in the initial n 
bytes (each interpreted as unsigned char) of the object pointed to by s."  It 
does not state anything about what happens if c occurs within the object but 
the object is smaller than n.  Do you have anything further from the standards 
that would resolve this ambiguity?

> 
> > it will manifest itself with gnulib's strstr
> 
> Then gnulib's and glibc's strstr is buggy.

If your statement that memchr must not be used with an overestimated size is 
true, then yes, gnulib and glibc's strstr is indeed buggy (and since they are 
pretty much the same implementation, I only need develop one patch, since it 
was my implemenation, then push it to both places).  But I'm still not 
convinced that the bug is in strstr rather than memchr.

> 
> Yes, this would be buggy as well. strchr or strlen needs to be used instead,
> if the amount of allocated memory is unknown.

But that defeats the purpose of using memchr - the strstr code in question is 
using memchr to do a bounded-length search for the end of string, intentionally 
terminating the search at the length of the needle if the haystack goes on 
beyond that.  That argues for strnlen, not strlen.

At one point, you wrote the gnulib strnlen1 module which provides exactly the 
semantics I want in strstr.  But how did you do it?  With memchr, and the 
assumption that memchr works in ascending order.  In other words, you are 
telling me that the very reason you wrote strnlen1 (that is, to get a faster 
bounded search of an unknown size object for a trailing NUL than what is 
possible via any other string.h primitive) is invalidated if you argue that 
memchr cannot be guaranteed to work if it can't read all n bytes.

I guess I should also file an aardvark with the Austin group to get a ruling on 
whether it makes sense to require memchr to work in ascending order when n is 
potentially larger than the allocated size.  After all, this is similar to the 
requirements already in place for %.*s in printf, where it is explicitly stated 
that "If the precision is not specified or is greater than the size of the 
array, the application shall ensure that the array contains a null byte".  And 
many existing printf implementations use memchr to implement this statement and 
would be rendered non-compliant if memchr cannot be used in this manner.

And even if the Austin group rules against my request, my point still remains 
that there is potentially a lot of existing code that assumes ascending memchr 
semantics on unknown-sized objects, where it is easier to fix that code by 
fixing memchr than to fix all clients to use a slower interface.

-- 
Eric Blake

Re: [PATCH] tests/test-strstr.c: Add another self-test.

Reply via email to