On Wed, Feb 16, 2005 at 01:34:16PM -0700, Mike Brown wrote:
> time. But I would also hope that it would be smart enough to know that it 
> doesn't need to look past the 2nd character in 'not the xyz' when it is 
> searching for 'not there' (due to the lengths of the sequences).

Assuming stringobject.c:string_contains is the right function, the
code looks like this:

        size = PyString_GET_SIZE(el);
        rhs = PyString_AS_STRING(el);
        lhs = PyString_AS_STRING(a);

        /* optimize for a single character */
        if (size == 1)
                return memchr(lhs, *rhs, PyString_GET_SIZE(a)) != NULL;

        end = lhs + (PyString_GET_SIZE(a) - size);
        while (lhs <= end) {
                if (memcmp(lhs++, rhs, size) == 0)
                        return 1;
        }

So it's doing a zillion memcmp()s.  I don't think there's a more
efficient way to do this with ANSI C; memmem() is a GNU extension that
searches for blocks of memory.  Perhaps saving some memcmps by writing

         if ((*lhs  == *rhs) && memcmp(lhs++, rhs, size) == 0)

would help.

--amk

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to