On Apr 3, 8:19 am, George Sakkis <[EMAIL PROTECTED]> wrote:
> On Apr 3, 8:03 am, Jeff <[EMAIL PROTECTED]> wrote:
>
> > def foo(sample, strings):
> > for s in strings:
> > if sample in s:
> > return True
> > return False
>
> > This was an order of magnitude faster for me than using str.find or
> > str.index. That was finding rare words in the entire word-list (w/
> > duplicates) of War and Peace.
>
> If you test against the same substrings over and over again, an
> alternative would be to build a regular expression:
>
> import re
> search = re.compile('|'.join(re.escape(x)
> for x in substrings)).search
> p = search(somestring)
> if p is not None:
> print 'Found', p.group()
>
> George
That would be an enormous regular expression and eat a lot of memory.
But over an enormous number of substrings, it would be O(log n),
rather than O(n).
--
http://mail.python.org/mailman/listinfo/python-list