Paul Watson wrote:
> Here is a better one that counts, and not just detects, the substring. This > is -much- faster than using mmap; especially for a large file that may cause > paging to start. Using mmap can be -very- slow. > > <ss = pattern, be = len(ss) - 1> > ... > b = fp.read(blocksize) > count = 0 > while len(b) > be: > count += b.count(ss) > b = b[-be:] + fp.read(blocksize) > ... In cases where that one wins and blocksize is big, this should do even better: ... block = fp.read(blocksize) count = 0 while len(block) > be: count += block.count(ss) lead = block[-be :] block = fp.read(blocksize) count += (lead + block[: be]).count(ss) ... -- -Scott David Daniels [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list