Re: Scanning a file

Scott David Daniels Sat, 29 Oct 2005 14:15:46 -0700

Paul Watson wrote:


> Here is a better one that counts, and not just detects, the substring.  This 
> is -much- faster than using mmap; especially for a large file that may cause 
> paging to start.  Using mmap can be -very- slow.
> 
 > <ss = pattern, be = len(ss) - 1>
> ...
> b = fp.read(blocksize)
> count = 0
> while len(b) > be:
>     count += b.count(ss)
>     b = b[-be:] + fp.read(blocksize)
> ...
In cases where that one wins and blocksize is big,
this should do even better:
     ...
     block = fp.read(blocksize)
     count = 0
     while len(block) > be:
         count += block.count(ss)
         lead = block[-be :]
         block = fp.read(blocksize)
         count += (lead + block[: be]).count(ss)
     ...
-- 
-Scott David Daniels
[EMAIL PROTECTED]
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Scanning a file

Reply via email to