Also since you're writing your found results to a file there's no need to
print the results to the screen.  That should shave off some time,
especially if you have a lot of hits.

On 6/26/07, Kent Johnson <[EMAIL PROTECTED]> wrote:

Robert Hicks wrote:
> idList only has about 129 id numbers in it.

That is quite a few times to be searching each line of the file. Try
using a regular expression search instead, like this:

import re
regex = re.compile('|'.join(idList))
for line in f2:
   if regex.search(line):
     # process a hit

A simple test shows that to be about 25 times faster.

Searching for each of 100 id strings in another string:
In [6]: import timeit
In [9]: setup = "import re; import string; ids=[str(i) for i in
range(1000, 1100)];line=string.letters"
In [10]: timeit.Timer('for i in ids: i in line', setup).timeit()
Out[10]: 15.298269987106323

Build a regular expression to match all the ids and use that to search:
In [11]: setup2=setup + ";regex=re.compile('|'.join(ids))"
In [12]: timeit.Timer('regex.search(line)', setup2).timeit()
Out[12]: 0.58947491645812988

In [15]: _10 / _12
Out[15]: 25.95236804820507

> I am running it straight from a Linux console. I thought about buffering
> but I am not sure how Python handles that.

I don't think the console should be buffered.

> Do you know if Python has a "slower" startup time than Perl? That could
> be part of it though I suspect the buffering thing more.

I don't know if it is slower than Perl but it doesn't take a few seconds
on my computer. How long does it take you to get to the interpreter
prompt if you just start Python? You could put a simple print at the
start of your program to see when it starts executing.

Kent
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to