Hi,

Use a suffix tree. First make yourself a suffix tree of your thousand files
and the use it.
This is a classical problem for that kind of structure.

Just search "suffix tree" or "suffix tree python" on google to find a
definition and an implementation.

(Also Jon Bentley's "Programming Pearls" is a great book to read)

Regards

Francis Girard

2008/6/18 brad <[EMAIL PROTECTED]>:

> Just wondering if anyone has ever solved this efficiently... not looking
> for specific solutions tho... just ideas.
>
> I have one thousand words and one thousand files. I need to read the files
> to see if some of the words are in the files. I can stop reading a file once
> I find 10 of the words in it. It's easy for me to do this with a few dozen
> words, but a thousand words is too large for an RE and too inefficient to
> loop, etc. Any suggestions?
>
> Thanks
> --
> http://mail.python.org/mailman/listinfo/python-list
>
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to