On Jun 18, 11:01 pm, Kris Kennaway <[EMAIL PROTECTED]> wrote: > Calvin Spealman wrote: > > Upload, wait, and google them. > > > Seriously tho, aside from using a real indexer, I would build a set of > > thewordsI'mlookingfor, and then loop over each file, looping over > > thewordsand doing quick checks for containment in the set. If so, add > > to a dict of file names to list ofwordsfound until the list hits 10 > > length. I don't think that would be a complicated solution and it > > shouldn't be terrible at performance. > > > If you need to run this more than once, use an indexer. > > > If you only need to use it once, use an indexer, so you learn how for > > next time. > > If you can't use an indexer, and performance matters, evaluate using > grep and a shell script. Seriously. > > grep is a couple of orders of magnitude faster at pattern matching > strings infiles(and especially regexps) than python is. Even if you > are invoking grep multiple times it is still likely to be faster than a > "maximally efficient" single pass over the file in python. This > realization was disappointing to me :) > > Kris
Alternatively, if you don't feel like writing shell scripts, you can write a Python program which auto-generate the desired shell script which utilizes grep. E.g. use Python for generating the file list which is passed to grep as arguments. ;-P -- http://mail.python.org/mailman/listinfo/python-list