Tim N. van der Leeuw wrote: > This is basically the same idea as what I tried to describe in my > previous post but without any samples. > I wonder if it's more efficient to create a new list using a > list-comprehension, and checking each entry against the 'wanted' set, > or to create a new set which is the intersection of set 'wanted' and > the iterable of all matches... > > Your sample code would then look like this: > >>>> import re >>>> r = re.compile(r"\w+") >>>> file_content = "foo bar-baz ignored foo()" >>>> wanted = set(["foo", "bar", "baz"]) >>>> found = wanted.intersection(name for name in r.findall(file_content))
Just found = wanted.intersection(r.findall(file_content)) >>>> print found > set(['baz', 'foo', 'bar']) >>>> > > Anyone who has an idea what is faster? (This dataset is so limited that > it doesn't make sense to do any performance-tests with it) I guess that your approach would be a bit faster though most of the time will be spent on IO anyway. The result would be slightly different, and again yours (without duplicates) seems more useful. However, I'm not sure whether the OP would rather stop at the first match or need a match object and not just the text. In that case: matches = (m for m in r.finditer(file_content) if m.group(0) in wanted) Peter -- http://mail.python.org/mailman/listinfo/python-list