doug shawhan wrote: > I have a series of lists to compare with a list of exclusionary terms. > > junkList =["interchange", "ifferen", "thru"] > > The comparison lists have one or more elements, which may or may not > contain the junkList elements somewhere within: > > l = ["My skull hurts", "Drive the thruway", "Interchangability is not my > forte"] > > ... output would be > > ["My skull hurts"] > > I have used list comprehension to match complete elements, how can I do > a partial match?
One way is to use a helper function to do the test: In [1]: junkList =["interchange", "ifferen", "thru"] In [2]: lst = ["My skull hurts", "Drive the thruway", "Interchangability is not my forte"] In [3]: def hasJunk(s): ...: for junk in junkList: ...: if junk in s: ...: return True ...: return False ...: In [4]: [ s for s in lst if not hasJunk(s) ] Out[4]: ['My skull hurts', 'Interchangability is not my forte'] Hmm, I guess spelling counts :-) also you might want to make this case-insensitive by taking s.lower() in hasJunk(). Another way is to make a regular expression that matches all the junk: In [7]: import re Escape the junk in case it has any re-special chars: In [9]: allJunk = '|'.join(re.escape(junk) for junk in junkList) In [10]: allJunk Out[10]: 'interchange|ifferen|thru' You could compile with re.IGNORECASE to make case-insensitive matches. Spelling still counts though ;) In [11]: junkRe = re.compile(allJunk) In [13]: [ s for s in lst if not junkRe.search(s) ] Out[13]: ['My skull hurts', 'Interchangability is not my forte'] My guess is the re version will be faster, at least if you don't count the compile, but only testing will tell for sure: In [14]: import timeit In [18]: timeit.Timer(setup='from __main__ import hasJunk,lst', stmt='[ s for s in lst if not hasJunk(s) ]').timeit() Out[18]: 11.921303685244915 In [19]: timeit.Timer(setup='from __main__ import junkRe,lst', stmt='[ s for s in lst if not junkRe.search(s) ]').timeit() Out[19]: 8.3083201915327223 So for this data using re is a little faster. Test with real data to be sure! Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor