On Tue, 14 Jul 2009 02:06:04 -0300, Gabriel Genellina wrote:

>> Matt, how many words are you looking for, in how long a string ?
>> Were you able to time any( substr in long_string ) against re.compile
>> ( "|".join( list_items )) ?
> 
> There is a known algorithm to solve specifically this problem  
> (Aho-Corasick), a good implementation should perform better than R.E. (and  
> better than the gen.expr. with the advantage of returning WHICH string  
> matched)

Aho-Corasick has the advantage of being linear in the length of the
patterns, so the setup may be faster than re.compile(). The actual
searching won't necessarily be any faster (assuming optimal
implementations; I don't know how safe that assumption is).

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to