On Mon, 3 Aug 2009, Ronen Hod wrote:

> I encountered serious performance issues so I had to do a quick fix
> for my needs. It is not thread-safe, it has assumptions regarding the
> number of states (which are sufficient for me), and I am not 100% sure
> that I understand all the implications (maybe it can be optimized
> further). My application as a whole runs 4x faster now, and I assume
> that pcre_dfa_exec() is ~10x faster on my data. Attached is the code
> with the changes. Enable/Disable them using: "#define
> CRESCENDO_CHECK_FOR_DUPLICATES".

I have now studied your patch, and I understand how you are getting a 
speed up. Unfortunately, I cannot install the patch in PCRE because of 
the problems you mention: it is not thread-safe and it has assumptions 
about the number of states.

I will try to think about other ways of speeding up the duplicate 
checking that are thread-safe and do not make any assumptions, though I 
have a feeling that this will not be easy.

In the meantime, I hope you have read Jeffrey Friedl's book "Mastering 
Regular Expressions" and made sure that the patterns you are using are 
as optimised at possible.

Thanks for posting your code and bringing this area of performance to my 
attention.

Philip

-- 
Philip Hazel

-- 
## List details at http://lists.exim.org/mailman/listinfo/pcre-dev 

Reply via email to