Hi,

.*? is a "lazy" match, whereas .* is a "greedy" match. The lazy one will
match as small a string as possible, thus it has to back track a LOT.
Getting rid of these is going to help a lot. (It's pleasantly confusing that
the lazy match is more expensive. :-))

Tom Conner

On Wed, Apr 14, 2010 at 3:25 AM, Martin Holst Swende <mar...@swende.se>wrote:

> Hi,
>
> I noticed that some greppers took extremely long time to run for certain
> input, especially two of them almost appeared to halt when I ran them.
> Those were ajax and svnusers. In ajax.py, the following regexp is used :
>
>        regex_string = '< *?script.*?>.*?'
>        regex_string +=
> '(XMLHttpRequest|eval\(\)|ActiveXObject\("Msxml2.XMLHTTP"\)|'
>        regex_string += 'ActiveXObject\("Microsoft.XMLHTTP"\))'
>        regex_string += '.*?</ *?script *?>'
>
> This is a very 'loose' regexp, which has a lot of wildcards, therefore
> it basically becomes ReDos:ed for certain pages. I suggest changing this
> to just checking for the calls. Also, it looks like the construct
> checking eval will check explicitly for "eval()", not "eval(foo)".
> Something like this should work, if we want to check use of any eval :
>
>        regex_string =
> '(XMLHttpRequest|eval\(|ActiveXObject\("Msxml2.XMLHTTP"\)|'
>        regex_string += 'ActiveXObject\("Microsoft.XMLHTTP"\))'
>
> svnusers.py contains the following
>    regex = '\$.*?: .*? .*? \d{4}[-/]\d{1,2}[-/]\d{1,2}'
>    regex += ' \d{1,2}:\d{1,2}:\d{1,2}.*? (.*?) (Exp )?\$'
>
> This can be enhanced by replacing wildcards with harder matches and
> removing optional stuff at the end (Exp )?.
> However, it seems to me that the following regexp would work and be much
> quicker :
>        regex  = "date:.*author:\W(\w+);"
>
> Additionally, both of them contains the construction ".*?" which is
> strange. Unless I am not missing something special about python regexps,
> this should be ".*", as * means zero or more times, and ? is optional,
> which is one or zero times.
>
> Regards,
> Martin Holst Swende
>
>
> ------------------------------------------------------------------------------
> Download Intel&#174; Parallel Studio Eval
> Try the new software tools for yourself. Speed compiling, find bugs
> proactively, and fine-tune applications for parallel performance.
> See why Intel Parallel Studio got high marks during beta.
> http://p.sf.net/sfu/intel-sw-dev
> _______________________________________________
> W3af-develop mailing list
> W3af-develop@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/w3af-develop
>
------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
W3af-develop mailing list
W3af-develop@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/w3af-develop

Reply via email to