--- Yakov Lerner <[EMAIL PROTECTED]> wrote:
> I use sometimes the regex that finds paragraphs
> containing given words w1,w2,... in any order ( I define "paragraph"
> as separated by lines, \n\n).
>
> I use the pattern like this: (two-word example, w1 and w2, but easily
> expandable for N words):
> /\c\(.\|.\n\)*\<w1\>\&\(.\|.\n\)*\<w2\>
> (and I set :set maxmempattern=20000 )
> This works. But search time is unbelievably slow on big files.
>
> My question is; is there a rewrite of this regex that works faster.
>
> To see the testcase how of how slow this works:
> 1. wget http://www.vmunix.com/~gabor/c/draft.html
> # this is ~1.3 MB file.
> 2. vim draft.html
> 3. /\c\(.\|.\n\)*\<w1\>\&\(.\|.\n\)*\<w2\>
Try this pattern:
/\c\n\zs\%(\%(.\|.\n\)\{-}\<international\>\&\%(.\|.\n\)\{-}\<although\>\)
It has the \n at the start so it will match at most once per line and uses \{-}
instead of * to prevent backtracking. That search ends in 30 seconds (on a Dual
1.8ghz G5). You won't need to tweak maxmempattern either.
regards,
Peter
Send instant messages to your online friends http://au.messenger.yahoo.com