I recently posted a question regarding a regex performance issue I was having.  
I've "resolved" the issue and thought my lessons learned might be helpful to 
someone else.

I am extracting lines from a text file with the regex task.(The lines fall 
between two input string values.) I found
that if the source file is 50k or so, the search took 10-15
seconds.  For source files of 500k, the search took up to 1.5 hours.  I don't 
know why this is happening (code is below).


I have replaced the regex code with a foreach that reads and
evaluates each line.  It processes the same 500k files in 10-15 seconds.


Regex code:
        <loadfile file="${inputfilename}" property="fileContent" />
        <regex 
pattern="^[\s\S]*${startstring}(?'matchedLines'[\s\S]*)[\s\S]*${endstring}" 
input="${fileContent}"/>
        <echo message="${matchedLines}" />

Foreach code:
        <foreach item="Line" in="${inputfilename}" property="lineContent">
            <if test="${string::contains(lineContent,endstring)}">
                <property name="captureLines" value="false"/>
            </if>
            <if test="${captureLines == 'true'}">
                <echo message="${lineContent}" />
            </if>
            <if test="${string::contains(lineContent,startstring)}">
                <property name="captureLines" value="true"/>
            </if>
        </foreach>

Curt




-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NAnt-users mailing list
NAnt-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nant-users

Reply via email to