I recently posted a question regarding a regex performance issue I was having.
I've "resolved" the issue and thought my lessons learned might be helpful to
someone else.
I am extracting lines from a text file with the regex task.(The lines fall
between two input string values.) I found
that if the source file is 50k or so, the search took 10-15
seconds. For source files of 500k, the search took up to 1.5 hours. I don't
know why this is happening (code is below).
I have replaced the regex code with a foreach that reads and
evaluates each line. It processes the same 500k files in 10-15 seconds.
Regex code:
<loadfile file="${inputfilename}" property="fileContent" />
<regex
pattern="^[\s\S]*${startstring}(?'matchedLines'[\s\S]*)[\s\S]*${endstring}"
input="${fileContent}"/>
<echo message="${matchedLines}" />
Foreach code:
<foreach item="Line" in="${inputfilename}" property="lineContent">
<if test="${string::contains(lineContent,endstring)}">
<property name="captureLines" value="false"/>
</if>
<if test="${captureLines == 'true'}">
<echo message="${lineContent}" />
</if>
<if test="${string::contains(lineContent,startstring)}">
<property name="captureLines" value="true"/>
</if>
</foreach>
Curt
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NAnt-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nant-users