RE: Help with speeding up a string replace on a large file.
On my tests I've found that doing the following with each particular piece of text seems to speed the process up as the FindNoCase is substantially quicker than the ReplaceNoCase if the particular sub-string doesn't exist in the original text. Using the code below, the whole process completes in around 15 seconds. It's not sub 10 seconds but it is pretty close. Certainly better than 3 to 5 minutes. The example file I used was 1.3Mb consisting of 18875 lines (Undernet channel dump) therefore the loop runs 18875 times... MyQuery = QueryNew("NewLine"); QueryAddRow(MyQuery, CFX_ReadLn.LineCount); for(counter = 1; counter LTE CFX_ReadLn.LineCount; counter = counter + 1) { tmpLine = ReadLn.Line[counter]; if (FindNoCase(" Text to replace ", tmpLine)) tmpLine = ReplaceNoCase(tmpLine, " Text to replace ", "My new text", "ALL"); if (FindNoCase(" Different stuff to replace ", tmpLine)) tmpLine = ReplaceNoCase(tmpLine, " Different stuff to replace ", "Some other text", "ALL"); QuerySetCell(MyQuery, "NewLine", tmpLine, counter); } The benchmarks I get are as follows (all times are in ms) A - ReadLn B - Loop C - Query2File AB C Iteration 1 131 14490 311 Iteration 2 130 14312 300 Iteration 3 130 14352 310 Iteration 4 130 14501 311 Iteration 5 130 14500 331 Average 130 14431 313 Thats roughly 0.79 seconds per line. If I run the code without the FindNoCase statements, the whole process takes on average 9 seconds longer to complete. Paul [Todays Threads] [This Message] [Subscription] [Fast Unsubscribe] [User Settings]
Re: Help with speeding up a string replace on a large file.
I think another option would be ReplaceList(). Although the function itself loops sequentially, maybe there's a gain to be had by using a native function rather than a hand-crafted loop, but I dunno. It'd be easy to benchmark, anyway. (On second thought, ReplaceList is implicitly scoped to "all," which seems inefficient in this case, and my whole idea is probably crap. ;-) // String with placeholders inStr = "Your name is {firstName} {midInit} {lastName}"; // Placeholders varNameList = "{firstName},{midInit},{lastName}"; // Replacement Values varValLis = "#q.firstName#,#q.midInit#,#q.lastName#"; // Perform replacement outStr = ReplaceList(inStr, varNameList, varValList); Jamie On Tue, 10 Feb 2004 16:24:22 -0700, in cf-talk you wrote: >We have a case where we have 30+ instances of text in an RTF document that need to be replaced with specific database values. The routine we have does work, but it's pretty slow. In one case, we have a 1+ meg file, that we need to do the 30 replacements on via the REPLACENOCASE function. This file is taking approx 3-5 minutes to open. We've explored using REREPLACE instead, but are not seeing any noticable speed improvement. > >The algorithm we have is something like so: > >1. read file into a memory variable via the CFFile tag with action=""> >2. Loop through our elements to replace >3. do an replace on the file in memory for the current element >4. End loop >5. Write the new file out to a temporary file (which we manage through another process). > >We've used the GetTickCount function to narrow down exactly where the delay is, and it's in the loop - not on the read and write commands. > >I'm sure there's a better way to do this - we're looking for a sub 10 second response time, if possible. Any suggestions? > >Shawn > > > [Todays Threads] [This Message] [Subscription] [Fast Unsubscribe] [User Settings]
RE: Help with speeding up a string replace on a large file.
If you're on a Windows platform, I have some CFX tags that may be of help. Just experimenting, I can alter a 1.3Mb file in under 4 seconds with the following code. text", "ALL"))> Paul _ From: Jim McAtee [mailto:[EMAIL PROTECTED] Sent: Wednesday, 11 February 2004 19:50 To: CF-Talk Subject: Re: Help with speeding up a string replace on a large file. If you don't need the case-insensitivity of ReplaceNoCase(), you might try ReplaceList() instead. This will eliminate the loop. Not sure how much you'll gain in speed, but letting CF loop over the elements should be faster than doing it yourself. - Original Message - From: "Shawn Grover" <[EMAIL PROTECTED]> To: "CF-Talk" <[EMAIL PROTECTED]> Sent: Tuesday, February 10, 2004 4:24 PM Subject: Help with speeding up a string replace on a large file. > We have a case where we have 30+ instances of text in an RTF document that need to be replaced with specific database values. The routine we have does work, but it's pretty slow. In one case, we have a 1+ meg file, that we need to do the 30 replacements on via the REPLACENOCASE function. This file is taking approx 3-5 minutes to open. We've explored using REREPLACE instead, but are not seeing any noticable speed improvement. > > The algorithm we have is something like so: > > 1. read file into a memory variable via the CFFile tag with action=""> > 2. Loop through our elements to replace > 3. do an replace on the file in memory for the current element > 4. End loop > 5. Write the new file out to a temporary file (which we manage through another process). > > We've used the GetTickCount function to narrow down exactly where the delay is, and it's in the loop - not on the read and write commands. > > I'm sure there's a better way to do this - we're looking for a sub 10 second response time, if possible. Any suggestions? _ [Todays Threads] [This Message] [Subscription] [Fast Unsubscribe] [User Settings]
Re: Help with speeding up a string replace on a large file.
If you don't need the case-insensitivity of ReplaceNoCase(), you might try ReplaceList() instead. This will eliminate the loop. Not sure how much you'll gain in speed, but letting CF loop over the elements should be faster than doing it yourself. - Original Message - From: "Shawn Grover" <[EMAIL PROTECTED]> To: "CF-Talk" <[EMAIL PROTECTED]> Sent: Tuesday, February 10, 2004 4:24 PM Subject: Help with speeding up a string replace on a large file. > We have a case where we have 30+ instances of text in an RTF document that need to be replaced with specific database values. The routine we have does work, but it's pretty slow. In one case, we have a 1+ meg file, that we need to do the 30 replacements on via the REPLACENOCASE function. This file is taking approx 3-5 minutes to open. We've explored using REREPLACE instead, but are not seeing any noticable speed improvement. > > The algorithm we have is something like so: > > 1. read file into a memory variable via the CFFile tag with action=""> > 2. Loop through our elements to replace > 3. do an replace on the file in memory for the current element > 4. End loop > 5. Write the new file out to a temporary file (which we manage through another process). > > We've used the GetTickCount function to narrow down exactly where the delay is, and it's in the loop - not on the read and write commands. > > I'm sure there's a better way to do this - we're looking for a sub 10 second response time, if possible. Any suggestions? [Todays Threads] [This Message] [Subscription] [Fast Unsubscribe] [User Settings]
RE: Help with speeding up a string replace on a large file.
1) Use Replace() instead of ReplaceNoCase(), as it's much faster 2) Recreate your loop and replacing, etc, with cfscript instead of tags 3) Your elements to replace, if you are using a list and list functions like listGetAt(), etc, switch it to an array After that, if you still need more speed, you can do it directly in java for another minor gain. -nathan strutz -Original Message- From: Shawn Grover [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 10, 2004 4:24 PM To: CF-Talk Subject: Help with speeding up a string replace on a large file. We have a case where we have 30+ instances of text in an RTF document that need to be replaced with specific database values. The routine we have does work, but it's pretty slow. In one case, we have a 1+ meg file, that we need to do the 30 replacements on via the REPLACENOCASE function. This file is taking approx 3-5 minutes to open. We've explored using REREPLACE instead, but are not seeing any noticable speed improvement. The algorithm we have is something like so: 1. read file into a memory variable via the CFFile tag with action=""> 2. Loop through our elements to replace 3. do an replace on the file in memory for the current element 4. End loop 5. Write the new file out to a temporary file (which we manage through another process). We've used the GetTickCount function to narrow down exactly where the delay is, and it's in the loop - not on the read and write commands. I'm sure there's a better way to do this - we're looking for a sub 10 second response time, if possible. Any suggestions? Shawn [Todays Threads] [This Message] [Subscription] [Fast Unsubscribe] [User Settings]
RE: Help with speeding up a string replace on a large file.
With a 1meg+ file, you want it sub 10 seconds? I think you're hoping for far too much... Think about that the OS has to pass that file to CF, which for a file of that size, is going to take more than 10 seconds anyway Try opening a 1meg+ file in Notepad - it'll take a LOT more than 10 seconds just to do that... Also, normally, a Replace() will be faster than a REReplace(), so stick with that > We have a case where we have 30+ instances of text in an RTF > document that need to be replaced with specific database > values. The routine we have does work, but it's pretty slow. > In one case, we have a 1+ meg file, that we need to do the > 30 replacements on via the REPLACENOCASE function. This file > is taking approx 3-5 minutes to open. We've explored using > REREPLACE instead, but are not seeing any noticable speed improvement. > > The algorithm we have is something like so: > > 1. read file into a memory variable via the CFFile tag with > action="" > 2. Loop through our elements to replace > 3. do an replace on the file in memory for the current element > 4. End loop > 5. Write the new file out to a temporary file (which we > manage through another process). > > We've used the GetTickCount function to narrow down exactly > where the delay is, and it's in the loop - not on the read > and write commands. > > I'm sure there's a better way to do this - we're looking for > a sub 10 second response time, if possible. Any suggestions? [Todays Threads] [This Message] [Subscription] [Fast Unsubscribe] [User Settings]
Help with speeding up a string replace on a large file.
We have a case where we have 30+ instances of text in an RTF document that need to be replaced with specific database values. The routine we have does work, but it's pretty slow. In one case, we have a 1+ meg file, that we need to do the 30 replacements on via the REPLACENOCASE function. This file is taking approx 3-5 minutes to open. We've explored using REREPLACE instead, but are not seeing any noticable speed improvement. The algorithm we have is something like so: 1. read file into a memory variable via the CFFile tag with action=""> 2. Loop through our elements to replace 3. do an replace on the file in memory for the current element 4. End loop 5. Write the new file out to a temporary file (which we manage through another process). We've used the GetTickCount function to narrow down exactly where the delay is, and it's in the loop - not on the read and write commands. I'm sure there's a better way to do this - we're looking for a sub 10 second response time, if possible. Any suggestions? Shawn [Todays Threads] [This Message] [Subscription] [Fast Unsubscribe] [User Settings]