RE: Help with speeding up a string replace on a large file.

2004-02-11 Thread Paul Vernon
On my tests I've found that doing the following with each particular piece
of text seems to speed the process up as the FindNoCase is substantially
quicker than the ReplaceNoCase if the particular sub-string doesn't exist in
the original text. Using the code below, the whole process completes in
around 15 seconds. It's not sub 10 seconds but it is pretty close. Certainly
better than 3 to 5 minutes.

 
The example file I used was 1.3Mb consisting of 18875 lines (Undernet
channel dump) therefore the loop runs 18875 times...
 
  
  
   MyQuery = QueryNew("NewLine");
   QueryAddRow(MyQuery, CFX_ReadLn.LineCount);
   for(counter = 1; counter LTE CFX_ReadLn.LineCount; counter = counter + 1)
{
    tmpLine = ReadLn.Line[counter];  

 if (FindNoCase(" Text to replace ", tmpLine)) 
   tmpLine = ReplaceNoCase(tmpLine, " Text to replace ", "My new
text", "ALL");
    
 

 
 if (FindNoCase(" Different stuff to replace ", tmpLine)) 
   tmpLine = ReplaceNoCase(tmpLine, " Different stuff to replace ",
"Some other text", "ALL");

 
    QuerySetCell(MyQuery, "NewLine", tmpLine, counter);
   }
  
   

The benchmarks I get are as follows (all times are in ms)

 
A - ReadLn
B - Loop
C - Query2File

 
AB C
Iteration 1    131    14490    311    
Iteration 2    130    14312    300
Iteration 3    130    14352    310
Iteration 4    130    14501    311
Iteration 5    130    14500    331

 
Average   130    14431    313

 
Thats roughly 0.79 seconds per line.

 
If I run the code without the FindNoCase statements, the whole process takes
on average 9 seconds longer to complete.

 
Paul
 [Todays Threads] 
 [This Message] 
 [Subscription] 
 [Fast Unsubscribe] 
 [User Settings]




Re: Help with speeding up a string replace on a large file.

2004-02-11 Thread Jamie Jackson
I think another option would be ReplaceList(). Although the function
itself loops sequentially, maybe there's a gain to be had by using a
native function rather than a hand-crafted loop, but I dunno. It'd be
easy to benchmark, anyway. 

(On second thought, ReplaceList is implicitly scoped to "all," which
seems inefficient in this case, and my whole idea is probably crap.
;-)

// String with placeholders
inStr = "Your name is {firstName} {midInit} {lastName}";
// Placeholders
varNameList = "{firstName},{midInit},{lastName}";
// Replacement Values
varValLis = "#q.firstName#,#q.midInit#,#q.lastName#";
// Perform replacement
outStr = ReplaceList(inStr, varNameList, varValList);

Jamie

On Tue, 10 Feb 2004 16:24:22 -0700, in cf-talk you wrote:

>We have a case where we have 30+ instances of text in an RTF document that need to be replaced with specific database values.  The routine we have does work, but it's pretty slow.  In one case, we have a 1+ meg file, that we need to do the 30 replacements on via the REPLACENOCASE function.  This file is taking approx 3-5 minutes to open.  We've explored using REREPLACE instead, but are not seeing any noticable speed improvement.
>
>The algorithm we have is something like so:
>
>1. read file into a memory variable via the CFFile tag with action="">
>2. Loop through our elements to replace
>3.    do an replace on the file in memory for the current element
>4. End loop
>5. Write the new file out to a temporary file (which we manage through another process).
>
>We've used the GetTickCount function to narrow down exactly where the delay is, and it's in the loop - not on the read and write commands.
>
>I'm sure there's a better way to do this - we're looking for a sub 10 second response time, if possible.  Any suggestions?
>
>Shawn
>
>
>
 [Todays Threads] 
 [This Message] 
 [Subscription] 
 [Fast Unsubscribe] 
 [User Settings]




RE: Help with speeding up a string replace on a large file.

2004-02-11 Thread Paul Vernon
If you're on a Windows platform, I have some CFX tags that may be of help.
Just experimenting, I can alter a 1.3Mb file in under 4 seconds with the
following code.

 
  
  
  
   
   
text", "ALL"))>
  
   

Paul

  _  

From: Jim McAtee [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, 11 February 2004 19:50
To: CF-Talk
Subject: Re: Help with speeding up a string replace on a large file.

If you don't need the case-insensitivity of ReplaceNoCase(), you might try
ReplaceList() instead.  This will eliminate the loop.  Not sure how much
you'll gain in speed, but letting CF loop over the elements should be faster
than doing it yourself.

- Original Message - 
From: "Shawn Grover" <[EMAIL PROTECTED]>
To: "CF-Talk" <[EMAIL PROTECTED]>
Sent: Tuesday, February 10, 2004 4:24 PM
Subject: Help with speeding up a string replace on a large file.

> We have a case where we have 30+ instances of text in an RTF document that
need to be replaced with specific database values.  The routine we have does
work, but it's pretty slow.  In one case, we have a 1+ meg file, that we
need
to do the 30 replacements on via the REPLACENOCASE function.  This file is
taking approx 3-5 minutes to open.  We've explored using REREPLACE instead,
but are not seeing any noticable speed improvement.
>
> The algorithm we have is something like so:
>
> 1. read file into a memory variable via the CFFile tag with action="">
> 2. Loop through our elements to replace
> 3.    do an replace on the file in memory for the current element
> 4. End loop
> 5. Write the new file out to a temporary file (which we manage through
another process).
>
> We've used the GetTickCount function to narrow down exactly where the
delay
is, and it's in the loop - not on the read and write commands.
>
> I'm sure there's a better way to do this - we're looking for a sub 10
second response time, if possible.  Any suggestions? 
  _
 [Todays Threads] 
 [This Message] 
 [Subscription] 
 [Fast Unsubscribe] 
 [User Settings]




Re: Help with speeding up a string replace on a large file.

2004-02-11 Thread Jim McAtee
If you don't need the case-insensitivity of ReplaceNoCase(), you might try
ReplaceList() instead.  This will eliminate the loop.  Not sure how much
you'll gain in speed, but letting CF loop over the elements should be faster
than doing it yourself.

- Original Message - 
From: "Shawn Grover" <[EMAIL PROTECTED]>
To: "CF-Talk" <[EMAIL PROTECTED]>
Sent: Tuesday, February 10, 2004 4:24 PM
Subject: Help with speeding up a string replace on a large file.

> We have a case where we have 30+ instances of text in an RTF document that
need to be replaced with specific database values.  The routine we have does
work, but it's pretty slow.  In one case, we have a 1+ meg file, that we need
to do the 30 replacements on via the REPLACENOCASE function.  This file is
taking approx 3-5 minutes to open.  We've explored using REREPLACE instead,
but are not seeing any noticable speed improvement.
>
> The algorithm we have is something like so:
>
> 1. read file into a memory variable via the CFFile tag with action="">
> 2. Loop through our elements to replace
> 3.    do an replace on the file in memory for the current element
> 4. End loop
> 5. Write the new file out to a temporary file (which we manage through
another process).
>
> We've used the GetTickCount function to narrow down exactly where the delay
is, and it's in the loop - not on the read and write commands.
>
> I'm sure there's a better way to do this - we're looking for a sub 10
second response time, if possible.  Any suggestions?
 [Todays Threads] 
 [This Message] 
 [Subscription] 
 [Fast Unsubscribe] 
 [User Settings]




RE: Help with speeding up a string replace on a large file.

2004-02-11 Thread Nathan Strutz
1) Use Replace() instead of ReplaceNoCase(), as it's much faster
2) Recreate your loop and replacing, etc, with cfscript instead of tags
3) Your elements to replace, if you are using a list and list functions like
listGetAt(), etc, switch it to an array

After that, if you still need more speed, you can do it directly in java for
another minor gain.

-nathan strutz

  -Original Message-
  From: Shawn Grover [mailto:[EMAIL PROTECTED]
  Sent: Tuesday, February 10, 2004 4:24 PM
  To: CF-Talk
  Subject: Help with speeding up a string replace on a large file.

  We have a case where we have 30+ instances of text in an RTF document that
need to be replaced with specific database values.  The routine we have does
work, but it's pretty slow.  In one case, we have a 1+ meg file, that we
need to do the 30 replacements on via the REPLACENOCASE function.  This file
is taking approx 3-5 minutes to open.  We've explored using REREPLACE
instead, but are not seeing any noticable speed improvement.

  The algorithm we have is something like so:

  1. read file into a memory variable via the CFFile tag with action="">
  2. Loop through our elements to replace
  3.    do an replace on the file in memory for the current element
  4. End loop
  5. Write the new file out to a temporary file (which we manage through
another process).

  We've used the GetTickCount function to narrow down exactly where the
delay is, and it's in the loop - not on the read and write commands.

  I'm sure there's a better way to do this - we're looking for a sub 10
second response time, if possible.  Any suggestions?

  Shawn
 [Todays Threads] 
 [This Message] 
 [Subscription] 
 [Fast Unsubscribe] 
 [User Settings]




RE: Help with speeding up a string replace on a large file.

2004-02-11 Thread Philip Arnold
With a 1meg+ file, you want it sub 10 seconds?

I think you're hoping for far too much... Think about that the OS has to
pass that file to CF, which for a file of that size, is going to take
more than 10 seconds anyway

Try opening a 1meg+ file in Notepad - it'll take a LOT more than 10
seconds just to do that...

Also, normally, a Replace() will be faster than a REReplace(), so stick
with that

> We have a case where we have 30+ instances of text in an RTF 
> document that need to be replaced with specific database 
> values.  The routine we have does work, but it's pretty slow. 
>  In one case, we have a 1+ meg file, that we need to do the 
> 30 replacements on via the REPLACENOCASE function.  This file 
> is taking approx 3-5 minutes to open.  We've explored using 
> REREPLACE instead, but are not seeing any noticable speed improvement.
> 
> The algorithm we have is something like so:
> 
> 1. read file into a memory variable via the CFFile tag with 
> action="" 
> 2. Loop through our elements to replace
> 3.    do an replace on the file in memory for the current element
> 4. End loop
> 5. Write the new file out to a temporary file (which we 
> manage through another process).
> 
> We've used the GetTickCount function to narrow down exactly 
> where the delay is, and it's in the loop - not on the read 
> and write commands.
> 
> I'm sure there's a better way to do this - we're looking for 
> a sub 10 second response time, if possible.  Any suggestions?
 [Todays Threads] 
 [This Message] 
 [Subscription] 
 [Fast Unsubscribe] 
 [User Settings]




Help with speeding up a string replace on a large file.

2004-02-11 Thread Shawn Grover
We have a case where we have 30+ instances of text in an RTF document that need to be replaced with specific database values.  The routine we have does work, but it's pretty slow.  In one case, we have a 1+ meg file, that we need to do the 30 replacements on via the REPLACENOCASE function.  This file is taking approx 3-5 minutes to open.  We've explored using REREPLACE instead, but are not seeing any noticable speed improvement.

The algorithm we have is something like so:

1. read file into a memory variable via the CFFile tag with action="">
2. Loop through our elements to replace
3.    do an replace on the file in memory for the current element
4. End loop
5. Write the new file out to a temporary file (which we manage through another process).

We've used the GetTickCount function to narrow down exactly where the delay is, and it's in the loop - not on the read and write commands.

I'm sure there's a better way to do this - we're looking for a sub 10 second response time, if possible.  Any suggestions?

Shawn
 [Todays Threads] 
 [This Message] 
 [Subscription] 
 [Fast Unsubscribe] 
 [User Settings]