All right... I tweaked a little more outside of email.
For accuracy in the case where "mystic_mouse" occurs multiple times on one line, uncomment the line:
"add offset(return, thisChunk, theOffset) to theOffset"

This just skips to the next line whenever a match is found.

This should run faster than my previous attempts:

on startup
  ## initialize variables: try adjusting numLines
  put "/gig/tmp/log/access_log" into the_file
  put ($1*1024*1024) into chunkSize ## this is for MB
  put 0 into counter
  put FALSE into isEOF
 
  open file the_file
 
  repeat until (isEOF = TRUE)
    ## read the specified number of lines, check if we are at the end of the file
    read from file the_file for chunkSize
    put it into thisChunk
    put (the result = "eof") into isEOF
   
    ## count the number of matches in this chunk
    put offset("mystic_mouse", thisChunk) into theOffset
    repeat
      add 1 to counter
      get offset("mystic_mouse", thisChunk, theOffset)
      if (it = 0) then exit repeat
      put theOffset + it + 12 into theOffset
      ## add offset(return, thisChunk, theOffset) to theOffset
    end repeat
   
  end repeat
 
  close file the_file

  put counter
end startup

HTH.
Brian

Reply via email to