All right... I tweaked a little more outside of email.
For accuracy in the case where "mystic_mouse" occurs multiple times on one line, uncomment the line:
"add offset(return, thisChunk, theOffset) to theOffset"
This just skips to the next line whenever a match is found.
This should run faster than my previous attempts:
on startup
## initialize variables: try adjusting numLines
put "/gig/tmp/log/access_log" into the_file
put ($1*1024*1024) into chunkSize ## this is for MB
put 0 into counter
put FALSE into isEOF
open file the_file
repeat until (isEOF = TRUE)
## read the specified number of lines, check if we are at the end of the file
read from file the_file for chunkSize
put it into thisChunk
put (the result = "eof") into isEOF
## count the number of matches in this chunk
put offset("mystic_mouse", thisChunk) into theOffset
repeat
add 1 to counter
get offset("mystic_mouse", thisChunk, theOffset)
if (it = 0) then exit repeat
put theOffset + it + 12 into theOffset
## add offset(return, thisChunk, theOffset) to theOffset
end repeat
end repeat
close file the_file
put counter
end startup
HTH.
Brian
- Re: the large file challenge Sadhunathan Nadesan
- Re: the large file challenge Richard Gaskin
- Re: the large file challenge Sadhunathan Nadesan
- Re: the large file challenge Pierre Sahores
- Re: the large file challenge Sadhunathan Nadesan
- Re: the large file challenge Pierre Sahores
- Re: the large file challenge andu
- Re: the large file challenge Pierre Sahores
- Re: the large file challenge Yennie
- Re: the large file challenge Scott Raney
- the large file challenge Yennie
- the large file challenge Sadhunathan Nadesan
- RE: the large file challenge Yates, Glen
- RE: the large file challenge Sadhunathan Nadesan
- RE: the large file challenge John Vokey
- RE: the large file challenge Sadhunathan Nadesan
- Re: the large file challenge Richard Gaskin