Hi Brian!

There are two functions for an file-Object which deliver the position in the 
file
and can seek-function to set the offset in the file:


>>> f=open(".emacs", "r" )
>>> help(f.tell)
Help on built-in function tell:

tell(...)
    tell() -> current file position, an integer (may be a long integer).

>>> help( f.seek )
Help on built-in function seek:

seek(...)
    seek(offset[, whence]) -> None.  Move to new file position.

    Argument offset is a byte count.  Optional argument whence defaults to
    0 (offset from start of file, offset should be >= 0); other values are 1
    (move relative to current position, positive or negative), and 2 (move
    relative to end of file, usually negative, although many platforms allow
    seeking beyond the end of a file).  If the file is opened in text mode,
    only offsets returned by tell() are legal.  Use of other offsets causes
    undefined behavior.
    Note that not all file objects are seekable.


HTH Ewald


Brian Gustin wrote:
> HI. This is one I cant seem to find a solid answer on:
> 
> First, some background:
> I have a log file that I wrote a python parser for it, and it works 
> great , but in the interest of saving time and memory , and also to be 
> able to read the currently active log file, say every 10 minutes , and 
> update the static file, I was trying to find some way that I can get 
> python to do this:
> 
> Open log file, read lines up to end of file, and *very important* make a 
> note of the bytes read, and stash this somewhere (I.E. "mark" the file) ,
> and then handle the parsing of said file, until all lines have been read 
> and parsed, write the new files, and close the handler.
>   say, 10 minutes later, for example, the script would then check the 
> bytes read , and *very important* start reading the file *from* the 
> point it marked (I.E. pick up at the point it bookmarked) and read from 
> that point.
> Since the log file will be active (webserver log file) it will have new 
> data to be read, but I dont want to have to read through the *entire* 
> log file all over again, just to get to the new data- I want to be able 
> ot "bookmark" where the log file was read "up to" last time, and then 
> open the file later at that point.
> 
> My current script works well, but only reads the "day old" log file 
> (post log rotate) , and it does very well, parsing as much as 3 GB in as 
> little as 2 minutes if the server isnt heavily loaded when the parser 
> runs.   basically, the webserver runs Tux , which writes a log file for 
> *all* domains on a server, and the script takes the tux log, and parses 
> it, extracting the domain for which the log entry is for, and writes a 
> new line into the domain's apache format CLF log file (this way we are 
> able to run awstats on individual domains, and get relatively accurate 
> stats)
> 
> So.. my question is- is there any way to do what I want ?
> 
> Open a live log file, read lines to x bytes, (say 845673231 bytes) , 
> make a note of this, and 10 miutes later open the same file again *AT* 
> 845673232 bytes - starting with the next byte after the bookmarked 
> point, read to end of file, and update the bookmark.
> 
> 
> Thanks for any pointers- Advice appreciated.
> Bri!
> 
> _______________________________________________
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
> 
> 


-- 
Ing. Ewald Ertl         HartterGruppe                   Phone : 
+43-3352-33085-558
trinomic Projektmanagement & Informationstechnik GmbH   Fax   : 
+43-3352-33085-600
Wiener Straße 41                                        mailto:[EMAIL PROTECTED]
A-7400 Oberwart         http://www.trinomic.com         mailto:[EMAIL PROTECTED]

_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to