On Mon, 7 Feb 2005, Reed L. O'Brien wrote:
I want to read the httpd-access.log and remove any
oversized log records
I quickly tossed this script together. I manually
mv-ed log to log.bak
and touched a new logfile.
running the following with print i uncommented
does print each line to
stdout. but it doesn't write to the appropriate
file...
Hello!
Let's take a look at the program again:
###
import os
srcfile = open('/var/log/httpd-access.log.bak', 'r')
dstfile = open('/var/log/httpd-access.log', 'w')
while 1:
lines = srcfile.readlines()
if not lines: break
for i in lines:
if len(i) < 2086:
dstfile.write(i)
srcfile.close()
dstfile.close()
###
a) what am I missing?
b) is there a less expensive way to do it?
Hmmm... I don't see anything offhand that prevents
httpd-access.log from
containing the lines you expect. Do you get any
error messages, like
permission problems, when you run the program?
Can you show us how you are running the program, and
how you are checking
that the resulting file is empty?
Addressing the question on efficiency and expense:
yes. The program at
the moment tries to read all lines into memory at
once, and this is
expensive if the file is large. Let's fix this.
In recent versions of Python, we can modify
file-handling code from:
###
lines = somefile.readlines()
for line in lines:
...
###
to this:
###
for line in somefile:
...
###
That is, we don't need to extract a list of 'lines'
out of a file.
Python allows us to loop directly across a file
object. We can find more
details about this in the documentation on
"Iterators" (PEP 234):
http://www.python.org/peps/pep-0234.html
Iterators are a good thing to know, since Python's
iterators are deeply
rooted in the language design. (Even if it they
were retroactively
embedded. *grin*)
A few more comments: the while loop appears
unnecessary, since on the
second run-through the loop, we'll have already read
all the lines out of
the file. (I am assuming that nothing is writing to
the backup file at
the time.) If the body of a while loop just runs
once, we don't need a
loop.
This simplifies the code down to:
###
srcfile = open('/var/log/httpd-access.log.bak', 'r')
dstfile = open('/var/log/httpd-access.log', 'w')
for line in srcfile:
if len(line) < 2086:
dstfile.write(line)
srcfile.close()
dstfile.close()
###
I don't see anything else here that causes the file
writing to fail. If
you can tell us more information on how you're
checking the program's
effectiveness, that may give us some more clues.
Best of wishes to you!
_______________________________________________
Tutor maillist - Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor