On Mon, 7 Feb 2005, Reed L. O'Brien wrote:
> I want to read the httpd-access.log and remove any oversized log records > > I quickly tossed this script together. I manually mv-ed log to log.bak > and touched a new logfile. > > running the following with print i uncommented does print each line to > stdout. but it doesn't write to the appropriate file... Hello! Let's take a look at the program again: ### import os srcfile = open('/var/log/httpd-access.log.bak', 'r') dstfile = open('/var/log/httpd-access.log', 'w') while 1: lines = srcfile.readlines() if not lines: break for i in lines: if len(i) < 2086: dstfile.write(i) srcfile.close() dstfile.close() ### > a) what am I missing? > b) is there a less expensive way to do it? Hmmm... I don't see anything offhand that prevents httpd-access.log from containing the lines you expect. Do you get any error messages, like permission problems, when you run the program? Can you show us how you are running the program, and how you are checking that the resulting file is empty? Addressing the question on efficiency and expense: yes. The program at the moment tries to read all lines into memory at once, and this is expensive if the file is large. Let's fix this. In recent versions of Python, we can modify file-handling code from: ### lines = somefile.readlines() for line in lines: ... ### to this: ### for line in somefile: ... ### That is, we don't need to extract a list of 'lines' out of a file. Python allows us to loop directly across a file object. We can find more details about this in the documentation on "Iterators" (PEP 234): http://www.python.org/peps/pep-0234.html Iterators are a good thing to know, since Python's iterators are deeply rooted in the language design. (Even if it they were retroactively embedded. *grin*) A few more comments: the while loop appears unnecessary, since on the second run-through the loop, we'll have already read all the lines out of the file. (I am assuming that nothing is writing to the backup file at the time.) If the body of a while loop just runs once, we don't need a loop. This simplifies the code down to: ### srcfile = open('/var/log/httpd-access.log.bak', 'r') dstfile = open('/var/log/httpd-access.log', 'w') for line in srcfile: if len(line) < 2086: dstfile.write(line) srcfile.close() dstfile.close() ### I don't see anything else here that causes the file writing to fail. If you can tell us more information on how you're checking the program's effectiveness, that may give us some more clues. Best of wishes to you! _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor