Chad Maine wrote:

for files of any substance, it's much better (cheaper and faster) to read
one line at a time into memory for parsing:


fileObject = open("myfile.log")
while 1:
	line = fileObject.readline()
	if not line: break
	// parse the line here

Using xreadlines() has the same effect with more natural syntax. While the readlines() method reads an entire file at once, xreadlines() uses some behind-the-scenes magic to *look* like it's reading everything but actually only holds a line at a time in memory.

# Go through each line of the logfile
for line in Contents:
# Split the string to isolate IP address -- changed from [0] on
Apache
Ip = line.split()[2]

# Ensure length is proper
if 6 < len(Ip) <= 15:
# Increase by 1 if Ip exists; else set hit count = 1
IpHitListing[Ip] = IpHitListing.get(Ip, 0) + 1
return IpHitListing

Here, you're iterating through each line of the logfile, and assigning the third field to Ip. Then, *after* you've read each line of the file, you add the contents of Ip to your hit-list. I'd imagine that if you check, you'll discover that the IP address being reported is the one from the *last* line of the logfile, rather than from the first. If you indent the 'if 6 < len(Ip) <= 15:' segment by another tabstop, that'll put it inside your for loop, and that should give you the behavior that you want.

Jeff Shannon
Technician/Programmer
Credit International


_______________________________________________
ActivePython mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Other options: http://listserv.ActiveState.com/mailman/listinfo/ActivePython

Reply via email to