Hello all, I took recipe 7.5 in the Python Cookbook (http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/65251) as a start towards getting some info about hits against our webcache, Squid. I think I'm missing something simple (or fundamental :-\ )
(I'd post a comment on the recipe, but the webpage is refusing my ASPN membership currently for this function.) A line of squid log looks like this: 1040217940.137 31 24.120.240.228 TCP_MISS/200 936 GET http://XXX.YYY.ZZZ.WWW/Emp/Careers/Images/midMenu_06.gif - DIRECT/XXX.YYY.ZZZ.WWW image/gif Which places the Ip address in the 3rd field, unlike the Apache one. However, the method: Contents = open(logfile_pathname,"r").xreadlines() Grabs all of the data in one big line: when I parse the data, I only get the first entry. I'm developing with ActivePython-2.2 win32all build 148, although this is intended to run cross-platform on an OpenBSD box. Here's the results when run on the command line: C:\etc>CalculateSquidIpHits.py access.log {'140.247.117.79': 1} It grabs the first entry only. I looked at using: file_object = open(logfile_pathname) Contents = list(file_object) But had the same result, only slower (which might lockup on a large logfile as the recipe suggests). Thanks for any assistance! (I have LP, PP, PPonWin32, and the PC, so I'll read some more -- but I learn best by doing ...) Here's my modified code from the recipe: def CalculateSquidIpHits(logfile_pathname): # Make a dictionary to store IP addresses and their hit counts # and read the contents of the log file line by line IpHitListing = {} Contents = open(logfile_pathname, "r").xreadlines() #file_object = open(logfile_pathname) #Contents = list(file_object) # Go through each line of the logfile for line in Contents: # Split the string to isolate IP address -- changed from [0] on Apache Ip = line.split()[2] # Ensure length is proper if 6 < len(Ip) <= 15: # Increase by 1 if Ip exists; else set hit count = 1 IpHitListing[Ip] = IpHitListing.get(Ip, 0) + 1 return IpHitListing def main(): import sys if len(sys.argv)>=2: HitsDictionary = CalculateSquidIpHits(sys.argv[1]) print HitsDictionary else: print "Usage: CalculateSquidIpHits [Logfile]" if __name__ == '__main__': main() *************************** * Adam Getchell [EMAIL PROTECTED] * System Architect/Programmer (530) 752-1584 * Human Resources Information Systems http://www.hr.ucdavis.edu/ *************************** "Invincibility is in oneself, vulnerability in the opponent." -- Sun Tzu _______________________________________________ ActivePython mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs Other options: http://listserv.ActiveState.com/mailman/listinfo/ActivePython