Hello all,

I took recipe 7.5 in the Python Cookbook
(http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/65251) as a start
towards getting some info about hits against our webcache, Squid. I think
I'm missing something simple (or fundamental :-\ )

(I'd post a comment on the recipe, but the webpage is refusing my ASPN
membership currently for this function.)

A line of squid log looks like this:

1040217940.137     31 24.120.240.228 TCP_MISS/200 936 GET
http://XXX.YYY.ZZZ.WWW/Emp/Careers/Images/midMenu_06.gif -
DIRECT/XXX.YYY.ZZZ.WWW image/gif

Which places the Ip address in the 3rd field, unlike the Apache one.

However, the method:

Contents = open(logfile_pathname,"r").xreadlines()

Grabs all of the data in one big line: when I parse the data, I only get the
first entry.

I'm developing with ActivePython-2.2 win32all build 148, although this is
intended to run cross-platform on an OpenBSD box. Here's the results when
run on the command line:

C:\etc>CalculateSquidIpHits.py access.log
{'140.247.117.79': 1}

It grabs the first entry only. I looked at using:

file_object = open(logfile_pathname)
Contents = list(file_object)

But had the same result, only slower (which might lockup on a large logfile
as the recipe suggests).

Thanks for any assistance! (I have LP, PP, PPonWin32, and the PC, so I'll
read some more -- but I learn best by doing ...)

Here's my modified code from the recipe:

def CalculateSquidIpHits(logfile_pathname):
    # Make a dictionary to store IP addresses and their hit counts
    # and read the contents of the log file line by line
    IpHitListing = {}
    Contents = open(logfile_pathname, "r").xreadlines()
    #file_object = open(logfile_pathname)
    #Contents = list(file_object)

    # Go through each line of the logfile
    for line in Contents:
        # Split the string to isolate IP address -- changed from [0] on
Apache
        Ip = line.split()[2]
          

    # Ensure length is proper
    if 6 < len(Ip) <= 15:
        # Increase by 1 if Ip exists; else set hit count = 1
        IpHitListing[Ip] = IpHitListing.get(Ip, 0) + 1
    return IpHitListing

def main():
    import sys
    if len(sys.argv)>=2:
        HitsDictionary = CalculateSquidIpHits(sys.argv[1])
        print HitsDictionary
    else:
        print "Usage: CalculateSquidIpHits [Logfile]"
        
if __name__ == '__main__':
    main()


***************************     
* Adam Getchell                                 [EMAIL PROTECTED]
* System Architect/Programmer                   (530) 752-1584
* Human Resources Information Systems   http://www.hr.ucdavis.edu/
***************************     
"Invincibility is in oneself, vulnerability in the opponent." -- Sun Tzu

_______________________________________________
ActivePython mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Other options: http://listserv.ActiveState.com/mailman/listinfo/ActivePython

Reply via email to