Re: Parse a log file
On Jan 18, 11:56 pm, Tim Chase wrote: > kak...@gmail.com wrote: > > I want to parse a log file with the following format for > > example: > > TIMESTAMPE Operation FileName > > Bytes > > 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151 > > 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151 > > 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151 > > 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151 > > 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151 > > 12/Jan/2010:16:05:05 +0200 DELETE sample3.3gp 37151 > > > How can i count the operations for a month(e.g total of 40 Operations, > > 30 exists, 10 delete?) > > It can be done pretty easily with a regexp to parse the relevant > bits: > > import re > r = re.compile(r'\d+/([^/]+)/(\d+)\S+\s+\S+\s+(\w+)') > stats = {} > for line in file('log.txt'): > m = r.match(line) > if m: > stats[m.groups()] = stats.get(m.groups(), 0) + 1 > print stats > > This prints out > > {('Jan', '2010', 'EXISTS'): 5, ('Jan', '2010', 'DELETE'): 1} > > With the resulting data structure, you can manipulate it to do > coarser-grained aggregates such as the total operations, or remap > month-name abbreviations into integers so they could be sorted > for output. > > -tkc Thank you both so much Antonis -- http://mail.python.org/mailman/listinfo/python-list
Re: Parse a log file
kak...@gmail.com wrote: I want to parse a log file with the following format for example: TIMESTAMPEOperation FileName Bytes 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151 12/Jan/2010:16:04:59 +0200 EXISTSsample3.3gp 37151 12/Jan/2010:16:04:59 +0200 EXISTSsample3.3gp 37151 12/Jan/2010:16:04:59 +0200 EXISTSsample3.3gp 37151 12/Jan/2010:16:04:59 +0200 EXISTSsample3.3gp 37151 12/Jan/2010:16:05:05 +0200 DELETE sample3.3gp 37151 How can i count the operations for a month(e.g total of 40 Operations, 30 exists, 10 delete?) It can be done pretty easily with a regexp to parse the relevant bits: import re r = re.compile(r'\d+/([^/]+)/(\d+)\S+\s+\S+\s+(\w+)') stats = {} for line in file('log.txt'): m = r.match(line) if m: stats[m.groups()] = stats.get(m.groups(), 0) + 1 print stats This prints out {('Jan', '2010', 'EXISTS'): 5, ('Jan', '2010', 'DELETE'): 1} With the resulting data structure, you can manipulate it to do coarser-grained aggregates such as the total operations, or remap month-name abbreviations into integers so they could be sorted for output. -tkc -- http://mail.python.org/mailman/listinfo/python-list
Re: Parse a log file
On Jan 18, 6:52 am, "kak...@gmail.com" wrote: > Hello to all! > I want to parse a log file with the following format for > example: > TIMESTAMPE Operation FileName > Bytes > 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151 > 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151 > 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151 > 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151 > 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151 > 12/Jan/2010:16:05:05 +0200 DELETE sample3.3gp 37151 > > How can i count the operations for a month(e.g total of 40 Operations, > 30 exists, 10 delete?) > Any tips? > > Thanks in advance > Antonis time.strptime(string[, format]) Parse a string representing a time according to a format. The return value is a struct_time as returned by gmtime() or localtime(). The format parameter uses the same directives as those used by strftime (); it defaults to "%a %b %d %H:%M:%S %Y" which matches the formatting returned by ctime(). If string cannot be parsed according to format, or if it has excess data after parsing, ValueError is raised. The default values used to fill in any missing data when more accurate values cannot be inferred are (1900, 1, 1, 0, 0, 0, 0, 1, -1). >>> import time >>> ts='12/Jan/2010:16:04:59 +0200' >>> time.strptime(ts[:-6], '%d/%b/%Y:%H:%M:%S') time.struct_time(tm_year=2010, tm_mon=1, tm_mday=12, tm_hour=16, tm_min=4, tm_sec=59, tm_wday=1, tm_yday=12, tm_isdst=-1) I leave the conversion of the last six characters (the time zone offset) as an exercise for the student. :) -- http://mail.python.org/mailman/listinfo/python-list
Parse a log file
Hello to all! I want to parse a log file with the following format for example: TIMESTAMPEOperation FileName Bytes 12/Jan/2010:16:04:59 +0200 EXISTS sample3.3gp 37151 12/Jan/2010:16:04:59 +0200 EXISTSsample3.3gp 37151 12/Jan/2010:16:04:59 +0200 EXISTSsample3.3gp 37151 12/Jan/2010:16:04:59 +0200 EXISTSsample3.3gp 37151 12/Jan/2010:16:04:59 +0200 EXISTSsample3.3gp 37151 12/Jan/2010:16:05:05 +0200 DELETE sample3.3gp 37151 How can i count the operations for a month(e.g total of 40 Operations, 30 exists, 10 delete?) Any tips? Thanks in advance Antonis -- http://mail.python.org/mailman/listinfo/python-list