Re: Parse a log file

2010-01-18 Thread kak...@gmail.com
On Jan 18, 11:56 pm, Tim Chase  wrote:
> kak...@gmail.com wrote:
> > I want to parse a log file with the following format for
> > example:
> >               TIMESTAMPE            Operation     FileName
> > Bytes
> > 12/Jan/2010:16:04:59 +0200   EXISTS       sample3.3gp   37151
> > 12/Jan/2010:16:04:59 +0200  EXISTS        sample3.3gp   37151
> > 12/Jan/2010:16:04:59 +0200  EXISTS        sample3.3gp   37151
> > 12/Jan/2010:16:04:59 +0200  EXISTS        sample3.3gp   37151
> > 12/Jan/2010:16:04:59 +0200  EXISTS        sample3.3gp   37151
> > 12/Jan/2010:16:05:05 +0200  DELETE      sample3.3gp   37151
>
> > How can i count the operations for a month(e.g total of 40 Operations,
> > 30 exists, 10 delete?)
>
> It can be done pretty easily with a regexp to parse the relevant
> bits:
>
>    import re
>    r = re.compile(r'\d+/([^/]+)/(\d+)\S+\s+\S+\s+(\w+)')
>    stats = {}
>    for line in file('log.txt'):
>      m = r.match(line)
>      if m:
>        stats[m.groups()] = stats.get(m.groups(), 0) + 1
>    print stats
>
> This prints out
>
>    {('Jan', '2010', 'EXISTS'): 5, ('Jan', '2010', 'DELETE'): 1}
>
> With the resulting data structure, you can manipulate it to do
> coarser-grained aggregates such as the total operations, or remap
> month-name abbreviations into integers so they could be sorted
> for output.
>
> -tkc

Thank you both so much

Antonis
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parse a log file

2010-01-18 Thread Tim Chase

kak...@gmail.com wrote:

I want to parse a log file with the following format for
example:
  TIMESTAMPEOperation FileName
Bytes
12/Jan/2010:16:04:59 +0200   EXISTS   sample3.3gp   37151
12/Jan/2010:16:04:59 +0200  EXISTSsample3.3gp   37151
12/Jan/2010:16:04:59 +0200  EXISTSsample3.3gp   37151
12/Jan/2010:16:04:59 +0200  EXISTSsample3.3gp   37151
12/Jan/2010:16:04:59 +0200  EXISTSsample3.3gp   37151
12/Jan/2010:16:05:05 +0200  DELETE  sample3.3gp   37151

How can i count the operations for a month(e.g total of 40 Operations,
30 exists, 10 delete?)


It can be done pretty easily with a regexp to parse the relevant 
bits:


  import re
  r = re.compile(r'\d+/([^/]+)/(\d+)\S+\s+\S+\s+(\w+)')
  stats = {}
  for line in file('log.txt'):
m = r.match(line)
if m:
  stats[m.groups()] = stats.get(m.groups(), 0) + 1
  print stats

This prints out

  {('Jan', '2010', 'EXISTS'): 5, ('Jan', '2010', 'DELETE'): 1}


With the resulting data structure, you can manipulate it to do 
coarser-grained aggregates such as the total operations, or remap 
month-name abbreviations into integers so they could be sorted 
for output.


-tkc


--
http://mail.python.org/mailman/listinfo/python-list


Re: Parse a log file

2010-01-18 Thread samwyse
On Jan 18, 6:52 am, "kak...@gmail.com"  wrote:
> Hello to all!
> I want to parse a log file with the following format for
> example:
>               TIMESTAMPE            Operation     FileName
> Bytes
> 12/Jan/2010:16:04:59 +0200   EXISTS       sample3.3gp   37151
> 12/Jan/2010:16:04:59 +0200  EXISTS        sample3.3gp   37151
> 12/Jan/2010:16:04:59 +0200  EXISTS        sample3.3gp   37151
> 12/Jan/2010:16:04:59 +0200  EXISTS        sample3.3gp   37151
> 12/Jan/2010:16:04:59 +0200  EXISTS        sample3.3gp   37151
> 12/Jan/2010:16:05:05 +0200  DELETE      sample3.3gp   37151
>
> How can i count the operations for a month(e.g total of 40 Operations,
> 30 exists, 10 delete?)
> Any tips?
>
> Thanks in advance
> Antonis

time.strptime(string[, format])
Parse a string representing a time according to a format. The return
value is a struct_time as returned by gmtime() or localtime().

The format parameter uses the same directives as those used by strftime
(); it defaults to "%a %b %d %H:%M:%S %Y" which matches the formatting
returned by ctime(). If string cannot be parsed according to format,
or if it has excess data after parsing, ValueError is raised. The
default values used to fill in any missing data when more accurate
values cannot be inferred are (1900, 1, 1, 0, 0, 0, 0, 1, -1).

>>> import time
>>> ts='12/Jan/2010:16:04:59 +0200'
>>> time.strptime(ts[:-6], '%d/%b/%Y:%H:%M:%S')
time.struct_time(tm_year=2010, tm_mon=1, tm_mday=12, tm_hour=16,
tm_min=4, tm_sec=59, tm_wday=1, tm_yday=12, tm_isdst=-1)

I leave the conversion of the last six characters (the time zone
offset) as an exercise for the student.  :)
-- 
http://mail.python.org/mailman/listinfo/python-list


Parse a log file

2010-01-18 Thread kak...@gmail.com
Hello to all!
I want to parse a log file with the following format for
example:
  TIMESTAMPEOperation FileName
Bytes
12/Jan/2010:16:04:59 +0200   EXISTS   sample3.3gp   37151
12/Jan/2010:16:04:59 +0200  EXISTSsample3.3gp   37151
12/Jan/2010:16:04:59 +0200  EXISTSsample3.3gp   37151
12/Jan/2010:16:04:59 +0200  EXISTSsample3.3gp   37151
12/Jan/2010:16:04:59 +0200  EXISTSsample3.3gp   37151
12/Jan/2010:16:05:05 +0200  DELETE  sample3.3gp   37151

How can i count the operations for a month(e.g total of 40 Operations,
30 exists, 10 delete?)
Any tips?

Thanks in advance
Antonis
-- 
http://mail.python.org/mailman/listinfo/python-list