Simon Mullis wrote:
> Hi All
>
> I'm writing a script to help with analyzing log files timestamps and
> have a very specific question on which I'm momentarily stumped....
>
> I'd like the script to support multiple log file types, so allow a
> strftime format to be passed in as a cli switch (default is %Y-%m-%d
> %H:%M:%S).
>
> When it comes to actually doing the analysis I want to store or discard
> the log entry based on certain criteria. In fact, I only need the log
> line timestamp.
>
> I'd like to do this in one step and therefore not require the user to
> supply a regex aswell as a strftime format:
>
>  >>> import datetime
>  >>> p = datetime.datetime.strptime("2008-07-23 12:18:28 this is the
> remainder of the log line that I do not care about", "%Y-%m-%d %H:%M:%S")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/opt/local/lib/python2.5/_strptime.py", line 333, in strptime
>     data_string[found.end():])
> ValueError: unconverted data remains:  this is the remainder of the log
> line that I do not care about
>
>  >>> repr(p)
> NameError: name 'p' is not defined
>
> Clearly the strptime method above can grab the right bits of data but
> the string "p" is not created due to the error.
>
> So, my options are:
>
> 1 - Only support one log format.
>
> 2 - Support any log format but require a regex as well as a strftime
> format so I can extract the "timestamp" portion.
>
> 3 - Create another class/method with a lookup table for the strftime
> options that automagically creates the correct regex to extract the
> right string from the log entry... (or is this overly complicated)
>
> 4 - Override the method above (strptime) to allow what I'm trying to do).
>
> 4 - Some other very clever and elegant solution that I would not ever
> manage to think of myself....
>
>
> Am I making any sense whatsoever?
>
> Thanks
>
> SM
>
> (P.S The reason I don't want the end user to supply a regex for the
> timestamin he log-entry is that we're already using 2 other regexes as
> cli switches to select the file glob and log line to match....)
>
If the timestamp is always at the start of the line (and I expect it is)
and is always the same length then you could calculate how long the
timestamp is from the format (eg "%Y" matches 4 characters) and use
string slicing.

If the timestamp isn't a fixed length then a generating a regex might be
needed.
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to