Simon Mullis wrote: > Hi All > > I'm writing a script to help with analyzing log files timestamps and > have a very specific question on which I'm momentarily stumped.... > > I'd like the script to support multiple log file types, so allow a > strftime format to be passed in as a cli switch (default is %Y-%m-%d > %H:%M:%S). > > When it comes to actually doing the analysis I want to store or discard > the log entry based on certain criteria. In fact, I only need the log > line timestamp. > > I'd like to do this in one step and therefore not require the user to > supply a regex aswell as a strftime format: > > >>> import datetime > >>> p = datetime.datetime.strptime("2008-07-23 12:18:28 this is the > remainder of the log line that I do not care about", "%Y-%m-%d %H:%M:%S") > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "/opt/local/lib/python2.5/_strptime.py", line 333, in strptime > data_string[found.end():]) > ValueError: unconverted data remains: this is the remainder of the log > line that I do not care about > > >>> repr(p) > NameError: name 'p' is not defined > > Clearly the strptime method above can grab the right bits of data but > the string "p" is not created due to the error. > > So, my options are: > > 1 - Only support one log format. > > 2 - Support any log format but require a regex as well as a strftime > format so I can extract the "timestamp" portion. > > 3 - Create another class/method with a lookup table for the strftime > options that automagically creates the correct regex to extract the > right string from the log entry... (or is this overly complicated) > > 4 - Override the method above (strptime) to allow what I'm trying to do). > > 4 - Some other very clever and elegant solution that I would not ever > manage to think of myself.... > > > Am I making any sense whatsoever? > > Thanks > > SM > > (P.S The reason I don't want the end user to supply a regex for the > timestamin he log-entry is that we're already using 2 other regexes as > cli switches to select the file glob and log line to match....) > If the timestamp is always at the start of the line (and I expect it is) and is always the same length then you could calculate how long the timestamp is from the format (eg "%Y" matches 4 characters) and use string slicing.
If the timestamp isn't a fixed length then a generating a regex might be needed. -- http://mail.python.org/mailman/listinfo/python-list