I am looking for a way given a number of files, say 3, that represent technical support tickets in the same format to generate regular expressions for the different fields automatically.
An example from of one line from each file: Date: 12/30/2008 Room: 457 Building: Main Date: 12/31/2008 Room: A21 Building: Annex Date: 1/4/2009 Room: L69 Building: Library The program would then, possibly using the python diff library, generate the regular expression needed to parse out different fields. In this case it might return a tuple like ("^Date:[\w]+(.*)[\w]+Room","Room:[\w]+(.*)[\w]+Building","Building:[\w]+(.*)[\w]+$") that would match each of the fields based on the common data and sort of assume that what doesn't change between them is data we are looking for.
-- http://mail.python.org/mailman/listinfo/python-list