Dear All, I know this has come up loads of times before, but I'm stuck with what should be a simple Regex problem. I'm trying to pull all the definitions from a latex document. these are marked
\begin{defn} <TEXT> \end{defn} so I thought I'd write something like this: filename = '/home/acl_home/PhD/CurrentPhD/extensions1_14.8.6.tex' infile = open(filename,'r') def_start = "\\begin\{defn\}" def_end = "\end{defn}" def_start_reg = re.compile(def_start) l = 0 while l < 500: line = infile.readline() #print l, line res = re.search(def_start_reg,line) print l, res l = l+1 but it doesn't return any matches (BTW, I know there's a defn tag in that section). I thought it was my regex matching, but I checked it with an online checker, and also with a small bit of text: def_start = "\\begin\{defn\}" def_start_reg = re.compile(def_start) text = """atom that is grounded. These formulae are useful not only for the work on valuation but are also used in later chapters. \begin{defn} A Patient-ground formula is a formula which contains a grounding of $Patient(x)$. The other atoms in the formula may be either ground or non-ground. \end{defn} Having defined our patient ground formulae, we can now use formulae of this form to define our patient values.""" res = re.search(def_start_reg, text) print res and this returns a MatchObject. I'm not sure why there should be any difference between the two - but I'm sure it's very simple. Thanks for any tips, Matt _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor