[EMAIL PROTECTED] wrote: > I'm parsing a text file to extract word definitions. For example the > input text file contains the following content: > > di.va.gate \'di_--v*-.ga_-t\ vb > pas.sim \'pas-*m\ adv : here and there : THROUGHOUT > > I am trying to obtain words between two literal backslashes (\ .. \). I > am not able to match words between two literal backslashes using the > regxp - re.compile(r'\\[^\\]*\\'). > > Here is my sample script: > > import re;
Lose the semicolons ... > > #slashPattern = re.compile(re.escape(r'\\[^\\]*\\')); > pattern = r'\\[^\\]*\\' > slashPattern = re.compile(pattern); > > fdr = file( "parseinput",'r'); > line = fdr.readline(); > You should upgrade so that you have a modern Python and a modern tutor[ial] -- then you will be writing: for line in fdr: do_something_with(line) > while (line != ""): Lose the extraneous parentheses ... > if (slashPattern.match(line)): Your main problem is that you should be using the search() method, not the match() method. Read the section on this topic in the re docs!! >>> import re >>> pat = re.compile(r'\\[^\\]*\\') >>> pat.match(r'abcd \xyz\ pqr') >>> pat.search(r'abcd \xyz\ pqr') <_sre.SRE_Match object at 0x00AE8988> > print line.rstrip() + " <-- matches pattern " + pattern > else: > print line.rstrip() + " <-- DOES not match pattern " + > pattern > line = fdr.readline(); > print; > > > ---------- > The output > > C:\home\krishna\lang\python>python wsparsetest.py > python wsparsetest.py > di.va.gate \'di_--v*-.ga_-t\ vb <-- DOES not match > pattern \\[^\\]*\\ > pas.sim \'pas-*m\ adv : here and there : THROUGHOUT <-- DOES not match > pattern \\[^\\]*\\ > ----------- > > What should I be doing to match those literal backslashes? > > Thanks > -- http://mail.python.org/mailman/listinfo/python-list