> Od: Zdenek Maxa <[EMAIL PROTECTED]>
> Předmět: Re: multiline regular expression (replace)
> Datum: 29.5.2007 13:46:32
> ----------------------------------------
> [EMAIL PROTECTED] wrote:
> > On May 29, 2:03 am, Zdenek Maxa <[EMAIL PROTECTED]> wrote:
> >
> >> Hi all,
> >>
> >> I would like to perform regular expression replace (e.g. removing
> >> everything from within tags in a XML file) with multiple-line pattern.
> >> How can I do this?
> >>
> >> where = open("filename").read()
> >> multilinePattern = "^<tag> .... <\/tag>$"
> >> re.search(multilinePattern, where, re.MULTILINE)
> >>
> >> Thanks greatly,
> >> Zdenek
> >>
> >
> > Why not use an xml package for working with xml files? I'm sure
> > they'll handle your multiline tags.
> >
> > http://effbot.org/zone/element-index.htm
> > http://codespeak.net/lxml/
> >
> > ~Sean
> >
> >
>
> Hi,
>
> that was merely an example of what I would like to achieve. However, in
> general, is there a way for handling multiline regular expressions in
> Python, using presumably only modules from distribution like re?
>
> Thanks,
> Zdenek
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>
>
There shouldn't be any problems matching multiline strings using re (even
without flags), there might be some problem with the search pattern, however,
especially the "..." part :-) if you are in fact using dots - which don't
include newlines in this pattern.
the flag re.M only changes the behaviour of ^ and $ metacharacters, cf. the
docs:
re.M
MULTILINE
When specified, the pattern character "^" matches at the beginning of the
string and at the beginning of each line (immediately following each newline);
and the pattern character "$" matches at the end of the string and at the end
of each line (immediately preceding each newline). By default, "^" matches only
at the beginning of the string, and "$" only at the end of the string and
immediately before the newline (if any) at the end of the string.
you may also check the S flag:
re.S
DOTALL
Make the "." special character match any character at all, including a newline;
without this flag, "." will match anything except a newline.
see
http://docs.python.org/lib/node46.html
http://docs.python.org/lib/re-syntax.html
Vlasta
--
http://mail.python.org/mailman/listinfo/python-list