The number of iterms - (us::.*?) - varies.

When I use re.findall with (us::*?), only several 'us::' are extracted.

Daniel

On 9/16/07, Kent Johnson <[EMAIL PROTECTED]> wrote:
>
> 王超 wrote:
> > yes, but I mean if I have the line like this:
> >
> > line = """38166 us::Video_Cat::Other; us::Video_Cat::Today Show;
> > us::VC_Supplier::bc; 1002::ms://bc.wd.net/a275/video/tdy_is.asf;
> > 1003::ms://bc.wd.net/a275/video/tdy_is_.fl;"""
> >
> > I want to get the part "us::MSNVideo_Cat::Other; us::MSNVideo_Cat::Today
> > Show; us::VC_Supplier::Msnbc;"
> >
> > but re.compile(r"(us::.*) .*(1002|1003).*$") will get the
> > "1002::ms://bc.wd.net/a275/video/tdy_is.asf;" included in an lazy mode.
>
> Of course, you have asked for all the text up to the end of the string.
>
> Not sure what you mean by lazy mode...
>
> If there will always be three items you could just repeat the relevant
> sections of the re, something like
>
> r'(us::.*?); (us::.*?); (us::.*?);'
>
> or even
>
> r'(us::Video_Cat::.*?); (us::Video_Cat::.*?); (us::VC_Supplier::.*?);'
>
> If the number of items varies then use re.findall() with (us::.*?);
>
> The non-greedy match is not strictly needed in the first case but it is
> in the second.
>
> Kent
>
_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

Reply via email to