The number of iterms - (us::.*?) - varies. When I use re.findall with (us::*?), only several 'us::' are extracted.
Daniel On 9/16/07, Kent Johnson <[EMAIL PROTECTED]> wrote: > > 王超 wrote: > > yes, but I mean if I have the line like this: > > > > line = """38166 us::Video_Cat::Other; us::Video_Cat::Today Show; > > us::VC_Supplier::bc; 1002::ms://bc.wd.net/a275/video/tdy_is.asf; > > 1003::ms://bc.wd.net/a275/video/tdy_is_.fl;""" > > > > I want to get the part "us::MSNVideo_Cat::Other; us::MSNVideo_Cat::Today > > Show; us::VC_Supplier::Msnbc;" > > > > but re.compile(r"(us::.*) .*(1002|1003).*$") will get the > > "1002::ms://bc.wd.net/a275/video/tdy_is.asf;" included in an lazy mode. > > Of course, you have asked for all the text up to the end of the string. > > Not sure what you mean by lazy mode... > > If there will always be three items you could just repeat the relevant > sections of the re, something like > > r'(us::.*?); (us::.*?); (us::.*?);' > > or even > > r'(us::Video_Cat::.*?); (us::Video_Cat::.*?); (us::VC_Supplier::.*?);' > > If the number of items varies then use re.findall() with (us::.*?); > > The non-greedy match is not strictly needed in the first case but it is > in the second. > > Kent >
_______________________________________________ Tutor maillist - [email protected] http://mail.python.org/mailman/listinfo/tutor
