[EMAIL PROTECTED] wrote: > James Stroud wrote: > >>[EMAIL PROTECTED] wrote: >> >>>hi >>>suppose i have a string like >>> >>>test1?test2t-test3*test4*test5$test6#test7*test8 >>> >>>how can i construct the regexp to get test3*test4*test5 and >>>test7*test8, ie, i want to match * and the words before and after? >>>thanks >>> >> >> >>py> import re >>py> s = 'test1?test2t-test3*test4*test5$test6#test7*test8' >>py> r = re.compile(r'(test\d(?:\*test\d)+)') >>py> r.findall(s) >>['test3*test4*test5', 'test7*test8'] >> >>James > > > thanks ! > I check the regexp doc it says: > """ > (?:...) > A non-grouping version of regular parentheses. Matches whatever > regular expression is inside the parentheses, but the substring matched > by the group cannot be retrieved after performing a match or referenced > later in the pattern. > """ > but i could not understand this : r'(test\d(?:\*test\d)+)'. which > parenthesis is it referring to? Sorry, could you explain the solution ? > thanks >
The outer parentheses are the grouping operator. These are saved and accessible from a match object via group() or groups() methods. The "\d" part matches a single digit 0-1. The (?:....) construct is used to make a non-grouping operator that is not itself remembered for access through the group() or groups() methods. The expression can also reference earlier groups, but not groups specified with the non-grouping operator. You may want to note that this is the most specific regular expression that would match your given example. James -- http://mail.python.org/mailman/listinfo/python-list