[EMAIL PROTECTED] wrote:
> James Stroud wrote:
> 
>>[EMAIL PROTECTED] wrote:
>>
>>>hi
>>>suppose i have a string like
>>>
>>>test1?test2t-test3*test4*test5$test6#test7*test8
>>>
>>>how can i construct the regexp to get test3*test4*test5 and
>>>test7*test8, ie, i want to match * and the words before and after?
>>>thanks
>>>
>>
>>
>>py> import re
>>py> s = 'test1?test2t-test3*test4*test5$test6#test7*test8'
>>py> r = re.compile(r'(test\d(?:\*test\d)+)')
>>py> r.findall(s)
>>['test3*test4*test5', 'test7*test8']
>>
>>James
> 
> 
> thanks !
> I check the regexp doc it says:
> """
> (?:...)
>     A non-grouping version of regular parentheses. Matches whatever
> regular expression is inside the parentheses, but the substring matched
> by the group cannot be retrieved after performing a match or referenced
> later in the pattern.
> """
> but i could not understand this : r'(test\d(?:\*test\d)+)'. which
> parenthesis is it referring to? Sorry, could you explain the solution ?
> thanks
> 

The outer parentheses are the grouping operator. These are saved and 
accessible from a match object via group() or groups() methods. The "\d" 
part matches a single digit 0-1. The (?:....) construct is used to make 
a non-grouping operator that is not itself remembered for access through 
the group() or groups() methods. The expression can also reference 
earlier groups, but not groups specified with the non-grouping operator.

You may want to note that this is the most specific regular expression 
that would match your given example.

James
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to