On Mar 29, 3:22 pm, "aspineux" <[EMAIL PROTECTED]> wrote:
> I want to parse
>
> '[EMAIL PROTECTED]' or '<[EMAIL PROTECTED]>' and get the email address [EMAIL 
> PROTECTED]
>
> the regex is
>
> r'<[EMAIL PROTECTED]>|[EMAIL PROTECTED]'
>
> now, I want to give it a name
>
> r'<(?P<email>[EMAIL PROTECTED])>|(?P<email>[EMAIL PROTECTED])'
>
> sre_constants.error: redefinition of group name 'email' as group 2;
> was group 1
>
> BUT because I use a | , I will get only one group named 'email' !
>
> Any comment ?
>
> PS: I know the solution for this case is to use  r'(?P<lt><)?(?P<email>
> [EMAIL PROTECTED])(?(lt)>)'

use two group names, one for each alternate form and if you are not
concerned with whichever matched do something like the following:

>>> s1 = '[EMAIL PROTECTED]'
>>> s2 = '<[EMAIL PROTECTED]>'
>>> matchobj = re.search(r'<(?P<email1>[EMAIL PROTECTED])>|(?P<email2>[EMAIL 
>>> PROTECTED])', s1)
>>> matchobj.groupdict()['email1'] or matchobj.groupdict()['email2']
'[EMAIL PROTECTED]'
>>> matchobj = re.search(r'<(?P<email1>[EMAIL PROTECTED])>|(?P<email2>[EMAIL 
>>> PROTECTED])', s2)
>>> matchobj.groupdict()['email1'] or matchobj.groupdict()['email2']
'[EMAIL PROTECTED]'
>>>

- Paddy.

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to