Your improvements work great, thank you. And thank you for the very 
detailed explanations!

On Tuesday, March 8, 2016 at 9:41:11 AM UTC+1, Michal Petrucha wrote:
>
> On Mon, Mar 07, 2016 at 05:44:08PM -0800, jorr...@gmail.com <javascript:> 
> wrote: 
> > I'm trying to replace *[URL]www.link.com[/URL]* with HTML with this 
> regexp: 
> > 
> > topic.text = re.sub("(\[URL\])(.*)(\[\/URL\])", '<a href="$2">$2</a>', 
> topic 
> > .text, flags=re.I) 
> > 
> > But it's giving me the following problems: 
> > 
> >    1. The $2 capture group is only able to be repeated once, so I get 
> >    <a href="www.link.com">$2</a> 
> >    instead of 
> >    <a href="www.link.com">www.link.com</a> 
>
> I have my doubts – if you use the standard Python re library, then the 
> way to refer to captured groups is "\1", "\2", etc., not "$1". When I 
> try the code you posted above, I get the following result (i.e., not 
> even the first occurrence of "$2" gets substituted):: 
>
>     >>> re.sub("(\[URL\])(.*)(\[\/URL\])", '<a href="$2">$2</a>', '[URL]
> www.link.com[/URL]', flags=re.I) 
>     '<a href="$2">$2</a>' 
>
> In order to make the substitution work for a single occurrence of 
> [URL]...[/URL], you can use the following, which uses "\2" (Also, when 
> writing regular expressions, or other strings that are supposed to 
> contain the backslash character, it is a good idea to write them as 
> raw string literals, i.e. prefix them with a "r", which I've done 
> below; that way, Python won't try to interpret the backslashes as 
> special characters – otherwise, "\2" would become a character with an 
> ASCII value of 2):: 
>
>     >>> re.sub(r"(\[URL\])(.*)(\[\/URL\])", r'<a href="\2">\2</a>', '[URL]
> www.link.com[/URL]', flags=re.I) 
>     '<a href="www.link.com">www.link.com</a>' 
>
> >    2. Only the first *[URL]* is matched. Everything after the first 
> *[/URL]* 
> >    is simply deleted... 
>
> The solution above gets you halfway there – re.sub will replace all 
> matches by default, the problem here is that the "(.*)" part of your 
> regex will matches everything between the first "[URL]", and the last 
> "[/URL]":: 
>
>     >>> re.sub(r"(\[URL\])(.*)(\[\/URL\])", r'<a href="\2">\2</a>', '[URL]
> www.link1.com[/URL][URL]www.link2.com[/URL][URL]www.link3.com[/URL]', 
> flags=re.I) 
>     '<a href="www.link1.com[/URL][URL]www.link2.com[/URL][URL]
> www.link3.com">www.link1.com[/URL][URL]www.link2.com[/URL][URL]
> www.link3.com</a>' 
>
> The reason is that the asterisk operator in a regex is greedy, which 
> means a ".*" will try to match as much as possible. When you use the 
> non-greedy version of the operator (which you get by putting a 
> question mark after the asterisk), you get the result you want:: 
>
>     >>> re.sub(r"(\[URL\])(.*?)(\[\/URL\])", r'<a href="\2">\2</a>', '[URL]
> www.link1.com[/URL][URL]www.link2.com[/URL][URL]www.link3.com[/URL]', 
> flags=re.I) 
>     '<a href="www.link1.com">www.link1.com</a><a href="www.link2.com">
> www.link2.com</a><a href="www.link3.com">www.link3.com</a>' 
>
>
> You can read an explanation of the difference between greedy and 
> non-greedy regular expressions in the Python docs: 
> https://docs.python.org/2/howto/regex.html#greedy-versus-non-greedy 
>
> Good luck, 
>
> Michal 
>
> >     
> > I hope someone can help me with this. I'm using Python 2.7 if it makes a 
> > difference. 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "Django users" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to django-users...@googlegroups.com <javascript:>. 
> > To post to this group, send email to django...@googlegroups.com 
> <javascript:>. 
> > Visit this group at https://groups.google.com/group/django-users. 
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-users/fce5a726-8a4c-455a-a978-6ee70d66464e%40googlegroups.com.
>  
>
> > For more options, visit https://groups.google.com/d/optout. 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/6d3e0a68-ec36-4a7a-bcb5-c57a775e8e59%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to