Urlencode vs. iriencode
What's the difference between the template filters urlencode and iriencode? When should I use one over the other (or use both)? -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: Urlencode vs. iriencode
Anyone? I haven't found anything that describes the difference (except that one is for URI's and the other for URLs). On Sep 4, 8:52 am, Jordon Wii wrote: > What's the difference between the template filters urlencode and > iriencode? When should I use one over the other (or use both)? -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: Urlencode vs. iriencode
Here's the code for the two (the numbers at the start of each line are just line numbers from the file) - iriencode: 128 """ 129 Convert an Internationalized Resource Identifier (IRI) portion to a URI 130 portion that is suitable for inclusion in a URL. 131 132 This is the algorithm from section 3.1 of RFC 3987. However, since we are 133 assuming input is either UTF-8 or unicode already, we can simplify things a 134 little from the full method. 135 136 Returns an ASCII string containing the encoded result. 137 """ 138 # The list of safe characters here is constructed from the "reserved" and 139 # "unreserved" characters specified in sections 2.2 and 2.3 of RFC 3986: 140 # reserved= gen-delims / sub-delims 141 # gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" 142 # sub-delims = "!" / "$" / "&" / "'" / "(" / ")" 143 # / "*" / "+" / "," / ";" / "=" 144 # unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" 145 # Of the unreserved characters, urllib.quote already considers all but 146 # the ~ safe. 147 # The % character is also added to the list of safe characters here, as the 148 # end of section 3.1 of RFC 3987 specifically mentions that % must not be 149 # converted. 150 if iri is None: 151 return iri 152 return urllib.quote(smart_str(iri), safe="/#%[]=:;$&()+,!?*@'~") urlencode: 11 """ 12 A version of Python's urllib.quote() function that can operate on unicode 13 strings. The url is first UTF-8 encoded before quoting. The returned string 14 can safely be used as part of an argument to a subsequent iri_to_uri() call 15 without double-quoting occurring. 16 """ 17 return force_unicode(urllib.quote(smart_str(url), safe='/')) So iriencode only encodes the IRI portion (hence the longer list of safe characters), while URL will encode the entire URL, including any GET arguments and anchors. As for usage, I haven't encountered any IRIs, but I believe IRIs need to be encoded before inclusion in HTML (i.e. you can't just include the non-ASCII characters in HTML). As for urlencode, its main purpose is if you're including a URL in a form submission, e.g. the URL to go to after login. urlencode will do everything that iriencode does, but sometimes you might not want it to do that. On 5 September 2010 08:17, Jordon Wii wrote: > Anyone? I haven't found anything that describes the difference > (except that one is for URI's and the other for URLs). > > On Sep 4, 8:52 am, Jordon Wii wrote: >> What's the difference between the template filters urlencode and >> iriencode? When should I use one over the other (or use both)? > > -- > You received this message because you are subscribed to the Google Groups > "Django users" group. > To post to this group, send email to django-us...@googlegroups.com. > To unsubscribe from this group, send email to > django-users+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/django-users?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: Urlencode vs. iriencode
Awesome, thank you. So as far as escaping user-entered data for use in URLs...urlencode is best? On Sep 4, 8:45 pm, Sam Lai wrote: > Here's the code for the two (the numbers at the start of each line are > just line numbers from the file) - > > iriencode: > 128 """ > 129 Convert an Internationalized Resource Identifier (IRI) portion to a > URI > 130 portion that is suitable for inclusion in a URL. > 131 > 132 This is the algorithm from section 3.1 of RFC 3987. However, > since we are > 133 assuming input is either UTF-8 or unicode already, we can > simplify things a > 134 little from the full method. > 135 > 136 Returns an ASCII string containing the encoded result. > 137 """ > 138 # The list of safe characters here is constructed from the > "reserved" and > 139 # "unreserved" characters specified in sections 2.2 and 2.3 of > RFC 3986: > 140 # reserved = gen-delims / sub-delims > 141 # gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" > 142 # sub-delims = "!" / "$" / "&" / "'" / "(" / ")" > 143 # / "*" / "+" / "," / ";" / "=" > 144 # unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" > 145 # Of the unreserved characters, urllib.quote already considers all but > 146 # the ~ safe. > 147 # The % character is also added to the list of safe characters > here, as the > 148 # end of section 3.1 of RFC 3987 specifically mentions that % > must not be > 149 # converted. > 150 if iri is None: > 151 return iri > 152 return urllib.quote(smart_str(iri), safe="/#%[]=:;$&()+,!?*@'~") > > urlencode: > 11 """ > 12 A version of Python's urllib.quote() function that can operate > on unicode > 13 strings. The url is first UTF-8 encoded before quoting. The > returned string > 14 can safely be used as part of an argument to a subsequent > iri_to_uri() call > 15 without double-quoting occurring. > 16 """ > 17 return force_unicode(urllib.quote(smart_str(url), safe='/')) > > So iriencode only encodes the IRI portion (hence the longer list of > safe characters), while URL will encode the entire URL, including any > GET arguments and anchors. > > As for usage, I haven't encountered any IRIs, but I believe IRIs need > to be encoded before inclusion in HTML (i.e. you can't just include > the non-ASCII characters in HTML). As for urlencode, its main purpose > is if you're including a URL in a form submission, e.g. the URL to go > to after login. urlencode will do everything that iriencode does, but > sometimes you might not want it to do that. > > On 5 September 2010 08:17, Jordon Wii wrote: > > > > > > > > > Anyone? I haven't found anything that describes the difference > > (except that one is for URI's and the other for URLs). > > > On Sep 4, 8:52 am, Jordon Wii wrote: > >> What's the difference between the template filters urlencode and > >> iriencode? When should I use one over the other (or use both)? > > > -- > > You received this message because you are subscribed to the Google Groups > > "Django users" group. > > To post to this group, send email to django-us...@googlegroups.com. > > To unsubscribe from this group, send email to > > django-users+unsubscr...@googlegroups.com. > > For more options, visit this group > > athttp://groups.google.com/group/django-users?hl=en. -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: Urlencode vs. iriencode
On 5 September 2010 16:18, Jordon Wii wrote: > Awesome, thank you. So as far as escaping user-entered data for use > in URLs...urlencode is best? I'd say so. I haven't really found a need for iriencode personally, and from the code, it seems that urlencode does everything iriencode does anyway. > On Sep 4, 8:45 pm, Sam Lai wrote: >> Here's the code for the two (the numbers at the start of each line are >> just line numbers from the file) - >> >> iriencode: >> 128 """ >> 129 Convert an Internationalized Resource Identifier (IRI) portion to a >> URI >> 130 portion that is suitable for inclusion in a URL. >> 131 >> 132 This is the algorithm from section 3.1 of RFC 3987. However, >> since we are >> 133 assuming input is either UTF-8 or unicode already, we can >> simplify things a >> 134 little from the full method. >> 135 >> 136 Returns an ASCII string containing the encoded result. >> 137 """ >> 138 # The list of safe characters here is constructed from the >> "reserved" and >> 139 # "unreserved" characters specified in sections 2.2 and 2.3 of >> RFC 3986: >> 140 # reserved = gen-delims / sub-delims >> 141 # gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" >> 142 # sub-delims = "!" / "$" / "&" / "'" / "(" / ")" >> 143 # / "*" / "+" / "," / ";" / "=" >> 144 # unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" >> 145 # Of the unreserved characters, urllib.quote already considers all >> but >> 146 # the ~ safe. >> 147 # The % character is also added to the list of safe characters >> here, as the >> 148 # end of section 3.1 of RFC 3987 specifically mentions that % >> must not be >> 149 # converted. >> 150 if iri is None: >> 151 return iri >> 152 return urllib.quote(smart_str(iri), safe="/#%[]=:;$&()+,!?*@'~") >> >> urlencode: >> 11 """ >> 12 A version of Python's urllib.quote() function that can operate >> on unicode >> 13 strings. The url is first UTF-8 encoded before quoting. The >> returned string >> 14 can safely be used as part of an argument to a subsequent >> iri_to_uri() call >> 15 without double-quoting occurring. >> 16 """ >> 17 return force_unicode(urllib.quote(smart_str(url), safe='/')) >> >> So iriencode only encodes the IRI portion (hence the longer list of >> safe characters), while URL will encode the entire URL, including any >> GET arguments and anchors. >> >> As for usage, I haven't encountered any IRIs, but I believe IRIs need >> to be encoded before inclusion in HTML (i.e. you can't just include >> the non-ASCII characters in HTML). As for urlencode, its main purpose >> is if you're including a URL in a form submission, e.g. the URL to go >> to after login. urlencode will do everything that iriencode does, but >> sometimes you might not want it to do that. >> >> On 5 September 2010 08:17, Jordon Wii wrote: >> >> >> >> >> >> >> >> > Anyone? I haven't found anything that describes the difference >> > (except that one is for URI's and the other for URLs). >> >> > On Sep 4, 8:52 am, Jordon Wii wrote: >> >> What's the difference between the template filters urlencode and >> >> iriencode? When should I use one over the other (or use both)? >> >> > -- >> > You received this message because you are subscribed to the Google Groups >> > "Django users" group. >> > To post to this group, send email to django-us...@googlegroups.com. >> > To unsubscribe from this group, send email to >> > django-users+unsubscr...@googlegroups.com. >> > For more options, visit this group >> > athttp://groups.google.com/group/django-users?hl=en. > > -- > You received this message because you are subscribed to the Google Groups > "Django users" group. > To post to this group, send email to django-us...@googlegroups.com. > To unsubscribe from this group, send email to > django-users+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/django-users?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.