#22223: reverse() escapes unreserved characters
----------------------------------+--------------------------------------
     Reporter:  erik.van.zijst@…  |                    Owner:  nobody
         Type:  Bug               |                   Status:  new
    Component:  Core (URLs)       |                  Version:  1.6
     Severity:  Normal            |               Resolution:
     Keywords:                    |             Triage Stage:  Unreviewed
    Has patch:  0                 |      Needs documentation:  0
  Needs tests:  0                 |  Patch needs improvement:  0
Easy pickings:  0                 |                    UI/UX:  0
----------------------------------+--------------------------------------

Comment (by aaugustin):

 This change is indeed documented in the release notes. It's a consequence
 of #13260. See [https://code.djangoproject.com/ticket/13260#comment:17 my
 analysis] for details.

 The change suggested here would still preserve the requirements of #13260,
 which was primarily concerned with % characters in variable parts of URLs.

 Now the question is -- what characters do we consider safe? By default
 [http://docs.python.org/2/library/urllib.html#urllib.quote urllib.quote]
 preserves `A-Za-z0-9_.-` and characters defined as safe, which default to
 `/`.

 Based on RFC 1738:

 1. {{{ <>"#%{}|\^~[]`}}} are unsafe and must be encoded (that list
 includes the SPACE character).
 2. `;/?:@=&` are reserved and must be encoded unless they are used for
 their special meaning.
 3. `$-_.+!*'(),` are safe and need not be encoded.

 We can certainly put the third set of characters in the safe list.

 If characters from the second set end up unencoded in URLs generated by
 Django, we start relying on user-agent quirks to re-encode them properly
 in HTTP request lines. However, `/` is part of this list and considered
 safe by the stdlib by default (which may not mean much; the stdlib
 contains many unfortunate API choices).


 Sinc the path segment always starts with a slash following the host and
 ends at the end of the URL or with one of `?`, `;` or `#` (which is always
 unsafe), we may choose to preserve `/:@=&`, that is, all of the second set
 except for `?` and `;`. If we want to be more careful, we may choose to
 preserve only `/` and `:` because `/` is safe by default and `:` is only
 used to separate the protocol from the remainder of the URL. That would
 resolve your problem.

 Can you clarify how you came up with `/:@&=+$,`? If you're including some
 characters from the third set above, you should probably include all of
 them.

 The fix should be backported to 1.6.x since it's a regression.

-- 
Ticket URL: <https://code.djangoproject.com/ticket/22223#comment:2>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To post to this group, send email to django-updates@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/082.06eb53dab5859205cc66c76e3fd36d54%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to