[issue12910] urrlib.quote quotes too many chars, e.g., '()'

2017-07-10 Thread Mariatta Wijaya

Changes by Mariatta Wijaya :


--
stage:  -> patch review
versions:  -Python 2.7, Python 3.3, Python 3.4, Python 3.5, Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12910] urrlib.quote quotes too many chars, e.g., '()'

2017-07-04 Thread Jörn Hees

Changes by Jörn Hees :


--
versions: +Python 2.7, Python 3.4, Python 3.5, Python 3.6, Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12910] urrlib.quote quotes too many chars, e.g., '()'

2017-07-04 Thread Jörn Hees

Jörn Hees added the comment:

It's been a while... nowadays I would mostly change the documentation of the 
quote function to point out that it is likely to quote more characters than 
absolutely necessary by SPEC. The function is in place for so long, (even in 
py3) that people will rely on the behavior.

I made an attempt to update the docstring accordingly in 
https://github.com/python/cpython/pull/2568


What i think is most confusing is the current docs mentioning the reserved 
chars (which are btw. definitely wrong wrt. RFC3986). Actually as one can see 
in the code the reserved chars don't play any role for quote, but much more the 
unreserved chars (called _ALWAYS_SAFE 
https://github.com/python/cpython/blob/master/Lib/urllib/parse.py#L716 ).

   unreserved= ALPHA / DIGIT / "-" / "." / "_" / "~"

The current quote function's approach is to simply quote everything that is not 
in unreserved + safe (per arg).

In that aspect it is quite close to the old javascript.escape function: 
https://www.w3schools.com/jsref/jsref_escape.asp


quick links
py2.7: https://github.com/python/cpython/blob/2.7/Lib/urllib.py#L1261
py3: https://github.com/python/cpython/blob/master/Lib/urllib/parse.py#L745
RFC3986: https://tools.ietf.org/html/rfc3986#appendix-A

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12910] urrlib.quote quotes too many chars, e.g., '()'

2017-07-04 Thread Jörn Hees

Changes by Jörn Hees :


--
pull_requests: +2638

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12910] urrlib.quote quotes too many chars, e.g., '()'

2017-07-03 Thread Cheryl Sabella

Cheryl Sabella added the comment:

Issue 16285 updated the urllib.parse.quote() reserved list to add '~'.

>From the docstring:
def quote(string, safe='/', encoding=None, errors=None):
"""quote('abc def') -> 'abc%20def'

Each part of a URL, e.g. the path info, the query, etc., has a
different set of reserved characters that must be quoted.

RFC 3986 Uniform Resource Identifiers (URI): Generic Syntax lists
the following reserved characters.

reserved= ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
  "$" | "," | "~"

Each of these characters is reserved in some component of a URL,
but not necessarily in all of them.

Python 3.7 updates from using RFC 2396 to RFC 3986 to quote URL strings.
Now, "~" is included in the set of reserved characters.


However, looking at RFC3986 (https://tools.ietf.org/html/rfc3986), appendix A 
has the following:

   unreserved= ALPHA / DIGIT / "-" / "." / "_" / "~"
   reserved  = gen-delims / sub-delims
   gen-delims= ":" / "/" / "?" / "#" / "[" / "]" / "@"
   sub-delims= "!" / "$" / "&" / "'" / "(" / ")"
 / "*" / "+" / "," / ";" / "="


Should the missing ones be added or should this issue be closed if they aren't 
going to be added?

Thanks.

--
nosy: +csabella

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12910] urrlib.quote quotes too many chars, e.g., '()'

2011-09-08 Thread Ezio Melotti

Changes by Ezio Melotti ezio.melo...@gmail.com:


--
nosy: +ezio.melotti

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12910
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12910] urrlib.quote quotes too many chars, e.g., '()'

2011-09-06 Thread Jörn Hees

New submission from Jörn Hees nrej9...@joernhees.de:

urllib.quote('()')
returns '%28%29'

Looking into its code it tries to follow RFC 2396 (which is good even though it 
should follow rfc3986 nowadays), but it doesn't:

http://tools.ietf.org/html/rfc2396 (see Appendix A, p.27): ( and ) are in 
mark and therefore unreserved, so why are they quoted?

--
components: Library (Lib)
messages: 143592
nosy: joern
priority: normal
severity: normal
status: open
title: urrlib.quote quotes too many chars, e.g., '()'
type: behavior
versions: Python 2.6, Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12910
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12910] urrlib.quote quotes too many chars, e.g., '()'

2011-09-06 Thread Senthil Kumaran

Senthil Kumaran sent...@uthcode.com added the comment:

It can aggressively put these chars !~*\'() in the safe list.  I will look at 
the history to see if they originally present and were removed for some reason 
or they did not make it the list in the first place. 

If we do add, then it should be only 3.3 (Someone could be relying on the old 
behavior).

--
assignee:  - orsenthil
nosy: +orsenthil
versions: +Python 3.3 -Python 2.6, Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12910
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12910] urrlib.quote quotes too many chars, e.g., '()'

2011-09-06 Thread Éric Araujo

Changes by Éric Araujo mer...@netwok.org:


--
nosy: +eric.araujo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12910
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com