[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2019-02-20 Thread Rémi Lapeyre
Change by Rémi Lapeyre : -- nosy: +remi.lapeyre ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2019-02-20 Thread Diego Rojas
Change by Diego Rojas : -- components: +Extension Modules -Library (Lib), Unicode type: enhancement -> behavior versions: +Python 3.4, Python 3.5, Python 3.6, Python 3.8 ___ Python tracker

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2017-01-29 Thread R. David Murray
R. David Murray added the comment: I believe the last time this subject was discussed the conclusion was that we really needed a full IRI module that conformed to the relevant RFCs, and that putting something on pypi would be one way to get there. Someone should research the existing

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2017-01-28 Thread Martin Panter
Martin Panter added the comment: I’m not really an expert on non-ASCII URLs / IRIs. Maybe it is obvious to other people that this is a good general implementation, but for me to thoroughly review it I would need time to research the relevant RFCs, other implementations, suitability for the

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2017-01-27 Thread Andreas Åkerlund
Andreas Åkerlund added the comment: Changed the patch after pointers from vadmium. And quote_uri is changed to quote_iri as martin.panter thought it was more appropriate. -- Added file: http://bugs.python.org/file46440/issue3991_2017-01-27.diff ___

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2017-01-18 Thread Martin Panter
Martin Panter added the comment: Issue 9679: Focusses on encoding just the DNS name Issue 20559: Maybe a duplicate, or opportunity for better documentation or error message as a bug fix? Andreas’s patch just proposes a new function called quote_uri(). It would need documentation. We already

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2014-05-28 Thread Graham Oliver
Graham Oliver added the comment: hello I came across this bug when using 'ā' in a url To get around the problem I used the 'URL encoded' version '%C4%81' instead of 'ā' See this page http://www.charbase.com/0101-unicode-latin-small-letter-a-with-macron I tried using the 'puny code' for 'ā'

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2013-02-25 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@gmail.com: -- nosy: +haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3991 ___ ___ Python-bugs-list

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2013-02-23 Thread Andreas Åkerlund
Andreas Åkerlund added the comment: This is a patch against 3.2 adding urllib.parse.quote_uri It splits the URI in 5 parts (protocol, authentication, hostname, port and path) then runs urllib.parse.quote on the path and encodes the hostname to punycode if it's not in ascii. It's not perfect,

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2012-10-02 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: -- versions: +Python 3.2, Python 3.3, Python 3.4 -Python 3.0 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3991 ___

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2009-04-22 Thread Daniel Diniz
Changes by Daniel Diniz aja...@gmail.com: -- priority: - normal ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3991 ___ ___ Python-bugs-list

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2009-02-12 Thread Daniel Diniz
Changes by Daniel Diniz aja...@gmail.com: -- nosy: +orsenthil ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3991 ___ ___ Python-bugs-list mailing

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2009-02-11 Thread Daniel Diniz
Changes by Daniel Diniz aja...@gmail.com: -- components: +Library (Lib) keywords: +easy stage: - test needed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3991 ___

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2009-02-08 Thread Daniel Diniz
Daniel Diniz aja...@gmail.com added the comment: I think Toshio's usecase is important enough to deserve a fix (patch attached) or a special-cased error message. IMO, newbies trying to fix failures from urlopen may have a hard time figuring out the maze: urlopen - _opener - open - _open -

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2009-02-08 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3991 ___ ___ Python-bugs-list

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2008-10-01 Thread Toshio Kuratomi
Toshio Kuratomi [EMAIL PROTECTED] added the comment: Oh, that's cool. I've been fine with this being a request for a needed function to quote and unquote full urls rather than a bug in urlopen(). I think iri's are a distraction here, though. The RFC for iris even says that specifications that

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2008-09-30 Thread Bill Janssen
Bill Janssen [EMAIL PROTECTED] added the comment: It's not immediately clear to me how an auto-quote function can be written; as you say (and as the URI spec points out), you have to take a URL apart before quoting it, and you can't parse an invalid URL, which is what the input is. Best to

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2008-09-30 Thread Toshio Kuratomi
Toshio Kuratomi [EMAIL PROTECTED] added the comment: The purpose of such a function would be to take something that is not a valid uri but 1) is a common way of expressing the way to get to the resource and 2) follows certain rules and turns that into something that is a valid uri. non-ASCii

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2008-09-29 Thread Bill Janssen
Bill Janssen [EMAIL PROTECTED] added the comment: As I read RFC 2396, 1.5: A URI is a sequence of characters from a very limited set, i.e. the letters of the basic Latin alphabet, digits, and a few special characters. 2.4: Data must be escaped if it does not have a representation using

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2008-09-29 Thread Toshio Kuratomi
Toshio Kuratomi [EMAIL PROTECTED] added the comment: Possibly. This is a change from python-2.x's urlopen() which escaped the URL automatically, though. I can see the case for having the user call an escape function themselves instead of having urlopen() perform the escape for them. However,

[issue3991] urllib.request.urlopen does not handle non-ASCII characters

2008-09-28 Thread Toshio Kuratomi
New submission from Toshio Kuratomi [EMAIL PROTECTED]: Tested on python-3.0rc1 -- Linux Fedora 9 I wanted to make sure that python3.0 would handle url's in different encodings. So I created two files on an apache server which were named ½ñ.html. One of the filenames was encoded in utf-8 and