[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2021-06-20 Thread Irit Katriel


Irit Katriel  added the comment:

This looks like a 2.7-only issue.

--
nosy: +iritkatriel
resolution:  -> out of date
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2017-01-06 Thread Herman Schistad

Herman Schistad added the comment:

I can confirm that this patch solves the issues I've had where I can submit 
multipart forms provided I have a string URL, but not if it's unicode.

I'm using Python 2.7.12. Applying the patch fixes the issue.

Code which breaks, assuming the file contains binary data:


# -*- encoding: utf-8 -*-
import urllib3
pool_manager = urllib3.PoolManager(num_pools=2)
url = u'http://example.org/form' # removing the 'u' fixes it
content = open('/some/binary/file').read()
fields = [
('foo', 'something'),
('bar', ('/some/binary/file', content, 'application/octet-stream'))
]
pool_manager.request("POST", url, fields=fields, encode_multipart=True, 
headers={})

--
nosy: +Herman Schistad

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2015-01-03 Thread Bob Chen

Bob Chen added the comment:

Is there any possibility that we encapsulate urllib.quote into httplib? Because 
many developers wouldn't know about this utility function. And as I mentioned 
above, they could have got an unicode url from anywhere inside python, like an 
API call, without being noticed that it is potentially wrong.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22231
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2015-01-03 Thread Bob Chen

Changes by Bob Chen 175818...@qq.com:


Removed file: http://bugs.python.org/file36492/httplib.py.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22231
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2015-01-03 Thread Bob Chen

Changes by Bob Chen 175818...@qq.com:


Added file: http://bugs.python.org/file37592/httplib.py.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22231
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2015-01-03 Thread Bob Chen

Bob Chen added the comment:

How about this patch?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22231
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2015-01-03 Thread Demian Brecht

Demian Brecht added the comment:

utf-8 encoding is only one step in IRI encoding. Correct IRI encoding is non 
trivial and doesn't fall into the support policy for 2.7 (bug/security fixes). 
I think that the best that can be done for 2.7 is to enhance the documentation 
around HTTPConnection.__init__ (unicode hostnames should be IDNA-encoded with 
the built-in IDNA encoder) and HTTPConnection.request/putrequest noting that 
unicode paths should be IRI encoded, with a link to RFC 3987.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22231
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2015-01-02 Thread Demian Brecht

Demian Brecht added the comment:

A few notes:

1. Unicode hosts are not automatically IDNA-encoded (which they /could/ be 
rather than relying on the programmer to be aware of this), but this really has 
no bearing on this specific issue
2. Unicode paths are not automatically IRI-encoded (see 
https://tools.ietf.org/html/rfc3987#section-3), which should also likely be 
automatically handled when unicode objects are encountered as the path
3. When a single unicode element is contained within a list, string_join will 
defer to PyUnicode_Join.

The problem here is that your pre-joined request elements looks like this: 
[u'POST http://bugs.python.org/any_url HTTP/1.1', 'Host: bugs.python.org', 
'Accept-Encoding: identity', 'Content-Length: 44', 'notes: 
\xe5\x91\xb5\xe5\x91\xb5', 'Content-type: application/x-www-form-urlencoded', 
'Accept: text/plain', '', '']

Because there's a unicode object contained in the list at index 0, the entire 
list is converted to unicode, which results in the error when \xe5 is 
encountered by the ascii decoder.

The proposed solution won't work as unicode characters are legal (see RFC 3987) 
and will fail should anything outside of the ascii character set be present.

I think that the correct way to solve this issue is to automatically encode 
unicode paths (or IRIs) using urllib.quote, passing the reserved characters 
defined in RFC 3987 as the safe parameter:

 urllib.quote(u'/foo/呵/bar'.encode('utf-8'),':/?#[]@!$\'()*+,;=')
'/foo/%E5%91%B5/bar'

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22231
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2014-12-24 Thread Éric Araujo

Changes by Éric Araujo mer...@netwok.org:


--
nosy: +orsenthil

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22231
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2014-11-23 Thread Bob Chen

Bob Chen added the comment:

Someone come and pick up this? It has been a long time...

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22231
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2014-11-23 Thread Bob Chen

Changes by Bob Chen 175818...@qq.com:


--
type: crash - behavior

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22231
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2014-09-16 Thread Bob Chen

Bob Chen added the comment:

up...

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22231
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2014-08-28 Thread Bob Chen

Changes by Bob Chen 175818...@qq.com:


--
keywords: +patch
Added file: http://bugs.python.org/file36492/httplib.py.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22231
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2014-08-28 Thread Bob Chen

Bob Chen added the comment:

This patch ensures the url not to be unicode, so the 'join' would not cause 
error when there is utf-8 string behind.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22231
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2014-08-28 Thread STINNER Victor

Changes by STINNER Victor victor.stin...@gmail.com:


--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22231
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2014-08-23 Thread Bob Chen

Bob Chen added the comment:

I personally suggest httplib convert the url and other elements to be string, 
at the begging of the class init.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22231
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2014-08-20 Thread Demian Brecht

Changes by Demian Brecht demianbre...@gmail.com:


--
nosy: +demian.brecht

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22231
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22231] httplib: unicode url will cause an ascii codec error when combined with a utf-8 string header

2014-08-19 Thread Bob Chen

New submission from Bob Chen:

Try to run these two script below, and you will understand what I'm talking 
about.

If you specified an url and it happened to be an unicode string(which is quite 
common in python because python processes string as unicode and you could 
possibly get it from somewhere else), and your header contains a utf-8 string 
converted from a foreign language, like u'呵呵', then the codec error occurred.

File /usr/lib/python2.7/httplib.py, line 808, in _send_output
msg = \r\n.join(self._buffer) 


# -*- encoding: utf-8 -*-
# should fail
import httplib, urllib
params = urllib.urlencode({'@number': 12524, '@type': 'issue', '@action': 
'show'})
headers = {Content-type: application/x-www-form-urlencoded,
Accept: text/plain, 'notes': u'呵呵'.encode('utf-8')}
conn = httplib.HTTPConnection(ubugs.python.org)
conn.request(POST, uhttp://bugs.python.org/any_url;, params, headers)
response = conn.getresponse()
print response.status, response.reason



# -*- encoding: utf-8 -*-
# should be ok
import httplib, urllib
params = urllib.urlencode({'@number': 12524, '@type': 'issue', '@action': 
'show'})
headers = {Content-type: application/x-www-form-urlencoded,
Accept: text/plain, 'notes': u'呵呵'.encode('utf-8')}
conn = httplib.HTTPConnection(ubugs.python.org)
conn.request(POST, http://bugs.python.org/any_url;, params, headers)
response = conn.getresponse()
print response.status, response.reason

--
components: Library (Lib)
messages: 225553
nosy: Bob.Chen
priority: normal
severity: normal
status: open
title: httplib: unicode url will cause an ascii codec error when combined with 
a utf-8 string header
type: crash
versions: Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22231
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com