[issue37093] http.client aborts header parsing upon encountering non-ASCII header names

2022-01-21 Thread Dong-hee Na


Change by Dong-hee Na :


--
nosy: +corona10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37093] http.client aborts header parsing upon encountering non-ASCII header names

2019-12-23 Thread Tim Burke


Tim Burke  added the comment:

Note that because http.server uses http.client to parse headers [0], this can 
pose a request-smuggling vector depending on how you've designed your system. 
For example, you might have a storage system with a user-facing HTTP server 
that is in charge of

* authenticating and authorizing users,
* determining where data should be stored, and
* proxying the user request to the backend

and a separate (unauthenticated) HTTP server for actually storing that data. If 
the proxy and backend are running different versions of CPython (say, because 
you're trying to upgrade an existing py2 cluster to run on py3), they may 
disagree about where the request begins and ends -- potentially causing the 
backend to process multiple requests, only the first of which was authorized.

See, for example, https://bugs.launchpad.net/swift/+bug/1840507

For what it's worth, most http server libraries (that I tested; take it with a 
grain of salt) seem to implement their own header parsing. Eventlet was a 
notable exception [1].

[0] https://github.com/python/cpython/blob/v3.8.0/Lib/http/server.py#L336-L337
[1] https://github.com/eventlet/eventlet/pull/574

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37093] http.client aborts header parsing upon encountering non-ASCII header names

2019-06-03 Thread Tim Burke


Change by Tim Burke :


--
keywords: +patch
pull_requests: +13672
stage: test needed -> patch review
pull_request: https://github.com/python/cpython/pull/13788

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37093] http.client aborts header parsing upon encountering non-ASCII header names

2019-05-30 Thread SilentGhost


Change by SilentGhost :


--
components: +Library (Lib)
nosy: +barry, maxking, r.david.murray
stage:  -> test needed
type:  -> behavior
versions:  -Python 3.5, Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37093] http.client aborts header parsing upon encountering non-ASCII header names

2019-05-29 Thread Tim Burke

New submission from Tim Burke :

First, spin up a fairly trivial http server:

import wsgiref.simple_server

def app(environ, start_response):
start_response('200 OK', [
('Some-Canonical', 'headers'),
('sOme-CRAzY', 'hEaDERs'),
('Utf-8-Values', '\xe2\x9c\x94'),
('s\xc3\xb6me-UT\xc6\x92-8', 'in the header name'),
('some-other', 'random headers'),
])
return [b'Hello, world!\n']

if __name__ == '__main__':
httpd = wsgiref.simple_server.make_server('', 8000, app)
while True:
httpd.handle_request()

Note that this code works equally well on py2 or py3; the interesting bytes on 
the wire are the same on either.

Verify the expected response using an independent tool such as curl:

$ curl -v http://localhost:8000
*   Trying ::1...
* TCP_NODELAY set
* connect to ::1 port 8000 failed: Connection refused
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8000 (#0)
> GET / HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.64.0
> Accept: */*
> 
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Date: Wed, 29 May 2019 23:02:37 GMT
< Server: WSGIServer/0.2 CPython/3.7.3
< Some-Canonical: headers
< sOme-CRAzY: hEaDERs
< Utf-8-Values: ✔
< söme-UTƒ-8: in the header name
< some-other: random headers
< Content-Length: 14
< 
Hello, world!
* Closing connection 0

Check that py2 includes all the same headers:

$ python2 -c 'import pprint, urllib; resp = 
urllib.urlopen("http://localhost:8000;); 
pprint.pprint((dict(resp.info().items()), resp.read()))'
({'content-length': '14',
  'date': 'Wed, 29 May 2019 23:03:02 GMT',
  'server': 'WSGIServer/0.2 CPython/3.7.3',
  'some-canonical': 'headers',
  'some-crazy': 'hEaDERs',
  'some-other': 'random headers',
  's\xc3\xb6me-ut\xc6\x92-8': 'in the header name',
  'utf-8-values': '\xe2\x9c\x94'},
 'Hello, world!\n')

But py3 *does not*:

$ python3 -c 'import pprint, urllib.request; resp = 
urllib.request.urlopen("http://localhost:8000;); 
pprint.pprint((dict(resp.info().items()), resp.read()))'
({'Date': 'Wed, 29 May 2019 23:04:09 GMT',
  'Server': 'WSGIServer/0.2 CPython/3.7.3',
  'Some-Canonical': 'headers',
  'Utf-8-Values': 'â\x9c\x94',
  'sOme-CRAzY': 'hEaDERs'},
 b'Hello, world!\n')

Instead, it is missing the first header that has a non-ASCII name as well as 
all subsequent headers (even if they are all-ASCII). Interestingly, the 
response body is intact.

This is eventually traced back to email.feedparser's expectation that all 
headers conform to rfc822 and its assumption that anything that *doesn't* 
conform must be part of the body: 
https://github.com/python/cpython/blob/v3.7.3/Lib/email/feedparser.py#L228-L236

However, http.client has *already* determined the boundary between headers and 
body in parse_headers, and sent everything that it thinks is headers to the 
parser: 
https://github.com/python/cpython/blob/v3.7.3/Lib/http/client.py#L193-L214

--
messages: 343942
nosy: tburke
priority: normal
severity: normal
status: open
title: http.client aborts header parsing upon encountering non-ASCII header 
names
versions: Python 3.5, Python 3.6, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com