On 06/02/2022 15.08, Victor Stinner wrote:
Hi,
I propose to deprecate the urllib module in Python 3.11. It would emit
a DeprecationWarning which warn users, so users should consider better
alternatives like urllib3 or httpx: well known modules, better
maintained, more secure, support HTTP/2 (httpx), etc.
I don't propose to schedule its removal. Let's discuss the removal in
1 or 2 years.
--
urllib has many abstraction to support a wide range of protocols with
"handlers": HTTP, HTTPS, FTP, "local file", proxy, HTTP
authentication, HTTP Cookie, etc. A simple HTTP request using Basic
Authentication requires 10-20 lines of code, whereas it should be a
single line.
Users (me included) don't like urllib API which was too complicated
for common tasks.
--
[...]
urllib is a package made of 4 parts:
* urllib.request for opening and reading URLs
* urllib.error containing the exceptions raised by urllib.request
* urllib.parse for parsing URLs
* urllib.robotparser for parsing robots.txt files
I propose to deprecate all of them. Maybe the deprecation can be
different for each sub-module?
Thanks for bringing this topic forward, Victor!
Disclaimer: I proposed the removal of urllib today in Python core's
internal chat.
The urllib package -- and to some degree also the http package -- are
constant source of security bugs. The code is old and the parsers for
HTTP and URLs don't handle edge cases well. Python core lacks a true
maintainer of the code. To be honest, we have to admit defeat and be up
front that urllib is not up to the task for this decade. It was designed
written during a more friendly, less scary time on the internet.
If I had the power and time, then I would replace urllib with a simpler,
reduced HTTP client that uses platform's HTTP library under the hood
(WinHTTP on Windows, NSURLSession (?) on macOS, Web API for Emscripten,
maybe curl on Linux/BSD). For non-trivial HTTP requests, httpx or
aiohttp are much better suited than urllib.
The second best option is to reduce the feature set of urllib to core
HTTP (no ftp, proxy, HTTP auth) and a partial rewrite with stricter,
more standard conform parsers for urls, query strings, and RFC 2822
instead of RFC 822 for headers.
Christian
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/WYVETVHMGRS4CI47GTFY6W7B43YLSJH2/
Code of Conduct: http://python.org/psf/codeofconduct/