Strong -1 from me.

urllib.request may not be "best practice", but it's still extremely
useful for simple situations, and urllib.parse is useful for basic
handling of URLs.Yes, the more complex aspects of urllib are better
handled by external packages, but that's not sufficient argument for
removing the package altogether. There are many situations where
external dependencies are unsuitable. Also, there's quite a lot of
usage of urllib in the stdlib itself - how would you propose to
replace that?

In addition, pip relies pretty heavily on urllib (parse and request),
and pip has a bootstrapping issue, so using 3rd party libraries is
non-trivial. Also, of pip's existing vendored dependencies,
webencodings, urllib3, requests, pkg_resources, packaging, html5lib,
distlib and cachecontrol all import urllib. So this would be *hugely*
disruptive to the whole packaging ecosystem (which is under-resourced
at the best of times, so this would put a lot of strain on us).

In any case, why is this being proposed as a simple posting on
python-dev? There's already PEP 594 for removals from the stdlib. If
you have a case for removing urllib, I suggest you get it added to PEP
594, so it can be discussed and agreed properly, along with the other
removals (none of which is remotely as controversial as urllib, so
there's absolutely no doubt in my mind that this would need a PEP
however it was proposed).

Paul

On Sun, 6 Feb 2022 at 14:15, Victor Stinner <vstin...@python.org> wrote:
>
> Hi,
>
> I propose to deprecate the urllib module in Python 3.11. It would emit
> a DeprecationWarning which warn users, so users should consider better
> alternatives like urllib3 or httpx: well known modules, better
> maintained, more secure, support HTTP/2 (httpx), etc.
>
> I don't propose to schedule its removal. Let's discuss the removal in
> 1 or 2 years.
>
> --
>
> urllib has many abstraction to support a wide range of protocols with
> "handlers": HTTP, HTTPS, FTP, "local file", proxy, HTTP
> authentication, HTTP Cookie, etc. A simple HTTP request using Basic
> Authentication requires 10-20 lines of code, whereas it should be a
> single line.
>
> Users (me included) don't like urllib API which was too complicated
> for common tasks.
>
> --
>
> Unhappy users created multiple better alternatives to the stdlib urllib 
> module.
>
> In 2008, the "urllib3" module was created to provide an API designed
> to be as simple as possible for the most common HTTP and HTTPS
> requests. Example:
>
>    req = http.request('GET', 'http://httpbin.org/robots.txt').
>
> In 2011, the "requests" module based on urllib3 was created.
>
> In 2013, the "aiohttp" module based on asyncio was created.
>
> In 2015, new "httpx" module was created:
>
>     req = httpx.get('https://www.example.org/')
>
> Not only httpx has a regular "synchronous" API (blocking function
> calls), but it also has an asynchronous API!
>
> Sadly, while HTTP/3 is being developed, it seems like in this list,
> httpx is the only HTTP client library supporting HTTP/2 currently :-(
>
> For HTTP/2, I also found the "httplib2" module.
>
> For HTTP/3, I found the "http3" and "aioquic" modules.
>
> --
>
> Let's come back to urllib:
>
> * It's API is too complicated
> * It doesn't support HTTP/2 nor HTTP/3
> * It's barely maintained: there are 121 open issues including 3 security 
> issues!
>
> The 3 open security issues:
>
> * bpo-33661 open 2018;
> * bpo-36338 open in 2019;
> * bpo-45795 open in 2021.
>
> Usually, it's bad when you refer to an open security issue by its
> creation year :-(
>
> The urllib module has long history of security vulnerabilities. List
> of *fixed* vulnerabilities:
>
> * 2011 (bpo-11662):
> https://python-security.readthedocs.io/vuln/urllib-redirect.html
> * 2017 (bpo-30119):
> https://python-security.readthedocs.io/vuln/urllib-ftp-stream-injection.html
> * 2017 (bpo-30500):
> https://python-security.readthedocs.io/vuln/urllib-connects-wrong-host.html
> * 2019 (bpo-35907):
> https://python-security.readthedocs.io/vuln/urllib-local-file-scheme.html
> * 2019 (bpo-38826):
> https://python-security.readthedocs.io/vuln/urllib-basic-auth-regex.html
> * 2021 (bpo-42967):
> https://python-security.readthedocs.io/vuln/urllib-query-string-semicolon-separator.html
> * 2021 (bpo-43075):
> https://python-security.readthedocs.io/vuln/urllib-basic-auth-regex2.html
> * 2021 (bpo-44022):
> https://python-security.readthedocs.io/vuln/urllib-100-continue-loop.html
>
> urllib is a package made of 4 parts:
>
> * urllib.request for opening and reading URLs
> * urllib.error containing the exceptions raised by urllib.request
> * urllib.parse for parsing URLs
> * urllib.robotparser for parsing robots.txt files
>
> I propose to deprecate all of them. Maybe the deprecation can be
> different for each sub-module?
>
> Victor
> --
> Night gathers, and now my watch begins. It shall not end until my death.
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/EZ6O7MOPZ4GA75MKTDO7LAELKXUHK2QS/
> Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YQPV5MQXOCFZNWZMG22CZKQT33IHYBNP/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to