New submission from Tim Burke <tim.bu...@gmail.com>: Not sure if this is a documentation or behavior bug, but... the docs for urllib.request.Request.set_proxy (https://docs.python.org/3/library/urllib.request.html#urllib.request.Request.set_proxy) say
> Prepare the request by connecting to a proxy server. *The host and type will > replace those of the instance*, and the instance’s selector will be the > original URL given in the constructor. (Emphasis mine.) In practice, behavior is more nuanced than that: >>> from urllib.request import Request >>> req = Request('http://hostame:port/some/path') >>> req.host, req.type ('hostame:port', 'http') >>> req.set_proxy('proxy:other-port', 'https') >>> req.host, req.type # So far, so good... ('proxy:other-port', 'https') >>> >>> req = Request('https://hostame:port/some/path') >>> req.host, req.type ('hostame:port', 'https') >>> req.set_proxy('proxy:other-port', 'http') >>> req.host, req.type # Type doesn't change! ('proxy:other-port', 'https') Looking at the source (https://github.com/python/cpython/blob/v3.7.0/Lib/urllib/request.py#L397) it's obvious why https is treated specially. The behavior is consistent with how things worked on py2... >>> from urllib2 import Request >>> req = Request('http://hostame:port/some/path') >>> req.get_host(), req.get_type() ('hostame:port', 'http') >>> req.set_proxy('proxy:other-port', 'https') >>> req.get_host(), req.get_type() ('proxy:other-port', 'https') >>> >>> req = Request('https://hostame:port/some/path') >>> req.get_host(), req.get_type() ('hostame:port', 'https') >>> req.set_proxy('proxy:other-port', 'http') >>> req.get_host(), req.get_type() ('proxy:other-port', 'https') ... but only if you're actually inspecting host/type along the way! >>> from urllib2 import Request >>> req = Request('https://hostame:port/some/path') >>> req.set_proxy('proxy:other-port', 'http') >>> req.get_host(), req.get_type() ('proxy:other-port', 'http') (FWIW, this came up while porting an application from py2 to py3; there was a unit test expecting that last behavior of proxying a https connection through a http proxy.) ---------- components: Library (Lib) messages: 325449 nosy: tburke priority: normal severity: normal status: open title: urllib.request.Request.set_proxy doesn't (necessarily) replace type _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue34698> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com