Cameron Simpson <c...@zip.com.au> added the comment:

Well, I've established a few things:
  - I'm mischaracterised this issue
  - httplib's _set_tunnel() is really meant to be called from
    urllib2, because using it directly with httplib is totally
    counter intuitive
  - a bare urllib2 setup fails with its own bug

To the first item: _tunnel() feels really fragile with that recursion issue, 
though it doesn't recurse called from urllib2.

For the second, here's my test script using httplib:

  H = httplib.HTTPSConnection("localhost", 3128)
  print H
  H._set_tunnel("localhost", 443)
  H.request("GET", "/boguspath")
  os.system("lsof -p %d | grep IPv4" % (os.getpid(),))
  R = H.getresponse()
  print R.status, R.reason

As you can see, one builds the HTTPSConnection object with the proxy's details 
instead of those of the target URL, and then put the target URL details in with 
_set_tunnel(). Am I alone in find this strange?

For the third, my test code is this:

  U = urllib2.Request('https://localhost/boguspath')
  U.set_proxy('localhost:3128', 'https')
  f = urllib2.urlopen(R)
  print f.read()

which fails like this:

  Traceback (most recent call last):
    File "thttp.py", line 15, in <module>
      f = urllib2.urlopen(R)
    File "/opt/python-2.6.4/lib/python2.6/urllib2.py", line 131, in urlopen
      return _opener.open(url, data, timeout)
    File "/opt/python-2.6.4/lib/python2.6/urllib2.py", line 395, in open
      protocol = req.get_type()
  AttributeError: HTTPResponse instance has no attribute 'get_type'

The line numbers are slightly off because I've got some debugging statements in 
there.

Finally, I flat out do not understand urllib2's set_proxy() method:
  
    def set_proxy(self, host, type):
        if self.type == 'https' and not self._tunnel_host:
            self._tunnel_host = self.host
        else:
            self.type = type
            self.__r_host = self.__original
        self.host = host

When my code calls set_proxy, self.type is None. Now, I had naively expected 
the first branch to be the only branch. Could someone explain what's happening 
here, and what is meant to happen?

I'm thinking that this bug may turn into a doc fix instead of a behaviour fix, 
but I'm finding it surprisingly hard to know how urllib2 is supposed to be used.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue7776>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to