[issue12576] urlib.request fails to open some sites

2011-07-27 Thread Barry A. Warsaw

Barry A. Warsaw ba...@python.org added the comment:

Re-opening, as I think this needs to still be applied to Python 2.7.

--
status: closed - open

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-27 Thread Barry A. Warsaw

Barry A. Warsaw ba...@python.org added the comment:

Never mind.  This changeset got applied to 2.7 (thanks!) but didn't get linked 
in the tracker.


changeset:   71523:b66bbbdc7abd
branch:  2.7
parent:  71518:73ae3729b8fe
user:Senthil Kumaran sent...@uthcode.com
date:Wed Jul 27 09:37:17 2011 +0800
summary: merge from 3.2 - fix urlopen behavior on sites which do not send 
(or obsfuscates) Connection: Close header.

--
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-26 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset 9eac48fbe21d by Senthil Kumaran in branch '3.2':
Fix closes Issue12576 - fix urlopen behavior on sites which do not send (or 
obsfuscates) Connection: Close header.
http://hg.python.org/cpython/rev/9eac48fbe21d

New changeset a45c8ce67c7d by Senthil Kumaran in branch 'default':
merge from 3.2 - Fix closes Issue12576 - fix urlopen behavior on sites which do 
not send (or obsfuscates) Connection: Close header.
http://hg.python.org/cpython/rev/a45c8ce67c7d

--
nosy: +python-dev
resolution:  - fixed
stage:  - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-26 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset d58b43fb9208 by Senthil Kumaran in branch '3.2':
Correcting issue 12576 fix, which resulted in buildbot failures.
http://hg.python.org/cpython/rev/d58b43fb9208

New changeset dcfce522723d by Senthil Kumaran in branch 'default':
merge from 3.2 - Correcting issue 12576 fix, which resulted in buildbot 
failures.
http://hg.python.org/cpython/rev/dcfce522723d

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-24 Thread Senthil Kumaran

Senthil Kumaran sent...@uthcode.com added the comment:

I propose the attached patch as fix to this issue. All it does is, moves the 
code of getting http response to the finally block of the http request. It 
closes the sockets if the getting the response fails for some reason, otherwise 
it proceeds normally.  

Please provide your critique if any, otherwise, I shall go ahead with checking 
this in.

--
keywords: +patch
Added file: http://bugs.python.org/file22746/issue12576.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-23 Thread angus

angus an...@amcinnes.info added the comment:

I'm experiencing a related problem:
---
from urllib.request import urlopen
print(urlopen('https://mtgox.com/').read())
---
prints b'' rather than the page content.

It looks like mtgox.com always sends 'Connection: Keep-Alive'. So some hack 
like recognising 'tion: close' wouldn't fix it.

--
nosy: +angus

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-23 Thread Senthil Kumaran

Changes by Senthil Kumaran sent...@uthcode.com:


--
assignee:  - orsenthil

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-23 Thread Senthil Kumaran

Senthil Kumaran sent...@uthcode.com added the comment:

I am against hacks like tion: close. Under worst case, we shall revert
the change which caused this regression in the first place.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-23 Thread Robert Xiao

Robert Xiao nneon...@gmail.com added the comment:

S3 also doesn't send any kind of connection header at all.

x-amz-id-2: WWuo30Fk2inKVcC5dH4GOjvHxnqMa5Q2+AduPm2bMhL1h3GqzOR0EPwUv0biqv2V
x-amz-request-id: 3CCF6B6A000E6446
Date: Sat, 23 Jul 2011 06:42:45 GMT
x-amz-meta-s3fox-filesize: 27692
x-amz-meta-s3fox-modifiedtime: 121329234
Last-Modified: Thu, 12 Jun 2008 17:45:12 GMT
ETag: c4db184c97f1d6b0b6e7ee17a73e785b
Accept-Ranges: bytes
Content-Type: application/pdf
Content-Length: 27692
Server: AmazonS3

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-23 Thread Georg Brandl

Georg Brandl ge...@python.org added the comment:

Recognizing ction: close as Connection: close is exactly what those servers 
do *not* want you to do.

--
nosy: +georg.brandl

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-22 Thread Terry J. Reedy

Terry J. Reedy tjre...@udel.edu added the comment:

Could we look for 'tion: Closed' instead of Connection: Closed, to accomodate 
servers that garble the response, even if it is a hack?

--
nosy: +terry.reedy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-21 Thread STINNER Victor

Changes by STINNER Victor victor.stin...@haypocalc.com:


--
nosy: +ezio.melotti

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-20 Thread Barry A. Warsaw

Barry A. Warsaw ba...@python.org added the comment:

I think this is also a regression in Python 2.7, as reported here:

https://bugs.launchpad.net/ubuntu/+source/python2.7/+bug/813295

--
nosy: +barry
versions: +Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-19 Thread Robert Xiao

Robert Xiao nneon...@gmail.com added the comment:

Seconded. #12133 inadvertently closes the response object if the server fails 
to indicate Connection: close. In my case, Amazon S3 (s3.amazonaws.com) 
causes this problem:

(Python 3.2)
 conn = 
 urllib.request.urlopen('http://s3.amazonaws.com/SurveyMonkeyFiles/VPAT_SurveyMonkey.pdf')
 len(conn.read())
27692

(Python 3.2.1)
 conn = 
 urllib.request.urlopen('http://s3.amazonaws.com/SurveyMonkeyFiles/VPAT_SurveyMonkey.pdf')
 len(conn.read())
0

The problem is that S3 doesn't send back a Connection: close header, so when 
h.close() is called from request.py, the request object is also closed; 
consequently, conn.fp is None and so conn.read() returns an empty bytes object.

This is a clear regression due to the patch in #12133.

--
nosy: +nneonneo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-18 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

h.close() (HTTPConnection.close) in the finally block of 
AbstractHTTPHandler.do_open() calls indirectly r.close() (HTTPResponse.close). 
The problem is that the content of the response cannot be read if its close() 
method was called.

The changelog of the fix (commit ad6bdfd7dd4b) is: Issue #12133: 
AbstractHTTPHandler.do_open() of urllib.request closes the HTTP connection if 
its getresponse() method fails with a socket error. Patch written by Ezio 
Melotti.

The HTTP connection is not only closed in case of an error, but it is always 
closed.

It's a bug because we cannot read the content of www.imdb.com, whereas it works 
without the commit. Test script:
---
import urllib.request, gc

print(python.org)
with urllib.request.urlopen(http://www.python.org/;) as page:
content = page.read()
print(content: %s... % content[:40])
gc.collect()

print(imdb.com)
with urllib.request.urlopen(http://www.imdb.com/;) as page:
content = page.read()
print(content: %s... % content[:40])
gc.collect()

print(exit)
---

--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-18 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

ValueError('I/O operation on closed file') error comes from 
HTTPResponse.__enter__() which is implemented in IOBase:

def __enter__(self):  # That's a forward reference
self._checkClosed()
return self

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-18 Thread Davide Rizzo

Changes by Davide Rizzo sor...@gmail.com:


--
nosy: +davide.rizzo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-18 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

imdb.com and python.org use HTTP/1.1. imdb.com server sends a 
Transfer-encoding: chunked header whereas python.org doesn't. python.org has 
a Connection: close header, whereas imdb.com doesn't.

The more revelant difference for this issue is the Connection: close header: 
HTTPResponse.wil_close is True if Connection: close header is present (see 
_check_close() method), it returns False otherwise. 
HTTPConnection.getresponse() keeps a reference to the response if will_close is 
False, or calls its close() method otherwise.

The Cneonction: close header looks to be a quirk of Netscaler loadbalancers. 
It is sometimes nnCoection uses the same load balancer.

There are buggy web servers, Python should not raise a I/O closed file error 
on such server.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-18 Thread Éric Araujo

Changes by Éric Araujo mer...@netwok.org:


--
nosy: +eric.araujo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-17 Thread Nadeem Vawda

Changes by Nadeem Vawda nadeem.va...@gmail.com:


--
nosy: +nadeem.vawda

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-17 Thread Santoso Wijaya

Changes by Santoso Wijaya santoso.wij...@gmail.com:


--
nosy: +santa4nt
versions: +Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-16 Thread Ugra Dániel

New submission from Ugra Dániel daniel.u...@gmail.com:

Issue #12133 introduced a patch which seems to cause problems.
I'm using Python 3.2.1 on 64-bit Arch Linux (this version already incorporates 
the changes from #12133).

The following code:

with urllib.request.urlopen(url) as page:
pass

raises ValueError: I/O operation on closed file. exception when url is 
http://www.imdb.com/;.
When I removed h.close() (added by the patch) from request.py everything 
worked as expected.
Other URLs work flawlessly with patched code (http://www.google.com/; for 
example).

Maybe it is something to do with differences in HTTP responses or in 
server-side behavior.
For example IMDb's Cneonction: close (not a typo) feature.
But this could be totally unrelated, I am by no means an HTTP expert.

--
components: Library (Lib)
messages: 140512
nosy: daniel.ugra
priority: normal
severity: normal
status: open
title: urlib.request fails to open some sites
type: behavior
versions: Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12576] urlib.request fails to open some sites

2011-07-16 Thread Senthil Kumaran

Senthil Kumaran sent...@uthcode.com added the comment:

On Sun, Jul 17, 2011 at 12:07:44AM +, Ugra Dániel wrote:

 For example IMDb's Cneonction: close (not a typo) feature.  But

This is a mistake at the server and urllib relies on the
Connection: close header at some point in time in the process.

You could try with few other sites and see that context manager should
work.

--
nosy: +orsenthil

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12576
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com