[issue4631] urlopen returns extra, spurious bytes

2009-02-10 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

I took a look at the patch and it looks ok, apart from the
_checkClosed() hack (but I don't think there's any immediate solution).
It should be noted that HTTPResponse.readline() will be awfully slow
since, as HTTPResponse doesn't have peek(), readline() will call read()
one byte at a time...

(slow I/O is nothing new in py3k, however :-))

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2009-02-10 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Here is a patch without the _checkClosed() hack. The solution is simply
to remove redundant _checkClosed() calls in IOBase (for example,
readline() doesn't need to do an explicit `closed` check as it calls
read()).

Added file: http://bugs.python.org/file13021/urllib-chunked2.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2009-02-10 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
resolution:  - accepted
status: open - pending

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2009-02-10 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Committed in r69513, r69514. Thanks everyone!

--
resolution: accepted - fixed
status: pending - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2009-02-08 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

On the principle, the test looks good.
If you want to avoid the 'if % in value' hack, you can use the
named-parameter form of string formatting:

 localhost:%(port)s % dict(port=8080)
'localhost:8080'
 localhost % dict(port=8080)
'localhost'

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2009-02-08 Thread Daniel Diniz

Daniel Diniz aja...@gmail.com added the comment:

Antoine,
Thanks for reviewing, here's an updated version.

Added file: http://bugs.python.org/file12988/test_urllib_chunked2.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2009-02-08 Thread Daniel Diniz

Changes by Daniel Diniz aja...@gmail.com:


Removed file: http://bugs.python.org/file12975/test_urllib_chunked.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2009-02-08 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

The test looks good to me.
I can't comment on the bugfix patch, but if it's ok to you, you can go
ahead :)

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2009-02-07 Thread Daniel Diniz

Daniel Diniz aja...@gmail.com added the comment:

Here's a test (in test_urllib2_localnet) that fails before the patch and
passes after, mostly lifted from test_httplib:

def test_chunked(self):
expected_response = bhello world
chunked_start = (
b'a\r\n'
b'hello worl\r\n'
b'1\r\n'
b'd\r\n'
)
response = [(200, [(Transfer-Encoding, chunked)],
chunked_start)]
handler = self.start_server(response)
data = self.urlopen(http://localhost:%s/; % handler.port)
self.assertEquals(data, expected_response)

Output:

test test_urllib2_localnet failed -- Traceback (most recent call last):
  File ~/py3k/Lib/test/test_urllib2_localnet.py, line 390, in test_chunked
self.assertEquals(data, expected_response)
AssertionError: b'a\r\nhello worl\r\n1\r\nd\r\n' != b'hello world'

To allow this test to work, the attached patch also touches
FakeHTTPRequestHandler and TestUrlopen.urlopen.

Added file: http://bugs.python.org/file12975/test_urllib_chunked.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2009-01-28 Thread Te-jé Rodgers

Changes by Te-jé Rodgers cont...@tejerodgers.com:


--
nosy: +trodgers

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2009-01-28 Thread Jean-Paul Calderone

Changes by Jean-Paul Calderone exar...@divmod.com:


--
nosy:  -exarkun

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2009-01-09 Thread Craig Holmquist

Changes by Craig Holmquist craigh...@gmail.com:


--
nosy: +craigh

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2008-12-21 Thread Martin v. Löwis

Changes by Martin v. Löwis mar...@v.loewis.de:


--
priority: critical - release blocker

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2008-12-21 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

The patch should have at least a test so that we don't have a regression
on this one.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2008-12-15 Thread Jeremy Hylton

Jeremy Hylton jer...@alum.mit.edu added the comment:

I have a patch here that seems to work for the specific url and that
passes all the tests.  Can anyone check whether it works for a larger
set of cases?

I'm a little concerned because I don't understand the new io library in
much detail.  There's an override for _checkClosed() in the HTTPResponse
that seems a little dodgy.  I'll try to get someone to review that
specifically.

Added file: http://bugs.python.org/file12361/urllib-chunked.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2008-12-15 Thread Daniel Diniz

Daniel Diniz aja...@gmail.com added the comment:

I think your patch is good, but there may be another bug around:

I wrote a script to check results of 3.x against 2.x, but many pages
(http://groups.google.com/, http://en.wikipedia.org/) give 403:
Forbidden for 3.x... but work with 2.x!

If you think of this as a bug in 3.x, it could retry the request
identifying as 2.x on 403.

Other than that, your patch gives me identical results to 2.5/2.6 for
128 sites I tested (only a read(100) for each).

Interestingly, my patched version gives a file closer to the buggy
version in size, at 12700 bytes versus 12707. Your version agrees with
2.x and simple maths (128 x 100) in giving a 12799 bytes result. I have
no idea why.

HTH,
Daniel

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2008-12-14 Thread Adeodato Simó

Adeodato Simó d...@net.com.org.es added the comment:

 Does the same thing happen with 2.6?

No, I can't reproduce with 2.6.1.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2008-12-14 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
priority:  - critical
type:  - behavior

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2008-12-14 Thread Resul Cetin

Resul Cetin resul-ce...@gmx.net added the comment:

I have the same problem with that code:

(exchange USERNAME with your delicious username and PASSWORD with your 
delicious password):
 import urllib.request
 auth_handler = urllib.request.HTTPBasicAuthHandler()
 auth_handler.add_password('del.icio.us API', 'api.del.icio.us',
USERNAME, PASSWORD)
 opener = urllib.request.build_opener(auth_handler)
 print(str(opener.open('https://api.del.icio.us/v1/posts/all').read(20),
utf-8))

And I don't use a proxy or anything like that. This makes python 3
completely unusable for me. And python 2.6 gives me what I want (the
content of that virtual file) without any extra data in front or in the
middle of the content.

--
nosy: +ResulCetin

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2008-12-14 Thread Adeodato Simó

Adeodato Simó d...@net.com.org.es added the comment:

 FWIW, there are trailing spurious bytes too

And in the middle of the document as well. Each time there's a chunk, I
guess?

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2008-12-14 Thread Daniel Diniz

Daniel Diniz aja...@gmail.com added the comment:

Clarifying the diagnosis, the offending spurious bytes are only present
when we use 3.0's GET above.

That's because urllib.request.HTTPHandler asks for a vanilla
http.client.HTTPConnection, which uses HTTP 1.1.

IIUC, either we change the request version back to 1.0 (attached patch)
or correct the way the response is processed (is it at all?).

I think HTTPSHandler will also suffer from this, perhaps
[Fancy]URLopener too.

[Antoine: cool, an edit conflict that agrees with what I was about to
post :D]

--
keywords: +patch
Added file: http://bugs.python.org/file12351/urllib_bytes.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2008-12-14 Thread Jeremy Hylton

Changes by Jeremy Hylton jer...@alum.mit.edu:


--
assignee:  - jhylton

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2008-12-14 Thread Jeremy Hylton

Jeremy Hylton jer...@alum.mit.edu added the comment:

Brief update:  The Python 2.x code works because readline() is provided
by socket._fileobject.  The Python 3.x code fails because it grabs the
HTTPResponse.fp instance variable at the end of
AbstractHTTPHandler.do_open.  That method needs to pass the response to
addinfourl(), but needs to have support for readline / readlines before
it can do that.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: [issue4631] urlopen returns extra, spurious bytes

2008-12-13 Thread Jeremy Hylton
Does the same thing happen with 2.6?

Jeremy

On Thu, Dec 11, 2008 at 8:53 AM, Jean-Paul Calderone
rep...@bugs.python.org wrote:

 Jean-Paul Calderone exar...@divmod.com added the comment:

 The f65 is the chunk length for the first chunk returned when
 requesting that URL.  A proxy could easily hide this by switching to a
 different transfer encoding.

 --
 nosy: +exarkun

 ___
 Python tracker rep...@bugs.python.org
 http://bugs.python.org/issue4631
 ___
 ___
 Python-bugs-list mailing list
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-bugs-list/jeremy%40alum.mit.edu


___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2008-12-11 Thread Adeodato Simó

New submission from Adeodato Simó [EMAIL PROTECTED]:

This is very odd, but it was reproduced by people in #python as well. 
Compare, in python 2.5:

 
urllib.urlopen('http://bugs.debian.org/cgi-bin/bugreport.cgi?mbox=yes;bug=123456').readline()
'From [EMAIL PROTECTED] Tue Dec 11 11:32:47 2001\n'

To the equivalent in python 3.0:

 
urllib.request.urlopen('http://bugs.debian.org/cgi-bin/bugreport.cgi?mbox=yes;bug=123456').readline()
b'f65\r\n'

--
components: Library (Lib)
messages: 77603
nosy: dato
severity: normal
status: open
title: urlopen returns extra, spurious bytes
versions: Python 3.0

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4631] urlopen returns extra, spurious bytes

2008-12-11 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment:

I don't reproduce the problem:


urllib.request.urlopen('http://bugs.debian.org/cgi-bin/bugreport.cgi?mbox=yes;bug=123456').readline()
b'From [EMAIL PROTECTED] Tue Dec 11 11:32:47 2001\n'

I connect through a http proxy.

--
nosy: +amaury.forgeotdarc

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue4631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com