[issue6791] httplib read status memory usage

2010-12-18 Thread Senthil Kumaran

Senthil Kumaran orsent...@gmail.com added the comment:

In the morning, I had a comment on the patch wondering why read _MAXLENGH + 1 
and then check for len of header  _MAXLENGH. Instead of just reading _MAXLENGH 
(and if the length matched rejecting). ( Looks like it did not go through).

I think that either way is okay. I am taking the privilege of committing the 
patch. Fixed for py3k in 87373. So it is be available in the next beta.

Shall merge the changes to other codelines.

--
resolution:  - fixed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2010-12-18 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Partially backported in r87382 (3.1) and r87383 (2.7). Not everything could be 
merged in because of HTTP 0.9 support and (in 2.7) a slightly different 
architecture. Thank you.

--
stage: patch review - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2010-12-17 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Now that 0.9 client support has been removed, this can proceed (at least for 
3.2).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2010-12-17 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Here is a patch limiting line length everywhere in http.client, + tests (it 
also affects http.server since the header parsing routine is shared).

--
stage: needs patch - patch review
Added file: http://bugs.python.org/file20100/httplinelength.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2010-12-16 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Well, removing 0.9 support doesn't make this obsolete, does it?

--
status: pending - open

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2010-12-16 Thread Senthil Kumaran

Senthil Kumaran orsent...@gmail.com added the comment:

On Thu, Dec 16, 2010 at 01:18:30PM +, Antoine Pitrou wrote:
 Well, removing 0.9 support doesn't make this obsolete, does it?

It does. Doesn't it? Because I saw in your patch that you fall back on
HTTP 1.0 behaviour when the server does not return a status line and
in which case a Exception will be raise and this issue won't be
observed.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2010-12-16 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 It does. Doesn't it? Because I saw in your patch that you fall back on
 HTTP 1.0 behaviour when the server does not return a status line and
 in which case a Exception will be raise and this issue won't be
 observed.

I don't think you understood the issue here. Calling readline() without
a maximum length means the process memory potentially explodes, if the
server sends gigabytes of data without a single \n.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2010-12-16 Thread Senthil Kumaran

Senthil Kumaran orsent...@gmail.com added the comment:

On Thu, Dec 16, 2010 at 02:02:10PM +, Antoine Pitrou wrote:
 I don't think you understood the issue here. Calling readline() without
 a maximum length means the process memory potentially explodes, if the
 server sends gigabytes of data without a single \n.

Yeah, I seem to have misunderstood the issue.  Even if the response wa
s an *invalid* one but it was huge data without \n, the readline call
would just explode.

- reading chunked response is doing a readline call too.

Both this need to be addressed by having a limit on reading.

I thought readline() is being called only when parsing headers which
should almost always have CRLF (or at least LF) and thought valid
responses always start with headers.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2010-12-15 Thread Ross Lagerwall

Ross Lagerwall rosslagerw...@gmail.com added the comment:

Attached is a unit test which tests the issue.
Unfortunately, since it uses the resource module to limit memory to a workable 
size, it will only work on Unix.

The given patch appears to fix the issue well.

I think this should be taken as a security issue (even if a rather odd one) 
since a malicious http server could be set up in place of the normal one and 
crash any http python clients that connect to it.

Eg:
Run: dd if=/dev/zero bs=10M count=1000 | nc -l 
And then:

import httplib
h = httplib.HTTPConnection('localhost', )
h.connect()
h.request('GET', '/')
r = h.getresponse()

This should cause python to use up all the memory available.

--
nosy: +rosslagerwall
Added file: http://bugs.python.org/file20048/i6791_unittest.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2010-12-15 Thread Ross Lagerwall

Ross Lagerwall rosslagerw...@gmail.com added the comment:

A py3k patch against revision 87228.

--
Added file: http://bugs.python.org/file20049/i6791_py3k.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2010-12-15 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

First, I don't think the resource module needs to be used here. Second, I don't 
see why getcode() would return 200. If no valid response was received then some 
kind of error should certainly be raised, shouldn't it?

--
nosy: +pitrou

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2010-12-15 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

By the way, looking at the code, readline() without any parameter is used all 
over http.client, so fixing only this one use case doesn't really make sense.

--
stage: unit test needed - needs patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2010-12-15 Thread Ross Lagerwall

Ross Lagerwall rosslagerw...@gmail.com added the comment:

That's true. Near the bottom of the code, it says:

 # The status-line parsing code calls readline(), which normally
 # get the HTTP status line.  For a 0.9 response, however, this is
 # actually the first line of the body!

Limiting the length of the status line would break 0.9 responses so maybe this 
issue should be closed?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2010-12-15 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 That's true. Near the bottom of the code, it says:
 
  # The status-line parsing code calls readline(), which normally
  # get the HTTP status line.  For a 0.9 response, however, this is
  # actually the first line of the body!
 
 Limiting the length of the status line would break 0.9 responses so maybe 
 this issue should be closed?

Well, the HTTP 1.0 RFC was filed in 1996 and HTTP 1.1 is most commonly
used today. I don't think we need to support 0.9 anymore. I'll open a
separate issue for ripping off 0.9 support, though.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2010-12-15 Thread Senthil Kumaran

Senthil Kumaran orsent...@gmail.com added the comment:

I just read the whole discussion and it seems that code was in place so that 
client can tolerant of a BAD HTTP 0.9 Server response.
http://www.w3.org/Protocols/HTTP/OldServers.html

Given that issue10711 talks about removing HTTP/0.9 support (+1 to that), this 
issue will become obsolete.  I too support removing HTTP/0.9. There are hardly 
any advantages in keeping it around.

--
nosy:  -BreamoreBoy
status: open - pending

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2010-07-20 Thread Mark Lawrence

Mark Lawrence breamore...@yahoo.co.uk added the comment:

Sumar, to get this moved forward could you please provide a unit test.

--
nosy: +BreamoreBoy
stage:  - unit test needed
versions: +Python 2.7, Python 3.1, Python 3.2 -Python 2.4, Python 2.5, Python 
2.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2010-07-20 Thread Senthil Kumaran

Changes by Senthil Kumaran orsent...@gmail.com:


--
assignee:  - orsenthil
nosy: +orsenthil

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2009-09-14 Thread STINNER Victor

Changes by STINNER Victor victor.stin...@haypocalc.com:


--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2009-08-28 Thread sumar

New submission from sumar m.sucaj...@gmail.com:

During writing some code I discovered some behaviour of httplib. When we
connect to host, which doesn’t respond with status line, but it just
sending data, httplib may consume more and more memory, becouce when we
execute
h = httplib.HTTPConnection(‘host’)
h.conect()
h.request(‘GET’, ‘/’)
r = h.getresponse()
httplib tries to read one line from host. If host doesn’t send new line
character (‘\n’), httplib reads more and more data. On my tests httplib
could consume all of 4GB of memory and the python process was killed by
oom_killer.
The resolution is to limit maximum amount of data read on getting
response. I have performed some test:
I received 3438293 from hosts located in the network. The longest valid
response line is
HTTP/1.1 500 ( The specified Secure Sockets Layer (SSL) port is not
allowed. ISA Server is not configured to allow SSL requests from this
port. Most Web browsers use port 443 for SSL requests.  )\r\n
and it has 197 characters.
In RFC2616 in section 6.1 we have:
“The first line of a Response message is the Status-Line, consisting of
the protocol version followed by a numeric status code and its
associated textual phrase, with each element separated by SP characters.
No CR or LF is allowed except in the final CRLF sequence.
   Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF
(..)The Reason-Phrase is intended to give a short textual description of
the Status-Code.”
So limiting maximum status line length to 256 characters is a solution
of this problem. It doesn’t break compatibility withc RFC 2616.

My patch was written originally on python2.4, but I’ve tested it on
python2.6:
[...@host python2.6]$ patch --dry-run -i /home/ms/httplib.patch
patching file httplib.py
Hunk #1 succeeded at 209 (offset 54 lines).

--
components: Library (Lib)
files: httplib.patch
keywords: patch
messages: 92027
nosy: m.sucajtys
severity: normal
status: open
title: httplib read status memory usage
type: resource usage
versions: Python 2.4, Python 2.5, Python 2.6
Added file: http://bugs.python.org/file14795/httplib.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6791] httplib read status memory usage

2009-08-28 Thread sumar

sumar m.sucaj...@gmail.com added the comment:

I've also check patch against code in svn tree:
wget http://svn.python.org/projects/python/trunk/Lib/httplib.py
patch -p0 -i httplib.patch --dry-run
patching file httplib.py
Hunk #1 succeeded at 209 (offset 54 lines).
Hunk #2 succeeded at 303 (offset 10 lines).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6791
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com