[issue4448] should socket readline() use default_bufsize instead of _rbufsize?
Kristján Valur Jónsson krist...@ccpgames.com added the comment: Issue 4879 has been resolved so that that HTTPResponse invokes socket.socket.makefile() with default buffering. see r69209. Since the problem stated in this defect has no bearing on 3.0 (there is no special hack for readline() in 3.0) I am closing this again. -- status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4448 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4448] should socket readline() use default_bufsize instead of _rbufsize?
Kristján Valur Jónsson krist...@ccpgames.com added the comment: I have looked at this for py3k. the behaviour of HTTPResponse.fp.read() is the same, wheter fp is buffered or not: a read() will read to EOF for HTTP/1.1, which means blocking indefinetely. So, read() is forbidden for HTTP/1.1. For fp.read(n), buffered IO won't attempt to read more than is on the stream, if n bytes are avalible (SocketIO.read(N) will return aN and not block) so there is no reason not to use buffering. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4448 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4448] should socket readline() use default_bufsize instead of _rbufsize?
Gregory P. Smith g...@krypto.org added the comment: unassigning, i don't have time to look at this one right now. -- assignee: gregory.p.smith - ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4448 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4448] should socket readline() use default_bufsize instead of _rbufsize?
Kristján Valur Jónsson krist...@ccpgames.com added the comment: Hi, I'm reawakening this because http://bugs.python.org/issue4879 needs to be ported to py3k. In py3k, a socket.fileobject() is still created with bufsize(0), although now the reasoning is different: def __init__(self, sock, debuglevel=0, strict=0, method=None): # XXX If the response includes a content-length header, we # need to make sure that the client doesn't read more than the # specified number of bytes. If it does, it will block until # the server times out and closes the connection. (The only # applies to HTTP/1.1 connections.) Since some clients access # self.fp directly rather than calling read(), this is a little # tricky. self.fp = sock.makefile(rb, 0) I think that this is just a translation of the old comment, i.e. a warning that some people may choose to call .recv() on the underlying socket. Now, this should be far more difficult now, with the newfangled IO library and all, and since the sock.makefile() is now a SocketIO object which inherits from RawIOBase and all that. It's tricky to excracth the socket to do .recv() on it. So, I don't think we need to fear buffering for readline() anymore. Or, is the comment about someone doing a HTTPResponse.fp.read() in stead of a HTTPResponse.read()? In that case, I don't see the problem. Of course, anyone reading N characters from a socket stream may cause blocking. My proposal is to remove the comment above and use default buffering for the fileobject. Any thoughts? -- versions: +Python 3.1 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4448 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4448] should socket readline() use default_bufsize instead of _rbufsize?
Changes by Gabriel Genellina gagsl-...@yahoo.com.ar: -- nosy: +gagenellina ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4448 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4448] should socket readline() use default_bufsize instead of _rbufsize?
Kristján Valur Jónsson [EMAIL PROTECTED] added the comment: If you look at http://bugs.python.org/issue4336, half of the proposed patch is an attempt to deal with this performance issue. In the patch, we laboriously ensure that bufsize=-1 is passed in for for the xmlrpc client. Seeing your comment, I realize that xmlrpclib.py also uses direct access to h._conn.sock (if present) and uses recv() on that. In fact, that is the only place in the standard library where I can find this pattern. Was that a performance improvement? It is hard to see how bypassing buffered read with a manual recv() can significantly alter performance. In all the cases in the test_xmlrpc.py, h._conn.sock is actually None because h._conn has been closed in HttpConnection.getresponse() Therefore, my patch continues to work. However, I will fix that patch to cater to this strange special case. However, please observe that since _fileobject.read() calls are always buffered, in general there is no way to safely mix read() and recv() calls, althought the recv() and readline() has been fudged to work. Isn´t this just a case of a wart in the standard lib that we ought to remove? Here is a suggestion: 1) document why readline() observes 0 buffering (to enable it to be used as a readline() utility tool on top of vanilla socket recv() 2) stop doing that in xmrlrpclib and use default buffering. -- nosy: +krisvale ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4448 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4448] should socket readline() use default_bufsize instead of _rbufsize?
Guido van Rossum [EMAIL PROTECTED] added the comment: I'm fine with disabling this feature in xmlrpclib.py, and possibly even in httplib.py. I'm *not* fine with fixing this behavior in socket.py -- the unittest coverage is unfortunately small and we have had plenty of trouble in this area in the past. It is there for a reason, even if that reason is hard to fathom and poorly documented. Fortunately in 3.0 it's gone (or, more likely, replaced with a different set of issues :-). ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4448 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4448] should socket readline() use default_bufsize instead of _rbufsize?
New submission from Gregory P. Smith [EMAIL PROTECTED]: From Kristján Valur Jónsson (kristjan at ccpgames.com) on python-dev: http://mail.python.org/pipermail/python-dev/2008-November/083724.html I came across this in socket.c: # _rbufsize is the suggested recv buffer size. It is *strictly* # obeyed within readline() for recv calls. If it is larger than # default_bufsize it will be used for recv calls within read(). What I worry about is the readline() case. Is there a reason why we want to strictly obey it for that function? Note that in the documentation for _fileobject.read() it says: # Use max, disallow tiny reads in a loop as they are very inefficient. The same argument surely applies for readline(). The reason I am fretting about this is that httplib.py (and therefore xmlrpclib.py) specify bufsize=0 when createing their socket fileobjects, presumably to make sure that write() operations are not buffered but flushed immediately. But this has the side effect of setting the _rbufsize to 1, and so readline() calls become very slow. I suggest that readline() be made to use at least defaultbufsize, like read(). Any thoughts? -- assignee: gregory.p.smith messages: 76516 nosy: gregory.p.smith priority: normal severity: normal status: open title: should socket readline() use default_bufsize instead of _rbufsize? type: performance versions: Python 2.6 ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4448 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4448] should socket readline() use default_bufsize instead of _rbufsize?
Guido van Rossum [EMAIL PROTECTED] added the comment: You meant socket.py. This is an extremely subtle area. I would be very wary of changing this -- there is a use case where headers are read from the socket using readline() but the rest of the data is read directly from the socket, and this would break if there was buffered data in the file objects. This is exactly why httplib sets the buffer size to 0. Fortunately things are completely different in Python 3.0 and I believe the same problem doesn't exist -- in 3.0 it makes more sense to always read from the (binary) buffered file object representing the socket. -- nosy: +gvanrossum ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4448 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com