New submission from shajianrui <shajian...@126.com>:

Windows 10, python 3.7 

I met a problem when using the http.server module. I set up a base server with 
class HTTPServer and CGIHTTPRequestHandler(Not using thread or fork) and tried 
to POST a large file (>2MB), then I find the server always reset the 
connection. In some very rare situation the post operation could be 
finished(Very slow) but the CGI script I'm posting to always show that an 
incomplete file is received(Called "incomplete file issue").

==========First Try===========

At first I think (Actually a misunderstanding but lead to a passable 
walkaround) that "self.rfile.read(nbytes) " at LINE 1199 is not blocking, so it 
finish receiving just before the POST operation finished. Then I modify the 
line like this below:

1198        if self.command.lower() == "post" and nbytes > 0:
1199            #data = self.rfile.read(nbytes)     【The original line, I 
comment out it.】
                databuf = bytearray(nbytes)
                datacount = 0
                while datacount + 1 < nbytes:
                    buf = 
self.rfile.read(self.request.getsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF)
                    #print("Get " + str(len(buf)) + " bytes.")
                    for i in range(len(buf)):
                        databuf[datacount] = buf[i]
                        datacount += 1
                        if datacount == nbytes:
                            #print("Done.")
                            break
                data = bytes(databuf)       【Now get the data.】

In this modification I just try to repeatedly read 65536(Default number of 
socket) bytes from rfile until I get nbytes of bytes. Now it works well(Correct 
file received), and is much faster then the POSTing process when using the 
original http.server module(If "incomplete file issue" appear).

==========Second Try==========

However, now I know that there is no problem with "whether it is blocking" 
because "self.rfile.read()" should be blocked if the file is not POSTed 
completely. 

I check the tcp stream with wireshark and find that in the middle of the 
transfer, the recv window of server is always 256, so I think that the problem 
is at the variable "rbufsize", which is transfered to makefile() when the rfile 
of the XXXRequestHandler Object is created. At least it is the problem of the 
low speed. But I dont know whether it lead to the reset operation and the 
incomplete file issue.

I go back to the original version of the http.server module. Then I make a 
subclass of socketserver.StreamRequestHandler, override its setup() 
method(firstly I copy the codes of setup() from StreamRequestHandler, and 
modify Line770)(770 is the line number in socketserver module, but I create the 
new subclass in a new file.):

770     #self.rfile = self.connection.makefile('rb', self.rbufsize)
        self.rfile = self.connection.makefile('rb', 65536)

Then the POST process become much faster(Then my first modification)!

But the server print Error:

    File 
"c:\Users\Administrator\Desktop\cgi-server-test\modified_http_server_bad.py", 
line 1204, in run_cgi    【A copy of http.server module】
        while select.select([self.rfile._sock], [], [], 0)[0]:          【at 
line 1204】
    AttributeError: '_io.BufferedReader' object has no attribute '_sock'

Because I know it want to get the socket of the current RequestHandler, I just 
modify http.server module and change "self.rfile._sock" into 
"self.connection"(I dont know if it would cause problem, it is just a 
walkaround). 

OK, It now work well again. The CGI script can get the correct file(return the 
correct SHA1 of the file uploaded), and the POST process is REALLY MUCH FASTER!

========= Question =========

So here is the problem:
1- What cause the server resetting the connection? Seem it is because the 
default buffer size of the rfile is too small.
2- What cause the cgi script getting the incomplete file? I really have no idea 
about it. Seems this problem also disappear if I enlarge the buffer.

Other information:
1- The "incomplete file issue" usually appear at the first POST to the server, 
and almost all of the other POST connections are reset.
2- If the server start resetting connections, another "incomplete file issue" 
will never appear anymore (Actually it happen, but Chrome only show a RESET 
page, see 4- below.).
3- If the server start resetting connections, it take a long time to terminate 
the server with Ctrl+C.
4- When the connection is reset, the response printed by the cgi script is 
received correctly and it show that cgi script receive an incomplete file, the 
byte count is much fewer than correct number.(I use Chrome to do the POST, so 
it just show a reset message and the real response is ignored)

Please help.

----------
components: Library (Lib)
messages: 345370
nosy: shajianrui
priority: normal
severity: normal
status: open
title: POST large file to server (using http.server.CGIHTTPRequestHandler), 
always reset by server.
type: behavior
versions: Python 3.7

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue37254>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to