OK, this time I think the file upload problem is solved for good. I've checked-in Alexis's code, with comments. Then I've done a quick rewrite of the multipart/form-data parser found in FieldStorage.__init__ and read_to_boundary so that it uses a regexp for the boundary checks, with the hope that it simplify the code a little bit (and remove thos nasty strip() calls). I've re-ran all tests and everything seems OK. Now, Alexis, if you can run your test with the 60,000+ files on this version of util.py, that would be great :
http://svn.apache.org/viewcvs.cgi/httpd/mod_python/trunk/lib/python/mod_python/util.py?rev=332779&view=markup Ironically, the JIRA server seems non responsive right now, so I can't comment or close MODPYTHON-40... Regards, Nicolas 2005/11/12, Nicolas Lehuen <[EMAIL PROTECTED]>: > Guys, next time please use the JIRA bug tracking application available at : > > http://nagoya.apache.org/jira/browse/MODPYTHON > > Especially this bug report : > > http://issues.apache.org/jira/browse/MODPYTHON-40 > > I'm currently re-reading the whole thread and trying to make sense of > all the test files and patches that were submitted, and it's quite > unpleasant, especially since I'm using Google mail which tries to be > too clever about attachments. > > Using JIRA, we have a permanent tracking of what's happening AND we > can store attachments in a nice way. > > Regards, > Nicolas > > 2005/11/8, Alexis Marrero <[EMAIL PROTECTED]>: > > Inspired by Mike's changes I made some changes the "new" version to > > improve performance while keeping readability: > > > > def read_to_boundary_new(self, req, boundary, file, readBlockSize): > > previous_delimiter = '' > > bound_length = len(boundary) > > while 1: > > line = req.readline(readBlockSize) > > if line[:bound_length] == boundary: > > break > > > > if line[-2:] == '\r\n': > > file.write(previous_delimiter + line[:-2]) > > previous_delimiter = '\r\n' > > > > elif line[-1:] == '\r': > > file.write(previous_delimiter + line[:-1]) > > previous_delimiter = '\r' > > > > elif line[-1:] == '\n': > > if len(line[:-1]) > 0: > > file.write(previous_delimiter + line[:-1]) > > previous_delimiter = '\n' > > > > else: > > previous_delimiter += '\n' > > > > else: > > file.write(previous_delimiter + line) > > previous_delimiter = '' > > > > Results are very comparable to read_to_boundary_mike > > > > /amn > > On Nov 8, 2005, at 9:07 AM, Jim Gallacher wrote: > > > > > Mike Looijmans wrote: > > > > > >> I've attached a modified upload_test_harness.py that includes the > > >> new and current, also the 'org' version (as in 3.1 release) and > > >> the 'mike' version. > > >> > > > > > > Nice changes, Mike. > > > > > > I started to get confused by the names of the various > > > read_to_boundary_* functions, so I've made a slight modification to > > > your code and renamed these functions. > > > > > > Note also that there are some failures for the split_boundary file > > > using offset -2 chunk [CR], so I've change the default offset to > > > include this condition. > > > > > > I've also changed the generate_*file functions to embed an > > > additional '\r' in a short string. Just being paranoid I guess. > > > > > > Diff is attached. > > > > > > If it's useful I can stick the upload_test_harness code in the > > > mod_python svn repository. > > > > > > Jim > > > > > > --- upload_test_harness_mike.py.orig 2005-11-08 > > > 08:02:13.000000000 -0500 > > > +++ upload_test_harness_mike.py 2005-11-08 08:54:15.000000000 -0500 > > > @@ -14,8 +14,8 @@ > > > ## f.write('b'*block_size) > > > ## f.close() > > > > > > -def read_to_boundary_current(self, req, boundary, file, > > > readBlockSize): > > > - ''' currrent version ''' > > > +def read_to_boundary_324(self, req, boundary, file, readBlockSize): > > > + ''' currrent version in 3.2.4b''' > > > # > > > # Although technically possible for the boundary to be split > > > by the read, this will > > > # not happen because the readBlockSize is set quite high - far > > > longer than any boundary line > > > @@ -66,7 +66,7 @@ > > > else: > > > sline = '' > > > > > > -def read_to_boundary_new(self, req, boundary, file, readBlockSize): > > > +def read_to_boundary_alexis(self, req, boundary, file, > > > readBlockSize): > > > ''' Alexis' version > > > read from the request object line by line with a maximum > > > size, > > > until the new line starts with boundary > > > @@ -89,7 +89,7 @@ > > > file.write(previous_delimiter + line) > > > previous_delimiter = '' > > > > > > -def read_to_boundary_org(self, req, boundary, file, readBlockSize): > > > +def read_to_boundary_314(self, req, boundary, file, readBlockSize): > > > delim = "" > > > line = req.readline(readBlockSize) > > > while line and not line.startswith(boundary): > > > @@ -145,6 +145,8 @@ > > > > > > f = open(fname, 'wb') > > > f.write('a'*50) > > > + f.write('\r') > > > + f.write('a'*50) > > > f.write('\r\n') > > > > > > block_size = readBlockSize + offset > > > @@ -163,6 +165,8 @@ > > > """ > > > f = open(fname, 'wb') > > > f.write('a'*50) > > > + f.write('\r') > > > + f.write('a'*50) > > > f.write('\r\n') > > > > > > block_size = readBlockSize + offset > > > @@ -171,7 +175,7 @@ > > > > > > f.close() > > > > > > -read_boundaries = [read_to_boundary_current, read_to_boundary_new, > > > read_to_boundary_org, read_to_boundary_mike] > > > +read_boundaries = [read_to_boundary_324, read_to_boundary_alexis, > > > read_to_boundary_314, read_to_boundary_mike] > > > > > > def main(file_generator, offset, chunk, block_size=1<<16): > > > fname_in = 'testfile.in' > > > @@ -218,6 +222,9 @@ > > > pass > > > > > > if __name__ == '__main__': > > > + # offsets which show problems are in [-2, -1] > > > + first_offset = -2 > > > + last_offset = 0 > > > > > > #test_chunks = ['', '\r', '\n', '\r\n'] > > > > > > @@ -229,7 +236,7 @@ > > > print '='*40 > > > print file_gen_obj.__name__ > > > for chunk in test_chunks: > > > - for i in range(-1, 0): > > > + for i in range(first_offset, last_offset): > > > print '-'*40 > > > print 'test offset', i, 'chunk',[ cname(c) for c > > > in chunk ] > > > print '-'*40 > > > > > > > >