Dear all, I've committed the following change to the GitHub repository
of NaviServer, that adds significant improvements for FORM uploads of
large files. It makes it now possible to handle files uploads via
multipart/form-data (usual format) larger than 4GB without crashing. The
support is just for NaviServer, applications (e.g. OpenACS) might still
try to read such large files into Tcl_Objs leading to crashes with Tcl
8. Although, loading huge data into memory is not a good practice and
leads to memory bloats. However, this should work with Tcl9.
The code replaces an implementation that has not changed for the last 17
years.
I am planing to backport this change also to the 4.99 branch.
All the best
-g
Added experimental command "ns_fseekchars" and use it in form.tcl for parsing
multipart/form-data
As a consequence,
(a) the file-based parser of multipart/form-data is able to read files >4GB,
(b) leads to less memory bloat and
(c) is more than a factor of 10 faster
file size old ns_fseekchars factor
65,517 4,471 151 29.61
124,523 1,139 94 12.12
74,006,378 682,375 54,752 12.46
2,104,408,064 18,916,496 1,564,472 12.09
3,992,977,408 35,942,768 3,061,061 11.74
5,368,709,120 3,817,896
The problem with the old file-based parser was that it was searching for
boundary
strings using the Tcl "gets" command:
if { [string match $boundary* [string trim [gets $fp]]] } {
...
}
Since "gets" reads a line (i.e., all character until the next new
line) Tcl can crash, when the next new line is more than 4GB
away. Even with e.g. 2GB, it will temporarily create a Tcl_Obj with
2GB content, which is stripped etc. leading therefore to a potential
memory bloat keeping multiple huge Tcl_Objs in memory.
The new code avoids all this by performing the search for the boundary
in C.
TODO: add documentation page
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel