Torsten Förtsch wrote:
On Wednesday, 08 February 2012 10:14:35 André Warnier wrote:
As far as I know, LimitRequestBody is an absolute POST size limit set once
and for all in the server config, and valid for all POSTs (and PUTs) after
server restart.
If you look at the docs you'll find that LimitRequestBody is valid in "server
config, virtual host, directory and .htaccess" contexts. That means you can
modify it on a per-request basis via $r->add_config. So, assuming
authentication takes place in httpd's authentication phase you can set the
limit in a PerlFixupHandler per user.
And it is calculated on the base of the real bytes being
sent by the browser, this including the overhead caused by Base64 encoding
the content of a file sent for example. (So that if you set the limit to
1MB, this will actually kick in as soon as the net unencoded size of the
file being uploaded exceeds 660KB or so.)
True. But with HTTP/1.1 the client can also choose to send the body deflated.
Thus, the actual file size may also exceed 1MB.
Then there is the $CGI_POST_MAX, which may very well be the same server
value being manipulated by the CGI module, or it may be a private copy by
CGI.pm. What is not really clear is if that value is "thread-safe" in all
scenarios.
CGI.pm is pure perl. So, to make $CGI_POST_MAX shared among threads it has to
declare it as such. I doubt that any sane developer would do that.
In the normal scenario, when retrieving the uploaded file's handle via the
CGI.pm call to param(file_input_name) or upload(file_input_name), what one
actually gets is a handle onto a local temporary file, into which
Apache/CGI.pm has already stored the whole content of the uploaded
file. By that time, the original file upload from the browser has already
happened, so doing something at this point would be too late to interrupt
the browser POST itself (and the bandwidth and time have already been
spent).
True.
On the other hand, the CGI.pm documentation seems to say that if one uses
the "hook" functionality for a file upload, then Apache/CGI.pm do not use
a temporary file, and one gets a handle directly into the POST body content
(so to speak), as it is being received by Apache. And thus this could be a
way to achieve what Mike wants.
yes and no. It depends upon what exactly you want to limit. On the internet
data is buffered by routers, firewalls etc. On your server it is buffered by
the kernel. Httpd adds it's own buffering. HTTP is TCP-based. So, there may be
retransmits that you won't notice. You certainly may abort the transfer when
the CGI.pm hook has received a certain amount of data. But that would not mean
that your server or your organization has not yet received the whole body.
So, if you want to limit the disk usage then yes, you can simply stop writing
when the limit is reached. If you want to limit the amount of data your server
receives then no.
Best would be if you could make an educated guess based on the Content-Length
request header if the uploaded file will exceed the limit. Most clients send
an "Expect: 100-continue" header and thus give the server a chance to decline
the request *before* the body is sent. If the body is already on the way the
only thing you can do is to close the connection. I don't know if httpd does
that immediately or if it reads and discards the whole body.
(I suppose that we can assume that even
though we get a handle into the POST body content, what we are reading is
the decoded data, right ?).
The code below is the relevant piece of CGI.pm. So, yes, the upload hook gets
the data as it is written to the temp file.
while (defined($data = $buffer->read)) {
if (defined $self->{'.upload_hook'}) {
$totalbytes += length($data);
&{$self->{'.upload_hook'}}($filename ,$data, $totalbytes,
$self->{'.upload_data'});
}
print $filehandle $data if ($self->{'use_tempfile'});
}
Torsten Förtsch
Many, many thanks Torsten. This is all precious and usable information.