Torsten Förtsch wrote:
On Wednesday, 08 February 2012 10:14:35 André Warnier wrote:
As far as I know, LimitRequestBody is an absolute POST size limit set once
and for all in  the server config, and valid for all POSTs (and PUTs) after
server restart.

If you look at the docs you'll find that LimitRequestBody is valid in "server config, virtual host, directory and .htaccess" contexts. That means you can modify it on a per-request basis via $r->add_config. So, assuming authentication takes place in httpd's authentication phase you can set the limit in a PerlFixupHandler per user.

And it is calculated on the base of the real bytes being
sent by the browser, this including the overhead caused by Base64 encoding
the content of a file sent for example. (So that if you set the limit to
1MB, this will actually kick in as soon as the net unencoded size of the
file being uploaded exceeds 660KB or so.)

True. But with HTTP/1.1 the client can also choose to send the body deflated. Thus, the actual file size may also exceed 1MB.

Then there is the $CGI_POST_MAX, which may very well be the same server
value being  manipulated by the CGI module, or it may be a private copy by
CGI.pm.  What is not really clear is if that value is "thread-safe" in all
scenarios.

CGI.pm is pure perl. So, to make $CGI_POST_MAX shared among threads it has to declare it as such. I doubt that any sane developer would do that.

In the normal scenario, when retrieving the uploaded file's handle via the
CGI.pm call to  param(file_input_name) or upload(file_input_name), what one
actually gets is a handle onto a local temporary file, into which
Apache/CGI.pm has already stored the whole content of the uploaded
file.  By that time, the original file upload from the browser has already
happened, so doing something at this point would be too late to interrupt
the browser POST itself (and the bandwidth and time have already been
spent).

True.

On the other hand, the CGI.pm documentation seems to say that if one uses
the "hook"  functionality for a file upload, then Apache/CGI.pm do not use
a temporary file, and one gets a handle directly into the POST body content
(so to speak), as it is being received by Apache.  And thus this could be a
way to achieve what Mike wants.

yes and no. It depends upon what exactly you want to limit. On the internet data is buffered by routers, firewalls etc. On your server it is buffered by the kernel. Httpd adds it's own buffering. HTTP is TCP-based. So, there may be retransmits that you won't notice. You certainly may abort the transfer when the CGI.pm hook has received a certain amount of data. But that would not mean that your server or your organization has not yet received the whole body.

So, if you want to limit the disk usage then yes, you can simply stop writing when the limit is reached. If you want to limit the amount of data your server receives then no.

Best would be if you could make an educated guess based on the Content-Length request header if the uploaded file will exceed the limit. Most clients send an "Expect: 100-continue" header and thus give the server a chance to decline the request *before* the body is sent. If the body is already on the way the only thing you can do is to close the connection. I don't know if httpd does that immediately or if it reads and discards the whole body.

(I suppose that we can assume that even
though we get a handle into the POST body content, what we are reading is
the decoded data, right ?).

The code below is the relevant piece of CGI.pm. So, yes, the upload hook gets the data as it is written to the temp file.

  while (defined($data = $buffer->read)) {
    if (defined $self->{'.upload_hook'}) {
      $totalbytes += length($data);
      &{$self->{'.upload_hook'}}($filename ,$data, $totalbytes,
                                 $self->{'.upload_data'});
    }
    print $filehandle $data if ($self->{'use_tempfile'});
  }

Torsten Förtsch


Many, many thanks Torsten. This is all precious and usable information.

Reply via email to