On Sat, Aug 07, 2004 at 05:24:19PM -0700, Roy T. Fielding wrote:
> >Since the Apache server can not know if CGI requires C-L, I conclude
> >that CGI scripts are broken if they require C-L and do not return
> >411 Length Required when the CGI/1.1 CONTENT_LENGTH environment
> >variable is not present.  It's too bad that CGI.pm and cgi-lib.pl
> >are both broken in this respect.  Fixing them would be simple and
> >that would take care of the vast majority of legacy apps.

Here's my argument, notwithstanding current practice and the
extremely poor quality of many, many CGI programs out there:

[snip]
> CGI is supposed to be a simple interface for web programming.

You and your cohorts did a great job with it.  It's been quite
extensible and robust (which is not to say it is perfect :)).

[snip]
> CGI was defined in 1993.  HTTP/1.0 in 1993-95.  HTTP/1.1 in 1995-97.

OK, so any robust CGI written after 1995 that is intended to be
gatewayed from an HTTP web server should grok all standard HTTP/1.0
request methods and should know which ones it supports, and which
ones _require_ a message body (e.g. POST, PUT).

And, naturally, the CGI should know it is speaking to an HTTP/1.x 
server on the other end of the gateway because of the CGI/1.1
SERVER_PROTOCOL environment variable.


Now, I always read the CGI/1.1 spec as _requiring_ a CONTENT_LENGTH
if there was any message body.  (<-- Note the qualification).


http://hoohoo.ncsa.uiuc.edu/cgi/env.html

The following environment variables are specific to the request being fulfilled by the 
gateway program:

[...]

#  CONTENT_TYPE

For queries which have attached information, such as HTTP POST and PUT, this is the 
content type of the data.

# CONTENT_LENGTH

The length of the said content as given by the client. 


It is implied that CONTENT_LENGTH should be set to 0 by the HTTP side
of the gateway when a request method that requires a message body has
an empty body.  Therefore, on the other side of the gateway, any robust
little CGI should check for CONTENT_LENGTH, because it wants to be able
to detect premature end of content.  (It doesn't even have to know that
the HTTP server required a message body for that method)

If a script expects content on stdin and does not find CONTENT_LENGTH,
then even very old CGI scripts should abort with 400 Bad Request.
Better, the script should return 411 Length Required if it
requires CONTENT_LENGTH and one is not present.

Of course, if the script does not require that content be present,
and CONTENT_LENGTH is not present, scripts that do not support
HTTP/1.1 semantics (i.e. Transfer-Encoding) will assume no content
and will silently ignore the content that was submitted.  However,
since the content is not required, one must assume that no corruption
will occur because of its absense; only some missing information results

(RFC 2616 says that C-L should be ignored if T-E is present, and so
an HTTP/1.1 server might pass CONTENT_LENGTH=0 to such scripts along
with HTTP_TRANSFER_ENCODING=chunked, and still be proper.  Scripts
that require content, should then definitely return 400 Bad Request)

In any case, CGI scripts are welcome to support HTTP/1.1 and T-E
if they see appropriate SERVER_PROTOCOL and HTTP_TRANSFER_ENCODING
environment variables.

Cheers,
Glenn

Reply via email to