On Sat, Aug 07, 2004 at 05:24:19PM -0700, Roy T. Fielding wrote: > >Since the Apache server can not know if CGI requires C-L, I conclude > >that CGI scripts are broken if they require C-L and do not return > >411 Length Required when the CGI/1.1 CONTENT_LENGTH environment > >variable is not present. It's too bad that CGI.pm and cgi-lib.pl > >are both broken in this respect. Fixing them would be simple and > >that would take care of the vast majority of legacy apps.
Here's my argument, notwithstanding current practice and the extremely poor quality of many, many CGI programs out there: [snip] > CGI is supposed to be a simple interface for web programming. You and your cohorts did a great job with it. It's been quite extensible and robust (which is not to say it is perfect :)). [snip] > CGI was defined in 1993. HTTP/1.0 in 1993-95. HTTP/1.1 in 1995-97. OK, so any robust CGI written after 1995 that is intended to be gatewayed from an HTTP web server should grok all standard HTTP/1.0 request methods and should know which ones it supports, and which ones _require_ a message body (e.g. POST, PUT). And, naturally, the CGI should know it is speaking to an HTTP/1.x server on the other end of the gateway because of the CGI/1.1 SERVER_PROTOCOL environment variable. Now, I always read the CGI/1.1 spec as _requiring_ a CONTENT_LENGTH if there was any message body. (<-- Note the qualification). http://hoohoo.ncsa.uiuc.edu/cgi/env.html The following environment variables are specific to the request being fulfilled by the gateway program: [...] # CONTENT_TYPE For queries which have attached information, such as HTTP POST and PUT, this is the content type of the data. # CONTENT_LENGTH The length of the said content as given by the client. It is implied that CONTENT_LENGTH should be set to 0 by the HTTP side of the gateway when a request method that requires a message body has an empty body. Therefore, on the other side of the gateway, any robust little CGI should check for CONTENT_LENGTH, because it wants to be able to detect premature end of content. (It doesn't even have to know that the HTTP server required a message body for that method) If a script expects content on stdin and does not find CONTENT_LENGTH, then even very old CGI scripts should abort with 400 Bad Request. Better, the script should return 411 Length Required if it requires CONTENT_LENGTH and one is not present. Of course, if the script does not require that content be present, and CONTENT_LENGTH is not present, scripts that do not support HTTP/1.1 semantics (i.e. Transfer-Encoding) will assume no content and will silently ignore the content that was submitted. However, since the content is not required, one must assume that no corruption will occur because of its absense; only some missing information results (RFC 2616 says that C-L should be ignored if T-E is present, and so an HTTP/1.1 server might pass CONTENT_LENGTH=0 to such scripts along with HTTP_TRANSFER_ENCODING=chunked, and still be proper. Scripts that require content, should then definitely return 400 Bad Request) In any case, CGI scripts are welcome to support HTTP/1.1 and T-E if they see appropriate SERVER_PROTOCOL and HTTP_TRANSFER_ENCODING environment variables. Cheers, Glenn