As discussed in v2.19.0-rc0~45^2~2 (http-backend: respect
CONTENT_LENGTH as specified by rfc3875, 2018-06-10), HTTP servers such
as IIS do not close a CGI script's standard input at the end of a
request, instead expecting CGI scripts to stop reading after
CONTENT_LENGTH bytes.  That commit taught http-backend to respect this
convention except when CONTENT_LENGTH is unset, in which case it
preserved the previous behavior of reading until EOF.

RFC 3875 (the CGI specification) explains:

   The CONTENT_LENGTH variable contains the size of the message-body
   attached to the request, if any, in decimal number of octets.  If no
   data is attached, then NULL (or unset).

      CONTENT_LENGTH = "" | 1*digit

And:

   This specification does not distinguish between zero-length (NULL)
   values and missing values.

But that specification was written before HTTP/1.1 and chunked
encoding.  With chunked encoding, the length of a request is not known
early and it is useful to start a CGI script to process it anyway, so
Apache and many other servers violate the spec: they leave
CONTENT_LENGTH unset and rely on EOF to indicate the end of request.
This is reproducible using t5510-fetch.sh, which hangs if http-backend
is patched to treat a missing CONTENT_LENGTH as zero.

So we are in a bind: to support HTTP servers that don't produce EOF,
http-backend should respect an unset or empty CONTENT_LENGTH that
represents zero, and to support chunked encoding, http-backend should
respect an unset CONTENT_LENGTH that represents "read until EOF".

Fortunately, there's a way out.  Use the HTTP_TRANSFER_ENCODING
environment variable to distinguish the two cases.

Reported-by: Jeff King <p...@peff.net>
Helped-by: Max Kirillov <m...@max630.net>
Signed-off-by: Jonathan Nieder <jrnie...@gmail.com>
---
How about this?

 http-backend.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/http-backend.c b/http-backend.c
index 458642ef72..7902eeb0b3 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -350,10 +350,25 @@ static ssize_t read_request_fixed_len(int fd, ssize_t 
req_len, unsigned char **o
 
 static ssize_t get_content_length(void)
 {
-       ssize_t val = -1;
+       ssize_t val;
        const char *str = getenv("CONTENT_LENGTH");
 
-       if (str && *str && !git_parse_ssize_t(str, &val))
+       if (!str || !*str) {
+               /*
+                * According to RFC 3875, an empty or missing
+                * CONTENT_LENGTH means "no body", but RFC 3875
+                * precedes HTTP/1.1 and chunked encoding. Apache and
+                * its imitators leave CONTENT_LENGTH unset for
+                * chunked requests, for which we should use EOF to
+                * detect the end of the request.
+                */
+               str = getenv("HTTP_TRANSFER_ENCODING");
+               if (str && !strcmp(str, "chunked"))
+                       return -1;
+
+               return 0;
+       }
+       if (!git_parse_ssize_t(str, &val))
                die("failed to parse CONTENT_LENGTH: %s", str);
        return val;
 }
-- 
2.19.0.397.gdd90340f6a

Reply via email to