Edit report at https://bugs.php.net/bug.php?id=44164&edit=1

 ID:                 44164
 Updated by:         cataphr...@php.net
 Reported by:        mplomer at gmx dot de
 Summary:            Handle "Content-Length" HTTP header when
                     zlib.output_compression active
-Status:             Assigned
+Status:             Closed
 Type:               Bug
 Package:            *General Issues
 Operating System:   *
 PHP Version:        5.2.5
 Assigned To:        cataphract
 Block user comment: N
 Private report:     N



Previous Comments:
------------------------------------------------------------------------
[2013-08-01 14:51:09] m...@php.net

https://github.com/php/php-src/pull/400

------------------------------------------------------------------------
[2013-08-01 14:19:34] m...@php.net

Why is this open, despite the patch being still applied to SAPI.c?

------------------------------------------------------------------------
[2012-02-16 10:00:14] daniel at code-emitter dot com

FYI: This issue is still causing problems.
http://tracker.phpbb.com/browse/PHPBB3-10648

------------------------------------------------------------------------
[2010-12-17 19:18:15] panczel dot levente at groware dot hu

Thanks, you are absolutely right pointing at my error: my suggestion would not 
work in situations where a Content-Length header was mandatory or referenced 
uncompressed body length. The partial response 206, as I understand, doesn’t 
make Content-Length mandatory. In fact the last line might be omitted from your 
example and that is still a valid response. But since Content-Length is not 
mandatory in this case either, I think my thesis still works.

I have not found any explicit remarks in the specification on how offsets and 
Content-Encoding should interact. As I see now all fields are about the 
document-entity (the one that the script handles and knows well) except for 
Content-Encoding and Content-Length fields which are about the representation 
of the message body. So Content-Length always shows the decimal number of 
octets transferred in the message body’s final byte-stream, and 
Content-Encoding has to be reversed before other processing (like matching it 
to the requested range’s size) takes place.
For all response types where Content-Length is mandatory, I agree with you, 
that compression should be turned off (possibly after trying to fit in the 
initial 1 buffer that I think is allocated anyways). But we know that in case 
of response 200 it is not mandatory, and as I see, for 206 neither. So at least 
these responses could follow my thesis (and any others currently do not require 
a Content-Length field).

> The problem is the zlib.output_compression is not presented as an output 
> handler that rewrites the response and creates a new entity. It is presented 
> as an inoffensive performance option that compresses the output for better 
> performance.
Yes, it rewrites the response; but no, it does not create a new entity. I think 
that’s just what 
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.11 references by 
“without losing the identity of its underlying media type”. So to send a 
compressed body, one just has to adjust the Content-Encoding field and take 
care that Content-Length is not invalid. I feel that changing these headers 
isn’t more intrusive than altering body octets, since they do not affect 
other content and headers in the message, except for Transfer-Encoding which I 
suppose that zlib compression correctly adjusts to. I think chunked 
Transfer-Encoding is relevant for two reasons. If received from the script, it 
has to be assembled before compression. And it might be used to maintain 
persistent connections (e.g. 1 compressed buffer in each chunk) where 
compression was not able to tell the Content-Length in advance.

Please understand that I’m not pushing for any of these features, just think 
that this topic still has potential for inspiring improvement and finding rare 
bugs.

------------------------------------------------------------------------
[2010-12-17 16:35:30] cataphr...@php.net

> That’s an error. Both scripts set the correct CL (that they know very well),
> just the way the specification says they SHOULD. I don’t agree that it would
> be the responsibility of the script to counteract the setting (zlib output
> compression in this case) of the executing framework (PHP in this case). If
> the scripts should take care for every such situation then using the header()
> would be completely illegal, because a future output handler might interact
> with the output in such a way that invalidates the headers set. This isn’t a
> portable phylosophy since it implicitly requires the script being aware of
> every aspects of plugins and settings in PHP.
> In fact it is the zlib output handler that was setting the wrong CL header (by
> not removing the deprecated one). As I see, the handler is constructing a new
> response entity instead the one it receives from the script; the consistency 
> of
> this response is entirely the responsibility of the handler. As I understand
> this has now been patched so that the handler always removes the CL header, 
> and
> by that it assures correctness. Note: here’s no refutation of the 
> correctness
> of the patched handler.

The problem is the zlib.output_compression is not presented as an output 
handler that rewrites the response and creates a new entity. It is presented as 
an inoffensive performance option that compresses the output for better 
performance. And it does so, generally, without the express assent of the 
programmer. The programmer can always use ob_gzhandler to force compression.

Your thesis is that the output handler should not be deactivated; instead it 
ought to remove the old header and write a new one, whenever possible. This 
looks good. But consider this script:

if (empty($_SERVER["HTTP_RANGE"])) {
    $offset = 0;
}
else { //violates rfc2616, which demands ignoring the header if invalid
    preg_match("/^bytes=(\d+)-/i",$_SERVER["HTTP_RANGE"], $matches);
    if (empty($matches[1]))
        $offset = 0;
    if (is_num_int($matches[1]) && $matches[1] < $filesize && $matches[1]>=0) {
        $offset = $matches[1];
        if (@fseek($fp,$offset,SEEK_SET) != 0)
            InternalError();
        header("HTTP/1.1 206 Partial Content");
        header("Content-Range: bytes $offset-".($filesize - 1)."/$filesize");
    }
    elseif ($matches[1] > $filesize) {
        header("HTTP/1.1 416 Requested Range Not Satisfiable");
        die();
    }
    else $offset = 0;
}
$conlen = $filesize - $offset;

header("Content-Length: $conlen");

This is no way this script can work correctly under the zlib handler. 206 
responses must have a content-length and the offsets are calculated through the 
uncompressed size, while under zlib that should be calculated under the 
compressed size, which is obviously impossible to know without first 
compressing the file.

So actually the only option is to disable the zlib output handler.

------------------------------------------------------------------------


The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at

    https://bugs.php.net/bug.php?id=44164


-- 
Edit this bug report at https://bugs.php.net/bug.php?id=44164&edit=1

Reply via email to