Hi Apache developers,
next is a bug that causes mod_proxy_html to add some random characters
(+html code) to HTML pages, if the document is smaller than 4 bytes.
(Thomas, Ewald, this is issue #18378 in our Mantis). It looks like the
output is from some kind of uninitialized memory. The added string
sometimes matches part of a previously delivered request. Also, it looks
like this only happens when doing multiple HTTP requests with the same
browser and using HTTP Keep Alive.
The root cause is that the charset guessing with xml2enc needs to
consume at least 4 bytes from the document to come to a conclusion. The
consumed bytes are buffered so that they can later get prepended to the
output again. But apparently it is assumed that there are always at
least 4 bytes available, which in some cases is not the case. In these
cases the buffer may contain some bytes left behind from the previous
request on the same connection.
The attached patch fixes that issue by simply skipping documents smaller
than 4 bytes. The rationale behind this is, that for HTML rewriting to
do something useful, it needs to work on an absolute URL (i.e. including
a schema). But as the schema "http" is already 4 bytes, there would be
nothing to rewrite.
The patch is based on httpd trunk, rev. 1579365.
Please provide feedback whether I should file an issue in Apaches
Bugzilla or whether this isn't needed in this case.
Regards,
Micha
Index: modules/filters/mod_proxy_html.c
===
--- modules/filters/mod_proxy_html.c (Revision 1579365)
+++ modules/filters/mod_proxy_html.c (Arbeitskopie)
@@ -885,6 +885,15 @@
else if (apr_bucket_read(b, &buf, &bytes, APR_BLOCK_READ)
== APR_SUCCESS) {
if (ctxt->parser == NULL) {
+/* For documents smaller than four bytes, there is no reason to do
+ * HTML rewriting. The URL schema (i.e. 'http') needs four bytes alone.
+ * And the HTML parser needs at least four bytes to initialise correctly.
+ */
+if ((bytes < 4) && APR_BUCKET_IS_EOS(APR_BUCKET_NEXT(b))) {
+ap_remove_output_filter(f) ;
+return ap_pass_brigade(f->next, bb) ;
+}
+
const char *cenc;
if (!xml2enc_charset ||
(xml2enc_charset(f->r, &enc, &cenc) != APR_SUCCESS)) {