PR 41798 and many related ones (eg 39746, 38980 - both of which I've
closed today) show a history of incorrect URL-unescaping in mod_proxy.
For PR41798, the attached patch looks like a fix: it just uses
r->unparsed_uri (escaped) instead of r->uri (unescaped) in
proxy_trans. I'm wondering if using unparsed_uri here risks
breaking something or has security implications we need to
consider, bearing in mind we already unescaped it and thus
verified it is well-formed.
Any thoughts? Whoever wrote the comment about the existing logic
breaking RFC1945 presumably didn't see it as being that simple.
--
Nick Kew
Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/
Index: modules/proxy/mod_proxy.c
===================================================================
--- modules/proxy/mod_proxy.c (revision 573827)
+++ modules/proxy/mod_proxy.c (working copy)
@@ -523,6 +523,16 @@
const char *real;
ap_regmatch_t regm[AP_MAX_REG_MATCH];
char *found = NULL;
+ const char *local_uri = r->unparsed_uri;
+ /* r->uri has been unescaped at this point.
+ * Using it in a proxy breaks us for backends that expect
+ * escaped sequences - at least slashes. We can fix that
+ * using r->unparsed_uri. But does that itself break something
+ * else, or have security implications?
+ *
+ * We're too early to use per-dir config. Should this be
+ * a per-vhost config option, or universal, or ???
+ */
if (r->proxyreq) {
/* someone has already set up the proxy, it was possibly ourselves
@@ -531,11 +541,6 @@
return OK;
}
- /* XXX: since r->uri has been manipulated already we're not really
- * compliant with RFC1945 at this point. But this probably isn't
- * an issue because this is a hybrid proxy/origin server.
- */
-
for (i = 0; i < conf->aliases->nelts; i++) {
if (dconf->interpolate_env == 1) {
fake = proxy_interpolate(r, ent[i].fake);
@@ -546,11 +551,11 @@
real = ent[i].real;
}
if (ent[i].regex) {
- if (!ap_regexec(ent[i].regex, r->uri, AP_MAX_REG_MATCH, regm, 0)) {
+ if (!ap_regexec(ent[i].regex, local_uri, AP_MAX_REG_MATCH, regm, 0)) {
if ((real[0] == '!') && (real[1] == '\0')) {
return DECLINED;
}
- found = ap_pregsub(r->pool, real, r->uri, AP_MAX_REG_MATCH,
+ found = ap_pregsub(r->pool, real, local_uri, AP_MAX_REG_MATCH,
regm);
/* Note: The strcmp() below catches cases where there
* was no regex substitution. This is so cases like:
@@ -569,13 +574,13 @@
found = apr_pstrcat(r->pool, "proxy:", found, NULL);
}
else {
- found = apr_pstrcat(r->pool, "proxy:", real, r->uri,
+ found = apr_pstrcat(r->pool, "proxy:", real, local_uri,
NULL);
}
}
}
else {
- len = alias_match(r->uri, fake);
+ len = alias_match(local_uri, fake);
if (len != 0) {
if ((real[0] == '!') && (real[1] == '\0')) {
@@ -583,11 +588,12 @@
}
found = apr_pstrcat(r->pool, "proxy:", real,
- r->uri + len, NULL);
+ local_uri + len, NULL);
}
}
if (found) {
+ r->uri = local_uri; /* the backend should get it escaped */
r->filename = found;
r->handler = "proxy-server";
r->proxyreq = PROXYREQ_REVERSE;