> On 25 May 2018, at 17:43, Micha Lenk <mi...@lenk.info> wrote: > > 524 s_from = strlen(m->from.c); > 525 if (!strncasecmp(ctx->buf, m->from.c, s_from)) { > ... ... do the string replacement ... > > > ... where ctx->buf is the URL found in the HTML document, and m->from.c is > the first configured argument of ProxyHTMLURLMap. So, if the latter is a > prefix of the first, this condition should be true and the string replacement > should happen. When the expected string replacement doesn't happen, the > condition is false and the values of the variables are: > > ctx->buf = http://internal/!%22%23$/ > m->from.c = http://internal/!"#$/ > > So, the strings don't match and are not replaced for that reason.
Yep. mod_proxy_html takes what it sees. That's why it relies on another module (mod_xml2enc) for i18n, which is kind-of what I expected to see from your subject line! > Going forward I am not interested in finding a work around for this, but more > how to approach a fix (if this is a bug at all). > > Is it reasonable to expect mod_proxy_html to rewrite URL encoded URLs as well? I think it's reasonable to use the escaped html in your ProxyHTMLURLMap. If we have mod_proxy_html unescape characters, it adds complexity to the code, and (perhaps more to the point) presents a mirror-image of your problem to anyone with the opposite expectations. > Let's assume this needs to be fixed. To make the strings match, we could > either URL escape the value from the Apache directive ProxyHTMLURLMap, or URL > temporarily URL-decode the string found in the HTML document just for the > purpose of the string comparison. What is the right thing to do? I prefer to leave it to server admins to find the match that works for them. I don't recollect this particular question ever arising in 15 years, which kind-of suggests users are not confused by it! -- Nick Kew