> > HTML::Entities correctly turns \x8b into ‹ while Apache::Util leaves it > untouched. That character is treated by certain buggy browsers as < and can > thus be used to fake tags. Note that just because your browser isn't > vulnerable (ie it doesn't buy the fakes h1) doesn't mean that the problem > isn't there :-) The source makes it explicit. > > This is with 1.25 but I don't think it has changed since. The solution is to > do what HTML::Entities does, which is basically sprintf "&#x%X;", ord($char) > control and high bit chars. I'd submit a patch but I'm not too fluent with > C/XS. >
I'm probably worse with C than Robin, but here's a patch that seems to fix the problem (as I understand it, that is). the solution is different that HTML::Entities in that it always uses the ¸ for characters between 126 and 255, whereas HTML::Entities uses stuff like ¸ anyway, with the usual caveats of myself not being a C guy, input on a better way to do this is not only welcomed, but encouraged :) --Geoff Index: Util.xs =================================================================== RCS file: /home/cvspublic/modperl/src/modules/perl/Util.xs,v retrieving revision 1.9 diff -u -r1.9 Util.xs --- Util.xs 4 Mar 2000 20:55:47 -0000 1.9 +++ Util.xs 24 Jan 2002 14:31:46 -0000 @@ -36,6 +36,7 @@ { int i, j; SV *x; + static char highbits[6]; /* first, count the number of extra characters */ for (i = 0, j = 0; s[i] != '\0'; i++) @@ -43,7 +44,8 @@ j += 3; else if (s[i] == '&') j += 4; - else if (s[i] == '"') + else if (s[i] == '"' || + ((unsigned char)s[i] > 126) && (unsigned char)s[i] <= 255) j += 5; if (j == 0) @@ -67,6 +69,11 @@ memcpy(&SvPVX(x)[j], """, 6); j += 5; } + else if ((unsigned char)s[i] > 126 && (unsigned char)s[i] <= 255) { + sprintf(highbits, "&#%i;", (unsigned char)s[i]); + memcpy(&SvPVX(x)[j], highbits, 6); + j += 5; + } else SvPVX(x)[j] = s[i];