Github user zwoop commented on a diff in the pull request:
https://github.com/apache/trafficserver/pull/866#discussion_r74707669
--- Diff: proxy/logging/LogUtils.cc ---
@@ -359,6 +359,23 @@ LogUtils::escapify_url(Arena *arena, char *url, size_t
len_in, int *len_out, cha
while (from < in_url_end) {
unsigned char c = *from;
if (map[c / 8] & (1 << (7 - c % 8))) {
+ /*
+ * If two characters following a '%' don't need to be encoded, then
it must
+ * mean that the three character sequence is already encoded. Just
copy it over.
+ */
+ if ((*from == '%') && ((from + 2) < in_url_end)) {
+ unsigned char c1 = *(from + 1);
+ unsigned char c2 = *(from + 2);
+ bool needsEncoding = ((map[c1 / 8] & (1 << (7 - c1 % 8))) ||
(map[c2 / 8] & (1 << (7 - c2 % 8))));
+ if (!needsEncoding) {
+ out_len -= 2;
+ *to++ = *from;
+ from++;
+ Debug("log-utils", "character already encoded..skipping %c, %c,
%c", *from, *(from + 1), *(from + 2));
--- End diff --
Hmmm, so some questions on this:
1) Why not *to++ = *from++; ?
2) Since we now moved from forward, is the Debug() line still correct?
Seems that it'd be one too much ?
3) I'm not sure I understand this logic, it seems it consumes 2 bytes
(out_len -= 2), but it only writes one (*to++ = *from) ? Shouldn't this consume
/ copy all 3 bytes ? That's sort of what the comments imply, no?
4) It might be nice to explain (comment) what all that bit shifting and
logic is actually doing? Presumably it's checking if c1 or c2 is of a
particular value, but what values are those?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---