[PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes
On Sep 12, 2013, at 02:57, Thomas Rast wrote: The calls to strbuf_add* within append_normalized_escapes() can reallocate the buffer passed to it. Therefore, the seg_start pointer into the string cannot be kept across such calls. Thanks for finding this. It went undetected for a while because it does not fail the test: the calls to test-urlmatch-normalization happen inside a $() substitution. I checked the other call sites to append_normalized_escapes() for the same type of problem, and they seem to be okay. diff --git a/urlmatch.c b/urlmatch.c index 1db76c8..59abc80 100644 --- a/urlmatch.c +++ b/urlmatch.c @@ -281,7 +281,8 @@ char *url_normalize(const char *url, struct url_info *out_info) url_len--; } for (;;) { - const char *seg_start = norm.buf + norm.len; + const char *seg_start; + size_t prev_len = norm.len; How about a more descriptive name for what prev_len is? It's actually the segment start offset. const char *next_slash = url + strcspn(url, /?#); int skip_add_slash = 0; /* @@ -297,6 +298,7 @@ char *url_normalize(const char *url, struct url_info *out_info) strbuf_release(norm); return NULL; } + seg_start = norm.buf + prev_len; A comment would be nice here to remind folks who might be tempted to revert this to the previous version why it's being done this way. I'm sure at some point someone will propose a simplification patch otherwise. Also some nits. The patch description should be imperative mood (cf. Documentation/SubmittingPatches). And instead of mentioning the seg_start pointer in the description (which will be meaningless to just about everyone and it's clear from the diff), mention the bad thing the code was doing in more general terms that will be clear to anyone familiar with a strbuf. So how about this patch instead... -- 8 -- From: Thomas Rast tr...@inf.ethz.ch Subject: urlmatch.c: recompute pointer after append_normalized_escapes When append_normalized_escapes is called, its internal strbuf_add* calls can cause the strbuf's buf to be reallocated changing the value of the buf pointer. Do not use the strbuf buf pointer from before any append_normalized_escapes calls afterwards. Instead recompute the needed pointer. Signed-off-by: Thomas Rast tr...@inf.ethz.ch Signed-off-by: Kyle J. McKay mack...@gmail.com --- urlmatch.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/urlmatch.c b/urlmatch.c index 1db76c89..01c67467 100644 --- a/urlmatch.c +++ b/urlmatch.c @@ -281,8 +281,9 @@ char *url_normalize(const char *url, struct url_info *out_info) url_len--; } for (;;) { - const char *seg_start = norm.buf + norm.len; + const char *seg_start; + size_t seg_start_off = norm.len; const char *next_slash = url + strcspn(url, /?#); int skip_add_slash = 0; /* * RFC 3689 indicates that any . or .. segments should be @@ -297,6 +298,8 @@ char *url_normalize(const char *url, struct url_info *out_info) strbuf_release(norm); return NULL; } + /* append_normalized_escapes can cause norm.buf to change */ + seg_start = norm.buf + seg_start_off; if (!strcmp(seg_start, .)) { /* ignore a . segment; be careful not to remove initial '/' */ if (seg_start == path_start + 1) { -- 1.8.3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes
Kyle J. McKay mack...@gmail.com writes: Also some nits. The patch description should be imperative mood (cf. Documentation/SubmittingPatches). Heh. Serves me right to go away for a while and get SubmittingPatches cited at me on return ;-) Thanks for the updated patch. I agree with the changes. I particularly like the better variable name. -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes
Thanks, both. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes
Kyle J. McKay mack...@gmail.com writes: So how about this patch instead... -- 8 -- From: Thomas Rast tr...@inf.ethz.ch Subject: urlmatch.c: recompute pointer after append_normalized_escapes When append_normalized_escapes is called, its internal strbuf_add* calls can cause the strbuf's buf to be reallocated changing the value of the buf pointer. Do not use the strbuf buf pointer from before any append_normalized_escapes calls afterwards. Instead recompute the needed pointer. Signed-off-by: Thomas Rast tr...@inf.ethz.ch Signed-off-by: Kyle J. McKay mack...@gmail.com --- urlmatch.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/urlmatch.c b/urlmatch.c index 1db76c89..01c67467 100644 --- a/urlmatch.c +++ b/urlmatch.c @@ -281,8 +281,9 @@ char *url_normalize(const char *url, struct url_info *out_info) url_len--; } for (;;) { - const char *seg_start = norm.buf + norm.len; + const char *seg_start; + size_t seg_start_off = norm.len; const char *next_slash = url + strcspn(url, /?#); int skip_add_slash = 0; /* * RFC 3689 indicates that any . or .. segments should be @@ -297,6 +298,8 @@ char *url_normalize(const char *url, struct url_info *out_info) strbuf_release(norm); return NULL; } + /* append_normalized_escapes can cause norm.buf to change */ + seg_start = norm.buf + seg_start_off; The change looks good, but I find that this comment is not placed in the right place. It is good if the reader knows about an old bug to put it here, but if the first thing a reader reads is this updated version, the comment is better placed close to the place where the start_ofs variable captures the original value (i.e. because the next call may relocate the buffer, we cannot grab seg_start upfront; instead we need to record the start_ofs here, and that is what this variable is about). It is too minor a point for a reroll, so I'll try to tweak it locally. Something like this (but now I think about it, the comment may not even be necessary). urlmatch.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/urlmatch.c b/urlmatch.c index 01c6746..d1600e2 100644 --- a/urlmatch.c +++ b/urlmatch.c @@ -282,9 +282,17 @@ char *url_normalize(const char *url, struct url_info *out_info) } for (;;) { const char *seg_start; - size_t seg_start_off = norm.len; + size_t seg_start_off; const char *next_slash = url + strcspn(url, /?#); int skip_add_slash = 0; + + /* +* record the starting offset; appending escapes may +* relocate the buffer, so we cannot capture seg_start +* upfront and use it later. +*/ + seg_start_off = norm.len; + /* * RFC 3689 indicates that any . or .. segments should be * unescaped before being checked for. @@ -298,7 +306,7 @@ char *url_normalize(const char *url, struct url_info *out_info) strbuf_release(norm); return NULL; } - /* append_normalized_escapes can cause norm.buf to change */ + seg_start = norm.buf + seg_start_off; if (!strcmp(seg_start, .)) { /* ignore a . segment; be careful not to remove initial '/' */ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes
On Sep 12, 2013, at 11:30, Junio C Hamano wrote: + /* append_normalized_escapes can cause norm.buf to change */ + seg_start = norm.buf + seg_start_off; The change looks good, but I find that this comment is not placed in the right place. It is good if the reader knows about an old bug to put it here, but if the first thing a reader reads is this updated version, the comment is better placed close to the place where the start_ofs variable captures the original value (i.e. because the next call may relocate the buffer, we cannot grab seg_start upfront; instead we need to record the start_ofs here, and that is what this variable is about). It is too minor a point for a reroll, so I'll try to tweak it locally. Something like this (but now I think about it, the comment may not even be necessary). The longer comment looks good to me. If you think the code will be safe from simplification patches without a comment, that works for me too. I've just seen so many simplification patches go by on the list I'm concerned it will be a target otherwise leading to re-introduction of the problem. diff --git a/urlmatch.c b/urlmatch.c index 01c6746..d1600e2 100644 --- a/urlmatch.c +++ b/urlmatch.c @@ -282,9 +282,17 @@ char *url_normalize(const char *url, struct url_info *out_info) } for (;;) { const char *seg_start; - size_t seg_start_off = norm.len; + size_t seg_start_off; const char *next_slash = url + strcspn(url, /?#); int skip_add_slash = 0; + + /* +* record the starting offset; appending escapes may +* relocate the buffer, so we cannot capture seg_start +* upfront and use it later. +*/ + seg_start_off = norm.len; + /* * RFC 3689 indicates that any . or .. segments should be * unescaped before being checked for. @@ -298,7 +306,7 @@ char *url_normalize(const char *url, struct url_info *out_info) strbuf_release(norm); return NULL; } - /* append_normalized_escapes can cause norm.buf to change */ + seg_start = norm.buf + seg_start_off; if (!strcmp(seg_start, .)) { /* ignore a . segment; be careful not to remove initial '/' */ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes
Kyle J. McKay wrote: The longer comment looks good to me. If you think the code will be safe from simplification patches without a comment, that works for me too. I think if we can't trust reviewers to catch this kind of thing, we're in trouble (i.e., moving too fast). :) So FWIW my instinct is to leave the comment out, since I actually find it more readable that way (otherwise I would wonder, Why am I being told that a strbuf's buffer has a nonconstant address? Do some other strbufs have a constant address or something?) Thanks, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes
Jonathan Nieder jrnie...@gmail.com writes: Kyle J. McKay wrote: The longer comment looks good to me. If you think the code will be safe from simplification patches without a comment, that works for me too. I think if we can't trust reviewers to catch this kind of thing, we're in trouble (i.e., moving too fast). :) So FWIW my instinct is to leave the comment out, since I actually find it more readable that way (otherwise I would wonder, Why am I being told that a strbuf's buffer has a nonconstant address? Do some other strbufs have a constant address or something?) Yeah, I was staring at that message and coming to the same conclusion. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html