Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup
On Thu, Sep 17, 2015 at 07:28:56AM -0400, Jeff King wrote: > /* >* The branches file would have URL and optionally >* #branch specified. The "master" (or specified) branch is > - * fetched and stored in the local branch of the same name. > + * fetched and stored in the local branch matching the > + * remote name. >*/ > - frag = strchr(p, '#'); > - if (frag) { > + frag = strchr(buf.buf, '#'); > + if (frag) > *(frag++) = '\0'; > - strbuf_addf(, "refs/heads/%s", frag); > - } else > - strbuf_addstr(, "refs/heads/master"); > + else > + frag = "master"; > + > + add_url_alias(remote, strbuf_detach(, NULL)); > + add_fetch_refspec(remote, xstrfmt("refs/heads/%s:refs/heads/%s", > + frag, remote->name)); There is a little bit of subtlety here. "frag" points into the strbuf. We then detach the strbuf, and assume that "frag" is still valid. This works fine, as we retain the same buffer when detaching. But I wonder if it is violating the strbuf abstraction too much. We can xstrdup() the frag section, but it just adds a little more tedious book-keeping to the function. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup
On Thu, Sep 17, 2015 at 07:28:56AM -0400, Jeff King wrote: > Here is the patch I ended up with: > > -- >8 -- > Subject: [PATCH] read_branches_file: simplify string handling And here is the matching cleanup for read_remotes_file, which lets us drop the static global "buffer" entirely. -- >8 -- Subject: [PATCH] read_remotes_file: simplify string handling The main motivation for this cleanup is to switch our line-reading to a strbuf, which removes the use of a static-sized buffer (which limited the size of remote URLs). Since we have the strbuf, we can make use of strbuf_rtrim(). While we're here, we can also simplify the parsing of each line. First, we can use skip_prefix() to avoid some magic numbers. But second, we can avoid splitting the parsing and actions for each line into two stages. Right now we figure out which type of line we have, set an int to a magic number, skip any intermediate whitespace, and then act on the resulting value based on the magic number. Instead, let's factor the whitespace skipping into a function. That lets us avoid the magic numbers and keep the actions close to the parsing. Signed-off-by: Jeff King--- You could also take advantage of the fact that all of the actions have a uniform function interface, and write this as: void (*action)(struct remote *, char *); if (skip_prefix(buf.buf, "URL:", )) action = add_url_alias; else ...more parsing... while (isspace(*v)) v++; action(remote, xstrdup(v)); That is more restrictive, but I doubt we would be adding new actions to this deprecated format anytime soon. I'm not sure which people think is more readable. remote.c | 55 ++- 1 file changed, 18 insertions(+), 37 deletions(-) diff --git a/remote.c b/remote.c index 2bef5a4..a62b659 100644 --- a/remote.c +++ b/remote.c @@ -54,9 +54,6 @@ static const char *pushremote_name; static struct rewrites rewrites; static struct rewrites rewrites_push; -#define BUF_SIZE (2048) -static char buffer[BUF_SIZE]; - static int valid_remote(const struct remote *remote) { return (!!remote->url) || (!!remote->foreign_vcs); @@ -243,50 +240,34 @@ static void add_instead_of(struct rewrite *rewrite, const char *instead_of) rewrite->instead_of_nr++; } +static const char *skip_spaces(const char *s) +{ + while (isspace(*s)) + s++; + return s; +} + static void read_remotes_file(struct remote *remote) { + struct strbuf buf = STRBUF_INIT; FILE *f = fopen(git_path("remotes/%s", remote->name), "r"); if (!f) return; remote->origin = REMOTE_REMOTES; - while (fgets(buffer, BUF_SIZE, f)) { - int value_list; - char *s, *p; - - if (starts_with(buffer, "URL:")) { - value_list = 0; - s = buffer + 4; - } else if (starts_with(buffer, "Push:")) { - value_list = 1; - s = buffer + 5; - } else if (starts_with(buffer, "Pull:")) { - value_list = 2; - s = buffer + 5; - } else - continue; - - while (isspace(*s)) - s++; - if (!*s) - continue; + while (strbuf_getline(, f, '\n') != EOF) { + const char *v; - p = s + strlen(s); - while (isspace(p[-1])) - *--p = 0; + strbuf_rtrim(); - switch (value_list) { - case 0: - add_url_alias(remote, xstrdup(s)); - break; - case 1: - add_push_refspec(remote, xstrdup(s)); - break; - case 2: - add_fetch_refspec(remote, xstrdup(s)); - break; - } + if (skip_prefix(buf.buf, "URL:", )) + add_url_alias(remote, xstrdup(skip_spaces(v))); + else if (skip_prefix(buf.buf, "Push:", )) + add_push_refspec(remote, xstrdup(skip_spaces(v))); + else if (skip_prefix(buf.buf, "Pull:", )) + add_fetch_refspec(remote, xstrdup(skip_spaces(v))); } + strbuf_release(); fclose(f); } -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup
On Wed, Sep 16, 2015 at 04:42:26PM -0400, Jeff King wrote: > > We use buffer[BUFSIZ] to read various things in this file, not just > > $GIT_DIR/branches/* files, with fgets(), which may be better done if > > we switched to strbuf_getline(). Then we can also use trim family > > of calls from the strbuf API suite. > > > > Move to strbuf_getline() may be a doubly attractive proposition, > > with a possible change to strbuf_getline() to make it also remove CR > > that immediately precedes LF [*1*], helping DOSsy platforms. > > I'll take a look and see how painful this is. I think that this already works, because we strip whitespace from the right side of the buffer (and isspace('\r') returns true). Likewise, I don't think we need to teach strbuf_getline() about CRs for the same reason (I agree it would be a good thing to do in general, but I stopped short here). Here is the patch I ended up with: -- >8 -- Subject: [PATCH] read_branches_file: simplify string handling This function does a lot of manual string handling, and has some unnecessary limits. This patch cleans up a number of things: 1. Drop the arbitrary 1000-byte limit on the size of the remote name (we do not have such a limit in any of the other remote-reading mechanisms). 2. Replace fgets into a fixed-size buffer with a strbuf, eliminating any limits on the length of the URL. This uses strbuf_read_file for simplicity. Technically this behavior is different than the original, as we will read the whole file content, whereas the original simply ignored subsequent lines. Given that branches files are supposed to be one line, this doesn't matter in practice (and arguably including the extra lines is better, as it will probably cause the URL to be bogus and bring attention to the issue). 3. Replace manual whitespace handling with strbuf_trim (since we now have a strbuf). This also gets rid of a call to strcpy, and the confusing reuse of the "p" pointer for multiple purposes. 4. We currently build up the refspecs over multiple strbuf calls. We do this to handle the fact that the URL "frag" may not be present. But rather than have multiple conditionals, let's just default "frag" to "master". This lets us format the refspecs with a single xstrfmt. It's shorter, and easier to see what the final string looks like. We also update the misleading comment in this area (the local branch is named after the remote name, not after the branch name on the remote side). Signed-off-by: Jeff King--- I guess (2) could be wrong if people are counting on adding arbitrary comment lines to the end of a "branches" file, but AFAIK that has never been a thing. remote.c | 54 +++--- 1 file changed, 19 insertions(+), 35 deletions(-) diff --git a/remote.c b/remote.c index 5ab0f7f..2bef5a4 100644 --- a/remote.c +++ b/remote.c @@ -293,56 +293,40 @@ static void read_remotes_file(struct remote *remote) static void read_branches_file(struct remote *remote) { char *frag; - struct strbuf branch = STRBUF_INIT; - int n = 1000; - FILE *f = fopen(git_path("branches/%.*s", n, remote->name), "r"); - char *s, *p; - int len; + struct strbuf buf = STRBUF_INIT; - if (!f) + if (strbuf_read_file(, git_path("branches/%s", remote->name), 0) < 0) return; - s = fgets(buffer, BUF_SIZE, f); - fclose(f); - if (!s) - return; - while (isspace(*s)) - s++; - if (!*s) + + strbuf_trim(); + if (!buf.len) { + strbuf_release(); return; + } + remote->origin = REMOTE_BRANCHES; - p = s + strlen(s); - while (isspace(p[-1])) - *--p = 0; - len = p - s; - p = xmalloc(len + 1); - strcpy(p, s); /* * The branches file would have URL and optionally * #branch specified. The "master" (or specified) branch is -* fetched and stored in the local branch of the same name. +* fetched and stored in the local branch matching the +* remote name. */ - frag = strchr(p, '#'); - if (frag) { + frag = strchr(buf.buf, '#'); + if (frag) *(frag++) = '\0'; - strbuf_addf(, "refs/heads/%s", frag); - } else - strbuf_addstr(, "refs/heads/master"); + else + frag = "master"; + + add_url_alias(remote, strbuf_detach(, NULL)); + add_fetch_refspec(remote, xstrfmt("refs/heads/%s:refs/heads/%s", + frag, remote->name)); - strbuf_addf(, ":refs/heads/%s", remote->name); - add_url_alias(remote, p); - add_fetch_refspec(remote, strbuf_detach(, NULL)); /* * Cogito compatible push: push current
Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup
On Thu, Sep 17, 2015 at 08:38:32AM -0700, Junio C Hamano wrote: > > In some of the cases, as you've seen, I dug further in cleaning things > > up. But in others I did the minimal fix (especially in this case, the > > limitations are only about the deprecated "branches" and "remotes" > > file), mostly to try to keep the scope of work sane. > > That is sensible. As long as the result of conversion is easier to > audit (which is the primary focus of this series), I'd agree that we > should stop there, instead of making further changes. > > The last thing we would want to do is to change the behaviour, > especially to unintentionally start rejecting what we have always > accepted, while doing a "code clean-up". Letting these sleeping > dogs lie is the safest. That various distros lag behind our release > schedule means that we may not hear about regression until a year > after we break it for a feature used by minority of users. Yeah, that was my thinking. Since I _did_ end up doing the cleanup and posted it earlier, please feel free to review and express an opinion on the original versus the cleanup. I'm on the fence. I do think the cleaned-up version is much nicer, but I always worry about the risk of touching little-used code. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup
Jeff Kingwrites: > On Wed, Sep 16, 2015 at 12:52:26PM -0700, Junio C Hamano wrote: > >> > diff --git a/remote.c b/remote.c >> > index 5ab0f7f..1b69751 100644 >> > --- a/remote.c >> > +++ b/remote.c >> > @@ -297,7 +297,6 @@ static void read_branches_file(struct remote *remote) >> >int n = 1000; >> >FILE *f = fopen(git_path("branches/%.*s", n, remote->name), "r"); >> >char *s, *p; >> > - int len; >> >> Hmm, we would punish those with ridiculously long remote name by >> truncating at n but that is OK. > > Yeah, though that is nothing new. > > In some of the cases, as you've seen, I dug further in cleaning things > up. But in others I did the minimal fix (especially in this case, the > limitations are only about the deprecated "branches" and "remotes" > file), mostly to try to keep the scope of work sane. That is sensible. As long as the result of conversion is easier to audit (which is the primary focus of this series), I'd agree that we should stop there, instead of making further changes. The last thing we would want to do is to change the behaviour, especially to unintentionally start rejecting what we have always accepted, while doing a "code clean-up". Letting these sleeping dogs lie is the safest. That various distros lag behind our release schedule means that we may not hear about regression until a year after we break it for a feature used by minority of users. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup
Jeff Kingwrites: > Yeah, that was my thinking. Since I _did_ end up doing the cleanup and > posted it earlier, please feel free to review and express an opinion on > the original versus the cleanup. > > I'm on the fence. I do think the cleaned-up version is much nicer, but > I always worry about the risk of touching little-used code. True. With s/strbuf_read_file()/strbuf_getline()/ on the first one I think there is no regression (at least, that is the only deliberate regression I saw). I found all other parts of the cleaned-up version much nicer, too. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup
On Wed, Sep 16, 2015 at 12:52:26PM -0700, Junio C Hamano wrote: > > diff --git a/remote.c b/remote.c > > index 5ab0f7f..1b69751 100644 > > --- a/remote.c > > +++ b/remote.c > > @@ -297,7 +297,6 @@ static void read_branches_file(struct remote *remote) > > int n = 1000; > > FILE *f = fopen(git_path("branches/%.*s", n, remote->name), "r"); > > char *s, *p; > > - int len; > > Hmm, we would punish those with ridiculously long remote name by > truncating at n but that is OK. Yeah, though that is nothing new. In some of the cases, as you've seen, I dug further in cleaning things up. But in others I did the minimal fix (especially in this case, the limitations are only about the deprecated "branches" and "remotes" file), mostly to try to keep the scope of work sane. > We use buffer[BUFSIZ] to read various things in this file, not just > $GIT_DIR/branches/* files, with fgets(), which may be better done if > we switched to strbuf_getline(). Then we can also use trim family > of calls from the strbuf API suite. > > Move to strbuf_getline() may be a doubly attractive proposition, > with a possible change to strbuf_getline() to make it also remove CR > that immediately precedes LF [*1*], helping DOSsy platforms. I'll take a look and see how painful this is. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup
Jeff Kingwrites: > This code is exactly replicating strdup, so let's just use > that. It's shorter, and eliminates some confusion (such as > whether "p - s" is really enough to hold the result; it is, > because we write NULs as we shrink "p"). > > Signed-off-by: Jeff King > --- > remote.c | 5 + > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/remote.c b/remote.c > index 5ab0f7f..1b69751 100644 > --- a/remote.c > +++ b/remote.c > @@ -297,7 +297,6 @@ static void read_branches_file(struct remote *remote) > int n = 1000; > FILE *f = fopen(git_path("branches/%.*s", n, remote->name), "r"); > char *s, *p; > - int len; Hmm, we would punish those with ridiculously long remote name by truncating at n but that is OK. We use buffer[BUFSIZ] to read various things in this file, not just $GIT_DIR/branches/* files, with fgets(), which may be better done if we switched to strbuf_getline(). Then we can also use trim family of calls from the strbuf API suite. Move to strbuf_getline() may be a doubly attractive proposition, with a possible change to strbuf_getline() to make it also remove CR that immediately precedes LF [*1*], helping DOSsy platforms. [Reference] *1* http://thread.gmane.org/gmane.comp.version-control.msysgit/21773/focus=21780 > > if (!f) > return; > @@ -313,9 +312,7 @@ static void read_branches_file(struct remote *remote) > p = s + strlen(s); > while (isspace(p[-1])) > *--p = 0; > - len = p - s; > - p = xmalloc(len + 1); > - strcpy(p, s); > + p = xstrdup(s); > > /* >* The branches file would have URL and optionally -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 33/67] read_branches_file: replace strcpy with xstrdup
This code is exactly replicating strdup, so let's just use that. It's shorter, and eliminates some confusion (such as whether "p - s" is really enough to hold the result; it is, because we write NULs as we shrink "p"). Signed-off-by: Jeff King--- remote.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/remote.c b/remote.c index 5ab0f7f..1b69751 100644 --- a/remote.c +++ b/remote.c @@ -297,7 +297,6 @@ static void read_branches_file(struct remote *remote) int n = 1000; FILE *f = fopen(git_path("branches/%.*s", n, remote->name), "r"); char *s, *p; - int len; if (!f) return; @@ -313,9 +312,7 @@ static void read_branches_file(struct remote *remote) p = s + strlen(s); while (isspace(p[-1])) *--p = 0; - len = p - s; - p = xmalloc(len + 1); - strcpy(p, s); + p = xstrdup(s); /* * The branches file would have URL and optionally -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html