Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup

2015-09-17 Thread Jeff King
On Thu, Sep 17, 2015 at 07:28:56AM -0400, Jeff King wrote:

>   /*
>* The branches file would have URL and optionally
>* #branch specified.  The "master" (or specified) branch is
> -  * fetched and stored in the local branch of the same name.
> +  * fetched and stored in the local branch matching the
> +  * remote name.
>*/
> - frag = strchr(p, '#');
> - if (frag) {
> + frag = strchr(buf.buf, '#');
> + if (frag)
>   *(frag++) = '\0';
> - strbuf_addf(, "refs/heads/%s", frag);
> - } else
> - strbuf_addstr(, "refs/heads/master");
> + else
> + frag = "master";
> +
> + add_url_alias(remote, strbuf_detach(, NULL));
> + add_fetch_refspec(remote, xstrfmt("refs/heads/%s:refs/heads/%s",
> +   frag, remote->name));

There is a little bit of subtlety here. "frag" points into the strbuf.
We then detach the strbuf, and assume that "frag" is still valid. This
works fine, as we retain the same buffer when detaching. But I wonder if
it is violating the strbuf abstraction too much.

We can xstrdup() the frag section, but it just adds a little more
tedious book-keeping to the function.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup

2015-09-17 Thread Jeff King
On Thu, Sep 17, 2015 at 07:28:56AM -0400, Jeff King wrote:

> Here is the patch I ended up with:
> 
> -- >8 --
> Subject: [PATCH] read_branches_file: simplify string handling

And here is the matching cleanup for read_remotes_file, which
lets us drop the static global "buffer" entirely.

-- >8 --
Subject: [PATCH] read_remotes_file: simplify string handling

The main motivation for this cleanup is to switch our
line-reading to a strbuf, which removes the use of a
static-sized buffer (which limited the size of remote URLs).
Since we have the strbuf, we can make use of strbuf_rtrim().

While we're here, we can also simplify the parsing of each
line.  First, we can use skip_prefix() to avoid some magic
numbers.

But second, we can avoid splitting the parsing and actions
for each line into two stages. Right now we figure out which
type of line we have, set an int to a magic number,
skip any intermediate whitespace, and then act on
the resulting value based on the magic number.

Instead, let's factor the whitespace skipping into a
function. That lets us avoid the magic numbers and keep the
actions close to the parsing.

Signed-off-by: Jeff King 
---
You could also take advantage of the fact that all of the actions have a
uniform function interface, and write this as:

  void (*action)(struct remote *, char *);

  if (skip_prefix(buf.buf, "URL:", ))
action = add_url_alias;
  else ...more parsing...

  while (isspace(*v))
v++;

  action(remote, xstrdup(v));

That is more restrictive, but I doubt we would be adding new actions to
this deprecated format anytime soon. I'm not sure which people think is
more readable.

 remote.c | 55 ++-
 1 file changed, 18 insertions(+), 37 deletions(-)

diff --git a/remote.c b/remote.c
index 2bef5a4..a62b659 100644
--- a/remote.c
+++ b/remote.c
@@ -54,9 +54,6 @@ static const char *pushremote_name;
 static struct rewrites rewrites;
 static struct rewrites rewrites_push;
 
-#define BUF_SIZE (2048)
-static char buffer[BUF_SIZE];
-
 static int valid_remote(const struct remote *remote)
 {
return (!!remote->url) || (!!remote->foreign_vcs);
@@ -243,50 +240,34 @@ static void add_instead_of(struct rewrite *rewrite, const 
char *instead_of)
rewrite->instead_of_nr++;
 }
 
+static const char *skip_spaces(const char *s)
+{
+   while (isspace(*s))
+   s++;
+   return s;
+}
+
 static void read_remotes_file(struct remote *remote)
 {
+   struct strbuf buf = STRBUF_INIT;
FILE *f = fopen(git_path("remotes/%s", remote->name), "r");
 
if (!f)
return;
remote->origin = REMOTE_REMOTES;
-   while (fgets(buffer, BUF_SIZE, f)) {
-   int value_list;
-   char *s, *p;
-
-   if (starts_with(buffer, "URL:")) {
-   value_list = 0;
-   s = buffer + 4;
-   } else if (starts_with(buffer, "Push:")) {
-   value_list = 1;
-   s = buffer + 5;
-   } else if (starts_with(buffer, "Pull:")) {
-   value_list = 2;
-   s = buffer + 5;
-   } else
-   continue;
-
-   while (isspace(*s))
-   s++;
-   if (!*s)
-   continue;
+   while (strbuf_getline(, f, '\n') != EOF) {
+   const char *v;
 
-   p = s + strlen(s);
-   while (isspace(p[-1]))
-   *--p = 0;
+   strbuf_rtrim();
 
-   switch (value_list) {
-   case 0:
-   add_url_alias(remote, xstrdup(s));
-   break;
-   case 1:
-   add_push_refspec(remote, xstrdup(s));
-   break;
-   case 2:
-   add_fetch_refspec(remote, xstrdup(s));
-   break;
-   }
+   if (skip_prefix(buf.buf, "URL:", ))
+   add_url_alias(remote, xstrdup(skip_spaces(v)));
+   else if (skip_prefix(buf.buf, "Push:", ))
+   add_push_refspec(remote, xstrdup(skip_spaces(v)));
+   else if (skip_prefix(buf.buf, "Pull:", ))
+   add_fetch_refspec(remote, xstrdup(skip_spaces(v)));
}
+   strbuf_release();
fclose(f);
 }
 
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup

2015-09-17 Thread Jeff King
On Wed, Sep 16, 2015 at 04:42:26PM -0400, Jeff King wrote:

> > We use buffer[BUFSIZ] to read various things in this file, not just
> > $GIT_DIR/branches/* files, with fgets(), which may be better done if
> > we switched to strbuf_getline().  Then we can also use trim family
> > of calls from the strbuf API suite.
> >
> > Move to strbuf_getline() may be a doubly attractive proposition,
> > with a possible change to strbuf_getline() to make it also remove CR
> > that immediately precedes LF [*1*], helping DOSsy platforms.
> 
> I'll take a look and see how painful this is.

I think that this already works, because we strip whitespace from the
right side of the buffer (and isspace('\r') returns true). Likewise, I
don't think we need to teach strbuf_getline() about CRs for the same
reason (I agree it would be a good thing to do in general, but I stopped
short here).

Here is the patch I ended up with:

-- >8 --
Subject: [PATCH] read_branches_file: simplify string handling

This function does a lot of manual string handling, and has
some unnecessary limits. This patch cleans up a number of
things:

  1. Drop the arbitrary 1000-byte limit on the size of the
 remote name (we do not have such a limit in any of the
 other remote-reading mechanisms).

  2. Replace fgets into a fixed-size buffer with a strbuf,
 eliminating any limits on the length of the URL.

 This uses strbuf_read_file for simplicity. Technically
 this behavior is different than the original, as we
 will read the whole file content, whereas the original
 simply ignored subsequent lines.

 Given that branches files are supposed to be one line,
 this doesn't matter in practice (and arguably including
 the extra lines is better, as it will probably cause
 the URL to be bogus and bring attention to the issue).

  3. Replace manual whitespace handling with strbuf_trim
 (since we now have a strbuf). This also gets rid
 of a call to strcpy, and the confusing reuse of the "p"
 pointer for multiple purposes.

  4. We currently build up the refspecs over multiple strbuf
 calls. We do this to handle the fact that the URL "frag"
 may not be present. But rather than have multiple
 conditionals, let's just default "frag" to "master".
 This lets us format the refspecs with a single xstrfmt.
 It's shorter, and easier to see what the final string
 looks like.

 We also update the misleading comment in this area (the
 local branch is named after the remote name, not after
 the branch name on the remote side).

Signed-off-by: Jeff King 
---
I guess (2) could be wrong if people are counting on adding arbitrary
comment lines to the end of a "branches" file, but AFAIK that has never
been a thing.

 remote.c | 54 +++---
 1 file changed, 19 insertions(+), 35 deletions(-)

diff --git a/remote.c b/remote.c
index 5ab0f7f..2bef5a4 100644
--- a/remote.c
+++ b/remote.c
@@ -293,56 +293,40 @@ static void read_remotes_file(struct remote *remote)
 static void read_branches_file(struct remote *remote)
 {
char *frag;
-   struct strbuf branch = STRBUF_INIT;
-   int n = 1000;
-   FILE *f = fopen(git_path("branches/%.*s", n, remote->name), "r");
-   char *s, *p;
-   int len;
+   struct strbuf buf = STRBUF_INIT;
 
-   if (!f)
+   if (strbuf_read_file(, git_path("branches/%s", remote->name), 0) < 
0)
return;
-   s = fgets(buffer, BUF_SIZE, f);
-   fclose(f);
-   if (!s)
-   return;
-   while (isspace(*s))
-   s++;
-   if (!*s)
+
+   strbuf_trim();
+   if (!buf.len) {
+   strbuf_release();
return;
+   }
+
remote->origin = REMOTE_BRANCHES;
-   p = s + strlen(s);
-   while (isspace(p[-1]))
-   *--p = 0;
-   len = p - s;
-   p = xmalloc(len + 1);
-   strcpy(p, s);
 
/*
 * The branches file would have URL and optionally
 * #branch specified.  The "master" (or specified) branch is
-* fetched and stored in the local branch of the same name.
+* fetched and stored in the local branch matching the
+* remote name.
 */
-   frag = strchr(p, '#');
-   if (frag) {
+   frag = strchr(buf.buf, '#');
+   if (frag)
*(frag++) = '\0';
-   strbuf_addf(, "refs/heads/%s", frag);
-   } else
-   strbuf_addstr(, "refs/heads/master");
+   else
+   frag = "master";
+
+   add_url_alias(remote, strbuf_detach(, NULL));
+   add_fetch_refspec(remote, xstrfmt("refs/heads/%s:refs/heads/%s",
+ frag, remote->name));
 
-   strbuf_addf(, ":refs/heads/%s", remote->name);
-   add_url_alias(remote, p);
-   add_fetch_refspec(remote, strbuf_detach(, NULL));
/*
 * Cogito compatible push: push current 

Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup

2015-09-17 Thread Jeff King
On Thu, Sep 17, 2015 at 08:38:32AM -0700, Junio C Hamano wrote:

> > In some of the cases, as you've seen, I dug further in cleaning things
> > up. But in others I did the minimal fix (especially in this case, the
> > limitations are only about the deprecated "branches" and "remotes"
> > file), mostly to try to keep the scope of work sane.
> 
> That is sensible.  As long as the result of conversion is easier to
> audit (which is the primary focus of this series), I'd agree that we
> should stop there, instead of making further changes.
> 
> The last thing we would want to do is to change the behaviour,
> especially to unintentionally start rejecting what we have always
> accepted, while doing a "code clean-up".  Letting these sleeping
> dogs lie is the safest.  That various distros lag behind our release
> schedule means that we may not hear about regression until a year
> after we break it for a feature used by minority of users.

Yeah, that was my thinking. Since I _did_ end up doing the cleanup and
posted it earlier, please feel free to review and express an opinion on
the original versus the cleanup.

I'm on the fence.  I do think the cleaned-up version is much nicer, but
I always worry about the risk of touching little-used code.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup

2015-09-17 Thread Junio C Hamano
Jeff King  writes:

> On Wed, Sep 16, 2015 at 12:52:26PM -0700, Junio C Hamano wrote:
>
>> > diff --git a/remote.c b/remote.c
>> > index 5ab0f7f..1b69751 100644
>> > --- a/remote.c
>> > +++ b/remote.c
>> > @@ -297,7 +297,6 @@ static void read_branches_file(struct remote *remote)
>> >int n = 1000;
>> >FILE *f = fopen(git_path("branches/%.*s", n, remote->name), "r");
>> >char *s, *p;
>> > -  int len;
>> 
>> Hmm, we would punish those with ridiculously long remote name by
>> truncating at n but that is OK.
>
> Yeah, though that is nothing new.
>
> In some of the cases, as you've seen, I dug further in cleaning things
> up. But in others I did the minimal fix (especially in this case, the
> limitations are only about the deprecated "branches" and "remotes"
> file), mostly to try to keep the scope of work sane.

That is sensible.  As long as the result of conversion is easier to
audit (which is the primary focus of this series), I'd agree that we
should stop there, instead of making further changes.

The last thing we would want to do is to change the behaviour,
especially to unintentionally start rejecting what we have always
accepted, while doing a "code clean-up".  Letting these sleeping
dogs lie is the safest.  That various distros lag behind our release
schedule means that we may not hear about regression until a year
after we break it for a feature used by minority of users.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup

2015-09-17 Thread Junio C Hamano
Jeff King  writes:

> Yeah, that was my thinking. Since I _did_ end up doing the cleanup and
> posted it earlier, please feel free to review and express an opinion on
> the original versus the cleanup.
>
> I'm on the fence.  I do think the cleaned-up version is much nicer, but
> I always worry about the risk of touching little-used code.

True.

With s/strbuf_read_file()/strbuf_getline()/ on the first one I think
there is no regression (at least, that is the only deliberate
regression I saw).

I found all other parts of the cleaned-up version much nicer, too.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup

2015-09-16 Thread Jeff King
On Wed, Sep 16, 2015 at 12:52:26PM -0700, Junio C Hamano wrote:

> > diff --git a/remote.c b/remote.c
> > index 5ab0f7f..1b69751 100644
> > --- a/remote.c
> > +++ b/remote.c
> > @@ -297,7 +297,6 @@ static void read_branches_file(struct remote *remote)
> > int n = 1000;
> > FILE *f = fopen(git_path("branches/%.*s", n, remote->name), "r");
> > char *s, *p;
> > -   int len;
> 
> Hmm, we would punish those with ridiculously long remote name by
> truncating at n but that is OK.

Yeah, though that is nothing new.

In some of the cases, as you've seen, I dug further in cleaning things
up. But in others I did the minimal fix (especially in this case, the
limitations are only about the deprecated "branches" and "remotes"
file), mostly to try to keep the scope of work sane.

> We use buffer[BUFSIZ] to read various things in this file, not just
> $GIT_DIR/branches/* files, with fgets(), which may be better done if
> we switched to strbuf_getline().  Then we can also use trim family
> of calls from the strbuf API suite.
>
> Move to strbuf_getline() may be a doubly attractive proposition,
> with a possible change to strbuf_getline() to make it also remove CR
> that immediately precedes LF [*1*], helping DOSsy platforms.

I'll take a look and see how painful this is.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 33/67] read_branches_file: replace strcpy with xstrdup

2015-09-16 Thread Junio C Hamano
Jeff King  writes:

> This code is exactly replicating strdup, so let's just use
> that. It's shorter, and eliminates some confusion (such as
> whether "p - s" is really enough to hold the result; it is,
> because we write NULs as we shrink "p").
>
> Signed-off-by: Jeff King 
> ---
>  remote.c | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/remote.c b/remote.c
> index 5ab0f7f..1b69751 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -297,7 +297,6 @@ static void read_branches_file(struct remote *remote)
>   int n = 1000;
>   FILE *f = fopen(git_path("branches/%.*s", n, remote->name), "r");
>   char *s, *p;
> - int len;

Hmm, we would punish those with ridiculously long remote name by
truncating at n but that is OK.

We use buffer[BUFSIZ] to read various things in this file, not just
$GIT_DIR/branches/* files, with fgets(), which may be better done if
we switched to strbuf_getline().  Then we can also use trim family
of calls from the strbuf API suite.

Move to strbuf_getline() may be a doubly attractive proposition,
with a possible change to strbuf_getline() to make it also remove CR
that immediately precedes LF [*1*], helping DOSsy platforms.


[Reference]

*1* http://thread.gmane.org/gmane.comp.version-control.msysgit/21773/focus=21780



>  
>   if (!f)
>   return;
> @@ -313,9 +312,7 @@ static void read_branches_file(struct remote *remote)
>   p = s + strlen(s);
>   while (isspace(p[-1]))
>   *--p = 0;
> - len = p - s;
> - p = xmalloc(len + 1);
> - strcpy(p, s);
> + p = xstrdup(s);
>  
>   /*
>* The branches file would have URL and optionally
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 33/67] read_branches_file: replace strcpy with xstrdup

2015-09-15 Thread Jeff King
This code is exactly replicating strdup, so let's just use
that. It's shorter, and eliminates some confusion (such as
whether "p - s" is really enough to hold the result; it is,
because we write NULs as we shrink "p").

Signed-off-by: Jeff King 
---
 remote.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/remote.c b/remote.c
index 5ab0f7f..1b69751 100644
--- a/remote.c
+++ b/remote.c
@@ -297,7 +297,6 @@ static void read_branches_file(struct remote *remote)
int n = 1000;
FILE *f = fopen(git_path("branches/%.*s", n, remote->name), "r");
char *s, *p;
-   int len;
 
if (!f)
return;
@@ -313,9 +312,7 @@ static void read_branches_file(struct remote *remote)
p = s + strlen(s);
while (isspace(p[-1]))
*--p = 0;
-   len = p - s;
-   p = xmalloc(len + 1);
-   strcpy(p, s);
+   p = xstrdup(s);
 
/*
 * The branches file would have URL and optionally
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html