Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf
On Tue, Sep 15, 2015 at 11:36 AM, Jeff Kingwrote: > We sometimes sprintf into static buffers when we know that > the size of the buffer is large enough to fit the input > (either because it's a constant, or because it's numeric > input that is bounded in size). Likewise with strcpy of > constant strings. > > However, these sites make it hard to audit sprintf and > strcpy calls for buffer overflows, as a reader has to > cross-reference the size of the array with the input. Let's > use xsnprintf instead, which communicates to a reader that > we don't expect this to overflow (and catches the mistake in > case we do). > > Signed-off-by: Jeff King > --- > diff --git a/builtin/merge-index.c b/builtin/merge-index.c > index 1a1eafa..1d66111 100644 > --- a/builtin/merge-index.c > +++ b/builtin/merge-index.c > @@ -23,7 +23,7 @@ static int merge_entry(int pos, const char *path) > break; > found++; > strcpy(hexbuf[stage], sha1_to_hex(ce->sha1)); > - sprintf(ownbuf[stage], "%o", ce->ce_mode); > + xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", > ce->ce_mode); Interesting. I wonder if there are any (old/broken) compilers which would barf on this. If we care, perhaps sizeof(ownbuf[0]) instead? > arguments[stage] = hexbuf[stage]; > arguments[stage + 4] = ownbuf[stage]; > } while (++pos < active_nr); -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 26/67] replace trivial malloc + sprintf /strcpy calls to xstrfmt
On Tue, Sep 15, 2015 at 11:45 AM, Jeff Kingwrote: > replace trivial malloc + sprintf /strcpy calls to xstrfmt s/to/with/ Also, do you want either to add a space after '/' or drop the one before it? > It's a common pattern to do: > > foo = xmalloc(strlen(one) + strlen(two) + 1 + 1); > sprintf(foo, "%s %s", one, two); > > (or possibly some variant with strcpy()s or a more > complicated length computation). We can switch these to use > xstrfmt, which is shorter, involves less error-prone manual > computation, and removes many sprintf and strcpy calls which > make it harder to audit the code for real buffer overflows. > > Signed-off-by: Jeff King > --- > --- a/imap-send.c > +++ b/imap-send.c > @@ -889,9 +889,8 @@ static char *cram(const char *challenge_64, const char > *user, const char *pass) > } > > /* response: " " */ > - resp_len = strlen(user) + 1 + strlen(hex) + 1; > - response = xmalloc(resp_len); > - sprintf(response, "%s %s", user, hex); > + response = xstrfmt("%s %s", user, hex); > + resp_len = strlen(response); > > response_64 = xmalloc(ENCODED_SIZE(resp_len) + 1); The original resp_len calculation included the NUL but the revised does not. If I'm reading this correctly, the revised calculation is correct, and the original was over-allocating response_64, right? > encoded_len = EVP_EncodeBlock((unsigned char *)response_64, -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] strtoul_ui: actually report error in case of negative input
On Tue, Sep 15, 2015 at 08:50:03AM +0200, Matthieu Moy wrote: > I think it would be better to just return a long to avoid needless > limitations, but changing the argument to "long" would interfer with > in-flight topics. Not worth the trouble. Sure. > > One potential issue with your patch is that you're forbidding the > interval [2^31, 2^32[ which was previously allowed, both on 32 and 64 > bits. I'm not sure whether we have a use for this in the codebase. As far as I could see it was used only for file modes. Which does not need that big numbers. > This alternative patch is rather ugly to, but I think it is less > limiting and does not have the "large negative wrapped to positive" > issue: > > --- a/git-compat-util.h > +++ b/git-compat-util.h > @@ -814,6 +814,9 @@ static inline int strtoul_ui(char const *s, int base, > unsigned int *result) > char *p; > > errno = 0; > + /* negative values would be accepted by strtoul */ > + if (strchr(s, '-')) > + return -1; > ul = strtoul(s, , base); > if (errno || *p || p == s || (unsigned int) ul != ul) > return -1; > > What do you think? Explicit rejection of '-' is of course useful addition. I still find "(unsigned int) ul != ul" bad. As far as I understand it makes no sense for i386. And even for 64-bit it's too obscure. In form of "(ul & 0xL) == 0" it would be more clear. Or just make explicit comparison with intended limit, like I did. Well, actually I don't have strong preferences as long as "make -C t" does not alarm me with things I did not break. Maybe somebody else will comment more. -- Max -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 29/67] use strip_suffix and xstrfmt to replace suffix
On Tue, Sep 15, 2015 at 11:47 AM, Jeff Kingwrote: > When we want to convert "foo.pack" to "foo.idx", we do it by > duplicating the original string and then munging the bytes > in place. Let's use strip_suffix and xstrfmt instead, which > has several advantages: > > 1. It's more clear what the intent is. > > 2. It does not implicitly rely on the fact that > strlen(".idx") <= strlen(".pack") to avoid an overflow. > > 3. We communicate the assumption that the input file ends > with ".pack" (and get a run-time check that this is so). > > 4. We drop calls to strcpy, which makes auditing the code > base easier. > > Likewise, we can do this to convert ".pack" to ".bitmap", > avoiding some manual memory computation. > > Signed-off-by: Jeff King > --- > diff --git a/http.c b/http.c > index 7b02259..e0ff876 100644 > --- a/http.c > +++ b/http.c > @@ -1511,6 +1511,7 @@ int finish_http_pack_request(struct http_pack_request > *preq) > struct packed_git **lst; > struct packed_git *p = preq->target; > char *tmp_idx; > + size_t len; > struct child_process ip = CHILD_PROCESS_INIT; > const char *ip_argv[8]; > > @@ -1524,9 +1525,9 @@ int finish_http_pack_request(struct http_pack_request > *preq) > lst = &((*lst)->next); > *lst = (*lst)->next; > > - tmp_idx = xstrdup(preq->tmpfile); > - strcpy(tmp_idx + strlen(tmp_idx) - strlen(".pack.temp"), > - ".idx.temp"); > + if (!strip_suffix(preq->tmpfile, ".pack.temp", )) > + die("BUG: pack tmpfile does not end in .pack.temp?"); > + tmp_idx = xstrfmt("%.*s.idx.temp", (int)len, preq->tmpfile); These instances of repeated replacement code may argue in favor of a general purpose replace_suffix() function: char *replace_suffix(const char *s, const char *old, const char *new) { size_t n; if (!strip_suffix(s, old, )) die("BUG: '%s' does not end with '%s', s, old); return xstrfmt("%.*s%s", (int)n, s, new); } or something. > ip_argv[0] = "index-pack"; > ip_argv[1] = "-o"; > diff --git a/pack-bitmap.c b/pack-bitmap.c > index 637770a..7dfcb34 100644 > --- a/pack-bitmap.c > +++ b/pack-bitmap.c > @@ -252,16 +252,11 @@ static int load_bitmap_entries_v1(struct bitmap_index > *index) > > static char *pack_bitmap_filename(struct packed_git *p) > { > - char *idx_name; > - int len; > - > - len = strlen(p->pack_name) - strlen(".pack"); > - idx_name = xmalloc(len + strlen(".bitmap") + 1); > - > - memcpy(idx_name, p->pack_name, len); > - memcpy(idx_name + len, ".bitmap", strlen(".bitmap") + 1); > + size_t len; > > - return idx_name; > + if (!strip_suffix(p->pack_name, ".pack", )) > + die("BUG: pack_name does not end in .pack"); > + return xstrfmt("%.*s.bitmap", (int)len, p->pack_name); > } > > static int open_pack_bitmap_1(struct packed_git *packfile) > diff --git a/sha1_file.c b/sha1_file.c > index 28352a5..88996f0 100644 > --- a/sha1_file.c > +++ b/sha1_file.c > @@ -671,13 +671,15 @@ static int check_packed_git_idx(const char *path, > struct packed_git *p) > int open_pack_index(struct packed_git *p) > { > char *idx_name; > + size_t len; > int ret; > > if (p->index_data) > return 0; > > - idx_name = xstrdup(p->pack_name); > - strcpy(idx_name + strlen(idx_name) - strlen(".pack"), ".idx"); > + if (!strip_suffix(p->pack_name, ".pack", )) > + die("BUG: pack_name does not end in .pack"); > + idx_name = xstrfmt("%.*s.idx", (int)len, p->pack_name); > ret = check_packed_git_idx(idx_name, p); > free(idx_name); > return ret; > -- > 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf
Ramsay Joneswrites: > How about using strlcpy() instead? Thus: > > - strcpy(header.name, "pax_global_header"); > + strlcpy(header.name, "pax_global_header", sizeof(header.name)); > > Ditto for other similar (strcpy->xsnprintf) hunks below. Please do not advocate use of strlcpy(), which substitutes overwriting beyond the end of the buffer (which is a bug) with a silent truncation (which is almost always another bug, unless in a very narrow case of producing a non-essential string result where loss of information does not matter). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 08/67] add reentrant variants of sha1_to_hex and find_unique_abbrev
Jeff Kingwrites: >> Hmm, I haven't read any other patches yet (including those which use these >> new '_to' functions), but I can't help feeling they should be named something >> like 'sha1_to_hex_str()' and 'find_unique_abbrev_str()' instead. i.e. I >> don't get >> the '_to' thing - not that I'm any good at naming things ... > > I meant it as a contrast with their original. sha1_to_hex() formats into > an internal buffer and returns it. But sha1_to_hex_to() formats "to" a > buffer of your choice. I think that naming makes perfect sense. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 07/67] strbuf: make strbuf_complete_line more generic
Eric Sunshinewrites: >> +static inline void strbuf_complete(struct strbuf *sb, char term) >> +{ >> + if (sb->len && sb->buf[sb->len - 1] != term) >> + strbuf_addch(sb, term); >> +} > > Hmm, so this only adds 'term' if not already present *and* if 'sb' is > not empty, which doesn't seem to match the documentation which says > that it "ensures" termination. > > But, is that reasonable behavior? Intuitively, I'd expect 'term' to be > added when 'sb' is empty: > > if (!sb->len || sb->buf[sb->len - 1] != term) > strbuf_addch(sb, term); > > strbuf_complete_line()'s existing behavior of not adding '\n' to an > empty string may have been intentional, but actually smells like a > bug. I would expect two different scenarios for which this function would be useful. One is when dealing with a text file and want to avoid incomplete lines at the end. In this scenario, an empty file with zero lines should be left as-is, instead of getting turned into a file with one empty line. "Leave the empty input as-is" is the behaviour the callers want. The other is when you are given a directory name in the strbuf, you have a name of a file you want to be in that directory, and want to have the full path to the file in the strbuf. In this scenario, what does it mean for the caller to give you an empty "directory name"? I think at least in our codebase, that almost always would mean that "the path is relative to $CWD", i.e. you would want to see the "complete" to leave the input intact and then append the filename there. So to these two plausible and different set of callers that would be helped by this function, the behaviour Peff gives it would match what the callers want better than your version. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/67] war on sprintf, strcpy, etc
Jeff Kingwrites: > Obviously this is not intended for v2.6.0. But all of the spots touched > here are relatively quiet right now, so I wanted to get it out onto the > list. There are a few minor conflicts against "pu", but they're all > just from touching nearby lines. Thanks. Looks like a lot of good work you did ;-) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] strtoul_ui: actually report error in case of negative input
Matthieu Moywrites: > Not just the return type (which is the error status), but also the type > of the result argument indeed. It's not clear to me whether this is > intentional (09f2825 (git-grep: don't use sscanf, 2007-03-12) introduced > it, the commit message doesn't help). I first read strtoul_ui as > "strtoul with a better UI (user interface)", but maybe the name was > meant to say "a fuction that uses strtoul and returns an ui (unsigned > int)". Just for this part. Yes, ui does not mean user interface but "we are grabbing an unsigned int and as its internal implementation we happen to use strtoul" is where the name comes from. > I went through the thread quickly, my understanding is that there were > more work to do, but no objection to merging. Yes, there were some in-flight topics that interfered with it and the topic quickly went stale without getting rerolled. There wasn't any fundamental issue with the topic itself. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: "Medium" log format: change proposal for author != committer
"Robin H. Johnson"writes: > Specifically, if the author is NOT the same as the committer, then > display both in the header. Otherwise continue to display only the > author. I too found myself wanting to see both of the names sometimes, and the "fuller" format was added explicitly for that purpose. Even though I agree "show only one, and both only when they are different" is a reasonable and possibly useful format, it is out of question to change what "--pretty=medium" does. It has been with us forever and people and their scripts do rely on it. It would be good if we can say $ git log --pretty=robinsformat but with a better name to show such an output. Having said that, I'm moderately negative about adding it as yet another hard-coded format. We simply have too many, and we do not need one more. What we need instead is a flexible framework to let users get what they want. I think what needs to happen is: * Enhance the "--pretty=format:" thing so that the current set of hardcoded --pretty=medium,short,... formats and your modified "medium" can be expressed as a custom format string. * Introduce a configuration mechanism to allow users to define new short-hand, e.g. if you have this in your $HOME/.gitconfig: [pretty "robin"] format = "commit %H%nAuthor: %an <%ae>%n..." and run "git log --pretty=robin", it would behave as if you said "git log --pretty="format:commit %H%nAuthor: %an <%ae>%n...". * (optional) Replace the hardcoded implementations of pretty formats with short-hand names like "medium", "short", etc. with a built-in set of pretty.$name.format using the configuration mechanism. But we need to make sure this does not hurt performance for common cases. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5] remote: add get-url subcommand
Expanding `insteadOf` is a part of ls-remote --url and there is no way to expand `pushInsteadOf` as well. Add a get-url subcommand to be able to query both as well as a way to get all configured urls. Signed-off-by: Ben Boeckel--- Documentation/git-remote.txt | 10 builtin/remote.c | 59 t/t5505-remote.sh| 39 + 3 files changed, 108 insertions(+) diff --git a/Documentation/git-remote.txt b/Documentation/git-remote.txt index 4c6d6de..3c9bf45 100644 --- a/Documentation/git-remote.txt +++ b/Documentation/git-remote.txt @@ -15,6 +15,7 @@ SYNOPSIS 'git remote remove' 'git remote set-head' (-a | --auto | -d | --delete | ) 'git remote set-branches' [--add] ... +'git remote get-url' [--push] [--all] 'git remote set-url' [--push] [] 'git remote set-url --add' [--push] 'git remote set-url --delete' [--push] @@ -131,6 +132,15 @@ The named branches will be interpreted as if specified with the With `--add`, instead of replacing the list of currently tracked branches, adds to that list. +'get-url':: + +Retrieves the URLs for a remote. Configurations for `insteadOf` and +`pushInsteadOf` are expanded here. By default, only the first URL is listed. ++ +With '--push', push URLs are queried rather than fetch URLs. ++ +With '--all', all URLs for the remote will be listed. + 'set-url':: Changes URLs for the remote. Sets first URL for remote that matches diff --git a/builtin/remote.c b/builtin/remote.c index 181668d..e4c3ea1 100644 --- a/builtin/remote.c +++ b/builtin/remote.c @@ -18,6 +18,7 @@ static const char * const builtin_remote_usage[] = { N_("git remote prune [-n | --dry-run] "), N_("git remote [-v | --verbose] update [-p | --prune] [( | )...]"), N_("git remote set-branches [--add] ..."), + N_("git remote get-url [--push] [--all] "), N_("git remote set-url [--push] []"), N_("git remote set-url --add "), N_("git remote set-url --delete "), @@ -65,6 +66,11 @@ static const char * const builtin_remote_update_usage[] = { NULL }; +static const char * const builtin_remote_geturl_usage[] = { + N_("git remote get-url [--push] [--all] "), + NULL +}; + static const char * const builtin_remote_seturl_usage[] = { N_("git remote set-url [--push] []"), N_("git remote set-url --add "), @@ -1467,6 +1473,57 @@ static int set_branches(int argc, const char **argv) return set_remote_branches(argv[0], argv + 1, add_mode); } +static int get_url(int argc, const char **argv) +{ + int i, push_mode = 0, all_mode = 0; + const char *remotename = NULL; + struct remote *remote; + const char **url; + int url_nr; + struct option options[] = { + OPT_BOOL('\0', "push", _mode, +N_("query push URLs rather than fetch URLs")), + OPT_BOOL('\0', "all", _mode, +N_("return all URLs")), + OPT_END() + }; + argc = parse_options(argc, argv, NULL, options, builtin_remote_geturl_usage, 0); + + if (argc != 1) + usage_with_options(builtin_remote_geturl_usage, options); + + remotename = argv[0]; + + if (!remote_is_configured(remotename)) + die(_("No such remote '%s'"), remotename); + remote = remote_get(remotename); + + url_nr = 0; + if (push_mode) { + url = remote->pushurl; + url_nr = remote->pushurl_nr; + } + /* else fetch mode */ + + /* Use the fetch URL when no push URLs were found or requested. */ + if (!url_nr) { + url = remote->url; + url_nr = remote->url_nr; + } + + if (!url_nr) + die(_("no URLs configured for remote '%s'"), remotename); + + if (all_mode) { + for (i = 0; i < url_nr; i++) + printf_ln("%s", url[i]); + } else { + printf_ln("%s", *url); + } + + return 0; +} + static int set_url(int argc, const char **argv) { int i, push_mode = 0, add_mode = 0, delete_mode = 0; @@ -1576,6 +1633,8 @@ int cmd_remote(int argc, const char **argv, const char *prefix) result = set_head(argc, argv); else if (!strcmp(argv[0], "set-branches")) result = set_branches(argc, argv); + else if (!strcmp(argv[0], "get-url")) + result = get_url(argc, argv); else if (!strcmp(argv[0], "set-url")) result = set_url(argc, argv); else if (!strcmp(argv[0], "show")) diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh index 7a8499c..f03ba4c 100755 --- a/t/t5505-remote.sh +++ b/t/t5505-remote.sh @@ -919,6 +919,21 @@ test_expect_success 'new remote' ' cmp expect actual ' +get_url_test () { + cat >expect && + test_expect_success
Re: [PATCH 07/67] strbuf: make strbuf_complete_line more generic
On Tue, Sep 15, 2015 at 11:25 AM, Jeff Kingwrote: > The strbuf_complete_line function make sure that a buffer s/make/makes/ > ends in a newline. But we may want to do this for any > character (e.g., "/" on the end of a path). Let's factor out > a generic version, and keep strbuf_complete_line as a thin > wrapper. > > Signed-off-by: Jeff King > --- > +/** > + * Ensure that `sb` ends with the character `term`, if it does not > + * already. > + */ > +static inline void strbuf_complete(struct strbuf *sb, char term) > +{ > + if (sb->len && sb->buf[sb->len - 1] != term) > + strbuf_addch(sb, term); > +} Hmm, so this only adds 'term' if not already present *and* if 'sb' is not empty, which doesn't seem to match the documentation which says that it "ensures" termination. But, is that reasonable behavior? Intuitively, I'd expect 'term' to be added when 'sb' is empty: if (!sb->len || sb->buf[sb->len - 1] != term) strbuf_addch(sb, term); strbuf_complete_line()'s existing behavior of not adding '\n' to an empty string may have been intentional, but actually smells like a bug. > + > +/** > + * Ensure that `sb` ends with a newline. > + */ > static inline void strbuf_complete_line(struct strbuf *sb) > { > - if (sb->len && sb->buf[sb->len - 1] != '\n') > - strbuf_addch(sb, '\n'); > + strbuf_complete(sb, '\n'); > } -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/67] mailsplit: make PATH_MAX buffers dynamic
On Tue, Sep 15, 2015 at 11:28 AM, Jeff Kingwrote: > There are several static PATH_MAX-sized buffers in > mailsplit, along with some questionable uses of sprintf. > These are not really of security interest, as local > mailsplit pathnames are not typically under control of an > attacker. But it does not hurt to be careful, and as a > bonus we lift some limits for systems with too-small > PATH_MAX varibles. > > Signed-off-by: Jeff King > --- > diff --git a/builtin/mailsplit.c b/builtin/mailsplit.c > index 9de06e3..fb0bc08 100644 > --- a/builtin/mailsplit.c > +++ b/builtin/mailsplit.c > @@ -148,8 +155,7 @@ static int maildir_filename_cmp(const char *a, const char > *b) > static int split_maildir(const char *maildir, const char *dir, > int nr_prec, int skip) > { > - char file[PATH_MAX]; > - char name[PATH_MAX]; > + struct strbuf file = STRBUF_INIT; > FILE *f = NULL; > int ret = -1; > int i; > @@ -161,20 +167,25 @@ static int split_maildir(const char *maildir, const > char *dir, > goto out; > > for (i = 0; i < list.nr; i++) { > - snprintf(file, sizeof(file), "%s/%s", maildir, > list.items[i].string); > - f = fopen(file, "r"); > + char *name; > + > + strbuf_reset(); > + strbuf_addf(, "%s/%s", maildir, list.items[i].string); > + > + f = fopen(file.buf, "r"); > if (!f) { > - error("cannot open mail %s (%s)", file, > strerror(errno)); > + error("cannot open mail %s (%s)", file.buf, > strerror(errno)); > goto out; > } > > if (strbuf_getwholeline(, f, '\n')) { > - error("cannot read mail %s (%s)", file, > strerror(errno)); > + error("cannot read mail %s (%s)", file.buf, > strerror(errno)); > goto out; > } > > - sprintf(name, "%s/%0*d", dir, nr_prec, ++skip); > + name = xstrfmt("%s/%0*d", dir, nr_prec, ++skip); > split_one(f, name, 1); > + free(name); Hmm, why does 'file' become a strbuf which is re-used each time through the loop, but 'name' is treated differently and gets re-allocated upon each iteration? Why doesn't 'name' deserve the same treatment as 'file'? > fclose(f); > f = NULL; > @@ -184,6 +195,7 @@ static int split_maildir(const char *maildir, const char > *dir, > out: > if (f) > fclose(f); > + strbuf_release(); > string_list_clear(, 1); > return ret; > } > @@ -191,7 +203,6 @@ out: > static int split_mbox(const char *file, const char *dir, int allow_bare, > int nr_prec, int skip) > { > - char name[PATH_MAX]; > int ret = -1; > int peek; > > @@ -218,8 +229,9 @@ static int split_mbox(const char *file, const char *dir, > int allow_bare, > } > > while (!file_done) { > - sprintf(name, "%s/%0*d", dir, nr_prec, ++skip); > + char *name = xstrfmt("%s/%0*d", dir, nr_prec, ++skip); > file_done = split_one(f, name, allow_bare); > + free(name); Same question, pretty much: Why not make 'name' a strbuf which is re-used in the loop? (I don't have a strong preference; I'm just trying to understand the apparent inconsistency of treatment.) > } > > if (f != stdin) > -- > 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 11/67] trace: use strbuf for quote_crnl output
On Tue, Sep 15, 2015 at 11:28 AM, Jeff Kingwrote: > When we output GIT_TRACE_SETUP paths, we quote any > meta-characters. But our buffer to hold the result is only > PATH_MAX bytes, and we could double the size of the input > path (if every character needs quoted). We could use a s/quoted/to be &/ ...or... s/quoted/quoting/ > 2*PATH_MAX buffer, if we assume the input will never be more > than PATH_MAX. But it's easier still to just switch to a > strbuf and not worry about whether the input can exceed > PATH_MAX or not. > > Signed-off-by: Jeff King > --- > diff --git a/trace.c b/trace.c > index 7393926..0c06d71 100644 > --- a/trace.c > +++ b/trace.c > @@ -277,25 +277,25 @@ void trace_performance_fl(const char *file, int line, > uint64_t nanos, > > static const char *quote_crnl(const char *path) > { > - static char new_path[PATH_MAX]; > + static struct strbuf new_path = STRBUF_INIT; > const char *p2 = path; > - char *p1 = new_path; It's a little sad that this leaves a variable named 'p2' when there is no corresponding 'p1'. Would this deserve a cleanup patch which renames 'p2' to 'p' or do we not care enough? > if (!path) > return NULL; > > + strbuf_reset(_path); > + > while (*p2) { > switch (*p2) { > - case '\\': *p1++ = '\\'; *p1++ = '\\'; break; > - case '\n': *p1++ = '\\'; *p1++ = 'n'; break; > - case '\r': *p1++ = '\\'; *p1++ = 'r'; break; > + case '\\': strbuf_addstr(_path, ""); break; > + case '\n': strbuf_addstr(_path, "\\n"); break; > + case '\r': strbuf_addstr(_path, "\\r"); break; > default: > - *p1++ = *p2; > + strbuf_addch(_path, *p2); > } > p2++; > } > - *p1 = '\0'; > - return new_path; > + return new_path.buf; > } > > /* FIXME: move prefix to startup_info struct and get rid of this arg */ > -- > 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Feature Request] git blame showing only revisions from git rev-list --first-parent
Jeff Kingwrites: > It seems like nobody is actually that interested in what "blame > --first-parent --reverse" does in the first place, though, and there's > no reason for its complexity to hold up vanilla --first-parent. So what > do you think of: I like the part that explicitly disables the combination of the two ;-) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
"Medium" log format: change proposal for author != committer
Hi, I want to propose a change to the 'medium' log output format, to improve readability. Specifically, if the author is NOT the same as the committer, then display both in the header. Otherwise continue to display only the author. This would aid quick review of changes in git-log & git-show output. -- Robin Hugh Johnson Gentoo Linux: Developer, Infrastructure Lead E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] git-p4: handle "Translation of file content failed"
On 15/09/15 16:38, Lars Schneider wrote: On 15 Sep 2015, at 08:43, Luke Diamandwrote: Do we know the mechanism by which we end up in this state? Unfortunately no. I tried hard to reproduce the error with “conventional” methods. As you can see I ended up manipulating the P4 database… However, it looks like this error happens in the wild, too: https://stackoverflow.com/questions/5156909/translation-of-file-content-failed-error-in-perforce https://stackoverflow.com/questions/887006/perforce-translation-of-file-content-failed-error It's described in the Perforce FAQ here: http://answers.perforce.com/articles/KB/3117 i.e. it looks to be caused by mixing old and new P4 clients. Known issue: This works only if git-p4 is executed in verbose mode. In normal mode no exceptions are thrown and git-p4 just exits. Does that mean that the error will only be detected in verbose mode? That doesn't seem right! Correct. I don’t like this either but I also don’t want to make huge changes to git-p4. You can see the root problem here: https://github.com/git/git/blob/97d7ad75b6fe74960d2a12e4a9151a55a5a87d6d/git-p4.py#L110-L114 Any idea how to approach that best? I guess what we have is not ideal but probably good enough. +try: +text = p4_read_pipe(['print', '-q', '-o', '-', '%s@%s' % (file['depotFile'], file['change'])]) +except Exception as e: Would it be better to specify which kind of Exception you are catching? Looks like you could get OSError, ValueError and CalledProcessError; it's the last of these you want (I think). I think it is just a plain exception. See here: https://github.com/git/git/blob/97d7ad75b6fe74960d2a12e4a9151a55a5a87d6d/git-p4.py#L111 OK, you're right (probably less than ideal behaviour from read_pipe() and die() but let's not try to fix that). +if p4_version_string().find('/NT') >= 0: +text = text.replace('\r\n', '\n') +contents = [ text ] The indentation on this bit doesn't look right to me. I believe it is exactly how it was: https://github.com/git/git/blob/97d7ad75b6fe74960d2a12e4a9151a55a5a87d6d/git-p4.py#L2397-L2399 OK. In general, what is the appropriate way to reference code in this email list? Are GitHub links OK? I'm not an expert, but it feels possibly a bit ephemeral - if someone is digging through email archives in a future where that github project has been moved elsewhere, the links will all be dead. Luke -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Git Deployment using existing multiple environments
Hi, please stop top-posting. It is quite irritating by now. Thank you. On 2015-09-15 08:50, Sukhwinder Singh wrote: > Now lets say we set up a repository at github which has the latest > code (all test code)., Now at each of our own servers we already have > existing code, that is Test, UAT and Live. For example, first he'll > pull code from github to our Test Server and then move branches to UAT > and then to Live. Can it work? Yes, it can work. The major problem here will be that you have only a single, central repository that every developer has write access to, but you expect only a single trusted person to deploy that code via three different stages. That can be tricky, in particular if your trusted person is a newbie. > If it can work then can I please get some some example commands or the > procedure to set it up? Sorry, this feels a bit too much like "could you please do all my work for me?" to me. And if I really provide exact commands here, I may even be liable when it does not work out in the end? I am not going to do that. Instead, I will strongly suggest to learn enough about Git to answer those quite simple questions ("How do I pull?", "What is a pull?", "How do I update a test machine's working directory with the newest branch tip?") yourself. I usually recommend https://try.github.io/ > Time is a bit of problem right now or I would have read book suggested by > Johannes. I have searched on the internet but couldn't find any similar case. If I was in your shoes, I would spend the time *now*, rather than quite possibly spending 10x as much time later when I have to clean up a mess. In my experience, what looks like the cheap route (copying commands without understanding them, really), invariably turns out to be the most expensive route possible. In any case, this was as much useful feedback as I had to give to your questions. Ciao, Johannes -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Sep 2015, #03; Mon, 14)
From: "Junio C Hamano"Sent: Monday, September 14, 2015 11:43 PM -- [New Topics] * po/doc-branch-desc (2015-09-14) 1 commit (merged to 'next' on 2015-09-14 at 4934a96) + doc: show usage of branch description The branch descriptions that are set with "git branch --edit-description" option were used in many places but they weren't clearly documented. Will merge to 'master'. Thanks. Shall I just rework/resend the V2 patch-up ($gmane/277829) that also links to 'merge's' usage as a fresh patch (would be tonight UK)? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] git-p4: handle "Translation of file content failed"
On 14/09/15 17:55, larsxschnei...@gmail.com wrote: From: Lars SchneiderA P4 repository can get into a state where it contains a file with type UTF-16 that does not contain a valid UTF-16 BOM. If git-p4 Sorry - what's a BOM? I'm assuming it's not a Bill of Materials? Do we know the mechanism by which we end up in this state? attempts to retrieve the file then the process crashes with a "Translation of file content failed" error. Fix this by detecting this error and retrieving the file as binary instead. The result in Git is the same. Known issue: This works only if git-p4 is executed in verbose mode. In normal mode no exceptions are thrown and git-p4 just exits. Does that mean that the error will only be detected in verbose mode? That doesn't seem right! Signed-off-by: Lars Schneider --- git-p4.py | 27 --- 1 file changed, 16 insertions(+), 11 deletions(-) diff --git a/git-p4.py b/git-p4.py index 073f87b..5ae25a6 100755 --- a/git-p4.py +++ b/git-p4.py @@ -134,13 +134,11 @@ def read_pipe(c, ignore_error=False): sys.stderr.write('Reading pipe: %s\n' % str(c)) expand = isinstance(c,basestring) -p = subprocess.Popen(c, stdout=subprocess.PIPE, shell=expand) -pipe = p.stdout -val = pipe.read() -if p.wait() and not ignore_error: -die('Command failed: %s' % str(c)) - -return val +p = subprocess.Popen(c, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=expand) +(out, err) = p.communicate() +if p.returncode != 0 and not ignore_error: +die('Command failed: %s\nError: %s' % (str(c), err)) +return out def p4_read_pipe(c, ignore_error=False): real_cmd = p4_build_cmd(c) @@ -2186,10 +2184,17 @@ class P4Sync(Command, P4UserMap): # them back too. This is not needed to the cygwin windows version, # just the native "NT" type. # -text = p4_read_pipe(['print', '-q', '-o', '-', "%s@%s" % (file['depotFile'], file['change']) ]) -if p4_version_string().find("/NT") >= 0: -text = text.replace("\r\n", "\n") -contents = [ text ] +try: +text = p4_read_pipe(['print', '-q', '-o', '-', '%s@%s' % (file['depotFile'], file['change'])]) +except Exception as e: Would it be better to specify which kind of Exception you are catching? Looks like you could get OSError, ValueError and CalledProcessError; it's the last of these you want (I think). +if 'Translation of file content failed' in str(e): +type_base = 'binary' +else: +raise e +else: +if p4_version_string().find('/NT') >= 0: +text = text.replace('\r\n', '\n') +contents = [ text ] The indentation on this bit doesn't look right to me. if type_base == "apple": # Apple filetype files will be streamed as a concatenation of Luke -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 1/7] git-p4: add optional type specifier to gitConfig reader
On 14/09/15 14:26, larsxschnei...@gmail.com wrote: From: Lars SchneiderThe functions “gitConfig” and “gitConfigBool” are almost identical. Make “gitConfig” more generic by adding an optional type specifier. Use the type specifier “—bool” with “gitConfig” to implement “gitConfigBool. This prepares the implementation of other type specifiers such as “—int”. Looks good to me, Ack. Signed-off-by: Lars Schneider --- git-p4.py | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/git-p4.py b/git-p4.py index 073f87b..c139cab 100755 --- a/git-p4.py +++ b/git-p4.py @@ -604,9 +604,12 @@ def gitBranchExists(branch): _gitConfig = {} -def gitConfig(key): +def gitConfig(key, typeSpecifier=None): if not _gitConfig.has_key(key): -cmd = [ "git", "config", key ] +cmd = [ "git", "config" ] +if typeSpecifier: +cmd += [ typeSpecifier ] +cmd += [ key ] s = read_pipe(cmd, ignore_error=True) _gitConfig[key] = s.strip() return _gitConfig[key] @@ -617,10 +620,7 @@ def gitConfigBool(key): in the config.""" if not _gitConfig.has_key(key): -cmd = [ "git", "config", "--bool", key ] -s = read_pipe(cmd, ignore_error=True) -v = s.strip() -_gitConfig[key] = v == "true" +_gitConfig[key] = gitConfig(key, '--bool') == "true" return _gitConfig[key] def gitConfigList(key): -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7] git-p4: improve path encoding verbose output
On 15/09/15 08:31, Luke Diamand wrote: On 14/09/15 18:10, larsxschnei...@gmail.com wrote: It would be better to query this once at startup. Otherwise we're potentially forking "git config" twice per file which on a large repo could become significant. Make it an instance variable perhaps? This is of course complete nonsense since gitConfig caches its results! -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7] git-p4: improve path encoding verbose output
On 14/09/15 18:10, larsxschnei...@gmail.com wrote: From: Lars SchneiderIf a path with non-ASCII characters is detected then print always the s/print always/print/ encoding and the encoded string in verbose mode. Signed-off-by: Lars Schneider --- git-p4.py | 19 +-- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/git-p4.py b/git-p4.py index d45cf2b..da25d3f 100755 --- a/git-p4.py +++ b/git-p4.py @@ -2220,16 +2220,15 @@ class P4Sync(Command, P4UserMap): text = regexp.sub(r'$\1$', text) contents = [ text ] -if gitConfig("git-p4.pathEncoding"): -relPath = relPath.decode(gitConfig("git-p4.pathEncoding")).encode('utf8', 'replace') -elif self.verbose: -try: -relPath.decode('ascii') -except: -print ( -"Path with Non-ASCII characters detected and no path encoding defined. " -"Please check the encoding: %s" % relPath -) +try: +relPath.decode('ascii') +except: +encoding = 'utf8' +if gitConfig('git-p4.pathEncoding'): +encoding = gitConfig('git-p4.pathEncoding') It would be better to query this once at startup. Otherwise we're potentially forking "git config" twice per file which on a large repo could become significant. Make it an instance variable perhaps? +relPath = relPath.decode(encoding).encode('utf8', 'replace') +if self.verbose: +print 'Path with non-ASCII characters detected. Used %s to encode: %s ' % (encoding, relPath) self.gitStream.write("M %s inline %s\n" % (git_mode, relPath)) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Git Deployment using existing multiple environments
Thank you very much for the detailed reply. There is going to be just one person who'll manage all the moving of the code. On approval he'll move code from one environment to other on our own servers (right now it is being done by manual merging). Development team is not that big right now. They'll be committing all new code to the git repository we have setup up at github using all over test environment code. And the person handling the code at our server will move code from one environment to other by USING Git and not by manual merge. Now lets say we set up a repository at github which has the latest code (all test code)., Now at each of our own servers we already have existing code, that is Test, UAT and Live. For example, first he'll pull code from github to our Test Server and then move branches to UAT and then to Live. Can it work? If it can work then can I please get some some example commands or the procedure to set it up? Time is a bit of problem right now or I would have read book suggested by Johannes. I have searched on the internet but couldn't find any similar case. Regards, Sukhwinder Singh > From: jacob.kel...@gmail.com > Date: Mon, 14 Sep 2015 01:32:39 -0700 > Subject: Re: Git Deployment using existing multiple environments > To: php_programmer_in...@hotmail.com > CC: johannes.schinde...@gmx.de; git@vger.kernel.org > > On Sun, Sep 13, 2015 at 10:55 PM, Sukhwinder Singh >wrote: >> Thank you for the reply. Let's say I do setup three different repositories >> then how can we move work from one repository to the other. For example, >> from Test Environment to UAT. If there are any links that you can provide me >> that I can check, it'll be great. >> >> Regards, >> Sukhwinder Singh >> > > Generally speaking there are two ways of moving work from one > repository to another. The first is the "pull" where you request data > from a remote repository and then merge that data into your own. This > is what you're doing when you perform a clone, a fetch, or a pull. > It's what everyone does all the time when working with a local copy of > a "trusted" remote repository. It can also be done between two > "trusted" remotes, if your workflow is more distributed. (ie: more > than one "official" source). > > The second form of moving work is the "push" where you upload your > work into another repository. This is most commonly used when the > workflow is "centralized". By that I mean there is a single > authoritative repository. Or when you are moving your own work on a > local machine into a remotely accessible machine for others to pull > from. > > As Johannes said above, you really need to determine the work flow and > team style you want before you can really understand the best way to > setup repositories. For example, if you setup using a distributed > chain of command, you can have one person be the "maintainer" of each > given trusted repository. Then, maintainers can pull (or > equivalent-ly, pull-request) between each other. This is generally how > a project would work when using github. One person is the maintainer, > then a developer "forks" the project, makes some changes, then > requests that the maintainer pull these changes. The maintainer has > final say and will perform the final merge in cases of conflict. In > addition, maintainer is the one who says "this is ok to go into this > repository". > > You can also instead opt to use a single centralized repository. Thus, > developers would work on code and get it ready to submit, and then > simply perform a push. If the push requires a merge git will tell the > user to update. There are many tools such as server side hooks in > order to enforce various behaviors. > > This flow generally doesn't use sole maintainers, as each developer > has access to push directly. It may work well for smaller teams or for > dedicated teams who don't change developers often. > > A lot comes down to how your team is structured. Do you have one > person who's job can be to maintain the repository? Do you have > several developers who don't want to be the sole owner? Is your team > willing to function much more distributed? > > In the end, it's generally always a good idea to designate at least > one repository as the "authority" so that everyone knows where to look > for release tags and other such data. > > Myself, I would say that I prefer to use the pull-request model so > that code gets more review, as "push" based models tend not to do > review. (Exception: Gerrit, but this uses "git push" on the command > line to do something very much not like a push) > > Regards, > Jake -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] strtoul_ui: actually report error in case of negative input
[ Cc-ing Michael Haggerty who wrote the numparse module ] Max Kirillovwrites: > On Mon, Sep 14, 2015 at 08:30:54AM +0200, Matthieu Moy wrote: >>> Fix it by changing the last check to trigger earlier, as soon as it >>> becomes bigger than INT_MAX. >> >> What if the value is actually greater than INT_MAX? The function is >> returning an unsigned long (64 bits on 64bits architectures), and your >> version is restricting it to integers smaller than 2^31, right? > > the return type of the function is "int", so this is not > going to work anyway. Not just the return type (which is the error status), but also the type of the result argument indeed. It's not clear to me whether this is intentional (09f2825 (git-grep: don't use sscanf, 2007-03-12) introduced it, the commit message doesn't help). I first read strtoul_ui as "strtoul with a better UI (user interface)", but maybe the name was meant to say "a fuction that uses strtoul and returns an ui (unsigned int)". I think it would be better to just return a long to avoid needless limitations, but changing the argument to "long" would interfer with in-flight topics. Not worth the trouble. One potential issue with your patch is that you're forbidding the interval [2^31, 2^32[ which was previously allowed, both on 32 and 64 bits. I'm not sure whether we have a use for this in the codebase. This alternative patch is rather ugly to, but I think it is less limiting and does not have the "large negative wrapped to positive" issue: --- a/git-compat-util.h +++ b/git-compat-util.h @@ -814,6 +814,9 @@ static inline int strtoul_ui(char const *s, int base, unsigned int *result) char *p; errno = 0; + /* negative values would be accepted by strtoul */ + if (strchr(s, '-')) + return -1; ul = strtoul(s, , base); if (errno || *p || p == s || (unsigned int) ul != ul) return -1; What do you think? > As I mentioned, some negative values are still accepted > as coresponding mod 2**32 positive numbers (-3221225472 as > 1073741824), so there really is room for improvement, but it > cannot be accomplished just by examining strtoul output. On 64 bits architectures, it's not as bad: you need to go really far in the negatives to wrap to positive values. > I saw in the list archives an attempt to abandon the > function in favor of more accurate parser [1], but seems > like it did not make it into the project. > > [1] http://thread.gmane.org/gmane.comp.version-control.git/265635 I went through the thread quickly, my understanding is that there were more work to do, but no objection to merging. Michael, any plan to resurect the topic? -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf
On 15/09/15 16:36, Jeff King wrote: > We sometimes sprintf into static buffers when we know that > the size of the buffer is large enough to fit the input > (either because it's a constant, or because it's numeric > input that is bounded in size). Likewise with strcpy of > constant strings. > > However, these sites make it hard to audit sprintf and > strcpy calls for buffer overflows, as a reader has to > cross-reference the size of the array with the input. Let's > use xsnprintf instead, which communicates to a reader that > we don't expect this to overflow (and catches the mistake in > case we do). > > Signed-off-by: Jeff King> --- > These are all pretty trivial; the obvious thing to get wrong is that > "sizeof(buf)" is not the correct length if "buf" is a pointer. I > considered a macro wrapper like: > > #define xsnprintf_array(dst, fmt, ...) \ > xsnprintf(dst, sizeof(dst) + BARF_UNLESS_AN_ARRAY(dst), \ > fmt, __VA_ARGS__) > > but obviously that requires variadic macro support. > > archive-tar.c | 2 +- > builtin/gc.c | 2 +- > builtin/init-db.c | 11 ++- > builtin/ls-tree.c | 9 + > builtin/merge-index.c | 2 +- > builtin/merge-recursive.c | 2 +- > builtin/read-tree.c | 2 +- > builtin/unpack-file.c | 2 +- > compat/mingw.c| 8 +--- > compat/winansi.c | 2 +- > connect.c | 2 +- > convert.c | 3 ++- > daemon.c | 4 ++-- > diff.c| 12 ++-- > http-push.c | 2 +- > http.c| 6 +++--- > ll-merge.c| 12 ++-- > refs.c| 8 > sideband.c| 4 ++-- > strbuf.c | 4 ++-- > 20 files changed, 52 insertions(+), 47 deletions(-) > > diff --git a/archive-tar.c b/archive-tar.c > index b6b30bb..d543f93 100644 > --- a/archive-tar.c > +++ b/archive-tar.c > @@ -301,7 +301,7 @@ static int write_global_extended_header(struct > archiver_args *args) > memset(, 0, sizeof(header)); > *header.typeflag = TYPEFLAG_GLOBAL_HEADER; > mode = 0100666; > - strcpy(header.name, "pax_global_header"); > + xsnprintf(header.name, sizeof(header.name), "pax_global_header"); How about using strlcpy() instead? Thus: - strcpy(header.name, "pax_global_header"); + strlcpy(header.name, "pax_global_header", sizeof(header.name)); Ditto for other similar (strcpy->xsnprintf) hunks below. ATB, Ramsay Jones > prepare_header(args, , mode, ext_header.len); > write_blocked(, sizeof(header)); > write_blocked(ext_header.buf, ext_header.len); > diff --git a/builtin/gc.c b/builtin/gc.c > index 0ad8d30..57584bc 100644 > --- a/builtin/gc.c > +++ b/builtin/gc.c > @@ -194,7 +194,7 @@ static const char *lock_repo_for_gc(int force, pid_t* > ret_pid) > return NULL; > > if (gethostname(my_host, sizeof(my_host))) > - strcpy(my_host, "unknown"); > + xsnprintf(my_host, sizeof(my_host), "unknown"); > > pidfile_path = git_pathdup("gc.pid"); > fd = hold_lock_file_for_update(, pidfile_path, > diff --git a/builtin/init-db.c b/builtin/init-db.c > index 69323e1..e7d0e31 100644 > --- a/builtin/init-db.c > +++ b/builtin/init-db.c > @@ -262,7 +262,8 @@ static int create_default_files(const char *template_path) > } > > /* This forces creation of new config file */ > - sprintf(repo_version_string, "%d", GIT_REPO_VERSION); > + xsnprintf(repo_version_string, sizeof(repo_version_string), > + "%d", GIT_REPO_VERSION); > git_config_set("core.repositoryformatversion", repo_version_string); > > path[len] = 0; > @@ -414,13 +415,13 @@ int init_db(const char *template_dir, unsigned int > flags) >*/ > if (shared_repository < 0) > /* force to the mode value */ > - sprintf(buf, "0%o", -shared_repository); > + xsnprintf(buf, sizeof(buf), "0%o", -shared_repository); > else if (shared_repository == PERM_GROUP) > - sprintf(buf, "%d", OLD_PERM_GROUP); > + xsnprintf(buf, sizeof(buf), "%d", OLD_PERM_GROUP); > else if (shared_repository == PERM_EVERYBODY) > - sprintf(buf, "%d", OLD_PERM_EVERYBODY); > + xsnprintf(buf, sizeof(buf), "%d", OLD_PERM_EVERYBODY); > else > - die("oops"); > + die("BUG: invalid value for shared_repository"); > git_config_set("core.sharedrepository", buf); > git_config_set("receive.denyNonFastforwards", "true"); > } > diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c > index 3b04a0f..0e30d86 100644 > --- a/builtin/ls-tree.c > +++ b/builtin/ls-tree.c > @@ -96,12 +96,13 @@ static int
Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf
On Tue, Sep 15, 2015 at 07:32:29PM +0100, Ramsay Jones wrote: > > diff --git a/archive-tar.c b/archive-tar.c > > index b6b30bb..d543f93 100644 > > --- a/archive-tar.c > > +++ b/archive-tar.c > > @@ -301,7 +301,7 @@ static int write_global_extended_header(struct > > archiver_args *args) > > memset(, 0, sizeof(header)); > > *header.typeflag = TYPEFLAG_GLOBAL_HEADER; > > mode = 0100666; > > - strcpy(header.name, "pax_global_header"); > > + xsnprintf(header.name, sizeof(header.name), "pax_global_header"); > > How about using strlcpy() instead? Thus: > > - strcpy(header.name, "pax_global_header"); > + strlcpy(header.name, "pax_global_header", sizeof(header.name)); > > Ditto for other similar (strcpy->xsnprintf) hunks below. That misses the "assert" behavior of xsnprintf. We are preventing overflow here, but also truncation. What should happen if "pax_global_header" does not fit in header.name? I think complaining loudly and immediately is the most helpful thing, because it is surely a programming error. We could make xstrlcpy(), of course, but I don't see much point when xsnprintf does the same thing (and more). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Sep 2015, #03; Mon, 14)
Oops.. From: "Philip Oakley"[New Topics] * po/doc-branch-desc (2015-09-14) 1 commit (merged to 'next' on 2015-09-14 at 4934a96) + doc: show usage of branch description The branch descriptions that are set with "git branch --edit-description" option were used in many places but they weren't clearly documented. Will merge to 'master'. Thanks. Shall I just rework/resend the V2 patch-up ($gmane/277829) that also links to 'merge's' usage as a fresh patch (would be tonight UK)? I now see that the full V2 patch is already there at 4934a96. I'd mistakenly just compared your note to the slightly fuller V2 commit message and in the morning rush hadn't had time to check. Sorry for the noise. Philip -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] doc: show usage of branch description
From: "Junio C Hamano""Philip Oakley" writes: It still means that my patch is incomplete in its aim to bring out these possible broader usages. I haven't yet looked at the mail archives to see if there is more around the time of those introductions. I guess this is largely my fault, but I think "git grep" is an easier source of truth to work with than the list archive. It eventually boils down to branch.*.description configuration and all users of that would call read_branch_desc(), so if you check callers of that helper function and see which commit introduced that call for what purpose ("blame" is your friend), you would know how they use the information under what condition. $ git grep -n read_branch_desc -- \*.c branch.c:143:int read_branch_desc(struct strbuf *buf, const char *branch_name) builtin/branch.c:771: read_branch_desc(, branch_name); builtin/fmt-merge-msg.c:211:if (!read_branch_desc(, name)) { builtin/log.c:888: read_branch_desc(, branch_name); $ git blame -L210,212 -s builtin/fmt-merge-msg.c 898eacd8 210) 898eacd8 211) if (!read_branch_desc(, name)) { 898eacd8 212) const char *bp = desc.buf; $ git show -s 898eacd8 commit 898eacd8ada2d012f977948350ed60845e238037 Author: Junio C Hamano [...]> Signed-off-by: Junio C Hamano etc. etc. --- Thanks. That was very useful seeing a few more of the options in combination. That, combined with the updated G4W, is a lot better/faster. I've also searched for: $ git grep -n "\.description" -- \*.sh which only came up with git-request-pull.sh:74: ! git config "branch.$branch_name.description" >/dev/null git-request-pull.sh:156: git config "branch.$branch_name.description" as relevant hits: Sometimes one can be a bit feart to try out some command options.. Philip -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 04/67] fsck: don't fsck alternates for connectivity-only check
Hi Peff, On 2015-09-15 17:24, Jeff King wrote: > Commit 02976bf (fsck: introduce `git fsck --connectivity-only`, > 2015-06-22) recently gave fsck an option to perform only a > subset of the checks, by skipping the fsck_object_dir() > call. However, it does so only for the local object > directory, and we still do expensive checks on any alternate > repos. We should skip them in this case, too. > > Signed-off-by: Jeff KingACK! Sorry for missing this spot. Dscho -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 22/67] entry.c: convert strcpy to xsnprintf
On 15/09/15 16:40, Jeff King wrote: > This particular conversion is non-obvious, because nobody > has passed our function the length of the destination > buffer. However, the interface to checkout_entry specifies > that the buffer must be at least TEMPORARY_FILENAME_LENGTH > bytes long, so we can check that (meaning the existing code > was not buggy, but merely worrisome to somebody reading it). > > Signed-off-by: Jeff King> --- > entry.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/entry.c b/entry.c > index 1eda8e9..582c400 100644 > --- a/entry.c > +++ b/entry.c > @@ -96,8 +96,8 @@ static int open_output_fd(char *path, const struct > cache_entry *ce, int to_tempf > { > int symlink = (ce->ce_mode & S_IFMT) != S_IFREG; > if (to_tempfile) { > - strcpy(path, symlink > -? ".merge_link_XX" : ".merge_file_XX"); > + xsnprintf(path, TEMPORARY_FILENAME_LENGTH, "%s", > + symlink ? ".merge_link_XX" : > ".merge_file_XX"); > return mkstemp(path); > } else { > return create_file(path, !symlink ? ce->ce_mode : 0666); Hmm, I was going to suggest strlcpy() again. However, if you expect an overflow to occur, then xsnprintf() will at least bring it to your attention! Checking for overflow with strlcpy() is not rocket science either, and I guess we could add xstrlcpy() ... :-D dunno. ATB, Ramsay Jones -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] doc: show usage of branch description
From: "Junio C Hamano""Philip Oakley" writes: It still means that my patch is incomplete in its aim to bring out these possible broader usages. I haven't yet looked at the mail archives to see if there is more around the time of those introductions. I guess this is largely my fault, but I think "git grep" is an easier source of truth to work with than the list archive. It eventually boils down to branch.*.description configuration and all users of that would call read_branch_desc(), so if you check callers of that helper function and see which commit introduced that call for what purpose ("blame" is your friend), you would know how they use the information under what condition. $ git grep -n read_branch_desc -- \*.c branch.c:143:int read_branch_desc(struct strbuf *buf, const char *branch_name) builtin/branch.c:771: read_branch_desc(, branch_name); builtin/fmt-merge-msg.c:211:if (!read_branch_desc(, name)) { builtin/log.c:888: read_branch_desc(, branch_name); $ git blame -L210,212 -s builtin/fmt-merge-msg.c 898eacd8 210) 898eacd8 211) if (!read_branch_desc(, name)) { 898eacd8 212) const char *bp = desc.buf; $ git show -s 898eacd8 commit 898eacd8ada2d012f977948350ed60845e238037 Author: Junio C Hamano [...]> Signed-off-by: Junio C Hamano etc. etc. --- Thanks. That was very useful seeing a few more of the options in combination. That, combined with the updated G4W, is a lot better/faster. Sometimes one can be a bit feart to try out some command options.. I've also searched for: $ git grep -n "\.description" -- \*.sh which only came up with git-request-pull.sh:74: ! git config "branch.$branch_name.description" >/dev/null git-request-pull.sh:156: git config "branch.$branch_name.description" as relevant hits. Philip -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf
On 15/09/15 19:42, Jeff King wrote: > On Tue, Sep 15, 2015 at 07:32:29PM +0100, Ramsay Jones wrote: > >>> diff --git a/archive-tar.c b/archive-tar.c >>> index b6b30bb..d543f93 100644 >>> --- a/archive-tar.c >>> +++ b/archive-tar.c >>> @@ -301,7 +301,7 @@ static int write_global_extended_header(struct >>> archiver_args *args) >>> memset(, 0, sizeof(header)); >>> *header.typeflag = TYPEFLAG_GLOBAL_HEADER; >>> mode = 0100666; >>> - strcpy(header.name, "pax_global_header"); >>> + xsnprintf(header.name, sizeof(header.name), "pax_global_header"); >> How about using strlcpy() instead? Thus: >> >> -strcpy(header.name, "pax_global_header"); >> +strlcpy(header.name, "pax_global_header", sizeof(header.name)); >> >> Ditto for other similar (strcpy->xsnprintf) hunks below. > That misses the "assert" behavior of xsnprintf. We are preventing > overflow here, but also truncation. What should happen if > "pax_global_header" does not fit in header.name? I think complaining > loudly and immediately is the most helpful thing, because it is surely a > programming error. > > We could make xstrlcpy(), of course, but I don't see much point when > xsnprintf does the same thing (and more). Heh, I just sent an email about patch 22/67 which says similar things. I don't feel too strongly, either way, but I have a slight preference for the use of [x]strlcpy() in these cases. I have to stop at patch #22 for now. ATB, Ramsay Jones -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Git Deployment using existing multiple environments
--- > Date: Tue, 15 Sep 2015 09:48:34 +0200 > From: johannes.schinde...@gmx.de > To: php_programmer_in...@hotmail.com > CC: jacob.kel...@gmail.com; git@vger.kernel.org > Subject: RE: Git Deployment using existing multiple environments > > Hi, > > please stop top-posting. It is quite irritating by now. Thank you. Sorry about that. Haven't used mailing lists since almost a decade. Forgot how it should be used. Thank you very much everyone for replying. The advice from everyone is much appreciated. Regards, Sukhwinder Singh -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [bug] .git/hooks/commit-msg.sample test is reversed
Federico Fissorewrites: > Hello everyone > > In file commit-msg.sample, grep test is reversed. It greps > '^Signed-off-by: ' > while it should grep > 'Signed-off-by: ' > > If you run the sample against attached file, it won't complain. Remove > the ^ and it will work as expected I think the test is correct. In grep, ^ can have two meanings: '^foo' means "foo at the start of a line" '[^abc]' means "one character but not a, b or c" The Signed-off-by: trailer is meant to be at the start of a line, hence the ^. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Feature Request] git blame showing only revisions from git rev-list --first-parent
On Sun, Sep 13, 2015 at 10:19:33PM -0700, Junio C Hamano wrote: > For that matter, I am not sure how "blame A..B" with first-parent & > reverse should behave when A is not an ancestor on the first-parent > chain. Wouldn't we try to find a cut-point on the first-parent chain by > traversing first-parent chain from B and painting them as positive, > while traversing _all_ parents from B and painting them as negative, > until the traversal intersect? And wouldn't we discover at least two > children (one positive and one negative) for the cut point we discover > by that traversal? That cut point would be the (fake) latest state the > blame traversal starts at, and then we try to use the first (fake) parent > that in real life is the first child (which we do not have a good definition > for). And at that point a simple panda brain explodes ;-) > > We might end up doing the right thing even in that case, but I haven't > convinced myself about that (yet). If the change were limited to "blame", > the change may be much less problematic. Good point. It seems like nobody is actually that interested in what "blame --first-parent --reverse" does in the first place, though, and there's no reason for its complexity to hold up vanilla --first-parent. So what do you think of: -- >8 -- Subject: [PATCH] blame: handle --first-parent The revision.c options-parser will parse "--first-parent" for us, but the blame code does not actually respect it, as we simply iterate over the whole list returned by first_scapegoat(). We can fix this by returning a truncated parent list. Note that we could technically also do so by limiting the return value of num_scapegoats(), but that is less robust. We would rely on nobody ever looking at the "next" pointer from the returned list. Combining "--reverse" with "--first-parent" is more complicated, and will probably involve cooperation from revision.c. Since the desired semantics are not even clear, let's punt on this for now, but explicitly disallow it to avoid confusing users (this is not really a regression, since it did something nonsensical before). Signed-off-by: Jeff King--- builtin/blame.c | 11 ++- t/annotate-tests.sh | 4 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/builtin/blame.c b/builtin/blame.c index 4db01c1..ae4301c 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -1366,8 +1366,15 @@ static void pass_whole_blame(struct scoreboard *sb, */ static struct commit_list *first_scapegoat(struct rev_info *revs, struct commit *commit) { - if (!reverse) + if (!reverse) { + if (revs->first_parent_only && + commit->parents && + commit->parents->next) { + free_commit_list(commit->parents->next); + commit->parents->next = NULL; + } return commit->parents; + } return lookup_decoration(>children, >object); } @@ -2680,6 +2687,8 @@ parse_done: } else if (contents_from) die("--contents and --children do not blend well."); + else if (revs.first_parent_only) + die("combining --first-parent and --reverse is not supported"); else { final_commit_name = prepare_initial(); sb.commits.compare = compare_commits_by_reverse_commit_date; diff --git a/t/annotate-tests.sh b/t/annotate-tests.sh index f5c0175..b1673b3 100644 --- a/t/annotate-tests.sh +++ b/t/annotate-tests.sh @@ -111,6 +111,10 @@ test_expect_success 'blame 2 authors + 2 merged-in authors' ' check_count A 2 B 1 B1 2 B2 1 ' +test_expect_success 'blame --first-parent blames merge for branch1' ' + check_count --first-parent A 2 B 1 "A U Thor" 2 B2 1 +' + test_expect_success 'blame ancestor' ' check_count -h master A 2 B 2 ' -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Developing- Where to Start
Breanna Devore-McDonaldwrites: > Hello all, Hello, > I'm a third year Computer Science student at the University of Notre > Dame, and for the final project of my Data Structures class, my group > and I have to find a way to contribute our (hopefully) newly-found > data structures and optimization knowledge to a well-known open source > project. We thought Git would be a great project to work on, but we > have no idea where to start. I am a teacher and I offer my students a similar project. I maintain a list of small projects as source of inspiration for students: https://git.wiki.kernel.org/index.php/SmallProjectsIdeas Some advices before you get started: * Start with a very, very small contribution, as soon as possible. The list of microprojects from GSoC is a good place to start, and the easiest projects in the page above can be of interest too. This way, you will: - Understand how submitting contributions work (if you haven't done so, read https://github.com/git/git/blob/master/Documentation/SubmittingPatches ) - Get an idea of how productive you are when working with the Git codebase. * Even for the actual project, don't try something hard. The difficulty here is not to write the first draft, but to get it properly tested, reviewed, to respond to reviewers, and finally get the code accepted into git.git. Unlike traditional school projects, you can't anticipate difficulties (no teacher wrote the correct version for you in advance...), and the quality standard is much higher than what you're probably used to. A rough heuristics: estimate how long you are going to take for your project, multiply by 2 or 3, and you're still getting an underestimation. As a comparison, for Google summer of code, a very good student works full-time for 2 months and writes about 2000 to 3000 lines of code. BTW, how long is your project? How many students in your group? * Interact with the mailing-list as much as you can and as early as you can. What you want to avoid is to write a large amount of code and get it reviewed at the end of your project. That would almost certainly result in requests for changes that you wouldn't have time to apply. OK, that may all be a bit discourraging ;-). But it shouldn't: it's _really_ very interesting to contribute to a project like Git. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[bug] .git/hooks/commit-msg.sample test is reversed
Hello everyone In file commit-msg.sample, grep test is reversed. It greps '^Signed-off-by: ' while it should grep 'Signed-off-by: ' If you run the sample against attached file, it won't complain. Remove the ^ and it will work as expected Regards Federico Commit message Signed-off-by: MeSigned-off-by: Me
[PATCH 19/67] stop_progress_msg: convert sprintf to xsnprintf
The usual arguments for using xsnprintf over sprintf apply, but this case is a little tricky. We print to a static buffer if we have room, and otherwise to an allocated buffer. So there should be no overflow here, but it is still good to communicate our intention, as well as to check our earlier math for how much space the string will need. Signed-off-by: Jeff King--- progress.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/progress.c b/progress.c index a3efcfd..353bd37 100644 --- a/progress.c +++ b/progress.c @@ -254,7 +254,7 @@ void stop_progress_msg(struct progress **p_progress, const char *msg) throughput_string(>display, tp->curr_total, rate); } progress_update = 1; - sprintf(bufp, ", %s.\n", msg); + xsnprintf(bufp, len + 1, ", %s.\n", msg); display(progress, progress->last_value, bufp); if (buf != bufp) free(bufp); -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] git-p4: handle "Translation of file content failed"
On 15 Sep 2015, at 08:43, Luke Diamandwrote: > On 14/09/15 17:55, larsxschnei...@gmail.com wrote: >> From: Lars Schneider >> >> A P4 repository can get into a state where it contains a file with >> type UTF-16 that does not contain a valid UTF-16 BOM. If git-p4 > > Sorry - what's a BOM? I'm assuming it's not a Bill of Materials? BOM stands for Byte Order Mark. The UTF-16 BOM is a two byte sequence at the beginning of a UTF-16 file. It is not part of the actual content. It is only used to define the encoding of the remaining file. FEFF stands for UTF-16 big-endian encoding and FFFE for little-endian encoding. More info here: http://www.unicode.org/faq/utf_bom.html#bom1 > Do we know the mechanism by which we end up in this state? Unfortunately no. I tried hard to reproduce the error with “conventional” methods. As you can see I ended up manipulating the P4 database… However, it looks like this error happens in the wild, too: https://stackoverflow.com/questions/5156909/translation-of-file-content-failed-error-in-perforce https://stackoverflow.com/questions/887006/perforce-translation-of-file-content-failed-error >> attempts to retrieve the file then the process crashes with a >> "Translation of file content failed" error. >> >> Fix this by detecting this error and retrieving the file as binary >> instead. The result in Git is the same. >> >> Known issue: This works only if git-p4 is executed in verbose mode. >> In normal mode no exceptions are thrown and git-p4 just exits. > > Does that mean that the error will only be detected in verbose mode? That > doesn't seem right! Correct. I don’t like this either but I also don’t want to make huge changes to git-p4. You can see the root problem here: https://github.com/git/git/blob/97d7ad75b6fe74960d2a12e4a9151a55a5a87d6d/git-p4.py#L110-L114 Any idea how to approach that best? > >> >> Signed-off-by: Lars Schneider >> --- >> git-p4.py | 27 --- >> 1 file changed, 16 insertions(+), 11 deletions(-) >> >> diff --git a/git-p4.py b/git-p4.py >> index 073f87b..5ae25a6 100755 >> --- a/git-p4.py >> +++ b/git-p4.py >> @@ -134,13 +134,11 @@ def read_pipe(c, ignore_error=False): >> sys.stderr.write('Reading pipe: %s\n' % str(c)) >> >> expand = isinstance(c,basestring) >> -p = subprocess.Popen(c, stdout=subprocess.PIPE, shell=expand) >> -pipe = p.stdout >> -val = pipe.read() >> -if p.wait() and not ignore_error: >> -die('Command failed: %s' % str(c)) >> - >> -return val >> +p = subprocess.Popen(c, stdout=subprocess.PIPE, stderr=subprocess.PIPE, >> shell=expand) >> +(out, err) = p.communicate() >> +if p.returncode != 0 and not ignore_error: >> +die('Command failed: %s\nError: %s' % (str(c), err)) >> +return out >> >> def p4_read_pipe(c, ignore_error=False): >> real_cmd = p4_build_cmd(c) >> @@ -2186,10 +2184,17 @@ class P4Sync(Command, P4UserMap): >> # them back too. This is not needed to the cygwin windows >> version, >> # just the native "NT" type. >> # >> -text = p4_read_pipe(['print', '-q', '-o', '-', "%s@%s" % >> (file['depotFile'], file['change']) ]) >> -if p4_version_string().find("/NT") >= 0: >> -text = text.replace("\r\n", "\n") >> -contents = [ text ] >> +try: >> +text = p4_read_pipe(['print', '-q', '-o', '-', '%s@%s' % >> (file['depotFile'], file['change'])]) >> +except Exception as e: > > Would it be better to specify which kind of Exception you are catching? Looks > like you could get OSError, ValueError and CalledProcessError; it's the last > of these you want (I think). I think it is just a plain exception. See here: https://github.com/git/git/blob/97d7ad75b6fe74960d2a12e4a9151a55a5a87d6d/git-p4.py#L111 > >> +if 'Translation of file content failed' in str(e): >> +type_base = 'binary' >> +else: >> +raise e >> +else: >> +if p4_version_string().find('/NT') >= 0: >> +text = text.replace('\r\n', '\n') >> +contents = [ text ] > > The indentation on this bit doesn't look right to me. I believe it is exactly how it was: https://github.com/git/git/blob/97d7ad75b6fe74960d2a12e4a9151a55a5a87d6d/git-p4.py#L2397-L2399 In general, what is the appropriate way to reference code in this email list? Are GitHub links OK? Thanks, Lars -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 18/67] find_short_object_filename: convert sprintf to xsnprintf
We use sprintf() to format some hex data into a buffer. The buffer is clearly long enough, and using snprintf here is not necessary. And in fact, it does not really make anything easier to audit, as the size we feed to snprintf accounts for the magic extra 42 bytes found in each alt->name field of struct alternate_object_database (which is there exactly to do this formatting). Still, it is nice to remove an sprintf call and replace it with an xsnprintf and explanatory comment, which makes it easier to audit the code base for overflows. Signed-off-by: Jeff King--- sha1_name.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index 416e408..ed42f79 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -96,11 +96,15 @@ static void find_short_object_filename(int len, const char *hex_pfx, struct disa } fakeent->next = alt_odb_list; - sprintf(hex, "%.2s", hex_pfx); + xsnprintf(hex, sizeof(hex), "%.2s", hex_pfx); for (alt = fakeent; alt && !ds->ambiguous; alt = alt->next) { struct dirent *de; DIR *dir; - sprintf(alt->name, "%.2s/", hex_pfx); + /* +* every alt_odb struct has 42 extra bytes after the base +* for exactly this purpose +*/ + xsnprintf(alt->name, 42, "%.2s/", hex_pfx); dir = opendir(alt->base); if (!dir) continue; -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 20/67] compat/hstrerror: convert sprintf to snprintf
This is a trivially correct use of sprintf, as our error number should not be excessively long. But it's still nice to drop an sprintf call. Note that we cannot use xsnprintf here, because this is compat code which does not load git-compat-util.h. Signed-off-by: Jeff King--- compat/hstrerror.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/compat/hstrerror.c b/compat/hstrerror.c index 069c555..b85a2fa 100644 --- a/compat/hstrerror.c +++ b/compat/hstrerror.c @@ -16,6 +16,6 @@ const char *githstrerror(int err) case TRY_AGAIN: return "Non-authoritative \"host not found\", or SERVERFAIL"; } - sprintf(buffer, "Name resolution error %d", err); + snprintf(buffer, sizeof(buffer), "Name resolution error %d", err); return buffer; } -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 13/67] test-dump-cache-tree: avoid overflow of cache-tree name
When dumping a cache-tree, we sprintf sub-tree names directly into a fixed-size buffer, which can overflow. We can trivially fix this by converting to xsnprintf to at least notice and die. This probably should handle arbitrary-sized names, but there's not much point. It's used only by the test scripts, so the trivial fix is enough. Signed-off-by: Jeff King--- test-dump-cache-tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/test-dump-cache-tree.c b/test-dump-cache-tree.c index 54c0872..bb53c0a 100644 --- a/test-dump-cache-tree.c +++ b/test-dump-cache-tree.c @@ -47,7 +47,7 @@ static int dump_cache_tree(struct cache_tree *it, struct cache_tree_sub *rdwn; rdwn = cache_tree_sub(ref, down->name); - sprintf(path, "%s%.*s/", pfx, down->namelen, down->name); + xsnprintf(path, sizeof(path), "%s%.*s/", pfx, down->namelen, down->name); if (dump_cache_tree(down->cache_tree, rdwn->cache_tree, path)) errs = 1; } -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 21/67] grep: use xsnprintf to format failure message
This looks at first glance like the sprintf can overflow our buffer, but it's actually fine; the p->origin string is something constant and small, like "command line" or "-e option". Signed-off-by: Jeff King--- grep.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/grep.c b/grep.c index b58c7c6..6c68d5b 100644 --- a/grep.c +++ b/grep.c @@ -306,9 +306,9 @@ static NORETURN void compile_regexp_failed(const struct grep_pat *p, char where[1024]; if (p->no) - sprintf(where, "In '%s' at %d, ", p->origin, p->no); + xsnprintf(where, sizeof(where), "In '%s' at %d, ", p->origin, p->no); else if (p->origin) - sprintf(where, "%s, ", p->origin); + xsnprintf(where, sizeof(where), "%s, ", p->origin); else where[0] = 0; -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 41/67] apply: convert root string to strbuf
We use manual computation and strcpy to allocate the "root" variable. This would be much simpler using xstrfmt. But since we store the length, too, we can just use a strbuf, which handles that for us. Note that we stop distinguishing between "no root" and "empty root" in some cases, but that's OK; the results are the same (e.g., inserting an empty string is a noop). Signed-off-by: Jeff King--- builtin/apply.c | 26 ++ 1 file changed, 10 insertions(+), 16 deletions(-) diff --git a/builtin/apply.c b/builtin/apply.c index 094a20f..1d4d439 100644 --- a/builtin/apply.c +++ b/builtin/apply.c @@ -77,8 +77,7 @@ static enum ws_ignore { static const char *patch_input_file; -static const char *root; -static int root_len; +struct strbuf root = STRBUF_INIT; static int read_stdin = 1; static int options; @@ -494,8 +493,8 @@ static char *find_name_gnu(const char *line, const char *def, int p_value) } strbuf_remove(, 0, cp - name.buf); - if (root) - strbuf_insert(, 0, root, root_len); + if (root.len) + strbuf_insert(, 0, root.buf, root.len); return squash_slash(strbuf_detach(, NULL)); } @@ -697,8 +696,8 @@ static char *find_name_common(const char *line, const char *def, return squash_slash(xstrdup(def)); } - if (root) { - char *ret = xstrfmt("%s%.*s", root, len, start); + if (root.len) { + char *ret = xstrfmt("%s%.*s", root.buf, len, start); return squash_slash(ret); } @@ -1274,8 +1273,8 @@ static int parse_git_header(const char *line, int len, unsigned int size, struct * the default name from the header. */ patch->def_name = git_header_name(line, len); - if (patch->def_name && root) { - char *s = xstrfmt("%s%s", root, patch->def_name); + if (patch->def_name && root.len) { + char *s = xstrfmt("%s%s", root.buf, patch->def_name); free(patch->def_name); patch->def_name = s; } @@ -4498,14 +4497,9 @@ static int option_parse_whitespace(const struct option *opt, static int option_parse_directory(const struct option *opt, const char *arg, int unset) { - root_len = strlen(arg); - if (root_len && arg[root_len - 1] != '/') { - char *new_root; - root = new_root = xmalloc(root_len + 2); - strcpy(new_root, arg); - strcpy(new_root + root_len++, "/"); - } else - root = arg; + strbuf_reset(); + strbuf_addstr(, arg); + strbuf_complete(, '/'); return 0; } -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 42/67] transport: use strbufs for status table "quickref" strings
We generate range strings like "1234abcd...5678efab" for use in the the fetch and push status tables. We use fixed-size buffers along with strcat to do so. These aren't buggy, as our manual size computation is correct, but there's nothing checking that this is so. Let's switch them to strbufs instead, which are obviously correct, and make it easier to audit the code base for problematic calls to strcat(). Signed-off-by: Jeff King--- builtin/fetch.c | 22 -- transport.c | 13 +++-- 2 files changed, 19 insertions(+), 16 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 4703725..841880e 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -528,36 +528,38 @@ static int update_local_ref(struct ref *ref, } if (in_merge_bases(current, updated)) { - char quickref[83]; + struct strbuf quickref = STRBUF_INIT; int r; - strcpy(quickref, find_unique_abbrev(current->object.sha1, DEFAULT_ABBREV)); - strcat(quickref, ".."); - strcat(quickref, find_unique_abbrev(ref->new_sha1, DEFAULT_ABBREV)); + strbuf_add_unique_abbrev(, current->object.sha1, DEFAULT_ABBREV); + strbuf_addstr(, ".."); + strbuf_add_unique_abbrev(, ref->new_sha1, DEFAULT_ABBREV); if ((recurse_submodules != RECURSE_SUBMODULES_OFF) && (recurse_submodules != RECURSE_SUBMODULES_ON)) check_for_new_submodule_commits(ref->new_sha1); r = s_update_ref("fast-forward", ref, 1); strbuf_addf(display, "%c %-*s %-*s -> %s%s", r ? '!' : ' ', - TRANSPORT_SUMMARY_WIDTH, quickref, + TRANSPORT_SUMMARY_WIDTH, quickref.buf, REFCOL_WIDTH, remote, pretty_ref, r ? _(" (unable to update local ref)") : ""); + strbuf_release(); return r; } else if (force || ref->force) { - char quickref[84]; + struct strbuf quickref = STRBUF_INIT; int r; - strcpy(quickref, find_unique_abbrev(current->object.sha1, DEFAULT_ABBREV)); - strcat(quickref, "..."); - strcat(quickref, find_unique_abbrev(ref->new_sha1, DEFAULT_ABBREV)); + strbuf_add_unique_abbrev(, current->object.sha1, DEFAULT_ABBREV); + strbuf_addstr(, "..."); + strbuf_add_unique_abbrev(, ref->new_sha1, DEFAULT_ABBREV); if ((recurse_submodules != RECURSE_SUBMODULES_OFF) && (recurse_submodules != RECURSE_SUBMODULES_ON)) check_for_new_submodule_commits(ref->new_sha1); r = s_update_ref("forced-update", ref, 1); strbuf_addf(display, "%c %-*s %-*s -> %s (%s)", r ? '!' : '+', - TRANSPORT_SUMMARY_WIDTH, quickref, + TRANSPORT_SUMMARY_WIDTH, quickref.buf, REFCOL_WIDTH, remote, pretty_ref, r ? _("unable to update local ref") : _("forced update")); + strbuf_release(); return r; } else { strbuf_addf(display, "! %-*s %-*s -> %s %s", diff --git a/transport.c b/transport.c index 2d51348..3b47d49 100644 --- a/transport.c +++ b/transport.c @@ -654,23 +654,24 @@ static void print_ok_ref_status(struct ref *ref, int porcelain) "[new branch]"), ref, ref->peer_ref, NULL, porcelain); else { - char quickref[84]; + struct strbuf quickref = STRBUF_INIT; char type; const char *msg; - strcpy(quickref, status_abbrev(ref->old_sha1)); + strbuf_addstr(, status_abbrev(ref->old_sha1)); if (ref->forced_update) { - strcat(quickref, "..."); + strbuf_addstr(, "..."); type = '+'; msg = "forced update"; } else { - strcat(quickref, ".."); + strbuf_addstr(, ".."); type = ' '; msg = NULL; } - strcat(quickref, status_abbrev(ref->new_sha1)); + strbuf_addstr(, status_abbrev(ref->new_sha1)); - print_ref_status(type, quickref, ref, ref->peer_ref, msg, porcelain); + print_ref_status(type, quickref.buf, ref, ref->peer_ref, msg, porcelain); + strbuf_release(); } } -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at
[PATCH 48/67] fetch-pack: use argv_array for index-pack / unpack-objects
This cleans up a magic number that must be kept in sync with the rest of the code (the number of argv slots). It also lets us drop some fixed buffers and an sprintf (since we can now use argv_array_pushf). We do still have to keep one fixed buffer for calling gethostname, but at least now the size computations for it are much simpler. Signed-off-by: Jeff King--- fetch-pack.c | 56 +++- 1 file changed, 27 insertions(+), 29 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index 820251a..2dabee9 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -681,11 +681,10 @@ static int get_pack(struct fetch_pack_args *args, int xd[2], char **pack_lockfile) { struct async demux; - const char *argv[22]; - char keep_arg[256]; - char hdr_arg[256]; - const char **av, *cmd_name; int do_keep = args->keep_pack; + const char *cmd_name; + struct pack_header header; + int pass_header = 0; struct child_process cmd = CHILD_PROCESS_INIT; int ret; @@ -705,17 +704,11 @@ static int get_pack(struct fetch_pack_args *args, else demux.out = xd[0]; - cmd.argv = argv; - av = argv; - *hdr_arg = 0; if (!args->keep_pack && unpack_limit) { - struct pack_header header; if (read_pack_header(demux.out, )) die("protocol error: bad pack header"); - snprintf(hdr_arg, sizeof(hdr_arg), -"--pack_header=%"PRIu32",%"PRIu32, -ntohl(header.hdr_version), ntohl(header.hdr_entries)); + pass_header = 1; if (ntohl(header.hdr_entries) < unpack_limit) do_keep = 0; else @@ -723,44 +716,49 @@ static int get_pack(struct fetch_pack_args *args, } if (alternate_shallow_file) { - *av++ = "--shallow-file"; - *av++ = alternate_shallow_file; + argv_array_push(, "--shallow-file"); + argv_array_push(, alternate_shallow_file); } if (do_keep) { if (pack_lockfile) cmd.out = -1; - *av++ = cmd_name = "index-pack"; - *av++ = "--stdin"; + cmd_name = "index-pack"; + argv_array_push(, cmd_name); + argv_array_push(, "--stdin"); if (!args->quiet && !args->no_progress) - *av++ = "-v"; + argv_array_push(, "-v"); if (args->use_thin_pack) - *av++ = "--fix-thin"; + argv_array_push(, "--fix-thin"); if (args->lock_pack || unpack_limit) { - int s = sprintf(keep_arg, - "--keep=fetch-pack %"PRIuMAX " on ", (uintmax_t) getpid()); - if (gethostname(keep_arg + s, sizeof(keep_arg) - s)) - strcpy(keep_arg + s, "localhost"); - *av++ = keep_arg; + char hostname[256]; + if (gethostname(hostname, sizeof(hostname))) + xsnprintf(hostname, sizeof(hostname), "localhost"); + argv_array_pushf(, + "--keep=fetch-pack %"PRIuMAX " on %s", + (uintmax_t)getpid(), hostname); } if (args->check_self_contained_and_connected) - *av++ = "--check-self-contained-and-connected"; + argv_array_push(, "--check-self-contained-and-connected"); } else { - *av++ = cmd_name = "unpack-objects"; + cmd_name = "unpack-objects"; + argv_array_push(, cmd_name); if (args->quiet || args->no_progress) - *av++ = "-q"; + argv_array_push(, "-q"); args->check_self_contained_and_connected = 0; } - if (*hdr_arg) - *av++ = hdr_arg; + + if (pass_header) + argv_array_pushf(, "--pack_header=%"PRIu32",%"PRIu32, +ntohl(header.hdr_version), +ntohl(header.hdr_entries)); if (fetch_fsck_objects >= 0 ? fetch_fsck_objects : transfer_fsck_objects >= 0 ? transfer_fsck_objects : 0) - *av++ = "--strict"; - *av++ = NULL; + argv_array_push(, "--strict"); cmd.in = demux.out; cmd.git_cmd = 1; -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 50/67] stat_tracking_info: convert to argv_array
In addition to dropping the magic number for the fixed-size argv, we can also drop a fixed-length buffer and some strcpy's into it. Signed-off-by: Jeff King--- remote.c | 26 -- 1 file changed, 12 insertions(+), 14 deletions(-) diff --git a/remote.c b/remote.c index 1b69751..255d39a 100644 --- a/remote.c +++ b/remote.c @@ -8,6 +8,7 @@ #include "tag.h" #include "string-list.h" #include "mergesort.h" +#include "argv-array.h" enum map_direction { FROM_SRC, FROM_DST }; @@ -2027,10 +2028,9 @@ int stat_tracking_info(struct branch *branch, int *num_ours, int *num_theirs, { unsigned char sha1[20]; struct commit *ours, *theirs; - char symmetric[84]; struct rev_info revs; - const char *rev_argv[10], *base; - int rev_argc; + const char *base; + struct argv_array argv = ARGV_ARRAY_INIT; /* Cannot stat unless we are marked to build on top of somebody else. */ base = branch_get_upstream(branch, NULL); @@ -2059,19 +2059,15 @@ int stat_tracking_info(struct branch *branch, int *num_ours, int *num_theirs, } /* Run "rev-list --left-right ours...theirs" internally... */ - rev_argc = 0; - rev_argv[rev_argc++] = NULL; - rev_argv[rev_argc++] = "--left-right"; - rev_argv[rev_argc++] = symmetric; - rev_argv[rev_argc++] = "--"; - rev_argv[rev_argc] = NULL; - - strcpy(symmetric, sha1_to_hex(ours->object.sha1)); - strcpy(symmetric + 40, "..."); - strcpy(symmetric + 43, sha1_to_hex(theirs->object.sha1)); + argv_array_push(, ""); /* ignored */ + argv_array_push(, "--left-right"); + argv_array_pushf(, "%s...%s", +sha1_to_hex(ours->object.sha1), +sha1_to_hex(theirs->object.sha1)); + argv_array_push(, "--"); init_revisions(, NULL); - setup_revisions(rev_argc, rev_argv, , NULL); + setup_revisions(argv.argc, argv.argv, , NULL); if (prepare_revision_walk()) die("revision walk setup failed"); @@ -2091,6 +2087,8 @@ int stat_tracking_info(struct branch *branch, int *num_ours, int *num_theirs, /* clear object flags smudged by the above traversal */ clear_commit_marks(ours, ALL_REV_FLAGS); clear_commit_marks(theirs, ALL_REV_FLAGS); + + argv_array_clear(); return 0; } -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 44/67] enter_repo: convert fixed-size buffers to strbufs
We use two PATH_MAX-sized buffers to represent the repo path, and must make sure not to overflow them. We do take care to check the lengths, but the logic is rather hard to follow, as we use several magic numbers (e.g., "PATH_MAX - 10"). And in fact you _can_ overflow the buffer if you have a ".git" file with an extremely long path in it. By switching to strbufs, these problems all go away. We do, however, retain the check that the initial input we get is no larger than PATH_MAX. This function is an entry point for untrusted repo names from the network, and it's a good idea to keep a sanity check (both to avoid allocating arbitrary amounts of memory, and also as a layer of defense against any downstream users of the names). Signed-off-by: Jeff King--- path.c | 57 + 1 file changed, 29 insertions(+), 28 deletions(-) diff --git a/path.c b/path.c index 46a4d27..60e0390 100644 --- a/path.c +++ b/path.c @@ -391,8 +391,8 @@ return_null: */ const char *enter_repo(const char *path, int strict) { - static char used_path[PATH_MAX]; - static char validated_path[PATH_MAX]; + static struct strbuf validated_path = STRBUF_INIT; + static struct strbuf used_path = STRBUF_INIT; if (!path) return NULL; @@ -407,46 +407,47 @@ const char *enter_repo(const char *path, int strict) while ((1 < len) && (path[len-1] == '/')) len--; + /* +* We can handle arbitrary-sized buffers, but this remains as a +* sanity check on untrusted input. +*/ if (PATH_MAX <= len) return NULL; - strncpy(used_path, path, len); used_path[len] = 0 ; - strcpy(validated_path, used_path); - if (used_path[0] == '~') { - char *newpath = expand_user_path(used_path); - if (!newpath || (PATH_MAX - 10 < strlen(newpath))) { - free(newpath); + strbuf_reset(_path); + strbuf_reset(_path); + strbuf_add(_path, path, len); + strbuf_add(_path, path, len); + + if (used_path.buf[0] == '~') { + char *newpath = expand_user_path(used_path.buf); + if (!newpath) return NULL; - } - /* -* Copy back into the static buffer. A pity -* since newpath was not bounded, but other -* branches of the if are limited by PATH_MAX -* anyway. -*/ - strcpy(used_path, newpath); free(newpath); + strbuf_attach(_path, newpath, strlen(newpath), + strlen(newpath)); } - else if (PATH_MAX - 10 < len) - return NULL; - len = strlen(used_path); for (i = 0; suffix[i]; i++) { struct stat st; - strcpy(used_path + len, suffix[i]); - if (!stat(used_path, ) && + size_t baselen = used_path.len; + strbuf_addstr(_path, suffix[i]); + if (!stat(used_path.buf, ) && (S_ISREG(st.st_mode) || - (S_ISDIR(st.st_mode) && is_git_directory(used_path { - strcat(validated_path, suffix[i]); + (S_ISDIR(st.st_mode) && is_git_directory(used_path.buf { + strbuf_addstr(_path, suffix[i]); break; } + strbuf_setlen(_path, baselen); } if (!suffix[i]) return NULL; - gitfile = read_gitfile(used_path) ; - if (gitfile) - strcpy(used_path, gitfile); - if (chdir(used_path)) + gitfile = read_gitfile(used_path.buf) ; + if (gitfile) { + strbuf_reset(_path); + strbuf_addstr(_path, gitfile); + } + if (chdir(used_path.buf)) return NULL; - path = validated_path; + path = validated_path.buf; } else if (chdir(path)) return NULL; -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 46/67] write_loose_object: convert to strbuf
When creating a loose object tempfile, we use a fixed PATH_MAX-sized buffer, and strcpy directly into it. This isn't buggy, because we do a rough check of the size, but there's no verification that our guesstimate of the required space is enough (in fact, it's several bytes too big for the current naming scheme). Let's switch to a strbuf, which makes this much easier to verify. The allocation overhead should be negligible, since we are replacing a static buffer with a static strbuf, and we'll only need to allocate on the first call. While we're here, we can also document a subtle interaction with mkstemp that would be easy to overlook. Signed-off-by: Jeff King--- sha1_file.c | 41 + 1 file changed, 21 insertions(+), 20 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 9d87b69..374a996 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -3007,29 +3007,30 @@ static inline int directory_size(const char *filename) * We want to avoid cross-directory filename renames, because those * can have problems on various filesystems (FAT, NFS, Coda). */ -static int create_tmpfile(char *buffer, size_t bufsiz, const char *filename) +static int create_tmpfile(struct strbuf *tmp, const char *filename) { int fd, dirlen = directory_size(filename); - if (dirlen + 20 > bufsiz) { - errno = ENAMETOOLONG; - return -1; - } - memcpy(buffer, filename, dirlen); - strcpy(buffer + dirlen, "tmp_obj_XX"); - fd = git_mkstemp_mode(buffer, 0444); + strbuf_reset(tmp); + strbuf_add(tmp, filename, dirlen); + strbuf_addstr(tmp, "tmp_obj_XX"); + fd = git_mkstemp_mode(tmp->buf, 0444); if (fd < 0 && dirlen && errno == ENOENT) { - /* Make sure the directory exists */ - memcpy(buffer, filename, dirlen); - buffer[dirlen-1] = 0; - if (mkdir(buffer, 0777) && errno != EEXIST) + /* +* Make sure the directory exists; note that mkstemp will have +* put a NUL in our buffer, so we have to rewrite the path, +* rather than just chomping the length. +*/ + strbuf_reset(tmp); + strbuf_add(tmp, filename, dirlen - 1); + if (mkdir(tmp->buf, 0777) && errno != EEXIST) return -1; - if (adjust_shared_perm(buffer)) + if (adjust_shared_perm(tmp->buf)) return -1; /* Try again */ - strcpy(buffer + dirlen - 1, "/tmp_obj_XX"); - fd = git_mkstemp_mode(buffer, 0444); + strbuf_addstr(tmp, "/tmp_obj_XX"); + fd = git_mkstemp_mode(tmp->buf, 0444); } return fd; } @@ -3042,10 +3043,10 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen, git_zstream stream; git_SHA_CTX c; unsigned char parano_sha1[20]; - static char tmp_file[PATH_MAX]; + static struct strbuf tmp_file = STRBUF_INIT; const char *filename = sha1_file_name(sha1); - fd = create_tmpfile(tmp_file, sizeof(tmp_file), filename); + fd = create_tmpfile(_file, filename); if (fd < 0) { if (errno == EACCES) return error("insufficient permission for adding an object to repository database %s", get_object_directory()); @@ -3094,12 +3095,12 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen, struct utimbuf utb; utb.actime = mtime; utb.modtime = mtime; - if (utime(tmp_file, ) < 0) + if (utime(tmp_file.buf, ) < 0) warning("failed utime() on %s: %s", - tmp_file, strerror(errno)); + tmp_file.buf, strerror(errno)); } - return finalize_object_file(tmp_file, filename); + return finalize_object_file(tmp_file.buf, filename); } static int freshen_loose_object(const unsigned char *sha1) -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 49/67] http-push: use an argv_array for setup_revisions
This drops the magic number for the fixed-size argv arrays, so we do not have to wonder if we are overflowing it. We can also drop some confusing sha1_to_hex memory allocation (which seems to predate the ring of buffers allowing multiple calls), and get rid of an unchecked sprintf call. Signed-off-by: Jeff King--- http-push.c | 32 ++-- 1 file changed, 10 insertions(+), 22 deletions(-) diff --git a/http-push.c b/http-push.c index e501c28..43a9036 100644 --- a/http-push.c +++ b/http-push.c @@ -10,6 +10,7 @@ #include "remote.h" #include "list-objects.h" #include "sigchain.h" +#include "argv-array.h" #ifdef EXPAT_NEEDS_XMLPARSE_H #include @@ -1856,9 +1857,7 @@ int main(int argc, char **argv) new_refs = 0; for (ref = remote_refs; ref; ref = ref->next) { char old_hex[60], *new_hex; - const char *commit_argv[5]; - int commit_argc; - char *new_sha1_hex, *old_sha1_hex; + struct argv_array commit_argv = ARGV_ARRAY_INIT; if (!ref->peer_ref) continue; @@ -1937,27 +1936,15 @@ int main(int argc, char **argv) } /* Set up revision info for this refspec */ - commit_argc = 3; - new_sha1_hex = xstrdup(sha1_to_hex(ref->new_sha1)); - old_sha1_hex = NULL; - commit_argv[1] = "--objects"; - commit_argv[2] = new_sha1_hex; - if (!push_all && !is_null_sha1(ref->old_sha1)) { - old_sha1_hex = xmalloc(42); - sprintf(old_sha1_hex, "^%s", - sha1_to_hex(ref->old_sha1)); - commit_argv[3] = old_sha1_hex; - commit_argc++; - } - commit_argv[commit_argc] = NULL; + argv_array_push(_argv, ""); /* ignored */ + argv_array_push(_argv, "--objects"); + argv_array_push(_argv, sha1_to_hex(ref->new_sha1)); + if (!push_all && !is_null_sha1(ref->old_sha1)) + argv_array_pushf(_argv, "^%s", +sha1_to_hex(ref->old_sha1)); init_revisions(, setup_git_directory()); - setup_revisions(commit_argc, commit_argv, , NULL); + setup_revisions(commit_argv.argc, commit_argv.argv, , NULL); revs.edge_hint = 0; /* just in case */ - free(new_sha1_hex); - if (old_sha1_hex) { - free(old_sha1_hex); - commit_argv[1] = NULL; - } /* Generate a list of objects that need to be pushed */ pushing = 0; @@ -1986,6 +1973,7 @@ int main(int argc, char **argv) printf("%s %s\n", !rc ? "ok" : "error", ref->name); unlock_remote(ref_lock); check_locks(); + argv_array_clear(_argv); } /* Update remote server info if appropriate */ -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 45/67] remove_leading_path: use a strbuf for internal storage
This function strcpy's directly into a PATH_MAX-sized buffer. There's only one caller, which feeds the git_dir into it, so it's not easy to trigger in practice (even if you fed a large $GIT_DIR through the environment or .git file, it would have to actually exist and be accessible on the filesystem to get to this point). We can fix it by moving to a strbuf. Signed-off-by: Jeff King--- path.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/path.c b/path.c index 60e0390..c597473 100644 --- a/path.c +++ b/path.c @@ -632,7 +632,7 @@ const char *relative_path(const char *in, const char *prefix, */ const char *remove_leading_path(const char *in, const char *prefix) { - static char buf[PATH_MAX + 1]; + static struct strbuf buf = STRBUF_INIT; int i = 0, j = 0; if (!prefix || !prefix[0]) @@ -661,11 +661,13 @@ const char *remove_leading_path(const char *in, const char *prefix) return in; while (is_dir_sep(in[j])) j++; + + strbuf_reset(); if (!in[j]) - strcpy(buf, "."); + strbuf_addstr(, "."); else - strcpy(buf, in + j); - return buf; + strbuf_addstr(, in + j); + return buf.buf; } /* -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 43/67] merge-recursive: convert malloc / strcpy to strbuf
This would be a fairly routine use of xstrfmt, except that we need to remember the length of the result to pass to cache_name_pos. So just use a strbuf, which makes this simple. As a bonus, this gets rid of confusing references to "pathlen+1". The "1" is for the trailing slash we added, but that is automatically accounted for in the strbuf's len parameter. Signed-off-by: Jeff King--- merge-recursive.c | 17 - 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/merge-recursive.c b/merge-recursive.c index 44d85be..a5e74d8 100644 --- a/merge-recursive.c +++ b/merge-recursive.c @@ -630,25 +630,24 @@ static char *unique_path(struct merge_options *o, const char *path, const char * static int dir_in_way(const char *path, int check_working_copy) { - int pos, pathlen = strlen(path); - char *dirpath = xmalloc(pathlen + 2); + int pos; + struct strbuf dirpath = STRBUF_INIT; struct stat st; - strcpy(dirpath, path); - dirpath[pathlen] = '/'; - dirpath[pathlen+1] = '\0'; + strbuf_addstr(, path); + strbuf_addch(, '/'); - pos = cache_name_pos(dirpath, pathlen+1); + pos = cache_name_pos(dirpath.buf, dirpath.len); if (pos < 0) pos = -1 - pos; if (pos < active_nr && - !strncmp(dirpath, active_cache[pos]->name, pathlen+1)) { - free(dirpath); + !strncmp(dirpath.buf, active_cache[pos]->name, dirpath.len)) { + strbuf_release(); return 1; } - free(dirpath); + strbuf_release(); return check_working_copy && !lstat(path, ) && S_ISDIR(st.st_mode); } -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7] git-p4: improve path encoding verbose output
On 15 Sep 2015, at 09:31, Luke Diamandwrote: > On 14/09/15 18:10, larsxschnei...@gmail.com wrote: >> From: Lars Schneider >> >> If a path with non-ASCII characters is detected then print always the > > s/print always/print/ I will fix it. > > >> encoding and the encoded string in verbose mode. >> >> Signed-off-by: Lars Schneider >> --- >> git-p4.py | 19 +-- >> 1 file changed, 9 insertions(+), 10 deletions(-) >> >> diff --git a/git-p4.py b/git-p4.py >> index d45cf2b..da25d3f 100755 >> --- a/git-p4.py >> +++ b/git-p4.py >> @@ -2220,16 +2220,15 @@ class P4Sync(Command, P4UserMap): >> text = regexp.sub(r'$\1$', text) >> contents = [ text ] >> >> -if gitConfig("git-p4.pathEncoding"): >> -relPath = >> relPath.decode(gitConfig("git-p4.pathEncoding")).encode('utf8', 'replace') >> -elif self.verbose: >> -try: >> -relPath.decode('ascii') >> -except: >> -print ( >> -"Path with Non-ASCII characters detected and no path >> encoding defined. " >> -"Please check the encoding: %s" % relPath >> -) >> +try: >> +relPath.decode('ascii') >> +except: >> +encoding = 'utf8' >> +if gitConfig('git-p4.pathEncoding'): >> +encoding = gitConfig('git-p4.pathEncoding') > > It would be better to query this once at startup. Otherwise we're potentially > forking "git config" twice per file which on a large repo could become > significant. Make it an instance variable perhaps? solved in other email > >> +relPath = relPath.decode(encoding).encode('utf8', 'replace') >> +if self.verbose: >> +print 'Path with non-ASCII characters detected. Used %s to >> encode: %s ' % (encoding, relPath) >> >> self.gitStream.write("M %s inline %s\n" % (git_mode, relPath)) Thanks!-- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 47/67] diagnose_invalid_index_path: use strbuf to avoid strcpy/strcat
We dynamically allocate a buffer and then strcpy and strcat into it. This isn't buggy, but we'd prefer to avoid these suspicious functions. This would be a good candidate for converstion to xstrfmt, but we need to record the length for dealing with index entries. A strbuf handles that for us. Signed-off-by: Jeff King--- sha1_name.c | 21 + 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index ed42f79..0f14ea2 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -1293,8 +1293,7 @@ static void diagnose_invalid_index_path(int stage, const struct cache_entry *ce; int pos; unsigned namelen = strlen(filename); - unsigned fullnamelen; - char *fullname; + struct strbuf fullname = STRBUF_INIT; if (!prefix) prefix = ""; @@ -1314,21 +1313,19 @@ static void diagnose_invalid_index_path(int stage, } /* Confusion between relative and absolute filenames? */ - fullnamelen = namelen + strlen(prefix); - fullname = xmalloc(fullnamelen + 1); - strcpy(fullname, prefix); - strcat(fullname, filename); - pos = cache_name_pos(fullname, fullnamelen); + strbuf_addstr(, prefix); + strbuf_addstr(, filename); + pos = cache_name_pos(fullname.buf, fullname.len); if (pos < 0) pos = -pos - 1; if (pos < active_nr) { ce = active_cache[pos]; - if (ce_namelen(ce) == fullnamelen && - !memcmp(ce->name, fullname, fullnamelen)) + if (ce_namelen(ce) == fullname.len && + !memcmp(ce->name, fullname.buf, fullname.len)) die("Path '%s' is in the index, but not '%s'.\n" "Did you mean ':%d:%s' aka ':%d:./%s'?", - fullname, filename, - ce_stage(ce), fullname, + fullname.buf, filename, + ce_stage(ce), fullname.buf, ce_stage(ce), filename); } @@ -1338,7 +1335,7 @@ static void diagnose_invalid_index_path(int stage, die("Path '%s' does not exist (neither on disk nor in the index).", filename); - free(fullname); + strbuf_release(); } -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 60/67] color: add color_set helper for copying raw colors
To set up default colors, we sometimes strcpy() from the default string literals into our color buffers. This isn't a bug (assuming the destination is COLOR_MAXLEN bytes), but makes it harder to audit the code for problematic strcpy calls. Let's introduce a color_set which copies under the assumption that there are COLOR_MAXLEN bytes in the destination (of course you can call it on a smaller buffer, so this isn't providing a huge amount of safety, but it's more convenient than calling xsnprintf yourself). Signed-off-by: Jeff King--- color.c | 5 + color.h | 7 +++ grep.c | 32 3 files changed, 28 insertions(+), 16 deletions(-) diff --git a/color.c b/color.c index 3a3fa33..42f19be 100644 --- a/color.c +++ b/color.c @@ -145,6 +145,11 @@ int color_parse(const char *value, char *dst) return color_parse_mem(value, strlen(value), dst); } +void color_set(char *dst, const char *color_bytes) +{ + xsnprintf(dst, COLOR_MAXLEN, "%s", color_bytes); +} + /* * Write the ANSI color codes for "c" to "out"; the string should * already have the ANSI escape code in it. "out" should have enough diff --git a/color.h b/color.h index 7fe77fb..e155d13 100644 --- a/color.h +++ b/color.h @@ -75,6 +75,13 @@ extern int color_stdout_is_tty; int git_color_config(const char *var, const char *value, void *cb); int git_color_default_config(const char *var, const char *value, void *cb); +/* + * Set the color buffer (which must be COLOR_MAXLEN bytes) + * to the raw color bytes; this is useful for initializing + * default color variables. + */ +void color_set(char *dst, const char *color_bytes); + int git_config_colorbool(const char *var, const char *value); int want_color(int var); int color_parse(const char *value, char *dst); diff --git a/grep.c b/grep.c index 6c68d5b..7b2b96a 100644 --- a/grep.c +++ b/grep.c @@ -31,14 +31,14 @@ void init_grep_defaults(void) opt->max_depth = -1; opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED; opt->extended_regexp_option = 0; - strcpy(opt->color_context, ""); - strcpy(opt->color_filename, ""); - strcpy(opt->color_function, ""); - strcpy(opt->color_lineno, ""); - strcpy(opt->color_match_context, GIT_COLOR_BOLD_RED); - strcpy(opt->color_match_selected, GIT_COLOR_BOLD_RED); - strcpy(opt->color_selected, ""); - strcpy(opt->color_sep, GIT_COLOR_CYAN); + color_set(opt->color_context, ""); + color_set(opt->color_filename, ""); + color_set(opt->color_function, ""); + color_set(opt->color_lineno, ""); + color_set(opt->color_match_context, GIT_COLOR_BOLD_RED); + color_set(opt->color_match_selected, GIT_COLOR_BOLD_RED); + color_set(opt->color_selected, ""); + color_set(opt->color_sep, GIT_COLOR_CYAN); opt->color = -1; } @@ -151,14 +151,14 @@ void grep_init(struct grep_opt *opt, const char *prefix) opt->regflags = def->regflags; opt->relative = def->relative; - strcpy(opt->color_context, def->color_context); - strcpy(opt->color_filename, def->color_filename); - strcpy(opt->color_function, def->color_function); - strcpy(opt->color_lineno, def->color_lineno); - strcpy(opt->color_match_context, def->color_match_context); - strcpy(opt->color_match_selected, def->color_match_selected); - strcpy(opt->color_selected, def->color_selected); - strcpy(opt->color_sep, def->color_sep); + color_set(opt->color_context, def->color_context); + color_set(opt->color_filename, def->color_filename); + color_set(opt->color_function, def->color_function); + color_set(opt->color_lineno, def->color_lineno); + color_set(opt->color_match_context, def->color_match_context); + color_set(opt->color_match_selected, def->color_match_selected); + color_set(opt->color_selected, def->color_selected); + color_set(opt->color_sep, def->color_sep); } void grep_commit_pattern_type(enum grep_pattern_type pattern_type, struct grep_opt *opt) -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 01/67] show-branch: avoid segfault with --reflog of unborn branch
When no branch is given to the "--reflog" option, we resolve HEAD to get the default branch. However, if HEAD points to an unborn branch, resolve_ref returns NULL, and we later segfault trying to access it. Signed-off-by: Jeff King--- builtin/show-branch.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/builtin/show-branch.c b/builtin/show-branch.c index 408ce70..092b59b 100644 --- a/builtin/show-branch.c +++ b/builtin/show-branch.c @@ -743,6 +743,8 @@ int cmd_show_branch(int ac, const char **av, const char *prefix) fake_av[1] = NULL; av = fake_av; ac = 1; + if (!*av) + die("no branches given, and HEAD is not valid"); } if (ac != 1) die("--reflog option needs one branch name"); -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/67] mailsplit: make PATH_MAX buffers dynamic
There are several static PATH_MAX-sized buffers in mailsplit, along with some questionable uses of sprintf. These are not really of security interest, as local mailsplit pathnames are not typically under control of an attacker. But it does not hurt to be careful, and as a bonus we lift some limits for systems with too-small PATH_MAX varibles. Signed-off-by: Jeff King--- builtin/mailsplit.c | 46 +- 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/builtin/mailsplit.c b/builtin/mailsplit.c index 9de06e3..fb0bc08 100644 --- a/builtin/mailsplit.c +++ b/builtin/mailsplit.c @@ -98,30 +98,37 @@ static int populate_maildir_list(struct string_list *list, const char *path) { DIR *dir; struct dirent *dent; - char name[PATH_MAX]; + struct strbuf name = STRBUF_INIT; char *subs[] = { "cur", "new", NULL }; char **sub; + int ret = -1; for (sub = subs; *sub; ++sub) { - snprintf(name, sizeof(name), "%s/%s", path, *sub); - if ((dir = opendir(name)) == NULL) { + strbuf_reset(); + strbuf_addf(, "%s/%s", path, *sub); + if ((dir = opendir(name.buf)) == NULL) { if (errno == ENOENT) continue; - error("cannot opendir %s (%s)", name, strerror(errno)); - return -1; + error("cannot opendir %s (%s)", name.buf, strerror(errno)); + goto out; } while ((dent = readdir(dir)) != NULL) { if (dent->d_name[0] == '.') continue; - snprintf(name, sizeof(name), "%s/%s", *sub, dent->d_name); - string_list_insert(list, name); + strbuf_reset(); + strbuf_addf(, "%s/%s", *sub, dent->d_name); + string_list_insert(list, name.buf); } closedir(dir); } - return 0; + ret = 0; + +out: + strbuf_release(); + return ret; } static int maildir_filename_cmp(const char *a, const char *b) @@ -148,8 +155,7 @@ static int maildir_filename_cmp(const char *a, const char *b) static int split_maildir(const char *maildir, const char *dir, int nr_prec, int skip) { - char file[PATH_MAX]; - char name[PATH_MAX]; + struct strbuf file = STRBUF_INIT; FILE *f = NULL; int ret = -1; int i; @@ -161,20 +167,25 @@ static int split_maildir(const char *maildir, const char *dir, goto out; for (i = 0; i < list.nr; i++) { - snprintf(file, sizeof(file), "%s/%s", maildir, list.items[i].string); - f = fopen(file, "r"); + char *name; + + strbuf_reset(); + strbuf_addf(, "%s/%s", maildir, list.items[i].string); + + f = fopen(file.buf, "r"); if (!f) { - error("cannot open mail %s (%s)", file, strerror(errno)); + error("cannot open mail %s (%s)", file.buf, strerror(errno)); goto out; } if (strbuf_getwholeline(, f, '\n')) { - error("cannot read mail %s (%s)", file, strerror(errno)); + error("cannot read mail %s (%s)", file.buf, strerror(errno)); goto out; } - sprintf(name, "%s/%0*d", dir, nr_prec, ++skip); + name = xstrfmt("%s/%0*d", dir, nr_prec, ++skip); split_one(f, name, 1); + free(name); fclose(f); f = NULL; @@ -184,6 +195,7 @@ static int split_maildir(const char *maildir, const char *dir, out: if (f) fclose(f); + strbuf_release(); string_list_clear(, 1); return ret; } @@ -191,7 +203,6 @@ out: static int split_mbox(const char *file, const char *dir, int allow_bare, int nr_prec, int skip) { - char name[PATH_MAX]; int ret = -1; int peek; @@ -218,8 +229,9 @@ static int split_mbox(const char *file, const char *dir, int allow_bare, } while (!file_done) { - sprintf(name, "%s/%0*d", dir, nr_prec, ++skip); + char *name = xstrfmt("%s/%0*d", dir, nr_prec, ++skip); file_done = split_one(f, name, allow_bare); + free(name); } if (f != stdin) -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 09/67] fsck: use strbuf to generate alternate directories
When fsck-ing alternates, we make a copy of the alternate directory in a fixed PATH_MAX buffer. We memcpy directly, without any check whether we are overflowing the buffer. This is OK if PATH_MAX is a true representation of the maximum path on the system, because any path here will have already been vetted by the alternates subsystem. But that is not true on every system, so we should be more careful. Signed-off-by: Jeff King--- builtin/fsck.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 46c7235..a019f4a 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -683,11 +683,12 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) prepare_alt_odb(); for (alt = alt_odb_list; alt; alt = alt->next) { - char namebuf[PATH_MAX]; - int namelen = alt->name - alt->base; - memcpy(namebuf, alt->base, namelen); - namebuf[namelen - 1] = 0; - fsck_object_dir(namebuf); + /* directory name, minus trailing slash */ + size_t namelen = alt->name - alt->base - 1; + struct strbuf name = STRBUF_INIT; + strbuf_add(, alt->base, namelen); + fsck_object_dir(name.buf); + strbuf_release(); } } -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 11/67] trace: use strbuf for quote_crnl output
When we output GIT_TRACE_SETUP paths, we quote any meta-characters. But our buffer to hold the result is only PATH_MAX bytes, and we could double the size of the input path (if every character needs quoted). We could use a 2*PATH_MAX buffer, if we assume the input will never be more than PATH_MAX. But it's easier still to just switch to a strbuf and not worry about whether the input can exceed PATH_MAX or not. Signed-off-by: Jeff King--- trace.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/trace.c b/trace.c index 7393926..0c06d71 100644 --- a/trace.c +++ b/trace.c @@ -277,25 +277,25 @@ void trace_performance_fl(const char *file, int line, uint64_t nanos, static const char *quote_crnl(const char *path) { - static char new_path[PATH_MAX]; + static struct strbuf new_path = STRBUF_INIT; const char *p2 = path; - char *p1 = new_path; if (!path) return NULL; + strbuf_reset(_path); + while (*p2) { switch (*p2) { - case '\\': *p1++ = '\\'; *p1++ = '\\'; break; - case '\n': *p1++ = '\\'; *p1++ = 'n'; break; - case '\r': *p1++ = '\\'; *p1++ = 'r'; break; + case '\\': strbuf_addstr(_path, ""); break; + case '\n': strbuf_addstr(_path, "\\n"); break; + case '\r': strbuf_addstr(_path, "\\r"); break; default: - *p1++ = *p2; + strbuf_addch(_path, *p2); } p2++; } - *p1 = '\0'; - return new_path; + return new_path.buf; } /* FIXME: move prefix to startup_info struct and get rid of this arg */ -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 64/67] Makefile: drop D_INO_IN_DIRENT build knob
Now that fsck has dropped its inode-sorting, there are no longer any users of this knob, and it can go away. Signed-off-by: Jeff King--- Makefile | 5 - config.mak.uname | 3 --- configure.ac | 7 --- 3 files changed, 15 deletions(-) diff --git a/Makefile b/Makefile index 8d5df7e..2f350ca 100644 --- a/Makefile +++ b/Makefile @@ -74,8 +74,6 @@ all:: # Define HAVE_PATHS_H if you have paths.h and want to use the default PATH # it specifies. # -# Define NO_D_INO_IN_DIRENT if you don't have d_ino in your struct dirent. -# # Define NO_D_TYPE_IN_DIRENT if your platform defines DT_UNKNOWN but lacks # d_type in struct dirent (Cygwin 1.5, fixed in Cygwin 1.7). # @@ -1160,9 +1158,6 @@ endif ifdef NO_D_TYPE_IN_DIRENT BASIC_CFLAGS += -DNO_D_TYPE_IN_DIRENT endif -ifdef NO_D_INO_IN_DIRENT - BASIC_CFLAGS += -DNO_D_INO_IN_DIRENT -endif ifdef NO_GECOS_IN_PWENT BASIC_CFLAGS += -DNO_GECOS_IN_PWENT endif diff --git a/config.mak.uname b/config.mak.uname index 943c439..f34dcaa 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -166,7 +166,6 @@ endif ifeq ($(uname_O),Cygwin) ifeq ($(shell expr "$(uname_R)" : '1\.[1-6]\.'),4) NO_D_TYPE_IN_DIRENT = YesPlease - NO_D_INO_IN_DIRENT = YesPlease NO_STRCASESTR = YesPlease NO_MEMMEM = YesPlease NO_MKSTEMPS = YesPlease @@ -370,7 +369,6 @@ ifeq ($(uname_S),Windows) NO_POSIX_GOODIES = UnfortunatelyYes NATIVE_CRLF = YesPlease DEFAULT_HELP_FORMAT = html - NO_D_INO_IN_DIRENT = YesPlease CC = compat/vcbuild/scripts/clink.pl AR = compat/vcbuild/scripts/lib.pl @@ -520,7 +518,6 @@ ifneq (,$(findstring MINGW,$(uname_S))) NO_INET_NTOP = YesPlease NO_POSIX_GOODIES = UnfortunatelyYes DEFAULT_HELP_FORMAT = html - NO_D_INO_IN_DIRENT = YesPlease COMPAT_CFLAGS += -D__USE_MINGW_ACCESS -D_USE_32BIT_TIME_T -DNOGDI -Icompat -Icompat/win32 COMPAT_CFLAGS += -DSTRIP_EXTENSION=\".exe\" COMPAT_OBJS += compat/mingw.o compat/winansi.o \ diff --git a/configure.ac b/configure.ac index 14012fa..3fcca61 100644 --- a/configure.ac +++ b/configure.ac @@ -767,13 +767,6 @@ elif test x$ac_cv_member_struct_stat_st_mtim_tv_nsec != xyes; then GIT_CONF_SUBST([NO_NSEC]) fi # -# Define NO_D_INO_IN_DIRENT if you don't have d_ino in your struct dirent. -AC_CHECK_MEMBER(struct dirent.d_ino, -[NO_D_INO_IN_DIRENT=], -[NO_D_INO_IN_DIRENT=YesPlease], -[#include ]) -GIT_CONF_SUBST([NO_D_INO_IN_DIRENT]) -# # Define NO_D_TYPE_IN_DIRENT if your platform defines DT_UNKNOWN but lacks # d_type in struct dirent (latest Cygwin -- will be fixed soonish). AC_CHECK_MEMBER(struct dirent.d_type, -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 65/67] fsck: use for_each_loose_file_in_objdir
Since 27e1e22 (prune: factor out loose-object directory traversal, 2014-10-15), we now have a generic callback system for iterating over the loose object directories. This is used by prune, count-objects, etc. We did not convert git-fsck at the time because it implemented an inode-sorting scheme that was not part of the generic code. Now that the inode-sorting code is gone, we can reuse the generic code. The result is shorter, hopefully more readable, and drops some unchecked sprintf calls. Signed-off-by: Jeff King--- builtin/fsck.c | 69 -- 1 file changed, 23 insertions(+), 46 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 73c3596..2fe6a31 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -365,45 +365,6 @@ static int fsck_obj_buffer(const unsigned char *sha1, enum object_type type, return fsck_obj(obj); } -static inline int is_loose_object_file(struct dirent *de, - char *name, unsigned char *sha1) -{ - if (strlen(de->d_name) != 38) - return 0; - memcpy(name + 2, de->d_name, 39); - return !get_sha1_hex(name, sha1); -} - -static void fsck_dir(int i, char *path) -{ - DIR *dir = opendir(path); - struct dirent *de; - char name[100]; - - if (!dir) - return; - - if (verbose) - fprintf(stderr, "Checking directory %s\n", path); - - sprintf(name, "%02x", i); - while ((de = readdir(dir)) != NULL) { - unsigned char sha1[20]; - - if (is_dot_or_dotdot(de->d_name)) - continue; - if (is_loose_object_file(de, name, sha1)) { - if (fsck_sha1(sha1)) - errors_found |= ERROR_OBJECT; - continue; - } - if (starts_with(de->d_name, "tmp_obj_")) - continue; - fprintf(stderr, "bad sha1 file: %s/%s\n", path, de->d_name); - } - closedir(dir); -} - static int default_refs; static void fsck_handle_reflog_sha1(const char *refname, unsigned char *sha1) @@ -491,9 +452,28 @@ static void get_default_heads(void) } } +static int fsck_loose(const unsigned char *sha1, const char *path, void *data) +{ + if (fsck_sha1(sha1)) + errors_found |= ERROR_OBJECT; + return 0; +} + +static int fsck_cruft(const char *basename, const char *path, void *data) +{ + if (!starts_with(basename, "tmp_obj_")) + fprintf(stderr, "bad sha1 file: %s\n", path); + return 0; +} + +static int fsck_subdir(int nr, const char *path, void *progress) +{ + display_progress(progress, nr + 1); + return 0; +} + static void fsck_object_dir(const char *path) { - int i; struct progress *progress = NULL; if (verbose) @@ -501,12 +481,9 @@ static void fsck_object_dir(const char *path) if (show_progress) progress = start_progress(_("Checking object directories"), 256); - for (i = 0; i < 256; i++) { - static char dir[4096]; - sprintf(dir, "%s/%02x", path, i); - fsck_dir(i, dir); - display_progress(progress, i+1); - } + + for_each_loose_file_in_objdir(path, fsck_loose, fsck_cruft, fsck_subdir, + progress); stop_progress(); } -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 63/67] fsck: drop inode-sorting code
Fsck tries to access loose objects in order of inode number, with the hope that this would make cold cache access faster on a spinning disk. This dates back to 7e8c174 (fsck-cache: sort entries by inode number, 2005-05-02), which predates the invention of packfiles. These days, there's not much point in trying to optimize cold cache for a large number of loose objects. You are much better off to simply pack the objects, which will reduce the disk footprint _and_ provide better locality of data access. So while you can certainly construct pathological cases where this code might help, it is not worth the trouble anymore. Signed-off-by: Jeff King--- builtin/fsck.c | 70 ++ 1 file changed, 2 insertions(+), 68 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index a019f4a..73c3596 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -39,14 +39,6 @@ static int show_dangling = 1; #define ERROR_REACHABLE 02 #define ERROR_PACK 04 -#ifdef NO_D_INO_IN_DIRENT -#define SORT_DIRENT 0 -#define DIRENT_SORT_HINT(de) 0 -#else -#define SORT_DIRENT 1 -#define DIRENT_SORT_HINT(de) ((de)->d_ino) -#endif - static int fsck_config(const char *var, const char *value, void *cb) { if (strcmp(var, "fsck.skiplist") == 0) { @@ -373,64 +365,6 @@ static int fsck_obj_buffer(const unsigned char *sha1, enum object_type type, return fsck_obj(obj); } -/* - * This is the sorting chunk size: make it reasonably - * big so that we can sort well.. - */ -#define MAX_SHA1_ENTRIES (1024) - -struct sha1_entry { - unsigned long ino; - unsigned char sha1[20]; -}; - -static struct { - unsigned long nr; - struct sha1_entry *entry[MAX_SHA1_ENTRIES]; -} sha1_list; - -static int ino_compare(const void *_a, const void *_b) -{ - const struct sha1_entry *a = _a, *b = _b; - unsigned long ino1 = a->ino, ino2 = b->ino; - return ino1 < ino2 ? -1 : ino1 > ino2 ? 1 : 0; -} - -static void fsck_sha1_list(void) -{ - int i, nr = sha1_list.nr; - - if (SORT_DIRENT) - qsort(sha1_list.entry, nr, - sizeof(struct sha1_entry *), ino_compare); - for (i = 0; i < nr; i++) { - struct sha1_entry *entry = sha1_list.entry[i]; - unsigned char *sha1 = entry->sha1; - - sha1_list.entry[i] = NULL; - if (fsck_sha1(sha1)) - errors_found |= ERROR_OBJECT; - free(entry); - } - sha1_list.nr = 0; -} - -static void add_sha1_list(unsigned char *sha1, unsigned long ino) -{ - struct sha1_entry *entry = xmalloc(sizeof(*entry)); - int nr; - - entry->ino = ino; - hashcpy(entry->sha1, sha1); - nr = sha1_list.nr; - if (nr == MAX_SHA1_ENTRIES) { - fsck_sha1_list(); - nr = 0; - } - sha1_list.entry[nr] = entry; - sha1_list.nr = ++nr; -} - static inline int is_loose_object_file(struct dirent *de, char *name, unsigned char *sha1) { @@ -459,7 +393,8 @@ static void fsck_dir(int i, char *path) if (is_dot_or_dotdot(de->d_name)) continue; if (is_loose_object_file(de, name, sha1)) { - add_sha1_list(sha1, DIRENT_SORT_HINT(de)); + if (fsck_sha1(sha1)) + errors_found |= ERROR_OBJECT; continue; } if (starts_with(de->d_name, "tmp_obj_")) @@ -573,7 +508,6 @@ static void fsck_object_dir(const char *path) display_progress(progress, i+1); } stop_progress(); - fsck_sha1_list(); } static int fsck_head_link(void) -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 07/67] strbuf: make strbuf_complete_line more generic
The strbuf_complete_line function make sure that a buffer ends in a newline. But we may want to do this for any character (e.g., "/" on the end of a path). Let's factor out a generic version, and keep strbuf_complete_line as a thin wrapper. Signed-off-by: Jeff King--- strbuf.h | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/strbuf.h b/strbuf.h index aef2794..ba099cd 100644 --- a/strbuf.h +++ b/strbuf.h @@ -491,10 +491,22 @@ extern void strbuf_add_lines(struct strbuf *sb, const char *prefix, const char * */ extern void strbuf_addstr_xml_quoted(struct strbuf *sb, const char *s); +/** + * Ensure that `sb` ends with the character `term`, if it does not + * already. + */ +static inline void strbuf_complete(struct strbuf *sb, char term) +{ + if (sb->len && sb->buf[sb->len - 1] != term) + strbuf_addch(sb, term); +} + +/** + * Ensure that `sb` ends with a newline. + */ static inline void strbuf_complete_line(struct strbuf *sb) { - if (sb->len && sb->buf[sb->len - 1] != '\n') - strbuf_addch(sb, '\n'); + strbuf_complete(sb, '\n'); } extern int strbuf_branchname(struct strbuf *sb, const char *name); -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/67] fsck: don't fsck alternates for connectivity-only check
Commit 02976bf (fsck: introduce `git fsck --connectivity-only`, 2015-06-22) recently gave fsck an option to perform only a subset of the checks, by skipping the fsck_object_dir() call. However, it does so only for the local object directory, and we still do expensive checks on any alternate repos. We should skip them in this case, too. Signed-off-by: Jeff King--- builtin/fsck.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 0794703..46c7235 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -678,16 +678,17 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) git_config(fsck_config, NULL); fsck_head_link(); - if (!connectivity_only) + if (!connectivity_only) { fsck_object_dir(get_object_directory()); - prepare_alt_odb(); - for (alt = alt_odb_list; alt; alt = alt->next) { - char namebuf[PATH_MAX]; - int namelen = alt->name - alt->base; - memcpy(namebuf, alt->base, namelen); - namebuf[namelen - 1] = 0; - fsck_object_dir(namebuf); + prepare_alt_odb(); + for (alt = alt_odb_list; alt; alt = alt->next) { + char namebuf[PATH_MAX]; + int namelen = alt->name - alt->base; + memcpy(namebuf, alt->base, namelen); + namebuf[namelen - 1] = 0; + fsck_object_dir(namebuf); + } } if (check_full) { -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/67] add git_path_buf helper function
If you have a function that uses git_path a lot, but would prefer to avoid the static buffers, it's useful to keep a single scratch buffer locally and reuse it for each call. You used to be able to do this with git_snpath: char buf[PATH_MAX]; foo(git_snpath(buf, sizeof(buf), "foo")); bar(git_snpath(buf, sizeof(buf), "bar")); but since 1a83c24, git_snpath has been replaced with strbuf_git_path. This is good, because it removes the arbitrary PATH_MAX limit. But using strbuf_git_path is more awkward for two reasons: 1. It adds to the buffer, rather than replacing it. This is consistent with other strbuf functions, but makes reuse of a single buffer more tedious. 2. It doesn't return the buffer, so you can't format as part of a function's arguments. The new git_path_buf solves both of these, so you can use it like: struct strbuf buf = STRBUF_INIT; foo(git_path_buf(, "foo")); bar(git_path_buf(, "bar")); strbuf_release(); Signed-off-by: Jeff King--- cache.h | 2 ++ path.c | 10 ++ 2 files changed, 12 insertions(+) diff --git a/cache.h b/cache.h index 79066e5..e231e47 100644 --- a/cache.h +++ b/cache.h @@ -723,6 +723,8 @@ extern char *mksnpath(char *buf, size_t n, const char *fmt, ...) __attribute__((format (printf, 3, 4))); extern void strbuf_git_path(struct strbuf *sb, const char *fmt, ...) __attribute__((format (printf, 2, 3))); +extern char *git_path_buf(struct strbuf *buf, const char *fmt, ...) + __attribute__((format (printf, 2, 3))); extern void strbuf_git_path_submodule(struct strbuf *sb, const char *path, const char *fmt, ...) __attribute__((format (printf, 3, 4))); diff --git a/path.c b/path.c index 95acbaf..46a4d27 100644 --- a/path.c +++ b/path.c @@ -175,6 +175,16 @@ static void do_git_path(struct strbuf *buf, const char *fmt, va_list args) strbuf_cleanup_path(buf); } +char *git_path_buf(struct strbuf *buf, const char *fmt, ...) +{ + va_list args; + strbuf_reset(buf); + va_start(args, fmt); + do_git_path(buf, fmt, args); + va_end(args); + return buf->buf; +} + void strbuf_git_path(struct strbuf *sb, const char *fmt, ...) { va_list args; -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/67] add xsnprintf helper function
There are a number of places in the code where we call sprintf(), with the assumption that the output will fit into the buffer. In many cases this is true (e.g., formatting a number into a large buffer), but it is hard to tell immediately from looking at the code. It would be nice if we had some run-time check to make sure that our assumption is correct (and to communicate to readers of the code that we are not blindly calling sprintf, but have actually thought about this case). This patch introduces xsnprintf, which behaves just like snprintf, except that it dies whenever the output is truncated. This acts as a sort of assert() for these cases, which can help find places where the assumption is violated (as opposed to truncating and proceeding, which may just silently give a wrong answer). Signed-off-by: Jeff King--- git-compat-util.h | 3 +++ wrapper.c | 16 2 files changed, 19 insertions(+) diff --git a/git-compat-util.h b/git-compat-util.h index f649e81..348b9dc 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -744,6 +744,9 @@ static inline size_t xsize_t(off_t len) return (size_t)len; } +__attribute__((format (printf, 3, 4))) +extern int xsnprintf(char *dst, size_t max, const char *fmt, ...); + /* in ctype.c, for kwset users */ extern const unsigned char tolower_trans_tbl[256]; diff --git a/wrapper.c b/wrapper.c index 0e22d43..6fcaa4d 100644 --- a/wrapper.c +++ b/wrapper.c @@ -621,6 +621,22 @@ char *xgetcwd(void) return strbuf_detach(, NULL); } +int xsnprintf(char *dst, size_t max, const char *fmt, ...) +{ + va_list ap; + int len; + + va_start(ap, fmt); + len = vsnprintf(dst, max, fmt, ap); + va_end(ap); + + if (len < 0) + die("BUG: your snprintf is broken"); + if (len >= max) + die("BUG: attempt to snprintf into too-small buffer"); + return len; +} + static int write_file_v(const char *path, int fatal, const char *fmt, va_list params) { -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 59/67] prefer memcpy to strcpy
When we already know the length of a string (e.g., because we just malloc'd to fit it), it's nicer to use memcpy than strcpy, as it makes it more obvious that we are not going to overflow the buffer (because the size we pass matches the size in the allocation). This also eliminates calls to strcpy, which make auditing the code base harder. Signed-off-by: Jeff King--- compat/nedmalloc/nedmalloc.c | 5 +++-- fast-import.c| 5 +++-- revision.c | 2 +- 3 files changed, 7 insertions(+), 5 deletions(-) diff --git a/compat/nedmalloc/nedmalloc.c b/compat/nedmalloc/nedmalloc.c index 609ebba..a0a16eb 100644 --- a/compat/nedmalloc/nedmalloc.c +++ b/compat/nedmalloc/nedmalloc.c @@ -957,8 +957,9 @@ char *strdup(const char *s1) { char *s2 = 0; if (s1) { - s2 = malloc(strlen(s1) + 1); - strcpy(s2, s1); + size_t len = strlen(s1) + 1; + s2 = malloc(len); + memcpy(s2, s1, len); } return s2; } diff --git a/fast-import.c b/fast-import.c index 895c6b4..cf6d8bc 100644 --- a/fast-import.c +++ b/fast-import.c @@ -644,8 +644,9 @@ static void *pool_calloc(size_t count, size_t size) static char *pool_strdup(const char *s) { - char *r = pool_alloc(strlen(s) + 1); - strcpy(r, s); + size_t len = strlen(s) + 1; + char *r = pool_alloc(len); + memcpy(r, s, len); return r; } diff --git a/revision.c b/revision.c index af2a18e..2236463 100644 --- a/revision.c +++ b/revision.c @@ -38,7 +38,7 @@ char *path_name(const struct name_path *path, const char *name) } n = xmalloc(len); m = n + len - (nlen + 1); - strcpy(m, name); + memcpy(m, name, nlen + 1); for (p = path; p; p = p->up) { if (p->elem_len) { m -= p->elem_len + 1; -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 67/67] name-rev: use strip_suffix to avoid magic numbers
The manual size computations here are correct, but using strip_suffix makes that obvious, and hopefully communicates the intent of the code more clearly. Signed-off-by: Jeff King--- builtin/name-rev.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/builtin/name-rev.c b/builtin/name-rev.c index 8a3a0cd..0377fc1 100644 --- a/builtin/name-rev.c +++ b/builtin/name-rev.c @@ -55,16 +55,15 @@ copy_data: parents; parents = parents->next, parent_number++) { if (parent_number > 1) { - int len = strlen(tip_name); + size_t len; char *new_name; - if (len > 2 && !strcmp(tip_name + len - 2, "^0")) - len -= 2; + strip_suffix(tip_name, "^0", ); if (generation > 0) - new_name = xstrfmt("%.*s~%d^%d", len, tip_name, + new_name = xstrfmt("%.*s~%d^%d", (int)len, tip_name, generation, parent_number); else - new_name = xstrfmt("%.*s^%d", len, tip_name, + new_name = xstrfmt("%.*s^%d", (int)len, tip_name, parent_number); name_rev(parents->item, new_name, 0, -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/67] archive-tar: fix minor indentation violation
This looks like a simple omission from 8539070 (archive-tar: unindent write_tar_entry by one level, 2012-05-03). Signed-off-by: Jeff King--- archive-tar.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/archive-tar.c b/archive-tar.c index 0d1e6bd..b6b30bb 100644 --- a/archive-tar.c +++ b/archive-tar.c @@ -233,7 +233,7 @@ static int write_tar_entry(struct archiver_args *args, size_t rest = pathlen - plen - 1; if (plen > 0 && rest <= sizeof(header.name)) { memcpy(header.prefix, path, plen); - memcpy(header.name, path + plen + 1, rest); + memcpy(header.name, path + plen + 1, rest); } else { sprintf(header.name, "%s.data", sha1_to_hex(sha1)); -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/67] mailsplit: fix FILE* leak in split_maildir
If we encounter an error while splitting a maildir, we exit the function early, leaking the open filehandle. This isn't a big deal, since we exit the program soon after, but it's easy enough to be careful. Signed-off-by: Jeff King--- builtin/mailsplit.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/builtin/mailsplit.c b/builtin/mailsplit.c index 8e02ea1..9de06e3 100644 --- a/builtin/mailsplit.c +++ b/builtin/mailsplit.c @@ -150,6 +150,7 @@ static int split_maildir(const char *maildir, const char *dir, { char file[PATH_MAX]; char name[PATH_MAX]; + FILE *f = NULL; int ret = -1; int i; struct string_list list = STRING_LIST_INIT_DUP; @@ -160,7 +161,6 @@ static int split_maildir(const char *maildir, const char *dir, goto out; for (i = 0; i < list.nr; i++) { - FILE *f; snprintf(file, sizeof(file), "%s/%s", maildir, list.items[i].string); f = fopen(file, "r"); if (!f) { @@ -177,10 +177,13 @@ static int split_maildir(const char *maildir, const char *dir, split_one(f, name, 1); fclose(f); + f = NULL; } ret = skip; out: + if (f) + fclose(f); string_list_clear(, 1); return ret; } -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 22/67] entry.c: convert strcpy to xsnprintf
This particular conversion is non-obvious, because nobody has passed our function the length of the destination buffer. However, the interface to checkout_entry specifies that the buffer must be at least TEMPORARY_FILENAME_LENGTH bytes long, so we can check that (meaning the existing code was not buggy, but merely worrisome to somebody reading it). Signed-off-by: Jeff King--- entry.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/entry.c b/entry.c index 1eda8e9..582c400 100644 --- a/entry.c +++ b/entry.c @@ -96,8 +96,8 @@ static int open_output_fd(char *path, const struct cache_entry *ce, int to_tempf { int symlink = (ce->ce_mode & S_IFMT) != S_IFREG; if (to_tempfile) { - strcpy(path, symlink - ? ".merge_link_XX" : ".merge_file_XX"); + xsnprintf(path, TEMPORARY_FILENAME_LENGTH, "%s", + symlink ? ".merge_link_XX" : ".merge_file_XX"); return mkstemp(path); } else { return create_file(path, !symlink ? ce->ce_mode : 0666); -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 54/67] color: add overflow checks for parsing colors
Our color parsing is designed to never exceed COLOR_MAXLEN bytes. But the relationship between that hand-computed number and the parsing code is not at all obvious, and we merely hope that it has been computed correctly for all cases. Let's mark the expected "end" pointer for the destination buffer and make sure that we do not exceed it. Signed-off-by: Jeff King--- color.c | 39 --- 1 file changed, 24 insertions(+), 15 deletions(-) diff --git a/color.c b/color.c index 9027352..3a3fa33 100644 --- a/color.c +++ b/color.c @@ -150,22 +150,24 @@ int color_parse(const char *value, char *dst) * already have the ANSI escape code in it. "out" should have enough * space in it to fit any color. */ -static char *color_output(char *out, const struct color *c, char type) +static char *color_output(char *out, int len, const struct color *c, char type) { switch (c->type) { case COLOR_UNSPECIFIED: case COLOR_NORMAL: break; case COLOR_ANSI: + if (len < 2) + die("BUG: color parsing ran out of space"); *out++ = type; *out++ = '0' + c->value; break; case COLOR_256: - out += sprintf(out, "%c8;5;%d", type, c->value); + out += xsnprintf(out, len, "%c8;5;%d", type, c->value); break; case COLOR_RGB: - out += sprintf(out, "%c8;2;%d;%d;%d", type, - c->red, c->green, c->blue); + out += xsnprintf(out, len, "%c8;2;%d;%d;%d", type, +c->red, c->green, c->blue); break; } return out; @@ -180,12 +182,13 @@ int color_parse_mem(const char *value, int value_len, char *dst) { const char *ptr = value; int len = value_len; + char *end = dst + COLOR_MAXLEN; unsigned int attr = 0; struct color fg = { COLOR_UNSPECIFIED }; struct color bg = { COLOR_UNSPECIFIED }; if (!strncasecmp(value, "reset", len)) { - strcpy(dst, GIT_COLOR_RESET); + xsnprintf(dst, end - dst, GIT_COLOR_RESET); return 0; } @@ -224,12 +227,18 @@ int color_parse_mem(const char *value, int value_len, char *dst) goto bad; } +#define OUT(x) do { \ + if (dst == end) \ + die("BUG: color parsing ran out of space"); \ + *dst++ = (x); \ +} while(0) + if (attr || !color_empty() || !color_empty()) { int sep = 0; int i; - *dst++ = '\033'; - *dst++ = '['; + OUT('\033'); + OUT('['); for (i = 0; attr; i++) { unsigned bit = (1 << i); @@ -237,24 +246,24 @@ int color_parse_mem(const char *value, int value_len, char *dst) continue; attr &= ~bit; if (sep++) - *dst++ = ';'; - dst += sprintf(dst, "%d", i); + OUT(';'); + dst += xsnprintf(dst, end - dst, "%d", i); } if (!color_empty()) { if (sep++) - *dst++ = ';'; + OUT(';'); /* foreground colors are all in the 3x range */ - dst = color_output(dst, , '3'); + dst = color_output(dst, end - dst, , '3'); } if (!color_empty()) { if (sep++) - *dst++ = ';'; + OUT(';'); /* background colors are all in the 4x range */ - dst = color_output(dst, , '4'); + dst = color_output(dst, end - dst, , '4'); } - *dst++ = 'm'; + OUT('m'); } - *dst = 0; + OUT(0); return 0; bad: return error(_("invalid color value: %.*s"), value_len, value); -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 55/67] use alloc_ref rather than hand-allocating "struct ref"
This saves us some manual computation, and eliminates a call to strcpy. Signed-off-by: Jeff King--- builtin/fetch.c | 3 +-- remote-curl.c | 5 + 2 files changed, 2 insertions(+), 6 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 841880e..ed84963 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -639,8 +639,7 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, continue; if (rm->peer_ref) { - ref = xcalloc(1, sizeof(*ref) + strlen(rm->peer_ref->name) + 1); - strcpy(ref->name, rm->peer_ref->name); + ref = alloc_ref(rm->peer_ref->name); hashcpy(ref->old_sha1, rm->peer_ref->old_sha1); hashcpy(ref->new_sha1, rm->old_sha1); ref->force = rm->peer_ref->force; diff --git a/remote-curl.c b/remote-curl.c index 71fbbb6..cc7a8a6 100644 --- a/remote-curl.c +++ b/remote-curl.c @@ -168,10 +168,7 @@ static struct ref *parse_info_refs(struct discovery *heads) url.buf); data[i] = 0; ref_name = mid + 1; - ref = xmalloc(sizeof(struct ref) + - strlen(ref_name) + 1); - memset(ref, 0, sizeof(struct ref)); - strcpy(ref->name, ref_name); + ref = alloc_ref(ref_name); get_sha1_hex(start, ref->old_sha1); if (!refs) refs = ref; -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 57/67] receive-pack: simplify keep_arg computation
To generate "--keep=receive-pack $pid on $host", we write progressively into a single buffer, which requires keeping track of how much we've written so far. But since the result is destined to go into our argv array, we can simply use argv_array_pushf. Unfortunately we still have to have a static buffer for the gethostname() call, but at least it now doesn't involve any extra size computation. And as a bonus, we drop an sprintf and a strcpy call. Signed-off-by: Jeff King--- builtin/receive-pack.c | 17 ++--- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index 8b50e48..2c82274 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -1524,15 +1524,18 @@ static const char *unpack(int err_fd, struct shallow_info *si) if (status) return "unpack-objects abnormal exit"; } else { - int s; - char keep_arg[256]; - - s = sprintf(keep_arg, "--keep=receive-pack %"PRIuMAX" on ", (uintmax_t) getpid()); - if (gethostname(keep_arg + s, sizeof(keep_arg) - s)) - strcpy(keep_arg + s, "localhost"); + char hostname[256]; argv_array_pushl(, "index-pack", -"--stdin", hdr_arg, keep_arg, NULL); +"--stdin", hdr_arg, NULL); + + if (gethostname(hostname, sizeof(hostname))) + xsnprintf(hostname, sizeof(hostname), "localhost"); + argv_array_pushf(, +"--keep=receive-pack %"PRIuMAX" on %s", +(uintmax_t)getpid(), +hostname); + if (fsck_objects) argv_array_pushf(, "--strict%s", fsck_msg_types.buf); -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 58/67] help: clean up kfmclient munging
When we are going to launch "/path/to/konqueror", we instead rewrite this into "/path/to/kfmclient" by duplicating the original string and writing over the ending bits. This can be done more obviously with strip_suffix and xstrfmt. Note that we also fix a subtle bug with the "filename" parameter, which is passed as argv[0] to the child. If the user has configured a program name with no directory component, we always pass the string "kfmclient", even if your program is called something else. But if you give a full path, we give the basename of that path. But more bizarrely, if we rewrite "konqueror" to "kfmclient", we still pass "konqueror". The history of this function doesn't reveal anything interesting, so it looks like just an oversight from combining the suffix-munging with the basename-finding. Let's just call basename on the munged path, which produces consistent results (if you gave a program, whether a full path or not, we pass its basename). Probably this doesn't matter at all in practice, but it makes the code slightly less confusing to read. Signed-off-by: Jeff King--- builtin/help.c | 15 --- 1 file changed, 4 insertions(+), 11 deletions(-) diff --git a/builtin/help.c b/builtin/help.c index fba8c01..e1650ab 100644 --- a/builtin/help.c +++ b/builtin/help.c @@ -140,17 +140,10 @@ static void exec_man_konqueror(const char *path, const char *page) /* It's simpler to launch konqueror using kfmclient. */ if (path) { - const char *file = strrchr(path, '/'); - if (file && !strcmp(file + 1, "konqueror")) { - char *new = xstrdup(path); - char *dest = strrchr(new, '/'); - - /* strlen("konqueror") == strlen("kfmclient") */ - strcpy(dest + 1, "kfmclient"); - path = new; - } - if (file) - filename = file; + size_t len; + if (strip_suffix(path, "/konqueror", )) + path = xstrfmt("%.*s/kfmclient", (int)len, path); + filename = basename((char *)path); } else path = "kfmclient"; strbuf_addf(_page, "man:%s(1)", page); -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/67] war on sprintf, strcpy, etc
The git code contains a lot of calls to sprintf, strcpy, and other unchecked string functions. In many cases, these aren't actually overflows, because some earlier part of the code implies that the copied content is smaller than the destination buffer. But it's often hard to tell, because the code enforcing that assumption is far away, or there's a complicated expression to create a buffer. This makes it difficult to audit git for buffer overflows, because you can spend a lot of time chasing down false positives. My goal with this series was to not only audit each of these sites for overflows, but to convert them to more modern constructs (e.g., strbufs), so that it's easier to do more audits going forward. There are quite a large number of changes. I've tried to group similar changes together to make reviewing easier. Here's a rough breakdown: [01/67]: show-branch: avoid segfault with --reflog of unborn branch [02/67]: mailsplit: fix FILE* leak in split_maildir [03/67]: archive-tar: fix minor indentation violation [04/67]: fsck: don't fsck alternates for connectivity-only check These are minor bugfixes that I found while digging into various call-sites. There's no semantic dependency, but some of the later patches depend on these textually. [05/67]: add xsnprintf helper function [06/67]: add git_path_buf helper function [07/67]: strbuf: make strbuf_complete_line more generic [08/67]: add reentrant variants of sha1_to_hex and find_unique_abbrev These four patches introduce infrastructure that will help later cleanups (alongside existing tools like strbuf, xstrfmt, etc). [09/67]: fsck: use strbuf to generate alternate directories [10/67]: mailsplit: make PATH_MAX buffers dynamic [11/67]: trace: use strbuf for quote_crnl output [12/67]: progress: store throughput display in a strbuf [13/67]: test-dump-cache-tree: avoid overflow of cache-tree name [14/67]: compat/inet_ntop: fix off-by-one in inet_ntop4 These cases are all things that _can_ overflow, given the right input. But none of them is interesting security-wise because, because their input is not typically attacker-controlled. [15/67]: convert trivial sprintf / strcpy calls to xsnprintf [16/67]: archive-tar: use xsnprintf for trivial formatting [17/67]: use xsnprintf for generating git object headers [18/67]: find_short_object_filename: convert sprintf to xsnprintf [19/67]: stop_progress_msg: convert sprintf to xsnprintf [20/67]: compat/hstrerror: convert sprintf to snprintf [21/67]: grep: use xsnprintf to format failure message [22/67]: entry.c: convert strcpy to xsnprintf [23/67]: add_packed_git: convert strcpy into xsnprintf [24/67]: http-push: replace strcat with xsnprintf [25/67]: receive-pack: convert strncpy to xsnprintf These cases can all be fixed by using the newly-added xsnprintf. The trivial conversions are in patch 15, and the rest are cases that needed a little more cleanup or explanation. [26/67]: replace trivial malloc + sprintf /strcpy calls to xstrfmt [27/67]: config: use xstrfmt in normalize_value [28/67]: fetch: replace static buffer with xstrfmt [29/67]: use strip_suffix and xstrfmt to replace suffix [30/67]: ref-filter: drop sprintf and strcpy calls [31/67]: help: drop prepend function in favor of xstrfmt [32/67]: mailmap: replace strcpy with xstrdup [33/67]: read_branches_file: replace strcpy with xstrdup Ditto, but for xstrfmt/xstrdup. [34/67]: resolve_ref: use strbufs for internal buffers [35/67]: upload-archive: convert sprintf to strbuf [36/67]: remote-ext: simplify git pkt-line generation [37/67]: http-push: use strbuf instead of fwrite_buffer [38/67]: http-walker: store url in a strbuf [39/67]: sha1_get_pack_name: use a strbuf [40/67]: init: use strbufs to store paths [41/67]: apply: convert root string to strbuf [42/67]: transport: use strbufs for status table "quickref" strings [43/67]: merge-recursive: convert malloc / strcpy to strbuf [44/67]: enter_repo: convert fixed-size buffers to strbufs [45/67]: remove_leading_path: use a strbuf for internal storage [46/67]: write_loose_object: convert to strbuf [47/67]: diagnose_invalid_index_path: use strbuf to avoid strcpy/strcat Ditto, but for strbufs. I generally used xstrfmt over a strbuf where it was feasible, since the former is shorter. These cases typically did something a little more complicated than xstrfmt could handle. [48/67]: fetch-pack: use argv_array for index-pack / unpack-objects [49/67]: http-push: use an argv_array for setup_revisions [50/67]: stat_tracking_info: convert to argv_array [51/67]: daemon: use cld->env_array when re-spawning Ditto, but for argv_array. This helps regular overflows, because argv_array_pushf uses a strbuf internally. But it also prevents overflowing the array-of-pointers. [52/67]: use sha1_to_hex_to() instead of strcpy [53/67]: drop strcpy in
[PATCH 12/67] progress: store throughput display in a strbuf
Coverity noticed that we strncpy() into a fixed-size buffer without making sure that it actually ended up NUL-terminated. This is unlikely to be a bug in practice, since throughput strings rarely hit 32 characters, but it would be nice to clean it up. The most obvious way to do so is to add a NUL-terminator. But instead, this patch switches the fixed-size buffer out for a strbuf. At first glance this seems much less efficient, until we realize that filling in the fixed-size buffer is done by writing into a strbuf and copying the result! By writing straight to the buffer, we actually end up more efficient: 1. We avoid an extra copy of the bytes. 2. Rather than malloc/free each time progress is shown, we can strbuf_reset and use the same buffer each time. Signed-off-by: Jeff King--- I actually sent this one to the list in June: http://thread.gmane.org/gmane.comp.version-control.git/271880 but it got overlooked. Good luck overlooking this 67-patch monstrosity. :) progress.c | 18 -- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/progress.c b/progress.c index 2e31bec..a3efcfd 100644 --- a/progress.c +++ b/progress.c @@ -25,7 +25,7 @@ struct throughput { unsigned int last_bytes[TP_IDX_MAX]; unsigned int last_misecs[TP_IDX_MAX]; unsigned int idx; - char display[32]; + struct strbuf display; }; struct progress { @@ -98,7 +98,7 @@ static int display(struct progress *progress, unsigned n, const char *done) } progress->last_value = n; - tp = (progress->throughput) ? progress->throughput->display : ""; + tp = (progress->throughput) ? progress->throughput->display.buf : ""; eol = done ? done : " \r"; if (progress->total) { unsigned percent = n * 100 / progress->total; @@ -129,6 +129,7 @@ static int display(struct progress *progress, unsigned n, const char *done) static void throughput_string(struct strbuf *buf, off_t total, unsigned int rate) { + strbuf_reset(buf); strbuf_addstr(buf, ", "); strbuf_humanise_bytes(buf, total); strbuf_addstr(buf, " | "); @@ -141,7 +142,6 @@ void display_throughput(struct progress *progress, off_t total) struct throughput *tp; uint64_t now_ns; unsigned int misecs, count, rate; - struct strbuf buf = STRBUF_INIT; if (!progress) return; @@ -154,6 +154,7 @@ void display_throughput(struct progress *progress, off_t total) if (tp) { tp->prev_total = tp->curr_total = total; tp->prev_ns = now_ns; + strbuf_init(>display, 0); } return; } @@ -193,9 +194,7 @@ void display_throughput(struct progress *progress, off_t total) tp->last_misecs[tp->idx] = misecs; tp->idx = (tp->idx + 1) % TP_IDX_MAX; - throughput_string(, total, rate); - strncpy(tp->display, buf.buf, sizeof(tp->display)); - strbuf_release(); + throughput_string(>display, total, rate); if (progress->last_value != -1 && progress_update) display(progress, progress->last_value, NULL); } @@ -250,12 +249,9 @@ void stop_progress_msg(struct progress **p_progress, const char *msg) bufp = (len < sizeof(buf)) ? buf : xmalloc(len + 1); if (tp) { - struct strbuf strbuf = STRBUF_INIT; unsigned int rate = !tp->avg_misecs ? 0 : tp->avg_bytes / tp->avg_misecs; - throughput_string(, tp->curr_total, rate); - strncpy(tp->display, strbuf.buf, sizeof(tp->display)); - strbuf_release(); + throughput_string(>display, tp->curr_total, rate); } progress_update = 1; sprintf(bufp, ", %s.\n", msg); @@ -264,6 +260,8 @@ void stop_progress_msg(struct progress **p_progress, const char *msg) free(bufp); } clear_progress_signal(); + if (progress->throughput) + strbuf_release(>throughput->display); free(progress->throughput); free(progress); } -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 24/67] http-push: replace strcat with xsnprintf
We account for these strcats in our initial allocation, but the code is confusing to follow and verify. Let's remember our original allocation length, and then xsnprintf can verify that we don't exceed it. Note that we can't just use xstrfmt here (which would be even cleaner) because the code tries to grow the buffer only when necessary. Signed-off-by: Jeff King--- It would probably be a good match for a strbuf, but it's hard to over-emphasize how little interest I have in refactoring the http-push webdav code. http-push.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/http-push.c b/http-push.c index 1f3788f..37baff8 100644 --- a/http-push.c +++ b/http-push.c @@ -786,21 +786,21 @@ xml_start_tag(void *userData, const char *name, const char **atts) { struct xml_ctx *ctx = (struct xml_ctx *)userData; const char *c = strchr(name, ':'); - int new_len; + int old_namelen, new_len; if (c == NULL) c = name; else c++; - new_len = strlen(ctx->name) + strlen(c) + 2; + old_namelen = strlen(ctx->name); + new_len = old_namelen + strlen(c) + 2; if (new_len > ctx->len) { ctx->name = xrealloc(ctx->name, new_len); ctx->len = new_len; } - strcat(ctx->name, "."); - strcat(ctx->name, c); + xsnprintf(ctx->name + old_namelen, ctx->len - old_namelen, ".%s", c); free(ctx->cdata); ctx->cdata = NULL; -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 25/67] receive-pack: convert strncpy to xsnprintf
This strncpy is pointless; we pass the strlen() of the src string, meaning that it works just like a memcpy. Worse, though, is that the size has no relation to the destination buffer, meaning it is a potential overflow. In practice, it's not. We pass only short constant strings like "warning: " and "error: ", which are much smaller than the destination buffer. We can make this much simpler by just using xsnprintf, which will check for overflow and return the size for our next vsnprintf, without us having to run a separate strlen(). Signed-off-by: Jeff King--- builtin/receive-pack.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index e6b93d0..04d2bdf 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -280,10 +280,10 @@ static void rp_warning(const char *err, ...) __attribute__((format (printf, 1, 2 static void report_message(const char *prefix, const char *err, va_list params) { - int sz = strlen(prefix); + int sz; char msg[4096]; - strncpy(msg, prefix, sz); + sz = xsnprintf(msg, sizeof(msg), "%s", prefix); sz += vsnprintf(msg + sz, sizeof(msg) - sz, err, params); if (sz > (sizeof(msg) - 1)) sz = sizeof(msg) - 1; -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 23/67] add_packed_git: convert strcpy into xsnprintf
We have the path "foo.idx", and we create a buffer big enough to hold "foo.pack" and "foo.keep", and then strcpy straight into it. This isn't a bug (we have enough space), but it's very hard to tell from the strcpy that this is so. Let's instead use strip_suffix to take off the ".idx", record the size of our allocation, and use xsnprintf to make sure we don't violate our assumptions. Signed-off-by: Jeff King--- cache.h | 2 +- sha1_file.c | 19 ++- 2 files changed, 11 insertions(+), 10 deletions(-) diff --git a/cache.h b/cache.h index cc59aba..11372ef 100644 --- a/cache.h +++ b/cache.h @@ -1305,7 +1305,7 @@ extern void close_pack_windows(struct packed_git *); extern void unuse_pack(struct pack_window **); extern void free_pack_by_name(const char *); extern void clear_delta_base_cache(void); -extern struct packed_git *add_packed_git(const char *, int, int); +extern struct packed_git *add_packed_git(const char *path, size_t path_len, int local); /* * Return the SHA-1 of the nth object within the specified packfile. diff --git a/sha1_file.c b/sha1_file.c index f106091..28352a5 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1146,11 +1146,12 @@ static void try_to_free_pack_memory(size_t size) release_pack_memory(size); } -struct packed_git *add_packed_git(const char *path, int path_len, int local) +struct packed_git *add_packed_git(const char *path, size_t path_len, int local) { static int have_set_try_to_free_routine; struct stat st; - struct packed_git *p = alloc_packed_git(path_len + 2); + size_t alloc; + struct packed_git *p; if (!have_set_try_to_free_routine) { have_set_try_to_free_routine = 1; @@ -1161,18 +1162,18 @@ struct packed_git *add_packed_git(const char *path, int path_len, int local) * Make sure a corresponding .pack file exists and that * the index looks sane. */ - path_len -= strlen(".idx"); - if (path_len < 1) { - free(p); + if (!strip_suffix_mem(path, _len, ".idx")) return NULL; - } - memcpy(p->pack_name, path, path_len); - strcpy(p->pack_name + path_len, ".keep"); + alloc = path_len + strlen(".pack") + 1; + p = alloc_packed_git(alloc); + memcpy(p->pack_name, path, path_len); /* NUL from zero-ed struct */ + + xsnprintf(p->pack_name + path_len, alloc - path_len, ".keep"); if (!access(p->pack_name, F_OK)) p->pack_keep = 1; - strcpy(p->pack_name + path_len, ".pack"); + xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack"); if (stat(p->pack_name, ) || !S_ISREG(st.st_mode)) { free(p); return NULL; -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 52/67] use sha1_to_hex_to() instead of strcpy
Before sha1_to_hex_to() existed, a simple way to get a hex sha1 into a buffer was with: strcpy(buf, sha1_to_hex(sha1)); This isn't wrong (assuming the buf is 41 characters), but it makes auditing the code base for bad strcpy() calls harder, as these become false positives. Let's convert them to sha1_to_hex_to(), and likewise for some calls to find_unique_abbrev(). While we're here, we'll double-check that all of the buffers are correctly sized, and use the more obvious GIT_SHA1_HEXSZ constant. Signed-off-by: Jeff King--- builtin/blame.c| 8 builtin/merge-index.c | 4 ++-- builtin/merge.c| 20 ++-- builtin/receive-pack.c | 15 +-- builtin/rev-list.c | 4 ++-- diff.c | 9 - 6 files changed, 31 insertions(+), 29 deletions(-) diff --git a/builtin/blame.c b/builtin/blame.c index 4db01c1..f8ec7ff 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -1867,9 +1867,9 @@ static void emit_porcelain(struct scoreboard *sb, struct blame_entry *ent, int cnt; const char *cp; struct origin *suspect = ent->suspect; - char hex[41]; + char hex[GIT_SHA1_HEXSZ + 1]; - strcpy(hex, sha1_to_hex(suspect->commit->object.sha1)); + sha1_to_hex_to(hex, suspect->commit->object.sha1); printf("%s %d %d %d\n", hex, ent->s_lno + 1, @@ -1905,11 +1905,11 @@ static void emit_other(struct scoreboard *sb, struct blame_entry *ent, int opt) const char *cp; struct origin *suspect = ent->suspect; struct commit_info ci; - char hex[41]; + char hex[GIT_SHA1_HEXSZ + 1]; int show_raw_time = !!(opt & OUTPUT_RAW_TIMESTAMP); get_commit_info(suspect->commit, , 1); - strcpy(hex, sha1_to_hex(suspect->commit->object.sha1)); + sha1_to_hex_to(hex, suspect->commit->object.sha1); cp = nth_line(sb, ent->lno); for (cnt = 0; cnt < ent->num_lines; cnt++) { diff --git a/builtin/merge-index.c b/builtin/merge-index.c index 1d66111..4ed0a83 100644 --- a/builtin/merge-index.c +++ b/builtin/merge-index.c @@ -9,7 +9,7 @@ static int merge_entry(int pos, const char *path) { int found; const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL }; - char hexbuf[4][60]; + char hexbuf[4][GIT_SHA1_HEXSZ + 1]; char ownbuf[4][60]; if (pos >= active_nr) @@ -22,7 +22,7 @@ static int merge_entry(int pos, const char *path) if (strcmp(ce->name, path)) break; found++; - strcpy(hexbuf[stage], sha1_to_hex(ce->sha1)); + sha1_to_hex_to(hexbuf[stage], ce->sha1); xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode); arguments[stage] = hexbuf[stage]; arguments[stage + 4] = ownbuf[stage]; diff --git a/builtin/merge.c b/builtin/merge.c index a0edaca..af2a2ae 100644 --- a/builtin/merge.c +++ b/builtin/merge.c @@ -1319,13 +1319,13 @@ int cmd_merge(int argc, const char **argv, const char *prefix) if (verify_signatures) { for (p = remoteheads; p; p = p->next) { struct commit *commit = p->item; - char hex[41]; + char hex[GIT_SHA1_HEXSZ + 1]; struct signature_check signature_check; memset(_check, 0, sizeof(signature_check)); check_commit_signature(commit, _check); - strcpy(hex, find_unique_abbrev(commit->object.sha1, DEFAULT_ABBREV)); + find_unique_abbrev_to(hex, commit->object.sha1, DEFAULT_ABBREV); switch (signature_check.result) { case 'G': break; @@ -1415,15 +1415,15 @@ int cmd_merge(int argc, const char **argv, const char *prefix) /* Again the most common case of merging one remote. */ struct strbuf msg = STRBUF_INIT; struct commit *commit; - char hex[41]; - strcpy(hex, find_unique_abbrev(head_commit->object.sha1, DEFAULT_ABBREV)); - - if (verbosity >= 0) - printf(_("Updating %s..%s\n"), - hex, - find_unique_abbrev(remoteheads->item->object.sha1, - DEFAULT_ABBREV)); + if (verbosity >= 0) { + char from[GIT_SHA1_HEXSZ + 1], to[GIT_SHA1_HEXSZ + 1]; + find_unique_abbrev_to(from, head_commit->object.sha1, + DEFAULT_ABBREV); + find_unique_abbrev_to(to, remoteheads->item->object.sha1, + DEFAULT_ABBREV); + printf(_("Updating %s..%s\n"), from, to);
[PATCH 51/67] daemon: use cld->env_array when re-spawning
This avoids an ugly strcat into a fixed-size buffer. It's not wrong (the buffer is plenty large enough for an IPv6 address plus some minor formatting), but it takes some effort to verify that. Unfortunately we are still stuck with some fixed-size buffers to hold the output of inet_ntop. But at least we now pass very easy-to-verify parameters, rather than doing a manual computation to account for other data in the buffer. As a side effect, this also fixes the case where we might pass an uninitialized portbuf buffer through the environment. This probably couldn't happen in practice, as it would mean that addr->sa_family was neither AF_INET nor AF_INET6 (and that is all we are listening on). Signed-off-by: Jeff King--- daemon.c | 26 ++ 1 file changed, 10 insertions(+), 16 deletions(-) diff --git a/daemon.c b/daemon.c index 5218a3f..56679a1 100644 --- a/daemon.c +++ b/daemon.c @@ -811,8 +811,6 @@ static char **cld_argv; static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen) { struct child_process cld = CHILD_PROCESS_INIT; - char addrbuf[300] = "REMOTE_ADDR=", portbuf[300]; - char *env[] = { addrbuf, portbuf, NULL }; if (max_connections && live_children >= max_connections) { kill_some_child(); @@ -826,27 +824,23 @@ static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen) } if (addr->sa_family == AF_INET) { + char buf[128] = ""; struct sockaddr_in *sin_addr = (void *) addr; - inet_ntop(addr->sa_family, _addr->sin_addr, addrbuf + 12, - sizeof(addrbuf) - 12); - snprintf(portbuf, sizeof(portbuf), "REMOTE_PORT=%d", - ntohs(sin_addr->sin_port)); + inet_ntop(addr->sa_family, _addr->sin_addr, buf, sizeof(buf)); + argv_array_pushf(_array, "REMOTE_ADDR=%s", buf); + argv_array_pushf(_array, "REMOTE_PORT=%d", +ntohs(sin_addr->sin_port)); #ifndef NO_IPV6 } else if (addr->sa_family == AF_INET6) { + char buf[128] = ""; struct sockaddr_in6 *sin6_addr = (void *) addr; - - char *buf = addrbuf + 12; - *buf++ = '['; *buf = '\0'; /* stpcpy() is cool */ - inet_ntop(AF_INET6, _addr->sin6_addr, buf, - sizeof(addrbuf) - 13); - strcat(buf, "]"); - - snprintf(portbuf, sizeof(portbuf), "REMOTE_PORT=%d", - ntohs(sin6_addr->sin6_port)); + inet_ntop(AF_INET6, _addr->sin6_addr, buf, sizeof(buf)); + argv_array_pushf(_array, "REMOTE_ADDR=[%s]", buf); + argv_array_pushf(_array, "REMOTE_PORT=%d", +ntohs(sin6_addr->sin6_port)); #endif } - cld.env = (const char **)env; cld.argv = (const char **)cld_argv; cld.in = incoming; cld.out = dup(incoming); -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 61/67] notes: document length of fanout path with a constant
We know that a fanned-out sha1 in a notes tree cannot be more than "aa/bb/cc/...", and we have an assert() to confirm that. But let's factor out that length into a constant so we can be sure it is used consistently. And even though we assert() earlier, let's replace a strcpy with xsnprintf, so it is clear to a reader that all cases are covered. Signed-off-by: Jeff King--- notes.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/notes.c b/notes.c index eacd2a6..db77922 100644 --- a/notes.c +++ b/notes.c @@ -539,6 +539,9 @@ static unsigned char determine_fanout(struct int_node *tree, unsigned char n, return fanout + 1; } +/* hex SHA1 + 19 * '/' + NUL */ +#define FANOUT_PATH_MAX 40 + 19 + 1 + static void construct_path_with_fanout(const unsigned char *sha1, unsigned char fanout, char *path) { @@ -551,7 +554,7 @@ static void construct_path_with_fanout(const unsigned char *sha1, path[i++] = '/'; fanout--; } - strcpy(path + i, hex_sha1 + j); + xsnprintf(path + i, FANOUT_PATH_MAX - i, "%s", hex_sha1 + j); } static int for_each_note_helper(struct notes_tree *t, struct int_node *tree, @@ -562,7 +565,7 @@ static int for_each_note_helper(struct notes_tree *t, struct int_node *tree, void *p; int ret = 0; struct leaf_node *l; - static char path[40 + 19 + 1]; /* hex SHA1 + 19 * '/' + NUL */ + static char path[FANOUT_PATH_MAX]; fanout = determine_fanout(tree, n, fanout); for (i = 0; i < 16; i++) { @@ -595,7 +598,7 @@ redo: /* invoke callback with subtree */ unsigned int path_len = l->key_sha1[19] * 2 + fanout; - assert(path_len < 40 + 19); + assert(path_len < FANOUT_PATH_MAX - 1); construct_path_with_fanout(l->key_sha1, fanout, path); /* Create trailing slash, if needed */ -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 08/67] add reentrant variants of sha1_to_hex and find_unique_abbrev
The sha1_to_hex and find_unique_abbrev functions always write into reusable static buffers. There are a few problems with this: - future calls overwrite our result. This is especially annoying with find_unique_abbrev, which does not have a ring of buffers, so you cannot even printf() a result that has two abbreviated sha1s. - if you want to put the result into another buffer, we often strcpy, which looks suspicious when auditing for overflows. This patch introduces sha1_to_hex_to and find_unique_abbrev_to, which write into a user-provided buffer. Of course this is just punting on the overflow-auditing, as the buffer obviously needs to be GIT_SHA1_HEXSZ + 1 bytes. But it is much easier to audit, since that is a well-known size. We retain the non-reentrant forms, which just become thin wrappers around the reentrant ones. This patch also adds a strbuf variant of find_unique_abbrev, which will be handy in later patches. Signed-off-by: Jeff King--- If we wanted to be really meticulous, these functions could take a size for the output buffer, and complain if it is not GIT_SHA1_HEXSZ+1 bytes. But that would bloat every call like: sha1_to_hex_to(buf, sizeof(buf), sha1); cache.h | 27 ++- hex.c | 13 + sha1_name.c | 16 +++- strbuf.c| 9 + strbuf.h| 8 5 files changed, 63 insertions(+), 10 deletions(-) diff --git a/cache.h b/cache.h index e231e47..cc59aba 100644 --- a/cache.h +++ b/cache.h @@ -785,7 +785,21 @@ extern char *sha1_pack_name(const unsigned char *sha1); */ extern char *sha1_pack_index_name(const unsigned char *sha1); -extern const char *find_unique_abbrev(const unsigned char *sha1, int); +/* + * Return an abbreviated sha1 unique within this repository's object database. + * The result will be at least `len` characters long, and will be NUL + * terminated. + * + * The non-`_to` version returns a static buffer which will be overwritten by + * subsequent calls. + * + * The `_to` variant writes to a buffer supplied by the caller, which must be + * at least `GIT_SHA1_HEXSZ + 1` bytes. The return value is the number of bytes + * written (excluding the NUL terminator). + */ +extern const char *find_unique_abbrev(const unsigned char *sha1, int len); +extern int find_unique_abbrev_to(char *hex, const unsigned char *sha1, int len); + extern const unsigned char null_sha1[GIT_SHA1_RAWSZ]; static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2) @@ -1067,6 +1081,17 @@ extern int for_each_abbrev(const char *prefix, each_abbrev_fn, void *); extern int get_sha1_hex(const char *hex, unsigned char *sha1); extern int get_oid_hex(const char *hex, struct object_id *sha1); +/* + * Convert a binary sha1 to its hex equivalent. The `_to` variant writes + * the NUL-terminated output to the buffer `out`, which must be at least + * `GIT_SHA1_HEXSZ + 1` bytes, and returns a pointer to out for convenience. + * + * The non-`_to` variant returns a static buffer, but uses a ring of 4 + * buffers, making it safe to make multiple calls for a single statement, like: + * + * printf("%s -> %s", sha1_to_hex(one), sha1_to_hex(two)); + */ +extern char *sha1_to_hex_to(char *out, const unsigned char *sha1); extern char *sha1_to_hex(const unsigned char *sha1); /* static buffer result! */ extern char *oid_to_hex(const struct object_id *oid); /* same static buffer as sha1_to_hex */ diff --git a/hex.c b/hex.c index 899b74a..004fdea 100644 --- a/hex.c +++ b/hex.c @@ -61,12 +61,10 @@ int get_oid_hex(const char *hex, struct object_id *oid) return get_sha1_hex(hex, oid->hash); } -char *sha1_to_hex(const unsigned char *sha1) +char *sha1_to_hex_to(char *buffer, const unsigned char *sha1) { - static int bufno; - static char hexbuffer[4][GIT_SHA1_HEXSZ + 1]; static const char hex[] = "0123456789abcdef"; - char *buffer = hexbuffer[3 & ++bufno], *buf = buffer; + char *buf = buffer; int i; for (i = 0; i < GIT_SHA1_RAWSZ; i++) { @@ -79,6 +77,13 @@ char *sha1_to_hex(const unsigned char *sha1) return buffer; } +char *sha1_to_hex(const unsigned char *sha1) +{ + static int bufno; + static char hexbuffer[4][GIT_SHA1_HEXSZ + 1]; + return sha1_to_hex_to(hexbuffer[3 & ++bufno], sha1); +} + char *oid_to_hex(const struct object_id *oid) { return sha1_to_hex(oid->hash); diff --git a/sha1_name.c b/sha1_name.c index da6874c..416e408 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -368,14 +368,13 @@ int for_each_abbrev(const char *prefix, each_abbrev_fn fn, void *cb_data) return ds.ambiguous; } -const char *find_unique_abbrev(const unsigned char *sha1, int len) +int find_unique_abbrev_to(char *hex, const unsigned char *sha1, int len) { int status, exists; - static char hex[41]; - memcpy(hex, sha1_to_hex(sha1), 40); + sha1_to_hex_to(hex, sha1); if
[PATCH 14/67] compat/inet_ntop: fix off-by-one in inet_ntop4
Our compat inet_ntop4 function writes to a temporary buffer with snprintf, and then uses strcpy to put the result into the final "dst" buffer. We check the return value of snprintf against the size of "dst", but fail to account for the NUL terminator. As a result, we may overflow "dst" with a single NUL. In practice, this doesn't happen because the output of inet_ntop is limited, and we provide buffers that are way oversized. We can fix the off-by-one check easily, but while we are here let's also use strlcpy for increased safety, just in case there are other bugs lurking. As a side note, this compat code seems to be BSD-derived. Searching for "vixie inet_ntop" turns up NetBSD's latest version of the same code, which has an identical fix (and switches to strlcpy, too!). Signed-off-by: Jeff King--- compat/inet_ntop.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/compat/inet_ntop.c b/compat/inet_ntop.c index 90b7cc4..6830726 100644 --- a/compat/inet_ntop.c +++ b/compat/inet_ntop.c @@ -53,11 +53,11 @@ inet_ntop4(const u_char *src, char *dst, size_t size) nprinted = snprintf(tmp, sizeof(tmp), fmt, src[0], src[1], src[2], src[3]); if (nprinted < 0) return (NULL); /* we assume "errno" was set by "snprintf()" */ - if ((size_t)nprinted > size) { + if ((size_t)nprinted >= size) { errno = ENOSPC; return (NULL); } - strcpy(dst, tmp); + strlcpy(dst, tmp, size); return (dst); } @@ -154,7 +154,7 @@ inet_ntop6(const u_char *src, char *dst, size_t size) errno = ENOSPC; return (NULL); } - strcpy(dst, tmp); + strlcpy(dst, tmp, size); return (dst); } #endif -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf
We sometimes sprintf into static buffers when we know that the size of the buffer is large enough to fit the input (either because it's a constant, or because it's numeric input that is bounded in size). Likewise with strcpy of constant strings. However, these sites make it hard to audit sprintf and strcpy calls for buffer overflows, as a reader has to cross-reference the size of the array with the input. Let's use xsnprintf instead, which communicates to a reader that we don't expect this to overflow (and catches the mistake in case we do). Signed-off-by: Jeff King--- These are all pretty trivial; the obvious thing to get wrong is that "sizeof(buf)" is not the correct length if "buf" is a pointer. I considered a macro wrapper like: #define xsnprintf_array(dst, fmt, ...) \ xsnprintf(dst, sizeof(dst) + BARF_UNLESS_AN_ARRAY(dst), \ fmt, __VA_ARGS__) but obviously that requires variadic macro support. archive-tar.c | 2 +- builtin/gc.c | 2 +- builtin/init-db.c | 11 ++- builtin/ls-tree.c | 9 + builtin/merge-index.c | 2 +- builtin/merge-recursive.c | 2 +- builtin/read-tree.c | 2 +- builtin/unpack-file.c | 2 +- compat/mingw.c| 8 +--- compat/winansi.c | 2 +- connect.c | 2 +- convert.c | 3 ++- daemon.c | 4 ++-- diff.c| 12 ++-- http-push.c | 2 +- http.c| 6 +++--- ll-merge.c| 12 ++-- refs.c| 8 sideband.c| 4 ++-- strbuf.c | 4 ++-- 20 files changed, 52 insertions(+), 47 deletions(-) diff --git a/archive-tar.c b/archive-tar.c index b6b30bb..d543f93 100644 --- a/archive-tar.c +++ b/archive-tar.c @@ -301,7 +301,7 @@ static int write_global_extended_header(struct archiver_args *args) memset(, 0, sizeof(header)); *header.typeflag = TYPEFLAG_GLOBAL_HEADER; mode = 0100666; - strcpy(header.name, "pax_global_header"); + xsnprintf(header.name, sizeof(header.name), "pax_global_header"); prepare_header(args, , mode, ext_header.len); write_blocked(, sizeof(header)); write_blocked(ext_header.buf, ext_header.len); diff --git a/builtin/gc.c b/builtin/gc.c index 0ad8d30..57584bc 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -194,7 +194,7 @@ static const char *lock_repo_for_gc(int force, pid_t* ret_pid) return NULL; if (gethostname(my_host, sizeof(my_host))) - strcpy(my_host, "unknown"); + xsnprintf(my_host, sizeof(my_host), "unknown"); pidfile_path = git_pathdup("gc.pid"); fd = hold_lock_file_for_update(, pidfile_path, diff --git a/builtin/init-db.c b/builtin/init-db.c index 69323e1..e7d0e31 100644 --- a/builtin/init-db.c +++ b/builtin/init-db.c @@ -262,7 +262,8 @@ static int create_default_files(const char *template_path) } /* This forces creation of new config file */ - sprintf(repo_version_string, "%d", GIT_REPO_VERSION); + xsnprintf(repo_version_string, sizeof(repo_version_string), + "%d", GIT_REPO_VERSION); git_config_set("core.repositoryformatversion", repo_version_string); path[len] = 0; @@ -414,13 +415,13 @@ int init_db(const char *template_dir, unsigned int flags) */ if (shared_repository < 0) /* force to the mode value */ - sprintf(buf, "0%o", -shared_repository); + xsnprintf(buf, sizeof(buf), "0%o", -shared_repository); else if (shared_repository == PERM_GROUP) - sprintf(buf, "%d", OLD_PERM_GROUP); + xsnprintf(buf, sizeof(buf), "%d", OLD_PERM_GROUP); else if (shared_repository == PERM_EVERYBODY) - sprintf(buf, "%d", OLD_PERM_EVERYBODY); + xsnprintf(buf, sizeof(buf), "%d", OLD_PERM_EVERYBODY); else - die("oops"); + die("BUG: invalid value for shared_repository"); git_config_set("core.sharedrepository", buf); git_config_set("receive.denyNonFastforwards", "true"); } diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c index 3b04a0f..0e30d86 100644 --- a/builtin/ls-tree.c +++ b/builtin/ls-tree.c @@ -96,12 +96,13 @@ static int show_tree(const unsigned char *sha1, struct strbuf *base, if (!strcmp(type, blob_type)) { unsigned long size; if (sha1_object_info(sha1, ) == OBJ_BAD) - strcpy(size_text, "BAD"); + xsnprintf(size_text, sizeof(size_text), +
[PATCH 26/67] replace trivial malloc + sprintf /strcpy calls to xstrfmt
It's a common pattern to do: foo = xmalloc(strlen(one) + strlen(two) + 1 + 1); sprintf(foo, "%s %s", one, two); (or possibly some variant with strcpy()s or a more complicated length computation). We can switch these to use xstrfmt, which is shorter, involves less error-prone manual computation, and removes many sprintf and strcpy calls which make it harder to audit the code for real buffer overflows. Signed-off-by: Jeff King--- builtin/apply.c | 5 + builtin/ls-remote.c | 8 ++-- builtin/name-rev.c | 13 + environment.c | 7 ++- imap-send.c | 5 ++--- reflog-walk.c | 7 +++ remote.c| 7 +-- setup.c | 12 +++- unpack-trees.c | 4 +--- 9 files changed, 20 insertions(+), 48 deletions(-) diff --git a/builtin/apply.c b/builtin/apply.c index 4aa53f7..094a20f 100644 --- a/builtin/apply.c +++ b/builtin/apply.c @@ -698,10 +698,7 @@ static char *find_name_common(const char *line, const char *def, } if (root) { - char *ret = xmalloc(root_len + len + 1); - strcpy(ret, root); - memcpy(ret + root_len, start, len); - ret[root_len + len] = '\0'; + char *ret = xstrfmt("%s%.*s", root, len, start); return squash_slash(ret); } diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c index 4554dbc..5b6d679 100644 --- a/builtin/ls-remote.c +++ b/builtin/ls-remote.c @@ -93,12 +93,8 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix) if (argv[i]) { int j; pattern = xcalloc(argc - i + 1, sizeof(const char *)); - for (j = i; j < argc; j++) { - int len = strlen(argv[j]); - char *p = xmalloc(len + 3); - sprintf(p, "*/%s", argv[j]); - pattern[j - i] = p; - } + for (j = i; j < argc; j++) + pattern[j - i] = xstrfmt("*/%s", argv[j]); } remote = remote_get(dest); if (!remote) { diff --git a/builtin/name-rev.c b/builtin/name-rev.c index 248a3eb..8a3a0cd 100644 --- a/builtin/name-rev.c +++ b/builtin/name-rev.c @@ -56,19 +56,16 @@ copy_data: parents = parents->next, parent_number++) { if (parent_number > 1) { int len = strlen(tip_name); - char *new_name = xmalloc(len + - 1 + decimal_length(generation) + /* ~ */ - 1 + 2 + /* ^NN */ - 1); + char *new_name; if (len > 2 && !strcmp(tip_name + len - 2, "^0")) len -= 2; if (generation > 0) - sprintf(new_name, "%.*s~%d^%d", len, tip_name, - generation, parent_number); + new_name = xstrfmt("%.*s~%d^%d", len, tip_name, + generation, parent_number); else - sprintf(new_name, "%.*s^%d", len, tip_name, - parent_number); + new_name = xstrfmt("%.*s^%d", len, tip_name, + parent_number); name_rev(parents->item, new_name, 0, distance + MERGE_TRAVERSAL_WEIGHT, 0); diff --git a/environment.c b/environment.c index a533aed..c5b65f5 100644 --- a/environment.c +++ b/environment.c @@ -143,11 +143,8 @@ static char *git_path_from_env(const char *envvar, const char *git_dir, const char *path, int *fromenv) { const char *value = getenv(envvar); - if (!value) { - char *buf = xmalloc(strlen(git_dir) + strlen(path) + 2); - sprintf(buf, "%s/%s", git_dir, path); - return buf; - } + if (!value) + return xstrfmt("%s/%s", git_dir, path); if (fromenv) *fromenv = 1; return xstrdup(value); diff --git a/imap-send.c b/imap-send.c index 37ac4aa..01aa227 100644 --- a/imap-send.c +++ b/imap-send.c @@ -889,9 +889,8 @@ static char *cram(const char *challenge_64, const char *user, const char *pass) } /* response: " " */ - resp_len = strlen(user) + 1 + strlen(hex) + 1; - response = xmalloc(resp_len); - sprintf(response, "%s %s", user, hex); + response = xstrfmt("%s %s", user, hex); + resp_len = strlen(response); response_64 = xmalloc(ENCODED_SIZE(resp_len) + 1); encoded_len = EVP_EncodeBlock((unsigned char *)response_64, diff --git a/reflog-walk.c b/reflog-walk.c index
[PATCH 30/67] ref-filter: drop sprintf and strcpy calls
The ref-filter code comes from for-each-ref, and inherited a number of raw sprintf and strcpy calls. These are generally all safe, as we custom-size the buffers, or are formatting numbers into sufficiently large buffers. But we can make the resulting code even simpler and more obviously correct by using some of our helper functions. Signed-off-by: Jeff King--- ref-filter.c | 70 +++- 1 file changed, 22 insertions(+), 48 deletions(-) diff --git a/ref-filter.c b/ref-filter.c index f38dee4..1f71870 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -192,9 +192,7 @@ static int grab_objectname(const char *name, const unsigned char *sha1, struct atom_value *v) { if (!strcmp(name, "objectname")) { - char *s = xmalloc(41); - strcpy(s, sha1_to_hex(sha1)); - v->s = s; + v->s = xstrdup(sha1_to_hex(sha1)); return 1; } if (!strcmp(name, "objectname:short")) { @@ -219,10 +217,8 @@ static void grab_common_values(struct atom_value *val, int deref, struct object if (!strcmp(name, "objecttype")) v->s = typename(obj->type); else if (!strcmp(name, "objectsize")) { - char *s = xmalloc(40); - sprintf(s, "%lu", sz); v->ul = sz; - v->s = s; + v->s = xstrfmt("%lu", sz); } else if (deref) grab_objectname(name, obj->sha1, v); @@ -246,11 +242,8 @@ static void grab_tag_values(struct atom_value *val, int deref, struct object *ob v->s = tag->tag; else if (!strcmp(name, "type") && tag->tagged) v->s = typename(tag->tagged->type); - else if (!strcmp(name, "object") && tag->tagged) { - char *s = xmalloc(41); - strcpy(s, sha1_to_hex(tag->tagged->sha1)); - v->s = s; - } + else if (!strcmp(name, "object") && tag->tagged) + v->s = xstrdup(sha1_to_hex(tag->tagged->sha1)); } } @@ -268,32 +261,22 @@ static void grab_commit_values(struct atom_value *val, int deref, struct object if (deref) name++; if (!strcmp(name, "tree")) { - char *s = xmalloc(41); - strcpy(s, sha1_to_hex(commit->tree->object.sha1)); - v->s = s; + v->s = xstrdup(sha1_to_hex(commit->tree->object.sha1)); } - if (!strcmp(name, "numparent")) { - char *s = xmalloc(40); + else if (!strcmp(name, "numparent")) { v->ul = commit_list_count(commit->parents); - sprintf(s, "%lu", v->ul); - v->s = s; + v->s = xstrfmt("%lu", v->ul); } else if (!strcmp(name, "parent")) { - int num = commit_list_count(commit->parents); - int i; struct commit_list *parents; - char *s = xmalloc(41 * num + 1); - v->s = s; - for (i = 0, parents = commit->parents; -parents; -parents = parents->next, i = i + 41) { + struct strbuf s = STRBUF_INIT; + for (parents = commit->parents; parents; parents = parents->next) { struct commit *parent = parents->item; - strcpy(s+i, sha1_to_hex(parent->object.sha1)); - if (parents->next) - s[i+40] = ' '; + if (parents != commit->parents) + strbuf_addch(, ' '); + strbuf_addstr(, sha1_to_hex(parent->object.sha1)); } - if (!i) - *s = '\0'; + v->s = strbuf_detach(, NULL); } } } @@ -700,7 +683,6 @@ static void populate_value(struct ref_array_item *ref) else if (!strcmp(formatp, "track") && (starts_with(name, "upstream") || starts_with(name, "push"))) { - char buf[40]; if (stat_tracking_info(branch, _ours, _theirs, NULL)) @@ -708,17 +690,13 @@ static void populate_value(struct ref_array_item *ref) if (!num_ours && !num_theirs)
[PATCH 28/67] fetch: replace static buffer with xstrfmt
We parse the INFINITE_DEPTH constant into a static, fixed-size buffer using sprintf. This buffer is sufficiently large for the current constant, but it's a suspicious pattern, as the constant is defined far away, and it's not immediately obvious that 12 bytes are large enough to hold it. We can just use xstrfmt here, which gets rid of any question of the buffer size. It also removes any concerns with object lifetime, which means we do not have to wonder why this buffer deep within a conditional is marked "static" (we never free our newly allocated result, of course, but that's OK; it's global that lasts the lifetime of the whole program anyway). Signed-off-by: Jeff King--- builtin/fetch.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 9a3869f..4703725 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -1156,11 +1156,8 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) die(_("--depth and --unshallow cannot be used together")); else if (!is_repository_shallow()) die(_("--unshallow on a complete repository does not make sense")); - else { - static char inf_depth[12]; - sprintf(inf_depth, "%d", INFINITE_DEPTH); - depth = inf_depth; - } + else + depth = xstrfmt("%d", INFINITE_DEPTH); } /* no need to be strict, transport_set_option() will validate it again */ -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 31/67] help: drop prepend function in favor of xstrfmt
This function predates xstrfmt, and its functionality is a subset. Let's just use xstrfmt. Signed-off-by: Jeff King--- builtin/help.c | 14 ++ 1 file changed, 2 insertions(+), 12 deletions(-) diff --git a/builtin/help.c b/builtin/help.c index 3422e73..fba8c01 100644 --- a/builtin/help.c +++ b/builtin/help.c @@ -295,16 +295,6 @@ static int is_git_command(const char *s) is_in_cmdlist(_cmds, s); } -static const char *prepend(const char *prefix, const char *cmd) -{ - size_t pre_len = strlen(prefix); - size_t cmd_len = strlen(cmd); - char *p = xmalloc(pre_len + cmd_len + 1); - memcpy(p, prefix, pre_len); - strcpy(p + pre_len, cmd); - return p; -} - static const char *cmd_to_page(const char *git_cmd) { if (!git_cmd) @@ -312,9 +302,9 @@ static const char *cmd_to_page(const char *git_cmd) else if (starts_with(git_cmd, "git")) return git_cmd; else if (is_git_command(git_cmd)) - return prepend("git-", git_cmd); + return xstrfmt("git-%s", git_cmd); else - return prepend("git", git_cmd); + return xstrfmt("git%s", git_cmd); } static void setup_man_path(void) -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 29/67] use strip_suffix and xstrfmt to replace suffix
When we want to convert "foo.pack" to "foo.idx", we do it by duplicating the original string and then munging the bytes in place. Let's use strip_suffix and xstrfmt instead, which has several advantages: 1. It's more clear what the intent is. 2. It does not implicitly rely on the fact that strlen(".idx") <= strlen(".pack") to avoid an overflow. 3. We communicate the assumption that the input file ends with ".pack" (and get a run-time check that this is so). 4. We drop calls to strcpy, which makes auditing the code base easier. Likewise, we can do this to convert ".pack" to ".bitmap", avoiding some manual memory computation. Signed-off-by: Jeff King--- http.c| 7 --- pack-bitmap.c | 13 - sha1_file.c | 6 -- 3 files changed, 12 insertions(+), 14 deletions(-) diff --git a/http.c b/http.c index 7b02259..e0ff876 100644 --- a/http.c +++ b/http.c @@ -1511,6 +1511,7 @@ int finish_http_pack_request(struct http_pack_request *preq) struct packed_git **lst; struct packed_git *p = preq->target; char *tmp_idx; + size_t len; struct child_process ip = CHILD_PROCESS_INIT; const char *ip_argv[8]; @@ -1524,9 +1525,9 @@ int finish_http_pack_request(struct http_pack_request *preq) lst = &((*lst)->next); *lst = (*lst)->next; - tmp_idx = xstrdup(preq->tmpfile); - strcpy(tmp_idx + strlen(tmp_idx) - strlen(".pack.temp"), - ".idx.temp"); + if (!strip_suffix(preq->tmpfile, ".pack.temp", )) + die("BUG: pack tmpfile does not end in .pack.temp?"); + tmp_idx = xstrfmt("%.*s.idx.temp", (int)len, preq->tmpfile); ip_argv[0] = "index-pack"; ip_argv[1] = "-o"; diff --git a/pack-bitmap.c b/pack-bitmap.c index 637770a..7dfcb34 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -252,16 +252,11 @@ static int load_bitmap_entries_v1(struct bitmap_index *index) static char *pack_bitmap_filename(struct packed_git *p) { - char *idx_name; - int len; - - len = strlen(p->pack_name) - strlen(".pack"); - idx_name = xmalloc(len + strlen(".bitmap") + 1); - - memcpy(idx_name, p->pack_name, len); - memcpy(idx_name + len, ".bitmap", strlen(".bitmap") + 1); + size_t len; - return idx_name; + if (!strip_suffix(p->pack_name, ".pack", )) + die("BUG: pack_name does not end in .pack"); + return xstrfmt("%.*s.bitmap", (int)len, p->pack_name); } static int open_pack_bitmap_1(struct packed_git *packfile) diff --git a/sha1_file.c b/sha1_file.c index 28352a5..88996f0 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -671,13 +671,15 @@ static int check_packed_git_idx(const char *path, struct packed_git *p) int open_pack_index(struct packed_git *p) { char *idx_name; + size_t len; int ret; if (p->index_data) return 0; - idx_name = xstrdup(p->pack_name); - strcpy(idx_name + strlen(idx_name) - strlen(".pack"), ".idx"); + if (!strip_suffix(p->pack_name, ".pack", )) + die("BUG: pack_name does not end in .pack"); + idx_name = xstrfmt("%.*s.idx", (int)len, p->pack_name); ret = check_packed_git_idx(idx_name, p); free(idx_name); return ret; -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 27/67] config: use xstrfmt in normalize_value
We xmalloc a fixed-size buffer and sprintf into it; this is OK because the size of our formatting types is finite, but that's not immediately clear to a reader auditing sprintf calls. Let's switch to xstrfmt, which is shorter and obviously correct. Note that just dropping the common xmalloc here causes gcc to complain with -Wmaybe-uninitialized. That's because if "types" does not match any of our known types, we never write anything into the "normalized" pointer. With the current code, gcc doesn't notice because we always return a valid pointer (just one which might point to uninitialized data, but the compiler doesn't know that). In other words, the current code is potentially buggy if new types are added without updating this spot. So let's take this opportunity to clean up the function a bit more. We can drop the "normalized" pointer entirely, and just return directly from each code path. And then add an assertion at the end in case we haven't covered any cases. Signed-off-by: Jeff King--- builtin/config.c | 34 +- 1 file changed, 13 insertions(+), 21 deletions(-) diff --git a/builtin/config.c b/builtin/config.c index 71acc44..adc7727 100644 --- a/builtin/config.c +++ b/builtin/config.c @@ -246,8 +246,6 @@ free_strings: static char *normalize_value(const char *key, const char *value) { - char *normalized; - if (!value) return NULL; @@ -258,27 +256,21 @@ static char *normalize_value(const char *key, const char *value) * "~/foobar/" in the config file, and to expand the ~ * when retrieving the value. */ - normalized = xstrdup(value); - else { - normalized = xmalloc(64); - if (types == TYPE_INT) { - int64_t v = git_config_int64(key, value); - sprintf(normalized, "%"PRId64, v); - } - else if (types == TYPE_BOOL) - sprintf(normalized, "%s", - git_config_bool(key, value) ? "true" : "false"); - else if (types == TYPE_BOOL_OR_INT) { - int is_bool, v; - v = git_config_bool_or_int(key, value, _bool); - if (!is_bool) - sprintf(normalized, "%d", v); - else - sprintf(normalized, "%s", v ? "true" : "false"); - } + return xstrdup(value); + if (types == TYPE_INT) + return xstrfmt("%"PRId64, git_config_int64(key, value)); + if (types == TYPE_BOOL) + return xstrdup(git_config_bool(key, value) ? "true" : "false"); + if (types == TYPE_BOOL_OR_INT) { + int is_bool, v; + v = git_config_bool_or_int(key, value, _bool); + if (!is_bool) + return xstrfmt("%d", v); + else + return xstrdup(v ? "true" : "false"); } - return normalized; + die("BUG: cannot normalize type %d", types); } static int get_color_found; -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 33/67] read_branches_file: replace strcpy with xstrdup
This code is exactly replicating strdup, so let's just use that. It's shorter, and eliminates some confusion (such as whether "p - s" is really enough to hold the result; it is, because we write NULs as we shrink "p"). Signed-off-by: Jeff King--- remote.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/remote.c b/remote.c index 5ab0f7f..1b69751 100644 --- a/remote.c +++ b/remote.c @@ -297,7 +297,6 @@ static void read_branches_file(struct remote *remote) int n = 1000; FILE *f = fopen(git_path("branches/%.*s", n, remote->name), "r"); char *s, *p; - int len; if (!f) return; @@ -313,9 +312,7 @@ static void read_branches_file(struct remote *remote) p = s + strlen(s); while (isspace(p[-1])) *--p = 0; - len = p - s; - p = xmalloc(len + 1); - strcpy(p, s); + p = xstrdup(s); /* * The branches file would have URL and optionally -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 53/67] drop strcpy in favor of raw sha1_to_hex
In some cases where we strcpy() the result of sha1_to_hex(), there's no need; the result goes directly into a printf statement, and we can simply pass the return value from sha1_to_hex() directly. Signed-off-by: Jeff King--- http-push.c | 6 ++ walker.c| 5 ++--- 2 files changed, 4 insertions(+), 7 deletions(-) diff --git a/http-push.c b/http-push.c index 43a9036..48f39b7 100644 --- a/http-push.c +++ b/http-push.c @@ -1856,7 +1856,6 @@ int main(int argc, char **argv) new_refs = 0; for (ref = remote_refs; ref; ref = ref->next) { - char old_hex[60], *new_hex; struct argv_array commit_argv = ARGV_ARRAY_INIT; if (!ref->peer_ref) @@ -1911,13 +1910,12 @@ int main(int argc, char **argv) } hashcpy(ref->new_sha1, ref->peer_ref->new_sha1); new_refs++; - strcpy(old_hex, sha1_to_hex(ref->old_sha1)); - new_hex = sha1_to_hex(ref->new_sha1); fprintf(stderr, "updating '%s'", ref->name); if (strcmp(ref->name, ref->peer_ref->name)) fprintf(stderr, " using '%s'", ref->peer_ref->name); - fprintf(stderr, "\n from %s\n to %s\n", old_hex, new_hex); + fprintf(stderr, "\n from %s\n to %s\n", + sha1_to_hex(ref->old_sha1), sha1_to_hex(ref->new_sha1)); if (dry_run) { if (helper_status) printf("ok %s\n", ref->name); diff --git a/walker.c b/walker.c index 44a936c..cdeb63f 100644 --- a/walker.c +++ b/walker.c @@ -17,10 +17,9 @@ void walker_say(struct walker *walker, const char *fmt, const char *hex) static void report_missing(const struct object *obj) { - char missing_hex[41]; - strcpy(missing_hex, sha1_to_hex(obj->sha1)); fprintf(stderr, "Cannot obtain needed %s %s\n", - obj->type ? typename(obj->type): "object", missing_hex); + obj->type ? typename(obj->type): "object", + sha1_to_hex(obj->sha1)); if (!is_null_sha1(current_commit_sha1)) fprintf(stderr, "while processing commit %s.\n", sha1_to_hex(current_commit_sha1)); -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 56/67] avoid sprintf and strcpy with flex arrays
When we are allocating a struct with a FLEX_ARRAY member, we generally compute the size of the array and then sprintf or strcpy into it. Normally we could improve a dynamic allocation like this by using xstrfmt, but it doesn't work here; we have to account for the size of the rest of the struct. But we can improve things a bit by storing the length that we use for the allocation, and then feeding it to xsnprintf or memcpy, which makes it more obvious that we are not writing more than the allocated number of bytes. It would be nice if we had some kind of helper for allocating generic flex arrays, but it doesn't work that well: - the call signature is a little bit unwieldy: d = flex_struct(sizeof(*d), offsetof(d, path), fmt, ...); You need offsetof here instead of just writing to the end of the base size, because we don't know how the struct is packed (partially this is because FLEX_ARRAY might not be zero, though we can account for that; but the size of the struct may actually be rounded up for alignment, and we can't know that). - some sites do clever things, like over-allocating because they know they will write larger things into the buffer later (e.g., struct packed_git here). So we're better off to just write out each allocation (or add type-specific helpers, though many of these are one-off allocations anyway). Signed-off-by: Jeff King--- Actually, I have a malloc-hardening series (which I'll post after this), in which I _do_ break down and add a FLEX_ALLOC() macro. But we still cannot use it for these cases anyway, because we don't assume all platforms support variadic macros. archive.c | 5 +++-- builtin/blame.c | 5 +++-- fast-import.c | 6 -- refs.c | 8 sha1_file.c | 5 +++-- submodule.c | 6 -- 6 files changed, 21 insertions(+), 14 deletions(-) diff --git a/archive.c b/archive.c index 01b0899..4ac86c8 100644 --- a/archive.c +++ b/archive.c @@ -171,13 +171,14 @@ static void queue_directory(const unsigned char *sha1, unsigned mode, int stage, struct archiver_context *c) { struct directory *d; - d = xmallocz(sizeof(*d) + base->len + 1 + strlen(filename)); + size_t len = base->len + 1 + strlen(filename) + 1; + d = xmalloc(sizeof(*d) + len); d->up = c->bottom; d->baselen = base->len; d->mode= mode; d->stage = stage; c->bottom = d; - d->len = sprintf(d->path, "%.*s%s/", (int)base->len, base->buf, filename); + d->len = xsnprintf(d->path, len, "%.*s%s/", (int)base->len, base->buf, filename); hashcpy(d->oid.hash, sha1); } diff --git a/builtin/blame.c b/builtin/blame.c index f8ec7ff..dbec516 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -459,12 +459,13 @@ static void queue_blames(struct scoreboard *sb, struct origin *porigin, static struct origin *make_origin(struct commit *commit, const char *path) { struct origin *o; - o = xcalloc(1, sizeof(*o) + strlen(path) + 1); + size_t pathlen = strlen(path) + 1; + o = xcalloc(1, sizeof(*o) + pathlen); o->commit = commit; o->refcnt = 1; o->next = commit->util; commit->util = o; - strcpy(o->path, path); + memcpy(o->path, path, pathlen); /* includes NUL */ return o; } diff --git a/fast-import.c b/fast-import.c index d0c2502..895c6b4 100644 --- a/fast-import.c +++ b/fast-import.c @@ -863,13 +863,15 @@ static void start_packfile(void) { static char tmp_file[PATH_MAX]; struct packed_git *p; + int namelen; struct pack_header hdr; int pack_fd; pack_fd = odb_mkstemp(tmp_file, sizeof(tmp_file), "pack/tmp_pack_XX"); - p = xcalloc(1, sizeof(*p) + strlen(tmp_file) + 2); - strcpy(p->pack_name, tmp_file); + namelen = strlen(tmp_file) + 2; + p = xcalloc(1, sizeof(*p) + namelen); + xsnprintf(p->pack_name, namelen, "%s", tmp_file); p->pack_fd = pack_fd; p->do_not_close = 1; pack_file = sha1fd(pack_fd, p->pack_name); diff --git a/refs.c b/refs.c index c2709de..df6c41a 100644 --- a/refs.c +++ b/refs.c @@ -2695,7 +2695,7 @@ static int pack_if_possible_fn(struct ref_entry *entry, void *cb_data) int namelen = strlen(entry->name) + 1; struct ref_to_prune *n = xcalloc(1, sizeof(*n) + namelen); hashcpy(n->sha1, entry->u.value.oid.hash); - strcpy(n->name, entry->name); + memcpy(n->name, entry->name, namelen); /* includes NUL */ n->next = cb->ref_to_prune; cb->ref_to_prune = n; } @@ -3984,10 +3984,10 @@ void ref_transaction_free(struct ref_transaction *transaction) static struct ref_update *add_update(struct ref_transaction *transaction, const char *refname) { - size_t len =
[PATCH 62/67] convert strncpy to memcpy
strncpy is known to be a confusing function because of its termination semantics. These calls are all correct, but it takes some examination to see why. In particular, every one of them expects to copy up to the length limit, and then makes some arrangement for terminating the result. We can just use memcpy, along with noting explicitly how the result is terminated (if it is not already obvious). That should make it more clear to a reader that we are doing the right thing. Signed-off-by: Jeff King--- builtin/help.c | 4 ++-- fast-import.c | 2 +- tag.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/builtin/help.c b/builtin/help.c index e1650ab..1cd0c1e 100644 --- a/builtin/help.c +++ b/builtin/help.c @@ -176,7 +176,7 @@ static void add_man_viewer(const char *name) while (*p) p = &((*p)->next); *p = xcalloc(1, (sizeof(**p) + len + 1)); - strncpy((*p)->name, name, len); + memcpy((*p)->name, name, len); /* NUL-terminated by xcalloc */ } static int supported_man_viewer(const char *name, size_t len) @@ -192,7 +192,7 @@ static void do_add_man_viewer_info(const char *name, { struct man_viewer_info_list *new = xcalloc(1, sizeof(*new) + len + 1); - strncpy(new->name, name, len); + memcpy(new->name, name, len); /* NUL-terminated by xcalloc */ new->info = xstrdup(value); new->next = man_viewer_info_list; man_viewer_info_list = new; diff --git a/fast-import.c b/fast-import.c index cf6d8bc..4d01efc 100644 --- a/fast-import.c +++ b/fast-import.c @@ -703,7 +703,7 @@ static struct atom_str *to_atom(const char *s, unsigned short len) c = pool_alloc(sizeof(struct atom_str) + len + 1); c->str_len = len; - strncpy(c->str_dat, s, len); + memcpy(c->str_dat, s, len); c->str_dat[len] = 0; c->next_atom = atom_table[hc]; atom_table[hc] = c; diff --git a/tag.c b/tag.c index 5b0ac62..5b2a06d 100644 --- a/tag.c +++ b/tag.c @@ -82,7 +82,7 @@ int parse_tag_buffer(struct tag *item, const void *data, unsigned long size) nl = memchr(bufptr, '\n', tail - bufptr); if (!nl || sizeof(type) <= (nl - bufptr)) return -1; - strncpy(type, bufptr, nl - bufptr); + memcpy(type, bufptr, nl - bufptr); type[nl - bufptr] = '\0'; bufptr = nl + 1; -- 2.6.0.rc2.408.ga2926b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 08/67] add reentrant variants of sha1_to_hex and find_unique_abbrev
On 15/09/15 16:26, Jeff King wrote: > The sha1_to_hex and find_unique_abbrev functions always > write into reusable static buffers. There are a few problems > with this: > > - future calls overwrite our result. This is especially > annoying with find_unique_abbrev, which does not have a > ring of buffers, so you cannot even printf() a result > that has two abbreviated sha1s. > > - if you want to put the result into another buffer, we > often strcpy, which looks suspicious when auditing for > overflows. > > This patch introduces sha1_to_hex_to and find_unique_abbrev_to, > which write into a user-provided buffer. Of course this is > just punting on the overflow-auditing, as the buffer > obviously needs to be GIT_SHA1_HEXSZ + 1 bytes. But it is > much easier to audit, since that is a well-known size. Hmm, I haven't read any other patches yet (including those which use these new '_to' functions), but I can't help feeling they should be named something like 'sha1_to_hex_str()' and 'find_unique_abbrev_str()' instead. i.e. I don't get the '_to' thing - not that I'm any good at naming things ... ATB, Ramsay Jones > > We retain the non-reentrant forms, which just become thin > wrappers around the reentrant ones. This patch also adds a > strbuf variant of find_unique_abbrev, which will be handy in > later patches. > > Signed-off-by: Jeff King> --- > If we wanted to be really meticulous, these functions could > take a size for the output buffer, and complain if it is not > GIT_SHA1_HEXSZ+1 bytes. But that would bloat every call > like: > > sha1_to_hex_to(buf, sizeof(buf), sha1); > > cache.h | 27 ++- > hex.c | 13 + > sha1_name.c | 16 +++- > strbuf.c| 9 + > strbuf.h| 8 > 5 files changed, 63 insertions(+), 10 deletions(-) > > diff --git a/cache.h b/cache.h > index e231e47..cc59aba 100644 > --- a/cache.h > +++ b/cache.h > @@ -785,7 +785,21 @@ extern char *sha1_pack_name(const unsigned char *sha1); > */ > extern char *sha1_pack_index_name(const unsigned char *sha1); > > -extern const char *find_unique_abbrev(const unsigned char *sha1, int); > +/* > + * Return an abbreviated sha1 unique within this repository's object > database. > + * The result will be at least `len` characters long, and will be NUL > + * terminated. > + * > + * The non-`_to` version returns a static buffer which will be overwritten by > + * subsequent calls. > + * > + * The `_to` variant writes to a buffer supplied by the caller, which must be > + * at least `GIT_SHA1_HEXSZ + 1` bytes. The return value is the number of > bytes > + * written (excluding the NUL terminator). > + */ > +extern const char *find_unique_abbrev(const unsigned char *sha1, int len); > +extern int find_unique_abbrev_to(char *hex, const unsigned char *sha1, int > len); > + > extern const unsigned char null_sha1[GIT_SHA1_RAWSZ]; > > static inline int hashcmp(const unsigned char *sha1, const unsigned char > *sha2) > @@ -1067,6 +1081,17 @@ extern int for_each_abbrev(const char *prefix, > each_abbrev_fn, void *); > extern int get_sha1_hex(const char *hex, unsigned char *sha1); > extern int get_oid_hex(const char *hex, struct object_id *sha1); > > +/* > + * Convert a binary sha1 to its hex equivalent. The `_to` variant writes > + * the NUL-terminated output to the buffer `out`, which must be at least > + * `GIT_SHA1_HEXSZ + 1` bytes, and returns a pointer to out for convenience. > + * > + * The non-`_to` variant returns a static buffer, but uses a ring of 4 > + * buffers, making it safe to make multiple calls for a single statement, > like: > + * > + * printf("%s -> %s", sha1_to_hex(one), sha1_to_hex(two)); > + */ > +extern char *sha1_to_hex_to(char *out, const unsigned char *sha1); > extern char *sha1_to_hex(const unsigned char *sha1); /* static buffer > result! */ > extern char *oid_to_hex(const struct object_id *oid);/* same static > buffer as sha1_to_hex */ > > diff --git a/hex.c b/hex.c > index 899b74a..004fdea 100644 > --- a/hex.c > +++ b/hex.c > @@ -61,12 +61,10 @@ int get_oid_hex(const char *hex, struct object_id *oid) > return get_sha1_hex(hex, oid->hash); > } > > -char *sha1_to_hex(const unsigned char *sha1) > +char *sha1_to_hex_to(char *buffer, const unsigned char *sha1) > { > - static int bufno; > - static char hexbuffer[4][GIT_SHA1_HEXSZ + 1]; > static const char hex[] = "0123456789abcdef"; > - char *buffer = hexbuffer[3 & ++bufno], *buf = buffer; > + char *buf = buffer; > int i; > > for (i = 0; i < GIT_SHA1_RAWSZ; i++) { > @@ -79,6 +77,13 @@ char *sha1_to_hex(const unsigned char *sha1) > return buffer; > } > > +char *sha1_to_hex(const unsigned char *sha1) > +{ > + static int bufno; > + static char hexbuffer[4][GIT_SHA1_HEXSZ + 1]; > + return sha1_to_hex_to(hexbuffer[3 & ++bufno], sha1); > +} > + > char *oid_to_hex(const struct
Re: [PATCH 08/67] add reentrant variants of sha1_to_hex and find_unique_abbrev
On Tue, Sep 15, 2015 at 05:55:55PM +0100, Ramsay Jones wrote: > On 15/09/15 16:26, Jeff King wrote: > > The sha1_to_hex and find_unique_abbrev functions always > > write into reusable static buffers. There are a few problems > > with this: > > > > - future calls overwrite our result. This is especially > > annoying with find_unique_abbrev, which does not have a > > ring of buffers, so you cannot even printf() a result > > that has two abbreviated sha1s. > > > > - if you want to put the result into another buffer, we > > often strcpy, which looks suspicious when auditing for > > overflows. > > > > This patch introduces sha1_to_hex_to and find_unique_abbrev_to, > > which write into a user-provided buffer. Of course this is > > just punting on the overflow-auditing, as the buffer > > obviously needs to be GIT_SHA1_HEXSZ + 1 bytes. But it is > > much easier to audit, since that is a well-known size. > > Hmm, I haven't read any other patches yet (including those which use these > new '_to' functions), but I can't help feeling they should be named something > like 'sha1_to_hex_str()' and 'find_unique_abbrev_str()' instead. i.e. I > don't get > the '_to' thing - not that I'm any good at naming things ... I meant it as a contrast with their original. sha1_to_hex() formats into an internal buffer and returns it. But sha1_to_hex_to() formats "to" a buffer of your choice. I'm happy to switch the names to something else, but I don't think _str() conveys the difference. If I were starting from scratch, I would probably have just called my variant sha1_to_hex(), and called the original sha1_to_hex_unsafe(). :) -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 17/67] use xsnprintf for generating git object headers
We generally use 32-byte buffers to format git's "type size" header fields. These should not generally overflow unless you can produce some truly gigantic objects (and our types come from our internal array of constant strings). But it is a good idea to use xsnprintf to make sure this is the case. Note that we slightly modify the interface to write_sha1_file_prepare, which nows uses "hdrlen" as an "in" parameter as well as an "out" (on the way in it stores the allocated size of the header, and on the way out it returns the ultimate size of the header). Signed-off-by: Jeff King--- builtin/index-pack.c | 2 +- bulk-checkin.c | 4 ++-- fast-import.c| 4 ++-- http-push.c | 2 +- sha1_file.c | 13 +++-- 5 files changed, 13 insertions(+), 12 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 3431de2..1ad1bde 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -441,7 +441,7 @@ static void *unpack_entry_data(unsigned long offset, unsigned long size, int hdrlen; if (!is_delta_type(type)) { - hdrlen = sprintf(hdr, "%s %lu", typename(type), size) + 1; + hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(type), size) + 1; git_SHA1_Init(); git_SHA1_Update(, hdr, hdrlen); } else diff --git a/bulk-checkin.c b/bulk-checkin.c index 7cffc3a..4347f5c 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -200,8 +200,8 @@ static int deflate_to_pack(struct bulk_checkin_state *state, if (seekback == (off_t) -1) return error("cannot find the current offset"); - header_len = sprintf((char *)obuf, "%s %" PRIuMAX, -typename(type), (uintmax_t)size) + 1; + header_len = xsnprintf((char *)obuf, sizeof(obuf), "%s %" PRIuMAX, + typename(type), (uintmax_t)size) + 1; git_SHA1_Init(); git_SHA1_Update(, obuf, header_len); diff --git a/fast-import.c b/fast-import.c index 6c7c3c9..d0c2502 100644 --- a/fast-import.c +++ b/fast-import.c @@ -1035,8 +1035,8 @@ static int store_object( git_SHA_CTX c; git_zstream s; - hdrlen = sprintf((char *)hdr,"%s %lu", typename(type), - (unsigned long)dat->len) + 1; + hdrlen = xsnprintf((char *)hdr, sizeof(hdr), "%s %lu", + typename(type), (unsigned long)dat->len) + 1; git_SHA1_Init(); git_SHA1_Update(, hdr, hdrlen); git_SHA1_Update(, dat->buf, dat->len); diff --git a/http-push.c b/http-push.c index 154e67b..1f3788f 100644 --- a/http-push.c +++ b/http-push.c @@ -361,7 +361,7 @@ static void start_put(struct transfer_request *request) git_zstream stream; unpacked = read_sha1_file(request->obj->sha1, , ); - hdrlen = sprintf(hdr, "%s %lu", typename(type), len) + 1; + hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(type), len) + 1; /* Set it up */ git_deflate_init(, zlib_compression_level); diff --git a/sha1_file.c b/sha1_file.c index d295a32..f106091 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1464,7 +1464,7 @@ int check_sha1_signature(const unsigned char *sha1, void *map, return -1; /* Generate the header */ - hdrlen = sprintf(hdr, "%s %lu", typename(obj_type), size) + 1; + hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(obj_type), size) + 1; /* Sha1.. */ git_SHA1_Init(); @@ -2930,7 +2930,7 @@ static void write_sha1_file_prepare(const void *buf, unsigned long len, git_SHA_CTX c; /* Generate the header */ - *hdrlen = sprintf(hdr, "%s %lu", type, len)+1; + *hdrlen = xsnprintf(hdr, *hdrlen, "%s %lu", type, len)+1; /* Sha1.. */ git_SHA1_Init(); @@ -2993,7 +2993,7 @@ int hash_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *sha1) { char hdr[32]; - int hdrlen; + int hdrlen = sizeof(hdr); write_sha1_file_prepare(buf, len, type, sha1, hdr, ); return 0; } @@ -3139,7 +3139,7 @@ static int freshen_packed_object(const unsigned char *sha1) int write_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *sha1) { char hdr[32]; - int hdrlen; + int hdrlen = sizeof(hdr); /* Normally if we have it in the pack then we do not bother writing * it out into .git/objects/??/?{38} file. @@ -3157,7 +3157,8 @@ int hash_sha1_file_literally(const void *buf, unsigned long len, const char *typ int hdrlen, status = 0; /* type string, SP, %lu of the length plus NUL must fit this */ - header = xmalloc(strlen(type) + 32); + hdrlen = strlen(type) + 32; + header = xmalloc(hdrlen); write_sha1_file_prepare(buf, len, type, sha1, header, ); if (!(flags & HASH_WRITE_OBJECT)) @@ -3185,7