Re: [PATCH] t7800 readlink not found
Torsten Bögershausenwrites: >> diff --git a/t/t7800-difftool.sh b/t/t7800-difftool.sh >> index 7ce4cd7..905035c 100755 >> --- a/t/t7800-difftool.sh >> +++ b/t/t7800-difftool.sh >> @@ -446,7 +446,7 @@ write_script .git/CHECK_SYMLINKS <<\EOF >> for f in file file2 sub/sub >> do >> echo "$f" >> -readlink "$2/$f" >> +ls -ld "$2/$f" | sed -e 's/.* -> //' >> done >actual >> EOF >> > I don't know how portable #ls -ld" really is. The parts with mode bits, nlinks, uid, gid, size, and date part do have some variations. For example, we have been burned on ACL enabled systems having some funny suffix after the usual mode bits stuff. However, as far as this test is concerned, I do not think "how portable is the output from ls -ld" is an especially relevant question. None of the things we expect early in the output (the fields I enumerated in the previous paragraph) would contain " -> ". And we know that we do not use a filename that has " -> " (or "->") as a substring in our tests. We don't have to use readlink, even on platforms where we do have readlink. Building the conditional to be checked at runtime and providing a shell function read_link that uses "ls -ld | sed" or "readlink" depending on the runtime check is wasteful. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/13] refs: introduce an iterator interface
On Mon, May 30, 2016 at 3:55 AM, Michael Haggertywrote: > [...] > This commit introduces a new iteration primitive for references: a > ref_iterator. A ref_iterator is a polymorphic object that a reference > storage backend can be asked to instantiate. There are three functions > that can be applied to a ref_iterator: > > * ref_iterator_advance(): move to the next reference in the iteration > * ref_iterator_abort(): end the iteration before it is exhausted > * ref_iterator_peel(): peel the reference currently being looked at > [...] > Signed-off-by: Michael Haggerty > --- > diff --git a/refs/refs-internal.h b/refs/refs-internal.h > @@ -249,6 +249,199 @@ int rename_ref_available(const char *oldname, const > char *newname); > +/* > + * Advance the iterator to the first or next item and return ITER_OK. > + * If the iteration is exhausted, free the resources associated with > + * the ref_iterator and return ITER_DONE. On errors, free the iterator > + * resources and return ITER_ERROR. It is a bug to use ref_iterator or > + * call this function again after it has returned false. > + */ Either: s/false/something other than ITER_OK/ or: s/false/ITER_DONE or ITER_ERROR/ > +int ref_iterator_advance(struct ref_iterator *ref_iterator); > + > +/* > + * An iterator over nothing (its first ref_iterator_advance() call > + * returns 0). > + */ s/0/ITER_DONE/ > +struct ref_iterator *empty_ref_iterator_begin(void); > + > +/* > + * Return true iff ref_iterator is an empty_ref_iterator. > + */ > +int is_empty_ref_iterator(struct ref_iterator *ref_iterator); I can see that you used this function as an optimization or convenience in overlay_ref_iterator_begin(), but do you expect it to be generally useful otherwise? Is it worth publishing? Do you have other use-cases in mind? Also, can you explain why the merge iterator doesn't also perform the optimization/convenience of checking if one iterator is an empty iterator? > +/* > + * Iterate over the intries from iter0 and iter1, with the values s/intries/entries/ > + * interleaved as directed by the select function. The iterator takes > + * ownership of iter0 and iter1 and frees them when the iteration is > + * over. > + */ > +struct ref_iterator *merge_ref_iterator_begin( > + struct ref_iterator *iter0, struct ref_iterator *iter1, > + ref_iterator_select_fn *select, void *cb_data); > + > +/* > + * An iterator consisting of the union of the entries from iter0 and > + * iter1. If there are entries common to the two sub-iterators, use > + * the one from iter1. Each iterator must iterate over its entries in > + * strcmp() order by refname for this to work. > + * > + * The new iterator takes ownership of its arguments and frees them > + * when the iteration is over. As a convenience to callers, if iter0 > + * or iter1 is_empty_ref_iterator(), then abort that one immediately > + * and return the other iterator directly, without wrapping it. > + */ > +struct ref_iterator *overlay_ref_iterator_begin(struct ref_iterator *iter0, > + struct ref_iterator *iter1); When reading about the overlay iterator (both code and documentation), my expectation was that iter0 would shadow iter1, not the other way around as implemented here. Of course, that's entirely subjective, but the generic names don't provide any useful clues as to which shadows which. Perhaps giving them more meaningful names would help. > +/* > + * Wrap iter0, only letting through the references whose names start > + * with prefix. If trim is set, set iter->refname to the name of the > + * reference with that many characters trimmed off the front; > + * otherwise set it to the full refname. The new iterator takes over > + * ownership of iter0 and frees it when iteration is over. It makes > + * its own copy of prefix. > + * > + * As an convenience to callers, if prefix is the empty string and > + * trim is zero, this function returns iter0 directly, without > + * wrapping it. > + */ > +struct ref_iterator *prefix_ref_iterator_begin(struct ref_iterator *iter0, > + const char *prefix, > + int trim); Minor: Similarly, when reading the code and documentation, I wondered why this was named 'iter0' when no 'iter1' was in sight. Perhaps name it simply 'iter'. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] t7800 readlink not found
On 05/31/2016 02:26 AM, Armin Kunaschik wrote: On 05/27/2016 06:19 AM, David Aguilar wrote: On Wed, May 25, 2016 at 11:33:33AM +0200, Armin Kunaschik wrote: Would you mind submitting a patch so that we can support these tests when running on AIX/HP-UX? I don't feel comfortable to submit patches for tests I can't verify. I don't have valgrind and python/p4 here. Looking to the code I'd say, patching the p4 tests with "ls -ld | sed" looks quite save. But I'm not sure about the test-lib.sh. When you are really super paranoid, as written in the comment, you should probably use perl like perl -e 'print readlink $ARGV[0]' $name as a replacement. So, as suggested by Junio, here the readlink workaround for t7800 only. (hopefully whitespace clean this time) --- 8< --- 8< --- From: Armin KunaschikSubject: t7800: readlink is not portable The readlink(1) command is not available on all platforms (notably not on AIX and HP-UX) and can be replaced in this test with the "workaround" ls -ld | sed -e 's/.* -> //' This is no universal readlink replacement but works in the controlled test environment good enough. Signed-off-by: Armin Kunaschik --- diff --git a/t/t7800-difftool.sh b/t/t7800-difftool.sh index 7ce4cd7..905035c 100755 --- a/t/t7800-difftool.sh +++ b/t/t7800-difftool.sh @@ -446,7 +446,7 @@ write_script .git/CHECK_SYMLINKS <<\EOF for f in file file2 sub/sub do echo "$f" - readlink "$2/$f" + ls -ld "$2/$f" | sed -e 's/.* -> //' done >actual EOF I don't know how portable #ls -ld" really is. If there is one platform, that doesn't support readlink, would it make sense to implement readlink() in test-lib.sh, similar to what we have for MINGW, e.g. sort() or find() ? And keep t7800 as it is ? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH] bisect--helper: `bisect_clean_state` shell function in C
On 05/30/2016 08:21 PM, Pranit Bauva wrote: > Reimplement `bisect_clean_state` shell function in C and add a > `bisect-clean-state` subcommand to `git bisect--helper` to call it from > git-bisect.sh . > > Using `bisect_clean_state` subcommand is a measure to port shell > function to C so as to use the existing test suite. As more functions > are ported, this subcommand will be retired and will be called by > bisect_reset() and bisect_start(). > > Mentored-by: Lars Schneider> Mentored-by: Christian Couder > Signed-off-by: Pranit Bauva > --- > This patch contains a bug. I have tried to identify the bug and I suppose it > exists in do_for_each_entry_in_dir(). I have reproduced the debugging session > at this link[1]. I have seen that some patches in mailing list regarding > iterating over refs. Will those affect this? Or is this bug fixed in those > patches? The problem is that it is not legal to modify references while iterating over them. See [1]. Your remove_bisect_ref() callback function deletes references, which modifies the reference cache that is being iterated over. Instead I suggest that your remove_bisect_ref() add the references to a string_list, then call delete_refs() *after* the iteration is over. Alternatively, you can change remove_bisect_ref() to call ref_transaction_delete() to add reference deletions to a ref_transaction, then call ref_transaction_commit() after the iteration is over. See the rm() function in builtin/remote.c [2] for an example. [1] https://github.com/git/git/blob/f3913c2d03abc660140678a9e14dac399f847647/refs.h#L176-L184 [2] https://github.com/git/git/blob/f3913c2d03abc660140678a9e14dac399f847647/builtin/remote.c#L738 > [...] Michael -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] strbuf: allow to use preallocated memory
On 05/30/2016 02:52 PM, Matthieu Moy wrote: > [...] I feel bad bikeshedding about names, especially since you took some of the original names from my RFC. But names are very important, so I think it's worth considering whether the current names could be improved upon. When reading this patch series, I found I had trouble remembering whether "preallocated" meant "preallocated and movable" or "preallocated and immovable". So maybe we should brainstorm alternatives to "preallocated" and "fixed". For example, * "growable"/"fixed"? Seems OK, though all strbufs are growable at least to the size of their initial allocation, so maybe "growable" is misleading. * "movable"/"fixed"? This maybe better captures the essence of the distinction. I'll use those names below for concreteness, without claiming that they are the best. > * strbuf_attach() calls strbuf_release(), which allows reusing an > existing strbuf. strbuf_wrap_preallocated() calls strbuf_init which > would override silently any previous content. I think strbuf_attach() > does the right thing here. Hmmm I think the best way to answer these questions is to think about use cases and make them as easy/consistent as possible. I expect that a very common use of strbuf_wrap_fixed() will be to wrap a stack-allocated string, like char pathbuf[PATH_MAX]; struct strbuf path; strbuf_wrap_fixed(, pathbuf, 0, sizeof(pathbuf)); In this use case, it would be a shame if `path` had to be initialized to STRBUF_INIT just because `strbuf_wrap_fixed()` calls `strbuf_release()` internally. But maybe we could make this use case easier still. If there were a macro #define STRBUF_FIXED_WRAPPER(sb, buf, len) { STRBUF_FIXED_MEMORY, sizeof(buf), (len), (buf) } then we could write char pathbuf[PATH_MAX]; struct strbuf path = STRBUF_FIXED_WRAPPER(pathbuf, 0); I think that would be pretty usable. One would have to be careful only to wrap arrays and not `char *` pointers, because `sizeof` wouldn't work on the latter. The BARF_UNLESS_AN_ARRAY macro could be used here to add a little safety. (It would be even nicer if we could write struct strbuf path = STRBUF_FIXED(PATH_MAX); and it would initialize both path and a pathbuf variable for it to wrap, but I don't think there is a way to implement such a macro. So I think the only way to squeeze this onto one line would be to make it look like DEFINE_FIXED_STRBUF(path, PATH_MAX); But that looks awful, so I think the two-line version above is preferable.) Similarly, there could be a macro #define STRBUF_MOVABLE_WRAPPER(sb, buf, len) { 0, sizeof(buf), (len), (buf) } If you provide macro forms like these for initializing strbufs, then I agree with Matthieu that the analogous functional forms should probably call strbuf_release() before wrapping the array. The functions might be named more like `strbuf_attach()` to emphasize their similarity to that existing function. Maybe strbuf_attach_fixed(struct strbuf *sb, void *s, size_t len, size_t alloc); strbuf_attach_movable(struct strbuf *sb, void *s, size_t len, size_t alloc); Michael -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] pretty: support "mboxrd" output format
On Mon, May 30, 2016 at 7:21 PM, Eric Wongwrote: > This output format prevents format-patch output from breaking > readers if somebody copy+pasted an mbox into a commit message. > > Unlike the traditional "mboxo" format, "mboxrd" is designed to > be fully-reversible. "mboxrd" also gracefully degrades to > showing extra ">" in existing "mboxo" readers. > [...] > Signed-off-by: Eric Wong > --- > diff --git a/pretty.c b/pretty.c > @@ -1697,12 +1699,34 @@ static void pp_handle_indent(struct > pretty_print_context *pp, > +static regex_t *mboxrd_prepare(void) > +{ > + static regex_t preg; > + const char re[] = "^>*From "; > + int err = regcomp(, re, REG_NOSUB | REG_EXTENDED); > +[...] > + return > +} > + > void pp_remainder(struct pretty_print_context *pp, > const char **msg_p, > struct strbuf *sb, > int indent) > { > + static regex_t *mboxrd_from; > + > + if (pp->fmt == CMIT_FMT_MBOXRD && !mboxrd_from) > + mboxrd_from = mboxrd_prepare(); > + > @@ -1725,8 +1749,13 @@ void pp_remainder(struct pretty_print_context *pp, > else if (pp->expand_tabs_in_log) > strbuf_add_tabexpand(sb, pp->expand_tabs_in_log, > line, linelen); > - else > + else { > + if (pp->fmt == CMIT_FMT_MBOXRD && > + !regexec(mboxrd_from, line, 0, 0, 0)) > + strbuf_addch(sb, '>'); At first glance, this seems dangerous since it's handing 'line' to regexec() without paying attention to 'linelen'. For an arbitrary regex, this could result in erroneous matches on subsequent "lines", however, since this expression is anchored with '^', it's not a problem. But, it is a bit subtle. I wonder if hand-coding, rather than using a regex, could be an improvement: static int is_mboxrd_from(const char *s, size_t n) { size_t f = strlen("From "); const char *t = s + n; while (s < t && *s == '>') s++; return t - s >= f && !memcmp(s, "From ", f); } ... if (is_mboxrd_from(line, linelen) strbuf_addch(sb, '>'); or something. > + > strbuf_add(sb, line, linelen); > + } > strbuf_addch(sb, '\n'); > } > } > diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh > @@ -1565,4 +1565,31 @@ test_expect_success 'format-patch --base overrides > format.useAutoBase' ' > +test_expect_success 'format-patch --pretty=mboxrd' ' > + cat >msg <<-INPUT_END && Maybe use <<-\INPUT_END to indicate that no variable interpolation is expected. Ditto below. > + mboxrd should escape the body > + > + From could trip up a loose mbox parser > + >From extra escape for reversibility > + >>From extra escape for reversibility 2 > + from lower case not escaped > + Fromm bad speling not escaped > +From with leading space not escaped > + INPUT_END > + > + cat >expect <<-INPUT_END && > + >From could trip up a loose mbox parser > + >>From extra escape for reversibility > + >>>From extra escape for reversibility 2 > + from lower case not escaped > + Fromm bad speling not escaped > +From with leading space not escaped > + INPUT_END > + > + C=$(git commit-tree HEAD^^{tree} -p HEAD + git format-patch --pretty=mboxrd --stdout -1 $C~1..$C >patch && > + grep -A5 "^>From could trip up a loose mbox parser" patch >actual && Hmm, -A is not POSIX and is otherwise not used in Git tests. Perhaps you could use 'git grep --no-index -A' instead or something? > + test_cmp expect actual > +' > + > test_done -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] strbuf: allow to use preallocated memory
On 05/30/2016 02:13 PM, Johannes Schindelin wrote: > [...] >> @@ -38,7 +67,11 @@ char *strbuf_detach(struct strbuf *sb, size_t *sz) >> { >> char *res; >> strbuf_grow(sb, 0); >> -res = sb->buf; >> +if (sb->flags & STRBUF_OWNS_MEMORY) >> +res = sb->buf; >> +else >> +res = xmemdupz(sb->buf, sb->alloc - 1); > > This looks like a usage to be avoided: if we plan to detach the buffer, > anyway, there is no good reason to allocate it on the heap first. I would > at least issue a warning here. I think this last case should be changed to res = xmemdupz(sb->buf, sb->len); Johannes, if this change is made then I think that there is a reasonable use case for calling `strbuf_detach()` on a strbuf that wraps a stack-allocated string, so I don't think that a warning is needed. I think this change makes sense. After all, once a caller detaches a string, it is probably not planning on growing/shrinking it anymore, so any more space than that would probably be wasted. In fact, since the caller has no way to ask how much extra space the detached string has allocated, it is almost guaranteed that the space would be wasted. Actually, that is not 100% certain. Theoretically, a caller might read `sb->alloc` *before* calling `strbuf_detach()`, and assume that the detached string has that allocated size. Or the caller might call `strbuf_grow()` then assume that the detached string has at least that much free space. I sure hope that no callers actually do that. The docstring for `strbuf_detach()` doesn't promise that it will work, and there is a pretty stern warning [1] that should hopefully have dissuaded developers from such a usage. But how could it be checked for sure? * Audit the callers of strbuf_detach(). But given that there are nearly 200 callers, that would be a huge amount of work. * On a test branch, change the existing implementation of strbuf_detach() to return newly-allocated memory of size `sb->len + 1` and free `sb->buf`, then run the test suite under valgrind. This would flush out examples of this antipattern in the test suite. It might seem like we don't have to worry about this, because existing callers only deal with strbufs that wrap heap-allocated strings. But such a caller might get a strbuf passed to it from a caller, and that caller might someday be modified to use stack-allocated strings. So I think that at least the valgrind test suggested above would be prudent. Michael [1] https://github.com/git/git/blob/f3913c2d03abc660140678a9e14dac399f847647/strbuf.h#L20-L23 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] strbuf: add tests
Hi, Cool that you are working on this! See my comments below. On 05/30/2016 12:36 PM, William Duclot wrote: > Test the strbuf API. Being used throughout all Git the API could be > considered tested, but adding specific tests makes it easier to improve > and extend the API. > --- > Makefile | 1 + > t/helper/test-strbuf.c | 69 > ++ > t/t0082-strbuf.sh | 19 ++ > 3 files changed, 89 insertions(+) > create mode 100644 t/helper/test-strbuf.c > create mode 100755 t/t0082-strbuf.sh > > diff --git a/Makefile b/Makefile > index 3f03366..dc84f43 100644 > --- a/Makefile > +++ b/Makefile > @@ -613,6 +613,7 @@ TEST_PROGRAMS_NEED_X += test-scrap-cache-tree > TEST_PROGRAMS_NEED_X += test-sha1 > TEST_PROGRAMS_NEED_X += test-sha1-array > TEST_PROGRAMS_NEED_X += test-sigchain > +TEST_PROGRAMS_NEED_X += test-strbuf > TEST_PROGRAMS_NEED_X += test-string-list > TEST_PROGRAMS_NEED_X += test-submodule-config > TEST_PROGRAMS_NEED_X += test-subprocess > diff --git a/t/helper/test-strbuf.c b/t/helper/test-strbuf.c > new file mode 100644 > index 000..622f627 > --- /dev/null > +++ b/t/helper/test-strbuf.c > @@ -0,0 +1,69 @@ > +#include "git-compat-util.h" > +#include "strbuf.h" > + > +/* > + * Check behavior on usual use cases > + */ > +int test_usual(struct strbuf *sb) Is there a particular reason that you pass `sb` into the function? Why not use a local variable? It wouldn't hurt to declare this function static, because it is only used within this one compilation unit. On the other hand, in this particular case I don't think it matters much one way or the other. > +{ > + size_t size, old_alloc; > + char *res, *old_buf, *str_test = malloc(5*sizeof(char)); There is no reason to multiply by `sizeof(char)` here, and we don't do it in our code. (According to the C standard, `sizeof(char)` is always 1.) > + strbuf_grow(sb, 1); > + strcpy(str_test, "test"); > + old_alloc = sb->alloc; > + strbuf_grow(sb, 1000); > + if (old_alloc == sb->alloc) > + die("strbuf_grow does not realloc the buffer as expected"); It's not ideal that your test depends on the details of the allocation policy of strbuf. If somebody were to decide that it makes sense for the initial allocation of a strbuf to be 1024 characters, this test would break. I know it's implausible, but it's better to remove this coupling. You could do it by ensuring that you request more space than is already allocated: strbuf_grow(sb, sb.alloc - sb.len + 1000); On the other hand, if you *want* to test the details of strbuf's allocation policy here, you would do it explicitly rather than as the side effect of this test. For example, before calling `strbuf_grow(sb, 1000)`, you could insert: if (sb.alloc > SOME_VALUE) die("strbuf_grow(sb, 1) allocated too much space"); Though my opinion is that if you want to write tests of strbuf's allocation policies, it should be in an entirely separate test. > + old_buf = sb->buf; > + res = strbuf_detach(sb, ); Since you don't use `size`, you can pass `NULL` as the second argument of `strbuf_detach()`. The same below. Alternatively, maybe you want to add a test that `strbuf_detach()` sets `size` to the expected value. > + if (res != old_buf) > + die("strbuf_detach does not return the expected buffer"); > + free(res); > + > + strcpy(str_test, "test"); Near the beginning of the function you have this exact line, but here you do it again even though str_test wasn't touched between the two lines. One of them can be deleted... ...but in fact it would be easier to initialize str_test using our xstrdup() function: char *str_test = xstrdup("test"); > + strbuf_attach(sb, (void *)str_test, strlen(str_test), sizeof(str_test)); Since str_test is a `char *`, `sizeof(str_test)` returns the size of the pointer itself (e.g., 4 or 8 bytes, depending on your computer's architecture). What you want here is the size of the memory that it points at, which in this case is `strlen(str_test) + 1`. (You may be confusing this pattern with code that looks like this: char str_test[5]; In this case, `sizeof(str_test)` is indeed 5, because `str_test` is an array of characters rather than a pointer to a character. It's confusing.) Also, you don't need to cast `str_test` to `void *`. Any pointer can be converted to `void *` implicitly. > + res = strbuf_detach(sb, ); > + if (res != str_test) > + die("strbuf_detach does not return the expected buffer"); > + free(res); > + strbuf_release(sb); > + > + return 0; > +} > + > +int main(int argc, char *argv[]) > +{ > + size_t size = 1; > + struct strbuf sb; > + char str_test[5] = "test"; > + char str_foo[7] = "foo"; `size`, `str_test`, and `str_foo` are all unused. You should compile using `DEVELOPER=1` to enable a bunch of compiler warnings that Git developers tend to
Re: [PATCH 09/13] refs: introduce an iterator interface
On 05/30/2016 06:57 PM, Ramsay Jones wrote: > > > On 30/05/16 16:22, Ramsay Jones wrote: >> >> >> On 30/05/16 08:55, Michael Haggerty wrote: >> [snip] >> >>> /* Reference is a symbolic reference. */ >>> diff --git a/refs/files-backend.c b/refs/files-backend.c >>> index 8ab4d5f..dbf1587 100644 >>> --- a/refs/files-backend.c >>> +++ b/refs/files-backend.c >>> [...] >>> + sort_ref_dir(dir); >> >> do you need to sort here ... >>> [...] >>> + sort_ref_dir(level->dir); >> >> ... given that you sort here? > > I had intended to say 'or vice versa' here. When I wrote this, I had not > finished reading this patch (let alone the series). Now, I suspect that > you can simply drop this 'sort_ref_dir()' call site. Unless I've misread > the code, of course! ;-) Yes, you are right. Thanks for catching this! I'll fix it in v2. Michael -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] t7800 readlink not found
On 05/27/2016 06:19 AM, David Aguilar wrote: > On Wed, May 25, 2016 at 11:33:33AM +0200, Armin Kunaschik wrote: > > Would you mind submitting a patch so that we can support these > tests when running on AIX/HP-UX? I don't feel comfortable to submit patches for tests I can't verify. I don't have valgrind and python/p4 here. Looking to the code I'd say, patching the p4 tests with "ls -ld | sed" looks quite save. But I'm not sure about the test-lib.sh. When you are really super paranoid, as written in the comment, you should probably use perl like perl -e 'print readlink $ARGV[0]' $name as a replacement. So, as suggested by Junio, here the readlink workaround for t7800 only. (hopefully whitespace clean this time) --- 8< --- 8< --- From: Armin KunaschikSubject: t7800: readlink is not portable The readlink(1) command is not available on all platforms (notably not on AIX and HP-UX) and can be replaced in this test with the "workaround" ls -ld | sed -e 's/.* -> //' This is no universal readlink replacement but works in the controlled test environment good enough. Signed-off-by: Armin Kunaschik --- diff --git a/t/t7800-difftool.sh b/t/t7800-difftool.sh index 7ce4cd7..905035c 100755 --- a/t/t7800-difftool.sh +++ b/t/t7800-difftool.sh @@ -446,7 +446,7 @@ write_script .git/CHECK_SYMLINKS <<\EOF for f in file file2 sub/sub do echo "$f" - readlink "$2/$f" + ls -ld "$2/$f" | sed -e 's/.* -> //' done >actual EOF -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] am: support --patch-format=mboxrd
Combined with "git format-patch --pretty=mboxrd", this should allow us to round-trip commit messages with embedded mbox "From " lines without corruption. Signed-off-by: Eric Wong--- Documentation/git-am.txt | 3 ++- builtin/am.c | 14 +++--- t/t4150-am.sh| 20 3 files changed, 33 insertions(+), 4 deletions(-) diff --git a/Documentation/git-am.txt b/Documentation/git-am.txt index 13cdd7f..6348c29 100644 --- a/Documentation/git-am.txt +++ b/Documentation/git-am.txt @@ -116,7 +116,8 @@ default. You can use `--no-utf8` to override this. By default the command will try to detect the patch format automatically. This option allows the user to bypass the automatic detection and specify the patch format that the patch(es) should be - interpreted as. Valid formats are mbox, stgit, stgit-series and hg. + interpreted as. Valid formats are mbox, mboxrd, + stgit, stgit-series and hg. -i:: --interactive:: diff --git a/builtin/am.c b/builtin/am.c index 3dfe70b..d5da5fe 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -70,7 +70,8 @@ enum patch_format { PATCH_FORMAT_MBOX, PATCH_FORMAT_STGIT, PATCH_FORMAT_STGIT_SERIES, - PATCH_FORMAT_HG + PATCH_FORMAT_HG, + PATCH_FORMAT_MBOXRD }; enum keep_type { @@ -712,7 +713,8 @@ done: * Splits out individual email patches from `paths`, where each path is either * a mbox file or a Maildir. Returns 0 on success, -1 on failure. */ -static int split_mail_mbox(struct am_state *state, const char **paths, int keep_cr) +static int split_mail_mbox(struct am_state *state, const char **paths, + int keep_cr, int mboxrd) { struct child_process cp = CHILD_PROCESS_INIT; struct strbuf last = STRBUF_INIT; @@ -724,6 +726,8 @@ static int split_mail_mbox(struct am_state *state, const char **paths, int keep_ argv_array_push(, "-b"); if (keep_cr) argv_array_push(, "--keep-cr"); + if (mboxrd) + argv_array_push(, "--mboxrd"); argv_array_push(, "--"); argv_array_pushv(, paths); @@ -965,13 +969,15 @@ static int split_mail(struct am_state *state, enum patch_format patch_format, switch (patch_format) { case PATCH_FORMAT_MBOX: - return split_mail_mbox(state, paths, keep_cr); + return split_mail_mbox(state, paths, keep_cr, 0); case PATCH_FORMAT_STGIT: return split_mail_conv(stgit_patch_to_mail, state, paths, keep_cr); case PATCH_FORMAT_STGIT_SERIES: return split_mail_stgit_series(state, paths, keep_cr); case PATCH_FORMAT_HG: return split_mail_conv(hg_patch_to_mail, state, paths, keep_cr); + case PATCH_FORMAT_MBOXRD: + return split_mail_mbox(state, paths, keep_cr, 1); default: die("BUG: invalid patch_format"); } @@ -2201,6 +2207,8 @@ static int parse_opt_patchformat(const struct option *opt, const char *arg, int *opt_value = PATCH_FORMAT_STGIT_SERIES; else if (!strcmp(arg, "hg")) *opt_value = PATCH_FORMAT_HG; + else if (!strcmp(arg, "mboxrd")) + *opt_value = PATCH_FORMAT_MBOXRD; else return error(_("Invalid value for --patch-format: %s"), arg); return 0; diff --git a/t/t4150-am.sh b/t/t4150-am.sh index b41bd17..74e093d 100755 --- a/t/t4150-am.sh +++ b/t/t4150-am.sh @@ -957,4 +957,24 @@ test_expect_success 'am -s unexpected trailer block' ' test_cmp expect actual ' +test_expect_success 'am --patch-format=mboxrd handles mboxrd' ' + rm -fr .git/rebase-apply && + git checkout -f first && + echo mboxrd >>file && + git add file && + cat >msg <<-INPUT_END && + mboxrd should escape the body + + From could trip up a loose mbox parser + >From extra escape for reversibility + INPUT_END + git commit -F msg && + git format-patch --pretty=mboxrd --stdout -1 >mboxrd1 && + grep "^>From could trip up a loose mbox parser" mboxrd1 && + git checkout -f first && + git am --patch-format=mboxrd mboxrd1 && + git cat-file commit HEAD | tail -n4 >out && + test_cmp msg out +' + test_done -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] mailsplit: support unescaping mboxrd messages
This will allow us to parse the output of --pretty=mboxrd and the output of other mboxrd generators. Signed-off-by: Eric Wong--- Documentation/git-mailsplit.txt | 7 ++- builtin/mailsplit.c | 23 +++ t/t5100-mailinfo.sh | 13 + t/t5100/0001mboxrd | 4 t/t5100/0002mboxrd | 3 +++ t/t5100/sample.mboxrd | 17 + 6 files changed, 66 insertions(+), 1 deletion(-) create mode 100644 t/t5100/0001mboxrd create mode 100644 t/t5100/0002mboxrd create mode 100644 t/t5100/sample.mboxrd diff --git a/Documentation/git-mailsplit.txt b/Documentation/git-mailsplit.txt index 4d1b871..e3b2a88 100644 --- a/Documentation/git-mailsplit.txt +++ b/Documentation/git-mailsplit.txt @@ -8,7 +8,8 @@ git-mailsplit - Simple UNIX mbox splitter program SYNOPSIS [verse] -'git mailsplit' [-b] [-f] [-d] [--keep-cr] -o [--] [(|)...] +'git mailsplit' [-b] [-f] [-d] [--keep-cr] [--mboxrd] + -o [--] [(|)...] DESCRIPTION --- @@ -47,6 +48,10 @@ OPTIONS --keep-cr:: Do not remove `\r` from lines ending with `\r\n`. +--mboxrd:: + Input is of the "mboxrd" format and "^>+From " line escaping is + reversed. + GIT --- Part of the linkgit:git[1] suite diff --git a/builtin/mailsplit.c b/builtin/mailsplit.c index 4859ede..fad871d 100644 --- a/builtin/mailsplit.c +++ b/builtin/mailsplit.c @@ -45,6 +45,7 @@ static int is_from_line(const char *line, int len) static struct strbuf buf = STRBUF_INIT; static int keep_cr; +static regex_t *gtfrom; /* Called with the first line (potentially partial) * already in buf[] -- normally that should begin with @@ -77,6 +78,10 @@ static int split_one(FILE *mbox, const char *name, int allow_bare) strbuf_addch(, '\n'); } + if (gtfrom && buf.len > (sizeof(">From ") - 1) && + !regexec(gtfrom, buf.buf, 0, 0, 0)) + strbuf_remove(, 0, 1); + if (fwrite(buf.buf, 1, buf.len, output) != buf.len) die_errno("cannot write output"); @@ -242,6 +247,22 @@ out: return ret; } +static regex_t *gtfrom_prepare(void) +{ + static regex_t preg; + const char re[] = "^>+From "; + int err = regcomp(, re, REG_NOSUB | REG_EXTENDED); + + if (err) { + char errbuf[1024]; + regerror(err, , errbuf, sizeof(errbuf)); + regfree(); + die("Cannot prepare regexp `%s': %s", re, errbuf); + } + + return +} + int cmd_mailsplit(int argc, const char **argv, const char *prefix) { int nr = 0, nr_prec = 4, num = 0; @@ -271,6 +292,8 @@ int cmd_mailsplit(int argc, const char **argv, const char *prefix) keep_cr = 1; } else if ( arg[1] == 'o' && arg[2] ) { dir = arg+2; + } else if (!strcmp(arg, "--mboxrd")) { + gtfrom = gtfrom_prepare(); } else if ( arg[1] == '-' && !arg[2] ) { argp++; /* -- marks end of options */ break; diff --git a/t/t5100-mailinfo.sh b/t/t5100-mailinfo.sh index 85b3df5..62b442c 100755 --- a/t/t5100-mailinfo.sh +++ b/t/t5100-mailinfo.sh @@ -111,4 +111,17 @@ test_expect_success 'mailinfo on message with quoted >From' ' test_cmp "$TEST_DIRECTORY"/t5100/quoted-from.expect quoted-from/msg ' +test_expect_success 'mailinfo unescapes with --mboxrd' ' + mkdir mboxrd && + git mailsplit -omboxrd --mboxrd \ + "$TEST_DIRECTORY"/t5100/sample.mboxrd >last && + test x"$(cat last)" = x2 && + for i in 0001 0002 + do + git mailinfo mboxrd/msg mboxrd/patch \ + mboxrd/out && + test_cmp "$TEST_DIRECTORY"/t5100/${i}mboxrd mboxrd/msg + done +' + test_done diff --git a/t/t5100/0001mboxrd b/t/t5100/0001mboxrd new file mode 100644 index 000..494ec55 --- /dev/null +++ b/t/t5100/0001mboxrd @@ -0,0 +1,4 @@ +From the beginning, mbox should have been mboxrd +>From escaped +From not mangled but this line should have been escaped + diff --git a/t/t5100/0002mboxrd b/t/t5100/0002mboxrd new file mode 100644 index 000..3c30b0b --- /dev/null +++ b/t/t5100/0002mboxrd @@ -0,0 +1,3 @@ + >From unchanged + From also unchanged + diff --git a/t/t5100/sample.mboxrd b/t/t5100/sample.mboxrd new file mode 100644 index 000..75de6dd --- /dev/null +++ b/t/t5100/sample.mboxrd @@ -0,0 +1,17 @@ +From mboxrd Mon Sep 17 00:00:00 2001 +From: mboxrd writer +Date: Fri, 9 Jun 2006 00:44:16 -0700 +Subject: [PATCH] a commit with escaped From lines + +>From the beginning, mbox should have been mboxrd +>>From escaped +From not mangled but this line should have been escaped + +From mboxrd Mon Sep 17 00:00:00 2001 +From: mboxrd writer
[PATCH 1/3] pretty: support "mboxrd" output format
This output format prevents format-patch output from breaking readers if somebody copy+pasted an mbox into a commit message. Unlike the traditional "mboxo" format, "mboxrd" is designed to be fully-reversible. "mboxrd" also gracefully degrades to showing extra ">" in existing "mboxo" readers. This degradation is preferable to breaking message splitting completely, a problem I've seen in "mboxcl" due to having multiple, non-existent, or inaccurate Content-Length headers. "mboxcl2" is a non-starter since it's inherits the problems of "mboxcl" while being completely incompatible with existing tooling based around mailsplit. ref: http://homepage.ntlworld.com/jonathan.deboynepollard/FGA/mail-mbox-formats.html Signed-off-by: Eric Wong--- builtin/log.c | 2 +- commit.h| 6 ++ log-tree.c | 4 ++-- pretty.c| 45 + t/t4014-format-patch.sh | 27 +++ 5 files changed, 73 insertions(+), 11 deletions(-) diff --git a/builtin/log.c b/builtin/log.c index 099f4f7..6d6f368 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -953,7 +953,7 @@ static void make_cover_letter(struct rev_info *rev, int use_stdout, struct pretty_print_context pp = {0}; struct commit *head = list[0]; - if (rev->commit_format != CMIT_FMT_EMAIL) + if (!cmit_fmt_is_mail(rev->commit_format)) die(_("Cover letter needs email format")); committer = git_committer_info(0); diff --git a/commit.h b/commit.h index b06db4d..1e04d3a 100644 --- a/commit.h +++ b/commit.h @@ -131,11 +131,17 @@ enum cmit_fmt { CMIT_FMT_FULLER, CMIT_FMT_ONELINE, CMIT_FMT_EMAIL, + CMIT_FMT_MBOXRD, CMIT_FMT_USERFORMAT, CMIT_FMT_UNSPECIFIED }; +static inline int cmit_fmt_is_mail(enum cmit_fmt fmt) +{ + return (fmt == CMIT_FMT_EMAIL || fmt == CMIT_FMT_MBOXRD); +} + struct pretty_print_context { /* * Callers should tweak these to change the behavior of pp_* functions. diff --git a/log-tree.c b/log-tree.c index 78a5381..48daf84 100644 --- a/log-tree.c +++ b/log-tree.c @@ -603,7 +603,7 @@ void show_log(struct rev_info *opt) * Print header line of header.. */ - if (opt->commit_format == CMIT_FMT_EMAIL) { + if (cmit_fmt_is_mail(opt->commit_format)) { log_write_email_headers(opt, commit, , _headers, _8bit_cte); } else if (opt->commit_format != CMIT_FMT_USERFORMAT) { @@ -694,7 +694,7 @@ void show_log(struct rev_info *opt) if ((ctx.fmt != CMIT_FMT_USERFORMAT) && ctx.notes_message && *ctx.notes_message) { - if (ctx.fmt == CMIT_FMT_EMAIL) { + if (cmit_fmt_is_mail(ctx.fmt)) { strbuf_addstr(, "---\n"); opt->shown_dashes = 1; } diff --git a/pretty.c b/pretty.c index 87c4497..a33e604 100644 --- a/pretty.c +++ b/pretty.c @@ -92,6 +92,7 @@ static void setup_commit_formats(void) { "medium", CMIT_FMT_MEDIUM,0, 8 }, { "short", CMIT_FMT_SHORT, 0, 0 }, { "email", CMIT_FMT_EMAIL, 0, 0 }, + { "mboxrd", CMIT_FMT_MBOXRD,0, 0 }, { "fuller", CMIT_FMT_FULLER,0, 8 }, { "full", CMIT_FMT_FULL, 0, 8 }, { "oneline",CMIT_FMT_ONELINE, 1, 0 } @@ -444,7 +445,7 @@ void pp_user_info(struct pretty_print_context *pp, if (pp->mailmap) map_user(pp->mailmap, , , , ); - if (pp->fmt == CMIT_FMT_EMAIL) { + if (cmit_fmt_is_mail(pp->fmt)) { if (pp->from_ident && ident_cmp(pp->from_ident, )) { struct strbuf buf = STRBUF_INIT; @@ -494,6 +495,7 @@ void pp_user_info(struct pretty_print_context *pp, show_ident_date(, >date_mode)); break; case CMIT_FMT_EMAIL: + case CMIT_FMT_MBOXRD: strbuf_addf(sb, "Date: %s\n", show_ident_date(, DATE_MODE(RFC2822))); break; @@ -535,7 +537,7 @@ static void add_merge_info(const struct pretty_print_context *pp, { struct commit_list *parent = commit->parents; - if ((pp->fmt == CMIT_FMT_ONELINE) || (pp->fmt == CMIT_FMT_EMAIL) || + if ((pp->fmt == CMIT_FMT_ONELINE) || (cmit_fmt_is_mail(pp->fmt)) || !parent || !parent->next) return; @@ -1614,7 +1616,7 @@ void pp_title_line(struct pretty_print_context *pp, if (pp->after_subject) { strbuf_addstr(sb, pp->after_subject); } - if (pp->fmt == CMIT_FMT_EMAIL) { + if (cmit_fmt_is_mail(pp->fmt)) { strbuf_addch(sb, '\n'); } @@ -1697,12 +1699,34
[RFC/PATCH 0/3] support mboxrd format
Sometimes users will copy+paste an entire mbox into a commit message, leading to bad splits when a patch is output as an email. Unlike other mbox-family formats, mboxrd allows reversible round-tripping while avoiding bad splits for old "mboxo" readers. I'm also considering altering the current "From ${COMMIT} Mon Sep 17 00:00:00 2001" line to something else so mailsplit (or "am") can autodetect. Maybe: From ${COMMIT}@mboxrd Mon Sep 17 00:00:00 2001 ? We may also want to default to single escaping "From " in commit messages for --pretty=email to avoid corruption when somebody copy+pastes an mbox into the commit message. This is a technically incompatible change, but I think it's preferable to breaking splitting complete. In other words, --pretty=email changes to output "mboxo" for now. Long term (possibly git 3.0?), maybe mboxrd can become the default mail format. IMHO, it should've been since 2005. ref: http://homepage.ntlworld.com/jonathan.deboynepollard/FGA/mail-mbox-formats.html Eric Wong (3): pretty: support "mboxrd" output format mailsplit: support unescaping mboxrd messages am: support --patch-format=mboxrd Documentation/git-am.txt| 3 ++- Documentation/git-mailsplit.txt | 7 ++- builtin/am.c| 14 ++--- builtin/log.c | 2 +- builtin/mailsplit.c | 23 + commit.h| 6 ++ log-tree.c | 4 ++-- pretty.c| 45 + t/t4014-format-patch.sh | 27 + t/t4150-am.sh | 20 ++ t/t5100-mailinfo.sh | 13 t/t5100/0001mboxrd | 4 t/t5100/0002mboxrd | 3 +++ t/t5100/sample.mboxrd | 17 14 files changed, 172 insertions(+), 16 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] strbuf: allow to use preallocated memory
On Tue, May 31, 2016 at 12:46:23AM +0200, William Duclot wrote: > Mike Hommeywrites: > >> struct strbuf { > >> + unsigned int flags; > >>size_t alloc; > >>size_t len; > >>char *buf; > >> }; > > > > Depending whether the size of strbuf matters, it /might/ be worth > > considering some packing here. malloc() usually returns buffers that can > > contain more data than what is requested. Which means allocation sizes > > could be rounded and that wouldn't change the amount of allocated > > memory. On glibc malloc_usable_size(malloc(1)) apparently returns 24. > > On jemalloc, it's 4 or 8. It's in the same ballbark with many > > allocators. > > > > So, it would be possible to round alloc such that it's always a multiple > > of, say, 4, and stick flags in the low, unused bits. > > If I'm not mistaken, the memory allocated is not necessarily linear with > the size asked, depending on the algorithm used by the allocator and/or > the kernel. The system for exemple use powers of two, if the user asks > for exactly 2^x bytes, adding the space for the flags would lead to an > allocation of 2^(x+1) bytes. No, it would not. If you requested 129 bytes, you'd request 136 instead, which the allocator would round to the same power of two. If you requested 128, you'd still request 128. It's not about adding space in the allocated buffer for the flags, it's about needing less bits in `alloc` because those bits are effectively useless because of how allocators work. > Way worse than storing an unsigned. > If the allocator use a fibonnaci system, we can't even rely on multiples > of 4 (or 2). > I'm not sure the fibonnaci system is actually used by any allocator, but > my point is that I'm not sure it is a good thing to rely on such > low-level implementations. Allocators have constraints related to word sizes and alignment, so they are pretty much guaranteed to align things to powers of two. Mike -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] strbuf: allow to use preallocated memory
Mike Hommeywrites: >> struct strbuf { >> +unsigned int flags; >> size_t alloc; >> size_t len; >> char *buf; >> }; > > Depending whether the size of strbuf matters, it /might/ be worth > considering some packing here. malloc() usually returns buffers that can > contain more data than what is requested. Which means allocation sizes > could be rounded and that wouldn't change the amount of allocated > memory. On glibc malloc_usable_size(malloc(1)) apparently returns 24. > On jemalloc, it's 4 or 8. It's in the same ballbark with many > allocators. > > So, it would be possible to round alloc such that it's always a multiple > of, say, 4, and stick flags in the low, unused bits. If I'm not mistaken, the memory allocated is not necessarily linear with the size asked, depending on the algorithm used by the allocator and/or the kernel. The system for exemple use powers of two, if the user asks for exactly 2^x bytes, adding the space for the flags would lead to an allocation of 2^(x+1) bytes. Way worse than storing an unsigned. If the allocator use a fibonnaci system, we can't even rely on multiples of 4 (or 2). I'm not sure the fibonnaci system is actually used by any allocator, but my point is that I'm not sure it is a good thing to rely on such low-level implementations. > Whether it's worth doing is another question. I'd say it is not, we generally lack time more than space, storing an unsigned seems very reasonable compared to operations involved in using malloc() leftovers :) Not even talking about growing buffers, so realloc(), so a whole new set of problems... -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Documentation: GPG capitalization
David Nicolsonwrites: >>> message part may contain a signature that Git itself doesn't >>> -care about, but that can be verified with gpg. >>> +care about, but that can be verified with GPG. >> >> Isn't this a name of the program, though? Other two hunks in your >> patch clearly refer to the concept and not to a particular program, >> and they are good changes, I think. > > This one was not as clear as the other hunks. Git is referred to as > "Git" in the preceding line, which in itself could be referring to the > concept or the particular program I guess? I think the "Git" you see above is clearly the "stupid content tracker Git", not a particular implementation. It is more about the data model of Git than our particular implementation of it. jgit, a Java implementation of Git, does not care about the gibberish in the "--BEGIN PGP..." block the same way we don't care. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git add without whitespace
Robert Daileywrites: > I like your solution better than mine because it utilizes the rules > defined in .gitattributes. A difference that may be more important is that I do not do generation of a patch or application of it without ignoring whitespaces with things like -w and --ignore-whitespace. That way, if my edit is a correction of existing whitespace breakage (e.g. I noticed a line that is indented by 8 spaces, and I corrected it by replacing them with one tab), that is shown as a change by "diff" and kept in the result. I suspect that your "diff -w | apply --ignore" will ignore that manual fix? > What does the checkout at the end do? That part confuses me (granted > I'm not well-versed with bash script). I correct whitespace-broken updates the user (i.e. I) made in her working tree file by adding a corrected version to the index, and then I checkout the result out of the index to the working tree. That corrects the breakage in both the index and the working tree, so that my further edit to the file will start from a ws-corrected version. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] strbuf: allow to use preallocated memory
> struct strbuf { > + unsigned int flags; > size_t alloc; > size_t len; > char *buf; > }; Depending whether the size of strbuf matters, it /might/ be worth considering some packing here. malloc() usually returns buffers that can contain more data than what is requested. Which means allocation sizes could be rounded and that wouldn't change the amount of allocated memory. On glibc malloc_usable_size(malloc(1)) apparently returns 24. On jemalloc, it's 4 or 8. It's in the same ballbark with many allocators. So, it would be possible to round alloc such that it's always a multiple of, say, 4, and stick flags in the low, unused bits. Whether it's worth doing is another question. Mike -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH] bisect--helper: `bisect_clean_state` shell function in C
On Mon, May 30, 2016 at 8:21 PM, Pranit Bauvawrote: > --- > This patch contains a bug. I have tried to identify the bug and I suppose it > exists in do_for_each_entry_in_dir(). I have reproduced the debugging session > at this link[1]. I have seen that some patches in mailing list regarding > iterating over refs. Will those affect this? Or is this bug fixed in those > patches? > > [1]: http://paste.ubuntu.com/16830752/ The debug session seems to use code source from a previous version of this patch. Also it is not cear in which context you run git under gdb. What have you done before? And we don't see a crash. Could you show the crash and run the "bt" command in gdb to get a backtrace? > @@ -79,11 +90,42 @@ int write_terms(const char *bad, const char *good) > strbuf_release(); > return (res < 0) ? -1 : 0; > } > + > +int remove_bisect_ref(const char *refname, const struct object_id *oid, > + int flag, void *cb_data) > +{ > + char *ref; > + ref = xstrfmt("refs/bisect/%s", refname); You could save one line by concatenating the 2 above lines. > + if (delete_ref(ref, oid->hash, flag)) > + return error(_("couldn't delete the ref %s\n"), ref); > + return 0; You need to free "ref". > +} Thanks, Christian. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Documentation: GPG capitalization
On Mon, May 30, 2016 at 8:02 PM, Junio C Hamanowrote: > Dave Nicolson writes: > >> When "GPG" is used in a sentence it is now consistently capitalized. When >> referring to the binary it is left as "gpg". >> >> Signed-off-by: David Nicolson >> --- >> Documentation/git-mktag.txt | 2 +- >> Documentation/git-tag.txt | 2 +- >> Documentation/git-verify-commit.txt | 4 ++-- >> 3 files changed, 4 insertions(+), 4 deletions(-) >> >> diff --git a/Documentation/git-mktag.txt b/Documentation/git-mktag.txt >> index fa6a756..724408d 100644 >> --- a/Documentation/git-mktag.txt >> +++ b/Documentation/git-mktag.txt >> @@ -32,7 +32,7 @@ followed by some 'optional' free-form message (some tags >> created >> by older Git may not have `tagger` line). The message, when >> exists, is separated by a blank line from the header. The >> message part may contain a signature that Git itself doesn't >> -care about, but that can be verified with gpg. >> +care about, but that can be verified with GPG. > > Isn't this a name of the program, though? Other two hunks in your > patch clearly refer to the concept and not to a particular program, > and they are good changes, I think. This one was not as clear as the other hunks. Git is referred to as "Git" in the preceding line, which in itself could be referring to the concept or the particular program I guess? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/8] t4051: rewrite, add more tests
Am 30.05.2016 um 01:55 schrieb Junio C Hamano: René Scharfewrites: +commit_and_tag () { + message=$1 && + shift && + git add $@ && Lack of dq around $@ makes me wonder if there is something funny going on (looking at the callers, there isn't, so we'd better quote it to avoid wasting time, I think). OK. + test_tick && + git commit -m $message && + git tag $message } The use of $message as the sole argument to "git tag" makes the readers guess that it must be a single token without any funny character, so the readers would probably do not waste too much time wondreing if the lack of dq around $message in the last two is problematic. Well, let's call it $tag; $message is a bit misleading here. The saved letters can be invested in quotes. ;) +last_context_line () { + sed -n '$ p' } I have a vague recollection that some implementations of sed are unhappy to see that space between the address and the operation; I'd feel safer without it. Indeed most sed calls in t/ have no space there (found counter-examples only in annotate-tests.sh, t4201-shortlog.sh, t9824-git-p4-git-lfs.sh). +check_diff () { + name=$1 + desc=$2 + options="-W $3" + + test_expect_success "$desc" ' + git diff $options "$name^" "$name" >"$name.diff" + ' + + test_expect_success ' diff applies' ' + test_when_finished "git reset --hard" && + git checkout --detach "$name^" && With the presence of ^ there, --detach is unnecessary; it would not hurt, though. Right. It's just there to make that intent clear. + git apply "$name.diff" && + git diff --exit-code "$name" Even though we may know that $name.diff" will never have a creation of new paths, I'd feel safer if "apply" is run with "--index". Makes sense; the less we assume about the diff to be checked the better. Thanks a lot! René -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git add without whitespace
On Mon, May 30, 2016 at 2:06 PM, Junio C Hamanowrote: > I have had this in my ~/.gitconfig for a long time. > > [alias] > wsadd = "!sh -c 'git diff -- \"$@\" | git apply --cached > --whitespace=fix;\ > git co -- ${1-.} \"$@\"' -" > > That is, "take what's different from the _index_ and the working > tree, apply that difference while correcting whitespace errors to > the index, and check the result out to the working tree". This > would _not_ touch existing whitespace-damaged lines that you are not > touching, and honours the customized definition of what is > considered whitespace breakage for each paths (which you set up with > the attributes system). > I like your solution better than mine because it utilizes the rules defined in .gitattributes. I think that's a really good idea. But other than that, yours is functionally the same as what I'm doing, right? I just want to make sure I understand: What ends up in the index/staging area is the code MINUS the trailing whitespace (e.g. whitespace errors)? What does the checkout at the end do? That part confuses me (granted I'm not well-versed with bash script). Thanks for the feedback. Looks like this is niche enough that an alias/script is probably the best solution. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [WIP-PATCH 0/2] send-email: refactor the email parser loop
Samuel GROOTwrites: > Can we consider this feature obsolete and remove it? We're usually quite conservative with backward compatibility. If we remove the feature, we may want to announce it in the next feature release and actually remove it in the one after (unless we get valid objection in the meantime). I'm all for dropping a feature that no one uses if it turns out to be the case. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5] upload-pack.c: use parse-options API
Antoine Queruwrites: > From: Antoine Queru Don't you want to be known to the project as the email that matches your Signed-off-by: line? > Use the parse-options API rather than a hand-rolled option parser. > > Description for --stateless-rpc and --advertise-refs come from > 42526b4 (Add stateless RPC options to upload-pack, > receive-pack, 2009-10-30). > > Signed-off-by: Antoine Queru > Signed-off-by: Matthieu Moy > --- > ---strict:: > +--[no-]strict:: > Do not try /.git/ if is no Git directory. Not a new problem, but "is no Git ..." may technically be correct, but it would be easier to read if phrased "is not a Git ..." or something like that. I am NOT asking _you_ to change/fix that in this patch. It is just a note for "a low hanging fruit" for people to pick up with a separate patch. > +--stateless-rpc:: > + Perform only a single read-write cycle with stdin and stdout. > + This fits with the HTTP POST request processing model where > + a program may read the request, write a response, and must exit. > + > +--advertise-refs:: > + Only the initial ref advertisement is output, and the program exits > + immediately. This fits with the HTTP GET request model, where > + no request content is received but a response must be produced. > + Good. > diff --git a/upload-pack.c b/upload-pack.c > index dc802a0..083d068 100644 > --- a/upload-pack.c > +++ b/upload-pack.c > @@ -14,8 +14,12 @@ > #include "sigchain.h" > #include "version.h" > #include "string-list.h" > +#include "parse-options.h" > > -static const char upload_pack_usage[] = "git upload-pack [--strict] > [--timeout=] "; > +static const char * const upload_pack_usage[] = { > + N_("git upload-pack [options] "), > + NULL > +}; Output from "git grep -e '\[option' -e '\[
Re: git add without whitespace
Robert Daileywrites: > $ git diff -U0 -w --no-color | git apply --cached --ignore-whitespace > --unidiff-zero > > This command explicitly leaves out context because it can sometimes > cause the patch to fail to apply, I think due to whitespace being in > it, but I'm not completely sure myself. I have had this in my ~/.gitconfig for a long time. [alias] wsadd = "!sh -c 'git diff -- \"$@\" | git apply --cached --whitespace=fix;\ git co -- ${1-.} \"$@\"' -" That is, "take what's different from the _index_ and the working tree, apply that difference while correcting whitespace errors to the index, and check the result out to the working tree". This would _not_ touch existing whitespace-damaged lines that you are not touching, and honours the customized definition of what is considered whitespace breakage for each paths (which you set up with the attributes system). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fatal bug on revert with --author
On Mon, May 30, 2016 at 04:16:50PM +0200, Andreas Lutro wrote: > So I learned today that --author is not a supported argument to git > revert. It took me a long time to realize this, though, because the > error I got was very cryptic: > > fatal: BUG: expected exactly one commit from walk > > Here's a very simple reproducible case: > https://gist.github.com/anlutro/67e0cec1a6a419e6d44131b0bc1deff6 > > I was recommended to send this report here by #git on irc.freenode.net. Hmm. It _is_ a supported command-line argument, as you can pass arbitrary revision options to revert. So: git revert --author peff HEAD should revert all of my commits (whether that is _useful_, I am not sure, but it is a consequence of the fact that revert simply passes its arguments to the regular revision machinery). But in your example, you pass the author "test", which matches nothing. And so we end up with nothing to revert. But rather than saying so, we fall into a backwards-compatibility code path: /* * If we were called as "git cherry-pick ", just * cherry-pick/revert it, set CHERRY_PICK_HEAD / * REVERT_HEAD, and don't touch the sequencer state. * This means it is possible to cherry-pick in the middle * of a cherry-pick sequence. */ if (opts->revs->cmdline.nr == 1 && opts->revs->cmdline.rev->whence == REV_CMD_REV && opts->revs->no_walk && !opts->revs->cmdline.rev->flags) { struct commit *cmit; if (prepare_revision_walk(opts->revs)) die(_("revision walk setup failed")); cmit = get_revision(opts->revs); if (!cmit || get_revision(opts->revs)) die("BUG: expected exactly one commit from walk"); return single_pick(cmit, opts); } I think this conditional should not be triggering, because even though we did receive a single argument, we _also_ got options which imply that the user expected us to walk and find other commits. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to cut extra liftime of expired loose objects
Jeff Kingwrites: > I think we should consider dropping the default time to something more > like 1 day. The 2-week period predates reflogs, I believe (OTOH, it does > save objects which you might have used with "git add" but never actually > committed). Your belief is shared by me. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to add custom metadata to Git commit object
Jeff Kingwrites: > On Mon, May 30, 2016 at 08:58:08PM +0300, Kirill Likhodedov wrote: > ... >> There are git-notes, which could be used for the purpose, but they are >> visible to the user via standard Git command, and could be used by the >> user for other purposes, so they are not very suitable for the task. > > Notes would work for this, too, but their main advantage is that they > can be created _after_ a commit has already been made (whereas anything > in the commit object itself will influence its sha1 id). I would have said the same but with s/but/and/. If the "rename hint" or whatever other "custom metadata" Kirill gives to a commit is found to be wrong, it can be corrected later. And "the user can use notes for other purposes" is not a good reason to reject them. The whole point of allowing custom notes ref is so that Kirill is not restricted to use the usual notes/commits ref to store this custom notes in its dedicated notes/kirills-metadata ref. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 0/1] t6038-merge-text-auto.sh
Junio C Hamanowrites: > Imagine that you started from a history where somebody recorded a > text file with CRLF in the blob, unconverted. Later the project > decided to express their text with LF to support cross-platform > development better, and sets up the Auto-CRLF. Your user is working > near the tip of that history after the eol correction happened. Now > she gets a pull-request of a branch that forked from an old point in > the history, before the eol correction and full of CRLF. Yes, to > integrate the change being proposed, she needs to look at "theirs"; > that's the whole point of a "merge". Why should she revert the eol > correction her history has by getting fooled by the fact that the > update was based on a part of the history before the eol correction? Thinking aloud along this line a bit further, if you really cared (and I don't feel very strongly, as I think "safe crlf" is an ugly workaround that lets users keep their misconfiguration by penalizing their day-to-day operation), you may want to think about doing a 3-way merge of "eol preference" beween all stages. That is, if the common ancestor in stage #1 and the current version in stage #2 both have its text in LF, and the data being merged in stage #3 is in CRLF, you sayr "do not convert; the change being brought in wants to have CRLF endings, while our history did not care". Similarly, if the common ancestor in stage #1 and the data being merged in stage #3 both have CRLF, and your version in stage #2 has LF, you say "We wanted to fix eol since the side branch forked, and the side branch predates that fix, so we keep the eol fix we did since we diverged", i.e. "Do convert". For doing this, you may want to refactor the codepath that decides "Auto-CRLF usually wants to turn CRLF in text to LF, but should we disable that logic now, because the user already has CRLF in the current one?" into a function that takes a single parameter, `path`, and returns "Yes, do convert CRLF to LF / No, do not convert" boolean. Having a low level function that says "What's the blob at this path in the index" and have the caller run that logic feels unwieldy if we want to go that route. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to add custom metadata to Git commit object
On Mon, May 30, 2016 at 08:58:08PM +0300, Kirill Likhodedov wrote: > Is it possible to add custom metadata to Git commit object? Such > metadata should be ignored by Git commands, but could be used by a > 3-party tool which knows the format and knows where to look. Yes. The recommended place to stick this is in a "trailer" at the bottom of the commit message, like: Rename-detection-hint: foo.java -> bar.java or whatever scheme is useful to your tool. > I assume that this should be possible, given that Git objects are > actually patches, and patches can contain additional details. But can > this be done with the help of Git commands? Git objects aren't actually patches. Try: git cat-file commit HEAD to see what an actual commit object looks like. You will see that there are user-invisible headers before the commit message, too, but we do not usually recommend people to add new headers here, as their semantics would be unclear to git. For example, when rebasing such a commit, should the header be preserved or not? It depends on its meaning. Whereas commit messages are always preserved. > There are git-notes, which could be used for the purpose, but they are > visible to the user via standard Git command, and could be used by the > user for other purposes, so they are not very suitable for the task. Notes would work for this, too, but their main advantage is that they can be created _after_ a commit has already been made (whereas anything in the commit object itself will influence its sha1 id). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [WIP-PATCH 0/2] send-email: refactor the email parser loop
On 05/30/2016 04:20 PM, Matthieu Moy wrote: Is the "lots of email" format still used? AFAICT, it was initially supported for backward compatibility, and then no one removed it, but I wouldn't be surprised if no one actually used it. I vaguely remember a message from Ryan Anderson being surprised to see the old format still supported, but I can't find it in the archives. In any case: - git log --grep 'lots of email' => shows only 83b2443 - git log -S'lots of email' => likewise - git grep 'lots of email' => just one answer in a comment I'm not sure the feature is even tested. `grep "non-mbox" t/t9001-send-email.sh` didn't return anything, apparently it's not tested. Can we consider this feature obsolete and remove it? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 2/3] perf: make the tests work in worktrees
Am 30.05.2016 um 20:03 schrieb Junio C Hamano: > Johannes Schindelinwrites: > >>> This breaks perf for the non-worktree case: >> >> Oh drats! >> >>> lsr@debian:~/src/git/t/perf$ make >>> rm -rf test-results >>> ./run >>> === Running 12 tests in this tree === >>> cp: cannot stat '.git/objects': No such file or directory >>> error: failed to copy repository '/home/lsr/src/git/t/..' to '/tmp/trash >>> directory.p-perf-lib-sanity' >>> cp: cannot stat '.git/objects': No such file or directory >>> error: failed to copy repository '/home/lsr/src/git/t/..' to '/tmp/trash >>> directory.p0001-rev-list' >>> ... >>> >>> Here's a fix: >>> >>> -- >8 -- >>> Subject: perf: make the tests work without a worktree >>> >>> In regular repositories $source_git and $objects_dir contain relative >>> paths based on $source. Go there to allow cp to resolve them. >>> >>> Signed-off-by: Rene Scharfe >>> --- >>> t/perf/perf-lib.sh | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh >>> index 5ef1744..1888790 100644 >>> --- a/t/perf/perf-lib.sh >>> +++ b/t/perf/perf-lib.sh >>> @@ -84,6 +84,7 @@ test_perf_create_repo_from () { >>> objects_dir="$(git -C "$source" rev-parse --git-path objects)" >>> mkdir -p "$repo/.git" >>> ( >>> + cd "$source" && >> >> I fear that interacts badly with the `cd "$repo"` I introduced later >> (replacing a `cd ..`)... Oh, right, it does if $repo is a relative path. > What do you want to do then? For now before -rc1 we can revert the > whole thing so that we can get a tested thing that works in both > layouts. Put it in its own subshell, e.g. like this? --- t/perf/perf-lib.sh | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh index 5ef1744..18c363e 100644 --- a/t/perf/perf-lib.sh +++ b/t/perf/perf-lib.sh @@ -84,6 +84,7 @@ test_perf_create_repo_from () { objects_dir="$(git -C "$source" rev-parse --git-path objects)" mkdir -p "$repo/.git" ( + cd "$source" && { cp -Rl "$objects_dir" "$repo/.git/" 2>/dev/null || cp -R "$objects_dir" "$repo/.git/"; } && for stuff in "$source_git"/*; do @@ -94,7 +95,9 @@ test_perf_create_repo_from () { cp -R "$stuff" "$repo/.git/" || exit 1 ;; esac - done && + done + ) && + ( cd "$repo" && git init -q && { test_have_prereq SYMLINKS || -- 2.8.3 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to add custom metadata to Git commit object
On Mon, 30 May 2016 20:58:08 +0300 Kirill Likhodedovwrote: > Is it possible to add custom metadata to Git commit object? > Such metadata should be ignored by Git commands, but could be used by > a 3-party tool which knows the format and knows where to look. > > I assume that this should be possible, given that Git objects are > actually patches, and patches can contain additional details. But can > this be done with the help of Git commands? [...] > There are git-notes, which could be used for the purpose, but they > are visible to the user via standard Git command, and could be used > by the user for other purposes, so they are not very suitable for the > task. AFAIK, within your restrictions, it's not possible because there are only two ways to add meta information for a Git commit: * Store it externally and somehow correlate it with the commit. This is what git-notes does. * Encode it directly into a commit object. Since you can't use your own headers in commit objects, you have to encode this information into the commit message in some form parsable by a machine. This is what, say, git-svn does to make it possible to correlate the commits it creates with their source Subversion revisions. In both cases the information can be viewed by the user. What I can't really understand is what is so bad about the user being able to peer at that data. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to cut extra liftime of expired loose objects
On Mon, May 30, 2016 at 06:43:07PM +0200, Johannes Sixt wrote: > I think there is logic somewhere in git gc or its minions that gives expired > objects that have been packed a two weeks extra life time as loose objects. > > That is, reachable loose objects are moved to a pack, but those objects that > were packed, but become unreachable due to expired reflogs, are evicted from > the pack and live for another two weeks or so as loose objects. Sort of. The loose objects should get the mtime of the pack, which is likely newer than 2 weeks. So it's not a 2-week extension, but it may take up to 2 weeks for them to go away (and if the pack is already more than 2 weeks old, the objects are dropped without even loosening). > As a consequence of this, I always have a few thousand loose objects in my > busy repositories, no matter how often I collect garbage. Is there a knob > that removes the extra lease of life of objects without reducing the usual > expiration times of reflogs etc? gc.pruneExpire should do this. I have run into this, too, and it interacts annoyingly with the background-gc (which complains "you have too many objects; run git-prune", leaves that in a .lock file, and then every subsequent auto-gc spews it at me). I think we should consider dropping the default time to something more like 1 day. The 2-week period predates reflogs, I believe (OTOH, it does save objects which you might have used with "git add" but never actually committed). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC/PATCH] bisect--helper: `bisect_clean_state` shell function in C
Reimplement `bisect_clean_state` shell function in C and add a `bisect-clean-state` subcommand to `git bisect--helper` to call it from git-bisect.sh . Using `bisect_clean_state` subcommand is a measure to port shell function to C so as to use the existing test suite. As more functions are ported, this subcommand will be retired and will be called by bisect_reset() and bisect_start(). Mentored-by: Lars SchneiderMentored-by: Christian Couder Signed-off-by: Pranit Bauva --- This patch contains a bug. I have tried to identify the bug and I suppose it exists in do_for_each_entry_in_dir(). I have reproduced the debugging session at this link[1]. I have seen that some patches in mailing list regarding iterating over refs. Will those affect this? Or is this bug fixed in those patches? [1]: http://paste.ubuntu.com/16830752/ builtin/bisect--helper.c | 50 +++- git-bisect.sh| 26 +++-- 2 files changed, 52 insertions(+), 24 deletions(-) diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c index 2b21c02..a20c9a5 100644 --- a/builtin/bisect--helper.c +++ b/builtin/bisect--helper.c @@ -3,10 +3,21 @@ #include "parse-options.h" #include "bisect.h" #include "refs.h" +#include "dir.h" + +static GIT_PATH_FUNC(git_path_bisect_expected_rev, "BISECT_EXPECTED_REV") +static GIT_PATH_FUNC(git_path_bisect_ancestors_ok, "BISECT_ANCESTORS_OK") +static GIT_PATH_FUNC(git_path_bisect_log, "BISECT_LOG") +static GIT_PATH_FUNC(git_path_bisect_names, "BISECT_NAMES") +static GIT_PATH_FUNC(git_path_bisect_run, "BISECT_RUN") +static GIT_PATH_FUNC(git_path_bisect_terms, "BISECT_TERMS") +static GIT_PATH_FUNC(git_path_head_name, "head-name") +static GIT_PATH_FUNC(git_path_bisect_start, "BISECT_START") static const char * const git_bisect_helper_usage[] = { N_("git bisect--helper --next-all [--no-checkout]"), N_("git bisect--helper --write-terms "), + N_("git bisect--helper --bisect-clean-state"), NULL }; @@ -79,11 +90,42 @@ int write_terms(const char *bad, const char *good) strbuf_release(); return (res < 0) ? -1 : 0; } + +int remove_bisect_ref(const char *refname, const struct object_id *oid, + int flag, void *cb_data) +{ + char *ref; + ref = xstrfmt("refs/bisect/%s", refname); + if (delete_ref(ref, oid->hash, flag)) + return error(_("couldn't delete the ref %s\n"), ref); + return 0; +} + +int bisect_clean_state(void) +{ + for_each_ref_in("refs/bisect/", remove_bisect_ref, NULL); + + remove_path(git_path_bisect_expected_rev()); + remove_path(git_path_bisect_ancestors_ok()); + remove_path(git_path_bisect_log()); + remove_path(git_path_bisect_names()); + remove_path(git_path_bisect_run()); + remove_path(git_path_bisect_terms()); + /* Cleanup head-name if it got left by an old version of git-bisect */ + remove_path(git_path_head_name()); + delete_ref("BISECT_HEAD", NULL, 0); + /* Cleanup BISECT_START last */ + remove_path(git_path_bisect_start()); + + return 0; +} + int cmd_bisect__helper(int argc, const char **argv, const char *prefix) { enum { NEXT_ALL = 1, - WRITE_TERMS + WRITE_TERMS, + BISECT_CLEAN_STATE } cmdmode = 0; int no_checkout = 0; struct option options[] = { @@ -91,6 +133,8 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix) N_("perform 'git bisect next'"), NEXT_ALL), OPT_CMDMODE(0, "write-terms", , N_("write the terms to .git/BISECT_TERMS"), WRITE_TERMS), + OPT_CMDMODE(0, "bisect-clean-state", , +N_("cleanup the bisection state"), BISECT_CLEAN_STATE), OPT_BOOL(0, "no-checkout", _checkout, N_("update BISECT_HEAD instead of checking out the current commit")), OPT_END() @@ -109,6 +153,10 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix) if (argc != 2) die(_("--write-terms requires two arguments")); return write_terms(argv[0], argv[1]); + case BISECT_CLEAN_STATE: + if (argc != 0) + die(_("--bisect-clean-state requires no arguments")); + return bisect_clean_state(); default: die("BUG: unknown subcommand '%d'", cmdmode); } diff --git a/git-bisect.sh b/git-bisect.sh index 2dd7ec5..a937d1c 100755 --- a/git-bisect.sh +++ b/git-bisect.sh @@ -187,7 +187,7 @@ bisect_start() { # # Get rid of any old bisect state. # - bisect_clean_state || exit + git bisect--helper --bisect-clean-state || exit #
Re: [PATCH v3 2/3] perf: make the tests work in worktrees
Johannes Schindelinwrites: >> This breaks perf for the non-worktree case: > > Oh drats! > >> lsr@debian:~/src/git/t/perf$ make >> rm -rf test-results >> ./run >> === Running 12 tests in this tree === >> cp: cannot stat '.git/objects': No such file or directory >> error: failed to copy repository '/home/lsr/src/git/t/..' to '/tmp/trash >> directory.p-perf-lib-sanity' >> cp: cannot stat '.git/objects': No such file or directory >> error: failed to copy repository '/home/lsr/src/git/t/..' to '/tmp/trash >> directory.p0001-rev-list' >> ... >> >> Here's a fix: >> >> -- >8 -- >> Subject: perf: make the tests work without a worktree >> >> In regular repositories $source_git and $objects_dir contain relative >> paths based on $source. Go there to allow cp to resolve them. >> >> Signed-off-by: Rene Scharfe >> --- >> t/perf/perf-lib.sh | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh >> index 5ef1744..1888790 100644 >> --- a/t/perf/perf-lib.sh >> +++ b/t/perf/perf-lib.sh >> @@ -84,6 +84,7 @@ test_perf_create_repo_from () { >> objects_dir="$(git -C "$source" rev-parse --git-path objects)" >> mkdir -p "$repo/.git" >> ( >> +cd "$source" && > > I fear that interacts badly with the `cd "$repo"` I introduced later > (replacing a `cd ..`)... What do you want to do then? For now before -rc1 we can revert the whole thing so that we can get a tested thing that works in both layouts. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Documentation: GPG capitalization
Dave Nicolsonwrites: > When "GPG" is used in a sentence it is now consistently capitalized. When > referring to the binary it is left as "gpg". > > Signed-off-by: David Nicolson > --- > Documentation/git-mktag.txt | 2 +- > Documentation/git-tag.txt | 2 +- > Documentation/git-verify-commit.txt | 4 ++-- > 3 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/Documentation/git-mktag.txt b/Documentation/git-mktag.txt > index fa6a756..724408d 100644 > --- a/Documentation/git-mktag.txt > +++ b/Documentation/git-mktag.txt > @@ -32,7 +32,7 @@ followed by some 'optional' free-form message (some tags > created > by older Git may not have `tagger` line). The message, when > exists, is separated by a blank line from the header. The > message part may contain a signature that Git itself doesn't > -care about, but that can be verified with gpg. > +care about, but that can be verified with GPG. Isn't this a name of the program, though? Other two hunks in your patch clearly refer to the concept and not to a particular program, and they are good changes, I think. > > GIT > --- > diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt > index abab481..32bc4aa 100644 > --- a/Documentation/git-tag.txt > +++ b/Documentation/git-tag.txt > @@ -78,7 +78,7 @@ OPTIONS > > -v:: > --verify:: > - Verify the gpg signature of the given tag names. > + Verify the GPG signature of the given tag names. > > -n:: >specifies how many lines from the annotation, if any, > diff --git a/Documentation/git-verify-commit.txt > b/Documentation/git-verify-commit.txt > index ecf4da1..0101f0f 100644 > --- a/Documentation/git-verify-commit.txt > +++ b/Documentation/git-verify-commit.txt > @@ -12,12 +12,12 @@ SYNOPSIS > > DESCRIPTION > --- > -Validates the gpg signature created by 'git commit -S'. > +Validates the GPG signature created by 'git commit -S'. > > OPTIONS > --- > --raw:: > - Print the raw gpg status output to standard error instead of the normal > + Print the raw GPG status output to standard error instead of the normal > human-readable output. > > -v:: > > -- > https://github.com/git/git/pull/246 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 0/1] t6038-merge-text-auto.sh
tbo...@web.de writes: > This is a little bit of a hen-and-egg problem: > The problem comes up after the "unified auto handling". > In theory, it should have been since before: > get_sha1_from_index() says: > > * We might be in the middle of a merge, in which > * case we would read stage #2 (ours). > > This seams wrong, as in the merge we sometimes need to > look at "theirs". The two comment you quoted is absolutely the right thing to do. "In a merge, we sometimes need to look at 'theirs'" is like saying "When we are dong 'git add', we need to look at what is in the working tree". It is total red herring. Step back and think why we even look at what in the index in the first place; it is to decide if we want or do not want to disable the automatic CRLF -> LF conversion. And think again the reason why do we look at the index. There may be a line with CRLF line endings in the new contents, whether it came from a merge, cherry-pick, patch application, or plain-simple "git add" from the working tree. Auto-CRLF usually says "We want CRLF turned into LF". But the user misconfigured and for this path the user might not want the conversion take place, in which case you would disable the conversion. Where do you take that hint "the user might have misconfigured?" from? By looking at what the user _started_ her update from. If the state before this "we need to replace the blob in the index with a new contents, so we need to hash the new contents to come up with the updated blob" started contains CRLF already, that may be a hint--if we apply the CRLF->LF conversion on the original, even if the "new contents" were identical to what she already had, we would end up changing the blob with her current configuration. Hence we disable. Isn't that the reasoning behind that "safe auto-crlf" thing? The new contents getting integrated into her current state may have CRLF, and if a merge or a cherry-pick leaves conflicts, they may be stored in stage #3. But paying attention to that to decide if we want to disable Auto-CRLF conversion is simply wrong; you should look at the CRLF in stage #3 as purely something that might need to be converted, not something that affects the decision if it needs to be converted, just like you view CRLF in a working tree file when you do "git add".. Imagine that you started from a history where somebody recorded a text file with CRLF in the blob, unconverted. Later the project decided to express their text with LF to support cross-platform development better, and sets up the Auto-CRLF. Your user is working near the tip of that history after the eol correction happened. Now she gets a pull-request of a branch that forked from an old point in the history, before the eol correction and full of CRLF. Yes, to integrate the change being proposed, she needs to look at "theirs"; that's the whole point of a "merge". Why should she revert the eol correction her history has by getting fooled by the fact that the update was based on a part of the history before the eol correction? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
How to add custom metadata to Git commit object
Is it possible to add custom metadata to Git commit object? Such metadata should be ignored by Git commands, but could be used by a 3-party tool which knows the format and knows where to look. I assume that this should be possible, given that Git objects are actually patches, and patches can contain additional details. But can this be done with the help of Git commands? The reason why I am asking this - is to create a tool which could overcome false rename detection. As all of you know, if I make significant changes to the code together with rename, Git won’t detect this rename and will treat this change as added + deleted. And sometimes there are false rename detections as well. It would be useful to record the fact of rename and use it afterwards. If a user is developing with our IDE (IntelliJ IDEA), we could remember the fact that he renamed a file, then write this information to the commit object, and when the commit information is requested (e.g. from the git log graphical view), the IDE could read the additional information of the commit and display the file as renamed, not as added + deleted. The IDE could also use this information to follow rename through the file history. As a real example, in our project we are converting a lot of files from Java to Kotlin, and such conversion always looses history unless the developer remembers to prepare a separate rename-commit first, which is tedious. There are git-notes, which could be used for the purpose, but they are visible to the user via standard Git command, and could be used by the user for other purposes, so they are not very suitable for the task. Thanks a lot! -- Kirill-- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v1 0/1] t6038-merge-text-auto.sh
From: Torsten BögershausenSplit of the old 10/10 series. This is the update of t6038, which is needed to motivate the patch `convert: ce_compare_data() checks for a sha1 of a path` on top of `convert: unify the "auto" handling of CRLF` When files with different eols are merged with merge.renormalize = true, it is important to look at the right blob to determine if the crlf came from the blob or are a result of a coversion. This is a little bit of a hen-and-egg problem: The problem comes up after the "unified auto handling". In theory, it should have been since before: get_sha1_from_index() says: * We might be in the middle of a merge, in which * case we would read stage #2 (ours). This seams wrong, as in the merge we sometimes need to look at "theirs". (But I haven't managed to construct a TC) t6038-merge-text-auto.sh Torsten Bögershausen (1): t6038: different eol for "Merge addition of text=auto" t/t6038-merge-text-auto.sh | 28 +++- 1 file changed, 23 insertions(+), 5 deletions(-) -- 2.0.0.rc1.6318.g0c2c796 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/13] refs: introduce an iterator interface
On 30/05/16 16:22, Ramsay Jones wrote: > > > On 30/05/16 08:55, Michael Haggerty wrote: > [snip] > >> /* Reference is a symbolic reference. */ >> diff --git a/refs/files-backend.c b/refs/files-backend.c >> index 8ab4d5f..dbf1587 100644 >> --- a/refs/files-backend.c >> +++ b/refs/files-backend.c >> @@ -1,6 +1,7 @@ >> #include "../cache.h" >> #include "../refs.h" >> #include "refs-internal.h" >> +#include "../iterator.h" >> #include "../lockfile.h" >> #include "../object.h" >> #include "../dir.h" >> @@ -704,6 +705,154 @@ static void prime_ref_dir(struct ref_dir *dir) >> } >> } >> >> +/* >> + * A level in the reference hierarchy that is currently being iterated >> + * through. >> + */ >> +struct cache_ref_iterator_level { >> +/* >> + * The ref_dir being iterated over at this level. The ref_dir >> + * is sorted before being stored here. >> + */ >> +struct ref_dir *dir; >> + >> +/* >> + * The index of the current entry within dir (which might >> + * itself be a directory). If index == -1, then the iteration >> + * hasn't yet begun. If index == dir->nr, then the iteration >> + * through this level is over. >> + */ >> +int index; >> +}; >> + >> +/* >> + * Represent an iteration through a ref_dir in the memory cache. The >> + * iteration recurses through subdirectories. >> + */ >> +struct cache_ref_iterator { >> +struct ref_iterator base; >> + >> +/* >> + * The number of levels currently on the stack. This is always >> + * at least 1, because when it becomes zero the iteration is >> + * ended and this struct is freed. >> + */ >> +size_t levels_nr; >> + >> +/* The number of levels that have been allocated on the stack */ >> +size_t levels_alloc; >> + >> +/* >> + * A stack of levels. levels[0] is the uppermost level that is >> + * being iterated over in this iteration. (This is not >> + * necessary the top level in the references hierarchy. If we >> + * are iterating through a subtree, then levels[0] will hold >> + * the ref_dir for that subtree, and subsequent levels will go >> + * on from there.) >> + */ >> +struct cache_ref_iterator_level *levels; >> +}; >> + >> +static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator) >> +{ >> +struct cache_ref_iterator *iter = >> +(struct cache_ref_iterator *)ref_iterator; >> + >> +while (1) { >> +struct cache_ref_iterator_level *level = >> +>levels[iter->levels_nr - 1]; >> +struct ref_dir *dir = level->dir; >> +struct ref_entry *entry; >> + >> +if (level->index == -1) >> +sort_ref_dir(dir); > > do you need to sort here ... >> + >> +if (++level->index == level->dir->nr) { >> +/* This level is exhausted; pop up a level */ >> +if (--iter->levels_nr == 0) >> +return ref_iterator_abort(ref_iterator); >> + >> +continue; >> +} >> + >> +entry = dir->entries[level->index]; >> + >> +if (entry->flag & REF_DIR) { >> +/* push down a level */ >> +ALLOC_GROW(iter->levels, iter->levels_nr + 1, >> + iter->levels_alloc); >> + >> +level = >levels[iter->levels_nr++]; >> +level->dir = get_ref_dir(entry); >> +sort_ref_dir(level->dir); > > ... given that you sort here? I had intended to say 'or vice versa' here. When I wrote this, I had not finished reading this patch (let alone the series). Now, I suspect that you can simply drop this 'sort_ref_dir()' call site. Unless I've misread the code, of course! ;-) ATB, Ramsay Jones -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v1 1/1] t6038: different eol for "Merge addition of text=auto"
From: Torsten Bögershausent6038 uses different code, depending if NATIVE_CRLF is set ot not. Enhance the test coverage for cross-platform testing and run "Merge addition of text=auto" with both lf and crlf as core.eol. It is important to be run this test with crlf under Linux, the day when the "unified auto handling will be introduced. Signed-off-by: Torsten Bögershausen --- t/t6038-merge-text-auto.sh | 28 +++- 1 file changed, 23 insertions(+), 5 deletions(-) diff --git a/t/t6038-merge-text-auto.sh b/t/t6038-merge-text-auto.sh index 33b77ee..8846f5d 100755 --- a/t/t6038-merge-text-auto.sh +++ b/t/t6038-merge-text-auto.sh @@ -91,16 +91,13 @@ test_expect_success 'Merge after setting text=auto' ' compare_files expected file ' -test_expect_success 'Merge addition of text=auto' ' +test_expect_success 'Merge addition of text=auto eol=LF' ' + git config core.eol lf && cat <<-\EOF >expected && first line same line EOF - if test_have_prereq NATIVE_CRLF; then - append_cr expected.temp && - mv expected.temp expected - fi && git config merge.renormalize true && git rm -fr . && rm -f .gitattributes && @@ -109,7 +106,28 @@ test_expect_success 'Merge addition of text=auto' ' compare_files expected file ' +test_expect_success 'Merge addition of text=auto eol=CRLF' ' + git config core.eol crlf && + cat <<-\EOF >expected && + first line + same line + EOF + + append_cr expected.temp && + mv expected.temp expected && + git config merge.renormalize true && + git rm -fr . && + rm -f .gitattributes && + git reset --hard b && + echo >&2 "After git reset --hard b" && + git ls-files -s --eol >&2 && + git merge a && + compare_files expected file +' + + test_expect_success 'Detect CRLF/LF conflict after setting text=auto' ' + git config core.eol native && echo "<<<" >expected && if test_have_prereq NATIVE_CRLF; then echo first line | append_cr >>expected && -- 2.0.0.rc1.6318.g0c2c796 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
How to cut extra liftime of expired loose objects
I think there is logic somewhere in git gc or its minions that gives expired objects that have been packed a two weeks extra life time as loose objects. That is, reachable loose objects are moved to a pack, but those objects that were packed, but become unreachable due to expired reflogs, are evicted from the pack and live for another two weeks or so as loose objects. As a consequence of this, I always have a few thousand loose objects in my busy repositories, no matter how often I collect garbage. Is there a knob that removes the extra lease of life of objects without reducing the usual expiration times of reflogs etc? -- Hannes -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git add without whitespace
I think it would be useful to have a '-w' option for 'git add' that completely ignores whitespace changes, the same way that 'git diff -w' does. Real life scenario: Sometimes developers will use tooling that does not properly strip trailing whitespace in source files. Next time I edit those files for a simple 1-line code change, my tooling will strip whitespace from the whole file. I *do* want these changes, however I want 2 commits: 1 commit with the bugfix, and a supplementary commit with just the whitespace changes. At the moment, there is no way for me to conveniently add the source file to the index without whitespace. The only way to accomplish this today that I'm aware of is via this command: $ git diff -U0 -w --no-color | git apply --cached --ignore-whitespace --unidiff-zero This command explicitly leaves out context because it can sometimes cause the patch to fail to apply, I think due to whitespace being in it, but I'm not completely sure myself. It would be useful to be able to do this instead: $ git add -w This would effectively function the same as my workaround command shown earlier. It should also be valid to use -w with -i and -p. In the -p case, it just won't show hunks containing whitespace changes. For -i, it would assume '-w' as part of any command run during the interactive session. Does this idea sound good? I have some free time on my hands so I wouldn't mind implementing this. Maybe there isn't a huge audience for this kind of thing, or maybe I'm just going about this the wrong way. Thoughts would be much appreciated. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/13] refs: introduce an iterator interface
On 30/05/16 08:55, Michael Haggerty wrote: [snip] > /* Reference is a symbolic reference. */ > diff --git a/refs/files-backend.c b/refs/files-backend.c > index 8ab4d5f..dbf1587 100644 > --- a/refs/files-backend.c > +++ b/refs/files-backend.c > @@ -1,6 +1,7 @@ > #include "../cache.h" > #include "../refs.h" > #include "refs-internal.h" > +#include "../iterator.h" > #include "../lockfile.h" > #include "../object.h" > #include "../dir.h" > @@ -704,6 +705,154 @@ static void prime_ref_dir(struct ref_dir *dir) > } > } > > +/* > + * A level in the reference hierarchy that is currently being iterated > + * through. > + */ > +struct cache_ref_iterator_level { > + /* > + * The ref_dir being iterated over at this level. The ref_dir > + * is sorted before being stored here. > + */ > + struct ref_dir *dir; > + > + /* > + * The index of the current entry within dir (which might > + * itself be a directory). If index == -1, then the iteration > + * hasn't yet begun. If index == dir->nr, then the iteration > + * through this level is over. > + */ > + int index; > +}; > + > +/* > + * Represent an iteration through a ref_dir in the memory cache. The > + * iteration recurses through subdirectories. > + */ > +struct cache_ref_iterator { > + struct ref_iterator base; > + > + /* > + * The number of levels currently on the stack. This is always > + * at least 1, because when it becomes zero the iteration is > + * ended and this struct is freed. > + */ > + size_t levels_nr; > + > + /* The number of levels that have been allocated on the stack */ > + size_t levels_alloc; > + > + /* > + * A stack of levels. levels[0] is the uppermost level that is > + * being iterated over in this iteration. (This is not > + * necessary the top level in the references hierarchy. If we > + * are iterating through a subtree, then levels[0] will hold > + * the ref_dir for that subtree, and subsequent levels will go > + * on from there.) > + */ > + struct cache_ref_iterator_level *levels; > +}; > + > +static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator) > +{ > + struct cache_ref_iterator *iter = > + (struct cache_ref_iterator *)ref_iterator; > + > + while (1) { > + struct cache_ref_iterator_level *level = > + >levels[iter->levels_nr - 1]; > + struct ref_dir *dir = level->dir; > + struct ref_entry *entry; > + > + if (level->index == -1) > + sort_ref_dir(dir); do you need to sort here ... > + > + if (++level->index == level->dir->nr) { > + /* This level is exhausted; pop up a level */ > + if (--iter->levels_nr == 0) > + return ref_iterator_abort(ref_iterator); > + > + continue; > + } > + > + entry = dir->entries[level->index]; > + > + if (entry->flag & REF_DIR) { > + /* push down a level */ > + ALLOC_GROW(iter->levels, iter->levels_nr + 1, > +iter->levels_alloc); > + > + level = >levels[iter->levels_nr++]; > + level->dir = get_ref_dir(entry); > + sort_ref_dir(level->dir); ... given that you sort here? > + level->index = -1; > + } else { > + iter->base.refname = entry->name; > + iter->base.oid = >u.value.oid; > + iter->base.flags = entry->flag; > + return ITER_OK; > + } > + } > +} > + ATB, Ramsay Jones -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] push: deny policy to prevent pushes to unwanted remotes.
- Mail original - > Antoine Queruwrites: > > > Currently, a user wanting to prevent accidental pushes to the wrong remote > > has to create a pre-push hook. > > The feature offers a configuration to allow users to prevent accidental > > pushes > > to the wrong remote. The user may define a list of whitelisted remotes, a > > list > > of blacklisted remotes and a default policy ("allow" or "deny"). A push > > is denied if the remote is explicitely blacklisted or if it isn't > > whitelisted and the default policy is "deny". > > Not really serious, but the text above is weirdly wrapped, probably by > hand. I'm sure your text editor can do better and quicker ;-). Yes indeed it was hand wrapped, it will be changed in the next version, thanks for the hint! > > diff --git a/Documentation/config.txt b/Documentation/config.txt > > index 53f00db..1478ce3 100644 > > --- a/Documentation/config.txt > > +++ b/Documentation/config.txt > > @@ -2517,6 +2517,23 @@ remote.pushDefault:: > > `branch..remote` for all branches, and is overridden by > > `branch..pushRemote` for specific branches. > > > > +remote.pushBlacklisted:: > > + The list of remotes the user is forbidden to push to. > > + See linkgit:git-push[1] > > I'd have spelled that "pushBlacklist" (no 'ed'). I think variable names > usually do not use verbs. But I'm fine with your version too. Ok it makes sense, we will change for "pushBlacklist". > > +For example, if we set up the configuration variables like this: > > + git config --add remote.pushBlacklisted repository.com > > + git config --add remote.pushWhitelisted repository.com/Special_Folder > > +Any push of this form will be accepted: > > + git push repository.com/Special_Folder/foo > > +While those ones for example will be denied: > > + git push repository.com/Normal_Folder/bar > > Please, look at the rendered output of your documentation, and notice > it's broken. We'd typically use the asciidoc syntax for inline code here > (between -). Indeed the rendered doc is broken, it will be fixed in the next version. > > +An error will be raised if the url is blacklisted and whitelisted at the > > same. > > "at the same time"? > > But as-is, this sentence conflicts with the previous "the more the url > in the config prefixes the asked url the more priority it has." > statement. I understand the confusion, it's not very clear. What we wanted to say was that: pushBlacklist = example.com/ pushWhitelist = example.com/ would raise an error, because the "priority rule" can't be applied. We will change the sentence. > The documentation doesn't talk about the URL-normalization the code is > doing. I think a reasonable behavior would be: > > pushBlacklisted = example.com/ => deny all accesses to example.com > pushBlacklisted = http://example.com/ => deny HTTP accesses to > example.com > > The second is a valid use-case IMHO, some people may want to forbid some > protocols. Actually, one may even want to whilelist only one protocol > and write stg like this to force HTTPS on host example.com: > > pushBlacklisted = example.com > pushWhitelisted = https://example.com As of right now the protocol is not handled, we just ignore it. We already thought about handling it the way you described but we found out some problems: pushBlacklist = example.com pushWhitelist = https://example.com -> push example.com forbidden EXCEPT with https pushBlacklist = example.com/example2 pushWhitelist = https://example.com -> push https://example.com/example2 -> Allow or deny? To sum up the problem, what should be the priority between a protocol-matching rule and a sub-folder-matching rule? Perhaps adding a configuration variable? pushPriority = "protocol"/"sub-folder" (needs renaming) > BTW, these use-cases could motivate some per-blacklist deny message > like > > [remote "example.com"] > pushDenyMessage = "Please use HTTPS when you push to example.com" > > I don't think it has to be implemented now though (better have users get > used to the basic feature and see if more is needed later). Interesting idea, perhaps a simplified version could be: pushDenyProtocolMessage = "This protocol is not allowed for this remote" pushDenyRemoteMessage = "This remote is not authorized" (as above v ery roughnaming) > > +static const char *string_url_normalize(const char *repo_name) > > +{ > > + if (starts_with(repo_name, "file://")) > > + return repo_name + 7; > > There are many instances of this in Git's codebase, but we now try to > avoid magic numbers like this, and would use strlen("file://") instead. > Actually, we even have skip_prefix() precisely for this use-case. Ok we will change it and use the skip_prefix() function. > > + if (is_url(repo_name)) { > > + struct url_info url_infos; > > +
Re: [PATCH 2/2] strbuf: allow to use preallocated memory
Matthieu Moywrites: > William Duclot writes: > >> Matthieu Moy writes: +/** + * Allow the caller to give a pre-allocated piece of memory for the strbuf + * to use and indicate that the strbuf must use exclusively this buffer, + * never realloc() it or allocate a new one. It means that the string can + * be manipulated but cannot overflow the pre-allocated buffer. The + * pre-allocated buffer will never be freed. + */ >>> >>> Perhaps say explicitly that although the allocated buffer has a fixed >>> size, the string itself can grow as long as it does not overflow the >>> buffer? >> >> That's what I meant by "the string can be manipulated but cannot overflow >> the pre-allocated buffer". I'll try to reformulate > > Maybe "the string can grow, but cannot overflow"? Seems OK @@ -91,6 +116,8 @@ extern void strbuf_release(struct strbuf *); * Detach the string from the strbuf and returns it; you now own the * storage the string occupies and it is your responsibility from then on * to release it with `free(3)` when you are done with it. + * Must allocate a copy of the buffer in case of a preallocated/fixed buffer. + * Performance-critical operations have to be aware of this. >>> >>> Better than just warn about performance, you can give the alternative. >> >> I'm not sure what you mean, I don't think there really is an alternative >> for >> detaching a string? > > So, is the comment above saying "You're doomed, there's no way you can > get good performance anyway"? > > The alternative is just that you don't have to call strbuf_release since > the caller can access the .buf field and is already the one responsible > for freeing it when needed, and it's safe to just call strbuf_init() if > one needs to re-initialize the stbuf structure. Actually, if the user _needs_ to detach(), then yes he's doomed. Or more realistically, he have to refactor so he'll don't need detach() (by being sure the strbuf is pre-allocated by himself for example). The strbuf_release() function is not the problem here, it does nothing problematic for performances. The problem is strbuf_wrap_preallocated(), but you're right I can add a comment so the user know when he don't have to call it. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5] upload-pack.c: use parse-options API
Antoine Queruwrites: > From: Antoine Queru [ Insert here the sentence I've been repeating a lot lately about this useless From ;-) ] > Documentation/git-upload-pack.txt | 16 +-- > upload-pack.c | 59 > +-- > 2 files changed, 38 insertions(+), 37 deletions(-) The patch contains a few whitespace errors: Documentation/git-upload-pack.txt:41: space before tab in indent. + immediately. This fits with the HTTP GET request model, where Documentation/git-upload-pack.txt:42: space before tab in indent. + no request content is received but a response must be produced. upload-pack.c:846: trailing whitespace. + You should notice them immediately if you use "git add -p" (big red warning in the patch hunk), and you can see all of them with "git diff --check" or "git show --check". Not sure if it deserves a reroll. Junio? Other than that, the patch is now Reviewed-by: Matthieu Moy -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5] upload-pack.c: use parse-options API
From: Antoine QueruUse the parse-options API rather than a hand-rolled option parser. Description for --stateless-rpc and --advertise-refs come from 42526b4 (Add stateless RPC options to upload-pack, receive-pack, 2009-10-30). Signed-off-by: Antoine Queru Signed-off-by: Matthieu Moy --- Change since v4: Imperative mood for the commit message. "Code is now more compact." removed. Documentation for option is now clearer. Help message for "--advertise-refs" rewritten. "no" for "strict" option added to the doc. Documentation/git-upload-pack.txt | 16 +-- upload-pack.c | 59 +-- 2 files changed, 38 insertions(+), 37 deletions(-) diff --git a/Documentation/git-upload-pack.txt b/Documentation/git-upload-pack.txt index 0abc806..8a03bed 100644 --- a/Documentation/git-upload-pack.txt +++ b/Documentation/git-upload-pack.txt @@ -9,8 +9,8 @@ git-upload-pack - Send objects packed back to git-fetch-pack SYNOPSIS [verse] -'git-upload-pack' [--strict] [--timeout=] - +'git-upload-pack' [--[no-]strict] [--timeout=] [--stateless-rpc] + [--advertise-refs] DESCRIPTION --- Invoked by 'git fetch-pack', learns what @@ -25,12 +25,22 @@ repository. For push operations, see 'git send-pack'. OPTIONS --- ---strict:: +--[no-]strict:: Do not try /.git/ if is no Git directory. --timeout=:: Interrupt transfer after seconds of inactivity. +--stateless-rpc:: + Perform only a single read-write cycle with stdin and stdout. + This fits with the HTTP POST request processing model where + a program may read the request, write a response, and must exit. + +--advertise-refs:: + Only the initial ref advertisement is output, and the program exits + immediately. This fits with the HTTP GET request model, where + no request content is received but a response must be produced. + :: The repository to sync from. diff --git a/upload-pack.c b/upload-pack.c index dc802a0..083d068 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -14,8 +14,12 @@ #include "sigchain.h" #include "version.h" #include "string-list.h" +#include "parse-options.h" -static const char upload_pack_usage[] = "git upload-pack [--strict] [--timeout=] "; +static const char * const upload_pack_usage[] = { + N_("git upload-pack [options] "), + NULL +}; /* Remember to update object flag allocation in object.h */ #define THEY_HAVE (1u << 11) @@ -817,11 +821,21 @@ static int upload_pack_config(const char *var, const char *value, void *unused) return parse_hide_refs_config(var, value, "uploadpack"); } -int main(int argc, char **argv) +int main(int argc, const char **argv) { - char *dir; - int i; + const char *dir; int strict = 0; + struct option options[] = { + OPT_BOOL(0, "stateless-rpc", _rpc, +N_("quit after a single request/response exchange")), + OPT_BOOL(0, "advertise-refs", _refs, +N_("exit immediately after intial ref advertisement")), + OPT_BOOL(0, "strict", , +N_("do not try /.git/ if is no Git directory")), + OPT_INTEGER(0, "timeout", , + N_("interrupt transfer after seconds of inactivity")), + OPT_END() + }; git_setup_gettext(); @@ -829,40 +843,17 @@ int main(int argc, char **argv) git_extract_argv0_path(argv[0]); check_replace_refs = 0; - for (i = 1; i < argc; i++) { - char *arg = argv[i]; - - if (arg[0] != '-') - break; - if (!strcmp(arg, "--advertise-refs")) { - advertise_refs = 1; - continue; - } - if (!strcmp(arg, "--stateless-rpc")) { - stateless_rpc = 1; - continue; - } - if (!strcmp(arg, "--strict")) { - strict = 1; - continue; - } - if (starts_with(arg, "--timeout=")) { - timeout = atoi(arg+10); - daemon_mode = 1; - continue; - } - if (!strcmp(arg, "--")) { - i++; - break; - } - } + argc = parse_options(argc, argv, NULL, options, upload_pack_usage, 0); + + if (argc != 1) + usage_with_options(upload_pack_usage, options); - if (i != argc-1) - usage(upload_pack_usage); + if (timeout) + daemon_mode = 1; setup_path(); - dir = argv[i]; + dir = argv[0]; if (!enter_repo(dir,
Re: [PATCH 2/2] strbuf: allow to use preallocated memory
William Duclotwrites: > Matthieu Moy writes: > >> void strbuf_grow(struct strbuf *sb, size_t extra) >> { >> int new_buf = !sb->alloc; >> ... >> if (sb->flags & STRBUF_OWNS_MEMORY) { >> if (new_buf) // < (1) >> sb->buf = NULL; >> ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc); >> } else { >> /* >> * The strbuf doesn't own the buffer: to avoid to realloc it, >> * the strbuf needs to use a new buffer without freeing the old >> */ >> if (sb->len + extra + 1 > sb->alloc) { >> size_t new_alloc = MAX(sb->len + extra + 1, >> alloc_nr(sb->alloc)); >> char *buf = xmalloc(new_alloc); >> memcpy(buf, sb->buf, sb->alloc); >> sb->buf = buf; >> sb->alloc = new_alloc; >> sb->flags |= STRBUF_OWNS_MEMORY; >> } >> } >> >> if (new_buf) // < (2) >> sb->buf[0] = '\0'; >> } >> >> I think (1) is now dead code, since sb->alloc == 0 implies that >> STRBUF_OWNS_MEMORY is set. (2) seems redundant since you've just >> memcpy-ed the existing '\0' into the buffer. > > You're right for (1), I hadn't noticed that. > For (2), we'll still have to set sb->buf[new_alloc-1]='\0' after the memcpy, > if we > have sb->alloc==0 then the memcpy won't copy it. That sounds like one more reason to memcpy len + 1 bytes, and you'll get the '\0' copied. >> After your patch, there are differences between >> strbuf_wrap_preallocated() which I think are inconsistencies: >> >> * strbuf_attach() does not check for NULL buffer, but it doesn't accept >> them either if I read correctly. It would make sense to add the check >> to strbuf_attach(), but it's weird to have the performance-critical >> oriented function do the expensive stuff that the >> non-performance-critical one doesn't. > > I agree that strbuf_attach should do the check (it seems strange that it > doesn't already do it, as the "buffer never NULL" invariant is not new). > I don't understand your "but" part, what "expensive stuff" are you talking > about? "expensive stuff" was an exageration for "== NULL" test. It's not that expensive, but costs a tiny bit of CPU time. > xmemdupz can only allocate the same size it will copy. Indeed, so forget about it. >>> +/** >>> + * Allow the caller to give a pre-allocated piece of memory for the strbuf >>> + * to use and indicate that the strbuf must use exclusively this buffer, >>> + * never realloc() it or allocate a new one. It means that the string can >>> + * be manipulated but cannot overflow the pre-allocated buffer. The >>> + * pre-allocated buffer will never be freed. >>> + */ >> >> Perhaps say explicitly that although the allocated buffer has a fixed >> size, the string itself can grow as long as it does not overflow the >> buffer? > > That's what I meant by "the string can be manipulated but cannot overflow > the pre-allocated buffer". I'll try to reformulate Maybe "the string can grow, but cannot overflow"? >>> @@ -91,6 +116,8 @@ extern void strbuf_release(struct strbuf *); >>> * Detach the string from the strbuf and returns it; you now own the >>> * storage the string occupies and it is your responsibility from then on >>> * to release it with `free(3)` when you are done with it. >>> + * Must allocate a copy of the buffer in case of a preallocated/fixed >>> buffer. >>> + * Performance-critical operations have to be aware of this. >> >> Better than just warn about performance, you can give the alternative. > > I'm not sure what you mean, I don't think there really is an alternative for > detaching a string? So, is the comment above saying "You're doomed, there's no way you can get good performance anyway"? The alternative is just that you don't have to call strbuf_release since the caller can access the .buf field and is already the one responsible for freeing it when needed, and it's safe to just call strbuf_init() if one needs to re-initialize the stbuf structure. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Gitweb: Persistant download URLs for snapshots?
Hi, I hope this is the right place to ask this, but I wanted to know whether it is possible to have a persistant URL to obtain a snapshot of the latest master of a repository through Gitweb. I set up a gitweb instance and it works nicely. I can click on the snapshot link to get a tgz archive of a specific commit hash to quickly receive or distribute files of a repository exposed by Gitweb. However, I have two problems with these links: 1) The link seems to depend on a commit hash. I haven't found a shorter or persitant link that would e.g. always give me the latest master snapshot. 2) Another issue which is actually more problematic: The links only seem to work interactively in my desktop browser. If I right click the link "snapshot", copy the URL and then try to download that link from another (headless) machine using wget, I end up getting a html file instead of a tgz archive. Is it possible to generate such links that conssitantly work for the latest commit of a repository and that work non-interactively from a command line? Am I doing something wrong (well aside fromt he possibility that I'm trying to use gitweb for something which it might not have been designed for...)? Thank you! Kind regards, Timo -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [WIP-PATCH 0/2] send-email: refactor the email parser loop
Samuel GROOTwrites: > (mbox) prefix was introduced by Ryan Anderson in 2005 (can't find the > exact commit though), in opposition with the (non-mbox) format ("lots > of email") that was used before. That is actually from the original commit introducing send-email: 83b2443 ([PATCH] Add git-send-email-script - tool to send emails from git-format-patch-script, 2005-07-31), i.e. ~3 month after Git was born. At that time, user-friendlyness was not really a priority ;-). > Is the "lots of email" format still used? AFAICT, it was initially supported for backward compatibility, and then no one removed it, but I wouldn't be surprised if no one actually used it. I vaguely remember a message from Ryan Anderson being surprised to see the old format still supported, but I can't find it in the archives. In any case: - git log --grep 'lots of email' => shows only 83b2443 - git log -S'lots of email' => likewise - git grep 'lots of email' => just one answer in a comment I'm not sure the feature is even tested. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] upload-pack.c: use of parse-options API
Hello Eric, Thank you for answer. Your remarks have been added in the next version. > > + OPT_BOOL(0, "strict", , > > +N_("do not try /.git/ if is > > no Git directory")), > > Use of OPT_BOOL introduces a --no-strict option which didn't exist > before. Does this need to be documented? (Genuine question.) > You're right. I've added it in the documentation. Antoine -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Fatal bug on revert with --author
So I learned today that --author is not a supported argument to git revert. It took me a long time to realize this, though, because the error I got was very cryptic: fatal: BUG: expected exactly one commit from walk Here's a very simple reproducible case: https://gist.github.com/anlutro/67e0cec1a6a419e6d44131b0bc1deff6 I was recommended to send this report here by #git on irc.freenode.net. Thanks, Andreas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] strbuf: allow to use preallocated memory
Matthieu Moywrites: > William Duclot writes: > >> @@ -20,16 +28,37 @@ char strbuf_slopbuf[1]; >> >> void strbuf_init(struct strbuf *sb, size_t hint) >> { >> +sb->flags = 0; >> sb->alloc = sb->len = 0; >> sb->buf = strbuf_slopbuf; >> if (hint) >> strbuf_grow(sb, hint); >> } > > If you set flags = 0 here, existing callers will have all flags off, > including OWNS_MEMORY. > > I *think* this is OK, as sb->buf is currently pointing to > strbuf_slopbuf, which the the strbuf doesn't own. But that is too subtle > to go without an explanatory comment IMHO. Right > > Also, doesn't this make the "new_buf" case useless in strbuf_grow? > > With your patch, the code looks like: > > void strbuf_grow(struct strbuf *sb, size_t extra) > { > int new_buf = !sb->alloc; > ... > if (sb->flags & STRBUF_OWNS_MEMORY) { > if (new_buf) // < (1) > sb->buf = NULL; > ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc); > } else { > /* >* The strbuf doesn't own the buffer: to avoid to realloc it, >* the strbuf needs to use a new buffer without freeing the old >*/ > if (sb->len + extra + 1 > sb->alloc) { > size_t new_alloc = MAX(sb->len + extra + 1, > alloc_nr(sb->alloc)); > char *buf = xmalloc(new_alloc); > memcpy(buf, sb->buf, sb->alloc); > sb->buf = buf; > sb->alloc = new_alloc; > sb->flags |= STRBUF_OWNS_MEMORY; > } > } > > if (new_buf) // < (2) > sb->buf[0] = '\0'; > } > > I think (1) is now dead code, since sb->alloc == 0 implies that > STRBUF_OWNS_MEMORY is set. (2) seems redundant since you've just > memcpy-ed the existing '\0' into the buffer. You're right for (1), I hadn't noticed that. For (2), we'll still have to set sb->buf[new_alloc-1]='\0' after the memcpy, if we have sb->alloc==0 then the memcpy won't copy it. >> +void strbuf_wrap_preallocated(struct strbuf *sb, char *path_buf, >> + size_t path_buf_len, size_t alloc_len) >> +{ >> +if (!path_buf) >> +die("you try to use a NULL buffer to initialize a strbuf"); >> + >> +strbuf_init(sb, 0); >> +strbuf_attach(sb, path_buf, path_buf_len, alloc_len); >> +sb->flags &= ~STRBUF_OWNS_MEMORY; >> +sb->flags &= ~STRBUF_FIXED_MEMORY; >> +} > > strbuf_wrap_preallocated seem very close to strbuf_attach. I'd rather > see a symmetric code sharing like > > void strbuf_attach_internal(struct strbuf *sb, ..., unsigned int flags) > > and then strbuf_attach() and strbuf_wrap_preallocated() become > straightforward wrappers around it. > > This would avoid setting and then unsetting STRBUF_OWNS_MEMORY (the > performance impact is probably small, but it looks weird to set the flag > and then unset it right away). We'll refactor the code with Johannes' remarks and yours in mind > After your patch, there are differences between > strbuf_wrap_preallocated() which I think are inconsistencies: > > * strbuf_attach() does not check for NULL buffer, but it doesn't accept > them either if I read correctly. It would make sense to add the check > to strbuf_attach(), but it's weird to have the performance-critical > oriented function do the expensive stuff that the > non-performance-critical one doesn't. I agree that strbuf_attach should do the check (it seems strange that it doesn't already do it, as the "buffer never NULL" invariant is not new). I don't understand your "but" part, what "expensive stuff" are you talking about? > * strbuf_attach() calls strbuf_release(), which allows reusing an > existing strbuf. strbuf_wrap_preallocated() calls strbuf_init which > would override silently any previous content. I think strbuf_attach() > does the right thing here. > > In any case, you probably want to include calls to strbuf_attach() and > strbuf_wrap_*() functions on existing non-empty strbufs. > >> +void strbuf_wrap_fixed(struct strbuf *sb, char *path_buf, >> + size_t path_buf_len, size_t alloc_len) >> +{ >> +strbuf_wrap_preallocated(sb, path_buf, path_buf_len, alloc_len); >> +sb->flags |= STRBUF_FIXED_MEMORY; >> +} > > And this could become a 3rd caller of strbuf_attach_internal(). True. We'll take care of that when refactoring >> @@ -61,9 +96,32 @@ void strbuf_grow(struct strbuf *sb, size_t extra) >> if (unsigned_add_overflows(extra, 1) || >> unsigned_add_overflows(sb->len, extra + 1)) >> die("you want to use way too much memory"); >> -if (new_buf) >> -sb->buf = NULL; >> -ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc); >> +if
Re: [WIP-PATCH 0/2] send-email: refactor the email parser loop
On 05/29/2016 08:05 PM, Matthieu Moy wrote: Samuel GROOTwrites: Should we take what Eric suggested (see below) as standard output? Since the headers are already shown after those lines, it's redundant to have the entire line. And we could add trailers after the headers (with a blank line to delimit): # existing header output: From: F Cc: A , One Subject: foo # new trailer output Signed-off-by: SoB Acked-by: ack I don't think adding the trailers in the output is needed. If the message says Adding foo@example from Signed-off-by trailer I can guess that it's from "Signed-off-by: foo@example" without having it explicitly. Perhaps others think differently, but for me, the shortest output would be the better (if only to solve the "I never see these lines, they scrolled out of my terminal" issue). I agree, the shorter the better. Ideally, I'd even like to shorten the current output a bit more (the X-Mailer: header is useless IMHO, and the Date: and Message-id: headers are probably not useful enough to be shown by default). Agreed. (Just thinking aloud, obviously none of this should be a prerequisite to accept your refactoring patch) And keep "(mbox) Adding ..." lines as error output, or maybe displayable by a new option `--verbose`? I think the "Adding ..." lines make sense by default at least for beginners (just a few days ago, I received a bunch of test emails by your team follow by a "Oops, I just noticed you got Cc-ed in my tests" message ;-), that would probably have been worse without the message). There could be an advice.* option to deactivate them, though. An advice.* option seems a good solution to me. The (mbox) prefix doesn't seem useful to me OTOH, I think it's a leftover debugging message. (mbox) prefix was introduced by Ryan Anderson in 2005 (can't find the exact commit though), in opposition with the (non-mbox) format ("lots of email") that was used before. Is the "lots of email" format still used? When adding Cc: from a Signed-off-by: field, (body) prefix is used. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] strbuf: add tests
Hi Johannes, I'm William's teammate on this feature. Johannes Schindelinwrites: > Hi William, > > On Mon, 30 May 2016, William Duclot wrote: > > > Test the strbuf API. Being used throughout all Git the API could be > > considered tested, but adding specific tests makes it easier to improve > > and extend the API. > > --- > > The commit message makes sense. Please add your sign-off. > We forgot to add the sign-off, we will fix that in the V2. > > Makefile | 1 + > > t/helper/test-strbuf.c | 69 > > ++ > > t/t0082-strbuf.sh | 19 ++ > > 3 files changed, 89 insertions(+) > > create mode 100644 t/helper/test-strbuf.c > > create mode 100755 t/t0082-strbuf.sh > > > > diff --git a/Makefile b/Makefile > > index 3f03366..dc84f43 100644 > > --- a/Makefile > > +++ b/Makefile > > @@ -613,6 +613,7 @@ TEST_PROGRAMS_NEED_X += test-scrap-cache-tree > > TEST_PROGRAMS_NEED_X += test-sha1 > > TEST_PROGRAMS_NEED_X += test-sha1-array > > TEST_PROGRAMS_NEED_X += test-sigchain > > +TEST_PROGRAMS_NEED_X += test-strbuf > > TEST_PROGRAMS_NEED_X += test-string-list > > TEST_PROGRAMS_NEED_X += test-submodule-config > > TEST_PROGRAMS_NEED_X += test-subprocess > > diff --git a/t/helper/test-strbuf.c b/t/helper/test-strbuf.c > > new file mode 100644 > > index 000..622f627 > > --- /dev/null > > +++ b/t/helper/test-strbuf.c > > @@ -0,0 +1,69 @@ > > +#include "git-compat-util.h" > > +#include "strbuf.h" > > + > > +/* > > + * Check behavior on usual use cases > > + */ > > +int test_usual(struct strbuf *sb) > > I have to admit that I would prefer a more concrete name. And since your > other tests are more fine-grained, maybe this one could be split into > multiple separate ones, too? > We will rename this function. We thought that one complete function would be convenient to test the usual API's behaviour. We are not sure how that change would be useful? > > +{ > > + size_t size, old_alloc; > > + char *res, *old_buf, *str_test = malloc(5*sizeof(char)); > > Our convention is to list the initialized variables first, the > uninitialized ones after that, and for readability an empty line is > recommended after the variable declaration block. OK, seems more readable. > > + strbuf_grow(sb, 1); > > + strcpy(str_test, "test"); > > + old_alloc = sb->alloc; > > + strbuf_grow(sb, 1000); > > + if (old_alloc == sb->alloc) > > + die("strbuf_grow does not realloc the buffer as expected"); > > + old_buf = sb->buf; > > + res = strbuf_detach(sb, ); > > + if (res != old_buf) > > + die("strbuf_detach does not return the expected buffer"); > > + free(res); > > + > > + strcpy(str_test, "test"); > > + strbuf_attach(sb, (void *)str_test, strlen(str_test), sizeof(str_test)); > > + res = strbuf_detach(sb, ); > > + if (res != str_test) > > + die("strbuf_detach does not return the expected buffer"); > > + free(res); > > + strbuf_release(sb); > > + > > + return 0; > > +} > > + > > +int main(int argc, char *argv[]) > > +{ > > + size_t size = 1; > > + struct strbuf sb; > > The common theme in our source code seems to initialize using > STRBUF_INIT... Let's use that paradigm here, too? We will add a test to check that initializing with srtbuf_init(...) is the same as initializing with STRBUF_INIT. > > > + char str_test[5] = "test"; > > + char str_foo[7] = "foo"; > > + > > + if (argc != 2) > > + usage("test-strbuf mode"); > > A nice and convenient way to do command-line parsing is to use the > parse-options API, in this case with OPT_CMDMODE. This would also give us > a chance to document the command modes in a nice and succinct way: as help > strings. > True, we're going to make that change. > > + > > + if (!strcmp(argv[1], "basic_grow")) { > > + /* > > +* Check if strbuf_grow(0) allocate a new NUL-terminated buffer > > s/allocate// > > > +*/ > > + strbuf_init(, 0); > > + strbuf_grow(, 0); > > + if (sb.buf == strbuf_slopbuf) > > + die("strbuf_grow failed to alloc memory"); > > + strbuf_release(); > > + if (sb.buf != strbuf_slopbuf) > > + die("strbuf_release does not reinitialize the strbuf"); > > + } else if (!strcmp(argv[1], "strbuf_check_behavior")) { > > + strbuf_init(, 0); > > + return test_usual(); > > + } else if (!strcmp(argv[1], "grow_overflow")) { > > + /* > > +* size_t overflow: should die() > > +*/ > > + strbuf_init(, 1000); > > + strbuf_grow(, maximum_unsigned_value_of_type((size_t)1)); > > A comment "If this does not die(), fall through to returning success, to > indicate an error" might be nice here. Agreed. > > + } else { > > + usage("test-strbuf mode"); > > + } > > + > > + return 0; > > +} > > diff --git
Re: [WIP-PATCH 1/2] send-email: create email parser subroutine
On 05/29/2016 07:53 PM, Matthieu Moy wrote: Samuel GROOTwrites: So should we merge parse_email and parse_header in one unique subroutine? At least on the user (i.e. caller of the API) side, one function is probably enough. I will do that, as soon as we decide what's best between including Email::Simple library or making our own API. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] push: deny policy to prevent pushes to unwanted remotes.
Antoine Queruwrites: > Currently, a user wanting to prevent accidental pushes to the wrong remote > has to create a pre-push hook. > The feature offers a configuration to allow users to prevent accidental pushes > to the wrong remote. The user may define a list of whitelisted remotes, a list > of blacklisted remotes and a default policy ("allow" or "deny"). A push > is denied if the remote is explicitely blacklisted or if it isn't > whitelisted and the default policy is "deny". Not really serious, but the text above is weirdly wrapped, probably by hand. I'm sure your text editor can do better and quicker ;-). > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 53f00db..1478ce3 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -2517,6 +2517,23 @@ remote.pushDefault:: > `branch..remote` for all branches, and is overridden by > `branch..pushRemote` for specific branches. > > +remote.pushBlacklisted:: > + The list of remotes the user is forbidden to push to. > + See linkgit:git-push[1] I'd have spelled that "pushBlacklist" (no 'ed'). I think variable names usually do not use verbs. But I'm fine with your version too. > +For example, if we set up the configuration variables like this: > + git config --add remote.pushBlacklisted repository.com > + git config --add remote.pushWhitelisted repository.com/Special_Folder > +Any push of this form will be accepted: > + git push repository.com/Special_Folder/foo > +While those ones for example will be denied: > + git push repository.com/Normal_Folder/bar Please, look at the rendered output of your documentation, and notice it's broken. We'd typically use the asciidoc syntax for inline code here (between -). > +An error will be raised if the url is blacklisted and whitelisted at the > same. "at the same time"? But as-is, this sentence conflicts with the previous "the more the url in the config prefixes the asked url the more priority it has." statement. The documentation doesn't talk about the URL-normalization the code is doing. I think a reasonable behavior would be: pushBlacklisted = example.com/ => deny all accesses to example.com pushBlacklisted = http://example.com/ => deny HTTP accesses to example.com The second is a valid use-case IMHO, some people may want to forbid some protocols. Actually, one may even want to whilelist only one protocol and write stg like this to force HTTPS on host example.com: pushBlacklisted = example.com pushWhitelisted = https://example.com BTW, these use-cases could motivate some per-blacklist deny message like [remote "example.com"] pushDenyMessage = "Please use HTTPS when you push to example.com" I don't think it has to be implemented now though (better have users get used to the basic feature and see if more is needed later). > +static const char *string_url_normalize(const char *repo_name) > +{ > + if (starts_with(repo_name, "file://")) > + return repo_name + 7; There are many instances of this in Git's codebase, but we now try to avoid magic numbers like this, and would use strlen("file://") instead. Actually, we even have skip_prefix() precisely for this use-case. > + if (is_url(repo_name)) { > + struct url_info url_infos; > + url_normalize(repo_name, _infos); > + return url_infos.url + url_infos.host_off; I think this would deserve a comment to explain why and what this is doing, like /* turn ... into ... to ... */. > +test_expect_success 'unsetup' ' "cleanup" ? (I just did a very quick look at the code, I think we need an agreement on the details of specification before a more detailed review) -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] strbuf: allow to use preallocated memory
Johannes Schindelinwrites: > On Mon, 30 May 2016, William Duclot wrote: > >> It is unfortunate that it is currently impossible to use a strbuf >> without doing a memory allocation. So code like >> >> void f() >> { >> char path[PATH_MAX]; >> ... >> } >> >> typically gets turned into either >> >> void f() >> { >> struct strbuf path; >> strbuf_add(, ...); <-- does a malloc >> ... >> strbuf_release(); <-- does a free >> } >> >> which costs extra memory allocations, or >> >> void f() >> { >> static struct strbuf path; >> strbuf_add(, ...); >> ... >> strbuf_setlen(, 0); >> } >> >> which, by using a static variable, avoids most of the malloc/free >> overhead, but makes the function unsafe to use recursively or from >> multiple threads. Those limitations prevent strbuf to be used in >> performance-critical operations. > > This description is nice and verbose, but maybe something like this would > introduce the subject in a quicker manner? > > When working e.g. with file paths or with dates, strbuf's > malloc()/free() dance of strbufs can be easily avoided: as > a sensible initial buffer size is already known, it can be > allocated on the heap. strbuf already allow to indicate a sensible initial buffer size thanks to strbuf_init() second parameter. The main perk of pre-allocation is to use stack-allocated memory, and not heap-allocated :) Unless I misunderstood your message? >> diff --git a/strbuf.c b/strbuf.c >> index 1ba600b..527b986 100644 >> --- a/strbuf.c >> +++ b/strbuf.c >> @@ -1,6 +1,14 @@ >> #include "cache.h" >> #include "refs.h" >> #include "utf8.h" >> +#include > > Why? For the MAX macro. It may be a teeny tiny overkill >> +/** >> + * Flags >> + * -- >> + */ >> +#define STRBUF_OWNS_MEMORY 1 >> +#define STRBUF_FIXED_MEMORY (1 << 1) > > From reading the commit message, I expected STRBUF_OWNS_MEMORY. > STRBUF_FIXED_MEMORY still needs to be explained. Yes, that seems right >> @@ -20,16 +28,37 @@ char strbuf_slopbuf[1]; >> >> void strbuf_init(struct strbuf *sb, size_t hint) >> { >> +sb->flags = 0; >> sb->alloc = sb->len = 0; >> sb->buf = strbuf_slopbuf; >> if (hint) >> strbuf_grow(sb, hint); >> } >> >> +void strbuf_wrap_preallocated(struct strbuf *sb, char *path_buf, >> + size_t path_buf_len, size_t alloc_len) >> +{ >> +if (!path_buf) >> +die("you try to use a NULL buffer to initialize a strbuf"); >> + >> +strbuf_init(sb, 0); >> +strbuf_attach(sb, path_buf, path_buf_len, alloc_len); >> +sb->flags &= ~STRBUF_OWNS_MEMORY; >> +sb->flags &= ~STRBUF_FIXED_MEMORY; > > Shorter: sb->flags &= ~(STRBUF_OWNS_MEMORY | STRBUF_FIXED_MEMORY); Okay with me >> +} >> + >> +void strbuf_wrap_fixed(struct strbuf *sb, char *path_buf, >> + size_t path_buf_len, size_t alloc_len) >> +{ >> +strbuf_wrap_preallocated(sb, path_buf, path_buf_len, alloc_len); >> +sb->flags |= STRBUF_FIXED_MEMORY; >> +} > > Rather than letting strbuf_wrap_preallocated() set sb->flags &= > ~FIXED_MEMORY only to revert that decision right away, a static function > could be called by both strbuf_wrap_preallocated() and > strbuf_wrap_fixed(). Makes sense >> void strbuf_release(struct strbuf *sb) >> { >> if (sb->alloc) { >> -free(sb->buf); >> +if (sb->flags & STRBUF_OWNS_MEMORY) >> +free(sb->buf); >> strbuf_init(sb, 0); >> } > > Should we not reset the flags here, too? Well, strbuf_init() reset the flags. The only way to have !sb->alloc is that strbuf has been initialized and never used (even alloc_grow(0) set sb->alloc=1), so sb==STRBUF_INIT, so the flags don't have to be reset >> @@ -38,7 +67,11 @@ char *strbuf_detach(struct strbuf *sb, size_t *sz) >> { >> char *res; >> strbuf_grow(sb, 0); >> -res = sb->buf; >> +if (sb->flags & STRBUF_OWNS_MEMORY) >> +res = sb->buf; >> +else >> +res = xmemdupz(sb->buf, sb->alloc - 1); > > This looks like a usage to be avoided: if we plan to detach the buffer, > anyway, there is no good reason to allocate it on the heap first. I would > at least issue a warning here. strbuf_detach() guarantees to return heap-allocated memory, that the caller can use however he want and that he'll have to free. If the strbuf doesn't own the memory, it cannot return the buf attribute directly because: - The memory belong to someone else (so the caller can't use it however he want) - The caller can't have the responsibility to free (because the memory belong to someone else) - The memory may not even be heap-allocated >> @@ -51,6 +84,8 @@ void strbuf_attach(struct strbuf *sb, void *buf, size_t >> len, size_t alloc) >> sb->buf = buf; >> sb->len = len; >> sb->alloc = alloc; >> +sb->flags |= STRBUF_OWNS_MEMORY; >> +sb->flags &= ~STRBUF_FIXED_MEMORY;
Re: [PATCH 2/2] strbuf: allow to use preallocated memory
William Duclotwrites: > multiple threads. Those limitations prevent strbuf to be used in prevent strbuf from being used ... > API ENHANCEMENT > --- > > All functions of the API can still be reliably called without > knowledge of the initialization (normal/preallocated/fixed) with the > exception that strbuf_grow() may die() if the string try to overflow a s/try/tries/ > @@ -20,16 +28,37 @@ char strbuf_slopbuf[1]; > > void strbuf_init(struct strbuf *sb, size_t hint) > { > + sb->flags = 0; > sb->alloc = sb->len = 0; > sb->buf = strbuf_slopbuf; > if (hint) > strbuf_grow(sb, hint); > } If you set flags = 0 here, existing callers will have all flags off, including OWNS_MEMORY. I *think* this is OK, as sb->buf is currently pointing to strbuf_slopbuf, which the the strbuf doesn't own. But that is too subtle to go without an explanatory comment IMHO. Also, doesn't this make the "new_buf" case useless in strbuf_grow? With your patch, the code looks like: void strbuf_grow(struct strbuf *sb, size_t extra) { int new_buf = !sb->alloc; ... if (sb->flags & STRBUF_OWNS_MEMORY) { if (new_buf) // < (1) sb->buf = NULL; ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc); } else { /* * The strbuf doesn't own the buffer: to avoid to realloc it, * the strbuf needs to use a new buffer without freeing the old */ if (sb->len + extra + 1 > sb->alloc) { size_t new_alloc = MAX(sb->len + extra + 1, alloc_nr(sb->alloc)); char *buf = xmalloc(new_alloc); memcpy(buf, sb->buf, sb->alloc); sb->buf = buf; sb->alloc = new_alloc; sb->flags |= STRBUF_OWNS_MEMORY; } } if (new_buf) // < (2) sb->buf[0] = '\0'; } I think (1) is now dead code, since sb->alloc == 0 implies that STRBUF_OWNS_MEMORY is set. (2) seems redundant since you've just memcpy-ed the existing '\0' into the buffer. > +void strbuf_wrap_preallocated(struct strbuf *sb, char *path_buf, > + size_t path_buf_len, size_t alloc_len) > +{ > + if (!path_buf) > + die("you try to use a NULL buffer to initialize a strbuf"); > + > + strbuf_init(sb, 0); > + strbuf_attach(sb, path_buf, path_buf_len, alloc_len); > + sb->flags &= ~STRBUF_OWNS_MEMORY; > + sb->flags &= ~STRBUF_FIXED_MEMORY; > +} strbuf_wrap_preallocated seem very close to strbuf_attach. I'd rather see a symmetric code sharing like void strbuf_attach_internal(struct strbuf *sb, ..., unsigned int flags) and then strbuf_attach() and strbuf_wrap_preallocated() become straightforward wrappers around it. This would avoid setting and then unsetting STRBUF_OWNS_MEMORY (the performance impact is probably small, but it looks weird to set the flag and then unset it right away). After your patch, there are differences between strbuf_wrap_preallocated() which I think are inconsistencies: * strbuf_attach() does not check for NULL buffer, but it doesn't accept them either if I read correctly. It would make sense to add the check to strbuf_attach(), but it's weird to have the performance-critical oriented function do the expensive stuff that the non-performance-critical one doesn't. * strbuf_attach() calls strbuf_release(), which allows reusing an existing strbuf. strbuf_wrap_preallocated() calls strbuf_init which would override silently any previous content. I think strbuf_attach() does the right thing here. (And I'm probably the one who misguided you to do this) In any case, you probably want to include calls to strbuf_attach() and strbuf_wrap_*() functions on existing non-empty strbufs. > +void strbuf_wrap_fixed(struct strbuf *sb, char *path_buf, > +size_t path_buf_len, size_t alloc_len) > +{ > + strbuf_wrap_preallocated(sb, path_buf, path_buf_len, alloc_len); > + sb->flags |= STRBUF_FIXED_MEMORY; > +} And this could become a 3rd caller of strbuf_attach_internal(). > @@ -61,9 +96,32 @@ void strbuf_grow(struct strbuf *sb, size_t extra) > if (unsigned_add_overflows(extra, 1) || > unsigned_add_overflows(sb->len, extra + 1)) > die("you want to use way too much memory"); > - if (new_buf) > - sb->buf = NULL; > - ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc); > + if ((sb->flags & STRBUF_FIXED_MEMORY) && sb->len + extra + 1 > > sb->alloc) > + die("you try to make a string overflow the buffer of a fixed > strbuf"); > + > + /* > + * ALLOC_GROW may do a realloc() if needed, so we must not use it on > + * a buffer the
Re: [PATCH 2/2] strbuf: allow to use preallocated memory
Hi William, On Mon, 30 May 2016, William Duclot wrote: > It is unfortunate that it is currently impossible to use a strbuf > without doing a memory allocation. So code like > > void f() > { > char path[PATH_MAX]; > ... > } > > typically gets turned into either > > void f() > { > struct strbuf path; > strbuf_add(, ...); <-- does a malloc > ... > strbuf_release(); <-- does a free > } > > which costs extra memory allocations, or > > void f() > { > static struct strbuf path; > strbuf_add(, ...); > ... > strbuf_setlen(, 0); > } > > which, by using a static variable, avoids most of the malloc/free > overhead, but makes the function unsafe to use recursively or from > multiple threads. Those limitations prevent strbuf to be used in > performance-critical operations. This description is nice and verbose, but maybe something like this would introduce the subject in a quicker manner? When working e.g. with file paths or with dates, strbuf's malloc()/free() dance of strbufs can be easily avoided: as a sensible initial buffer size is already known, it can be allocated on the heap. The rest of the commit message flows nicely. > diff --git a/strbuf.c b/strbuf.c > index 1ba600b..527b986 100644 > --- a/strbuf.c > +++ b/strbuf.c > @@ -1,6 +1,14 @@ > #include "cache.h" > #include "refs.h" > #include "utf8.h" > +#include Why? > +/** > + * Flags > + * -- > + */ > +#define STRBUF_OWNS_MEMORY 1 > +#define STRBUF_FIXED_MEMORY (1 << 1) >From reading the commit message, I expected STRBUF_OWNS_MEMORY. STRBUF_FIXED_MEMORY still needs to be explained. > @@ -20,16 +28,37 @@ char strbuf_slopbuf[1]; > > void strbuf_init(struct strbuf *sb, size_t hint) > { > + sb->flags = 0; > sb->alloc = sb->len = 0; > sb->buf = strbuf_slopbuf; > if (hint) > strbuf_grow(sb, hint); > } > > +void strbuf_wrap_preallocated(struct strbuf *sb, char *path_buf, > + size_t path_buf_len, size_t alloc_len) > +{ > + if (!path_buf) > + die("you try to use a NULL buffer to initialize a strbuf"); > + > + strbuf_init(sb, 0); > + strbuf_attach(sb, path_buf, path_buf_len, alloc_len); > + sb->flags &= ~STRBUF_OWNS_MEMORY; > + sb->flags &= ~STRBUF_FIXED_MEMORY; Shorter: sb->flags &= ~(STRBUF_OWNS_MEMORY | STRBUF_FIXED_MEMORY); > +} > + > +void strbuf_wrap_fixed(struct strbuf *sb, char *path_buf, > +size_t path_buf_len, size_t alloc_len) > +{ > + strbuf_wrap_preallocated(sb, path_buf, path_buf_len, alloc_len); > + sb->flags |= STRBUF_FIXED_MEMORY; > +} Rather than letting strbuf_wrap_preallocated() set sb->flags &= ~FIXED_MEMORY only to revert that decision right away, a static function could be called by both strbuf_wrap_preallocated() and strbuf_wrap_fixed(). > void strbuf_release(struct strbuf *sb) > { > if (sb->alloc) { > - free(sb->buf); > + if (sb->flags & STRBUF_OWNS_MEMORY) > + free(sb->buf); > strbuf_init(sb, 0); > } Should we not reset the flags here, too? > @@ -38,7 +67,11 @@ char *strbuf_detach(struct strbuf *sb, size_t *sz) > { > char *res; > strbuf_grow(sb, 0); > - res = sb->buf; > + if (sb->flags & STRBUF_OWNS_MEMORY) > + res = sb->buf; > + else > + res = xmemdupz(sb->buf, sb->alloc - 1); This looks like a usage to be avoided: if we plan to detach the buffer, anyway, there is no good reason to allocate it on the heap first. I would at least issue a warning here. > @@ -51,6 +84,8 @@ void strbuf_attach(struct strbuf *sb, void *buf, size_t > len, size_t alloc) > sb->buf = buf; > sb->len = len; > sb->alloc = alloc; > + sb->flags |= STRBUF_OWNS_MEMORY; > + sb->flags &= ~STRBUF_FIXED_MEMORY; > strbuf_grow(sb, 0); > sb->buf[sb->len] = '\0'; > } > @@ -61,9 +96,32 @@ void strbuf_grow(struct strbuf *sb, size_t extra) > if (unsigned_add_overflows(extra, 1) || > unsigned_add_overflows(sb->len, extra + 1)) > die("you want to use way too much memory"); > - if (new_buf) > - sb->buf = NULL; > - ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc); > + if ((sb->flags & STRBUF_FIXED_MEMORY) && sb->len + extra + 1 > > sb->alloc) > + die("you try to make a string overflow the buffer of a fixed > strbuf"); We try to avoid running over 80 columns/row. This message could be more to the point: cannot grow fixed string > + /* > + * ALLOC_GROW may do a realloc() if needed, so we must not use it on > + * a buffer the strbuf doesn't own > + */ > + if (sb->flags & STRBUF_OWNS_MEMORY) { > + if (new_buf) > + sb->buf = NULL; > + ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc); > + } else { > + /* > + * The strbuf doesn't own
Re: [PATCH 1/2] strbuf: add tests
I agree with Johannes' remarks. I'd add two style nits: William Duclotwrites: > --- /dev/null > +++ b/t/helper/test-strbuf.c > @@ -0,0 +1,69 @@ > +#include "git-compat-util.h" > +#include "strbuf.h" > + > +/* > + * Check behavior on usual use cases > + */ > +int test_usual(struct strbuf *sb) > +{ > + size_t size, old_alloc; > + char *res, *old_buf, *str_test = malloc(5*sizeof(char)); Spaces around binary operator. > +int main(int argc, char *argv[]) > +{ > + size_t size = 1; > + struct strbuf sb; > + char str_test[5] = "test"; > + char str_foo[7] = "foo"; > + > + if (argc != 2) > + usage("test-strbuf mode"); > + > + if (!strcmp(argv[1], "basic_grow")) { > + /* > + * Check if strbuf_grow(0) allocate a new NUL-terminated buffer > + */ > + strbuf_init(, 0); > + strbuf_grow(, 0); > + if (sb.buf == strbuf_slopbuf) > + die("strbuf_grow failed to alloc memory"); > + strbuf_release(); > + if (sb.buf != strbuf_slopbuf) > + die("strbuf_release does not reinitialize the strbuf"); > + } else if (!strcmp(argv[1], "strbuf_check_behavior")) { > + strbuf_init(, 0); > + return test_usual(); I think the command ("strbuf_check_behavior") should have the same name as the function (test_usual). This avoids one indirection for the reader going from t*.sh file to the actual test code. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] strbuf: add tests
Hi William, On Mon, 30 May 2016, William Duclot wrote: > Test the strbuf API. Being used throughout all Git the API could be > considered tested, but adding specific tests makes it easier to improve > and extend the API. > --- The commit message makes sense. Please add your sign-off. > Makefile | 1 + > t/helper/test-strbuf.c | 69 > ++ > t/t0082-strbuf.sh | 19 ++ > 3 files changed, 89 insertions(+) > create mode 100644 t/helper/test-strbuf.c > create mode 100755 t/t0082-strbuf.sh > > diff --git a/Makefile b/Makefile > index 3f03366..dc84f43 100644 > --- a/Makefile > +++ b/Makefile > @@ -613,6 +613,7 @@ TEST_PROGRAMS_NEED_X += test-scrap-cache-tree > TEST_PROGRAMS_NEED_X += test-sha1 > TEST_PROGRAMS_NEED_X += test-sha1-array > TEST_PROGRAMS_NEED_X += test-sigchain > +TEST_PROGRAMS_NEED_X += test-strbuf > TEST_PROGRAMS_NEED_X += test-string-list > TEST_PROGRAMS_NEED_X += test-submodule-config > TEST_PROGRAMS_NEED_X += test-subprocess > diff --git a/t/helper/test-strbuf.c b/t/helper/test-strbuf.c > new file mode 100644 > index 000..622f627 > --- /dev/null > +++ b/t/helper/test-strbuf.c > @@ -0,0 +1,69 @@ > +#include "git-compat-util.h" > +#include "strbuf.h" > + > +/* > + * Check behavior on usual use cases > + */ > +int test_usual(struct strbuf *sb) I have to admit that I would prefer a more concrete name. And since your other tests are more fine-grained, maybe this one could be split into multiple separate ones, too? > +{ > + size_t size, old_alloc; > + char *res, *old_buf, *str_test = malloc(5*sizeof(char)); Our convention is to list the initialized variables first, the uninitialized ones after that, and for readability an empty line is recommended after the variable declaration block. > + strbuf_grow(sb, 1); > + strcpy(str_test, "test"); > + old_alloc = sb->alloc; > + strbuf_grow(sb, 1000); > + if (old_alloc == sb->alloc) > + die("strbuf_grow does not realloc the buffer as expected"); > + old_buf = sb->buf; > + res = strbuf_detach(sb, ); > + if (res != old_buf) > + die("strbuf_detach does not return the expected buffer"); > + free(res); > + > + strcpy(str_test, "test"); > + strbuf_attach(sb, (void *)str_test, strlen(str_test), sizeof(str_test)); > + res = strbuf_detach(sb, ); > + if (res != str_test) > + die("strbuf_detach does not return the expected buffer"); > + free(res); > + strbuf_release(sb); > + > + return 0; > +} > + > +int main(int argc, char *argv[]) > +{ > + size_t size = 1; > + struct strbuf sb; The common theme in our source code seems to initialize using STRBUF_INIT... Let's use that paradigm here, too? > + char str_test[5] = "test"; > + char str_foo[7] = "foo"; > + > + if (argc != 2) > + usage("test-strbuf mode"); A nice and convenient way to do command-line parsing is to use the parse-options API, in this case with OPT_CMDMODE. This would also give us a chance to document the command modes in a nice and succinct way: as help strings. > + > + if (!strcmp(argv[1], "basic_grow")) { > + /* > + * Check if strbuf_grow(0) allocate a new NUL-terminated buffer s/allocate// > + */ > + strbuf_init(, 0); > + strbuf_grow(, 0); > + if (sb.buf == strbuf_slopbuf) > + die("strbuf_grow failed to alloc memory"); > + strbuf_release(); > + if (sb.buf != strbuf_slopbuf) > + die("strbuf_release does not reinitialize the strbuf"); > + } else if (!strcmp(argv[1], "strbuf_check_behavior")) { > + strbuf_init(, 0); > + return test_usual(); > + } else if (!strcmp(argv[1], "grow_overflow")) { > + /* > + * size_t overflow: should die() > + */ > + strbuf_init(, 1000); > + strbuf_grow(, maximum_unsigned_value_of_type((size_t)1)); A comment "If this does not die(), fall through to returning success, to indicate an error" might be nice here. > + } else { > + usage("test-strbuf mode"); > + } > + > + return 0; > +} > diff --git a/t/t0082-strbuf.sh b/t/t0082-strbuf.sh > new file mode 100755 > index 000..0800d26 > --- /dev/null > +++ b/t/t0082-strbuf.sh > @@ -0,0 +1,19 @@ > +#!/bin/sh > + > +test_description="Test the strbuf API. > +" This description does not need a new-line, and existing one-liner test descriptions seem not to be terminated by a period. The rest of this patch looks good. Ciao, Johannes -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] strbuf: improve API
William Duclotwrites: > This patch series implements an improvment of the strbuf API, allowing > strbuf to use preallocated memory. This makes strbuf fit to be used > in performance-critical operations. > > The first patch is simply a preparatory work, adding tests for > existing strbuf implementation. > Most of the work is made in the second patch: handle pre-allocated > memory, extend the API, document it and test it. Seems interesting, however do you have any test/example that would show the difference of performance between using these optimizations and not using them? Such values would make a nice addition to help convince people that your series is interesting to have and use. Thanks, Rémi -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 02/22] i18n: advice: mark string about detached head for translation
Às 19:27 de 23-05-2016, Vasco Almeida escreveu: > Mark string with advice seen by the user when in detached head. > > Signed-off-by: Vasco Almeida> --- > advice.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/advice.c b/advice.c > index 4dc5cf1..703a847 100644 > --- a/advice.c > +++ b/advice.c > @@ -107,13 +107,13 @@ void NORETURN die_conclude_merge(void) > void detach_advice(const char *new_name) > { > const char fmt[] = > - "Note: checking out '%s'.\n\n" > + N_("Note: checking out '%s'.\n\n" > "You are in 'detached HEAD' state. You can look around, make > experimental\n" > "changes and commit them, and you can discard any commits you make in > this\n" > "state without impacting any branches by performing another > checkout.\n\n" > "If you want to create a new branch to retain commits you create, you > may\n" > "do so (now or later) by using -b with the checkout command again. > Example:\n\n" > - " git checkout -b \n\n"; > + " git checkout -b \n\n"); > > fprintf(stderr, fmt, new_name); > } > I just realized this does nothing but letting xgettext extract the string. Does not allow interpolating the actual translation. Hence, the translator would translate this text but the translation would never be used. There are 2 solutions: 1) use const char *fmt = _("...") instead 2) still use N_() like this patch but do fprintf(stderr, _(fmt), new_name); to trigger retrieving the translation when printing the message. Option 1) it most common on git source and 2) is used primarily during command line option parse (see usage_with_options_internal() in parse-options.c). I prefer the solution 1) and I'll choose it if there are no objections. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 0/6] worktree lock/unlock
v3 changes almost all patches so I didn't bother with interdiff [1/6] worktree.c: add find_worktree() This replaces the old find_wortree_by_path(). It now takes both prefix and command line argument so it can do some intelligent things with them if needed. [2/6] worktree.c: find_worktree() learns to identify worktrees by basename This is from Eric's suggestion. Here we only care about the "argument" argument and ignore the "prefix" argument from 1/6 [3/6] worktree.c: add is_main_worktree() This is the only unchanged patch since v2! [4/6] worktree.c: retrieve lock status (and optionally reason) in get_worktrees() This used to be is_worktree_locked() [5/6] worktree: add "lock" command Besides Eric's comments in v2, the syntax line "worktree lock " is changed to "worktree lock ", where could be either a path, or a base name, or something more in future. I'm aware that we don't like showing "worktrees" in user documents, favoring working trees. But I think is ok in this case. [6/6] worktree: add "unlock" command Pretty much addressing comments from v2 and some more changes because of the new 1/6 and 4/6. Documentation/git-worktree.txt | 33 + builtin/worktree.c | 66 ++ contrib/completion/git-completion.bash | 5 ++- t/t2028-worktree-move.sh (new +x) | 62 worktree.c | 53 +++ worktree.h | 14 6 files changed, 226 insertions(+), 7 deletions(-) create mode 100755 t/t2028-worktree-move.sh -- 2.8.2.524.g6ff3d78 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 5/6] worktree: add "lock" command
Helped-by: Eric SunshineSigned-off-by: Nguyễn Thái Ngọc Duy --- Documentation/git-worktree.txt | 28 +++- builtin/worktree.c | 38 +++ contrib/completion/git-completion.bash | 5 +++- t/t2028-worktree-move.sh (new +x) | 48 ++ 4 files changed, 112 insertions(+), 7 deletions(-) create mode 100755 t/t2028-worktree-move.sh diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt index 27feff6..6fca6cf 100644 --- a/Documentation/git-worktree.txt +++ b/Documentation/git-worktree.txt @@ -11,6 +11,7 @@ SYNOPSIS [verse] 'git worktree add' [-f] [--detach] [--checkout] [-b ] [] 'git worktree list' [--porcelain] +'git worktree lock' [--reason ] 'git worktree prune' [-n] [-v] [--expire ] DESCRIPTION @@ -38,9 +39,8 @@ section "DETAILS" for more information. If a linked working tree is stored on a portable device or network share which is not always mounted, you can prevent its administrative files from -being pruned by creating a file named 'locked' alongside the other -administrative files, optionally containing a plain text reason that -pruning should be suppressed. See section "DETAILS" for more information. +being pruned by issuing the `git worktree lock` command, optionally +specifying `--reason` to explain why the working tree is locked. COMMANDS @@ -61,6 +61,14 @@ each of the linked worktrees. The output details include if the worktree is bare, the revision currently checked out, and the branch currently checked out (or 'detached HEAD' if none). +lock:: + +If a working tree is on a portable device or network share which +is not always mounted, lock it to prevent its administrative +files from being pruned automatically. This also prevents it from +being moved or deleted. Optionally, specify a reason for the lock +with `--reason`. + prune:: Prune working tree information in $GIT_DIR/worktrees. @@ -110,6 +118,15 @@ OPTIONS --expire :: With `prune`, only expire unused working trees older than . +--reason : + With `lock`, an explanation why the worktree is locked. + +:: + Working trees can be identified by path, either relative or + absolute. Alternatively, if the last component in the + worktree's path is unique among working trees, it can be used + as well. + DETAILS --- Each linked working tree has a private sub-directory in the repository's @@ -150,7 +167,8 @@ instead. To prevent a $GIT_DIR/worktrees entry from being pruned (which can be useful in some situations, such as when the -entry's working tree is stored on a portable device), add a file named +entry's working tree is stored on a portable device), use the +`git worktree lock` command, which adds a file named 'locked' to the entry's directory. The file contains the reason in plain text. For example, if a linked working tree's `.git` file points to `/path/main/.git/worktrees/test-next` then a file named @@ -226,8 +244,6 @@ performed manually, such as: - `remove` to remove a linked working tree and its administrative files (and warn if the working tree is dirty) - `mv` to move or rename a working tree and update its administrative files -- `lock` to prevent automatic pruning of administrative files (for instance, - for a working tree on a portable device) GIT --- diff --git a/builtin/worktree.c b/builtin/worktree.c index f9dac37..84fe63b 100644 --- a/builtin/worktree.c +++ b/builtin/worktree.c @@ -14,6 +14,7 @@ static const char * const worktree_usage[] = { N_("git worktree add [] []"), N_("git worktree list []"), + N_("git worktree lock [] "), N_("git worktree prune []"), NULL }; @@ -459,6 +460,41 @@ static int list(int ac, const char **av, const char *prefix) return 0; } +static int lock_worktree(int ac, const char **av, const char *prefix) +{ + const char *reason = "", *old_reason; + struct option options[] = { + OPT_STRING(0, "reason", , N_("string"), + N_("reason for locking")), + OPT_END() + }; + struct worktree **worktrees, *wt; + + ac = parse_options(ac, av, prefix, options, worktree_usage, 0); + if (ac != 1) + usage_with_options(worktree_usage, options); + + worktrees = get_worktrees(); + wt = find_worktree(worktrees, prefix, av[0]); + if (!wt) + die(_("'%s' is not a working directory"), av[0]); + if (is_main_worktree(wt)) + die(_("'%s' is a main working directory"), av[0]); + + old_reason = wt->lock_reason; + if (old_reason) { + if (*old_reason) + die(_("'%s' is already locked, reason: %s"), + av[0], old_reason); + die(_("'%s' is already locked"), av[0]); + } + +
[PATCH v3 4/6] worktree.c: retrieve lock status (and optionally reason) in get_worktrees()
We need this later to avoid double locking a worktree, or unlocking one when it's not even locked. Signed-off-by: Nguyễn Thái Ngọc Duy--- worktree.c | 12 worktree.h | 1 + 2 files changed, 13 insertions(+) diff --git a/worktree.c b/worktree.c index 6432eec..c6a64b1 100644 --- a/worktree.c +++ b/worktree.c @@ -98,6 +98,7 @@ static struct worktree *get_main_worktree(void) worktree->is_detached = is_detached; worktree->is_current = 0; add_head_info(_ref, worktree); + worktree->lock_reason = NULL; done: strbuf_release(); @@ -144,6 +145,17 @@ static struct worktree *get_linked_worktree(const char *id) worktree->is_current = 0; add_head_info(_ref, worktree); + strbuf_reset(); + strbuf_addf(, "%s/worktrees/%s/locked", get_git_common_dir(), id); + if (file_exists(path.buf)) { + struct strbuf lock_reason = STRBUF_INIT; + if (strbuf_read_file(_reason, path.buf, 0) < 0) + die_errno(_("failed to read '%s'"), path.buf); + strbuf_trim(_reason); + worktree->lock_reason = strbuf_detach(_reason, NULL); + } else + worktree->lock_reason = NULL; + done: strbuf_release(); strbuf_release(_path); diff --git a/worktree.h b/worktree.h index e1c4715..9932710 100644 --- a/worktree.h +++ b/worktree.h @@ -5,6 +5,7 @@ struct worktree { char *path; char *id; char *head_ref; + char *lock_reason; /* NULL means not locked */ unsigned char head_sha1[20]; int is_detached; int is_bare; -- 2.8.2.524.g6ff3d78 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 2/6] worktree.c: find_worktree() learns to identify worktrees by basename
This allows the user to do something like "worktree lock foo" instead of "worktree lock ". With completion support it could be quite convenient. While this base name search can be done in the same worktree iteration loop, the code is split into a separate function for clarity. Suggested-by: Eric SunshineSigned-off-by: Nguyễn Thái Ngọc Duy --- worktree.c | 21 + 1 file changed, 21 insertions(+) diff --git a/worktree.c b/worktree.c index 0782e00..4dd7b77 100644 --- a/worktree.c +++ b/worktree.c @@ -214,12 +214,33 @@ const char *get_worktree_git_dir(const struct worktree *wt) return git_common_path("worktrees/%s", wt->id); } +static struct worktree *find_worktree_by_basename(struct worktree **list, + const char *base_name) +{ + struct worktree *found = NULL; + int nr_found = 0; + + for (; *list && nr_found < 2; list++) { + char *path = xstrdup((*list)->path); + if (!fspathcmp(base_name, basename(path))) { + found = *list; + nr_found++; + } + free(path); + } + return nr_found == 1 ? found : NULL; +} + struct worktree *find_worktree(struct worktree **list, const char *prefix, const char *arg) { + struct worktree *wt; char *path; + if ((wt = find_worktree_by_basename(list, arg))) + return wt; + arg = prefix_filename(prefix, strlen(prefix), arg); path = xstrdup(real_path(arg)); for (; *list; list++) -- 2.8.2.524.g6ff3d78 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 1/6] worktree.c: add find_worktree()
So far we haven't needed to identify an existing worktree from command line. Future commands such as lock or move will need it. The current implementation identifies worktrees by path (*). In future, the function could learn to identify by $(basename $path) or tags... (*) We could probably go cheaper with comparing inode number (and probably more reliable than paths when unicode enters the game). But not all systems have good inode that so let's stick to something simple for now. Helped-by: Eric SunshineSigned-off-by: Nguyễn Thái Ngọc Duy --- worktree.c | 15 +++ worktree.h | 8 2 files changed, 23 insertions(+) diff --git a/worktree.c b/worktree.c index f4a4f38..0782e00 100644 --- a/worktree.c +++ b/worktree.c @@ -214,6 +214,21 @@ const char *get_worktree_git_dir(const struct worktree *wt) return git_common_path("worktrees/%s", wt->id); } +struct worktree *find_worktree(struct worktree **list, + const char *prefix, + const char *arg) +{ + char *path; + + arg = prefix_filename(prefix, strlen(prefix), arg); + path = xstrdup(real_path(arg)); + for (; *list; list++) + if (!fspathcmp(path, real_path((*list)->path))) + break; + free(path); + return *list; +} + int is_worktree_being_rebased(const struct worktree *wt, const char *target) { diff --git a/worktree.h b/worktree.h index 1394909..7ad15da 100644 --- a/worktree.h +++ b/worktree.h @@ -29,6 +29,14 @@ extern struct worktree **get_worktrees(void); */ extern const char *get_worktree_git_dir(const struct worktree *wt); +/* + * Search a worktree that can be unambiguously identified by + * "arg". "prefix" must not be NULL. + */ +extern struct worktree *find_worktree(struct worktree **list, + const char *prefix, + const char *arg); + /* * Free up the memory for worktree(s) */ -- 2.8.2.524.g6ff3d78 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 3/6] worktree.c: add is_main_worktree()
Main worktree _is_ different. You can lock a linked worktree but not the main one, for example. Provide an API for checking that. Signed-off-by: Nguyễn Thái Ngọc Duy--- worktree.c | 5 + worktree.h | 5 + 2 files changed, 10 insertions(+) diff --git a/worktree.c b/worktree.c index 4dd7b77..6432eec 100644 --- a/worktree.c +++ b/worktree.c @@ -250,6 +250,11 @@ struct worktree *find_worktree(struct worktree **list, return *list; } +int is_main_worktree(const struct worktree *wt) +{ + return !wt->id; +} + int is_worktree_being_rebased(const struct worktree *wt, const char *target) { diff --git a/worktree.h b/worktree.h index 7ad15da..e1c4715 100644 --- a/worktree.h +++ b/worktree.h @@ -37,6 +37,11 @@ extern struct worktree *find_worktree(struct worktree **list, const char *prefix, const char *arg); +/* + * Return true if the given worktree is the main one. + */ +extern int is_main_worktree(const struct worktree *wt); + /* * Free up the memory for worktree(s) */ -- 2.8.2.524.g6ff3d78 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 6/6] worktree: add "unlock" command
Signed-off-by: Nguyễn Thái Ngọc Duy--- Documentation/git-worktree.txt | 5 + builtin/worktree.c | 28 contrib/completion/git-completion.bash | 2 +- t/t2028-worktree-move.sh | 14 ++ 4 files changed, 48 insertions(+), 1 deletion(-) diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt index 6fca6cf..6a5ff47 100644 --- a/Documentation/git-worktree.txt +++ b/Documentation/git-worktree.txt @@ -13,6 +13,7 @@ SYNOPSIS 'git worktree list' [--porcelain] 'git worktree lock' [--reason ] 'git worktree prune' [-n] [-v] [--expire ] +'git worktree unlock' DESCRIPTION --- @@ -73,6 +74,10 @@ prune:: Prune working tree information in $GIT_DIR/worktrees. +unlock:: + +Unlock a worktree, allowing it to be pruned, moved or deleted. + OPTIONS --- diff --git a/builtin/worktree.c b/builtin/worktree.c index 84fe63b..5fe9323 100644 --- a/builtin/worktree.c +++ b/builtin/worktree.c @@ -16,6 +16,7 @@ static const char * const worktree_usage[] = { N_("git worktree list []"), N_("git worktree lock [] "), N_("git worktree prune []"), + N_("git worktree unlock "), NULL }; @@ -495,6 +496,31 @@ static int lock_worktree(int ac, const char **av, const char *prefix) return 0; } +static int unlock_worktree(int ac, const char **av, const char *prefix) +{ + struct option options[] = { + OPT_END() + }; + struct worktree **worktrees, *wt; + int ret; + + ac = parse_options(ac, av, prefix, options, worktree_usage, 0); + if (ac != 1) + usage_with_options(worktree_usage, options); + + worktrees = get_worktrees(); + wt = find_worktree(worktrees, prefix, av[0]); + if (!wt) + die(_("'%s' is not a working directory"), av[0]); + if (is_main_worktree(wt)) + die(_("'%s' is a main working directory"), av[0]); + if (!wt->lock_reason) + die(_("'%s' is not locked"), av[0]); + ret = unlink_or_warn(git_common_path("worktrees/%s/locked", wt->id)); + free_worktrees(worktrees); + return ret; +} + int cmd_worktree(int ac, const char **av, const char *prefix) { struct option options[] = { @@ -513,5 +539,7 @@ int cmd_worktree(int ac, const char **av, const char *prefix) return list(ac - 1, av + 1, prefix); if (!strcmp(av[1], "lock")) return lock_worktree(ac - 1, av + 1, prefix); + if (!strcmp(av[1], "unlock")) + return unlock_worktree(ac - 1, av + 1, prefix); usage_with_options(worktree_usage, options); } diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index f88727d..0e3841d 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -2597,7 +2597,7 @@ _git_whatchanged () _git_worktree () { - local subcommands="add list lock prune" + local subcommands="add list lock prune unlock" local subcommand="$(__git_find_on_cmdline "$subcommands")" if [ -z "$subcommand" ]; then __gitcomp "$subcommands" diff --git a/t/t2028-worktree-move.sh b/t/t2028-worktree-move.sh index 87afc2e..68d3fe8 100755 --- a/t/t2028-worktree-move.sh +++ b/t/t2028-worktree-move.sh @@ -45,4 +45,18 @@ test_expect_success 'lock worktree twice (from the locked worktree)' ' test_cmp expected .git/worktrees/source/locked ' +test_expect_success 'unlock main worktree' ' + test_must_fail git worktree unlock . +' + +test_expect_success 'unlock linked worktree' ' + git worktree unlock source && + test_path_is_missing .git/worktrees/source/locked +' + +test_expect_success 'unlock worktree twice' ' + test_must_fail git worktree unlock source && + test_path_is_missing .git/worktrees/source/locked +' + test_done -- 2.8.2.524.g6ff3d78 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] push: deny policy to prevent pushes to unwanted remotes.
Currently, a user wanting to prevent accidental pushes to the wrong remote has to create a pre-push hook. The feature offers a configuration to allow users to prevent accidental pushes to the wrong remote. The user may define a list of whitelisted remotes, a list of blacklisted remotes and a default policy ("allow" or "deny"). A push is denied if the remote is explicitely blacklisted or if it isn't whitelisted and the default policy is "deny". This feature is intended as a safety net more than a real security, the user will always be able to modify the config if he wants to. It is here for him to consciously restrict his push possibilities. For example, it may be useful for an unexperimented user fearing to push to the wrong remote, or for companies wanting to avoid unintentionnal leaking of private code on public repositories. By default the whitelist/blacklist feature is disabled since the default policy is "allow". Signed-off-by: Antoine QueruSigned-off-by: Francois Beutin Signed-off-by: Matthieu Moy --- This is the first implementation of the feature proposed in SoCG 2016. The conversation about it can be found here: http://thread.gmane.org/gmane.comp.version-control.git/295166 and http://thread.gmane.org/gmane.comp.version-control.git/289749. We have included in cc all the participants in the previous conversation. Documentation/config.txt| 17 +++ Documentation/git-push.txt | 32 + builtin/push.c | 75 --- t/t5544-push-whitelist-blacklist.sh | 90 + 4 files changed, 209 insertions(+), 5 deletions(-) create mode 100755 t/t5544-push-whitelist-blacklist.sh diff --git a/Documentation/config.txt b/Documentation/config.txt index 53f00db..1478ce3 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2517,6 +2517,23 @@ remote.pushDefault:: `branch..remote` for all branches, and is overridden by `branch..pushRemote` for specific branches. +remote.pushBlacklisted:: + The list of remotes the user is forbidden to push to. + See linkgit:git-push[1] + +remote.pushWhitelisted:: + The list of remotes the user is allowed to push to. + See linkgit:git-push[1] + +remote.pushDefaultPolicy:: + Can be set to either 'allow' or 'deny', defines the policy to adopt + when the user tries to push to a remote not in the whitelist or + the blacklist. Defaults to 'allow'. + +remote.pushDenyMessage:: + The message printed when pushing to a forbidden remote. + Defaults to "Pushing to this remote has been forbidden". + remote..url:: The URL of a remote repository. See linkgit:git-fetch[1] or linkgit:git-push[1]. diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt index cf6ee4a..644bfde 100644 --- a/Documentation/git-push.txt +++ b/Documentation/git-push.txt @@ -368,6 +368,38 @@ reason:: refs, no explanation is needed. For a failed ref, the reason for failure is described. + +Note about denying pushes to the wrong remotes +-- + +If you fear accidental pushes to the wrong remotes, you can use the +blacklist/whitelist feature. The goal is to catch pushes to unwanted +remotes before they happen. + +The simplest way to forbid pushes to a remote is to add its url in the +config file under the 'remote.pushBlacklisted' variable. +A more restrictive way is to change 'remote.pushDefaultPolicy' to 'deny', +thus denying pushes to every remote not whitelisted. You can then add +the right remotes under 'remote.pushWhitelisted'. + +You can also configure more advanced acceptation/denial behavior following +this rule: the more the url in the config prefixes the asked url the more +priority it has. + +For example, if we set up the configuration variables like this: + git config --add remote.pushBlacklisted repository.com + git config --add remote.pushWhitelisted repository.com/Special_Folder +Any push of this form will be accepted: + git push repository.com/Special_Folder/foo +While those ones for example will be denied: + git push repository.com/Normal_Folder/bar + +Additionally, you can configure the message printed when a push is denied +with the 'remote.pushDenyMessage' configuration variable. + +An error will be raised if the url is blacklisted and whitelisted at the same. + + Note about fast-forwards diff --git a/builtin/push.c b/builtin/push.c index 4e9e4db..0aa1f7d 100644 --- a/builtin/push.c +++ b/builtin/push.c @@ -11,6 +11,8 @@ #include "submodule.h" #include "submodule-config.h" #include "send-pack.h" +#include "urlmatch.h" +#include "url.h" static const char * const push_usage[] = { N_("git push [] [ [...]]"), @@ -353,6 +355,68 @@ static int
[PATCH 1/2] strbuf: add tests
Test the strbuf API. Being used throughout all Git the API could be considered tested, but adding specific tests makes it easier to improve and extend the API. --- Makefile | 1 + t/helper/test-strbuf.c | 69 ++ t/t0082-strbuf.sh | 19 ++ 3 files changed, 89 insertions(+) create mode 100644 t/helper/test-strbuf.c create mode 100755 t/t0082-strbuf.sh diff --git a/Makefile b/Makefile index 3f03366..dc84f43 100644 --- a/Makefile +++ b/Makefile @@ -613,6 +613,7 @@ TEST_PROGRAMS_NEED_X += test-scrap-cache-tree TEST_PROGRAMS_NEED_X += test-sha1 TEST_PROGRAMS_NEED_X += test-sha1-array TEST_PROGRAMS_NEED_X += test-sigchain +TEST_PROGRAMS_NEED_X += test-strbuf TEST_PROGRAMS_NEED_X += test-string-list TEST_PROGRAMS_NEED_X += test-submodule-config TEST_PROGRAMS_NEED_X += test-subprocess diff --git a/t/helper/test-strbuf.c b/t/helper/test-strbuf.c new file mode 100644 index 000..622f627 --- /dev/null +++ b/t/helper/test-strbuf.c @@ -0,0 +1,69 @@ +#include "git-compat-util.h" +#include "strbuf.h" + +/* + * Check behavior on usual use cases + */ +int test_usual(struct strbuf *sb) +{ + size_t size, old_alloc; + char *res, *old_buf, *str_test = malloc(5*sizeof(char)); + strbuf_grow(sb, 1); + strcpy(str_test, "test"); + old_alloc = sb->alloc; + strbuf_grow(sb, 1000); + if (old_alloc == sb->alloc) + die("strbuf_grow does not realloc the buffer as expected"); + old_buf = sb->buf; + res = strbuf_detach(sb, ); + if (res != old_buf) + die("strbuf_detach does not return the expected buffer"); + free(res); + + strcpy(str_test, "test"); + strbuf_attach(sb, (void *)str_test, strlen(str_test), sizeof(str_test)); + res = strbuf_detach(sb, ); + if (res != str_test) + die("strbuf_detach does not return the expected buffer"); + free(res); + strbuf_release(sb); + + return 0; +} + +int main(int argc, char *argv[]) +{ + size_t size = 1; + struct strbuf sb; + char str_test[5] = "test"; + char str_foo[7] = "foo"; + + if (argc != 2) + usage("test-strbuf mode"); + + if (!strcmp(argv[1], "basic_grow")) { + /* +* Check if strbuf_grow(0) allocate a new NUL-terminated buffer +*/ + strbuf_init(, 0); + strbuf_grow(, 0); + if (sb.buf == strbuf_slopbuf) + die("strbuf_grow failed to alloc memory"); + strbuf_release(); + if (sb.buf != strbuf_slopbuf) + die("strbuf_release does not reinitialize the strbuf"); + } else if (!strcmp(argv[1], "strbuf_check_behavior")) { + strbuf_init(, 0); + return test_usual(); + } else if (!strcmp(argv[1], "grow_overflow")) { + /* +* size_t overflow: should die() +*/ + strbuf_init(, 1000); + strbuf_grow(, maximum_unsigned_value_of_type((size_t)1)); + } else { + usage("test-strbuf mode"); + } + + return 0; +} diff --git a/t/t0082-strbuf.sh b/t/t0082-strbuf.sh new file mode 100755 index 000..0800d26 --- /dev/null +++ b/t/t0082-strbuf.sh @@ -0,0 +1,19 @@ +#!/bin/sh + +test_description="Test the strbuf API. +" +. ./test-lib.sh + +test_expect_success 'basic test on strbuf_grow()' ' + test-strbuf basic_grow +' + +test_expect_success 'check strbuf behavior in usual use cases ' ' + test-strbuf strbuf_check_behavior +' + +test_expect_success 'overflow while calling strbuf_grow' ' + test_must_fail test-strbuf grow_overflow +' + +test_done -- 2.8.2.403.ge2646ba.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] strbuf: allow to use preallocated memory
It is unfortunate that it is currently impossible to use a strbuf without doing a memory allocation. So code like void f() { char path[PATH_MAX]; ... } typically gets turned into either void f() { struct strbuf path; strbuf_add(, ...); <-- does a malloc ... strbuf_release(); <-- does a free } which costs extra memory allocations, or void f() { static struct strbuf path; strbuf_add(, ...); ... strbuf_setlen(, 0); } which, by using a static variable, avoids most of the malloc/free overhead, but makes the function unsafe to use recursively or from multiple threads. Those limitations prevent strbuf to be used in performance-critical operations. THE IDEA The idea here is to enhance strbuf to allow it to use memory that it doesn't own (for example, stack-allocated memory), while (optionally) allowing it to switch over to using allocated memory if the string grows past the size of the pre-allocated buffer. API ENHANCEMENT --- All functions of the API can still be reliably called without knowledge of the initialization (normal/preallocated/fixed) with the exception that strbuf_grow() may die() if the string try to overflow a fixed buffer. The API contract is still respected: - The API users may peek strbuf.buf in-place until they perform an operation that makes it longer (at which point the .buf pointer may point at a new piece of memory). - The API users may strbuf_detach() to obtain a piece of memory that belongs to them (at which point the strbuf becomes empty), hence needs to be freed by the callers. Full credit to Michael Haggerty for the idea and most of the wording of this commit (http://mid.gmane.org/53512db6.1070...@alum.mit.edu). The implementation and bugs are all mine. Signed-off by: William DuclotSigned-off by: Simon Rabourg Signed-off by: Matthieu Moy --- strbuf.c | 68 ++ strbuf.h | 31 +-- t/helper/test-strbuf.c | 42 +++ t/t0082-strbuf.sh | 28 + 4 files changed, 162 insertions(+), 7 deletions(-) diff --git a/strbuf.c b/strbuf.c index 1ba600b..527b986 100644 --- a/strbuf.c +++ b/strbuf.c @@ -1,6 +1,14 @@ #include "cache.h" #include "refs.h" #include "utf8.h" +#include + +/** + * Flags + * -- + */ +#define STRBUF_OWNS_MEMORY 1 +#define STRBUF_FIXED_MEMORY (1 << 1) int starts_with(const char *str, const char *prefix) { @@ -20,16 +28,37 @@ char strbuf_slopbuf[1]; void strbuf_init(struct strbuf *sb, size_t hint) { + sb->flags = 0; sb->alloc = sb->len = 0; sb->buf = strbuf_slopbuf; if (hint) strbuf_grow(sb, hint); } +void strbuf_wrap_preallocated(struct strbuf *sb, char *path_buf, + size_t path_buf_len, size_t alloc_len) +{ + if (!path_buf) + die("you try to use a NULL buffer to initialize a strbuf"); + + strbuf_init(sb, 0); + strbuf_attach(sb, path_buf, path_buf_len, alloc_len); + sb->flags &= ~STRBUF_OWNS_MEMORY; + sb->flags &= ~STRBUF_FIXED_MEMORY; +} + +void strbuf_wrap_fixed(struct strbuf *sb, char *path_buf, + size_t path_buf_len, size_t alloc_len) +{ + strbuf_wrap_preallocated(sb, path_buf, path_buf_len, alloc_len); + sb->flags |= STRBUF_FIXED_MEMORY; +} + void strbuf_release(struct strbuf *sb) { if (sb->alloc) { - free(sb->buf); + if (sb->flags & STRBUF_OWNS_MEMORY) + free(sb->buf); strbuf_init(sb, 0); } } @@ -38,7 +67,11 @@ char *strbuf_detach(struct strbuf *sb, size_t *sz) { char *res; strbuf_grow(sb, 0); - res = sb->buf; + if (sb->flags & STRBUF_OWNS_MEMORY) + res = sb->buf; + else + res = xmemdupz(sb->buf, sb->alloc - 1); + if (sz) *sz = sb->len; strbuf_init(sb, 0); @@ -51,6 +84,8 @@ void strbuf_attach(struct strbuf *sb, void *buf, size_t len, size_t alloc) sb->buf = buf; sb->len = len; sb->alloc = alloc; + sb->flags |= STRBUF_OWNS_MEMORY; + sb->flags &= ~STRBUF_FIXED_MEMORY; strbuf_grow(sb, 0); sb->buf[sb->len] = '\0'; } @@ -61,9 +96,32 @@ void strbuf_grow(struct strbuf *sb, size_t extra) if (unsigned_add_overflows(extra, 1) || unsigned_add_overflows(sb->len, extra + 1)) die("you want to use way too much memory"); - if (new_buf) - sb->buf = NULL; - ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc); + if ((sb->flags & STRBUF_FIXED_MEMORY) && sb->len + extra + 1 > sb->alloc) + die("you try to make a string overflow the buffer of a fixed strbuf"); + +
[PATCH 0/2] strbuf: improve API
This patch series implements an improvment of the strbuf API, allowing strbuf to use preallocated memory. This makes strbuf fit to be used in performance-critical operations. The first patch is simply a preparatory work, adding tests for existing strbuf implementation. Most of the work is made in the second patch: handle pre-allocated memory, extend the API, document it and test it. As said in the second commit, the idea comes from Michael Haggerty. William Duclot (2): strbuf: add tests strbuf: allow to use preallocated memory Makefile | 1 + strbuf.c | 68 +++- strbuf.h | 31 +-- t/helper/test-strbuf.c | 111 +++ t/t0082-strbuf.sh | 47 +++ 5 files changed, 251 insertions(+), 7 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/5] worktree.c: add find_worktree_by_path()
On Mon, May 23, 2016 at 11:11 AM, Eric Sunshinewrote: > On Sun, May 22, 2016 at 6:43 AM, Nguyễn Thái Ngọc Duy > wrote: >> So far we haven't needed to identify an existing worktree from command >> line. Future commands such as lock or move will need it. There are of >> course other options for identifying a worktree, for example by branch >> or even by internal id. They may be added later if proved useful. > > Beyond the above methods for specifying a worktree, [1] adds > $(basename $path) as a possibility if not ambiguous. Excerpt: > > For git-worktree commands such as "lock", "mv", "remove", it > likely will be nice to allow people to specify the linked > worktree not only by path, but also by tag, and possibly even by > $(basename $path) if not ambiguous. Thanks. This changes things. Turns out my "find_by_path" API cannot be easily extended to support identifying worktrees some other way. I will need to pass the command line argument as-is, so that this "find_worktree" (by any means) has enough info to decide (especially about ambiguation). -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC] Triangular Workflow UI improvments
> Le 27 mai 2016 à 09:32, Philip Oakleya écrit : > For me, the first step would be to actually document a (the?) Triangular > Workflow in the documentation, so we are all taking about the same broad > method. > > At the moment there is a choice (assuming a ithub like service) of either > clone and then fork, or fork and clone the fork, which leave the user with > different fixups of their config's being required, so describing the easier > one would help folk. > > Likewise there are missing terms such as for the third place (the personal > fork) that is neither the upstream, nor the local repo. Making sure the > terminology is crisp and clean will greatly ease any implementation issues. > And then there are the possible workflows... Yes, you’re right. Writing a complete tutorial is a good idea. I start this job and the content will depend on the result of this discussion. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC] Triangular Workflow UI improvement
>> We are working on full implementation of triangular workflow feature. >> For now, the main options available are: >> - branch..pushRemote >> - remote.pushDefault >> And only setable by hands. > > And once it is set, you do not have to worry about it. I am not > sure per-branch thing is all that useful, unless you are always > working on a single branch like 'master', but the latter would be > just set once and forget about it. We got the example of our fork of git(e.g. GitHub). We want : - some branches that only fetch from git/git and push to our/git - the others branches fetch from and push to our/git. For now, we have to change branch..remote and branch..pushRemote(or remote.pushDefault). With branch..fetchRemote, we only have to set this one. fetchRemote is more explicit than remote for that case. Another point is that many commands can erase the old value of branch..remote. >> Context: >> - One main remote repository, e.g. git/git. >> - A remote fork (e.g. a GitHub fork) of git/git, e.g. me/git. >> - A local clone of me/git on the machine >> Purposes: >> - the local branch master has to fetch to git/git by default >> - the local branch master has to push to me/git by default > > Wouldn't remote.pushDefault be the single thing you need to set just > once and forget about it? Why would your users even want to do > these things … remote.pushDefault overrides branch..remote for all branches. The goal is to give an easily understandable config for complex configuration. Having all the configuration in the same part (e.g [branch « master »]) is easier to understand and edit. > >> c. add `git fetch --set-default` in order to set remote.fetchDefault >> d. add `git fetch --set-remote` in order to set >> branch..fetchRemote >> e. add `git pull --set-default` in order to set remote.fetchDefault >> f. add `git pull --set-remote` in order to set branch..fetchRemote >> a. add `git push --set-default` in order to set remote.pushDefault >> b. add `git push --set-remote` in order to set branch..pushRemote > > ... just to configure many variables every time they work on a new > branch? branch..pushRemote and all these options are optional. Options in commands may not be added. It just gives user friendly interaction with these options. To conclude: This feature is more about configuration clarity than possibility itself. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 2/3] perf: make the tests work in worktrees
Hi René, On Sun, 29 May 2016, René Scharfe wrote: > Am 13.05.2016 um 15:25 schrieb Johannes Schindelin: > > This patch makes perf-lib.sh more robust so that it can run correctly > > even inside a worktree. For example, it assumed that $GIT_DIR/objects is > > the objects directory (which is not the case for worktrees) and it used > > the commondir file verbatim, even if it contained a relative path. > > > > Furthermore, the setup code expected `git rev-parse --git-dir` to spit > > out a relative path, which is also not true for worktrees. Let's just > > change the code to accept both relative and absolute paths, by avoiding > > the `cd` into the copied working directory. > > > > Signed-off-by: Johannes Schindelin> > --- > > t/perf/perf-lib.sh | 14 +++--- > > 1 file changed, 7 insertions(+), 7 deletions(-) > > > > diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh > > index 9fa0706..5ef1744 100644 > > --- a/t/perf/perf-lib.sh > > +++ b/t/perf/perf-lib.sh > > @@ -80,22 +80,22 @@ test_perf_create_repo_from () { > > error "bug in the test script: not 2 parameters to test-create-repo" > > repo="$1" > > source="$2" > > - source_git=$source/$(cd "$source" && git rev-parse --git-dir) > > + source_git="$(git -C "$source" rev-parse --git-dir)" > > + objects_dir="$(git -C "$source" rev-parse --git-path objects)" > > mkdir -p "$repo/.git" > > ( > > - cd "$repo/.git" && > > - { cp -Rl "$source_git/objects" . 2>/dev/null || > > - cp -R "$source_git/objects" .; } && > > + { cp -Rl "$objects_dir" "$repo/.git/" 2>/dev/null || > > + cp -R "$objects_dir" "$repo/.git/"; } && > > for stuff in "$source_git"/*; do > > case "$stuff" in > > - */objects|*/hooks|*/config) > > + */objects|*/hooks|*/config|*/commondir) > > ;; > > *) > > - cp -R "$stuff" . || exit 1 > > + cp -R "$stuff" "$repo/.git/" || exit 1 > > ;; > > esac > > done && > > - cd .. && > > + cd "$repo" && > > git init -q && { > > test_have_prereq SYMLINKS || > > git config core.symlinks false > > > > This breaks perf for the non-worktree case: Oh drats! > lsr@debian:~/src/git/t/perf$ make > rm -rf test-results > ./run > === Running 12 tests in this tree === > cp: cannot stat '.git/objects': No such file or directory > error: failed to copy repository '/home/lsr/src/git/t/..' to '/tmp/trash > directory.p-perf-lib-sanity' > cp: cannot stat '.git/objects': No such file or directory > error: failed to copy repository '/home/lsr/src/git/t/..' to '/tmp/trash > directory.p0001-rev-list' > ... > > Here's a fix: > > -- >8 -- > Subject: perf: make the tests work without a worktree > > In regular repositories $source_git and $objects_dir contain relative > paths based on $source. Go there to allow cp to resolve them. > > Signed-off-by: Rene Scharfe > --- > t/perf/perf-lib.sh | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh > index 5ef1744..1888790 100644 > --- a/t/perf/perf-lib.sh > +++ b/t/perf/perf-lib.sh > @@ -84,6 +84,7 @@ test_perf_create_repo_from () { > objects_dir="$(git -C "$source" rev-parse --git-path objects)" > mkdir -p "$repo/.git" > ( > + cd "$source" && I fear that interacts badly with the `cd "$repo"` I introduced later (replacing a `cd ..`)... Ciao, Dscho
[RFC] Triangular Workflow UI improvement
In my last message, I forgot to add some important contributors linked to this feature. Some of you already replied to me. I will answer shortly. Sorry for the noise. > > We are working on full implementation of triangular workflow feature. > For now, the main options available are: >- branch..pushRemote >- remote.pushDefault > And only setable by hands. > > As it can be difficult to understand, here is what we want to do. > > > Context: > - One main remote repository, e.g. git/git. > - A remote fork (e.g. a GitHub fork) of git/git, e.g. me/git. > - A local clone of me/git on the machine > > Purposes: > - the local branch master has to fetch to git/git by default > - the local branch master has to push to me/git by default > > Configuration wanted: > - Add a remote to git/git e.g. `git remote add ...` > - Set the fetch remote for branch as default. > > For now, we can do that by setting: > - branch..remote to git/git > - branch..pushRemote to me/git > But many options set `branch..remote`, a suitable solution is to > implement an option for the fetch for example. > > > Here is what we want to implement: > > 1. > a. add the config var: remote.fetchDefault > b. add the config var: branch..fetchRemote > c. add `git fetch --set-default` in order to set remote.fetchDefault > d. add `git fetch --set-remote` in order to set > branch..fetchRemote > e. add `git pull --set-default` in order to set remote.fetchDefault > f. add `git pull --set-remote` in order to set branch..fetchRemote > > 2. > a. add `git push --set-default` in order to set remote.pushDefault > b. add `git push --set-remote` in order to set branch..pushRemote > > > What's your opinion about this feature ? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/13] remote rm: handle symbolic refs correctly
In the modern world of reference backends, it is not OK to delete a symref by unlink()ing the file directly. This must be done via the refs API. We do so by adding the symref to the list of references to delete along with the non-symbolic references, then calling delete_refs() with the new flags option set to REF_NODEREF. Signed-off-by: Michael Haggerty--- builtin/remote.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/builtin/remote.c b/builtin/remote.c index 1bbf9b4..c4b4d67 100644 --- a/builtin/remote.c +++ b/builtin/remote.c @@ -539,10 +539,6 @@ static int add_branch_for_removal(const char *refname, return 0; } - /* make sure that symrefs are deleted */ - if (flags & REF_ISSYMREF) - return unlink(git_path("%s", refname)); - string_list_append(branches->branches, refname); return 0; @@ -788,7 +784,7 @@ static int rm(int argc, const char **argv) strbuf_release(); if (!result) - result = delete_refs(, 0); + result = delete_refs(, REF_NODEREF); string_list_clear(, 0); if (skipped.nr) { -- 2.8.1 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/13] do_for_each_ref(): move docstring to the header file
Signed-off-by: Michael Haggerty--- refs/files-backend.c | 9 - refs/refs-internal.h | 10 +- 2 files changed, 9 insertions(+), 10 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 1230dfb..68db3e8 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1878,15 +1878,6 @@ static int do_for_each_entry(struct ref_cache *refs, const char *base, return retval; } -/* - * Call fn for each reference in the specified ref_cache for which the - * refname begins with base. If trim is non-zero, then trim that many - * characters off the beginning of each refname before passing the - * refname to fn. flags can be DO_FOR_EACH_INCLUDE_BROKEN to include - * broken references in the iteration. If fn ever returns a non-zero - * value, stop the iteration and return that value; otherwise, return - * 0. - */ int do_for_each_ref(const char *submodule, const char *base, each_ref_fn fn, int trim, int flags, void *cb_data) { diff --git a/refs/refs-internal.h b/refs/refs-internal.h index 1bb3d87..b4dd545 100644 --- a/refs/refs-internal.h +++ b/refs/refs-internal.h @@ -249,7 +249,15 @@ int rename_ref_available(const char *oldname, const char *newname); #define DO_FOR_EACH_INCLUDE_BROKEN 0x01 /* - * The common backend for the for_each_*ref* functions + * Call fn for each reference in the specified submodule for which the + * refname begins with base. If trim is non-zero, then trim that many + * characters off the beginning of each refname before passing the + * refname to fn. flags can be DO_FOR_EACH_INCLUDE_BROKEN to include + * broken references in the iteration. If fn ever returns a non-zero + * value, stop the iteration and return that value; otherwise, return + * 0. + * + * This is the common backend for the for_each_*ref* functions. */ int do_for_each_ref(const char *submodule, const char *base, each_ref_fn fn, int trim, int flags, void *cb_data); -- 2.8.1 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 09/13] refs: introduce an iterator interface
Currently, the API for iterating over references is via a family of for_each_ref()-type functions that invoke a callback function for each selected reference. All of these eventually call do_for_each_ref(), which knows how to do one thing: iterate in parallel through two ref_caches, one for loose and one for packed refs, giving loose references precedence over packed refs. This is rather complicated code, and is quite specialized to the files backend. It also requires callers to encapsulate their work into a callback function, which often means that they have to define and use a "cb_data" struct to manage their context. The current design is already bursting at the seams, and will become even more awkward in the upcoming world of multiple reference storage backends: * Per-worktree vs. shared references are currently handled via a kludge in git_path() rather than iterating over each part of the reference namespace separately and merging the results. This kludge will cease to work when we have multiple reference storage backends. * The current scheme is inflexible. What if we sometimes want to bypass the ref_cache, or use it only for packed or only for loose refs? What if we want to store symbolic refs in one type of storage backend and non-symbolic ones in another? In the future, each reference backend will need to define its own way of iterating over references. The crux of the problem with the current design is that it is impossible to compose for_each_ref()-style iterations, because the flow of control is owned by the for_each_ref() function. There is nothing that a caller can do but iterate through all references in a single burst, so there is no way for it to interleave references from multiple backends and present the result to the rest of the world as a single compound backend. This commit introduces a new iteration primitive for references: a ref_iterator. A ref_iterator is a polymorphic object that a reference storage backend can be asked to instantiate. There are three functions that can be applied to a ref_iterator: * ref_iterator_advance(): move to the next reference in the iteration * ref_iterator_abort(): end the iteration before it is exhausted * ref_iterator_peel(): peel the reference currently being looked at Iterating using a ref_iterator leaves the flow of control in the hands of the caller, which means that ref_iterators from multiple sources (e.g., loose and packed refs) can be composed and presented to the world as a single compound ref_iterator. It also means that the backend code for implementing reference iteration will sometimes be more complicated. For example, the cache_ref_iterator (which iterates over a ref_cache) can't use the C stack to recurse; instead, it must manage its own stack internally as explicit data structures. There is also a lot of boilerplate connected with object-oriented programming in C. Eventually, end-user callers will be able to be written in a more natural way—managing their own flow of control rather than having to work via callbacks. Since there will only be a few reference backends but there are many consumers of this API, this is a good tradeoff. More importantly, we gain composability, and especially the possibility of writing interchangeable parts that can work with any ref_iterator. For example, merge_ref_iterator implements a generic way of merging the contents of any two ref_iterators. It is used to merge loose + packed refs as part of the implementation of the files_ref_iterator. But it will also be possible to use it to merge other pairs of reference sources (e.g., per-worktree vs. shared refs). Another example is prefix_ref_iterator, which can be used to trim a prefix off the front of reference names before presenting them to the caller (e.g., "refs/heads/master" -> "master"). In this patch, we introduce the iterator abstraction and many utilities, and implement a reference iterator for the files ref storage backend. (I've written several other obvious utilities, for example a generic way to filter references being iterated over. These will probably be useful in the future. But they are not needed for this patch series, so I am not including them at this time.) In a moment we will rewrite do_for_each_ref() to work via reference iterators (allowing some special-purpose code to be discarded), and do something similar for reflogs. In future patch series, we will expose the ref_iterator abstraction in the public refs API so that callers can use it directly. Implementation note: I tried abstracting this a layer further to allow generic iterators (over arbitrary types of objects) and generic utilities like a generic merge_iterator. But the implementation in C was very cumbersome, involving (in my opinion) too much boilerplate and too much unsafe casting, some of which would have had to be done on the caller side. However, I did put a few iterator-related constants in a top-level header file, iterator.h, as they will be
[PATCH 04/13] delete_refs(): add a flags argument
This will be useful for passing REF_NODEREF through. Signed-off-by: Michael Haggerty--- builtin/fetch.c | 2 +- builtin/remote.c | 4 ++-- refs.h | 5 +++-- refs/files-backend.c | 4 ++-- 4 files changed, 8 insertions(+), 7 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index f8455bd..b55c83c 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -806,7 +806,7 @@ static int prune_refs(struct refspec *refs, int ref_count, struct ref *ref_map, for (ref = stale_refs; ref; ref = ref->next) string_list_append(, ref->name); - result = delete_refs(); + result = delete_refs(, 0); string_list_clear(, 0); } diff --git a/builtin/remote.c b/builtin/remote.c index fda5c2e..1bbf9b4 100644 --- a/builtin/remote.c +++ b/builtin/remote.c @@ -788,7 +788,7 @@ static int rm(int argc, const char **argv) strbuf_release(); if (!result) - result = delete_refs(); + result = delete_refs(, 0); string_list_clear(, 0); if (skipped.nr) { @@ -1303,7 +1303,7 @@ static int prune_remote(const char *remote, int dry_run) string_list_sort(_to_prune); if (!dry_run) - result |= delete_refs(_to_prune); + result |= delete_refs(_to_prune, 0); for_each_string_list_item(item, ) { const char *refname = item->util; diff --git a/refs.h b/refs.h index 21874f0..6d515a4 100644 --- a/refs.h +++ b/refs.h @@ -274,9 +274,10 @@ int delete_ref(const char *refname, const unsigned char *old_sha1, /* * Delete the specified references. If there are any problems, emit * errors but attempt to keep going (i.e., the deletes are not done in - * an all-or-nothing transaction). + * an all-or-nothing transaction). flags is passed through to + * ref_transaction_delete(). */ -int delete_refs(struct string_list *refnames); +int delete_refs(struct string_list *refnames, unsigned int flags); /** Delete a reflog */ int delete_reflog(const char *refname); diff --git a/refs/files-backend.c b/refs/files-backend.c index 5af7441..a0d09f4 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -2403,7 +2403,7 @@ static int delete_ref_loose(struct ref_lock *lock, int flag, struct strbuf *err) return 0; } -int delete_refs(struct string_list *refnames) +int delete_refs(struct string_list *refnames, unsigned int flags) { struct strbuf err = STRBUF_INIT; int i, result = 0; @@ -2432,7 +2432,7 @@ int delete_refs(struct string_list *refnames) for (i = 0; i < refnames->nr; i++) { const char *refname = refnames->items[i].string; - if (delete_ref(refname, NULL, 0)) + if (delete_ref(refname, NULL, flags)) result |= error(_("could not remove reference %s"), refname); } -- 2.8.1 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/13] refs: use name "prefix" consistently
In the context of the for_each_ref() functions, call the prefix that references must start with "prefix". (In some places it was called "base".) This is clearer, and also prevents confusion with another planned use of the word "base". Signed-off-by: Michael Haggerty--- refs/files-backend.c | 24 refs/refs-internal.h | 14 +++--- 2 files changed, 19 insertions(+), 19 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 68db3e8..5af7441 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -542,7 +542,7 @@ static struct ref_entry *current_ref; typedef int each_ref_entry_fn(struct ref_entry *entry, void *cb_data); struct ref_entry_cb { - const char *base; + const char *prefix; int trim; int flags; each_ref_fn *fn; @@ -559,7 +559,7 @@ static int do_one_ref(struct ref_entry *entry, void *cb_data) struct ref_entry *old_current_ref; int retval; - if (!starts_with(entry->name, data->base)) + if (!starts_with(entry->name, data->prefix)) return 0; if (!(data->flags & DO_FOR_EACH_INCLUDE_BROKEN) && @@ -1824,12 +1824,12 @@ int peel_ref(const char *refname, unsigned char *sha1) /* * Call fn for each reference in the specified ref_cache, omitting - * references not in the containing_dir of base. fn is called for all - * references, including broken ones. If fn ever returns a non-zero + * references not in the containing_dir of prefix. Call fn for all + * references, including broken ones. If fn ever returns a non-zero * value, stop the iteration and return that value; otherwise, return * 0. */ -static int do_for_each_entry(struct ref_cache *refs, const char *base, +static int do_for_each_entry(struct ref_cache *refs, const char *prefix, each_ref_entry_fn fn, void *cb_data) { struct packed_ref_cache *packed_ref_cache; @@ -1846,8 +1846,8 @@ static int do_for_each_entry(struct ref_cache *refs, const char *base, * disk. */ loose_dir = get_loose_refs(refs); - if (base && *base) { - loose_dir = find_containing_dir(loose_dir, base, 0); + if (prefix && *prefix) { + loose_dir = find_containing_dir(loose_dir, prefix, 0); } if (loose_dir) prime_ref_dir(loose_dir); @@ -1855,8 +1855,8 @@ static int do_for_each_entry(struct ref_cache *refs, const char *base, packed_ref_cache = get_packed_ref_cache(refs); acquire_packed_ref_cache(packed_ref_cache); packed_dir = get_packed_ref_dir(packed_ref_cache); - if (base && *base) { - packed_dir = find_containing_dir(packed_dir, base, 0); + if (prefix && *prefix) { + packed_dir = find_containing_dir(packed_dir, prefix, 0); } if (packed_dir && loose_dir) { @@ -1878,14 +1878,14 @@ static int do_for_each_entry(struct ref_cache *refs, const char *base, return retval; } -int do_for_each_ref(const char *submodule, const char *base, +int do_for_each_ref(const char *submodule, const char *prefix, each_ref_fn fn, int trim, int flags, void *cb_data) { struct ref_entry_cb data; struct ref_cache *refs; refs = get_ref_cache(submodule); - data.base = base; + data.prefix = prefix; data.trim = trim; data.flags = flags; data.fn = fn; @@ -1896,7 +1896,7 @@ int do_for_each_ref(const char *submodule, const char *base, if (ref_paranoia) data.flags |= DO_FOR_EACH_INCLUDE_BROKEN; - return do_for_each_entry(refs, base, do_one_ref, ); + return do_for_each_entry(refs, prefix, do_one_ref, ); } /* diff --git a/refs/refs-internal.h b/refs/refs-internal.h index b4dd545..8ad02d8 100644 --- a/refs/refs-internal.h +++ b/refs/refs-internal.h @@ -250,16 +250,16 @@ int rename_ref_available(const char *oldname, const char *newname); /* * Call fn for each reference in the specified submodule for which the - * refname begins with base. If trim is non-zero, then trim that many - * characters off the beginning of each refname before passing the - * refname to fn. flags can be DO_FOR_EACH_INCLUDE_BROKEN to include - * broken references in the iteration. If fn ever returns a non-zero - * value, stop the iteration and return that value; otherwise, return - * 0. + * refname begins with prefix. If trim is non-zero, then trim that + * many characters off the beginning of each refname before passing + * the refname to fn. flags can be DO_FOR_EACH_INCLUDE_BROKEN to + * include broken references in the iteration. If fn ever returns a + * non-zero value, stop the iteration and return that value; + * otherwise, return 0. * * This is the common backend for the for_each_*ref* functions. */ -int do_for_each_ref(const char *submodule, const char *base, +int do_for_each_ref(const char
[PATCH 01/13] refs: remove unnecessary "extern" keywords
There's continuing work in this area, so clean up unneeded "extern" keywords rather than schlepping them around. Also split up some overlong lines and add parameter names in a couple of places. Signed-off-by: Michael Haggerty--- refs.h | 132 +++-- 1 file changed, 72 insertions(+), 60 deletions(-) diff --git a/refs.h b/refs.h index 9230d47..21874f0 100644 --- a/refs.h +++ b/refs.h @@ -52,19 +52,19 @@ #define RESOLVE_REF_NO_RECURSE 0x02 #define RESOLVE_REF_ALLOW_BAD_NAME 0x04 -extern const char *resolve_ref_unsafe(const char *refname, int resolve_flags, - unsigned char *sha1, int *flags); +const char *resolve_ref_unsafe(const char *refname, int resolve_flags, + unsigned char *sha1, int *flags); -extern char *resolve_refdup(const char *refname, int resolve_flags, - unsigned char *sha1, int *flags); +char *resolve_refdup(const char *refname, int resolve_flags, +unsigned char *sha1, int *flags); -extern int read_ref_full(const char *refname, int resolve_flags, -unsigned char *sha1, int *flags); -extern int read_ref(const char *refname, unsigned char *sha1); +int read_ref_full(const char *refname, int resolve_flags, + unsigned char *sha1, int *flags); +int read_ref(const char *refname, unsigned char *sha1); -extern int ref_exists(const char *refname); +int ref_exists(const char *refname); -extern int is_branch(const char *refname); +int is_branch(const char *refname); /* * If refname is a non-symbolic reference that refers to a tag object, @@ -74,24 +74,25 @@ extern int is_branch(const char *refname); * Symbolic references are considered unpeelable, even if they * ultimately resolve to a peelable tag. */ -extern int peel_ref(const char *refname, unsigned char *sha1); +int peel_ref(const char *refname, unsigned char *sha1); /** * Resolve refname in the nested "gitlink" repository that is located * at path. If the resolution is successful, return 0 and set sha1 to * the name of the object; otherwise, return a non-zero value. */ -extern int resolve_gitlink_ref(const char *path, const char *refname, unsigned char *sha1); +int resolve_gitlink_ref(const char *path, const char *refname, + unsigned char *sha1); /* * Return true iff abbrev_name is a possible abbreviation for * full_name according to the rules defined by ref_rev_parse_rules in * refs.c. */ -extern int refname_match(const char *abbrev_name, const char *full_name); +int refname_match(const char *abbrev_name, const char *full_name); -extern int dwim_ref(const char *str, int len, unsigned char *sha1, char **ref); -extern int dwim_log(const char *str, int len, unsigned char *sha1, char **ref); +int dwim_ref(const char *str, int len, unsigned char *sha1, char **ref); +int dwim_log(const char *str, int len, unsigned char *sha1, char **ref); /* * A ref_transaction represents a collection of ref updates @@ -182,38 +183,45 @@ typedef int each_ref_fn(const char *refname, * modifies the reference also returns a nonzero value to immediately * stop the iteration. */ -extern int head_ref(each_ref_fn fn, void *cb_data); -extern int for_each_ref(each_ref_fn fn, void *cb_data); -extern int for_each_ref_in(const char *prefix, each_ref_fn fn, void *cb_data); -extern int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data, unsigned int broken); -extern int for_each_tag_ref(each_ref_fn fn, void *cb_data); -extern int for_each_branch_ref(each_ref_fn fn, void *cb_data); -extern int for_each_remote_ref(each_ref_fn fn, void *cb_data); -extern int for_each_replace_ref(each_ref_fn fn, void *cb_data); -extern int for_each_glob_ref(each_ref_fn fn, const char *pattern, void *cb_data); -extern int for_each_glob_ref_in(each_ref_fn fn, const char *pattern, const char *prefix, void *cb_data); +int head_ref(each_ref_fn fn, void *cb_data); +int for_each_ref(each_ref_fn fn, void *cb_data); +int for_each_ref_in(const char *prefix, each_ref_fn fn, void *cb_data); +int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data, + unsigned int broken); +int for_each_tag_ref(each_ref_fn fn, void *cb_data); +int for_each_branch_ref(each_ref_fn fn, void *cb_data); +int for_each_remote_ref(each_ref_fn fn, void *cb_data); +int for_each_replace_ref(each_ref_fn fn, void *cb_data); +int for_each_glob_ref(each_ref_fn fn, const char *pattern, void *cb_data); +int for_each_glob_ref_in(each_ref_fn fn, const char *pattern, +const char *prefix, void *cb_data); -extern int head_ref_submodule(const char *submodule, each_ref_fn fn, void *cb_data); -extern int for_each_ref_submodule(const char *submodule, each_ref_fn fn, void *cb_data); -extern int for_each_ref_in_submodule(const char *submodule, const char *prefix,
[PATCH 10/13] do_for_each_ref(): reimplement using reference iteration
Use the reference iterator interface to implement do_for_each_ref(). Delete a bunch of code supporting the old for_each_ref() implementation. And now that do_for_each_ref() is generic code (it is no longer tied to the files backend), move it to refs.c. The implementation is via a new function, do_for_each_ref_iterator(), which takes a reference iterator as argument and calls a callback function for each of the references in the iterator. This change requires the current_ref performance hack for peel_ref() to be implemented via ref_iterator_peel() rather than peel_entry() because we don't have a ref_entry handy (it is hidden under three layers: file_ref_iterator, merge_ref_iterator, and cache_ref_iterator). So: * do_for_each_ref_iterator() records the active iterator in current_ref_iter while it is running. * peel_ref() checks whether current_ref_iter is pointing at the requested reference. If so, it asks the iterator to peel the reference (which it can do efficiently via its "peel" virtual function). For extra safety, we do the optimization only if the refname *addresses* are the same, not only if the refname *strings* are the same, to forestall possible mixups between refnames that come from different ref_iterators. Please note that this optimization of peel_ref() is only available when iterating via do_for_each_ref_iterator() (including all of the for_each_ref() functions, which call it indirectly). It would be complicated to implement a similar optimization when iterating directly using a reference iterator, because multiple reference iterators can be in use at the same time, with interleaved calls to ref_iterator_advance(). (In fact we do exactly that in merge_ref_iterator.) But that is not necessary. peel_ref() is only called while iterating over references. Callers who iterate using the for_each_ref() functions benefit from the optimization described above. Callers who iterate using reference iterators directly have access to the ref_iterator, so they can call ref_iterator_peel() themselves to get an analogous optimization in a more straightforward manner. If we rewrite all callers to use the reference iteration API, then we can remove the current_ref_iter hack permanently. Signed-off-by: Michael Haggerty--- refs.c | 20 + refs/files-backend.c | 206 ++- refs/iterator.c | 29 refs/refs-internal.h | 33 ++--- 4 files changed, 76 insertions(+), 212 deletions(-) diff --git a/refs.c b/refs.c index 842c5c7..814cad3 100644 --- a/refs.c +++ b/refs.c @@ -1120,6 +1120,26 @@ int head_ref(each_ref_fn fn, void *cb_data) return head_ref_submodule(NULL, fn, cb_data); } +/* + * Call fn for each reference in the specified submodule for which the + * refname begins with prefix. If trim is non-zero, then trim that + * many characters off the beginning of each refname before passing + * the refname to fn. flags can be DO_FOR_EACH_INCLUDE_BROKEN to + * include broken references in the iteration. If fn ever returns a + * non-zero value, stop the iteration and return that value; + * otherwise, return 0. + */ +static int do_for_each_ref(const char *submodule, const char *prefix, + each_ref_fn fn, int trim, int flags, void *cb_data) +{ + struct ref_iterator *iter; + + iter = files_ref_iterator_begin(submodule, prefix, flags); + iter = prefix_ref_iterator_begin(iter, prefix, trim); + + return do_for_each_ref_iterator(iter, fn, cb_data); +} + int for_each_ref(each_ref_fn fn, void *cb_data) { return do_for_each_ref(NULL, "", fn, 0, 0, cb_data); diff --git a/refs/files-backend.c b/refs/files-backend.c index dbf1587..c48a006 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -542,53 +542,8 @@ static int entry_resolves_to_object(struct ref_entry *entry) >u.value.oid, entry->flag); } -/* - * current_ref is a performance hack: when iterating over references - * using the for_each_ref*() functions, current_ref is set to the - * current reference's entry before calling the callback function. If - * the callback function calls peel_ref(), then peel_ref() first - * checks whether the reference to be peeled is the current reference - * (it usually is) and if so, returns that reference's peeled version - * if it is available. This avoids a refname lookup in a common case. - */ -static struct ref_entry *current_ref; - typedef int each_ref_entry_fn(struct ref_entry *entry, void *cb_data); -struct ref_entry_cb { - const char *prefix; - int trim; - int flags; - each_ref_fn *fn; - void *cb_data; -}; - -/* - * Handle one reference in a do_for_each_ref*()-style iteration, - * calling an each_ref_fn for each entry. - */ -static int do_one_ref(struct ref_entry *entry, void *cb_data) -{ - struct ref_entry_cb *data = cb_data; - struct ref_entry
[PATCH 06/13] get_ref_cache(): only create an instance if there is a submodule
If there is not a nonbare repository where a submodule is supposedly located, then don't instantiate a ref_cache for it. The analogous check can be removed from resolve_gitlink_ref(). Signed-off-by: Michael Haggerty--- This doesn't actually reduce the number of ref_cache instances generated by out test suite, but it is a more logical place for the check that was added in a2d5156c resolve_gitlink_ref: ignore non-repository paths refs/files-backend.c | 33 ++--- 1 file changed, 22 insertions(+), 11 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index a0d09f4..142c977 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -954,15 +954,26 @@ static struct ref_cache *lookup_ref_cache(const char *submodule) /* * Return a pointer to a ref_cache for the specified submodule. For - * the main repository, use submodule==NULL. The returned structure - * will be allocated and initialized but not necessarily populated; it - * should not be freed. + * the main repository, use submodule==NULL; such a call cannot fail. + * For a submodule, the submodule must exist and be a nonbare + * repository, otherwise return NULL. + * + * The returned structure will be allocated and initialized but not + * necessarily populated; it should not be freed. */ static struct ref_cache *get_ref_cache(const char *submodule) { struct ref_cache *refs = lookup_ref_cache(submodule); - if (!refs) - refs = create_ref_cache(submodule); + + if (!refs) { + struct strbuf submodule_sb = STRBUF_INIT; + + strbuf_addstr(_sb, submodule); + if (is_nonbare_repository_dir(_sb)) + refs = create_ref_cache(submodule); + strbuf_release(_sb); + } + return refs; } @@ -1341,13 +1352,10 @@ int resolve_gitlink_ref(const char *path, const char *refname, unsigned char *sh return -1; strbuf_add(, path, len); - refs = lookup_ref_cache(submodule.buf); + refs = get_ref_cache(submodule.buf); if (!refs) { - if (!is_nonbare_repository_dir()) { - strbuf_release(); - return -1; - } - refs = create_ref_cache(submodule.buf); + strbuf_release(); + return -1; } strbuf_release(); @@ -1885,6 +1893,9 @@ int do_for_each_ref(const char *submodule, const char *prefix, struct ref_cache *refs; refs = get_ref_cache(submodule); + if (!refs) + return 0; + data.prefix = prefix; data.trim = trim; data.flags = flags; -- 2.8.1 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 08/13] ref_resolves_to_object(): new function
Extract new function ref_resolves_to_object() from entry_resolves_to_object(). It can be used even if there is no ref_entry at hand. Signed-off-by: Michael Haggerty--- refs/files-backend.c | 33 +++-- 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 1a46f32..8ab4d5f 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -513,19 +513,32 @@ static void sort_ref_dir(struct ref_dir *dir) } /* - * Return true iff the reference described by entry can be resolved to - * an object in the database. Emit a warning if the referred-to - * object does not exist. + * Return true if refname, which has the specified oid and flags, can + * be resolved to an object in the database. If the referred-to object + * does not exist, emit a warning and return false. + */ +static int ref_resolves_to_object(const char *refname, + const struct object_id *oid, + unsigned int flags) +{ + if (flags & REF_ISBROKEN) + return 0; + if (!has_sha1_file(oid->hash)) { + error("%s does not point to a valid object!", refname); + return 0; + } + return 1; +} + +/* + * Return true if the reference described by entry can be resolved to + * an object in the database; otherwise, emit a warning and return + * false. */ static int entry_resolves_to_object(struct ref_entry *entry) { - if (entry->flag & REF_ISBROKEN) - return 0; - if (!has_sha1_file(entry->u.value.oid.hash)) { - error("%s does not point to a valid object!", entry->name); - return 0; - } - return 1; + return ref_resolves_to_object(entry->name, + >u.value.oid, entry->flag); } /* -- 2.8.1 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 11/13] for_each_reflog(): don't abort for bad references
If there is a file under "$GIT_DIR/logs" with no corresponding reference, the old code was emitting an error message, aborting the reflog iteration, and returning -1. But * None of the callers was checking the exit value * The callers all want to find all legitimate reflogs (sometimes for the purpose of determining object reachability!) and wouldn't benefit from a truncated iteration anyway. So instead, emit an error message and skip the "broken" reflog, but continue with the iteration. Signed-off-by: Michael Haggerty--- refs/files-backend.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index c48a006..a7cc0e2 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -3325,7 +3325,7 @@ static int do_for_each_reflog(struct strbuf *name, each_ref_fn fn, void *cb_data struct object_id oid; if (read_ref_full(name->buf, 0, oid.hash, NULL)) - retval = error("bad ref for %s", name->buf); + error("bad ref for %s", name->buf); else retval = fn(name->buf, , 0, cb_data); } -- 2.8.1 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 00/13] Reference iterators
This patch series implements a new iteration paradigm for iterating over references using iterators. This approach was proposed earlier as an RFC [1]. The justification for this change is laid out in the RFC [1] and in the commit message for patch 09/13 [2]. Please refer to those, as I won't repeat them here. There are several obvious followup steps that are not included in this patch series (because they don't help with the initial transition to pluggable reference backends). I've written prototypes of several of these: * The iterator interface could be made public and callers could start using it directly. * A filter_ref_iterator could let references be filtered based on a callback function (this would be useful, for example, for for_each_glob_ref()). * A single_ref_iterator could "iterate" over a single reference (e.g., HEAD). * A series_ref_iterator could iterate over multiple iterators, for callers that want to operate on, say, HEAD plus all branches. * The ref_cache could be fed from an iterator, to better decouple caching from reading packed and loose references, and to make it easy to use caching with other reference storage backends. * Per-worktree refs could be overlaid on top of shared references using merge_ref_iterator rather than mixing them up in the same ref_cache. * The dir_iterator could be used in more places; for example, when reading loose references from disk. Table of contents of changes: * The first eight patches are cleanups. * Patch 05/13 fixes a code path that unlinks symrefs directly instead of using the refs API. * Patch 09/13 is the most important part of this series. It introduces not only reference iterators, but also (1) a pattern that other iterator interfaces can follow (along with some useful constants in iterator.h), and (2) a pattern for building OO code with inheritance. * Patch 10/13 actually uses the new reference iteration mechanism. * Patch 11/13 avoids aborting reflog iterations if an unexpected file is found under `$GIT_DIR/logs`. This fixes a bug that could cause objects needed by reflogs to be pruned, breaking the reflogs. * Patch 12/13 adds an iterator interface for iterating over directories (using the same model as patch 09/13). * Patch 13/13 implements for_each_reflog() on top of an iterator interface (essentially the analogue of patch 10/13, but for reflogs). Note that it is not necessary to rebuild for_each_reflog_ent() on top of iterators at this time, because that function deals with only a single reference at a time. Therefore, composability is not important here (for example, it won't have to deal with multiple refs backends at the same time). This patch series applies on top of mh/split-under-lock. It is also available from my GitHub repository [3] as branch "ref-iterators". I haven't yet completely rebased the ref-store changes (virtualization of the refs API) on top of all of these changes, but I will work on that next. Michael [1] http://thread.gmane.org/gmane.comp.version-control.git/290409 [2] http://mid.gmane.org/89634d216544d1102dafd5d18247bff2581d48a8.1464537050.git.mhag...@alum.mit.edu [3] https://github.com/mhagger/git Michael Haggerty (13): refs: remove unnecessary "extern" keywords do_for_each_ref(): move docstring to the header file refs: use name "prefix" consistently delete_refs(): add a flags argument remote rm: handle symbolic refs correctly get_ref_cache(): only create an instance if there is a submodule entry_resolves_to_object(): rename function from ref_resolves_to_object() ref_resolves_to_object(): new function refs: introduce an iterator interface do_for_each_ref(): reimplement using reference iteration for_each_reflog(): don't abort for bad references dir_iterator: new API for iterating over a directory tree for_each_reflog(): reimplement using iterators Makefile | 2 + builtin/fetch.c | 2 +- builtin/remote.c | 8 +- dir-iterator.c | 180 +++ dir-iterator.h | 86 +++ iterator.h | 81 +++ refs.c | 20 ++ refs.h | 139 +++- refs/files-backend.c | 630 +++ refs/iterator.c | 376 ++ refs/refs-internal.h | 225 +- 11 files changed, 1427 insertions(+), 322 deletions(-) create mode 100644 dir-iterator.c create mode 100644 dir-iterator.h create mode 100644 iterator.h create mode 100644 refs/iterator.c -- 2.8.1 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 12/13] dir_iterator: new API for iterating over a directory tree
The iterator interface is modeled on that for references, though no vtable is necessary because there is (so far?) only one type of dir_iterator. There are obviously a lot of features that could easily be added to this class: * Skip/include directory paths in the iteration * Shallow/deep iteration * Letting the caller decide which subdirectories to recurse into (e.g., via a dir_iterator_advance_into() function) * Option to iterate in sorted order * Option to iterate over directory paths before vs. after their contents But these are not needed for the current patch series, so I refrain. Signed-off-by: Michael Haggerty--- Makefile | 1 + dir-iterator.c | 180 + dir-iterator.h | 86 +++ 3 files changed, 267 insertions(+) create mode 100644 dir-iterator.c create mode 100644 dir-iterator.h diff --git a/Makefile b/Makefile index ac8f365..b4ffc11 100644 --- a/Makefile +++ b/Makefile @@ -722,6 +722,7 @@ LIB_OBJS += diff-lib.o LIB_OBJS += diff-no-index.o LIB_OBJS += diff.o LIB_OBJS += dir.o +LIB_OBJS += dir-iterator.o LIB_OBJS += editor.o LIB_OBJS += entry.o LIB_OBJS += environment.o diff --git a/dir-iterator.c b/dir-iterator.c new file mode 100644 index 000..e1d60f0 --- /dev/null +++ b/dir-iterator.c @@ -0,0 +1,180 @@ +#include "cache.h" +#include "dir.h" +#include "iterator.h" +#include "dir-iterator.h" + +struct dir_iterator_level { + int initialized; + + DIR *dir; + + /* +* The length of the directory part of refname at this level +* (including the trailing '/'): +*/ + size_t prefix_len; + + /* +* The last action that has been taken with the current entry +* (needed for directories, which have to be included in the +* iteration and also iterated into): +*/ + enum { + DIR_STATE_ITER, + DIR_STATE_RECURSE + } dir_state; +}; + +/* + * The full data structure used to manage the internal directory + * iteration state. It includes members that are not part of the + * public interface. + */ +struct dir_iterator_int { + struct dir_iterator base; + + /* +* The number of levels currently on the stack. This is always +* at least 1, because when it becomes zero the iteration is +* ended and this struct is freed. +*/ + size_t levels_nr; + + /* The number of levels that have been allocated on the stack */ + size_t levels_alloc; + + /* +* A stack of levels. levels[0] is the uppermost directory +* that will be included in this iteration. +*/ + struct dir_iterator_level *levels; +}; + +int dir_iterator_advance(struct dir_iterator *dir_iterator) +{ + struct dir_iterator_int *iter = + (struct dir_iterator_int *)dir_iterator; + + while (1) { + struct dir_iterator_level *level = + >levels[iter->levels_nr - 1]; + struct dirent *de; + + if (!level->initialized) { + if (!is_dir_sep(iter->base.path.buf[iter->base.path.len - 1])) + strbuf_addch(>base.path, '/'); + level->prefix_len = iter->base.path.len; + + /* opendir() errors are handled below */ + level->dir = opendir(iter->base.path.buf); + + level->initialized = 1; + } else if (S_ISDIR(iter->base.st.st_mode)) { + if (level->dir_state == DIR_STATE_ITER) { + /* +* The directory was just iterated +* over; now prepare to iterate into +* it. +*/ + level->dir_state = DIR_STATE_RECURSE; + ALLOC_GROW(iter->levels, iter->levels_nr + 1, + iter->levels_alloc); + level = >levels[iter->levels_nr++]; + level->initialized = 0; + continue; + } else { + /* +* The directory has already been +* iterated over and iterated into; +* we're done with it. +*/ + } + } + + if (!level->dir) { + /* +* This level is exhausted (or wasn't opened +* successfully); pop up a level. +*/ + if (--iter->levels_nr == 0) { + return dir_iterator_abort(dir_iterator); + } +
[PATCH 13/13] for_each_reflog(): reimplement using iterators
Allow references with reflogs to be iterated over using a ref_iterator. The latter is implemented as a files_reflog_iterator, which in turn uses dir_iterator to read the "logs" directory. Note that reflog iteration doesn't correctly handle per-worktree reflogs (either before or after this patch). Signed-off-by: Michael Haggerty--- refs/files-backend.c | 113 --- refs/refs-internal.h | 7 2 files changed, 78 insertions(+), 42 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index a7cc0e2..2e0d006 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -2,6 +2,7 @@ #include "../refs.h" #include "refs-internal.h" #include "../iterator.h" +#include "../dir-iterator.h" #include "../lockfile.h" #include "../object.h" #include "../dir.h" @@ -3292,60 +3293,88 @@ int for_each_reflog_ent(const char *refname, each_reflog_ent_fn fn, void *cb_dat strbuf_release(); return ret; } -/* - * Call fn for each reflog in the namespace indicated by name. name - * must be empty or end with '/'. Name will be used as a scratch - * space, but its contents will be restored before return. - */ -static int do_for_each_reflog(struct strbuf *name, each_ref_fn fn, void *cb_data) + +struct files_reflog_iterator { + struct ref_iterator base; + + struct dir_iterator *dir_iterator; + struct object_id oid; +}; + +static int files_reflog_iterator_advance(struct ref_iterator *ref_iterator) { - DIR *d = opendir(git_path("logs/%s", name->buf)); - int retval = 0; - struct dirent *de; - int oldlen = name->len; + struct files_reflog_iterator *iter = + (struct files_reflog_iterator *)ref_iterator; + struct dir_iterator *diter = iter->dir_iterator; + int ok; - if (!d) - return name->len ? errno : 0; + while ((ok = dir_iterator_advance(diter)) == ITER_OK) { + int flags; - while ((de = readdir(d)) != NULL) { - struct stat st; - - if (de->d_name[0] == '.') + if (!S_ISREG(diter->st.st_mode)) + continue; + if (diter->basename[0] == '.') continue; - if (ends_with(de->d_name, ".lock")) + if (ends_with(diter->basename, ".lock")) continue; - strbuf_addstr(name, de->d_name); - if (stat(git_path("logs/%s", name->buf), ) < 0) { - ; /* silently ignore */ - } else { - if (S_ISDIR(st.st_mode)) { - strbuf_addch(name, '/'); - retval = do_for_each_reflog(name, fn, cb_data); - } else { - struct object_id oid; - if (read_ref_full(name->buf, 0, oid.hash, NULL)) - error("bad ref for %s", name->buf); - else - retval = fn(name->buf, , 0, cb_data); - } - if (retval) - break; + if (read_ref_full(diter->relative_path, 0, + iter->oid.hash, )) { + error("bad ref for %s", diter->path.buf); + continue; } - strbuf_setlen(name, oldlen); + + iter->base.refname = diter->relative_path; + iter->base.oid = >oid; + iter->base.flags = flags; + return ITER_OK; } - closedir(d); - return retval; + + iter->dir_iterator = NULL; + if (ref_iterator_abort(ref_iterator) == ITER_ERROR) + ok = ITER_ERROR; + return ok; +} + +static int files_reflog_iterator_peel(struct ref_iterator *ref_iterator, + struct object_id *peeled) +{ + die("BUG: ref_iterator_peel() called for reflog_iterator"); +} + +static int files_reflog_iterator_abort(struct ref_iterator *ref_iterator) +{ + struct files_reflog_iterator *iter = + (struct files_reflog_iterator *)ref_iterator; + int ok = ITER_DONE; + + if (iter->dir_iterator) + ok = dir_iterator_abort(iter->dir_iterator); + + base_ref_iterator_free(ref_iterator); + return ok; +} + +struct ref_iterator_vtable files_reflog_iterator_vtable = { + files_reflog_iterator_advance, + files_reflog_iterator_peel, + files_reflog_iterator_abort +}; + +struct ref_iterator *files_reflog_iterator_begin(void) +{ + struct files_reflog_iterator *iter = xcalloc(1, sizeof(*iter)); + struct ref_iterator *ref_iterator = >base; + + base_ref_iterator_init(ref_iterator, _reflog_iterator_vtable); + iter->dir_iterator =
[PATCH 07/13] entry_resolves_to_object(): rename function from ref_resolves_to_object()
Free up the old name for a more general purpose. Signed-off-by: Michael Haggerty--- refs/files-backend.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 142c977..1a46f32 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -517,7 +517,7 @@ static void sort_ref_dir(struct ref_dir *dir) * an object in the database. Emit a warning if the referred-to * object does not exist. */ -static int ref_resolves_to_object(struct ref_entry *entry) +static int entry_resolves_to_object(struct ref_entry *entry) { if (entry->flag & REF_ISBROKEN) return 0; @@ -563,7 +563,7 @@ static int do_one_ref(struct ref_entry *entry, void *cb_data) return 0; if (!(data->flags & DO_FOR_EACH_INCLUDE_BROKEN) && - !ref_resolves_to_object(entry)) + !entry_resolves_to_object(entry)) return 0; /* Store the old value, in case this is a recursive call: */ @@ -2228,7 +2228,7 @@ static int pack_if_possible_fn(struct ref_entry *entry, void *cb_data) return 0; /* Do not pack symbolic or broken refs: */ - if ((entry->flag & REF_ISSYMREF) || !ref_resolves_to_object(entry)) + if ((entry->flag & REF_ISSYMREF) || !entry_resolves_to_object(entry)) return 0; /* Add a packed ref cache entry equivalent to the loose entry. */ -- 2.8.1 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] README.md: format CLI commands with code syntax
Benjamin Dopplingerwrites: > CLI commands which are mentioned in the readme are now formatted with > the Markdown code syntax to make the documentation more readable. > > Signed-off-by: Benjamin Dopplinger > --- > README.md | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) Looks good to me, thanks. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html