Re: [PATCH] t7800 readlink not found

2016-05-30 Thread Junio C Hamano
Torsten Bögershausen  writes:

>> diff --git a/t/t7800-difftool.sh b/t/t7800-difftool.sh
>> index 7ce4cd7..905035c 100755
>> --- a/t/t7800-difftool.sh
>> +++ b/t/t7800-difftool.sh
>> @@ -446,7 +446,7 @@ write_script .git/CHECK_SYMLINKS <<\EOF
>>   for f in file file2 sub/sub
>>   do
>>  echo "$f"
>> -readlink "$2/$f"
>> +ls -ld "$2/$f" | sed -e 's/.* -> //'
>>   done >actual
>>   EOF
>>
> I don't know how portable #ls -ld" really is.

The parts with mode bits, nlinks, uid, gid, size, and date part do
have some variations.  For example, we have been burned on ACL
enabled systems having some funny suffix after the usual mode bits
stuff.

However, as far as this test is concerned, I do not think "how
portable is the output from ls -ld" is an especially relevant
question.  None of the things we expect early in the output (the
fields I enumerated in the previous paragraph) would contain " -> ".
And we know that we do not use a filename that has " -> " (or "->")
as a substring in our tests.

We don't have to use readlink, even on platforms where we do have
readlink.  Building the conditional to be checked at runtime and
providing a shell function read_link that uses "ls -ld | sed" or
"readlink" depending on the runtime check is wasteful.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/13] refs: introduce an iterator interface

2016-05-30 Thread Eric Sunshine
On Mon, May 30, 2016 at 3:55 AM, Michael Haggerty  wrote:
> [...]
> This commit introduces a new iteration primitive for references: a
> ref_iterator. A ref_iterator is a polymorphic object that a reference
> storage backend can be asked to instantiate. There are three functions
> that can be applied to a ref_iterator:
>
> * ref_iterator_advance(): move to the next reference in the iteration
> * ref_iterator_abort(): end the iteration before it is exhausted
> * ref_iterator_peel(): peel the reference currently being looked at
> [...]
> Signed-off-by: Michael Haggerty 
> ---
> diff --git a/refs/refs-internal.h b/refs/refs-internal.h
> @@ -249,6 +249,199 @@ int rename_ref_available(const char *oldname, const 
> char *newname);
> +/*
> + * Advance the iterator to the first or next item and return ITER_OK.
> + * If the iteration is exhausted, free the resources associated with
> + * the ref_iterator and return ITER_DONE. On errors, free the iterator
> + * resources and return ITER_ERROR. It is a bug to use ref_iterator or
> + * call this function again after it has returned false.
> + */

Either:

s/false/something other than ITER_OK/

or:

s/false/ITER_DONE or ITER_ERROR/

> +int ref_iterator_advance(struct ref_iterator *ref_iterator);
> +
> +/*
> + * An iterator over nothing (its first ref_iterator_advance() call
> + * returns 0).
> + */

s/0/ITER_DONE/

> +struct ref_iterator *empty_ref_iterator_begin(void);
> +
> +/*
> + * Return true iff ref_iterator is an empty_ref_iterator.
> + */
> +int is_empty_ref_iterator(struct ref_iterator *ref_iterator);

I can see that you used this function as an optimization or
convenience in overlay_ref_iterator_begin(), but do you expect it to
be generally useful otherwise? Is it worth publishing? Do you have
other use-cases in mind?

Also, can you explain why the merge iterator doesn't also perform the
optimization/convenience of checking if one iterator is an empty
iterator?

> +/*
> + * Iterate over the intries from iter0 and iter1, with the values

s/intries/entries/

> + * interleaved as directed by the select function. The iterator takes
> + * ownership of iter0 and iter1 and frees them when the iteration is
> + * over.
> + */
> +struct ref_iterator *merge_ref_iterator_begin(
> +   struct ref_iterator *iter0, struct ref_iterator *iter1,
> +   ref_iterator_select_fn *select, void *cb_data);
> +
> +/*
> + * An iterator consisting of the union of the entries from iter0 and
> + * iter1. If there are entries common to the two sub-iterators, use
> + * the one from iter1. Each iterator must iterate over its entries in
> + * strcmp() order by refname for this to work.
> + *
> + * The new iterator takes ownership of its arguments and frees them
> + * when the iteration is over. As a convenience to callers, if iter0
> + * or iter1 is_empty_ref_iterator(), then abort that one immediately
> + * and return the other iterator directly, without wrapping it.
> + */
> +struct ref_iterator *overlay_ref_iterator_begin(struct ref_iterator *iter0,
> +   struct ref_iterator *iter1);

When reading about the overlay iterator (both code and documentation),
my expectation was that iter0 would shadow iter1, not the other way
around as implemented here. Of course, that's entirely subjective, but
the generic names don't provide any useful clues as to which shadows
which. Perhaps giving them more meaningful names would help.

> +/*
> + * Wrap iter0, only letting through the references whose names start
> + * with prefix. If trim is set, set iter->refname to the name of the
> + * reference with that many characters trimmed off the front;
> + * otherwise set it to the full refname. The new iterator takes over
> + * ownership of iter0 and frees it when iteration is over. It makes
> + * its own copy of prefix.
> + *
> + * As an convenience to callers, if prefix is the empty string and
> + * trim is zero, this function returns iter0 directly, without
> + * wrapping it.
> + */
> +struct ref_iterator *prefix_ref_iterator_begin(struct ref_iterator *iter0,
> +  const char *prefix,
> +  int trim);

Minor: Similarly, when reading the code and documentation, I wondered
why this was named 'iter0' when no 'iter1' was in sight. Perhaps name
it simply 'iter'.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] t7800 readlink not found

2016-05-30 Thread Torsten Bögershausen

On 05/31/2016 02:26 AM, Armin Kunaschik wrote:

On 05/27/2016 06:19 AM, David Aguilar wrote:

On Wed, May 25, 2016 at 11:33:33AM +0200, Armin Kunaschik wrote:

Would you mind submitting a patch so that we can support these
tests when running on AIX/HP-UX?

I don't feel comfortable to submit patches for tests I can't verify. I
don't have valgrind and python/p4 here. Looking to the code I'd say,
patching the p4 tests with "ls -ld | sed" looks quite save.
But I'm not sure about the test-lib.sh. When you are really super
paranoid, as written in the comment, you should probably use perl like

perl -e 'print readlink $ARGV[0]' $name

as a replacement.

So, as suggested by Junio, here the readlink workaround for t7800 only.
(hopefully whitespace clean this time)

--- 8< --- 8< ---
From: Armin Kunaschik 
Subject: t7800: readlink is not portable

The readlink(1) command is not available on all platforms (notably not
on AIX and HP-UX) and can be replaced in this test with the "workaround"

ls -ld  | sed -e 's/.* -> //'

This is no universal readlink replacement but works in the controlled
test environment good enough.

Signed-off-by: Armin Kunaschik 
---

diff --git a/t/t7800-difftool.sh b/t/t7800-difftool.sh
index 7ce4cd7..905035c 100755
--- a/t/t7800-difftool.sh
+++ b/t/t7800-difftool.sh
@@ -446,7 +446,7 @@ write_script .git/CHECK_SYMLINKS <<\EOF
  for f in file file2 sub/sub
  do
echo "$f"
-   readlink "$2/$f"
+   ls -ld "$2/$f" | sed -e 's/.* -> //'
  done >actual
  EOF


I don't know how portable #ls -ld" really is.
If there is one platform, that doesn't support readlink, would it
make sense to implement readlink() in test-lib.sh,
similar to what we have for MINGW, e.g. sort() or find() ?
And keep t7800 as it is ?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] bisect--helper: `bisect_clean_state` shell function in C

2016-05-30 Thread Michael Haggerty
On 05/30/2016 08:21 PM, Pranit Bauva wrote:
> Reimplement `bisect_clean_state` shell function in C and add a
> `bisect-clean-state` subcommand to `git bisect--helper` to call it from
> git-bisect.sh .
> 
> Using `bisect_clean_state` subcommand is a measure to port shell
> function to C so as to use the existing test suite. As more functions
> are ported, this subcommand will be retired and will be called by
> bisect_reset() and bisect_start().
> 
> Mentored-by: Lars Schneider 
> Mentored-by: Christian Couder 
> Signed-off-by: Pranit Bauva 
> ---
> This patch contains a bug. I have tried to identify the bug and I suppose it
> exists in do_for_each_entry_in_dir(). I have reproduced the debugging session
> at this link[1]. I have seen that some patches in mailing list regarding
> iterating over refs. Will those affect this? Or is this bug fixed in those
> patches?

The problem is that it is not legal to modify references while iterating
over them. See [1]. Your remove_bisect_ref() callback function deletes
references, which modifies the reference cache that is being iterated over.

Instead I suggest that your remove_bisect_ref() add the references to a
string_list, then call delete_refs() *after* the iteration is over.
Alternatively, you can change remove_bisect_ref() to call
ref_transaction_delete() to add reference deletions to a
ref_transaction, then call ref_transaction_commit() after the iteration
is over. See the rm() function in builtin/remote.c [2] for an example.

[1]
https://github.com/git/git/blob/f3913c2d03abc660140678a9e14dac399f847647/refs.h#L176-L184
[2]
https://github.com/git/git/blob/f3913c2d03abc660140678a9e14dac399f847647/builtin/remote.c#L738

> [...]

Michael

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] strbuf: allow to use preallocated memory

2016-05-30 Thread Michael Haggerty
On 05/30/2016 02:52 PM, Matthieu Moy wrote:
> [...]

I feel bad bikeshedding about names, especially since you took some of
the original names from my RFC. But names are very important, so I think
it's worth considering whether the current names could be improved upon.

When reading this patch series, I found I had trouble remembering
whether "preallocated" meant "preallocated and movable" or "preallocated
and immovable". So maybe we should brainstorm alternatives to
"preallocated" and "fixed". For example,

* "growable"/"fixed"? Seems OK, though all strbufs are growable at least
to the size of their initial allocation, so maybe "growable" is misleading.

* "movable"/"fixed"? This maybe better captures the essence of the
distinction. I'll use those names below for concreteness, without
claiming that they are the best.

> * strbuf_attach() calls strbuf_release(), which allows reusing an
>   existing strbuf. strbuf_wrap_preallocated() calls strbuf_init which
>   would override silently any previous content. I think strbuf_attach()
>   does the right thing here.

Hmmm

I think the best way to answer these questions is to think about use
cases and make them as easy/consistent as possible.

I expect that a very common use of strbuf_wrap_fixed() will be to wrap a
stack-allocated string, like

char pathbuf[PATH_MAX];
struct strbuf path;

strbuf_wrap_fixed(, pathbuf, 0, sizeof(pathbuf));

In this use case, it would be a shame if `path` had to be initialized to
STRBUF_INIT just because `strbuf_wrap_fixed()` calls `strbuf_release()`
internally.

But maybe we could make this use case easier still. If there were a macro

#define STRBUF_FIXED_WRAPPER(sb, buf, len) { STRBUF_FIXED_MEMORY,
sizeof(buf), (len), (buf) }

then we could write

char pathbuf[PATH_MAX];
struct strbuf path = STRBUF_FIXED_WRAPPER(pathbuf, 0);

I think that would be pretty usable. One would have to be careful only
to wrap arrays and not `char *` pointers, because `sizeof` wouldn't work
on the latter. The BARF_UNLESS_AN_ARRAY macro could be used here to add
a little safety.

(It would be even nicer if we could write

struct strbuf path = STRBUF_FIXED(PATH_MAX);

and it would initialize both path and a pathbuf variable for it to wrap,
but I don't think there is a way to implement such a macro. So I think
the only way to squeeze this onto one line would be to make it look like

DEFINE_FIXED_STRBUF(path, PATH_MAX);

But that looks awful, so I think the two-line version above is preferable.)

Similarly, there could be a macro

#define STRBUF_MOVABLE_WRAPPER(sb, buf, len) { 0, sizeof(buf),
(len), (buf) }

If you provide macro forms like these for initializing strbufs, then I
agree with Matthieu that the analogous functional forms should probably
call strbuf_release() before wrapping the array. The functions might be
named more like `strbuf_attach()` to emphasize their similarity to that
existing function. Maybe

strbuf_attach_fixed(struct strbuf *sb, void *s, size_t len, size_t
alloc);
strbuf_attach_movable(struct strbuf *sb, void *s, size_t len, size_t
alloc);

Michael

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] pretty: support "mboxrd" output format

2016-05-30 Thread Eric Sunshine
On Mon, May 30, 2016 at 7:21 PM, Eric Wong  wrote:
> This output format prevents format-patch output from breaking
> readers if somebody copy+pasted an mbox into a commit message.
>
> Unlike the traditional "mboxo" format, "mboxrd" is designed to
> be fully-reversible.  "mboxrd" also gracefully degrades to
> showing extra ">" in existing "mboxo" readers.
> [...]
> Signed-off-by: Eric Wong 
> ---
> diff --git a/pretty.c b/pretty.c
> @@ -1697,12 +1699,34 @@ static void pp_handle_indent(struct 
> pretty_print_context *pp,
> +static regex_t *mboxrd_prepare(void)
> +{
> +   static regex_t preg;
> +   const char re[] = "^>*From ";
> +   int err = regcomp(, re, REG_NOSUB | REG_EXTENDED);
> +[...]
> +   return 
> +}
> +
>  void pp_remainder(struct pretty_print_context *pp,
>   const char **msg_p,
>   struct strbuf *sb,
>   int indent)
>  {
> +   static regex_t *mboxrd_from;
> +
> +   if (pp->fmt == CMIT_FMT_MBOXRD && !mboxrd_from)
> +   mboxrd_from = mboxrd_prepare();
> +
> @@ -1725,8 +1749,13 @@ void pp_remainder(struct pretty_print_context *pp,
> else if (pp->expand_tabs_in_log)
> strbuf_add_tabexpand(sb, pp->expand_tabs_in_log,
>  line, linelen);
> -   else
> +   else {
> +   if (pp->fmt == CMIT_FMT_MBOXRD &&
> +   !regexec(mboxrd_from, line, 0, 0, 0))
> +   strbuf_addch(sb, '>');

At first glance, this seems dangerous since it's handing 'line' to
regexec() without paying attention to 'linelen'. For an arbitrary
regex, this could result in erroneous matches on subsequent "lines",
however, since this expression is anchored with '^', it's not a
problem. But, it is a bit subtle.

I wonder if hand-coding, rather than using a regex, could be an improvement:

static int is_mboxrd_from(const char *s, size_t n)
{
size_t f = strlen("From ");
const char *t = s + n;

while (s < t && *s == '>')
s++;
return t - s >= f && !memcmp(s, "From ", f);
}

...
if (is_mboxrd_from(line, linelen)
strbuf_addch(sb, '>');

or something.

> +
> strbuf_add(sb, line, linelen);
> +   }
> strbuf_addch(sb, '\n');
> }
>  }
> diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
> @@ -1565,4 +1565,31 @@ test_expect_success 'format-patch --base overrides 
> format.useAutoBase' '
> +test_expect_success 'format-patch --pretty=mboxrd' '
> +   cat >msg <<-INPUT_END &&

Maybe use <<-\INPUT_END to indicate that no variable interpolation is
expected. Ditto below.

> +   mboxrd should escape the body
> +
> +   From could trip up a loose mbox parser
> +   >From extra escape for reversibility
> +   >>From extra escape for reversibility 2
> +   from lower case not escaped
> +   Fromm bad speling not escaped
> +From with leading space not escaped
> +   INPUT_END
> +
> +   cat >expect <<-INPUT_END &&
> +   >From could trip up a loose mbox parser
> +   >>From extra escape for reversibility
> +   >>>From extra escape for reversibility 2
> +   from lower case not escaped
> +   Fromm bad speling not escaped
> +From with leading space not escaped
> +   INPUT_END
> +
> +   C=$(git commit-tree HEAD^^{tree} -p HEAD  +   git format-patch --pretty=mboxrd --stdout -1 $C~1..$C >patch &&
> +   grep -A5 "^>From could trip up a loose mbox parser" patch >actual &&

Hmm, -A is not POSIX and is otherwise not used in Git tests. Perhaps
you could use 'git grep --no-index -A' instead or something?

> +   test_cmp expect actual
> +'
> +
>  test_done
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] strbuf: allow to use preallocated memory

2016-05-30 Thread Michael Haggerty
On 05/30/2016 02:13 PM, Johannes Schindelin wrote:
> [...]
>> @@ -38,7 +67,11 @@ char *strbuf_detach(struct strbuf *sb, size_t *sz)
>>  {
>>  char *res;
>>  strbuf_grow(sb, 0);
>> -res = sb->buf;
>> +if (sb->flags & STRBUF_OWNS_MEMORY)
>> +res = sb->buf;
>> +else
>> +res = xmemdupz(sb->buf, sb->alloc - 1);
> 
> This looks like a usage to be avoided: if we plan to detach the buffer,
> anyway, there is no good reason to allocate it on the heap first. I would
> at least issue a warning here.

I think this last case should be changed to

res = xmemdupz(sb->buf, sb->len);

Johannes, if this change is made then I think that there is a reasonable
use case for calling `strbuf_detach()` on a strbuf that wraps a
stack-allocated string, so I don't think that a warning is needed.

I think this change makes sense. After all, once a caller detaches a
string, it is probably not planning on growing/shrinking it anymore, so
any more space than that would probably be wasted. In fact, since the
caller has no way to ask how much extra space the detached string has
allocated, it is almost guaranteed that the space would be wasted.

Actually, that is not 100% certain. Theoretically, a caller might read
`sb->alloc` *before* calling `strbuf_detach()`, and assume that the
detached string has that allocated size. Or the caller might call
`strbuf_grow()` then assume that the detached string has at least that
much free space.

I sure hope that no callers actually do that. The docstring for
`strbuf_detach()` doesn't promise that it will work, and there is a
pretty stern warning [1] that should hopefully have dissuaded developers
from such a usage. But how could it be checked for sure?

* Audit the callers of strbuf_detach(). But given that there are nearly
200 callers, that would be a huge amount of work.

* On a test branch, change the existing implementation of
strbuf_detach() to return newly-allocated memory of size `sb->len + 1`
and free `sb->buf`, then run the test suite under valgrind. This would
flush out examples of this antipattern in the test suite.

It might seem like we don't have to worry about this, because existing
callers only deal with strbufs that wrap heap-allocated strings. But
such a caller might get a strbuf passed to it from a caller, and that
caller might someday be modified to use stack-allocated strings. So I
think that at least the valgrind test suggested above would be prudent.

Michael

[1]
https://github.com/git/git/blob/f3913c2d03abc660140678a9e14dac399f847647/strbuf.h#L20-L23

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] strbuf: add tests

2016-05-30 Thread Michael Haggerty
Hi,

Cool that you are working on this! See my comments below.

On 05/30/2016 12:36 PM, William Duclot wrote:
> Test the strbuf API. Being used throughout all Git the API could be
> considered tested, but adding specific tests makes it easier to improve
> and extend the API.
> ---
>  Makefile   |  1 +
>  t/helper/test-strbuf.c | 69 
> ++
>  t/t0082-strbuf.sh  | 19 ++
>  3 files changed, 89 insertions(+)
>  create mode 100644 t/helper/test-strbuf.c
>  create mode 100755 t/t0082-strbuf.sh
> 
> diff --git a/Makefile b/Makefile
> index 3f03366..dc84f43 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -613,6 +613,7 @@ TEST_PROGRAMS_NEED_X += test-scrap-cache-tree
>  TEST_PROGRAMS_NEED_X += test-sha1
>  TEST_PROGRAMS_NEED_X += test-sha1-array
>  TEST_PROGRAMS_NEED_X += test-sigchain
> +TEST_PROGRAMS_NEED_X += test-strbuf
>  TEST_PROGRAMS_NEED_X += test-string-list
>  TEST_PROGRAMS_NEED_X += test-submodule-config
>  TEST_PROGRAMS_NEED_X += test-subprocess
> diff --git a/t/helper/test-strbuf.c b/t/helper/test-strbuf.c
> new file mode 100644
> index 000..622f627
> --- /dev/null
> +++ b/t/helper/test-strbuf.c
> @@ -0,0 +1,69 @@
> +#include "git-compat-util.h"
> +#include "strbuf.h"
> +
> +/*
> + * Check behavior on usual use cases
> + */
> +int test_usual(struct strbuf *sb)

Is there a particular reason that you pass `sb` into the function? Why
not use a local variable?

It wouldn't hurt to declare this function static, because it is only
used within this one compilation unit. On the other hand, in this
particular case I don't think it matters much one way or the other.

> +{
> + size_t size, old_alloc;
> + char *res, *old_buf, *str_test = malloc(5*sizeof(char));

There is no reason to multiply by `sizeof(char)` here, and we don't do
it in our code. (According to the C standard, `sizeof(char)` is always 1.)

> + strbuf_grow(sb, 1);
> + strcpy(str_test, "test");
> + old_alloc = sb->alloc;
> + strbuf_grow(sb, 1000);
> + if (old_alloc == sb->alloc)
> + die("strbuf_grow does not realloc the buffer as expected");

It's not ideal that your test depends on the details of the allocation
policy of strbuf. If somebody were to decide that it makes sense for the
initial allocation of a strbuf to be 1024 characters, this test would
break. I know it's implausible, but it's better to remove this coupling.

You could do it by ensuring that you request more space than is already
allocated:

strbuf_grow(sb, sb.alloc - sb.len + 1000);

On the other hand, if you *want* to test the details of strbuf's
allocation policy here, you would do it explicitly rather than as the
side effect of this test. For example, before calling `strbuf_grow(sb,
1000)`, you could insert:

if (sb.alloc > SOME_VALUE)
die("strbuf_grow(sb, 1) allocated too much space");

Though my opinion is that if you want to write tests of strbuf's
allocation policies, it should be in an entirely separate test.

> + old_buf = sb->buf;
> + res = strbuf_detach(sb, );

Since you don't use `size`, you can pass `NULL` as the second argument
of `strbuf_detach()`. The same below.

Alternatively, maybe you want to add a test that `strbuf_detach()` sets
`size` to the expected value.

> + if (res != old_buf)
> + die("strbuf_detach does not return the expected buffer");
> + free(res);
> +
> + strcpy(str_test, "test");

Near the beginning of the function you have this exact line, but here
you do it again even though str_test wasn't touched between the two
lines. One of them can be deleted...

...but in fact it would be easier to initialize str_test using our
xstrdup() function:

char *str_test = xstrdup("test");

> + strbuf_attach(sb, (void *)str_test, strlen(str_test), sizeof(str_test));

Since str_test is a `char *`, `sizeof(str_test)` returns the size of the
pointer itself (e.g., 4 or 8 bytes, depending on your computer's
architecture). What you want here is the size of the memory that it
points at, which in this case is `strlen(str_test) + 1`.

(You may be confusing this pattern with code that looks like this:

char str_test[5];

In this case, `sizeof(str_test)` is indeed 5, because `str_test` is an
array of characters rather than a pointer to a character. It's confusing.)

Also, you don't need to cast `str_test` to `void *`. Any pointer can be
converted to `void *` implicitly.

> + res = strbuf_detach(sb, );
> + if (res != str_test)
> + die("strbuf_detach does not return the expected buffer");
> + free(res);
> + strbuf_release(sb);
> +
> + return 0;
> +}
> +
> +int main(int argc, char *argv[])
> +{
> + size_t size = 1;
> + struct strbuf sb;
> + char str_test[5] = "test";
> + char str_foo[7] = "foo";

`size`, `str_test`, and `str_foo` are all unused.

You should compile using `DEVELOPER=1` to enable a bunch of compiler
warnings that Git developers tend to 

Re: [PATCH 09/13] refs: introduce an iterator interface

2016-05-30 Thread Michael Haggerty
On 05/30/2016 06:57 PM, Ramsay Jones wrote:
> 
> 
> On 30/05/16 16:22, Ramsay Jones wrote:
>>
>>
>> On 30/05/16 08:55, Michael Haggerty wrote:
>> [snip]
>>
>>>  /* Reference is a symbolic reference. */
>>> diff --git a/refs/files-backend.c b/refs/files-backend.c
>>> index 8ab4d5f..dbf1587 100644
>>> --- a/refs/files-backend.c
>>> +++ b/refs/files-backend.c
>>> [...]
>>> +   sort_ref_dir(dir);
>>
>> do you need to sort here ...
>>> [...]
>>> +   sort_ref_dir(level->dir);
>>
>> ... given that you sort here?
> 
> I had intended to say 'or vice versa' here. When I wrote this, I had not
> finished reading this patch (let alone the series). Now, I suspect that
> you can simply drop this 'sort_ref_dir()' call site. Unless I've misread
> the code, of course! ;-)

Yes, you are right. Thanks for catching this! I'll fix it in v2.

Michael

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] t7800 readlink not found

2016-05-30 Thread Armin Kunaschik
On 05/27/2016 06:19 AM, David Aguilar wrote:
> On Wed, May 25, 2016 at 11:33:33AM +0200, Armin Kunaschik wrote:
> 
> Would you mind submitting a patch so that we can support these
> tests when running on AIX/HP-UX?

I don't feel comfortable to submit patches for tests I can't verify. I
don't have valgrind and python/p4 here. Looking to the code I'd say,
patching the p4 tests with "ls -ld | sed" looks quite save.
But I'm not sure about the test-lib.sh. When you are really super
paranoid, as written in the comment, you should probably use perl like

perl -e 'print readlink $ARGV[0]' $name

as a replacement.

So, as suggested by Junio, here the readlink workaround for t7800 only.
(hopefully whitespace clean this time)

--- 8< --- 8< ---
From: Armin Kunaschik 
Subject: t7800: readlink is not portable

The readlink(1) command is not available on all platforms (notably not
on AIX and HP-UX) and can be replaced in this test with the "workaround"

ls -ld  | sed -e 's/.* -> //'

This is no universal readlink replacement but works in the controlled
test environment good enough.

Signed-off-by: Armin Kunaschik 
---

diff --git a/t/t7800-difftool.sh b/t/t7800-difftool.sh
index 7ce4cd7..905035c 100755
--- a/t/t7800-difftool.sh
+++ b/t/t7800-difftool.sh
@@ -446,7 +446,7 @@ write_script .git/CHECK_SYMLINKS <<\EOF
 for f in file file2 sub/sub
 do
echo "$f"
-   readlink "$2/$f"
+   ls -ld "$2/$f" | sed -e 's/.* -> //'
 done >actual
 EOF

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] am: support --patch-format=mboxrd

2016-05-30 Thread Eric Wong
Combined with "git format-patch --pretty=mboxrd", this should
allow us to round-trip commit messages with embedded mbox
"From " lines without corruption.

Signed-off-by: Eric Wong 
---
 Documentation/git-am.txt |  3 ++-
 builtin/am.c | 14 +++---
 t/t4150-am.sh| 20 
 3 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-am.txt b/Documentation/git-am.txt
index 13cdd7f..6348c29 100644
--- a/Documentation/git-am.txt
+++ b/Documentation/git-am.txt
@@ -116,7 +116,8 @@ default.   You can use `--no-utf8` to override this.
By default the command will try to detect the patch format
automatically. This option allows the user to bypass the automatic
detection and specify the patch format that the patch(es) should be
-   interpreted as. Valid formats are mbox, stgit, stgit-series and hg.
+   interpreted as. Valid formats are mbox, mboxrd,
+   stgit, stgit-series and hg.
 
 -i::
 --interactive::
diff --git a/builtin/am.c b/builtin/am.c
index 3dfe70b..d5da5fe 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -70,7 +70,8 @@ enum patch_format {
PATCH_FORMAT_MBOX,
PATCH_FORMAT_STGIT,
PATCH_FORMAT_STGIT_SERIES,
-   PATCH_FORMAT_HG
+   PATCH_FORMAT_HG,
+   PATCH_FORMAT_MBOXRD
 };
 
 enum keep_type {
@@ -712,7 +713,8 @@ done:
  * Splits out individual email patches from `paths`, where each path is either
  * a mbox file or a Maildir. Returns 0 on success, -1 on failure.
  */
-static int split_mail_mbox(struct am_state *state, const char **paths, int 
keep_cr)
+static int split_mail_mbox(struct am_state *state, const char **paths,
+   int keep_cr, int mboxrd)
 {
struct child_process cp = CHILD_PROCESS_INIT;
struct strbuf last = STRBUF_INIT;
@@ -724,6 +726,8 @@ static int split_mail_mbox(struct am_state *state, const 
char **paths, int keep_
argv_array_push(, "-b");
if (keep_cr)
argv_array_push(, "--keep-cr");
+   if (mboxrd)
+   argv_array_push(, "--mboxrd");
argv_array_push(, "--");
argv_array_pushv(, paths);
 
@@ -965,13 +969,15 @@ static int split_mail(struct am_state *state, enum 
patch_format patch_format,
 
switch (patch_format) {
case PATCH_FORMAT_MBOX:
-   return split_mail_mbox(state, paths, keep_cr);
+   return split_mail_mbox(state, paths, keep_cr, 0);
case PATCH_FORMAT_STGIT:
return split_mail_conv(stgit_patch_to_mail, state, paths, 
keep_cr);
case PATCH_FORMAT_STGIT_SERIES:
return split_mail_stgit_series(state, paths, keep_cr);
case PATCH_FORMAT_HG:
return split_mail_conv(hg_patch_to_mail, state, paths, keep_cr);
+   case PATCH_FORMAT_MBOXRD:
+   return split_mail_mbox(state, paths, keep_cr, 1);
default:
die("BUG: invalid patch_format");
}
@@ -2201,6 +2207,8 @@ static int parse_opt_patchformat(const struct option 
*opt, const char *arg, int
*opt_value = PATCH_FORMAT_STGIT_SERIES;
else if (!strcmp(arg, "hg"))
*opt_value = PATCH_FORMAT_HG;
+   else if (!strcmp(arg, "mboxrd"))
+   *opt_value = PATCH_FORMAT_MBOXRD;
else
return error(_("Invalid value for --patch-format: %s"), arg);
return 0;
diff --git a/t/t4150-am.sh b/t/t4150-am.sh
index b41bd17..74e093d 100755
--- a/t/t4150-am.sh
+++ b/t/t4150-am.sh
@@ -957,4 +957,24 @@ test_expect_success 'am -s unexpected trailer block' '
test_cmp expect actual
 '
 
+test_expect_success 'am --patch-format=mboxrd handles mboxrd' '
+   rm -fr .git/rebase-apply &&
+   git checkout -f first &&
+   echo mboxrd >>file &&
+   git add file &&
+   cat >msg <<-INPUT_END &&
+   mboxrd should escape the body
+
+   From could trip up a loose mbox parser
+   >From extra escape for reversibility
+   INPUT_END
+   git commit -F msg &&
+   git format-patch --pretty=mboxrd --stdout -1 >mboxrd1 &&
+   grep "^>From could trip up a loose mbox parser" mboxrd1 &&
+   git checkout -f first &&
+   git am --patch-format=mboxrd mboxrd1 &&
+   git cat-file commit HEAD | tail -n4 >out &&
+   test_cmp msg out
+'
+
 test_done
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] mailsplit: support unescaping mboxrd messages

2016-05-30 Thread Eric Wong
This will allow us to parse the output of --pretty=mboxrd
and the output of other mboxrd generators.

Signed-off-by: Eric Wong 
---
 Documentation/git-mailsplit.txt |  7 ++-
 builtin/mailsplit.c | 23 +++
 t/t5100-mailinfo.sh | 13 +
 t/t5100/0001mboxrd  |  4 
 t/t5100/0002mboxrd  |  3 +++
 t/t5100/sample.mboxrd   | 17 +
 6 files changed, 66 insertions(+), 1 deletion(-)
 create mode 100644 t/t5100/0001mboxrd
 create mode 100644 t/t5100/0002mboxrd
 create mode 100644 t/t5100/sample.mboxrd

diff --git a/Documentation/git-mailsplit.txt b/Documentation/git-mailsplit.txt
index 4d1b871..e3b2a88 100644
--- a/Documentation/git-mailsplit.txt
+++ b/Documentation/git-mailsplit.txt
@@ -8,7 +8,8 @@ git-mailsplit - Simple UNIX mbox splitter program
 SYNOPSIS
 
 [verse]
-'git mailsplit' [-b] [-f] [-d] [--keep-cr] -o [--] 
[(|)...]
+'git mailsplit' [-b] [-f] [-d] [--keep-cr] [--mboxrd]
+   -o [--] [(|)...]
 
 DESCRIPTION
 ---
@@ -47,6 +48,10 @@ OPTIONS
 --keep-cr::
Do not remove `\r` from lines ending with `\r\n`.
 
+--mboxrd::
+   Input is of the "mboxrd" format and "^>+From " line escaping is
+   reversed.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/mailsplit.c b/builtin/mailsplit.c
index 4859ede..fad871d 100644
--- a/builtin/mailsplit.c
+++ b/builtin/mailsplit.c
@@ -45,6 +45,7 @@ static int is_from_line(const char *line, int len)
 
 static struct strbuf buf = STRBUF_INIT;
 static int keep_cr;
+static regex_t *gtfrom;
 
 /* Called with the first line (potentially partial)
  * already in buf[] -- normally that should begin with
@@ -77,6 +78,10 @@ static int split_one(FILE *mbox, const char *name, int 
allow_bare)
strbuf_addch(, '\n');
}
 
+   if (gtfrom && buf.len > (sizeof(">From ") - 1) &&
+   !regexec(gtfrom, buf.buf, 0, 0, 0))
+   strbuf_remove(, 0, 1);
+
if (fwrite(buf.buf, 1, buf.len, output) != buf.len)
die_errno("cannot write output");
 
@@ -242,6 +247,22 @@ out:
return ret;
 }
 
+static regex_t *gtfrom_prepare(void)
+{
+   static regex_t preg;
+   const char re[] = "^>+From ";
+   int err = regcomp(, re, REG_NOSUB | REG_EXTENDED);
+
+   if (err) {
+   char errbuf[1024];
+   regerror(err, , errbuf, sizeof(errbuf));
+   regfree();
+   die("Cannot prepare regexp `%s': %s", re, errbuf);
+   }
+
+   return 
+}
+
 int cmd_mailsplit(int argc, const char **argv, const char *prefix)
 {
int nr = 0, nr_prec = 4, num = 0;
@@ -271,6 +292,8 @@ int cmd_mailsplit(int argc, const char **argv, const char 
*prefix)
keep_cr = 1;
} else if ( arg[1] == 'o' && arg[2] ) {
dir = arg+2;
+   } else if (!strcmp(arg, "--mboxrd")) {
+   gtfrom = gtfrom_prepare();
} else if ( arg[1] == '-' && !arg[2] ) {
argp++; /* -- marks end of options */
break;
diff --git a/t/t5100-mailinfo.sh b/t/t5100-mailinfo.sh
index 85b3df5..62b442c 100755
--- a/t/t5100-mailinfo.sh
+++ b/t/t5100-mailinfo.sh
@@ -111,4 +111,17 @@ test_expect_success 'mailinfo on message with quoted 
>From' '
test_cmp "$TEST_DIRECTORY"/t5100/quoted-from.expect quoted-from/msg
 '
 
+test_expect_success 'mailinfo unescapes with --mboxrd' '
+   mkdir mboxrd &&
+   git mailsplit -omboxrd --mboxrd \
+   "$TEST_DIRECTORY"/t5100/sample.mboxrd >last &&
+   test x"$(cat last)" = x2 &&
+   for i in 0001 0002
+   do
+   git mailinfo mboxrd/msg mboxrd/patch \
+ mboxrd/out &&
+   test_cmp "$TEST_DIRECTORY"/t5100/${i}mboxrd mboxrd/msg
+   done
+'
+
 test_done
diff --git a/t/t5100/0001mboxrd b/t/t5100/0001mboxrd
new file mode 100644
index 000..494ec55
--- /dev/null
+++ b/t/t5100/0001mboxrd
@@ -0,0 +1,4 @@
+From the beginning, mbox should have been mboxrd
+>From escaped
+From not mangled but this line should have been escaped
+
diff --git a/t/t5100/0002mboxrd b/t/t5100/0002mboxrd
new file mode 100644
index 000..3c30b0b
--- /dev/null
+++ b/t/t5100/0002mboxrd
@@ -0,0 +1,3 @@
+ >From unchanged
+ From also unchanged
+
diff --git a/t/t5100/sample.mboxrd b/t/t5100/sample.mboxrd
new file mode 100644
index 000..75de6dd
--- /dev/null
+++ b/t/t5100/sample.mboxrd
@@ -0,0 +1,17 @@
+From mboxrd Mon Sep 17 00:00:00 2001
+From: mboxrd writer 
+Date: Fri, 9 Jun 2006 00:44:16 -0700
+Subject: [PATCH] a commit with escaped From lines
+
+>From the beginning, mbox should have been mboxrd
+>>From escaped
+From not mangled but this line should have been escaped
+
+From mboxrd Mon Sep 17 00:00:00 2001
+From: mboxrd writer 

[PATCH 1/3] pretty: support "mboxrd" output format

2016-05-30 Thread Eric Wong
This output format prevents format-patch output from breaking
readers if somebody copy+pasted an mbox into a commit message.

Unlike the traditional "mboxo" format, "mboxrd" is designed to
be fully-reversible.  "mboxrd" also gracefully degrades to
showing extra ">" in existing "mboxo" readers.

This degradation is preferable to breaking message splitting
completely, a problem I've seen in "mboxcl" due to having
multiple, non-existent, or inaccurate Content-Length headers.

"mboxcl2" is a non-starter since it's inherits the problems
of "mboxcl" while being completely incompatible with existing
tooling based around mailsplit.

ref: 
http://homepage.ntlworld.com/jonathan.deboynepollard/FGA/mail-mbox-formats.html

Signed-off-by: Eric Wong 
---
 builtin/log.c   |  2 +-
 commit.h|  6 ++
 log-tree.c  |  4 ++--
 pretty.c| 45 +
 t/t4014-format-patch.sh | 27 +++
 5 files changed, 73 insertions(+), 11 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 099f4f7..6d6f368 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -953,7 +953,7 @@ static void make_cover_letter(struct rev_info *rev, int 
use_stdout,
struct pretty_print_context pp = {0};
struct commit *head = list[0];
 
-   if (rev->commit_format != CMIT_FMT_EMAIL)
+   if (!cmit_fmt_is_mail(rev->commit_format))
die(_("Cover letter needs email format"));
 
committer = git_committer_info(0);
diff --git a/commit.h b/commit.h
index b06db4d..1e04d3a 100644
--- a/commit.h
+++ b/commit.h
@@ -131,11 +131,17 @@ enum cmit_fmt {
CMIT_FMT_FULLER,
CMIT_FMT_ONELINE,
CMIT_FMT_EMAIL,
+   CMIT_FMT_MBOXRD,
CMIT_FMT_USERFORMAT,
 
CMIT_FMT_UNSPECIFIED
 };
 
+static inline int cmit_fmt_is_mail(enum cmit_fmt fmt)
+{
+   return (fmt == CMIT_FMT_EMAIL || fmt == CMIT_FMT_MBOXRD);
+}
+
 struct pretty_print_context {
/*
 * Callers should tweak these to change the behavior of pp_* functions.
diff --git a/log-tree.c b/log-tree.c
index 78a5381..48daf84 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -603,7 +603,7 @@ void show_log(struct rev_info *opt)
 * Print header line of header..
 */
 
-   if (opt->commit_format == CMIT_FMT_EMAIL) {
+   if (cmit_fmt_is_mail(opt->commit_format)) {
log_write_email_headers(opt, commit, , 
_headers,
_8bit_cte);
} else if (opt->commit_format != CMIT_FMT_USERFORMAT) {
@@ -694,7 +694,7 @@ void show_log(struct rev_info *opt)
 
if ((ctx.fmt != CMIT_FMT_USERFORMAT) &&
ctx.notes_message && *ctx.notes_message) {
-   if (ctx.fmt == CMIT_FMT_EMAIL) {
+   if (cmit_fmt_is_mail(ctx.fmt)) {
strbuf_addstr(, "---\n");
opt->shown_dashes = 1;
}
diff --git a/pretty.c b/pretty.c
index 87c4497..a33e604 100644
--- a/pretty.c
+++ b/pretty.c
@@ -92,6 +92,7 @@ static void setup_commit_formats(void)
{ "medium", CMIT_FMT_MEDIUM,0,  8 },
{ "short",  CMIT_FMT_SHORT, 0,  0 },
{ "email",  CMIT_FMT_EMAIL, 0,  0 },
+   { "mboxrd", CMIT_FMT_MBOXRD,0,  0 },
{ "fuller", CMIT_FMT_FULLER,0,  8 },
{ "full",   CMIT_FMT_FULL,  0,  8 },
{ "oneline",CMIT_FMT_ONELINE,   1,  0 }
@@ -444,7 +445,7 @@ void pp_user_info(struct pretty_print_context *pp,
if (pp->mailmap)
map_user(pp->mailmap, , , , );
 
-   if (pp->fmt == CMIT_FMT_EMAIL) {
+   if (cmit_fmt_is_mail(pp->fmt)) {
if (pp->from_ident && ident_cmp(pp->from_ident, )) {
struct strbuf buf = STRBUF_INIT;
 
@@ -494,6 +495,7 @@ void pp_user_info(struct pretty_print_context *pp,
show_ident_date(, >date_mode));
break;
case CMIT_FMT_EMAIL:
+   case CMIT_FMT_MBOXRD:
strbuf_addf(sb, "Date: %s\n",
show_ident_date(, DATE_MODE(RFC2822)));
break;
@@ -535,7 +537,7 @@ static void add_merge_info(const struct 
pretty_print_context *pp,
 {
struct commit_list *parent = commit->parents;
 
-   if ((pp->fmt == CMIT_FMT_ONELINE) || (pp->fmt == CMIT_FMT_EMAIL) ||
+   if ((pp->fmt == CMIT_FMT_ONELINE) || (cmit_fmt_is_mail(pp->fmt)) ||
!parent || !parent->next)
return;
 
@@ -1614,7 +1616,7 @@ void pp_title_line(struct pretty_print_context *pp,
if (pp->after_subject) {
strbuf_addstr(sb, pp->after_subject);
}
-   if (pp->fmt == CMIT_FMT_EMAIL) {
+   if (cmit_fmt_is_mail(pp->fmt)) {
strbuf_addch(sb, '\n');
}
 
@@ -1697,12 +1699,34 

[RFC/PATCH 0/3] support mboxrd format

2016-05-30 Thread Eric Wong
Sometimes users will copy+paste an entire mbox into a
commit message, leading to bad splits when a patch is
output as an email.

Unlike other mbox-family formats, mboxrd allows reversible
round-tripping while avoiding bad splits for old "mboxo"
readers.

I'm also considering altering the current
"From ${COMMIT} Mon Sep 17 00:00:00 2001" line to something
else so mailsplit (or "am") can autodetect.

Maybe:
From ${COMMIT}@mboxrd Mon Sep 17 00:00:00 2001
?


We may also want to default to single escaping "From " in
commit messages for --pretty=email to avoid corruption when
somebody copy+pastes an mbox into the commit message.
This is a technically incompatible change, but I think it's
preferable to breaking splitting complete.

In other words, --pretty=email changes to output "mboxo"
for now.

Long term (possibly git 3.0?), maybe mboxrd can become the
default mail format.  IMHO, it should've been since 2005.

ref: 
http://homepage.ntlworld.com/jonathan.deboynepollard/FGA/mail-mbox-formats.html

Eric Wong (3):
  pretty: support "mboxrd" output format
  mailsplit: support unescaping mboxrd messages
  am: support --patch-format=mboxrd

 Documentation/git-am.txt|  3 ++-
 Documentation/git-mailsplit.txt |  7 ++-
 builtin/am.c| 14 ++---
 builtin/log.c   |  2 +-
 builtin/mailsplit.c | 23 +
 commit.h|  6 ++
 log-tree.c  |  4 ++--
 pretty.c| 45 +
 t/t4014-format-patch.sh | 27 +
 t/t4150-am.sh   | 20 ++
 t/t5100-mailinfo.sh | 13 
 t/t5100/0001mboxrd  |  4 
 t/t5100/0002mboxrd  |  3 +++
 t/t5100/sample.mboxrd   | 17 
 14 files changed, 172 insertions(+), 16 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] strbuf: allow to use preallocated memory

2016-05-30 Thread Mike Hommey
On Tue, May 31, 2016 at 12:46:23AM +0200, William Duclot wrote:
> Mike Hommey  writes:
> >>  struct strbuf {
> >> +  unsigned int flags;
> >>size_t alloc;
> >>size_t len;
> >>char *buf;
> >>  };
> > 
> > Depending whether the size of strbuf matters, it /might/ be worth
> > considering some packing here. malloc() usually returns buffers that can
> > contain more data than what is requested. Which means allocation sizes
> > could be rounded and that wouldn't change the amount of allocated
> > memory. On glibc malloc_usable_size(malloc(1)) apparently returns 24.
> > On jemalloc, it's 4 or 8. It's in the same ballbark with many
> > allocators.
> > 
> > So, it would be possible to round alloc such that it's always a multiple
> > of, say, 4, and stick flags in the low, unused bits.
> 
> If I'm not mistaken, the memory allocated is not necessarily linear with
> the size asked, depending on the algorithm used by the allocator and/or
> the kernel. The system for exemple use powers of two, if the user asks
> for exactly 2^x bytes, adding the space for the flags would lead to an
> allocation of 2^(x+1) bytes.

No, it would not. If you requested 129 bytes, you'd request 136 instead,
which the allocator would round to the same power of two. If you
requested 128, you'd still request 128. It's not about adding space in
the allocated buffer for the flags, it's about needing less bits in
`alloc` because those bits are effectively useless because of how
allocators work.

> Way worse than storing an unsigned.
> If the allocator use a fibonnaci system, we can't even rely on multiples
> of 4 (or 2).
> I'm not sure the fibonnaci system is actually used by any allocator, but
> my point is that I'm not sure it is a good thing to rely on such 
> low-level implementations.

Allocators have constraints related to word sizes and alignment, so they
are pretty much guaranteed to align things to powers of two.

Mike
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] strbuf: allow to use preallocated memory

2016-05-30 Thread William Duclot
Mike Hommey  writes:
>>  struct strbuf {
>> +unsigned int flags;
>>  size_t alloc;
>>  size_t len;
>>  char *buf;
>>  };
> 
> Depending whether the size of strbuf matters, it /might/ be worth
> considering some packing here. malloc() usually returns buffers that can
> contain more data than what is requested. Which means allocation sizes
> could be rounded and that wouldn't change the amount of allocated
> memory. On glibc malloc_usable_size(malloc(1)) apparently returns 24.
> On jemalloc, it's 4 or 8. It's in the same ballbark with many
> allocators.
> 
> So, it would be possible to round alloc such that it's always a multiple
> of, say, 4, and stick flags in the low, unused bits.

If I'm not mistaken, the memory allocated is not necessarily linear with
the size asked, depending on the algorithm used by the allocator and/or
the kernel. The system for exemple use powers of two, if the user asks
for exactly 2^x bytes, adding the space for the flags would lead to an
allocation of 2^(x+1) bytes. Way worse than storing an unsigned.
If the allocator use a fibonnaci system, we can't even rely on multiples
of 4 (or 2).
I'm not sure the fibonnaci system is actually used by any allocator, but
my point is that I'm not sure it is a good thing to rely on such 
low-level implementations.

> Whether it's worth doing is another question.

I'd say it is not, we generally lack time more than space, storing an
unsigned seems very reasonable compared to operations involved in using
malloc() leftovers :)
Not even talking about growing buffers, so realloc(), so a whole new set
of problems...
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Documentation: GPG capitalization

2016-05-30 Thread Junio C Hamano
David Nicolson  writes:

>>>  message part may contain a signature that Git itself doesn't
>>> -care about, but that can be verified with gpg.
>>> +care about, but that can be verified with GPG.
>>
>> Isn't this a name of the program, though?  Other two hunks in your
>> patch clearly refer to the concept and not to a particular program,
>> and they are good changes, I think.
>
> This one was not as clear as the other hunks. Git is referred to as
> "Git" in the preceding line, which in itself could be referring to the
> concept or the particular program I guess?

I think the "Git" you see above is clearly the "stupid content
tracker Git", not a particular implementation.  It is more about the
data model of Git than our particular implementation of it.  jgit, a
Java implementation of Git, does not care about the gibberish
in the "--BEGIN PGP..." block the same way we don't care.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git add without whitespace

2016-05-30 Thread Junio C Hamano
Robert Dailey  writes:

> I like your solution better than mine because it utilizes the rules
> defined in .gitattributes.

A difference that may be more important is that I do not do
generation of a patch or application of it without ignoring
whitespaces with things like -w and --ignore-whitespace.  That way,
if my edit is a correction of existing whitespace breakage (e.g. I
noticed a line that is indented by 8 spaces, and I corrected it by
replacing them with one tab), that is shown as a change by "diff"
and kept in the result.  I suspect that your "diff -w | apply --ignore"
will ignore that manual fix?

> What does the checkout at the end do? That part confuses me (granted
> I'm not well-versed with bash script).

I correct whitespace-broken updates the user (i.e. I) made in her
working tree file by adding a corrected version to the index, and
then I checkout the result out of the index to the working tree.

That corrects the breakage in both the index and the working tree,
so that my further edit to the file will start from a ws-corrected
version.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] strbuf: allow to use preallocated memory

2016-05-30 Thread Mike Hommey
>  struct strbuf {
> + unsigned int flags;
>   size_t alloc;
>   size_t len;
>   char *buf;
>  };

Depending whether the size of strbuf matters, it /might/ be worth
considering some packing here. malloc() usually returns buffers that can
contain more data than what is requested. Which means allocation sizes
could be rounded and that wouldn't change the amount of allocated
memory. On glibc malloc_usable_size(malloc(1)) apparently returns 24.
On jemalloc, it's 4 or 8. It's in the same ballbark with many
allocators.

So, it would be possible to round alloc such that it's always a multiple
of, say, 4, and stick flags in the low, unused bits.

Whether it's worth doing is another question.

Mike
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] bisect--helper: `bisect_clean_state` shell function in C

2016-05-30 Thread Christian Couder
On Mon, May 30, 2016 at 8:21 PM, Pranit Bauva  wrote:
> ---
> This patch contains a bug. I have tried to identify the bug and I suppose it
> exists in do_for_each_entry_in_dir(). I have reproduced the debugging session
> at this link[1]. I have seen that some patches in mailing list regarding
> iterating over refs. Will those affect this? Or is this bug fixed in those
> patches?
>
> [1]: http://paste.ubuntu.com/16830752/

The debug session seems to use code source from a previous version of
this patch.
Also it is not cear in which context you run git under gdb. What have
you done before?
And we don't see a crash. Could you show the crash and run the "bt"
command in gdb to get a backtrace?

> @@ -79,11 +90,42 @@ int write_terms(const char *bad, const char *good)
> strbuf_release();
> return (res < 0) ? -1 : 0;
>  }
> +
> +int remove_bisect_ref(const char *refname, const struct object_id *oid,
> +  int flag, void *cb_data)
> +{
> +   char *ref;
> +   ref = xstrfmt("refs/bisect/%s", refname);

You could save one line by concatenating the 2 above lines.

> +   if (delete_ref(ref, oid->hash, flag))
> +   return error(_("couldn't delete the ref %s\n"), ref);
> +   return 0;

You need to free "ref".

> +}

Thanks,
Christian.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Documentation: GPG capitalization

2016-05-30 Thread David Nicolson
On Mon, May 30, 2016 at 8:02 PM, Junio C Hamano  wrote:
> Dave Nicolson  writes:
>
>> When "GPG" is used in a sentence it is now consistently capitalized. When 
>> referring to the binary it is left as "gpg".
>>
>> Signed-off-by: David Nicolson 
>> ---
>>  Documentation/git-mktag.txt | 2 +-
>>  Documentation/git-tag.txt   | 2 +-
>>  Documentation/git-verify-commit.txt | 4 ++--
>>  3 files changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/git-mktag.txt b/Documentation/git-mktag.txt
>> index fa6a756..724408d 100644
>> --- a/Documentation/git-mktag.txt
>> +++ b/Documentation/git-mktag.txt
>> @@ -32,7 +32,7 @@ followed by some 'optional' free-form message (some tags 
>> created
>>  by older Git may not have `tagger` line).  The message, when
>>  exists, is separated by a blank line from the header.  The
>>  message part may contain a signature that Git itself doesn't
>> -care about, but that can be verified with gpg.
>> +care about, but that can be verified with GPG.
>
> Isn't this a name of the program, though?  Other two hunks in your
> patch clearly refer to the concept and not to a particular program,
> and they are good changes, I think.

This one was not as clear as the other hunks. Git is referred to as
"Git" in the preceding line, which in itself could be referring to the
concept or the particular program I guess?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/8] t4051: rewrite, add more tests

2016-05-30 Thread René Scharfe

Am 30.05.2016 um 01:55 schrieb Junio C Hamano:

René Scharfe  writes:


+commit_and_tag () {
+   message=$1 &&
+   shift &&
+   git add $@ &&


Lack of dq around $@ makes me wonder if there is something funny
going on (looking at the callers, there isn't, so we'd better quote
it to avoid wasting time, I think).


OK.


+   test_tick &&
+   git commit -m $message &&
+   git tag $message
  }


The use of $message as the sole argument to "git tag" makes the
readers guess that it must be a single token without any funny
character, so the readers would probably do not waste too much time
wondreing if the lack of dq around $message in the last two is
problematic.


Well, let's call it $tag; $message is a bit misleading here.  The saved 
letters can be invested in quotes. ;)



+last_context_line () {
+   sed -n '$ p'
  }


I have a vague recollection that some implementations of sed are
unhappy to see that space between the address and the operation; I'd
feel safer without it.


Indeed most sed calls in t/ have no space there (found counter-examples 
only in annotate-tests.sh, t4201-shortlog.sh, t9824-git-p4-git-lfs.sh).



+check_diff () {
+   name=$1
+   desc=$2
+   options="-W $3"
+
+   test_expect_success "$desc" '
+   git diff $options "$name^" "$name" >"$name.diff"
+   '
+
+   test_expect_success ' diff applies' '
+   test_when_finished "git reset --hard" &&
+   git checkout --detach "$name^" &&


With the presence of ^ there, --detach is unnecessary; it would not
hurt, though.


Right.  It's just there to make that intent clear.


+   git apply "$name.diff" &&
+   git diff --exit-code "$name"


Even though we may know that $name.diff" will never have a creation
of new paths, I'd feel safer if "apply" is run with "--index".


Makes sense; the less we assume about the diff to be checked the better.

Thanks a lot!

René
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git add without whitespace

2016-05-30 Thread Robert Dailey
On Mon, May 30, 2016 at 2:06 PM, Junio C Hamano  wrote:
> I have had this in my ~/.gitconfig for a long time.
>
> [alias]
> wsadd = "!sh -c 'git diff -- \"$@\" | git apply --cached 
> --whitespace=fix;\
> git co -- ${1-.} \"$@\"' -"
>
> That is, "take what's different from the _index_ and the working
> tree, apply that difference while correcting whitespace errors to
> the index, and check the result out to the working tree".  This
> would _not_ touch existing whitespace-damaged lines that you are not
> touching, and honours the customized definition of what is
> considered whitespace breakage for each paths (which you set up with
> the attributes system).
>

I like your solution better than mine because it utilizes the rules
defined in .gitattributes. I think that's a really good idea. But
other than that, yours is functionally the same as what I'm doing,
right? I just want to make sure I understand: What ends up in the
index/staging area is the code MINUS the trailing whitespace (e.g.
whitespace errors)?

What does the checkout at the end do? That part confuses me (granted
I'm not well-versed with bash script).

Thanks for the feedback. Looks like this is niche enough that an
alias/script is probably the best solution.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [WIP-PATCH 0/2] send-email: refactor the email parser loop

2016-05-30 Thread Matthieu Moy
Samuel GROOT  writes:

> Can we consider this feature obsolete and remove it?

We're usually quite conservative with backward compatibility. If we
remove the feature, we may want to announce it in the next feature
release and actually remove it in the one after (unless we get valid
objection in the meantime).

I'm all for dropping a feature that no one uses if it turns out to be
the case.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5] upload-pack.c: use parse-options API

2016-05-30 Thread Junio C Hamano
Antoine Queru  writes:

> From: Antoine Queru 

Don't you want to be known to the project as the email that matches
your Signed-off-by: line?

> Use the parse-options API rather than a hand-rolled option parser.
>
> Description for --stateless-rpc and --advertise-refs come from
> 42526b4 (Add stateless RPC options to upload-pack,
> receive-pack, 2009-10-30).
>
> Signed-off-by: Antoine Queru 
> Signed-off-by: Matthieu Moy 
> ---
> ---strict::
> +--[no-]strict::
>   Do not try /.git/ if  is no Git directory.

Not a new problem, but "is no Git ..." may technically be correct,
but it would be easier to read if phrased "is not a Git ..." or
something like that.

I am NOT asking _you_ to change/fix that in this patch.  It
is just a note for "a low hanging fruit" for people to pick
up with a separate patch.

> +--stateless-rpc::
> + Perform only a single read-write cycle with stdin and stdout.
> + This fits with the HTTP POST request processing model where
> + a program may read the request, write a response, and must exit.
> +
> +--advertise-refs::
> + Only the initial ref advertisement is output, and the program exits
> + immediately. This fits with the HTTP GET request model, where
> + no request content is received but a response must be produced.
> +

Good.

> diff --git a/upload-pack.c b/upload-pack.c
> index dc802a0..083d068 100644
> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -14,8 +14,12 @@
>  #include "sigchain.h"
>  #include "version.h"
>  #include "string-list.h"
> +#include "parse-options.h"
>  
> -static const char upload_pack_usage[] = "git upload-pack [--strict] 
> [--timeout=] ";
> +static const char * const upload_pack_usage[] = {
> + N_("git upload-pack [options] "),
> + NULL
> +};

Output from "git grep -e '\[option' -e '\[ to highlight it as a
placeholder.  An argument that "it is done for uniformity
with descriptions for other non option arguments, i.e. not
to special case 'options'" would not hold water, e.g. in

builtin/merge.c:N_("git merge [] [...]"),

 is still special-cased in that it implies multiple
things, unlike "..." notation that has to explicitly
say that can have multiple.  [...] would have been
justifiable with the "make it uniform with non-option args",
though.

BUT this patch is not about "make usage string better"
patch, so I do NOT want you to switch upload-pack's usage
string to use the "options is special anyway, so let's not
waste two display columns with enclosing <>" convention.

So, in conclusion, "git upload-pack [] ".

The remainder of the patch looked OK to me.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git add without whitespace

2016-05-30 Thread Junio C Hamano
Robert Dailey  writes:

> $ git diff -U0 -w --no-color | git apply --cached --ignore-whitespace
> --unidiff-zero
>
> This command explicitly leaves out context because it can sometimes
> cause the patch to fail to apply, I think due to whitespace being in
> it, but I'm not completely sure myself.

I have had this in my ~/.gitconfig for a long time.

[alias]
wsadd = "!sh -c 'git diff -- \"$@\" | git apply --cached 
--whitespace=fix;\
git co -- ${1-.} \"$@\"' -"

That is, "take what's different from the _index_ and the working
tree, apply that difference while correcting whitespace errors to
the index, and check the result out to the working tree".  This
would _not_ touch existing whitespace-damaged lines that you are not
touching, and honours the customized definition of what is
considered whitespace breakage for each paths (which you set up with
the attributes system).

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fatal bug on revert with --author

2016-05-30 Thread Jeff King
On Mon, May 30, 2016 at 04:16:50PM +0200, Andreas Lutro wrote:

> So I learned today that --author is not a supported argument to git
> revert. It took me a long time to realize this, though, because the
> error I got was very cryptic:
> 
>   fatal: BUG: expected exactly one commit from walk
> 
> Here's a very simple reproducible case:
> https://gist.github.com/anlutro/67e0cec1a6a419e6d44131b0bc1deff6
> 
> I was recommended to send this report here by #git on irc.freenode.net.

Hmm. It _is_ a supported command-line argument, as you can pass
arbitrary revision options to revert. So:

  git revert --author peff HEAD

should revert all of my commits (whether that is _useful_, I am not
sure, but it is a consequence of the fact that revert simply passes its
arguments to the regular revision machinery).

But in your example, you pass the author "test", which matches nothing.
And so we end up with nothing to revert. But rather than saying so, we
fall into a backwards-compatibility code path:

/*
 * If we were called as "git cherry-pick ", just
 * cherry-pick/revert it, set CHERRY_PICK_HEAD /
 * REVERT_HEAD, and don't touch the sequencer state.
 * This means it is possible to cherry-pick in the middle
 * of a cherry-pick sequence.
 */
if (opts->revs->cmdline.nr == 1 &&
opts->revs->cmdline.rev->whence == REV_CMD_REV &&
opts->revs->no_walk &&
!opts->revs->cmdline.rev->flags) {
struct commit *cmit;
if (prepare_revision_walk(opts->revs))
die(_("revision walk setup failed"));
cmit = get_revision(opts->revs);
if (!cmit || get_revision(opts->revs))
die("BUG: expected exactly one commit from walk");
return single_pick(cmit, opts);
}

I think this conditional should not be triggering, because even though
we did receive a single argument, we _also_ got options which imply that
the user expected us to walk and find other commits.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to cut extra liftime of expired loose objects

2016-05-30 Thread Junio C Hamano
Jeff King  writes:

> I think we should consider dropping the default time to something more
> like 1 day. The 2-week period predates reflogs, I believe (OTOH, it does
> save objects which you might have used with "git add" but never actually
> committed).

Your belief is shared by me.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to add custom metadata to Git commit object

2016-05-30 Thread Junio C Hamano
Jeff King  writes:

> On Mon, May 30, 2016 at 08:58:08PM +0300, Kirill Likhodedov wrote:
> ...
>> There are git-notes, which could be used for the purpose, but they are
>> visible to the user via standard Git command, and could be used by the
>> user for other purposes, so they are not very suitable for the task.
>
> Notes would work for this, too, but their main advantage is that they
> can be created _after_ a commit has already been made (whereas anything
> in the commit object itself will influence its sha1 id).

I would have said the same but with s/but/and/.  If the "rename
hint" or whatever other "custom metadata" Kirill gives to a commit
is found to be wrong, it can be corrected later.

And "the user can use notes for other purposes" is not a good reason
to reject them.  The whole point of allowing custom notes ref is so
that Kirill is not restricted to use the usual notes/commits ref to
store this custom notes in its dedicated notes/kirills-metadata ref.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1 0/1] t6038-merge-text-auto.sh

2016-05-30 Thread Junio C Hamano
Junio C Hamano  writes:

> Imagine that you started from a history where somebody recorded a
> text file with CRLF in the blob, unconverted.  Later the project
> decided to express their text with LF to support cross-platform
> development better, and sets up the Auto-CRLF.  Your user is working
> near the tip of that history after the eol correction happened.  Now
> she gets a pull-request of a branch that forked from an old point in
> the history, before the eol correction and full of CRLF.  Yes, to
> integrate the change being proposed, she needs to look at "theirs";
> that's the whole point of a "merge".  Why should she revert the eol
> correction her history has by getting fooled by the fact that the
> update was based on a part of the history before the eol correction?

Thinking aloud along this line a bit further, if you really cared
(and I don't feel very strongly, as I think "safe crlf" is an ugly
workaround that lets users keep their misconfiguration by penalizing
their day-to-day operation), you may want to think about doing a
3-way merge of "eol preference" beween all stages.

That is, if the common ancestor in stage #1 and the current version
in stage #2 both have its text in LF, and the data being merged in
stage #3 is in CRLF, you sayr "do not convert; the change being
brought in wants to have CRLF endings, while our history did not
care".  Similarly, if the common ancestor in stage #1 and the data
being merged in stage #3 both have CRLF, and your version in stage
#2 has LF, you say "We wanted to fix eol since the side branch
forked, and the side branch predates that fix, so we keep the eol
fix we did since we diverged", i.e. "Do convert".

For doing this, you may want to refactor the codepath that decides
"Auto-CRLF usually wants to turn CRLF in text to LF, but should we
disable that logic now, because the user already has CRLF in the
current one?" into a function that takes a single parameter, `path`,
and returns "Yes, do convert CRLF to LF / No, do not convert"
boolean.  Having a low level function that says "What's the blob at
this path in the index" and have the caller run that logic feels
unwieldy if we want to go that route.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to add custom metadata to Git commit object

2016-05-30 Thread Jeff King
On Mon, May 30, 2016 at 08:58:08PM +0300, Kirill Likhodedov wrote:

> Is it possible to add custom metadata to Git commit object?  Such
> metadata should be ignored by Git commands, but could be used by a
> 3-party tool which knows the format and knows where to look.

Yes. The recommended place to stick this is in a "trailer" at the bottom
of the commit message, like:

  Rename-detection-hint: foo.java -> bar.java

or whatever scheme is useful to your tool.

> I assume that this should be possible, given that Git objects are
> actually patches, and patches can contain additional details. But can
> this be done with the help of Git commands?

Git objects aren't actually patches. Try:

  git cat-file commit HEAD

to see what an actual commit object looks like. You will see that there
are user-invisible headers before the commit message, too, but we do not
usually recommend people to add new headers here, as their semantics
would be unclear to git. For example, when rebasing such a commit,
should the header be preserved or not? It depends on its meaning.
Whereas commit messages are always preserved.

> There are git-notes, which could be used for the purpose, but they are
> visible to the user via standard Git command, and could be used by the
> user for other purposes, so they are not very suitable for the task.

Notes would work for this, too, but their main advantage is that they
can be created _after_ a commit has already been made (whereas anything
in the commit object itself will influence its sha1 id).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [WIP-PATCH 0/2] send-email: refactor the email parser loop

2016-05-30 Thread Samuel GROOT

On 05/30/2016 04:20 PM, Matthieu Moy wrote:

Is the "lots of email" format still used?


AFAICT, it was initially supported for backward compatibility, and then
no one removed it, but I wouldn't be surprised if no one actually used
it.

I vaguely remember a message from Ryan Anderson being surprised to see
the old format still supported, but I can't find it in the archives.

In any case:

- git log --grep 'lots of email' => shows only 83b2443
- git log -S'lots of email' => likewise
- git grep 'lots of email' => just one answer in a comment

I'm not sure the feature is even tested.


`grep "non-mbox" t/t9001-send-email.sh` didn't return anything, 
apparently it's not tested.


Can we consider this feature obsolete and remove it?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/3] perf: make the tests work in worktrees

2016-05-30 Thread René Scharfe
Am 30.05.2016 um 20:03 schrieb Junio C Hamano:
> Johannes Schindelin  writes:
> 
>>> This breaks perf for the non-worktree case:
>>
>> Oh drats!
>>
>>> lsr@debian:~/src/git/t/perf$ make
>>> rm -rf test-results
>>> ./run
>>> === Running 12 tests in this tree ===
>>> cp: cannot stat '.git/objects': No such file or directory
>>> error: failed to copy repository '/home/lsr/src/git/t/..' to '/tmp/trash 
>>> directory.p-perf-lib-sanity'
>>> cp: cannot stat '.git/objects': No such file or directory
>>> error: failed to copy repository '/home/lsr/src/git/t/..' to '/tmp/trash 
>>> directory.p0001-rev-list'
>>> ...
>>>
>>> Here's a fix:
>>>
>>> -- >8 --
>>> Subject: perf: make the tests work without a worktree
>>>
>>> In regular repositories $source_git and $objects_dir contain relative
>>> paths based on $source.  Go there to allow cp to resolve them.
>>>
>>> Signed-off-by: Rene Scharfe 
>>> ---
>>>   t/perf/perf-lib.sh | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
>>> index 5ef1744..1888790 100644
>>> --- a/t/perf/perf-lib.sh
>>> +++ b/t/perf/perf-lib.sh
>>> @@ -84,6 +84,7 @@ test_perf_create_repo_from () {
>>> objects_dir="$(git -C "$source" rev-parse --git-path objects)"
>>> mkdir -p "$repo/.git"
>>> (
>>> +   cd "$source" &&
>>
>> I fear that interacts badly with the `cd "$repo"` I introduced later
>> (replacing a `cd ..`)...

Oh, right, it does if $repo is a relative path.

> What do you want to do then?  For now before -rc1 we can revert the
> whole thing so that we can get a tested thing that works in both
> layouts.

Put it in its own subshell, e.g. like this?

---
 t/perf/perf-lib.sh | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
index 5ef1744..18c363e 100644
--- a/t/perf/perf-lib.sh
+++ b/t/perf/perf-lib.sh
@@ -84,6 +84,7 @@ test_perf_create_repo_from () {
objects_dir="$(git -C "$source" rev-parse --git-path objects)"
mkdir -p "$repo/.git"
(
+   cd "$source" &&
{ cp -Rl "$objects_dir" "$repo/.git/" 2>/dev/null ||
cp -R "$objects_dir" "$repo/.git/"; } &&
for stuff in "$source_git"/*; do
@@ -94,7 +95,9 @@ test_perf_create_repo_from () {
cp -R "$stuff" "$repo/.git/" || exit 1
;;
esac
-   done &&
+   done
+   ) &&
+   (
cd "$repo" &&
git init -q && {
test_have_prereq SYMLINKS ||
-- 
2.8.3



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to add custom metadata to Git commit object

2016-05-30 Thread Konstantin Khomoutov
On Mon, 30 May 2016 20:58:08 +0300
Kirill Likhodedov  wrote:

> Is it possible to add custom metadata to Git commit object? 
> Such metadata should be ignored by Git commands, but could be used by
> a 3-party tool which knows the format and knows where to look. 
> 
> I assume that this should be possible, given that Git objects are
> actually patches, and patches can contain additional details. But can
> this be done with the help of Git commands? 
[...]
> There are git-notes, which could be used for the purpose, but they
> are visible to the user via standard Git command, and could be used
> by the user for other purposes, so they are not very suitable for the
> task.

AFAIK, within your restrictions, it's not possible because there are
only two ways to add meta information for a Git commit:

* Store it externally and somehow correlate it with the commit.

  This is what git-notes does.

* Encode it directly into a commit object.

  Since you can't use your own headers in commit objects,
  you have to encode this information into the commit message in some
  form parsable by a machine.   This is what, say, git-svn does to
  make it possible to correlate the commits it creates with their source
  Subversion revisions.

In both cases the information can be viewed by the user.

What I can't really understand is what is so bad about the user being
able to peer at that data.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to cut extra liftime of expired loose objects

2016-05-30 Thread Jeff King
On Mon, May 30, 2016 at 06:43:07PM +0200, Johannes Sixt wrote:

> I think there is logic somewhere in git gc or its minions that gives expired
> objects that have been packed a two weeks extra life time as loose objects.
> 
> That is, reachable loose objects are moved to a pack, but those objects that
> were packed, but become unreachable due to expired reflogs, are evicted from
> the pack and live for another two weeks or so as loose objects.

Sort of. The loose objects should get the mtime of the pack, which is
likely newer than 2 weeks. So it's not a 2-week extension, but it may
take up to 2 weeks for them to go away (and if the pack is already more
than 2 weeks old, the objects are dropped without even loosening).

> As a consequence of this, I always have a few thousand loose objects in my
> busy repositories, no matter how often I collect garbage. Is there a knob
> that removes the extra lease of life of objects without reducing the usual
> expiration times of reflogs etc?

gc.pruneExpire should do this.

I have run into this, too, and it interacts annoyingly with the
background-gc (which complains "you have too many objects; run
git-prune", leaves that in a .lock file, and then every subsequent
auto-gc spews it at me).

I think we should consider dropping the default time to something more
like 1 day. The 2-week period predates reflogs, I believe (OTOH, it does
save objects which you might have used with "git add" but never actually
committed).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC/PATCH] bisect--helper: `bisect_clean_state` shell function in C

2016-05-30 Thread Pranit Bauva
Reimplement `bisect_clean_state` shell function in C and add a
`bisect-clean-state` subcommand to `git bisect--helper` to call it from
git-bisect.sh .

Using `bisect_clean_state` subcommand is a measure to port shell
function to C so as to use the existing test suite. As more functions
are ported, this subcommand will be retired and will be called by
bisect_reset() and bisect_start().

Mentored-by: Lars Schneider 
Mentored-by: Christian Couder 
Signed-off-by: Pranit Bauva 
---
This patch contains a bug. I have tried to identify the bug and I suppose it
exists in do_for_each_entry_in_dir(). I have reproduced the debugging session
at this link[1]. I have seen that some patches in mailing list regarding
iterating over refs. Will those affect this? Or is this bug fixed in those
patches?

[1]: http://paste.ubuntu.com/16830752/

 builtin/bisect--helper.c | 50 +++-
 git-bisect.sh| 26 +++--
 2 files changed, 52 insertions(+), 24 deletions(-)

diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
index 2b21c02..a20c9a5 100644
--- a/builtin/bisect--helper.c
+++ b/builtin/bisect--helper.c
@@ -3,10 +3,21 @@
 #include "parse-options.h"
 #include "bisect.h"
 #include "refs.h"
+#include "dir.h"
+
+static GIT_PATH_FUNC(git_path_bisect_expected_rev, "BISECT_EXPECTED_REV")
+static GIT_PATH_FUNC(git_path_bisect_ancestors_ok, "BISECT_ANCESTORS_OK")
+static GIT_PATH_FUNC(git_path_bisect_log, "BISECT_LOG")
+static GIT_PATH_FUNC(git_path_bisect_names, "BISECT_NAMES")
+static GIT_PATH_FUNC(git_path_bisect_run, "BISECT_RUN")
+static GIT_PATH_FUNC(git_path_bisect_terms, "BISECT_TERMS")
+static GIT_PATH_FUNC(git_path_head_name, "head-name")
+static GIT_PATH_FUNC(git_path_bisect_start, "BISECT_START")
 
 static const char * const git_bisect_helper_usage[] = {
N_("git bisect--helper --next-all [--no-checkout]"),
N_("git bisect--helper --write-terms  "),
+   N_("git bisect--helper --bisect-clean-state"),
NULL
 };
 
@@ -79,11 +90,42 @@ int write_terms(const char *bad, const char *good)
strbuf_release();
return (res < 0) ? -1 : 0;
 }
+
+int remove_bisect_ref(const char *refname, const struct object_id *oid,
+  int flag, void *cb_data)
+{
+   char *ref;
+   ref = xstrfmt("refs/bisect/%s", refname);
+   if (delete_ref(ref, oid->hash, flag))
+   return error(_("couldn't delete the ref %s\n"), ref);
+   return 0;
+}
+
+int bisect_clean_state(void)
+{
+   for_each_ref_in("refs/bisect/", remove_bisect_ref, NULL);
+
+   remove_path(git_path_bisect_expected_rev());
+   remove_path(git_path_bisect_ancestors_ok());
+   remove_path(git_path_bisect_log());
+   remove_path(git_path_bisect_names());
+   remove_path(git_path_bisect_run());
+   remove_path(git_path_bisect_terms());
+   /* Cleanup head-name if it got left by an old version of git-bisect */
+   remove_path(git_path_head_name());
+   delete_ref("BISECT_HEAD", NULL, 0);
+   /* Cleanup BISECT_START last */
+   remove_path(git_path_bisect_start());
+
+   return 0;
+}
+
 int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
 {
enum {
NEXT_ALL = 1,
-   WRITE_TERMS
+   WRITE_TERMS,
+   BISECT_CLEAN_STATE
} cmdmode = 0;
int no_checkout = 0;
struct option options[] = {
@@ -91,6 +133,8 @@ int cmd_bisect__helper(int argc, const char **argv, const 
char *prefix)
 N_("perform 'git bisect next'"), NEXT_ALL),
OPT_CMDMODE(0, "write-terms", ,
 N_("write the terms to .git/BISECT_TERMS"), 
WRITE_TERMS),
+   OPT_CMDMODE(0, "bisect-clean-state", ,
+N_("cleanup the bisection state"), BISECT_CLEAN_STATE),
OPT_BOOL(0, "no-checkout", _checkout,
 N_("update BISECT_HEAD instead of checking out the 
current commit")),
OPT_END()
@@ -109,6 +153,10 @@ int cmd_bisect__helper(int argc, const char **argv, const 
char *prefix)
if (argc != 2)
die(_("--write-terms requires two arguments"));
return write_terms(argv[0], argv[1]);
+   case BISECT_CLEAN_STATE:
+   if (argc != 0)
+   die(_("--bisect-clean-state requires no arguments"));
+   return bisect_clean_state();
default:
die("BUG: unknown subcommand '%d'", cmdmode);
}
diff --git a/git-bisect.sh b/git-bisect.sh
index 2dd7ec5..a937d1c 100755
--- a/git-bisect.sh
+++ b/git-bisect.sh
@@ -187,7 +187,7 @@ bisect_start() {
#
# Get rid of any old bisect state.
#
-   bisect_clean_state || exit
+   git bisect--helper --bisect-clean-state || exit
 
#

Re: [PATCH v3 2/3] perf: make the tests work in worktrees

2016-05-30 Thread Junio C Hamano
Johannes Schindelin  writes:

>> This breaks perf for the non-worktree case:
>
> Oh drats!
>
>> lsr@debian:~/src/git/t/perf$ make
>> rm -rf test-results
>> ./run
>> === Running 12 tests in this tree ===
>> cp: cannot stat '.git/objects': No such file or directory
>> error: failed to copy repository '/home/lsr/src/git/t/..' to '/tmp/trash 
>> directory.p-perf-lib-sanity'
>> cp: cannot stat '.git/objects': No such file or directory
>> error: failed to copy repository '/home/lsr/src/git/t/..' to '/tmp/trash 
>> directory.p0001-rev-list'
>> ...
>> 
>> Here's a fix:
>> 
>> -- >8 --
>> Subject: perf: make the tests work without a worktree
>> 
>> In regular repositories $source_git and $objects_dir contain relative
>> paths based on $source.  Go there to allow cp to resolve them.
>> 
>> Signed-off-by: Rene Scharfe 
>> ---
>>  t/perf/perf-lib.sh | 1 +
>>  1 file changed, 1 insertion(+)
>> 
>> diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
>> index 5ef1744..1888790 100644
>> --- a/t/perf/perf-lib.sh
>> +++ b/t/perf/perf-lib.sh
>> @@ -84,6 +84,7 @@ test_perf_create_repo_from () {
>>  objects_dir="$(git -C "$source" rev-parse --git-path objects)"
>>  mkdir -p "$repo/.git"
>>  (
>> +cd "$source" &&
>
> I fear that interacts badly with the `cd "$repo"` I introduced later
> (replacing a `cd ..`)...

What do you want to do then?  For now before -rc1 we can revert the
whole thing so that we can get a tested thing that works in both
layouts.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Documentation: GPG capitalization

2016-05-30 Thread Junio C Hamano
Dave Nicolson  writes:

> When "GPG" is used in a sentence it is now consistently capitalized. When 
> referring to the binary it is left as "gpg".
>
> Signed-off-by: David Nicolson 
> ---
>  Documentation/git-mktag.txt | 2 +-
>  Documentation/git-tag.txt   | 2 +-
>  Documentation/git-verify-commit.txt | 4 ++--
>  3 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/git-mktag.txt b/Documentation/git-mktag.txt
> index fa6a756..724408d 100644
> --- a/Documentation/git-mktag.txt
> +++ b/Documentation/git-mktag.txt
> @@ -32,7 +32,7 @@ followed by some 'optional' free-form message (some tags 
> created
>  by older Git may not have `tagger` line).  The message, when
>  exists, is separated by a blank line from the header.  The
>  message part may contain a signature that Git itself doesn't
> -care about, but that can be verified with gpg.
> +care about, but that can be verified with GPG.

Isn't this a name of the program, though?  Other two hunks in your
patch clearly refer to the concept and not to a particular program,
and they are good changes, I think.

>  
>  GIT
>  ---
> diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
> index abab481..32bc4aa 100644
> --- a/Documentation/git-tag.txt
> +++ b/Documentation/git-tag.txt
> @@ -78,7 +78,7 @@ OPTIONS
>  
>  -v::
>  --verify::
> - Verify the gpg signature of the given tag names.
> + Verify the GPG signature of the given tag names.
>  
>  -n::
>specifies how many lines from the annotation, if any,
> diff --git a/Documentation/git-verify-commit.txt 
> b/Documentation/git-verify-commit.txt
> index ecf4da1..0101f0f 100644
> --- a/Documentation/git-verify-commit.txt
> +++ b/Documentation/git-verify-commit.txt
> @@ -12,12 +12,12 @@ SYNOPSIS
>  
>  DESCRIPTION
>  ---
> -Validates the gpg signature created by 'git commit -S'.
> +Validates the GPG signature created by 'git commit -S'.
>  
>  OPTIONS
>  ---
>  --raw::
> - Print the raw gpg status output to standard error instead of the normal
> + Print the raw GPG status output to standard error instead of the normal
>   human-readable output.
>  
>  -v::
>
> --
> https://github.com/git/git/pull/246
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1 0/1] t6038-merge-text-auto.sh

2016-05-30 Thread Junio C Hamano
tbo...@web.de writes:

> This is a little bit of a hen-and-egg problem:
> The problem comes up after the "unified auto handling".
> In theory, it should have been since before:
> get_sha1_from_index() says:
>
>  * We might be in the middle of a merge, in which
>  * case we would read stage #2 (ours).
>
> This seams wrong, as in the merge we sometimes need to
> look at "theirs".

The two comment you quoted is absolutely the right thing to do.
"In a merge, we sometimes need to look at 'theirs'" is like saying
"When we are dong 'git add', we need to look at what is in the
working tree".  It is total red herring.

Step back and think why we even look at what in the index in the
first place; it is to decide if we want or do not want to disable
the automatic CRLF -> LF conversion.  And think again the reason why
do we look at the index.

There may be a line with CRLF line endings in the new contents,
whether it came from a merge, cherry-pick, patch application, or
plain-simple "git add" from the working tree.  Auto-CRLF usually
says "We want CRLF turned into LF".  But the user misconfigured and
for this path the user might not want the conversion take place, in
which case you would disable the conversion.  Where do you take that
hint "the user might have misconfigured?" from?  By looking at what
the user _started_ her update from.  If the state before this "we
need to replace the blob in the index with a new contents, so we
need to hash the new contents to come up with the updated blob"
started contains CRLF already, that may be a hint--if we apply the
CRLF->LF conversion on the original, even if the "new contents" were
identical to what she already had, we would end up changing the blob
with her current configuration.  Hence we disable.

Isn't that the reasoning behind that "safe auto-crlf" thing?

The new contents getting integrated into her current state may have
CRLF, and if a merge or a cherry-pick leaves conflicts, they may be
stored in stage #3.  But paying attention to that to decide if we
want to disable Auto-CRLF conversion is simply wrong; you should
look at the CRLF in stage #3 as purely something that might need to
be converted, not something that affects the decision if it needs to
be converted, just like you view CRLF in a working tree file when
you do "git add"..

Imagine that you started from a history where somebody recorded a
text file with CRLF in the blob, unconverted.  Later the project
decided to express their text with LF to support cross-platform
development better, and sets up the Auto-CRLF.  Your user is working
near the tip of that history after the eol correction happened.  Now
she gets a pull-request of a branch that forked from an old point in
the history, before the eol correction and full of CRLF.  Yes, to
integrate the change being proposed, she needs to look at "theirs";
that's the whole point of a "merge".  Why should she revert the eol
correction her history has by getting fooled by the fact that the
update was based on a part of the history before the eol correction?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


How to add custom metadata to Git commit object

2016-05-30 Thread Kirill Likhodedov
Is it possible to add custom metadata to Git commit object? 
Such metadata should be ignored by Git commands, but could be used by a 3-party 
tool which knows the format and knows where to look. 

I assume that this should be possible, given that Git objects are actually 
patches, and patches can contain additional details. But can this be done with 
the help of Git commands? 



The reason why I am asking this - is to create a tool which could overcome 
false rename detection. 

As all of you know, if I make significant changes to the code together with 
rename, Git won’t detect this rename and will treat this change as added + 
deleted. And sometimes there are false rename detections as well. It would be 
useful to record the fact of rename and use it afterwards.

If a user is developing with our IDE (IntelliJ IDEA), we could remember the 
fact that he renamed a file, then write this information to the commit object, 
and when the commit information is requested (e.g. from the git log graphical 
view), the IDE could read the additional information of the commit and display 
the file as renamed, not as added + deleted. The IDE could also use this 
information to follow rename through the file history.

As a real example, in our project we are converting a lot of files from Java to 
Kotlin, and such conversion always looses history unless the developer 
remembers to prepare a separate rename-commit first, which is tedious. 

There are git-notes, which could be used for the purpose, but they are visible 
to the user via standard Git command, and could be used by the user for other 
purposes, so they are not very suitable for the task.

Thanks a lot!
-- Kirill--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v1 0/1] t6038-merge-text-auto.sh

2016-05-30 Thread tboegi
From: Torsten Bögershausen 

Split of the old 10/10 series.
This is the update of t6038, which is needed to
motivate the patch
   `convert: ce_compare_data() checks for a sha1 of a path`
on top of
   `convert: unify the "auto" handling of CRLF`

When files with different eols are merged with
merge.renormalize = true,
it is important to look at the right blob to determine if
the crlf came from the blob or are a result of a coversion.
This is a little bit of a hen-and-egg problem:
The problem comes up after the "unified auto handling".
In theory, it should have been since before:
get_sha1_from_index() says:

 * We might be in the middle of a merge, in which
 * case we would read stage #2 (ours).

This seams wrong, as in the merge we sometimes need to
look at "theirs".
(But I haven't managed to construct a TC)

t6038-merge-text-auto.sh

Torsten Bögershausen (1):
  t6038: different eol for "Merge addition of text=auto"

 t/t6038-merge-text-auto.sh | 28 +++-
 1 file changed, 23 insertions(+), 5 deletions(-)

-- 
2.0.0.rc1.6318.g0c2c796

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/13] refs: introduce an iterator interface

2016-05-30 Thread Ramsay Jones


On 30/05/16 16:22, Ramsay Jones wrote:
> 
> 
> On 30/05/16 08:55, Michael Haggerty wrote:
> [snip]
> 
>>  /* Reference is a symbolic reference. */
>> diff --git a/refs/files-backend.c b/refs/files-backend.c
>> index 8ab4d5f..dbf1587 100644
>> --- a/refs/files-backend.c
>> +++ b/refs/files-backend.c
>> @@ -1,6 +1,7 @@
>>  #include "../cache.h"
>>  #include "../refs.h"
>>  #include "refs-internal.h"
>> +#include "../iterator.h"
>>  #include "../lockfile.h"
>>  #include "../object.h"
>>  #include "../dir.h"
>> @@ -704,6 +705,154 @@ static void prime_ref_dir(struct ref_dir *dir)
>>  }
>>  }
>>  
>> +/*
>> + * A level in the reference hierarchy that is currently being iterated
>> + * through.
>> + */
>> +struct cache_ref_iterator_level {
>> +/*
>> + * The ref_dir being iterated over at this level. The ref_dir
>> + * is sorted before being stored here.
>> + */
>> +struct ref_dir *dir;
>> +
>> +/*
>> + * The index of the current entry within dir (which might
>> + * itself be a directory). If index == -1, then the iteration
>> + * hasn't yet begun. If index == dir->nr, then the iteration
>> + * through this level is over.
>> + */
>> +int index;
>> +};
>> +
>> +/*
>> + * Represent an iteration through a ref_dir in the memory cache. The
>> + * iteration recurses through subdirectories.
>> + */
>> +struct cache_ref_iterator {
>> +struct ref_iterator base;
>> +
>> +/*
>> + * The number of levels currently on the stack. This is always
>> + * at least 1, because when it becomes zero the iteration is
>> + * ended and this struct is freed.
>> + */
>> +size_t levels_nr;
>> +
>> +/* The number of levels that have been allocated on the stack */
>> +size_t levels_alloc;
>> +
>> +/*
>> + * A stack of levels. levels[0] is the uppermost level that is
>> + * being iterated over in this iteration. (This is not
>> + * necessary the top level in the references hierarchy. If we
>> + * are iterating through a subtree, then levels[0] will hold
>> + * the ref_dir for that subtree, and subsequent levels will go
>> + * on from there.)
>> + */
>> +struct cache_ref_iterator_level *levels;
>> +};
>> +
>> +static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
>> +{
>> +struct cache_ref_iterator *iter =
>> +(struct cache_ref_iterator *)ref_iterator;
>> +
>> +while (1) {
>> +struct cache_ref_iterator_level *level =
>> +>levels[iter->levels_nr - 1];
>> +struct ref_dir *dir = level->dir;
>> +struct ref_entry *entry;
>> +
>> +if (level->index == -1)
>> +sort_ref_dir(dir);
> 
> do you need to sort here ...
>> +
>> +if (++level->index == level->dir->nr) {
>> +/* This level is exhausted; pop up a level */
>> +if (--iter->levels_nr == 0)
>> +return ref_iterator_abort(ref_iterator);
>> +
>> +continue;
>> +}
>> +
>> +entry = dir->entries[level->index];
>> +
>> +if (entry->flag & REF_DIR) {
>> +/* push down a level */
>> +ALLOC_GROW(iter->levels, iter->levels_nr + 1,
>> +   iter->levels_alloc);
>> +
>> +level = >levels[iter->levels_nr++];
>> +level->dir = get_ref_dir(entry);
>> +sort_ref_dir(level->dir);
> 
> ... given that you sort here?

I had intended to say 'or vice versa' here. When I wrote this, I had not
finished reading this patch (let alone the series). Now, I suspect that
you can simply drop this 'sort_ref_dir()' call site. Unless I've misread
the code, of course! ;-)

ATB,
Ramsay Jones


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v1 1/1] t6038: different eol for "Merge addition of text=auto"

2016-05-30 Thread tboegi
From: Torsten Bögershausen 

t6038 uses different code, depending if NATIVE_CRLF is set ot not.

Enhance the test coverage for cross-platform testing and run
"Merge addition of text=auto" with both lf and crlf as core.eol.

It is important to be run this test with crlf under Linux,
the day when the "unified auto handling will be introduced.

Signed-off-by: Torsten Bögershausen 
---
 t/t6038-merge-text-auto.sh | 28 +++-
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/t/t6038-merge-text-auto.sh b/t/t6038-merge-text-auto.sh
index 33b77ee..8846f5d 100755
--- a/t/t6038-merge-text-auto.sh
+++ b/t/t6038-merge-text-auto.sh
@@ -91,16 +91,13 @@ test_expect_success 'Merge after setting text=auto' '
compare_files expected file
 '
 
-test_expect_success 'Merge addition of text=auto' '
+test_expect_success 'Merge addition of text=auto eol=LF' '
+   git config core.eol lf &&
cat <<-\EOF >expected &&
first line
same line
EOF
 
-   if test_have_prereq NATIVE_CRLF; then
-   append_cr expected.temp &&
-   mv expected.temp expected
-   fi &&
git config merge.renormalize true &&
git rm -fr . &&
rm -f .gitattributes &&
@@ -109,7 +106,28 @@ test_expect_success 'Merge addition of text=auto' '
compare_files  expected file
 '
 
+test_expect_success 'Merge addition of text=auto eol=CRLF' '
+   git config core.eol crlf &&
+   cat <<-\EOF >expected &&
+   first line
+   same line
+   EOF
+
+   append_cr expected.temp &&
+   mv expected.temp expected &&
+   git config merge.renormalize true &&
+   git rm -fr . &&
+   rm -f .gitattributes &&
+   git reset --hard b &&
+   echo >&2 "After git reset --hard b" &&
+   git ls-files -s --eol >&2 &&
+   git merge a &&
+   compare_files  expected file
+'
+
+
 test_expect_success 'Detect CRLF/LF conflict after setting text=auto' '
+   git config core.eol native &&
echo "<<<" >expected &&
if test_have_prereq NATIVE_CRLF; then
echo first line | append_cr >>expected &&
-- 
2.0.0.rc1.6318.g0c2c796

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


How to cut extra liftime of expired loose objects

2016-05-30 Thread Johannes Sixt
I think there is logic somewhere in git gc or its minions that gives 
expired objects that have been packed a two weeks extra life time as 
loose objects.


That is, reachable loose objects are moved to a pack, but those objects 
that were packed, but become unreachable due to expired reflogs, are 
evicted from the pack and live for another two weeks or so as loose objects.


As a consequence of this, I always have a few thousand loose objects in 
my busy repositories, no matter how often I collect garbage. Is there a 
knob that removes the extra lease of life of objects without reducing 
the usual expiration times of reflogs etc?


-- Hannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git add without whitespace

2016-05-30 Thread Robert Dailey
I think it would be useful to have a '-w' option for 'git add' that
completely ignores whitespace changes, the same way that 'git diff -w'
does.

Real life scenario:

Sometimes developers will use tooling that does not properly strip
trailing whitespace in source files. Next time I edit those files for
a simple 1-line code change, my tooling will strip whitespace from the
whole file. I *do* want these changes, however I want 2 commits: 1
commit with the bugfix, and a supplementary commit with just the
whitespace changes.

At the moment, there is no way for me to conveniently add the source
file to the index without whitespace. The only way to accomplish this
today that I'm aware of is via this command:

$ git diff -U0 -w --no-color | git apply --cached --ignore-whitespace
--unidiff-zero

This command explicitly leaves out context because it can sometimes
cause the patch to fail to apply, I think due to whitespace being in
it, but I'm not completely sure myself.

It would be useful to be able to do this instead:

$ git add -w

This would effectively function the same as my workaround command
shown earlier. It should also be valid to use -w with -i and -p. In
the -p case, it just won't show hunks containing whitespace changes.
For -i, it would assume '-w' as part of any command run during the
interactive session.

Does this idea sound good? I have some free time on my hands so I
wouldn't mind implementing this. Maybe there isn't a huge audience for
this kind of thing, or maybe I'm just going about this the wrong way.
Thoughts would be much appreciated.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/13] refs: introduce an iterator interface

2016-05-30 Thread Ramsay Jones


On 30/05/16 08:55, Michael Haggerty wrote:
[snip]

>  /* Reference is a symbolic reference. */
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 8ab4d5f..dbf1587 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -1,6 +1,7 @@
>  #include "../cache.h"
>  #include "../refs.h"
>  #include "refs-internal.h"
> +#include "../iterator.h"
>  #include "../lockfile.h"
>  #include "../object.h"
>  #include "../dir.h"
> @@ -704,6 +705,154 @@ static void prime_ref_dir(struct ref_dir *dir)
>   }
>  }
>  
> +/*
> + * A level in the reference hierarchy that is currently being iterated
> + * through.
> + */
> +struct cache_ref_iterator_level {
> + /*
> +  * The ref_dir being iterated over at this level. The ref_dir
> + * is sorted before being stored here.
> +  */
> + struct ref_dir *dir;
> +
> + /*
> +  * The index of the current entry within dir (which might
> +  * itself be a directory). If index == -1, then the iteration
> +  * hasn't yet begun. If index == dir->nr, then the iteration
> +  * through this level is over.
> +  */
> + int index;
> +};
> +
> +/*
> + * Represent an iteration through a ref_dir in the memory cache. The
> + * iteration recurses through subdirectories.
> + */
> +struct cache_ref_iterator {
> + struct ref_iterator base;
> +
> + /*
> +  * The number of levels currently on the stack. This is always
> +  * at least 1, because when it becomes zero the iteration is
> +  * ended and this struct is freed.
> +  */
> + size_t levels_nr;
> +
> + /* The number of levels that have been allocated on the stack */
> + size_t levels_alloc;
> +
> + /*
> +  * A stack of levels. levels[0] is the uppermost level that is
> +  * being iterated over in this iteration. (This is not
> +  * necessary the top level in the references hierarchy. If we
> +  * are iterating through a subtree, then levels[0] will hold
> +  * the ref_dir for that subtree, and subsequent levels will go
> +  * on from there.)
> +  */
> + struct cache_ref_iterator_level *levels;
> +};
> +
> +static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
> +{
> + struct cache_ref_iterator *iter =
> + (struct cache_ref_iterator *)ref_iterator;
> +
> + while (1) {
> + struct cache_ref_iterator_level *level =
> + >levels[iter->levels_nr - 1];
> + struct ref_dir *dir = level->dir;
> + struct ref_entry *entry;
> +
> + if (level->index == -1)
> + sort_ref_dir(dir);

do you need to sort here ...
> +
> + if (++level->index == level->dir->nr) {
> + /* This level is exhausted; pop up a level */
> + if (--iter->levels_nr == 0)
> + return ref_iterator_abort(ref_iterator);
> +
> + continue;
> + }
> +
> + entry = dir->entries[level->index];
> +
> + if (entry->flag & REF_DIR) {
> + /* push down a level */
> + ALLOC_GROW(iter->levels, iter->levels_nr + 1,
> +iter->levels_alloc);
> +
> + level = >levels[iter->levels_nr++];
> + level->dir = get_ref_dir(entry);
> + sort_ref_dir(level->dir);

... given that you sort here?

> + level->index = -1;
> + } else {
> + iter->base.refname = entry->name;
> + iter->base.oid = >u.value.oid;
> + iter->base.flags = entry->flag;
> + return ITER_OK;
> + }
> + }
> +}
> +

ATB,
Ramsay Jones


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] push: deny policy to prevent pushes to unwanted remotes.

2016-05-30 Thread Francois Beutin


- Mail original -
> Antoine Queru  writes:
> 
> > Currently, a user wanting to prevent accidental pushes to the wrong remote
> > has to create a pre-push hook.
> > The feature offers a configuration to allow users to prevent accidental
> > pushes
> > to the wrong remote. The user may define a list of whitelisted remotes, a
> > list
> > of blacklisted remotes and a default policy ("allow" or "deny"). A push
> > is denied if the remote is explicitely blacklisted or if it isn't
> > whitelisted and the default policy is "deny".
> 
> Not really serious, but the text above is weirdly wrapped, probably by
> hand. I'm sure your text editor can do better and quicker ;-).

Yes indeed it was hand wrapped, it will be changed in the next version,
thanks for the hint!

> > diff --git a/Documentation/config.txt b/Documentation/config.txt
> > index 53f00db..1478ce3 100644
> > --- a/Documentation/config.txt
> > +++ b/Documentation/config.txt
> > @@ -2517,6 +2517,23 @@ remote.pushDefault::
> > `branch..remote` for all branches, and is overridden by
> > `branch..pushRemote` for specific branches.
> >  
> > +remote.pushBlacklisted::
> > +   The list of remotes the user is forbidden to push to.
> > +   See linkgit:git-push[1]
> 
> I'd have spelled that "pushBlacklist" (no 'ed'). I think variable names
> usually do not use verbs. But I'm fine with your version too.

Ok it makes sense, we will change for "pushBlacklist".

> > +For example, if we set up the configuration variables like this:
> > +   git config --add remote.pushBlacklisted repository.com
> > +   git config --add remote.pushWhitelisted repository.com/Special_Folder
> > +Any push of this form will be accepted:
> > +   git push repository.com/Special_Folder/foo
> > +While those ones for example will be denied:
> > +   git push repository.com/Normal_Folder/bar
> 
> Please, look at the rendered output of your documentation, and notice
> it's broken. We'd typically use the asciidoc syntax for inline code here
> (between -).

Indeed the rendered doc is broken, it will be fixed in the next version.

> > +An error will be raised if the url is blacklisted and whitelisted at the
> > same.
> 
> "at the same time"?
> 
> But as-is, this sentence conflicts with the previous "the more the url
> in the config prefixes the asked url the more priority it has."
> statement.

I understand the confusion, it's not very clear. What we wanted to say
was that:
pushBlacklist = example.com/
pushWhitelist = example.com/
would raise an error, because the "priority rule" can't be applied.
We will change the sentence.

> The documentation doesn't talk about the URL-normalization the code is
> doing. I think a reasonable behavior would be:
> 
> pushBlacklisted = example.com/ => deny all accesses to example.com
> pushBlacklisted = http://example.com/ => deny HTTP accesses to
>  example.com
> 
> The second is a valid use-case IMHO, some people may want to forbid some
> protocols. Actually, one may even want to whilelist only one protocol
> and write stg like this to force HTTPS on host example.com:
> 
>   pushBlacklisted = example.com
>   pushWhitelisted = https://example.com

As of right now the protocol is not handled, we just ignore it.
We already thought about handling it the way you described but
we found out some problems:
pushBlacklist = example.com
pushWhitelist = https://example.com
-> push example.com forbidden EXCEPT with https

pushBlacklist = example.com/example2
pushWhitelist = https://example.com
-> push https://example.com/example2
-> Allow or deny?

To sum up the problem, what should be the priority between a
protocol-matching rule and a sub-folder-matching rule?
Perhaps adding a configuration variable?
pushPriority = "protocol"/"sub-folder" (needs renaming)

> BTW, these use-cases could motivate some per-blacklist deny message
> like
> 
> [remote "example.com"]
>   pushDenyMessage = "Please use HTTPS when you push to example.com"
> 
> I don't think it has to be implemented now though (better have users get
> used to the basic feature and see if more is needed later).

Interesting idea, perhaps a simplified version could be:
pushDenyProtocolMessage = "This protocol is not allowed for this remote"
pushDenyRemoteMessage = "This remote is not authorized"
(as above v ery roughnaming)

> > +static const char *string_url_normalize(const char *repo_name)
> > +{
> > +   if (starts_with(repo_name, "file://"))
> > +   return repo_name + 7;
> 
> There are many instances of this in Git's codebase, but we now try to
> avoid magic numbers like this, and would use strlen("file://") instead.
> Actually, we even have skip_prefix() precisely for this use-case.

Ok we will change it and use the skip_prefix() function.

> > +   if (is_url(repo_name)) {
> > +   struct url_info url_infos;
> > +   

Re: [PATCH 2/2] strbuf: allow to use preallocated memory

2016-05-30 Thread William Duclot
Matthieu Moy  writes:
> William Duclot  writes:
> 
>> Matthieu Moy  writes:
 +/**
 + * Allow the caller to give a pre-allocated piece of memory for the
 strbuf
 + * to use and indicate that the strbuf must use exclusively this buffer,
 + * never realloc() it or allocate a new one. It means that the string
 can
 + * be manipulated but cannot overflow the pre-allocated buffer. The
 + * pre-allocated buffer will never be freed.
 + */
>>> 
>>> Perhaps say explicitly that although the allocated buffer has a fixed
>>> size, the string itself can grow as long as it does not overflow the
>>> buffer?
>>
>> That's what I meant by "the string can be manipulated but cannot overflow
>> the pre-allocated buffer". I'll try to reformulate
> 
> Maybe "the string can grow, but cannot overflow"?

Seems OK

 @@ -91,6 +116,8 @@ extern void strbuf_release(struct strbuf *);
   * Detach the string from the strbuf and returns it; you now own the
   * storage the string occupies and it is your responsibility from then
   on
   * to release it with `free(3)` when you are done with it.
 + * Must allocate a copy of the buffer in case of a preallocated/fixed
 buffer.
 + * Performance-critical operations have to be aware of this.
>>> 
>>> Better than just warn about performance, you can give the alternative.
>>
>> I'm not sure what you mean, I don't think there really is an alternative
>> for
>> detaching a string?
> 
> So, is the comment above saying "You're doomed, there's no way you can
> get good performance anyway"?
> 
> The alternative is just that you don't have to call strbuf_release since
> the caller can access the .buf field and is already the one responsible
> for freeing it when needed, and it's safe to just call strbuf_init() if
> one needs to re-initialize the stbuf structure.

Actually, if the user _needs_ to detach(), then yes he's doomed. Or more
realistically, he have to refactor so he'll don't need detach() (by being
sure the strbuf is pre-allocated by himself for example).
The strbuf_release() function is not the problem here, it does nothing
problematic for performances. The problem is strbuf_wrap_preallocated(),
but you're right I can add a comment so the user know when he don't have
to call it.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5] upload-pack.c: use parse-options API

2016-05-30 Thread Matthieu Moy
Antoine Queru  writes:

> From: Antoine Queru 

[ Insert here the sentence I've been repeating a lot lately about this
  useless From ;-) ]

> Documentation/git-upload-pack.txt | 16 +--
>  upload-pack.c | 59 
> +--
>  2 files changed, 38 insertions(+), 37 deletions(-)

The patch contains a few whitespace errors:

Documentation/git-upload-pack.txt:41: space before tab in indent.
+   immediately. This fits with the HTTP GET request model, where
Documentation/git-upload-pack.txt:42: space before tab in indent.
+   no request content is received but a response must be produced.
upload-pack.c:846: trailing whitespace.
+   

You should notice them immediately if you use "git add -p" (big red
warning in the patch hunk), and you can see all of them with "git diff
--check" or "git show --check".

Not sure if it deserves a reroll. Junio?

Other than that, the patch is now

Reviewed-by: Matthieu Moy 

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5] upload-pack.c: use parse-options API

2016-05-30 Thread Antoine Queru
From: Antoine Queru 

Use the parse-options API rather than a hand-rolled option parser.

Description for --stateless-rpc and --advertise-refs come from
42526b4 (Add stateless RPC options to upload-pack,
receive-pack, 2009-10-30).

Signed-off-by: Antoine Queru 
Signed-off-by: Matthieu Moy 
---
Change since v4:
Imperative mood for the commit message.
"Code is now more compact." removed.
Documentation for option is now clearer.
Help message for "--advertise-refs" rewritten.
"no" for "strict" option added to the doc.

Documentation/git-upload-pack.txt | 16 +--
 upload-pack.c | 59 +--
 2 files changed, 38 insertions(+), 37 deletions(-)

diff --git a/Documentation/git-upload-pack.txt 
b/Documentation/git-upload-pack.txt
index 0abc806..8a03bed 100644
--- a/Documentation/git-upload-pack.txt
+++ b/Documentation/git-upload-pack.txt
@@ -9,8 +9,8 @@ git-upload-pack - Send objects packed back to git-fetch-pack
 SYNOPSIS
 
 [verse]
-'git-upload-pack' [--strict] [--timeout=] 
-
+'git-upload-pack' [--[no-]strict] [--timeout=] [--stateless-rpc]
+ [--advertise-refs] 
 DESCRIPTION
 ---
 Invoked by 'git fetch-pack', learns what
@@ -25,12 +25,22 @@ repository.  For push operations, see 'git send-pack'.
 OPTIONS
 ---
 
---strict::
+--[no-]strict::
Do not try /.git/ if  is no Git directory.
 
 --timeout=::
Interrupt transfer after  seconds of inactivity.
 
+--stateless-rpc::
+   Perform only a single read-write cycle with stdin and stdout.
+   This fits with the HTTP POST request processing model where
+   a program may read the request, write a response, and must exit.
+
+--advertise-refs::
+   Only the initial ref advertisement is output, and the program exits
+   immediately. This fits with the HTTP GET request model, where
+   no request content is received but a response must be produced.
+
 ::
The repository to sync from.
 
diff --git a/upload-pack.c b/upload-pack.c
index dc802a0..083d068 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -14,8 +14,12 @@
 #include "sigchain.h"
 #include "version.h"
 #include "string-list.h"
+#include "parse-options.h"
 
-static const char upload_pack_usage[] = "git upload-pack [--strict] 
[--timeout=] ";
+static const char * const upload_pack_usage[] = {
+   N_("git upload-pack [options] "),
+   NULL
+};
 
 /* Remember to update object flag allocation in object.h */
 #define THEY_HAVE  (1u << 11)
@@ -817,11 +821,21 @@ static int upload_pack_config(const char *var, const char 
*value, void *unused)
return parse_hide_refs_config(var, value, "uploadpack");
 }
 
-int main(int argc, char **argv)
+int main(int argc, const char **argv)
 {
-   char *dir;
-   int i;
+   const char *dir;
int strict = 0;
+   struct option options[] = {
+   OPT_BOOL(0, "stateless-rpc", _rpc,
+N_("quit after a single request/response exchange")),
+   OPT_BOOL(0, "advertise-refs", _refs,
+N_("exit immediately after intial ref advertisement")),
+   OPT_BOOL(0, "strict", ,
+N_("do not try /.git/ if  is no 
Git directory")),
+   OPT_INTEGER(0, "timeout", ,
+   N_("interrupt transfer after  seconds of 
inactivity")),
+   OPT_END()
+   };
 
git_setup_gettext();
 
@@ -829,40 +843,17 @@ int main(int argc, char **argv)
git_extract_argv0_path(argv[0]);
check_replace_refs = 0;
 
-   for (i = 1; i < argc; i++) {
-   char *arg = argv[i];
-
-   if (arg[0] != '-')
-   break;
-   if (!strcmp(arg, "--advertise-refs")) {
-   advertise_refs = 1;
-   continue;
-   }
-   if (!strcmp(arg, "--stateless-rpc")) {
-   stateless_rpc = 1;
-   continue;
-   }
-   if (!strcmp(arg, "--strict")) {
-   strict = 1;
-   continue;
-   }
-   if (starts_with(arg, "--timeout=")) {
-   timeout = atoi(arg+10);
-   daemon_mode = 1;
-   continue;
-   }
-   if (!strcmp(arg, "--")) {
-   i++;
-   break;
-   }
-   }
+   argc = parse_options(argc, argv, NULL, options, upload_pack_usage, 0);
+   
+   if (argc != 1)
+   usage_with_options(upload_pack_usage, options);
 
-   if (i != argc-1)
-   usage(upload_pack_usage);
+   if (timeout)
+   daemon_mode = 1;
 
setup_path();
 
-   dir = argv[i];
+   dir = argv[0];
 
if (!enter_repo(dir, 

Re: [PATCH 2/2] strbuf: allow to use preallocated memory

2016-05-30 Thread Matthieu Moy
William Duclot  writes:

> Matthieu Moy  writes:
>
>> void strbuf_grow(struct strbuf *sb, size_t extra)
>> {
>>  int new_buf = !sb->alloc;
>> ...
>>  if (sb->flags & STRBUF_OWNS_MEMORY) {
>>  if (new_buf) // < (1)
>>  sb->buf = NULL;
>>  ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
>>  } else {
>>  /*
>>   * The strbuf doesn't own the buffer: to avoid to realloc it,
>>   * the strbuf needs to use a new buffer without freeing the old
>>   */
>>  if (sb->len + extra + 1 > sb->alloc) {
>>  size_t new_alloc = MAX(sb->len + extra + 1, 
>> alloc_nr(sb->alloc));
>>  char *buf = xmalloc(new_alloc);
>>  memcpy(buf, sb->buf, sb->alloc);
>>  sb->buf = buf;
>>  sb->alloc = new_alloc;
>>  sb->flags |= STRBUF_OWNS_MEMORY;
>>  }
>>  }
>> 
>>  if (new_buf) // < (2)
>>  sb->buf[0] = '\0';
>> }
>> 
>> I think (1) is now dead code, since sb->alloc == 0 implies that
>> STRBUF_OWNS_MEMORY is set. (2) seems redundant since you've just
>> memcpy-ed the existing '\0' into the buffer.
>
> You're right for (1), I hadn't noticed that.
> For (2), we'll still have to set sb->buf[new_alloc-1]='\0' after the memcpy, 
> if we
> have sb->alloc==0 then the memcpy won't copy it.

That sounds like one more reason to memcpy len + 1 bytes, and you'll get
the '\0' copied.

>> After your patch, there are differences between
>> strbuf_wrap_preallocated() which I think are inconsistencies:
>> 
>> * strbuf_attach() does not check for NULL buffer, but it doesn't accept
>>   them either if I read correctly. It would make sense to add the check
>>   to strbuf_attach(), but it's weird to have the performance-critical
>>   oriented function do the expensive stuff that the
>>   non-performance-critical one doesn't.
>
> I agree that strbuf_attach should do the check (it seems strange that it
> doesn't already do it, as the "buffer never NULL" invariant is not new).
> I don't understand your "but" part, what "expensive stuff" are you talking
> about?

"expensive stuff" was an exageration for "== NULL" test. It's not that
expensive, but costs a tiny bit of CPU time.

> xmemdupz can only allocate the same size it will copy.

Indeed, so forget about it.

>>> +/**
>>> + * Allow the caller to give a pre-allocated piece of memory for the strbuf
>>> + * to use and indicate that the strbuf must use exclusively this buffer,
>>> + * never realloc() it or allocate a new one. It means that the string can
>>> + * be manipulated but cannot overflow the pre-allocated buffer. The
>>> + * pre-allocated buffer will never be freed.
>>> + */
>> 
>> Perhaps say explicitly that although the allocated buffer has a fixed
>> size, the string itself can grow as long as it does not overflow the
>> buffer?
>
> That's what I meant by "the string can be manipulated but cannot overflow
> the pre-allocated buffer". I'll try to reformulate

Maybe "the string can grow, but cannot overflow"?

>>> @@ -91,6 +116,8 @@ extern void strbuf_release(struct strbuf *);
>>>   * Detach the string from the strbuf and returns it; you now own the
>>>   * storage the string occupies and it is your responsibility from then on
>>>   * to release it with `free(3)` when you are done with it.
>>> + * Must allocate a copy of the buffer in case of a preallocated/fixed
>>> buffer.
>>> + * Performance-critical operations have to be aware of this.
>> 
>> Better than just warn about performance, you can give the alternative.
>
> I'm not sure what you mean, I don't think there really is an alternative for
> detaching a string?

So, is the comment above saying "You're doomed, there's no way you can
get good performance anyway"?

The alternative is just that you don't have to call strbuf_release since
the caller can access the .buf field and is already the one responsible
for freeing it when needed, and it's safe to just call strbuf_init() if
one needs to re-initialize the stbuf structure.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Gitweb: Persistant download URLs for snapshots?

2016-05-30 Thread Timo Sigurdsson
Hi,

I hope this is the right place to ask this, but I wanted to know whether it is 
possible to have a persistant URL to obtain a snapshot of the latest master of 
a repository through Gitweb.

I set up a gitweb instance and it works nicely. I can click on the snapshot 
link to get a tgz archive of a specific commit hash to quickly receive or 
distribute files of a repository exposed by Gitweb. However, I have two 
problems with these links:

1) The link seems to depend on a commit hash. I haven't found a shorter or 
persitant link that would e.g. always give me the latest master snapshot.

2) Another issue which is actually more problematic: The links only seem to 
work interactively in my desktop browser. If I right click the link "snapshot", 
copy the URL and then try to download that link from another (headless) machine 
using wget, I end up getting a html file instead of a tgz archive.

Is it possible to generate such links that conssitantly work for the latest 
commit of a repository and that work non-interactively from a command line? Am 
I doing something wrong (well aside fromt he possibility that I'm trying to use 
gitweb for something which it might not have been designed for...)?

Thank you!

Kind regards,

Timo


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [WIP-PATCH 0/2] send-email: refactor the email parser loop

2016-05-30 Thread Matthieu Moy
Samuel GROOT  writes:

> (mbox) prefix was introduced by Ryan Anderson in 2005 (can't find the
> exact commit though), in opposition with the (non-mbox) format ("lots
> of email") that was used before.

That is actually from the original commit introducing send-email:
83b2443 ([PATCH] Add git-send-email-script - tool to send emails from
git-format-patch-script, 2005-07-31), i.e. ~3 month after Git was born.

At that time, user-friendlyness was not really a priority ;-).

> Is the "lots of email" format still used?

AFAICT, it was initially supported for backward compatibility, and then
no one removed it, but I wouldn't be surprised if no one actually used
it.

I vaguely remember a message from Ryan Anderson being surprised to see
the old format still supported, but I can't find it in the archives.

In any case:

- git log --grep 'lots of email' => shows only 83b2443
- git log -S'lots of email' => likewise
- git grep 'lots of email' => just one answer in a comment

I'm not sure the feature is even tested.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4] upload-pack.c: use of parse-options API

2016-05-30 Thread Antoine Queru
Hello Eric,
Thank you for answer. Your remarks have been added in the next version.
> > +   OPT_BOOL(0, "strict", ,
> > +N_("do not try /.git/ if  is
> > no Git directory")),
> 
> Use of OPT_BOOL introduces a --no-strict option which didn't exist
> before. Does this need to be documented? (Genuine question.)
> 
You're right. I've added it in the documentation.

Antoine
   
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fatal bug on revert with --author

2016-05-30 Thread Andreas Lutro
So I learned today that --author is not a supported argument to git
revert. It took me a long time to realize this, though, because the
error I got was very cryptic:

  fatal: BUG: expected exactly one commit from walk

Here's a very simple reproducible case:
https://gist.github.com/anlutro/67e0cec1a6a419e6d44131b0bc1deff6

I was recommended to send this report here by #git on irc.freenode.net.

Thanks,
Andreas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] strbuf: allow to use preallocated memory

2016-05-30 Thread William Duclot
Matthieu Moy  writes:
> William Duclot  writes:
> 
>> @@ -20,16 +28,37 @@ char strbuf_slopbuf[1];
>>  
>>  void strbuf_init(struct strbuf *sb, size_t hint)
>>  {
>> +sb->flags = 0;
>>  sb->alloc = sb->len = 0;
>>  sb->buf = strbuf_slopbuf;
>>  if (hint)
>>  strbuf_grow(sb, hint);
>>  }
> 
> If you set flags = 0 here, existing callers will have all flags off,
> including OWNS_MEMORY.
> 
> I *think* this is OK, as sb->buf is currently pointing to
> strbuf_slopbuf, which the the strbuf doesn't own. But that is too subtle
> to go without an explanatory comment IMHO.

Right

> 
> Also, doesn't this make the "new_buf" case useless in strbuf_grow?
> 
> With your patch, the code looks like:
> 
> void strbuf_grow(struct strbuf *sb, size_t extra)
> {
>   int new_buf = !sb->alloc;
> ...
>   if (sb->flags & STRBUF_OWNS_MEMORY) {
>   if (new_buf) // < (1)
>   sb->buf = NULL;
>   ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
>   } else {
>   /*
>* The strbuf doesn't own the buffer: to avoid to realloc it,
>* the strbuf needs to use a new buffer without freeing the old
>*/
>   if (sb->len + extra + 1 > sb->alloc) {
>   size_t new_alloc = MAX(sb->len + extra + 1, 
> alloc_nr(sb->alloc));
>   char *buf = xmalloc(new_alloc);
>   memcpy(buf, sb->buf, sb->alloc);
>   sb->buf = buf;
>   sb->alloc = new_alloc;
>   sb->flags |= STRBUF_OWNS_MEMORY;
>   }
>   }
> 
>   if (new_buf) // < (2)
>   sb->buf[0] = '\0';
> }
> 
> I think (1) is now dead code, since sb->alloc == 0 implies that
> STRBUF_OWNS_MEMORY is set. (2) seems redundant since you've just
> memcpy-ed the existing '\0' into the buffer.

You're right for (1), I hadn't noticed that.
For (2), we'll still have to set sb->buf[new_alloc-1]='\0' after the memcpy, if 
we
have sb->alloc==0 then the memcpy won't copy it.

>> +void strbuf_wrap_preallocated(struct strbuf *sb, char *path_buf,
>> +  size_t path_buf_len, size_t alloc_len)
>> +{
>> +if (!path_buf)
>> +die("you try to use a NULL buffer to initialize a strbuf");
>> +
>> +strbuf_init(sb, 0);
>> +strbuf_attach(sb, path_buf, path_buf_len, alloc_len);
>> +sb->flags &= ~STRBUF_OWNS_MEMORY;
>> +sb->flags &= ~STRBUF_FIXED_MEMORY;
>> +}
> 
> strbuf_wrap_preallocated seem very close to strbuf_attach. I'd rather
> see a symmetric code sharing like
> 
> void strbuf_attach_internal(struct strbuf *sb, ..., unsigned int flags)
> 
> and then strbuf_attach() and strbuf_wrap_preallocated() become
> straightforward wrappers around it.
> 
> This would avoid setting and then unsetting STRBUF_OWNS_MEMORY (the
> performance impact is probably small, but it looks weird to set the flag
> and then unset it right away).

We'll refactor the code with Johannes' remarks and yours in mind

> After your patch, there are differences between
> strbuf_wrap_preallocated() which I think are inconsistencies:
> 
> * strbuf_attach() does not check for NULL buffer, but it doesn't accept
>   them either if I read correctly. It would make sense to add the check
>   to strbuf_attach(), but it's weird to have the performance-critical
>   oriented function do the expensive stuff that the
>   non-performance-critical one doesn't.

I agree that strbuf_attach should do the check (it seems strange that it
doesn't already do it, as the "buffer never NULL" invariant is not new).
I don't understand your "but" part, what "expensive stuff" are you talking
about?

> * strbuf_attach() calls strbuf_release(), which allows reusing an
>   existing strbuf. strbuf_wrap_preallocated() calls strbuf_init which
>   would override silently any previous content. I think strbuf_attach()
>   does the right thing here.
>
>   In any case, you probably want to include calls to strbuf_attach() and
>   strbuf_wrap_*() functions on existing non-empty strbufs.
>
>> +void strbuf_wrap_fixed(struct strbuf *sb, char *path_buf,
>> +   size_t path_buf_len, size_t alloc_len)
>> +{
>> +strbuf_wrap_preallocated(sb, path_buf, path_buf_len, alloc_len);
>> +sb->flags |= STRBUF_FIXED_MEMORY;
>> +}
> 
> And this could become a 3rd caller of strbuf_attach_internal().

True. We'll take care of that when refactoring

>> @@ -61,9 +96,32 @@ void strbuf_grow(struct strbuf *sb, size_t extra)
>>  if (unsigned_add_overflows(extra, 1) ||
>>  unsigned_add_overflows(sb->len, extra + 1))
>>  die("you want to use way too much memory");
>> -if (new_buf)
>> -sb->buf = NULL;
>> -ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
>> +if 

Re: [WIP-PATCH 0/2] send-email: refactor the email parser loop

2016-05-30 Thread Samuel GROOT

On 05/29/2016 08:05 PM, Matthieu Moy wrote:

Samuel GROOT  writes:


Should we take what Eric suggested (see below) as standard output?


Since the headers are already shown after those lines, it's
redundant to have the entire line.  And we could add
trailers after the headers (with a blank line to delimit):

# existing header output:
From: F 
Cc: A , One 
Subject: foo

# new trailer output
Signed-off-by: SoB 
Acked-by: ack 


I don't think adding the trailers in the output is needed. If the
message says

  Adding foo@example from Signed-off-by trailer

I can guess that it's from "Signed-off-by: foo@example" without having
it explicitly.

Perhaps others think differently, but for me, the shortest output would
be the better (if only to solve the "I never see these lines, they
scrolled out of my terminal" issue).


I agree, the shorter the better.


Ideally, I'd even like to shorten the current output a bit more (the
X-Mailer: header is useless IMHO, and the Date: and Message-id: headers
are probably not useful enough to be shown by default).


Agreed.


(Just thinking aloud, obviously none of this should be a prerequisite to
accept your refactoring patch)


And keep "(mbox) Adding ..." lines as error output, or maybe
displayable by a new option `--verbose`?


I think the "Adding ..." lines make sense by default at least for
beginners (just a few days ago, I received a bunch of test emails by
your team follow by a "Oops, I just noticed you got Cc-ed in my tests"
message ;-), that would probably have been worse without the message).
There could be an advice.* option to deactivate them, though.


An advice.* option seems a good solution to me.


The (mbox) prefix doesn't seem useful to me OTOH, I think it's a
leftover debugging message.


(mbox) prefix was introduced by Ryan Anderson in 2005 (can't find the 
exact commit though), in opposition with the (non-mbox) format ("lots of 
email") that was used before.


Is the "lots of email" format still used?

When adding Cc: from a Signed-off-by: field, (body) prefix is used.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] strbuf: add tests

2016-05-30 Thread Simon Rabourg
Hi Johannes, 

I'm William's teammate on this feature. 

Johannes Schindelin  writes:
> Hi William,
> 
> On Mon, 30 May 2016, William Duclot wrote:
> 
> > Test the strbuf API. Being used throughout all Git the API could be
> > considered tested, but adding specific tests makes it easier to improve
> > and extend the API.
> > ---
> 
> The commit message makes sense. Please add your sign-off.
> 

We forgot to add the sign-off, we will fix that in the V2.

> >  Makefile   |  1 +
> >  t/helper/test-strbuf.c | 69
> >  ++
> >  t/t0082-strbuf.sh  | 19 ++
> >  3 files changed, 89 insertions(+)
> >  create mode 100644 t/helper/test-strbuf.c
> >  create mode 100755 t/t0082-strbuf.sh
> > 
> > diff --git a/Makefile b/Makefile
> > index 3f03366..dc84f43 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -613,6 +613,7 @@ TEST_PROGRAMS_NEED_X += test-scrap-cache-tree
> >  TEST_PROGRAMS_NEED_X += test-sha1
> >  TEST_PROGRAMS_NEED_X += test-sha1-array
> >  TEST_PROGRAMS_NEED_X += test-sigchain
> > +TEST_PROGRAMS_NEED_X += test-strbuf
> >  TEST_PROGRAMS_NEED_X += test-string-list
> >  TEST_PROGRAMS_NEED_X += test-submodule-config
> >  TEST_PROGRAMS_NEED_X += test-subprocess
> > diff --git a/t/helper/test-strbuf.c b/t/helper/test-strbuf.c
> > new file mode 100644
> > index 000..622f627
> > --- /dev/null
> > +++ b/t/helper/test-strbuf.c
> > @@ -0,0 +1,69 @@
> > +#include "git-compat-util.h"
> > +#include "strbuf.h"
> > +
> > +/*
> > + * Check behavior on usual use cases
> > + */
> > +int test_usual(struct strbuf *sb)
> 
> I have to admit that I would prefer a more concrete name. And since your
> other tests are more fine-grained, maybe this one could be split into
> multiple separate ones, too?
> 

We will rename this function.
We thought that one complete function would be convenient to test 
the usual API's behaviour. We are not sure how that change would be useful?

> > +{
> > +   size_t size, old_alloc;
> > +   char *res, *old_buf, *str_test = malloc(5*sizeof(char));
> 
> Our convention is to list the initialized variables first, the
> uninitialized ones after that, and for readability an empty line is
> recommended after the variable declaration block.

OK, seems more readable.

> > +   strbuf_grow(sb, 1);
> > +   strcpy(str_test, "test");
> > +   old_alloc = sb->alloc;
> > +   strbuf_grow(sb, 1000);
> > +   if (old_alloc == sb->alloc)
> > +   die("strbuf_grow does not realloc the buffer as expected");
> > +   old_buf = sb->buf;
> > +   res = strbuf_detach(sb, );
> > +   if (res != old_buf)
> > +   die("strbuf_detach does not return the expected buffer");
> > +   free(res);
> > +
> > +   strcpy(str_test, "test");
> > +   strbuf_attach(sb, (void *)str_test, strlen(str_test), sizeof(str_test));
> > +   res = strbuf_detach(sb, );
> > +   if (res != str_test)
> > +   die("strbuf_detach does not return the expected buffer");
> > +   free(res);
> > +   strbuf_release(sb);
> > +
> > +   return 0;
> > +}
> > +
> > +int main(int argc, char *argv[])
> > +{
> > +   size_t size = 1;
> > +   struct strbuf sb;
> 
> The common theme in our source code seems to initialize using
> STRBUF_INIT... Let's use that paradigm here, too?

We will add a test to check that initializing with srtbuf_init(...) is the
same as initializing with STRBUF_INIT. 

> 
> > +   char str_test[5] = "test";
> > +   char str_foo[7] = "foo";
> > +
> > +   if (argc != 2)
> > +   usage("test-strbuf mode");
> 
> A nice and convenient way to do command-line parsing is to use the
> parse-options API, in this case with OPT_CMDMODE. This would also give us
> a chance to document the command modes in a nice and succinct way: as help
> strings.
> 

True, we're going to make that change.


> > +
> > +   if (!strcmp(argv[1], "basic_grow")) {
> > +   /*
> > +* Check if strbuf_grow(0) allocate a new NUL-terminated buffer
> 
> s/allocate//
> 
> > +*/
> > +   strbuf_init(, 0);
> > +   strbuf_grow(, 0);
> > +   if (sb.buf == strbuf_slopbuf)
> > +   die("strbuf_grow failed to alloc memory");
> > +   strbuf_release();
> > +   if (sb.buf != strbuf_slopbuf)
> > +   die("strbuf_release does not reinitialize the strbuf");
> > +   } else if (!strcmp(argv[1], "strbuf_check_behavior")) {
> > +   strbuf_init(, 0);
> > +   return test_usual();
> > +   } else if (!strcmp(argv[1], "grow_overflow")) {
> > +   /*
> > +* size_t overflow: should die()
> > +*/
> > +   strbuf_init(, 1000);
> > +   strbuf_grow(, maximum_unsigned_value_of_type((size_t)1));
> 
> A comment "If this does not die(), fall through to returning success, to
> indicate an error" might be nice here.

Agreed.

> > +   } else {
> > +   usage("test-strbuf mode");
> > +   }
> > +
> > +   return 0;
> > +}
> > diff --git 

Re: [WIP-PATCH 1/2] send-email: create email parser subroutine

2016-05-30 Thread Samuel GROOT

On 05/29/2016 07:53 PM, Matthieu Moy wrote:

Samuel GROOT  writes:


So should we merge parse_email and parse_header in one unique
subroutine?


At least on the user (i.e. caller of the API) side, one function is
probably enough.



I will do that, as soon as we decide what's best between including 
Email::Simple library or making our own API.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] push: deny policy to prevent pushes to unwanted remotes.

2016-05-30 Thread Matthieu Moy
Antoine Queru  writes:

> Currently, a user wanting to prevent accidental pushes to the wrong remote
> has to create a pre-push hook.
> The feature offers a configuration to allow users to prevent accidental pushes
> to the wrong remote. The user may define a list of whitelisted remotes, a list
> of blacklisted remotes and a default policy ("allow" or "deny"). A push
> is denied if the remote is explicitely blacklisted or if it isn't
> whitelisted and the default policy is "deny".

Not really serious, but the text above is weirdly wrapped, probably by
hand. I'm sure your text editor can do better and quicker ;-).

> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 53f00db..1478ce3 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -2517,6 +2517,23 @@ remote.pushDefault::
>   `branch..remote` for all branches, and is overridden by
>   `branch..pushRemote` for specific branches.
>  
> +remote.pushBlacklisted::
> + The list of remotes the user is forbidden to push to.
> + See linkgit:git-push[1]

I'd have spelled that "pushBlacklist" (no 'ed'). I think variable names
usually do not use verbs. But I'm fine with your version too.

> +For example, if we set up the configuration variables like this:
> + git config --add remote.pushBlacklisted repository.com
> + git config --add remote.pushWhitelisted repository.com/Special_Folder
> +Any push of this form will be accepted:
> + git push repository.com/Special_Folder/foo
> +While those ones for example will be denied:
> + git push repository.com/Normal_Folder/bar

Please, look at the rendered output of your documentation, and notice
it's broken. We'd typically use the asciidoc syntax for inline code here
(between -).

> +An error will be raised if the url is blacklisted and whitelisted at the 
> same.

"at the same time"?

But as-is, this sentence conflicts with the previous "the more the url
in the config prefixes the asked url the more priority it has."
statement.

The documentation doesn't talk about the URL-normalization the code is
doing. I think a reasonable behavior would be:

pushBlacklisted = example.com/ => deny all accesses to example.com
pushBlacklisted = http://example.com/ => deny HTTP accesses to
 example.com

The second is a valid use-case IMHO, some people may want to forbid some
protocols. Actually, one may even want to whilelist only one protocol
and write stg like this to force HTTPS on host example.com:

  pushBlacklisted = example.com
  pushWhitelisted = https://example.com

BTW, these use-cases could motivate some per-blacklist deny message
like

[remote "example.com"]
pushDenyMessage = "Please use HTTPS when you push to example.com"

I don't think it has to be implemented now though (better have users get
used to the basic feature and see if more is needed later).

> +static const char *string_url_normalize(const char *repo_name)
> +{
> + if (starts_with(repo_name, "file://"))
> + return repo_name + 7;

There are many instances of this in Git's codebase, but we now try to
avoid magic numbers like this, and would use strlen("file://") instead.
Actually, we even have skip_prefix() precisely for this use-case.

> + if (is_url(repo_name)) {
> + struct url_info url_infos;
> + url_normalize(repo_name, _infos);
> + return url_infos.url + url_infos.host_off;

I think this would deserve a comment to explain why and what this is
doing, like /* turn ... into ... to ... */.

> +test_expect_success 'unsetup' '

"cleanup" ?

(I just did a very quick look at the code, I think we need an agreement
on the details of specification before a more detailed review)

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] strbuf: allow to use preallocated memory

2016-05-30 Thread William Duclot
Johannes Schindelin  writes:
> On Mon, 30 May 2016, William Duclot wrote:
> 
>> It is unfortunate that it is currently impossible to use a strbuf
>> without doing a memory allocation. So code like
>> 
>> void f()
>> {
>> char path[PATH_MAX];
>> ...
>> }
>> 
>> typically gets turned into either
>> 
>> void f()
>> {
>> struct strbuf path;
>> strbuf_add(, ...); <-- does a malloc
>> ...
>> strbuf_release();  <-- does a free
>> }
>> 
>> which costs extra memory allocations, or
>> 
>> void f()
>> {
>> static struct strbuf path;
>> strbuf_add(, ...);
>> ...
>> strbuf_setlen(, 0);
>> }
>> 
>> which, by using a static variable, avoids most of the malloc/free
>> overhead, but makes the function unsafe to use recursively or from
>> multiple threads. Those limitations prevent strbuf to be used in
>> performance-critical operations.
> 
> This description is nice and verbose, but maybe something like this would
> introduce the subject in a quicker manner?
> 
>   When working e.g. with file paths or with dates, strbuf's
>   malloc()/free() dance of strbufs can be easily avoided: as
>   a sensible initial buffer size is already known, it can be
>   allocated on the heap.

strbuf already allow to indicate a sensible initial buffer size thanks
to strbuf_init() second parameter. The main perk of pre-allocation
is to use stack-allocated memory, and not heap-allocated :)
Unless I misunderstood your message?

>> diff --git a/strbuf.c b/strbuf.c
>> index 1ba600b..527b986 100644
>> --- a/strbuf.c
>> +++ b/strbuf.c
>> @@ -1,6 +1,14 @@
>>  #include "cache.h"
>>  #include "refs.h"
>>  #include "utf8.h"
>> +#include 
> 
> Why?

For the MAX macro. It may be a teeny tiny overkill

>> +/**
>> + * Flags
>> + * --
>> + */
>> +#define STRBUF_OWNS_MEMORY 1
>> +#define STRBUF_FIXED_MEMORY (1 << 1)
> 
> From reading the commit message, I expected STRBUF_OWNS_MEMORY.
> STRBUF_FIXED_MEMORY still needs to be explained.

Yes, that seems right

>> @@ -20,16 +28,37 @@ char strbuf_slopbuf[1];
>>  
>>  void strbuf_init(struct strbuf *sb, size_t hint)
>>  {
>> +sb->flags = 0;
>>  sb->alloc = sb->len = 0;
>>  sb->buf = strbuf_slopbuf;
>>  if (hint)
>>  strbuf_grow(sb, hint);
>>  }
>>  
>> +void strbuf_wrap_preallocated(struct strbuf *sb, char *path_buf,
>> +  size_t path_buf_len, size_t alloc_len)
>> +{
>> +if (!path_buf)
>> +die("you try to use a NULL buffer to initialize a strbuf");
>> +
>> +strbuf_init(sb, 0);
>> +strbuf_attach(sb, path_buf, path_buf_len, alloc_len);
>> +sb->flags &= ~STRBUF_OWNS_MEMORY;
>> +sb->flags &= ~STRBUF_FIXED_MEMORY;
> 
> Shorter: sb->flags &= ~(STRBUF_OWNS_MEMORY | STRBUF_FIXED_MEMORY);

Okay with me

>> +}
>> +
>> +void strbuf_wrap_fixed(struct strbuf *sb, char *path_buf,
>> +   size_t path_buf_len, size_t alloc_len)
>> +{
>> +strbuf_wrap_preallocated(sb, path_buf, path_buf_len, alloc_len);
>> +sb->flags |= STRBUF_FIXED_MEMORY;
>> +}
> 
> Rather than letting strbuf_wrap_preallocated() set sb->flags &=
> ~FIXED_MEMORY only to revert that decision right away, a static function
> could be called by both strbuf_wrap_preallocated() and
> strbuf_wrap_fixed().

Makes sense

>>  void strbuf_release(struct strbuf *sb)
>>  {
>>  if (sb->alloc) {
>> -free(sb->buf);
>> +if (sb->flags & STRBUF_OWNS_MEMORY)
>> +free(sb->buf);
>>  strbuf_init(sb, 0);
>>  }
> 
> Should we not reset the flags here, too?

Well, strbuf_init() reset the flags. The only way to have !sb->alloc
is that strbuf has been initialized and never used (even alloc_grow(0)
set sb->alloc=1), so sb==STRBUF_INIT, so the flags don't have to be reset 

>> @@ -38,7 +67,11 @@ char *strbuf_detach(struct strbuf *sb, size_t *sz)
>>  {
>>  char *res;
>>  strbuf_grow(sb, 0);
>> -res = sb->buf;
>> +if (sb->flags & STRBUF_OWNS_MEMORY)
>> +res = sb->buf;
>> +else
>> +res = xmemdupz(sb->buf, sb->alloc - 1);
> 
> This looks like a usage to be avoided: if we plan to detach the buffer,
> anyway, there is no good reason to allocate it on the heap first. I would
> at least issue a warning here.

strbuf_detach() guarantees to return heap-allocated memory, that the caller
can use however he want and that he'll have to free. If the strbuf doesn't
own the memory, it cannot return the buf attribute directly because:
- The memory belong to someone else (so the caller can't use it however
he want)
- The caller can't have the responsibility to free (because the memory
belong to someone else)
- The memory may not even be heap-allocated

>> @@ -51,6 +84,8 @@ void strbuf_attach(struct strbuf *sb, void *buf, size_t
>> len, size_t alloc)
>>  sb->buf   = buf;
>>  sb->len   = len;
>>  sb->alloc = alloc;
>> +sb->flags |= STRBUF_OWNS_MEMORY;
>> +sb->flags &= ~STRBUF_FIXED_MEMORY;

Re: [PATCH 2/2] strbuf: allow to use preallocated memory

2016-05-30 Thread Matthieu Moy
William Duclot  writes:

> multiple threads. Those limitations prevent strbuf to be used in

prevent strbuf from being used ...

> API ENHANCEMENT
> ---
>
> All functions of the API can still be reliably called without
> knowledge of the initialization (normal/preallocated/fixed) with the
> exception that strbuf_grow() may die() if the string try to overflow a

s/try/tries/

> @@ -20,16 +28,37 @@ char strbuf_slopbuf[1];
>  
>  void strbuf_init(struct strbuf *sb, size_t hint)
>  {
> + sb->flags = 0;
>   sb->alloc = sb->len = 0;
>   sb->buf = strbuf_slopbuf;
>   if (hint)
>   strbuf_grow(sb, hint);
>  }

If you set flags = 0 here, existing callers will have all flags off,
including OWNS_MEMORY.

I *think* this is OK, as sb->buf is currently pointing to
strbuf_slopbuf, which the the strbuf doesn't own. But that is too subtle
to go without an explanatory comment IMHO.

Also, doesn't this make the "new_buf" case useless in strbuf_grow?

With your patch, the code looks like:

void strbuf_grow(struct strbuf *sb, size_t extra)
{
int new_buf = !sb->alloc;
...
if (sb->flags & STRBUF_OWNS_MEMORY) {
if (new_buf) // < (1)
sb->buf = NULL;
ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
} else {
/*
 * The strbuf doesn't own the buffer: to avoid to realloc it,
 * the strbuf needs to use a new buffer without freeing the old
 */
if (sb->len + extra + 1 > sb->alloc) {
size_t new_alloc = MAX(sb->len + extra + 1, 
alloc_nr(sb->alloc));
char *buf = xmalloc(new_alloc);
memcpy(buf, sb->buf, sb->alloc);
sb->buf = buf;
sb->alloc = new_alloc;
sb->flags |= STRBUF_OWNS_MEMORY;
}
}

if (new_buf) // < (2)
sb->buf[0] = '\0';
}

I think (1) is now dead code, since sb->alloc == 0 implies that
STRBUF_OWNS_MEMORY is set. (2) seems redundant since you've just
memcpy-ed the existing '\0' into the buffer.

> +void strbuf_wrap_preallocated(struct strbuf *sb, char *path_buf,
> +   size_t path_buf_len, size_t alloc_len)
> +{
> + if (!path_buf)
> + die("you try to use a NULL buffer to initialize a strbuf");
> +
> + strbuf_init(sb, 0);
> + strbuf_attach(sb, path_buf, path_buf_len, alloc_len);
> + sb->flags &= ~STRBUF_OWNS_MEMORY;
> + sb->flags &= ~STRBUF_FIXED_MEMORY;
> +}

strbuf_wrap_preallocated seem very close to strbuf_attach. I'd rather
see a symmetric code sharing like

void strbuf_attach_internal(struct strbuf *sb, ..., unsigned int flags)

and then strbuf_attach() and strbuf_wrap_preallocated() become
straightforward wrappers around it.

This would avoid setting and then unsetting STRBUF_OWNS_MEMORY (the
performance impact is probably small, but it looks weird to set the flag
and then unset it right away).

After your patch, there are differences between
strbuf_wrap_preallocated() which I think are inconsistencies:

* strbuf_attach() does not check for NULL buffer, but it doesn't accept
  them either if I read correctly. It would make sense to add the check
  to strbuf_attach(), but it's weird to have the performance-critical
  oriented function do the expensive stuff that the
  non-performance-critical one doesn't.

* strbuf_attach() calls strbuf_release(), which allows reusing an
  existing strbuf. strbuf_wrap_preallocated() calls strbuf_init which
  would override silently any previous content. I think strbuf_attach()
  does the right thing here.

  (And I'm probably the one who misguided you to do this)

  In any case, you probably want to include calls to strbuf_attach() and
  strbuf_wrap_*() functions on existing non-empty strbufs.

> +void strbuf_wrap_fixed(struct strbuf *sb, char *path_buf,
> +size_t path_buf_len, size_t alloc_len)
> +{
> + strbuf_wrap_preallocated(sb, path_buf, path_buf_len, alloc_len);
> + sb->flags |= STRBUF_FIXED_MEMORY;
> +}

And this could become a 3rd caller of strbuf_attach_internal().

> @@ -61,9 +96,32 @@ void strbuf_grow(struct strbuf *sb, size_t extra)
>   if (unsigned_add_overflows(extra, 1) ||
>   unsigned_add_overflows(sb->len, extra + 1))
>   die("you want to use way too much memory");
> - if (new_buf)
> - sb->buf = NULL;
> - ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
> + if ((sb->flags & STRBUF_FIXED_MEMORY) && sb->len + extra + 1 > 
> sb->alloc)
> + die("you try to make a string overflow the buffer of a fixed 
> strbuf");
> +
> + /*
> +  * ALLOC_GROW may do a realloc() if needed, so we must not use it on
> +  * a buffer the 

Re: [PATCH 2/2] strbuf: allow to use preallocated memory

2016-05-30 Thread Johannes Schindelin
Hi William,

On Mon, 30 May 2016, William Duclot wrote:

> It is unfortunate that it is currently impossible to use a strbuf
> without doing a memory allocation. So code like
> 
> void f()
> {
> char path[PATH_MAX];
> ...
> }
> 
> typically gets turned into either
> 
> void f()
> {
> struct strbuf path;
> strbuf_add(, ...); <-- does a malloc
> ...
> strbuf_release();  <-- does a free
> }
> 
> which costs extra memory allocations, or
> 
> void f()
> {
> static struct strbuf path;
> strbuf_add(, ...);
> ...
> strbuf_setlen(, 0);
> }
> 
> which, by using a static variable, avoids most of the malloc/free
> overhead, but makes the function unsafe to use recursively or from
> multiple threads. Those limitations prevent strbuf to be used in
> performance-critical operations.

This description is nice and verbose, but maybe something like this would
introduce the subject in a quicker manner?

When working e.g. with file paths or with dates, strbuf's
malloc()/free() dance of strbufs can be easily avoided: as
a sensible initial buffer size is already known, it can be
allocated on the heap.

The rest of the commit message flows nicely.

> diff --git a/strbuf.c b/strbuf.c
> index 1ba600b..527b986 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -1,6 +1,14 @@
>  #include "cache.h"
>  #include "refs.h"
>  #include "utf8.h"
> +#include 

Why?

> +/**
> + * Flags
> + * --
> + */
> +#define STRBUF_OWNS_MEMORY 1
> +#define STRBUF_FIXED_MEMORY (1 << 1)

>From reading the commit message, I expected STRBUF_OWNS_MEMORY.
STRBUF_FIXED_MEMORY still needs to be explained.

> @@ -20,16 +28,37 @@ char strbuf_slopbuf[1];
>  
>  void strbuf_init(struct strbuf *sb, size_t hint)
>  {
> + sb->flags = 0;
>   sb->alloc = sb->len = 0;
>   sb->buf = strbuf_slopbuf;
>   if (hint)
>   strbuf_grow(sb, hint);
>  }
>  
> +void strbuf_wrap_preallocated(struct strbuf *sb, char *path_buf,
> +   size_t path_buf_len, size_t alloc_len)
> +{
> + if (!path_buf)
> + die("you try to use a NULL buffer to initialize a strbuf");
> +
> + strbuf_init(sb, 0);
> + strbuf_attach(sb, path_buf, path_buf_len, alloc_len);
> + sb->flags &= ~STRBUF_OWNS_MEMORY;
> + sb->flags &= ~STRBUF_FIXED_MEMORY;

Shorter: sb->flags &= ~(STRBUF_OWNS_MEMORY | STRBUF_FIXED_MEMORY);

> +}
> +
> +void strbuf_wrap_fixed(struct strbuf *sb, char *path_buf,
> +size_t path_buf_len, size_t alloc_len)
> +{
> + strbuf_wrap_preallocated(sb, path_buf, path_buf_len, alloc_len);
> + sb->flags |= STRBUF_FIXED_MEMORY;
> +}

Rather than letting strbuf_wrap_preallocated() set sb->flags &=
~FIXED_MEMORY only to revert that decision right away, a static function
could be called by both strbuf_wrap_preallocated() and
strbuf_wrap_fixed().

>  void strbuf_release(struct strbuf *sb)
>  {
>   if (sb->alloc) {
> - free(sb->buf);
> + if (sb->flags & STRBUF_OWNS_MEMORY)
> + free(sb->buf);
>   strbuf_init(sb, 0);
>   }

Should we not reset the flags here, too?

> @@ -38,7 +67,11 @@ char *strbuf_detach(struct strbuf *sb, size_t *sz)
>  {
>   char *res;
>   strbuf_grow(sb, 0);
> - res = sb->buf;
> + if (sb->flags & STRBUF_OWNS_MEMORY)
> + res = sb->buf;
> + else
> + res = xmemdupz(sb->buf, sb->alloc - 1);

This looks like a usage to be avoided: if we plan to detach the buffer,
anyway, there is no good reason to allocate it on the heap first. I would
at least issue a warning here.

> @@ -51,6 +84,8 @@ void strbuf_attach(struct strbuf *sb, void *buf, size_t 
> len, size_t alloc)
>   sb->buf   = buf;
>   sb->len   = len;
>   sb->alloc = alloc;
> + sb->flags |= STRBUF_OWNS_MEMORY;
> + sb->flags &= ~STRBUF_FIXED_MEMORY;
>   strbuf_grow(sb, 0);
>   sb->buf[sb->len] = '\0';
>  }
> @@ -61,9 +96,32 @@ void strbuf_grow(struct strbuf *sb, size_t extra)
>   if (unsigned_add_overflows(extra, 1) ||
>   unsigned_add_overflows(sb->len, extra + 1))
>   die("you want to use way too much memory");
> - if (new_buf)
> - sb->buf = NULL;
> - ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
> + if ((sb->flags & STRBUF_FIXED_MEMORY) && sb->len + extra + 1 > 
> sb->alloc)
> + die("you try to make a string overflow the buffer of a fixed 
> strbuf");

We try to avoid running over 80 columns/row. This message could be
more to the point: cannot grow fixed string

> + /*
> +  * ALLOC_GROW may do a realloc() if needed, so we must not use it on
> +  * a buffer the strbuf doesn't own
> +  */
> + if (sb->flags & STRBUF_OWNS_MEMORY) {
> + if (new_buf)
> + sb->buf = NULL;
> + ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
> + } else {
> + /*
> +  * The strbuf doesn't own 

Re: [PATCH 1/2] strbuf: add tests

2016-05-30 Thread Matthieu Moy
I agree with Johannes' remarks. I'd add two style nits:

William Duclot  writes:

> --- /dev/null
> +++ b/t/helper/test-strbuf.c
> @@ -0,0 +1,69 @@
> +#include "git-compat-util.h"
> +#include "strbuf.h"
> +
> +/*
> + * Check behavior on usual use cases
> + */
> +int test_usual(struct strbuf *sb)
> +{
> + size_t size, old_alloc;
> + char *res, *old_buf, *str_test = malloc(5*sizeof(char));

Spaces around binary operator.

> +int main(int argc, char *argv[])
> +{
> + size_t size = 1;
> + struct strbuf sb;
> + char str_test[5] = "test";
> + char str_foo[7] = "foo";
> +
> + if (argc != 2)
> + usage("test-strbuf mode");
> +
> + if (!strcmp(argv[1], "basic_grow")) {
> + /*
> +  * Check if strbuf_grow(0) allocate a new NUL-terminated buffer
> +  */
> + strbuf_init(, 0);
> + strbuf_grow(, 0);
> + if (sb.buf == strbuf_slopbuf)
> + die("strbuf_grow failed to alloc memory");
> + strbuf_release();
> + if (sb.buf != strbuf_slopbuf)
> + die("strbuf_release does not reinitialize the strbuf");
> + } else if (!strcmp(argv[1], "strbuf_check_behavior")) {
> + strbuf_init(, 0);
> + return test_usual();

I think the command ("strbuf_check_behavior") should have the same name
as the function (test_usual). This avoids one indirection for the
reader going from t*.sh file to the actual test code.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] strbuf: add tests

2016-05-30 Thread Johannes Schindelin
Hi William,

On Mon, 30 May 2016, William Duclot wrote:

> Test the strbuf API. Being used throughout all Git the API could be
> considered tested, but adding specific tests makes it easier to improve
> and extend the API.
> ---

The commit message makes sense. Please add your sign-off.

>  Makefile   |  1 +
>  t/helper/test-strbuf.c | 69 
> ++
>  t/t0082-strbuf.sh  | 19 ++
>  3 files changed, 89 insertions(+)
>  create mode 100644 t/helper/test-strbuf.c
>  create mode 100755 t/t0082-strbuf.sh
> 
> diff --git a/Makefile b/Makefile
> index 3f03366..dc84f43 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -613,6 +613,7 @@ TEST_PROGRAMS_NEED_X += test-scrap-cache-tree
>  TEST_PROGRAMS_NEED_X += test-sha1
>  TEST_PROGRAMS_NEED_X += test-sha1-array
>  TEST_PROGRAMS_NEED_X += test-sigchain
> +TEST_PROGRAMS_NEED_X += test-strbuf
>  TEST_PROGRAMS_NEED_X += test-string-list
>  TEST_PROGRAMS_NEED_X += test-submodule-config
>  TEST_PROGRAMS_NEED_X += test-subprocess
> diff --git a/t/helper/test-strbuf.c b/t/helper/test-strbuf.c
> new file mode 100644
> index 000..622f627
> --- /dev/null
> +++ b/t/helper/test-strbuf.c
> @@ -0,0 +1,69 @@
> +#include "git-compat-util.h"
> +#include "strbuf.h"
> +
> +/*
> + * Check behavior on usual use cases
> + */
> +int test_usual(struct strbuf *sb)

I have to admit that I would prefer a more concrete name. And since your
other tests are more fine-grained, maybe this one could be split into
multiple separate ones, too?

> +{
> + size_t size, old_alloc;
> + char *res, *old_buf, *str_test = malloc(5*sizeof(char));

Our convention is to list the initialized variables first, the
uninitialized ones after that, and for readability an empty line is
recommended after the variable declaration block.

> + strbuf_grow(sb, 1);
> + strcpy(str_test, "test");
> + old_alloc = sb->alloc;
> + strbuf_grow(sb, 1000);
> + if (old_alloc == sb->alloc)
> + die("strbuf_grow does not realloc the buffer as expected");
> + old_buf = sb->buf;
> + res = strbuf_detach(sb, );
> + if (res != old_buf)
> + die("strbuf_detach does not return the expected buffer");
> + free(res);
> +
> + strcpy(str_test, "test");
> + strbuf_attach(sb, (void *)str_test, strlen(str_test), sizeof(str_test));
> + res = strbuf_detach(sb, );
> + if (res != str_test)
> + die("strbuf_detach does not return the expected buffer");
> + free(res);
> + strbuf_release(sb);
> +
> + return 0;
> +}
> +
> +int main(int argc, char *argv[])
> +{
> + size_t size = 1;
> + struct strbuf sb;

The common theme in our source code seems to initialize using
STRBUF_INIT... Let's use that paradigm here, too?

> + char str_test[5] = "test";
> + char str_foo[7] = "foo";
> +
> + if (argc != 2)
> + usage("test-strbuf mode");

A nice and convenient way to do command-line parsing is to use the
parse-options API, in this case with OPT_CMDMODE. This would also give us
a chance to document the command modes in a nice and succinct way: as help
strings.

> +
> + if (!strcmp(argv[1], "basic_grow")) {
> + /*
> +  * Check if strbuf_grow(0) allocate a new NUL-terminated buffer

s/allocate//

> +  */
> + strbuf_init(, 0);
> + strbuf_grow(, 0);
> + if (sb.buf == strbuf_slopbuf)
> + die("strbuf_grow failed to alloc memory");
> + strbuf_release();
> + if (sb.buf != strbuf_slopbuf)
> + die("strbuf_release does not reinitialize the strbuf");
> + } else if (!strcmp(argv[1], "strbuf_check_behavior")) {
> + strbuf_init(, 0);
> + return test_usual();
> + } else if (!strcmp(argv[1], "grow_overflow")) {
> + /*
> +  * size_t overflow: should die()
> +  */
> + strbuf_init(, 1000);
> + strbuf_grow(, maximum_unsigned_value_of_type((size_t)1));

A comment "If this does not die(), fall through to returning success, to
indicate an error" might be nice here.

> + } else {
> + usage("test-strbuf mode");
> + }
> +
> + return 0;
> +}
> diff --git a/t/t0082-strbuf.sh b/t/t0082-strbuf.sh
> new file mode 100755
> index 000..0800d26
> --- /dev/null
> +++ b/t/t0082-strbuf.sh
> @@ -0,0 +1,19 @@
> +#!/bin/sh
> +
> +test_description="Test the strbuf API.
> +"

This description does not need a new-line, and existing one-liner test
descriptions seem not to be terminated by a period.

The rest of this patch looks good.

Ciao,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] strbuf: improve API

2016-05-30 Thread Remi Galan Alfonso
William Duclot  writes:
> This patch series implements an improvment of the strbuf API, allowing
> strbuf to use preallocated memory. This makes strbuf fit to be used
> in performance-critical operations.
> 
> The first patch is simply a preparatory work, adding tests for
> existing strbuf implementation.
> Most of the work is made in the second patch: handle pre-allocated
> memory, extend the API, document it and test it.

Seems interesting, however do you have any test/example that would
show the difference of performance between using these optimizations
and not using them?

Such values would make a nice addition to help convince people that
your series is interesting to have and use.

Thanks,
Rémi
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 02/22] i18n: advice: mark string about detached head for translation

2016-05-30 Thread Vasco Almeida
Às 19:27 de 23-05-2016, Vasco Almeida escreveu:
> Mark string with advice seen by the user when in detached head.
> 
> Signed-off-by: Vasco Almeida 
> ---
>  advice.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/advice.c b/advice.c
> index 4dc5cf1..703a847 100644
> --- a/advice.c
> +++ b/advice.c
> @@ -107,13 +107,13 @@ void NORETURN die_conclude_merge(void)
>  void detach_advice(const char *new_name)
>  {
>   const char fmt[] =
> - "Note: checking out '%s'.\n\n"
> + N_("Note: checking out '%s'.\n\n"
>   "You are in 'detached HEAD' state. You can look around, make 
> experimental\n"
>   "changes and commit them, and you can discard any commits you make in 
> this\n"
>   "state without impacting any branches by performing another 
> checkout.\n\n"
>   "If you want to create a new branch to retain commits you create, you 
> may\n"
>   "do so (now or later) by using -b with the checkout command again. 
> Example:\n\n"
> - "  git checkout -b \n\n";
> + "  git checkout -b \n\n");
>  
>   fprintf(stderr, fmt, new_name);
>  }
> 
I just realized this does nothing but letting xgettext extract the
string. Does not allow interpolating the actual translation. Hence, the
translator would translate this text but the translation would never be
used.

There are 2 solutions: 1) use const char *fmt = _("...") instead
2) still use N_() like this patch but do fprintf(stderr, _(fmt),
new_name); to trigger retrieving the translation when printing the message.

Option 1) it most common on git source and 2) is used primarily during
command line option parse (see usage_with_options_internal() in
parse-options.c).

I prefer the solution 1) and I'll choose it if there are no objections.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 0/6] worktree lock/unlock

2016-05-30 Thread Nguyễn Thái Ngọc Duy
v3 changes almost all patches so I didn't bother with interdiff

  [1/6] worktree.c: add find_worktree()

This replaces the old find_wortree_by_path(). It now takes both prefix
and command line argument so it can do some intelligent things with
them if needed.

  [2/6] worktree.c: find_worktree() learns to identify worktrees by basename

This is from Eric's suggestion. Here we only care about the "argument"
argument and ignore the "prefix" argument from 1/6

  [3/6] worktree.c: add is_main_worktree()
 
This is the only unchanged patch since v2!

  [4/6] worktree.c: retrieve lock status (and optionally reason) in 
get_worktrees()

This used to be is_worktree_locked()

  [5/6] worktree: add "lock" command

Besides Eric's comments in v2, the syntax line "worktree lock "
is changed to "worktree lock ", where  could be
either a path, or a base name, or something more in future. I'm aware
that we don't like showing "worktrees" in user documents, favoring
working trees. But I think  is ok in this case.

  [6/6] worktree: add "unlock" command

Pretty much addressing comments from v2 and some more changes because
of the new 1/6 and 4/6.

 Documentation/git-worktree.txt | 33 +
 builtin/worktree.c | 66 ++
 contrib/completion/git-completion.bash |  5 ++-
 t/t2028-worktree-move.sh (new +x)  | 62 
 worktree.c | 53 +++
 worktree.h | 14 
 6 files changed, 226 insertions(+), 7 deletions(-)
 create mode 100755 t/t2028-worktree-move.sh

-- 
2.8.2.524.g6ff3d78

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 5/6] worktree: add "lock" command

2016-05-30 Thread Nguyễn Thái Ngọc Duy
Helped-by: Eric Sunshine 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/git-worktree.txt | 28 +++-
 builtin/worktree.c | 38 +++
 contrib/completion/git-completion.bash |  5 +++-
 t/t2028-worktree-move.sh (new +x)  | 48 ++
 4 files changed, 112 insertions(+), 7 deletions(-)
 create mode 100755 t/t2028-worktree-move.sh

diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
index 27feff6..6fca6cf 100644
--- a/Documentation/git-worktree.txt
+++ b/Documentation/git-worktree.txt
@@ -11,6 +11,7 @@ SYNOPSIS
 [verse]
 'git worktree add' [-f] [--detach] [--checkout] [-b ]  
[]
 'git worktree list' [--porcelain]
+'git worktree lock' [--reason ] 
 'git worktree prune' [-n] [-v] [--expire ]
 
 DESCRIPTION
@@ -38,9 +39,8 @@ section "DETAILS" for more information.
 
 If a linked working tree is stored on a portable device or network share
 which is not always mounted, you can prevent its administrative files from
-being pruned by creating a file named 'locked' alongside the other
-administrative files, optionally containing a plain text reason that
-pruning should be suppressed. See section "DETAILS" for more information.
+being pruned by issuing the `git worktree lock` command, optionally
+specifying `--reason` to explain why the working tree is locked.
 
 COMMANDS
 
@@ -61,6 +61,14 @@ each of the linked worktrees.  The output details include if 
the worktree is
 bare, the revision currently checked out, and the branch currently checked out
 (or 'detached HEAD' if none).
 
+lock::
+
+If a working tree is on a portable device or network share which
+is not always mounted, lock it to prevent its administrative
+files from being pruned automatically. This also prevents it from
+being moved or deleted. Optionally, specify a reason for the lock
+with `--reason`.
+
 prune::
 
 Prune working tree information in $GIT_DIR/worktrees.
@@ -110,6 +118,15 @@ OPTIONS
 --expire ::
With `prune`, only expire unused working trees older than .
 
+--reason :
+   With `lock`, an explanation why the worktree is locked.
+
+::
+   Working trees can be identified by path, either relative or
+   absolute. Alternatively, if the last component in the
+   worktree's path is unique among working trees, it can be used
+   as well.
+
 DETAILS
 ---
 Each linked working tree has a private sub-directory in the repository's
@@ -150,7 +167,8 @@ instead.
 
 To prevent a $GIT_DIR/worktrees entry from being pruned (which
 can be useful in some situations, such as when the
-entry's working tree is stored on a portable device), add a file named
+entry's working tree is stored on a portable device), use the
+`git worktree lock` command, which adds a file named
 'locked' to the entry's directory. The file contains the reason in
 plain text. For example, if a linked working tree's `.git` file points
 to `/path/main/.git/worktrees/test-next` then a file named
@@ -226,8 +244,6 @@ performed manually, such as:
 - `remove` to remove a linked working tree and its administrative files (and
   warn if the working tree is dirty)
 - `mv` to move or rename a working tree and update its administrative files
-- `lock` to prevent automatic pruning of administrative files (for instance,
-  for a working tree on a portable device)
 
 GIT
 ---
diff --git a/builtin/worktree.c b/builtin/worktree.c
index f9dac37..84fe63b 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -14,6 +14,7 @@
 static const char * const worktree_usage[] = {
N_("git worktree add []  []"),
N_("git worktree list []"),
+   N_("git worktree lock [] "),
N_("git worktree prune []"),
NULL
 };
@@ -459,6 +460,41 @@ static int list(int ac, const char **av, const char 
*prefix)
return 0;
 }
 
+static int lock_worktree(int ac, const char **av, const char *prefix)
+{
+   const char *reason = "", *old_reason;
+   struct option options[] = {
+   OPT_STRING(0, "reason", , N_("string"),
+  N_("reason for locking")),
+   OPT_END()
+   };
+   struct worktree **worktrees, *wt;
+
+   ac = parse_options(ac, av, prefix, options, worktree_usage, 0);
+   if (ac != 1)
+   usage_with_options(worktree_usage, options);
+
+   worktrees = get_worktrees();
+   wt = find_worktree(worktrees, prefix, av[0]);
+   if (!wt)
+   die(_("'%s' is not a working directory"), av[0]);
+   if (is_main_worktree(wt))
+   die(_("'%s' is a main working directory"), av[0]);
+
+   old_reason = wt->lock_reason;
+   if (old_reason) {
+   if (*old_reason)
+   die(_("'%s' is already locked, reason: %s"),
+   av[0], old_reason);
+   die(_("'%s' is already locked"), av[0]);
+   }
+
+   

[PATCH v3 4/6] worktree.c: retrieve lock status (and optionally reason) in get_worktrees()

2016-05-30 Thread Nguyễn Thái Ngọc Duy
We need this later to avoid double locking a worktree, or unlocking one
when it's not even locked.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 worktree.c | 12 
 worktree.h |  1 +
 2 files changed, 13 insertions(+)

diff --git a/worktree.c b/worktree.c
index 6432eec..c6a64b1 100644
--- a/worktree.c
+++ b/worktree.c
@@ -98,6 +98,7 @@ static struct worktree *get_main_worktree(void)
worktree->is_detached = is_detached;
worktree->is_current = 0;
add_head_info(_ref, worktree);
+   worktree->lock_reason = NULL;
 
 done:
strbuf_release();
@@ -144,6 +145,17 @@ static struct worktree *get_linked_worktree(const char *id)
worktree->is_current = 0;
add_head_info(_ref, worktree);
 
+   strbuf_reset();
+   strbuf_addf(, "%s/worktrees/%s/locked", get_git_common_dir(), id);
+   if (file_exists(path.buf)) {
+   struct strbuf lock_reason = STRBUF_INIT;
+   if (strbuf_read_file(_reason, path.buf, 0) < 0)
+   die_errno(_("failed to read '%s'"), path.buf);
+   strbuf_trim(_reason);
+   worktree->lock_reason = strbuf_detach(_reason, NULL);
+   } else
+   worktree->lock_reason = NULL;
+
 done:
strbuf_release();
strbuf_release(_path);
diff --git a/worktree.h b/worktree.h
index e1c4715..9932710 100644
--- a/worktree.h
+++ b/worktree.h
@@ -5,6 +5,7 @@ struct worktree {
char *path;
char *id;
char *head_ref;
+   char *lock_reason;  /* NULL means not locked */
unsigned char head_sha1[20];
int is_detached;
int is_bare;
-- 
2.8.2.524.g6ff3d78

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/6] worktree.c: find_worktree() learns to identify worktrees by basename

2016-05-30 Thread Nguyễn Thái Ngọc Duy
This allows the user to do something like "worktree lock foo" instead of
"worktree lock ". With completion support it could be quite
convenient. While this base name search can be done in the same worktree
iteration loop, the code is split into a separate function for clarity.

Suggested-by: Eric Sunshine 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 worktree.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/worktree.c b/worktree.c
index 0782e00..4dd7b77 100644
--- a/worktree.c
+++ b/worktree.c
@@ -214,12 +214,33 @@ const char *get_worktree_git_dir(const struct worktree 
*wt)
return git_common_path("worktrees/%s", wt->id);
 }
 
+static struct worktree *find_worktree_by_basename(struct worktree **list,
+ const char *base_name)
+{
+   struct worktree *found = NULL;
+   int nr_found = 0;
+
+   for (; *list && nr_found < 2; list++) {
+   char *path = xstrdup((*list)->path);
+   if (!fspathcmp(base_name, basename(path))) {
+   found = *list;
+   nr_found++;
+   }
+   free(path);
+   }
+   return nr_found == 1 ? found : NULL;
+}
+
 struct worktree *find_worktree(struct worktree **list,
   const char *prefix,
   const char *arg)
 {
+   struct worktree *wt;
char *path;
 
+   if ((wt = find_worktree_by_basename(list, arg)))
+   return wt;
+
arg = prefix_filename(prefix, strlen(prefix), arg);
path = xstrdup(real_path(arg));
for (; *list; list++)
-- 
2.8.2.524.g6ff3d78

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/6] worktree.c: add find_worktree()

2016-05-30 Thread Nguyễn Thái Ngọc Duy
So far we haven't needed to identify an existing worktree from command
line. Future commands such as lock or move will need it. The current
implementation identifies worktrees by path (*). In future, the function
could learn to identify by $(basename $path) or tags...

(*) We could probably go cheaper with comparing inode number (and
probably more reliable than paths when unicode enters the game). But not
all systems have good inode that so let's stick to something simple for
now.

Helped-by: Eric Sunshine 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 worktree.c | 15 +++
 worktree.h |  8 
 2 files changed, 23 insertions(+)

diff --git a/worktree.c b/worktree.c
index f4a4f38..0782e00 100644
--- a/worktree.c
+++ b/worktree.c
@@ -214,6 +214,21 @@ const char *get_worktree_git_dir(const struct worktree *wt)
return git_common_path("worktrees/%s", wt->id);
 }
 
+struct worktree *find_worktree(struct worktree **list,
+  const char *prefix,
+  const char *arg)
+{
+   char *path;
+
+   arg = prefix_filename(prefix, strlen(prefix), arg);
+   path = xstrdup(real_path(arg));
+   for (; *list; list++)
+   if (!fspathcmp(path, real_path((*list)->path)))
+   break;
+   free(path);
+   return *list;
+}
+
 int is_worktree_being_rebased(const struct worktree *wt,
  const char *target)
 {
diff --git a/worktree.h b/worktree.h
index 1394909..7ad15da 100644
--- a/worktree.h
+++ b/worktree.h
@@ -29,6 +29,14 @@ extern struct worktree **get_worktrees(void);
  */
 extern const char *get_worktree_git_dir(const struct worktree *wt);
 
+/*
+ * Search a worktree that can be unambiguously identified by
+ * "arg". "prefix" must not be NULL.
+ */
+extern struct worktree *find_worktree(struct worktree **list,
+ const char *prefix,
+ const char *arg);
+
 /*
  * Free up the memory for worktree(s)
  */
-- 
2.8.2.524.g6ff3d78

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 3/6] worktree.c: add is_main_worktree()

2016-05-30 Thread Nguyễn Thái Ngọc Duy
Main worktree _is_ different. You can lock a linked worktree but not the
main one, for example. Provide an API for checking that.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 worktree.c | 5 +
 worktree.h | 5 +
 2 files changed, 10 insertions(+)

diff --git a/worktree.c b/worktree.c
index 4dd7b77..6432eec 100644
--- a/worktree.c
+++ b/worktree.c
@@ -250,6 +250,11 @@ struct worktree *find_worktree(struct worktree **list,
return *list;
 }
 
+int is_main_worktree(const struct worktree *wt)
+{
+   return !wt->id;
+}
+
 int is_worktree_being_rebased(const struct worktree *wt,
  const char *target)
 {
diff --git a/worktree.h b/worktree.h
index 7ad15da..e1c4715 100644
--- a/worktree.h
+++ b/worktree.h
@@ -37,6 +37,11 @@ extern struct worktree *find_worktree(struct worktree **list,
  const char *prefix,
  const char *arg);
 
+/*
+ * Return true if the given worktree is the main one.
+ */
+extern int is_main_worktree(const struct worktree *wt);
+
 /*
  * Free up the memory for worktree(s)
  */
-- 
2.8.2.524.g6ff3d78

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 6/6] worktree: add "unlock" command

2016-05-30 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/git-worktree.txt |  5 +
 builtin/worktree.c | 28 
 contrib/completion/git-completion.bash |  2 +-
 t/t2028-worktree-move.sh   | 14 ++
 4 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
index 6fca6cf..6a5ff47 100644
--- a/Documentation/git-worktree.txt
+++ b/Documentation/git-worktree.txt
@@ -13,6 +13,7 @@ SYNOPSIS
 'git worktree list' [--porcelain]
 'git worktree lock' [--reason ] 
 'git worktree prune' [-n] [-v] [--expire ]
+'git worktree unlock' 
 
 DESCRIPTION
 ---
@@ -73,6 +74,10 @@ prune::
 
 Prune working tree information in $GIT_DIR/worktrees.
 
+unlock::
+
+Unlock a worktree, allowing it to be pruned, moved or deleted.
+
 OPTIONS
 ---
 
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 84fe63b..5fe9323 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -16,6 +16,7 @@ static const char * const worktree_usage[] = {
N_("git worktree list []"),
N_("git worktree lock [] "),
N_("git worktree prune []"),
+   N_("git worktree unlock "),
NULL
 };
 
@@ -495,6 +496,31 @@ static int lock_worktree(int ac, const char **av, const 
char *prefix)
return 0;
 }
 
+static int unlock_worktree(int ac, const char **av, const char *prefix)
+{
+   struct option options[] = {
+   OPT_END()
+   };
+   struct worktree **worktrees, *wt;
+   int ret;
+
+   ac = parse_options(ac, av, prefix, options, worktree_usage, 0);
+   if (ac != 1)
+   usage_with_options(worktree_usage, options);
+
+   worktrees = get_worktrees();
+   wt = find_worktree(worktrees, prefix, av[0]);
+   if (!wt)
+   die(_("'%s' is not a working directory"), av[0]);
+   if (is_main_worktree(wt))
+   die(_("'%s' is a main working directory"), av[0]);
+   if (!wt->lock_reason)
+   die(_("'%s' is not locked"), av[0]);
+   ret = unlink_or_warn(git_common_path("worktrees/%s/locked", wt->id));
+   free_worktrees(worktrees);
+   return ret;
+}
+
 int cmd_worktree(int ac, const char **av, const char *prefix)
 {
struct option options[] = {
@@ -513,5 +539,7 @@ int cmd_worktree(int ac, const char **av, const char 
*prefix)
return list(ac - 1, av + 1, prefix);
if (!strcmp(av[1], "lock"))
return lock_worktree(ac - 1, av + 1, prefix);
+   if (!strcmp(av[1], "unlock"))
+   return unlock_worktree(ac - 1, av + 1, prefix);
usage_with_options(worktree_usage, options);
 }
diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index f88727d..0e3841d 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -2597,7 +2597,7 @@ _git_whatchanged ()
 
 _git_worktree ()
 {
-   local subcommands="add list lock prune"
+   local subcommands="add list lock prune unlock"
local subcommand="$(__git_find_on_cmdline "$subcommands")"
if [ -z "$subcommand" ]; then
__gitcomp "$subcommands"
diff --git a/t/t2028-worktree-move.sh b/t/t2028-worktree-move.sh
index 87afc2e..68d3fe8 100755
--- a/t/t2028-worktree-move.sh
+++ b/t/t2028-worktree-move.sh
@@ -45,4 +45,18 @@ test_expect_success 'lock worktree twice (from the locked 
worktree)' '
test_cmp expected .git/worktrees/source/locked
 '
 
+test_expect_success 'unlock main worktree' '
+   test_must_fail git worktree unlock .
+'
+
+test_expect_success 'unlock linked worktree' '
+   git worktree unlock source &&
+   test_path_is_missing .git/worktrees/source/locked
+'
+
+test_expect_success 'unlock worktree twice' '
+   test_must_fail git worktree unlock source &&
+   test_path_is_missing .git/worktrees/source/locked
+'
+
 test_done
-- 
2.8.2.524.g6ff3d78

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] push: deny policy to prevent pushes to unwanted remotes.

2016-05-30 Thread Antoine Queru
Currently, a user wanting to prevent accidental pushes to the wrong remote
has to create a pre-push hook.
The feature offers a configuration to allow users to prevent accidental pushes
to the wrong remote. The user may define a list of whitelisted remotes, a list
of blacklisted remotes and a default policy ("allow" or "deny"). A push
is denied if the remote is explicitely blacklisted or if it isn't
whitelisted and the default policy is "deny".

This feature is intended as a safety net more than a real security, the user
will always be able to modify the config if he wants to. It is here for him to
consciously restrict his push possibilities. For example, it may be useful
for an unexperimented user fearing to push to the wrong remote, or for
companies wanting to avoid unintentionnal leaking of private code on public
repositories.

By default the whitelist/blacklist feature is disabled since the default policy
is "allow".

Signed-off-by: Antoine Queru 
Signed-off-by: Francois Beutin 
Signed-off-by: Matthieu Moy 
---
This is the first implementation of the feature proposed in SoCG 2016.
The conversation about it can be found here:
http://thread.gmane.org/gmane.comp.version-control.git/295166
and http://thread.gmane.org/gmane.comp.version-control.git/289749.
We have included in cc all the participants in the previous conversation.
 Documentation/config.txt| 17 +++
 Documentation/git-push.txt  | 32 +
 builtin/push.c  | 75 ---
 t/t5544-push-whitelist-blacklist.sh | 90 +
 4 files changed, 209 insertions(+), 5 deletions(-)
 create mode 100755 t/t5544-push-whitelist-blacklist.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 53f00db..1478ce3 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2517,6 +2517,23 @@ remote.pushDefault::
`branch..remote` for all branches, and is overridden by
`branch..pushRemote` for specific branches.
 
+remote.pushBlacklisted::
+   The list of remotes the user is forbidden to push to.
+   See linkgit:git-push[1]
+
+remote.pushWhitelisted::
+   The list of remotes the user is allowed to push to.
+   See linkgit:git-push[1]
+
+remote.pushDefaultPolicy::
+   Can be set to either 'allow' or 'deny', defines the policy to adopt
+   when the user tries to push to a remote not in the whitelist or 
+   the blacklist. Defaults to 'allow'.
+
+remote.pushDenyMessage::
+   The message printed when pushing to a forbidden remote.
+   Defaults to "Pushing to this remote has been forbidden".
+
 remote..url::
The URL of a remote repository.  See linkgit:git-fetch[1] or
linkgit:git-push[1].
diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index cf6ee4a..644bfde 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -368,6 +368,38 @@ reason::
refs, no explanation is needed. For a failed ref, the reason for
failure is described.
 
+
+Note about denying pushes to the wrong remotes
+--
+
+If you fear accidental pushes to the wrong remotes, you can use the
+blacklist/whitelist feature. The goal is to catch pushes to unwanted
+remotes before they happen.
+
+The simplest way to forbid pushes to a remote is to add its url in the
+config file under the 'remote.pushBlacklisted' variable.
+A more restrictive way is to change 'remote.pushDefaultPolicy' to 'deny',
+thus denying pushes to every remote not whitelisted. You can then add
+the right remotes under 'remote.pushWhitelisted'.
+
+You can also configure more advanced acceptation/denial behavior following
+this rule: the more the url in the config prefixes the asked url the more
+priority it has.
+
+For example, if we set up the configuration variables like this:
+   git config --add remote.pushBlacklisted repository.com
+   git config --add remote.pushWhitelisted repository.com/Special_Folder
+Any push of this form will be accepted:
+   git push repository.com/Special_Folder/foo
+While those ones for example will be denied:
+   git push repository.com/Normal_Folder/bar
+
+Additionally, you can configure the message printed when a push is denied
+with the 'remote.pushDenyMessage' configuration variable.
+
+An error will be raised if the url is blacklisted and whitelisted at the same.
+
+
 Note about fast-forwards
 
 
diff --git a/builtin/push.c b/builtin/push.c
index 4e9e4db..0aa1f7d 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -11,6 +11,8 @@
 #include "submodule.h"
 #include "submodule-config.h"
 #include "send-pack.h"
+#include "urlmatch.h"
+#include "url.h"
 
 static const char * const push_usage[] = {
N_("git push [] [ [...]]"),
@@ -353,6 +355,68 @@ static int 

[PATCH 1/2] strbuf: add tests

2016-05-30 Thread William Duclot
Test the strbuf API. Being used throughout all Git the API could be
considered tested, but adding specific tests makes it easier to improve
and extend the API.
---
 Makefile   |  1 +
 t/helper/test-strbuf.c | 69 ++
 t/t0082-strbuf.sh  | 19 ++
 3 files changed, 89 insertions(+)
 create mode 100644 t/helper/test-strbuf.c
 create mode 100755 t/t0082-strbuf.sh

diff --git a/Makefile b/Makefile
index 3f03366..dc84f43 100644
--- a/Makefile
+++ b/Makefile
@@ -613,6 +613,7 @@ TEST_PROGRAMS_NEED_X += test-scrap-cache-tree
 TEST_PROGRAMS_NEED_X += test-sha1
 TEST_PROGRAMS_NEED_X += test-sha1-array
 TEST_PROGRAMS_NEED_X += test-sigchain
+TEST_PROGRAMS_NEED_X += test-strbuf
 TEST_PROGRAMS_NEED_X += test-string-list
 TEST_PROGRAMS_NEED_X += test-submodule-config
 TEST_PROGRAMS_NEED_X += test-subprocess
diff --git a/t/helper/test-strbuf.c b/t/helper/test-strbuf.c
new file mode 100644
index 000..622f627
--- /dev/null
+++ b/t/helper/test-strbuf.c
@@ -0,0 +1,69 @@
+#include "git-compat-util.h"
+#include "strbuf.h"
+
+/*
+ * Check behavior on usual use cases
+ */
+int test_usual(struct strbuf *sb)
+{
+   size_t size, old_alloc;
+   char *res, *old_buf, *str_test = malloc(5*sizeof(char));
+   strbuf_grow(sb, 1);
+   strcpy(str_test, "test");
+   old_alloc = sb->alloc;
+   strbuf_grow(sb, 1000);
+   if (old_alloc == sb->alloc)
+   die("strbuf_grow does not realloc the buffer as expected");
+   old_buf = sb->buf;
+   res = strbuf_detach(sb, );
+   if (res != old_buf)
+   die("strbuf_detach does not return the expected buffer");
+   free(res);
+
+   strcpy(str_test, "test");
+   strbuf_attach(sb, (void *)str_test, strlen(str_test), sizeof(str_test));
+   res = strbuf_detach(sb, );
+   if (res != str_test)
+   die("strbuf_detach does not return the expected buffer");
+   free(res);
+   strbuf_release(sb);
+
+   return 0;
+}
+
+int main(int argc, char *argv[])
+{
+   size_t size = 1;
+   struct strbuf sb;
+   char str_test[5] = "test";
+   char str_foo[7] = "foo";
+
+   if (argc != 2)
+   usage("test-strbuf mode");
+
+   if (!strcmp(argv[1], "basic_grow")) {
+   /*
+* Check if strbuf_grow(0) allocate a new NUL-terminated buffer
+*/
+   strbuf_init(, 0);
+   strbuf_grow(, 0);
+   if (sb.buf == strbuf_slopbuf)
+   die("strbuf_grow failed to alloc memory");
+   strbuf_release();
+   if (sb.buf != strbuf_slopbuf)
+   die("strbuf_release does not reinitialize the strbuf");
+   } else if (!strcmp(argv[1], "strbuf_check_behavior")) {
+   strbuf_init(, 0);
+   return test_usual();
+   } else if (!strcmp(argv[1], "grow_overflow")) {
+   /*
+* size_t overflow: should die()
+*/
+   strbuf_init(, 1000);
+   strbuf_grow(, maximum_unsigned_value_of_type((size_t)1));
+   } else {
+   usage("test-strbuf mode");
+   }
+
+   return 0;
+}
diff --git a/t/t0082-strbuf.sh b/t/t0082-strbuf.sh
new file mode 100755
index 000..0800d26
--- /dev/null
+++ b/t/t0082-strbuf.sh
@@ -0,0 +1,19 @@
+#!/bin/sh
+
+test_description="Test the strbuf API.
+"
+. ./test-lib.sh
+
+test_expect_success 'basic test on strbuf_grow()' '
+   test-strbuf basic_grow
+'
+
+test_expect_success 'check strbuf behavior in usual use cases ' '
+   test-strbuf strbuf_check_behavior
+'
+
+test_expect_success 'overflow while calling strbuf_grow' '
+   test_must_fail test-strbuf grow_overflow
+'
+
+test_done
-- 
2.8.2.403.ge2646ba.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] strbuf: allow to use preallocated memory

2016-05-30 Thread William Duclot
It is unfortunate that it is currently impossible to use a strbuf
without doing a memory allocation. So code like

void f()
{
char path[PATH_MAX];
...
}

typically gets turned into either

void f()
{
struct strbuf path;
strbuf_add(, ...); <-- does a malloc
...
strbuf_release();  <-- does a free
}

which costs extra memory allocations, or

void f()
{
static struct strbuf path;
strbuf_add(, ...);
...
strbuf_setlen(, 0);
}

which, by using a static variable, avoids most of the malloc/free
overhead, but makes the function unsafe to use recursively or from
multiple threads. Those limitations prevent strbuf to be used in
performance-critical operations.

THE IDEA


The idea here is to enhance strbuf to allow it to use memory that it
doesn't own (for example, stack-allocated memory), while (optionally)
allowing it to switch over to using allocated memory if the string grows
past the size of the pre-allocated buffer.

API ENHANCEMENT
---

All functions of the API can still be reliably called without
knowledge of the initialization (normal/preallocated/fixed) with the
exception that strbuf_grow() may die() if the string try to overflow a
fixed buffer.

The API contract is still respected:

- The API users may peek strbuf.buf in-place until they perform an
  operation that makes it longer (at which point the .buf pointer
  may point at a new piece of memory).

- The API users may strbuf_detach() to obtain a piece of memory that
  belongs to them (at which point the strbuf becomes empty), hence
  needs to be freed by the callers.

Full credit to Michael Haggerty for the idea and most of the wording of
this commit (http://mid.gmane.org/53512db6.1070...@alum.mit.edu). The
implementation and bugs are all mine.

Signed-off by: William Duclot 
Signed-off by: Simon Rabourg 
Signed-off by: Matthieu Moy 
---
 strbuf.c   | 68 ++
 strbuf.h   | 31 +--
 t/helper/test-strbuf.c | 42 +++
 t/t0082-strbuf.sh  | 28 +
 4 files changed, 162 insertions(+), 7 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index 1ba600b..527b986 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1,6 +1,14 @@
 #include "cache.h"
 #include "refs.h"
 #include "utf8.h"
+#include 
+
+/**
+ * Flags
+ * --
+ */
+#define STRBUF_OWNS_MEMORY 1
+#define STRBUF_FIXED_MEMORY (1 << 1)
 
 int starts_with(const char *str, const char *prefix)
 {
@@ -20,16 +28,37 @@ char strbuf_slopbuf[1];
 
 void strbuf_init(struct strbuf *sb, size_t hint)
 {
+   sb->flags = 0;
sb->alloc = sb->len = 0;
sb->buf = strbuf_slopbuf;
if (hint)
strbuf_grow(sb, hint);
 }
 
+void strbuf_wrap_preallocated(struct strbuf *sb, char *path_buf,
+ size_t path_buf_len, size_t alloc_len)
+{
+   if (!path_buf)
+   die("you try to use a NULL buffer to initialize a strbuf");
+
+   strbuf_init(sb, 0);
+   strbuf_attach(sb, path_buf, path_buf_len, alloc_len);
+   sb->flags &= ~STRBUF_OWNS_MEMORY;
+   sb->flags &= ~STRBUF_FIXED_MEMORY;
+}
+
+void strbuf_wrap_fixed(struct strbuf *sb, char *path_buf,
+  size_t path_buf_len, size_t alloc_len)
+{
+   strbuf_wrap_preallocated(sb, path_buf, path_buf_len, alloc_len);
+   sb->flags |= STRBUF_FIXED_MEMORY;
+}
+
 void strbuf_release(struct strbuf *sb)
 {
if (sb->alloc) {
-   free(sb->buf);
+   if (sb->flags & STRBUF_OWNS_MEMORY)
+   free(sb->buf);
strbuf_init(sb, 0);
}
 }
@@ -38,7 +67,11 @@ char *strbuf_detach(struct strbuf *sb, size_t *sz)
 {
char *res;
strbuf_grow(sb, 0);
-   res = sb->buf;
+   if (sb->flags & STRBUF_OWNS_MEMORY)
+   res = sb->buf;
+   else
+   res = xmemdupz(sb->buf, sb->alloc - 1);
+
if (sz)
*sz = sb->len;
strbuf_init(sb, 0);
@@ -51,6 +84,8 @@ void strbuf_attach(struct strbuf *sb, void *buf, size_t len, 
size_t alloc)
sb->buf   = buf;
sb->len   = len;
sb->alloc = alloc;
+   sb->flags |= STRBUF_OWNS_MEMORY;
+   sb->flags &= ~STRBUF_FIXED_MEMORY;
strbuf_grow(sb, 0);
sb->buf[sb->len] = '\0';
 }
@@ -61,9 +96,32 @@ void strbuf_grow(struct strbuf *sb, size_t extra)
if (unsigned_add_overflows(extra, 1) ||
unsigned_add_overflows(sb->len, extra + 1))
die("you want to use way too much memory");
-   if (new_buf)
-   sb->buf = NULL;
-   ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
+   if ((sb->flags & STRBUF_FIXED_MEMORY) && sb->len + extra + 1 > 
sb->alloc)
+   die("you try to make a string overflow the buffer of a fixed 
strbuf");
+
+   

[PATCH 0/2] strbuf: improve API

2016-05-30 Thread William Duclot
This patch series implements an improvment of the strbuf API, allowing
strbuf to use preallocated memory. This makes strbuf fit to be used
in performance-critical operations.

The first patch is simply a preparatory work, adding tests for
existing strbuf implementation.
Most of the work is made in the second patch: handle pre-allocated
memory, extend the API, document it and test it.

As said in the second commit, the idea comes from Michael Haggerty.

William Duclot (2):
strbuf: add tests
strbuf: allow to use preallocated memory

Makefile   |   1 +
strbuf.c   |  68 
+++-
strbuf.h   |  31 +--
t/helper/test-strbuf.c | 111 
+++
t/t0082-strbuf.sh  |  47 +++
5 files changed, 251 insertions(+), 7 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/5] worktree.c: add find_worktree_by_path()

2016-05-30 Thread Duy Nguyen
On Mon, May 23, 2016 at 11:11 AM, Eric Sunshine  wrote:
> On Sun, May 22, 2016 at 6:43 AM, Nguyễn Thái Ngọc Duy  
> wrote:
>> So far we haven't needed to identify an existing worktree from command
>> line. Future commands such as lock or move will need it. There are of
>> course other options for identifying a worktree, for example by branch
>> or even by internal id. They may be added later if proved useful.
>
> Beyond the above methods for specifying a worktree, [1] adds
> $(basename $path) as a possibility if not ambiguous. Excerpt:
>
> For git-worktree commands such as "lock", "mv", "remove", it
> likely will be nice to allow people to specify the linked
> worktree not only by path, but also by tag, and possibly even by
> $(basename $path) if not ambiguous.

Thanks. This changes things. Turns out my "find_by_path" API  cannot
be easily extended to support identifying worktrees some other way. I
will need to pass the command line argument as-is, so that this
"find_worktree" (by any means) has enough info to decide (especially
about ambiguation).
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC] Triangular Workflow UI improvments

2016-05-30 Thread Jordan DE GEA

> Le 27 mai 2016 à 09:32, Philip Oakley  a écrit :
> For me, the first step would be to actually document a (the?) Triangular 
> Workflow in the documentation, so we are all taking about the same broad 
> method.
> 
> At the moment there is a choice (assuming a ithub like service) of either 
> clone and then fork, or fork and clone the fork, which leave the user with 
> different fixups of their config's being required, so describing the easier 
> one would help folk.
> 
> Likewise there are missing terms such as for the third place (the personal 
> fork) that is neither the upstream, nor the local repo. Making sure the 
> terminology is crisp and clean will greatly ease any implementation issues. 
> And then there are the possible workflows...

Yes, you’re right. 

Writing a complete tutorial is a good idea. 

I start this job and the content will depend 
on the result of this discussion. 

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC] Triangular Workflow UI improvement

2016-05-30 Thread Jordan DE GEA
>> We are working on full implementation of triangular workflow feature.
>> For now, the main options available are: 
>>   - branch..pushRemote
>>   - remote.pushDefault
>> And only setable by hands. 
> 
> And once it is set, you do not have to worry about it.  I am not
> sure per-branch thing is all that useful, unless you are always
> working on a single branch like 'master', but the latter would be
> just set once and forget about it.

We got the example of our fork of git(e.g. GitHub). 
We want : 
 - some branches that only fetch from git/git and push to our/git
 - the others branches fetch from and push to our/git. 
For now, we have to change branch..remote and 
branch..pushRemote(or remote.pushDefault). 
With branch..fetchRemote, we only have to set this one. 
fetchRemote is more explicit than remote for that case. 

Another point is that many commands can erase the old 
value of branch..remote. 

>> Context:
>>  - One main remote repository, e.g. git/git.
>>  - A remote fork (e.g. a GitHub fork) of git/git, e.g. me/git.
>>  - A local clone of me/git on the machine
>> Purposes:
>>  - the local branch master has to fetch to git/git by default
>>  - the local branch master has to push to me/git by default
> 
> Wouldn't remote.pushDefault be the single thing you need to set just
> once and forget about it?  Why would your users even want to do
> these things …

remote.pushDefault overrides branch..remote for all branches. 
The goal is to give an easily understandable config for 
complex configuration. 
Having all the configuration in the same part (e.g [branch « master »]) 
is easier to understand and edit. 

> 
>>  c. add `git fetch --set-default` in order to set remote.fetchDefault
>>  d. add `git fetch --set-remote` in order to set 
>> branch..fetchRemote
>>  e. add `git pull --set-default` in order to set remote.fetchDefault
>>  f. add `git pull --set-remote` in order to set branch..fetchRemote
>>  a. add `git push --set-default` in order to set remote.pushDefault
>>  b. add `git push --set-remote` in order to set branch..pushRemote
> 
> ... just to configure many variables every time they work on a new
> branch?

branch..pushRemote and all these options are optional.
Options in commands may not be added. It just gives user 
friendly interaction with these options.  



To conclude: 
This feature is more about configuration clarity than possibility itself. 

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/3] perf: make the tests work in worktrees

2016-05-30 Thread Johannes Schindelin
Hi René,

On Sun, 29 May 2016, René Scharfe wrote:

> Am 13.05.2016 um 15:25 schrieb Johannes Schindelin:
> > This patch makes perf-lib.sh more robust so that it can run correctly
> > even inside a worktree. For example, it assumed that $GIT_DIR/objects is
> > the objects directory (which is not the case for worktrees) and it used
> > the commondir file verbatim, even if it contained a relative path.
> > 
> > Furthermore, the setup code expected `git rev-parse --git-dir` to spit
> > out a relative path, which is also not true for worktrees. Let's just
> > change the code to accept both relative and absolute paths, by avoiding
> > the `cd` into the copied working directory.
> > 
> > Signed-off-by: Johannes Schindelin 
> > ---
> >   t/perf/perf-lib.sh | 14 +++---
> >   1 file changed, 7 insertions(+), 7 deletions(-)
> > 
> > diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
> > index 9fa0706..5ef1744 100644
> > --- a/t/perf/perf-lib.sh
> > +++ b/t/perf/perf-lib.sh
> > @@ -80,22 +80,22 @@ test_perf_create_repo_from () {
> > error "bug in the test script: not 2 parameters to test-create-repo"
> > repo="$1"
> > source="$2"
> > -   source_git=$source/$(cd "$source" && git rev-parse --git-dir)
> > +   source_git="$(git -C "$source" rev-parse --git-dir)"
> > +   objects_dir="$(git -C "$source" rev-parse --git-path objects)"
> > mkdir -p "$repo/.git"
> > (
> > -   cd "$repo/.git" &&
> > -   { cp -Rl "$source_git/objects" . 2>/dev/null ||
> > -   cp -R "$source_git/objects" .; } &&
> > +   { cp -Rl "$objects_dir" "$repo/.git/" 2>/dev/null ||
> > +   cp -R "$objects_dir" "$repo/.git/"; } &&
> > for stuff in "$source_git"/*; do
> > case "$stuff" in
> > -   */objects|*/hooks|*/config)
> > +   */objects|*/hooks|*/config|*/commondir)
> > ;;
> > *)
> > -   cp -R "$stuff" . || exit 1
> > +   cp -R "$stuff" "$repo/.git/" || exit 1
> > ;;
> > esac
> > done &&
> > -   cd .. &&
> > +   cd "$repo" &&
> > git init -q && {
> > test_have_prereq SYMLINKS ||
> > git config core.symlinks false
> > 
> 
> This breaks perf for the non-worktree case:

Oh drats!

> lsr@debian:~/src/git/t/perf$ make
> rm -rf test-results
> ./run
> === Running 12 tests in this tree ===
> cp: cannot stat '.git/objects': No such file or directory
> error: failed to copy repository '/home/lsr/src/git/t/..' to '/tmp/trash 
> directory.p-perf-lib-sanity'
> cp: cannot stat '.git/objects': No such file or directory
> error: failed to copy repository '/home/lsr/src/git/t/..' to '/tmp/trash 
> directory.p0001-rev-list'
> ...
> 
> Here's a fix:
> 
> -- >8 --
> Subject: perf: make the tests work without a worktree
> 
> In regular repositories $source_git and $objects_dir contain relative
> paths based on $source.  Go there to allow cp to resolve them.
> 
> Signed-off-by: Rene Scharfe 
> ---
>  t/perf/perf-lib.sh | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
> index 5ef1744..1888790 100644
> --- a/t/perf/perf-lib.sh
> +++ b/t/perf/perf-lib.sh
> @@ -84,6 +84,7 @@ test_perf_create_repo_from () {
>   objects_dir="$(git -C "$source" rev-parse --git-path objects)"
>   mkdir -p "$repo/.git"
>   (
> + cd "$source" &&

I fear that interacts badly with the `cd "$repo"` I introduced later
(replacing a `cd ..`)...

Ciao,
Dscho

[RFC] Triangular Workflow UI improvement

2016-05-30 Thread Jordan DE GEA
In my last message, I forgot to add some important 
contributors linked to this feature. Some of you 
already replied to me. 
I will answer shortly. Sorry for the noise. 

> 
> We are working on full implementation of triangular workflow feature.
> For now, the main options available are: 
>- branch..pushRemote
>- remote.pushDefault
> And only setable by hands. 
> 
> As it can be difficult to understand, here is what we want to do. 
> 
> 
> Context:
>   - One main remote repository, e.g. git/git.
>   - A remote fork (e.g. a GitHub fork) of git/git, e.g. me/git.
>   - A local clone of me/git on the machine
> 
> Purposes:
>   - the local branch master has to fetch to git/git by default
>   - the local branch master has to push to me/git by default
> 
> Configuration wanted:
>   - Add a remote to git/git e.g. `git remote add ...`
>   - Set the fetch remote for branch as default. 
> 
> For now, we can do that by setting: 
>   - branch..remote to git/git
>   - branch..pushRemote to me/git
> But many options set `branch..remote`, a suitable solution is to
> implement an option for the fetch for example. 
> 
> 
> Here is what we want to implement: 
> 
> 1. 
>   a. add the config var: remote.fetchDefault
>   b. add the config var: branch..fetchRemote
>   c. add `git fetch --set-default` in order to set remote.fetchDefault
>   d. add `git fetch --set-remote` in order to set 
> branch..fetchRemote
>   e. add `git pull --set-default` in order to set remote.fetchDefault
>   f. add `git pull --set-remote` in order to set branch..fetchRemote
> 
> 2. 
>   a. add `git push --set-default` in order to set remote.pushDefault
>   b. add `git push --set-remote` in order to set branch..pushRemote
> 
> 
> What's your opinion about this feature ?


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/13] remote rm: handle symbolic refs correctly

2016-05-30 Thread Michael Haggerty
In the modern world of reference backends, it is not OK to delete a
symref by unlink()ing the file directly. This must be done via the refs
API.

We do so by adding the symref to the list of references to delete along
with the non-symbolic references, then calling delete_refs() with the
new flags option set to REF_NODEREF.

Signed-off-by: Michael Haggerty 
---
 builtin/remote.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/builtin/remote.c b/builtin/remote.c
index 1bbf9b4..c4b4d67 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -539,10 +539,6 @@ static int add_branch_for_removal(const char *refname,
return 0;
}
 
-   /* make sure that symrefs are deleted */
-   if (flags & REF_ISSYMREF)
-   return unlink(git_path("%s", refname));
-
string_list_append(branches->branches, refname);
 
return 0;
@@ -788,7 +784,7 @@ static int rm(int argc, const char **argv)
strbuf_release();
 
if (!result)
-   result = delete_refs(, 0);
+   result = delete_refs(, REF_NODEREF);
string_list_clear(, 0);
 
if (skipped.nr) {
-- 
2.8.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/13] do_for_each_ref(): move docstring to the header file

2016-05-30 Thread Michael Haggerty
Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c |  9 -
 refs/refs-internal.h | 10 +-
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 1230dfb..68db3e8 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1878,15 +1878,6 @@ static int do_for_each_entry(struct ref_cache *refs, 
const char *base,
return retval;
 }
 
-/*
- * Call fn for each reference in the specified ref_cache for which the
- * refname begins with base.  If trim is non-zero, then trim that many
- * characters off the beginning of each refname before passing the
- * refname to fn.  flags can be DO_FOR_EACH_INCLUDE_BROKEN to include
- * broken references in the iteration.  If fn ever returns a non-zero
- * value, stop the iteration and return that value; otherwise, return
- * 0.
- */
 int do_for_each_ref(const char *submodule, const char *base,
each_ref_fn fn, int trim, int flags, void *cb_data)
 {
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 1bb3d87..b4dd545 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -249,7 +249,15 @@ int rename_ref_available(const char *oldname, const char 
*newname);
 #define DO_FOR_EACH_INCLUDE_BROKEN 0x01
 
 /*
- * The common backend for the for_each_*ref* functions
+ * Call fn for each reference in the specified submodule for which the
+ * refname begins with base. If trim is non-zero, then trim that many
+ * characters off the beginning of each refname before passing the
+ * refname to fn. flags can be DO_FOR_EACH_INCLUDE_BROKEN to include
+ * broken references in the iteration. If fn ever returns a non-zero
+ * value, stop the iteration and return that value; otherwise, return
+ * 0.
+ *
+ * This is the common backend for the for_each_*ref* functions.
  */
 int do_for_each_ref(const char *submodule, const char *base,
each_ref_fn fn, int trim, int flags, void *cb_data);
-- 
2.8.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/13] refs: introduce an iterator interface

2016-05-30 Thread Michael Haggerty
Currently, the API for iterating over references is via a family of
for_each_ref()-type functions that invoke a callback function for each
selected reference. All of these eventually call do_for_each_ref(),
which knows how to do one thing: iterate in parallel through two
ref_caches, one for loose and one for packed refs, giving loose
references precedence over packed refs. This is rather complicated code,
and is quite specialized to the files backend. It also requires callers
to encapsulate their work into a callback function, which often means
that they have to define and use a "cb_data" struct to manage their
context.

The current design is already bursting at the seams, and will become
even more awkward in the upcoming world of multiple reference storage
backends:

* Per-worktree vs. shared references are currently handled via a kludge
  in git_path() rather than iterating over each part of the reference
  namespace separately and merging the results. This kludge will cease
  to work when we have multiple reference storage backends.

* The current scheme is inflexible. What if we sometimes want to bypass
  the ref_cache, or use it only for packed or only for loose refs? What
  if we want to store symbolic refs in one type of storage backend and
  non-symbolic ones in another?

In the future, each reference backend will need to define its own way of
iterating over references. The crux of the problem with the current
design is that it is impossible to compose for_each_ref()-style
iterations, because the flow of control is owned by the for_each_ref()
function. There is nothing that a caller can do but iterate through all
references in a single burst, so there is no way for it to interleave
references from multiple backends and present the result to the rest of
the world as a single compound backend.

This commit introduces a new iteration primitive for references: a
ref_iterator. A ref_iterator is a polymorphic object that a reference
storage backend can be asked to instantiate. There are three functions
that can be applied to a ref_iterator:

* ref_iterator_advance(): move to the next reference in the iteration
* ref_iterator_abort(): end the iteration before it is exhausted
* ref_iterator_peel(): peel the reference currently being looked at

Iterating using a ref_iterator leaves the flow of control in the hands
of the caller, which means that ref_iterators from multiple
sources (e.g., loose and packed refs) can be composed and presented to
the world as a single compound ref_iterator.

It also means that the backend code for implementing reference iteration
will sometimes be more complicated. For example, the
cache_ref_iterator (which iterates over a ref_cache) can't use the C
stack to recurse; instead, it must manage its own stack internally as
explicit data structures. There is also a lot of boilerplate connected
with object-oriented programming in C.

Eventually, end-user callers will be able to be written in a more
natural way—managing their own flow of control rather than having to
work via callbacks. Since there will only be a few reference backends
but there are many consumers of this API, this is a good tradeoff.

More importantly, we gain composability, and especially the possibility
of writing interchangeable parts that can work with any ref_iterator.

For example, merge_ref_iterator implements a generic way of merging the
contents of any two ref_iterators. It is used to merge loose + packed
refs as part of the implementation of the files_ref_iterator. But it
will also be possible to use it to merge other pairs of reference
sources (e.g., per-worktree vs. shared refs).

Another example is prefix_ref_iterator, which can be used to trim a
prefix off the front of reference names before presenting them to the
caller (e.g., "refs/heads/master" -> "master").

In this patch, we introduce the iterator abstraction and many utilities,
and implement a reference iterator for the files ref storage backend.
(I've written several other obvious utilities, for example a generic way
to filter references being iterated over. These will probably be useful
in the future. But they are not needed for this patch series, so I am
not including them at this time.)

In a moment we will rewrite do_for_each_ref() to work via reference
iterators (allowing some special-purpose code to be discarded), and do
something similar for reflogs. In future patch series, we will expose
the ref_iterator abstraction in the public refs API so that callers can
use it directly.

Implementation note: I tried abstracting this a layer further to allow
generic iterators (over arbitrary types of objects) and generic
utilities like a generic merge_iterator. But the implementation in C was
very cumbersome, involving (in my opinion) too much boilerplate and too
much unsafe casting, some of which would have had to be done on the
caller side. However, I did put a few iterator-related constants in a
top-level header file, iterator.h, as they will be 

[PATCH 04/13] delete_refs(): add a flags argument

2016-05-30 Thread Michael Haggerty
This will be useful for passing REF_NODEREF through.

Signed-off-by: Michael Haggerty 
---
 builtin/fetch.c  | 2 +-
 builtin/remote.c | 4 ++--
 refs.h   | 5 +++--
 refs/files-backend.c | 4 ++--
 4 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index f8455bd..b55c83c 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -806,7 +806,7 @@ static int prune_refs(struct refspec *refs, int ref_count, 
struct ref *ref_map,
for (ref = stale_refs; ref; ref = ref->next)
string_list_append(, ref->name);
 
-   result = delete_refs();
+   result = delete_refs(, 0);
string_list_clear(, 0);
}
 
diff --git a/builtin/remote.c b/builtin/remote.c
index fda5c2e..1bbf9b4 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -788,7 +788,7 @@ static int rm(int argc, const char **argv)
strbuf_release();
 
if (!result)
-   result = delete_refs();
+   result = delete_refs(, 0);
string_list_clear(, 0);
 
if (skipped.nr) {
@@ -1303,7 +1303,7 @@ static int prune_remote(const char *remote, int dry_run)
string_list_sort(_to_prune);
 
if (!dry_run)
-   result |= delete_refs(_to_prune);
+   result |= delete_refs(_to_prune, 0);
 
for_each_string_list_item(item, ) {
const char *refname = item->util;
diff --git a/refs.h b/refs.h
index 21874f0..6d515a4 100644
--- a/refs.h
+++ b/refs.h
@@ -274,9 +274,10 @@ int delete_ref(const char *refname, const unsigned char 
*old_sha1,
 /*
  * Delete the specified references. If there are any problems, emit
  * errors but attempt to keep going (i.e., the deletes are not done in
- * an all-or-nothing transaction).
+ * an all-or-nothing transaction). flags is passed through to
+ * ref_transaction_delete().
  */
-int delete_refs(struct string_list *refnames);
+int delete_refs(struct string_list *refnames, unsigned int flags);
 
 /** Delete a reflog */
 int delete_reflog(const char *refname);
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 5af7441..a0d09f4 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2403,7 +2403,7 @@ static int delete_ref_loose(struct ref_lock *lock, int 
flag, struct strbuf *err)
return 0;
 }
 
-int delete_refs(struct string_list *refnames)
+int delete_refs(struct string_list *refnames, unsigned int flags)
 {
struct strbuf err = STRBUF_INIT;
int i, result = 0;
@@ -2432,7 +2432,7 @@ int delete_refs(struct string_list *refnames)
for (i = 0; i < refnames->nr; i++) {
const char *refname = refnames->items[i].string;
 
-   if (delete_ref(refname, NULL, 0))
+   if (delete_ref(refname, NULL, flags))
result |= error(_("could not remove reference %s"), 
refname);
}
 
-- 
2.8.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/13] refs: use name "prefix" consistently

2016-05-30 Thread Michael Haggerty
In the context of the for_each_ref() functions, call the prefix that
references must start with "prefix". (In some places it was called
"base".) This is clearer, and also prevents confusion with another
planned use of the word "base".

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 24 
 refs/refs-internal.h | 14 +++---
 2 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 68db3e8..5af7441 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -542,7 +542,7 @@ static struct ref_entry *current_ref;
 typedef int each_ref_entry_fn(struct ref_entry *entry, void *cb_data);
 
 struct ref_entry_cb {
-   const char *base;
+   const char *prefix;
int trim;
int flags;
each_ref_fn *fn;
@@ -559,7 +559,7 @@ static int do_one_ref(struct ref_entry *entry, void 
*cb_data)
struct ref_entry *old_current_ref;
int retval;
 
-   if (!starts_with(entry->name, data->base))
+   if (!starts_with(entry->name, data->prefix))
return 0;
 
if (!(data->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
@@ -1824,12 +1824,12 @@ int peel_ref(const char *refname, unsigned char *sha1)
 
 /*
  * Call fn for each reference in the specified ref_cache, omitting
- * references not in the containing_dir of base.  fn is called for all
- * references, including broken ones.  If fn ever returns a non-zero
+ * references not in the containing_dir of prefix. Call fn for all
+ * references, including broken ones. If fn ever returns a non-zero
  * value, stop the iteration and return that value; otherwise, return
  * 0.
  */
-static int do_for_each_entry(struct ref_cache *refs, const char *base,
+static int do_for_each_entry(struct ref_cache *refs, const char *prefix,
 each_ref_entry_fn fn, void *cb_data)
 {
struct packed_ref_cache *packed_ref_cache;
@@ -1846,8 +1846,8 @@ static int do_for_each_entry(struct ref_cache *refs, 
const char *base,
 * disk.
 */
loose_dir = get_loose_refs(refs);
-   if (base && *base) {
-   loose_dir = find_containing_dir(loose_dir, base, 0);
+   if (prefix && *prefix) {
+   loose_dir = find_containing_dir(loose_dir, prefix, 0);
}
if (loose_dir)
prime_ref_dir(loose_dir);
@@ -1855,8 +1855,8 @@ static int do_for_each_entry(struct ref_cache *refs, 
const char *base,
packed_ref_cache = get_packed_ref_cache(refs);
acquire_packed_ref_cache(packed_ref_cache);
packed_dir = get_packed_ref_dir(packed_ref_cache);
-   if (base && *base) {
-   packed_dir = find_containing_dir(packed_dir, base, 0);
+   if (prefix && *prefix) {
+   packed_dir = find_containing_dir(packed_dir, prefix, 0);
}
 
if (packed_dir && loose_dir) {
@@ -1878,14 +1878,14 @@ static int do_for_each_entry(struct ref_cache *refs, 
const char *base,
return retval;
 }
 
-int do_for_each_ref(const char *submodule, const char *base,
+int do_for_each_ref(const char *submodule, const char *prefix,
each_ref_fn fn, int trim, int flags, void *cb_data)
 {
struct ref_entry_cb data;
struct ref_cache *refs;
 
refs = get_ref_cache(submodule);
-   data.base = base;
+   data.prefix = prefix;
data.trim = trim;
data.flags = flags;
data.fn = fn;
@@ -1896,7 +1896,7 @@ int do_for_each_ref(const char *submodule, const char 
*base,
if (ref_paranoia)
data.flags |= DO_FOR_EACH_INCLUDE_BROKEN;
 
-   return do_for_each_entry(refs, base, do_one_ref, );
+   return do_for_each_entry(refs, prefix, do_one_ref, );
 }
 
 /*
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index b4dd545..8ad02d8 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -250,16 +250,16 @@ int rename_ref_available(const char *oldname, const char 
*newname);
 
 /*
  * Call fn for each reference in the specified submodule for which the
- * refname begins with base. If trim is non-zero, then trim that many
- * characters off the beginning of each refname before passing the
- * refname to fn. flags can be DO_FOR_EACH_INCLUDE_BROKEN to include
- * broken references in the iteration. If fn ever returns a non-zero
- * value, stop the iteration and return that value; otherwise, return
- * 0.
+ * refname begins with prefix. If trim is non-zero, then trim that
+ * many characters off the beginning of each refname before passing
+ * the refname to fn. flags can be DO_FOR_EACH_INCLUDE_BROKEN to
+ * include broken references in the iteration. If fn ever returns a
+ * non-zero value, stop the iteration and return that value;
+ * otherwise, return 0.
  *
  * This is the common backend for the for_each_*ref* functions.
  */
-int do_for_each_ref(const char *submodule, const char *base,
+int do_for_each_ref(const char 

[PATCH 01/13] refs: remove unnecessary "extern" keywords

2016-05-30 Thread Michael Haggerty
There's continuing work in this area, so clean up unneeded "extern"
keywords rather than schlepping them around. Also split up some overlong
lines and add parameter names in a couple of places.

Signed-off-by: Michael Haggerty 
---
 refs.h | 132 +++--
 1 file changed, 72 insertions(+), 60 deletions(-)

diff --git a/refs.h b/refs.h
index 9230d47..21874f0 100644
--- a/refs.h
+++ b/refs.h
@@ -52,19 +52,19 @@
 #define RESOLVE_REF_NO_RECURSE 0x02
 #define RESOLVE_REF_ALLOW_BAD_NAME 0x04
 
-extern const char *resolve_ref_unsafe(const char *refname, int resolve_flags,
- unsigned char *sha1, int *flags);
+const char *resolve_ref_unsafe(const char *refname, int resolve_flags,
+  unsigned char *sha1, int *flags);
 
-extern char *resolve_refdup(const char *refname, int resolve_flags,
-   unsigned char *sha1, int *flags);
+char *resolve_refdup(const char *refname, int resolve_flags,
+unsigned char *sha1, int *flags);
 
-extern int read_ref_full(const char *refname, int resolve_flags,
-unsigned char *sha1, int *flags);
-extern int read_ref(const char *refname, unsigned char *sha1);
+int read_ref_full(const char *refname, int resolve_flags,
+ unsigned char *sha1, int *flags);
+int read_ref(const char *refname, unsigned char *sha1);
 
-extern int ref_exists(const char *refname);
+int ref_exists(const char *refname);
 
-extern int is_branch(const char *refname);
+int is_branch(const char *refname);
 
 /*
  * If refname is a non-symbolic reference that refers to a tag object,
@@ -74,24 +74,25 @@ extern int is_branch(const char *refname);
  * Symbolic references are considered unpeelable, even if they
  * ultimately resolve to a peelable tag.
  */
-extern int peel_ref(const char *refname, unsigned char *sha1);
+int peel_ref(const char *refname, unsigned char *sha1);
 
 /**
  * Resolve refname in the nested "gitlink" repository that is located
  * at path.  If the resolution is successful, return 0 and set sha1 to
  * the name of the object; otherwise, return a non-zero value.
  */
-extern int resolve_gitlink_ref(const char *path, const char *refname, unsigned 
char *sha1);
+int resolve_gitlink_ref(const char *path, const char *refname,
+   unsigned char *sha1);
 
 /*
  * Return true iff abbrev_name is a possible abbreviation for
  * full_name according to the rules defined by ref_rev_parse_rules in
  * refs.c.
  */
-extern int refname_match(const char *abbrev_name, const char *full_name);
+int refname_match(const char *abbrev_name, const char *full_name);
 
-extern int dwim_ref(const char *str, int len, unsigned char *sha1, char **ref);
-extern int dwim_log(const char *str, int len, unsigned char *sha1, char **ref);
+int dwim_ref(const char *str, int len, unsigned char *sha1, char **ref);
+int dwim_log(const char *str, int len, unsigned char *sha1, char **ref);
 
 /*
  * A ref_transaction represents a collection of ref updates
@@ -182,38 +183,45 @@ typedef int each_ref_fn(const char *refname,
  * modifies the reference also returns a nonzero value to immediately
  * stop the iteration.
  */
-extern int head_ref(each_ref_fn fn, void *cb_data);
-extern int for_each_ref(each_ref_fn fn, void *cb_data);
-extern int for_each_ref_in(const char *prefix, each_ref_fn fn, void *cb_data);
-extern int for_each_fullref_in(const char *prefix, each_ref_fn fn, void 
*cb_data, unsigned int broken);
-extern int for_each_tag_ref(each_ref_fn fn, void *cb_data);
-extern int for_each_branch_ref(each_ref_fn fn, void *cb_data);
-extern int for_each_remote_ref(each_ref_fn fn, void *cb_data);
-extern int for_each_replace_ref(each_ref_fn fn, void *cb_data);
-extern int for_each_glob_ref(each_ref_fn fn, const char *pattern, void 
*cb_data);
-extern int for_each_glob_ref_in(each_ref_fn fn, const char *pattern, const 
char *prefix, void *cb_data);
+int head_ref(each_ref_fn fn, void *cb_data);
+int for_each_ref(each_ref_fn fn, void *cb_data);
+int for_each_ref_in(const char *prefix, each_ref_fn fn, void *cb_data);
+int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data,
+   unsigned int broken);
+int for_each_tag_ref(each_ref_fn fn, void *cb_data);
+int for_each_branch_ref(each_ref_fn fn, void *cb_data);
+int for_each_remote_ref(each_ref_fn fn, void *cb_data);
+int for_each_replace_ref(each_ref_fn fn, void *cb_data);
+int for_each_glob_ref(each_ref_fn fn, const char *pattern, void *cb_data);
+int for_each_glob_ref_in(each_ref_fn fn, const char *pattern,
+const char *prefix, void *cb_data);
 
-extern int head_ref_submodule(const char *submodule, each_ref_fn fn, void 
*cb_data);
-extern int for_each_ref_submodule(const char *submodule, each_ref_fn fn, void 
*cb_data);
-extern int for_each_ref_in_submodule(const char *submodule, const char *prefix,

[PATCH 10/13] do_for_each_ref(): reimplement using reference iteration

2016-05-30 Thread Michael Haggerty
Use the reference iterator interface to implement do_for_each_ref().
Delete a bunch of code supporting the old for_each_ref() implementation.
And now that do_for_each_ref() is generic code (it is no longer tied to
the files backend), move it to refs.c.

The implementation is via a new function, do_for_each_ref_iterator(),
which takes a reference iterator as argument and calls a callback
function for each of the references in the iterator.

This change requires the current_ref performance hack for peel_ref() to
be implemented via ref_iterator_peel() rather than peel_entry() because
we don't have a ref_entry handy (it is hidden under three layers:
file_ref_iterator, merge_ref_iterator, and cache_ref_iterator). So:

* do_for_each_ref_iterator() records the active iterator in
  current_ref_iter while it is running.

* peel_ref() checks whether current_ref_iter is pointing at the
  requested reference. If so, it asks the iterator to peel the
  reference (which it can do efficiently via its "peel" virtual
  function). For extra safety, we do the optimization only if the
  refname *addresses* are the same, not only if the refname *strings*
  are the same, to forestall possible mixups between refnames that come
  from different ref_iterators.

Please note that this optimization of peel_ref() is only available when
iterating via do_for_each_ref_iterator() (including all of the
for_each_ref() functions, which call it indirectly). It would be
complicated to implement a similar optimization when iterating directly
using a reference iterator, because multiple reference iterators can be
in use at the same time, with interleaved calls to
ref_iterator_advance(). (In fact we do exactly that in
merge_ref_iterator.)

But that is not necessary. peel_ref() is only called while iterating
over references. Callers who iterate using the for_each_ref() functions
benefit from the optimization described above. Callers who iterate using
reference iterators directly have access to the ref_iterator, so they
can call ref_iterator_peel() themselves to get an analogous optimization
in a more straightforward manner.

If we rewrite all callers to use the reference iteration API, then we
can remove the current_ref_iter hack permanently.

Signed-off-by: Michael Haggerty 
---
 refs.c   |  20 +
 refs/files-backend.c | 206 ++-
 refs/iterator.c  |  29 
 refs/refs-internal.h |  33 ++---
 4 files changed, 76 insertions(+), 212 deletions(-)

diff --git a/refs.c b/refs.c
index 842c5c7..814cad3 100644
--- a/refs.c
+++ b/refs.c
@@ -1120,6 +1120,26 @@ int head_ref(each_ref_fn fn, void *cb_data)
return head_ref_submodule(NULL, fn, cb_data);
 }
 
+/*
+ * Call fn for each reference in the specified submodule for which the
+ * refname begins with prefix. If trim is non-zero, then trim that
+ * many characters off the beginning of each refname before passing
+ * the refname to fn. flags can be DO_FOR_EACH_INCLUDE_BROKEN to
+ * include broken references in the iteration. If fn ever returns a
+ * non-zero value, stop the iteration and return that value;
+ * otherwise, return 0.
+ */
+static int do_for_each_ref(const char *submodule, const char *prefix,
+  each_ref_fn fn, int trim, int flags, void *cb_data)
+{
+   struct ref_iterator *iter;
+
+   iter = files_ref_iterator_begin(submodule, prefix, flags);
+   iter = prefix_ref_iterator_begin(iter, prefix, trim);
+
+   return do_for_each_ref_iterator(iter, fn, cb_data);
+}
+
 int for_each_ref(each_ref_fn fn, void *cb_data)
 {
return do_for_each_ref(NULL, "", fn, 0, 0, cb_data);
diff --git a/refs/files-backend.c b/refs/files-backend.c
index dbf1587..c48a006 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -542,53 +542,8 @@ static int entry_resolves_to_object(struct ref_entry 
*entry)
  >u.value.oid, entry->flag);
 }
 
-/*
- * current_ref is a performance hack: when iterating over references
- * using the for_each_ref*() functions, current_ref is set to the
- * current reference's entry before calling the callback function.  If
- * the callback function calls peel_ref(), then peel_ref() first
- * checks whether the reference to be peeled is the current reference
- * (it usually is) and if so, returns that reference's peeled version
- * if it is available.  This avoids a refname lookup in a common case.
- */
-static struct ref_entry *current_ref;
-
 typedef int each_ref_entry_fn(struct ref_entry *entry, void *cb_data);
 
-struct ref_entry_cb {
-   const char *prefix;
-   int trim;
-   int flags;
-   each_ref_fn *fn;
-   void *cb_data;
-};
-
-/*
- * Handle one reference in a do_for_each_ref*()-style iteration,
- * calling an each_ref_fn for each entry.
- */
-static int do_one_ref(struct ref_entry *entry, void *cb_data)
-{
-   struct ref_entry_cb *data = cb_data;
-   struct ref_entry 

[PATCH 06/13] get_ref_cache(): only create an instance if there is a submodule

2016-05-30 Thread Michael Haggerty
If there is not a nonbare repository where a submodule is supposedly
located, then don't instantiate a ref_cache for it.

The analogous check can be removed from resolve_gitlink_ref().

Signed-off-by: Michael Haggerty 
---
This doesn't actually reduce the number of ref_cache instances
generated by out test suite, but it is a more logical place for
the check that was added in

  a2d5156c resolve_gitlink_ref: ignore non-repository paths

 refs/files-backend.c | 33 ++---
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index a0d09f4..142c977 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -954,15 +954,26 @@ static struct ref_cache *lookup_ref_cache(const char 
*submodule)
 
 /*
  * Return a pointer to a ref_cache for the specified submodule. For
- * the main repository, use submodule==NULL. The returned structure
- * will be allocated and initialized but not necessarily populated; it
- * should not be freed.
+ * the main repository, use submodule==NULL; such a call cannot fail.
+ * For a submodule, the submodule must exist and be a nonbare
+ * repository, otherwise return NULL.
+ *
+ * The returned structure will be allocated and initialized but not
+ * necessarily populated; it should not be freed.
  */
 static struct ref_cache *get_ref_cache(const char *submodule)
 {
struct ref_cache *refs = lookup_ref_cache(submodule);
-   if (!refs)
-   refs = create_ref_cache(submodule);
+
+   if (!refs) {
+   struct strbuf submodule_sb = STRBUF_INIT;
+
+   strbuf_addstr(_sb, submodule);
+   if (is_nonbare_repository_dir(_sb))
+   refs = create_ref_cache(submodule);
+   strbuf_release(_sb);
+   }
+
return refs;
 }
 
@@ -1341,13 +1352,10 @@ int resolve_gitlink_ref(const char *path, const char 
*refname, unsigned char *sh
return -1;
 
strbuf_add(, path, len);
-   refs = lookup_ref_cache(submodule.buf);
+   refs = get_ref_cache(submodule.buf);
if (!refs) {
-   if (!is_nonbare_repository_dir()) {
-   strbuf_release();
-   return -1;
-   }
-   refs = create_ref_cache(submodule.buf);
+   strbuf_release();
+   return -1;
}
strbuf_release();
 
@@ -1885,6 +1893,9 @@ int do_for_each_ref(const char *submodule, const char 
*prefix,
struct ref_cache *refs;
 
refs = get_ref_cache(submodule);
+   if (!refs)
+   return 0;
+
data.prefix = prefix;
data.trim = trim;
data.flags = flags;
-- 
2.8.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/13] ref_resolves_to_object(): new function

2016-05-30 Thread Michael Haggerty
Extract new function ref_resolves_to_object() from
entry_resolves_to_object(). It can be used even if there is no ref_entry
at hand.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 33 +++--
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 1a46f32..8ab4d5f 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -513,19 +513,32 @@ static void sort_ref_dir(struct ref_dir *dir)
 }
 
 /*
- * Return true iff the reference described by entry can be resolved to
- * an object in the database.  Emit a warning if the referred-to
- * object does not exist.
+ * Return true if refname, which has the specified oid and flags, can
+ * be resolved to an object in the database. If the referred-to object
+ * does not exist, emit a warning and return false.
+ */
+static int ref_resolves_to_object(const char *refname,
+ const struct object_id *oid,
+ unsigned int flags)
+{
+   if (flags & REF_ISBROKEN)
+   return 0;
+   if (!has_sha1_file(oid->hash)) {
+   error("%s does not point to a valid object!", refname);
+   return 0;
+   }
+   return 1;
+}
+
+/*
+ * Return true if the reference described by entry can be resolved to
+ * an object in the database; otherwise, emit a warning and return
+ * false.
  */
 static int entry_resolves_to_object(struct ref_entry *entry)
 {
-   if (entry->flag & REF_ISBROKEN)
-   return 0;
-   if (!has_sha1_file(entry->u.value.oid.hash)) {
-   error("%s does not point to a valid object!", entry->name);
-   return 0;
-   }
-   return 1;
+   return ref_resolves_to_object(entry->name,
+ >u.value.oid, entry->flag);
 }
 
 /*
-- 
2.8.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/13] for_each_reflog(): don't abort for bad references

2016-05-30 Thread Michael Haggerty
If there is a file under "$GIT_DIR/logs" with no corresponding
reference, the old code was emitting an error message, aborting the
reflog iteration, and returning -1. But

* None of the callers was checking the exit value

* The callers all want to find all legitimate reflogs (sometimes for the
  purpose of determining object reachability!) and wouldn't benefit from
  a truncated iteration anyway.

So instead, emit an error message and skip the "broken" reflog, but
continue with the iteration.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index c48a006..a7cc0e2 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3325,7 +3325,7 @@ static int do_for_each_reflog(struct strbuf *name, 
each_ref_fn fn, void *cb_data
struct object_id oid;
 
if (read_ref_full(name->buf, 0, oid.hash, NULL))
-   retval = error("bad ref for %s", 
name->buf);
+   error("bad ref for %s", name->buf);
else
retval = fn(name->buf, , 0, 
cb_data);
}
-- 
2.8.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/13] Reference iterators

2016-05-30 Thread Michael Haggerty
This patch series implements a new iteration paradigm for iterating
over references using iterators. This approach was proposed earlier as
an RFC [1].

The justification for this change is laid out in the RFC [1] and in
the commit message for patch 09/13 [2]. Please refer to those, as I
won't repeat them here.

There are several obvious followup steps that are not included in this
patch series (because they don't help with the initial transition to
pluggable reference backends). I've written prototypes of several of
these:

* The iterator interface could be made public and callers could start
  using it directly.

* A filter_ref_iterator could let references be filtered based on a
  callback function (this would be useful, for example, for
  for_each_glob_ref()).

* A single_ref_iterator could "iterate" over a single reference (e.g.,
  HEAD).

* A series_ref_iterator could iterate over multiple iterators, for
  callers that want to operate on, say, HEAD plus all branches.

* The ref_cache could be fed from an iterator, to better decouple
  caching from reading packed and loose references, and to make it
  easy to use caching with other reference storage backends.

* Per-worktree refs could be overlaid on top of shared references
  using merge_ref_iterator rather than mixing them up in the same
  ref_cache.

* The dir_iterator could be used in more places; for example, when
  reading loose references from disk.

Table of contents of changes:

* The first eight patches are cleanups.

  * Patch 05/13 fixes a code path that unlinks symrefs directly
instead of using the refs API.

* Patch 09/13 is the most important part of this series. It introduces
  not only reference iterators, but also (1) a pattern that other
  iterator interfaces can follow (along with some useful constants in
  iterator.h), and (2) a pattern for building OO code with
  inheritance.

* Patch 10/13 actually uses the new reference iteration mechanism.

* Patch 11/13 avoids aborting reflog iterations if an unexpected file
  is found under `$GIT_DIR/logs`. This fixes a bug that could cause
  objects needed by reflogs to be pruned, breaking the reflogs.

* Patch 12/13 adds an iterator interface for iterating over
  directories (using the same model as patch 09/13).

* Patch 13/13 implements for_each_reflog() on top of an iterator
  interface (essentially the analogue of patch 10/13, but for
  reflogs).

Note that it is not necessary to rebuild for_each_reflog_ent() on top
of iterators at this time, because that function deals with only a
single reference at a time. Therefore, composability is not important
here (for example, it won't have to deal with multiple refs backends
at the same time).

This patch series applies on top of mh/split-under-lock. It is also
available from my GitHub repository [3] as branch "ref-iterators".

I haven't yet completely rebased the ref-store changes (virtualization
of the refs API) on top of all of these changes, but I will work on
that next.

Michael

[1] http://thread.gmane.org/gmane.comp.version-control.git/290409
[2] 
http://mid.gmane.org/89634d216544d1102dafd5d18247bff2581d48a8.1464537050.git.mhag...@alum.mit.edu
[3] https://github.com/mhagger/git

Michael Haggerty (13):
  refs: remove unnecessary "extern" keywords
  do_for_each_ref(): move docstring to the header file
  refs: use name "prefix" consistently
  delete_refs(): add a flags argument
  remote rm: handle symbolic refs correctly
  get_ref_cache(): only create an instance if there is a submodule
  entry_resolves_to_object(): rename function from
ref_resolves_to_object()
  ref_resolves_to_object(): new function
  refs: introduce an iterator interface
  do_for_each_ref(): reimplement using reference iteration
  for_each_reflog(): don't abort for bad references
  dir_iterator: new API for iterating over a directory tree
  for_each_reflog(): reimplement using iterators

 Makefile |   2 +
 builtin/fetch.c  |   2 +-
 builtin/remote.c |   8 +-
 dir-iterator.c   | 180 +++
 dir-iterator.h   |  86 +++
 iterator.h   |  81 +++
 refs.c   |  20 ++
 refs.h   | 139 +++-
 refs/files-backend.c | 630 +++
 refs/iterator.c  | 376 ++
 refs/refs-internal.h | 225 +-
 11 files changed, 1427 insertions(+), 322 deletions(-)
 create mode 100644 dir-iterator.c
 create mode 100644 dir-iterator.h
 create mode 100644 iterator.h
 create mode 100644 refs/iterator.c

-- 
2.8.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/13] dir_iterator: new API for iterating over a directory tree

2016-05-30 Thread Michael Haggerty
The iterator interface is modeled on that for references, though no
vtable is necessary because there is (so far?) only one type of
dir_iterator.

There are obviously a lot of features that could easily be added to this
class:

* Skip/include directory paths in the iteration
* Shallow/deep iteration
* Letting the caller decide which subdirectories to recurse into (e.g.,
  via a dir_iterator_advance_into() function)
* Option to iterate in sorted order
* Option to iterate over directory paths before vs. after their contents

But these are not needed for the current patch series, so I refrain.

Signed-off-by: Michael Haggerty 
---
 Makefile   |   1 +
 dir-iterator.c | 180 +
 dir-iterator.h |  86 +++
 3 files changed, 267 insertions(+)
 create mode 100644 dir-iterator.c
 create mode 100644 dir-iterator.h

diff --git a/Makefile b/Makefile
index ac8f365..b4ffc11 100644
--- a/Makefile
+++ b/Makefile
@@ -722,6 +722,7 @@ LIB_OBJS += diff-lib.o
 LIB_OBJS += diff-no-index.o
 LIB_OBJS += diff.o
 LIB_OBJS += dir.o
+LIB_OBJS += dir-iterator.o
 LIB_OBJS += editor.o
 LIB_OBJS += entry.o
 LIB_OBJS += environment.o
diff --git a/dir-iterator.c b/dir-iterator.c
new file mode 100644
index 000..e1d60f0
--- /dev/null
+++ b/dir-iterator.c
@@ -0,0 +1,180 @@
+#include "cache.h"
+#include "dir.h"
+#include "iterator.h"
+#include "dir-iterator.h"
+
+struct dir_iterator_level {
+   int initialized;
+
+   DIR *dir;
+
+   /*
+* The length of the directory part of refname at this level
+* (including the trailing '/'):
+*/
+   size_t prefix_len;
+
+   /*
+* The last action that has been taken with the current entry
+* (needed for directories, which have to be included in the
+* iteration and also iterated into):
+*/
+   enum {
+   DIR_STATE_ITER,
+   DIR_STATE_RECURSE
+   } dir_state;
+};
+
+/*
+ * The full data structure used to manage the internal directory
+ * iteration state. It includes members that are not part of the
+ * public interface.
+ */
+struct dir_iterator_int {
+   struct dir_iterator base;
+
+   /*
+* The number of levels currently on the stack. This is always
+* at least 1, because when it becomes zero the iteration is
+* ended and this struct is freed.
+*/
+   size_t levels_nr;
+
+   /* The number of levels that have been allocated on the stack */
+   size_t levels_alloc;
+
+   /*
+* A stack of levels. levels[0] is the uppermost directory
+* that will be included in this iteration.
+*/
+   struct dir_iterator_level *levels;
+};
+
+int dir_iterator_advance(struct dir_iterator *dir_iterator)
+{
+   struct dir_iterator_int *iter =
+   (struct dir_iterator_int *)dir_iterator;
+
+   while (1) {
+   struct dir_iterator_level *level =
+   >levels[iter->levels_nr - 1];
+   struct dirent *de;
+
+   if (!level->initialized) {
+   if (!is_dir_sep(iter->base.path.buf[iter->base.path.len 
- 1]))
+   strbuf_addch(>base.path, '/');
+   level->prefix_len = iter->base.path.len;
+
+   /* opendir() errors are handled below */
+   level->dir = opendir(iter->base.path.buf);
+
+   level->initialized = 1;
+   } else if (S_ISDIR(iter->base.st.st_mode)) {
+   if (level->dir_state == DIR_STATE_ITER) {
+   /*
+* The directory was just iterated
+* over; now prepare to iterate into
+* it.
+*/
+   level->dir_state = DIR_STATE_RECURSE;
+   ALLOC_GROW(iter->levels, iter->levels_nr + 1,
+  iter->levels_alloc);
+   level = >levels[iter->levels_nr++];
+   level->initialized = 0;
+   continue;
+   } else {
+   /*
+* The directory has already been
+* iterated over and iterated into;
+* we're done with it.
+*/
+   }
+   }
+
+   if (!level->dir) {
+   /*
+* This level is exhausted (or wasn't opened
+* successfully); pop up a level.
+*/
+   if (--iter->levels_nr == 0) {
+   return dir_iterator_abort(dir_iterator);
+   }
+

[PATCH 13/13] for_each_reflog(): reimplement using iterators

2016-05-30 Thread Michael Haggerty
Allow references with reflogs to be iterated over using a ref_iterator.
The latter is implemented as a files_reflog_iterator, which in turn uses
dir_iterator to read the "logs" directory.

Note that reflog iteration doesn't correctly handle per-worktree
reflogs (either before or after this patch).

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 113 ---
 refs/refs-internal.h |   7 
 2 files changed, 78 insertions(+), 42 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index a7cc0e2..2e0d006 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2,6 +2,7 @@
 #include "../refs.h"
 #include "refs-internal.h"
 #include "../iterator.h"
+#include "../dir-iterator.h"
 #include "../lockfile.h"
 #include "../object.h"
 #include "../dir.h"
@@ -3292,60 +3293,88 @@ int for_each_reflog_ent(const char *refname, 
each_reflog_ent_fn fn, void *cb_dat
strbuf_release();
return ret;
 }
-/*
- * Call fn for each reflog in the namespace indicated by name.  name
- * must be empty or end with '/'.  Name will be used as a scratch
- * space, but its contents will be restored before return.
- */
-static int do_for_each_reflog(struct strbuf *name, each_ref_fn fn, void 
*cb_data)
+
+struct files_reflog_iterator {
+   struct ref_iterator base;
+
+   struct dir_iterator *dir_iterator;
+   struct object_id oid;
+};
+
+static int files_reflog_iterator_advance(struct ref_iterator *ref_iterator)
 {
-   DIR *d = opendir(git_path("logs/%s", name->buf));
-   int retval = 0;
-   struct dirent *de;
-   int oldlen = name->len;
+   struct files_reflog_iterator *iter =
+   (struct files_reflog_iterator *)ref_iterator;
+   struct dir_iterator *diter = iter->dir_iterator;
+   int ok;
 
-   if (!d)
-   return name->len ? errno : 0;
+   while ((ok = dir_iterator_advance(diter)) == ITER_OK) {
+   int flags;
 
-   while ((de = readdir(d)) != NULL) {
-   struct stat st;
-
-   if (de->d_name[0] == '.')
+   if (!S_ISREG(diter->st.st_mode))
+   continue;
+   if (diter->basename[0] == '.')
continue;
-   if (ends_with(de->d_name, ".lock"))
+   if (ends_with(diter->basename, ".lock"))
continue;
-   strbuf_addstr(name, de->d_name);
-   if (stat(git_path("logs/%s", name->buf), ) < 0) {
-   ; /* silently ignore */
-   } else {
-   if (S_ISDIR(st.st_mode)) {
-   strbuf_addch(name, '/');
-   retval = do_for_each_reflog(name, fn, cb_data);
-   } else {
-   struct object_id oid;
 
-   if (read_ref_full(name->buf, 0, oid.hash, NULL))
-   error("bad ref for %s", name->buf);
-   else
-   retval = fn(name->buf, , 0, 
cb_data);
-   }
-   if (retval)
-   break;
+   if (read_ref_full(diter->relative_path, 0,
+ iter->oid.hash, )) {
+   error("bad ref for %s", diter->path.buf);
+   continue;
}
-   strbuf_setlen(name, oldlen);
+
+   iter->base.refname = diter->relative_path;
+   iter->base.oid = >oid;
+   iter->base.flags = flags;
+   return ITER_OK;
}
-   closedir(d);
-   return retval;
+
+   iter->dir_iterator = NULL;
+   if (ref_iterator_abort(ref_iterator) == ITER_ERROR)
+   ok = ITER_ERROR;
+   return ok;
+}
+
+static int files_reflog_iterator_peel(struct ref_iterator *ref_iterator,
+  struct object_id *peeled)
+{
+   die("BUG: ref_iterator_peel() called for reflog_iterator");
+}
+
+static int files_reflog_iterator_abort(struct ref_iterator *ref_iterator)
+{
+   struct files_reflog_iterator *iter =
+   (struct files_reflog_iterator *)ref_iterator;
+   int ok = ITER_DONE;
+
+   if (iter->dir_iterator)
+   ok = dir_iterator_abort(iter->dir_iterator);
+
+   base_ref_iterator_free(ref_iterator);
+   return ok;
+}
+
+struct ref_iterator_vtable files_reflog_iterator_vtable = {
+   files_reflog_iterator_advance,
+   files_reflog_iterator_peel,
+   files_reflog_iterator_abort
+};
+
+struct ref_iterator *files_reflog_iterator_begin(void)
+{
+   struct files_reflog_iterator *iter = xcalloc(1, sizeof(*iter));
+   struct ref_iterator *ref_iterator = >base;
+
+   base_ref_iterator_init(ref_iterator, _reflog_iterator_vtable);
+   iter->dir_iterator = 

[PATCH 07/13] entry_resolves_to_object(): rename function from ref_resolves_to_object()

2016-05-30 Thread Michael Haggerty
Free up the old name for a more general purpose.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 142c977..1a46f32 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -517,7 +517,7 @@ static void sort_ref_dir(struct ref_dir *dir)
  * an object in the database.  Emit a warning if the referred-to
  * object does not exist.
  */
-static int ref_resolves_to_object(struct ref_entry *entry)
+static int entry_resolves_to_object(struct ref_entry *entry)
 {
if (entry->flag & REF_ISBROKEN)
return 0;
@@ -563,7 +563,7 @@ static int do_one_ref(struct ref_entry *entry, void 
*cb_data)
return 0;
 
if (!(data->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
- !ref_resolves_to_object(entry))
+ !entry_resolves_to_object(entry))
return 0;
 
/* Store the old value, in case this is a recursive call: */
@@ -2228,7 +2228,7 @@ static int pack_if_possible_fn(struct ref_entry *entry, 
void *cb_data)
return 0;
 
/* Do not pack symbolic or broken refs: */
-   if ((entry->flag & REF_ISSYMREF) || !ref_resolves_to_object(entry))
+   if ((entry->flag & REF_ISSYMREF) || !entry_resolves_to_object(entry))
return 0;
 
/* Add a packed ref cache entry equivalent to the loose entry. */
-- 
2.8.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] README.md: format CLI commands with code syntax

2016-05-30 Thread Matthieu Moy
Benjamin Dopplinger  writes:

> CLI commands which are mentioned in the readme are now formatted with
> the Markdown code syntax to make the documentation more readable.
>
> Signed-off-by: Benjamin Dopplinger 
> ---
>  README.md | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)

Looks good to me, thanks.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html