date:20160219

Re: [PATCH v5 12/27] refs: forbid cross-backend ref renames

2016-02-19 Thread Duy Nguyen

On Thu, Feb 18, 2016 at 12:17 PM, David Turner  wrote:
> This would be pretty weird, but since it will break, we should prevent
> it.
>
> Signed-off-by: David Turner 
> ---
>  refs.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/refs.c b/refs.c
> index f5754f2..8eb04da 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1306,5 +1306,11 @@ int delete_refs(struct string_list *refnames)
>
>  int rename_ref(const char *oldref, const char *newref, const char *logmsg)
>  {
> +   if ((ref_type(oldref) == REF_TYPE_NORMAL) !=
> +   (ref_type(newref) == REF_TYPE_NORMAL)) {
> +   error(_("Both ref arguments to rename_ref must be normal "
> +   "(or both must be per-worktree/pseudorefs)"));
> +   return -1;

You can do return error(...);

> +   }
> return the_refs_backend->rename_ref(oldref, newref, logmsg);

LMDB backend can't deal with per-worktree rename. So either forbid
per-worktree rename here too, or fall back to files backend.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v5 25/27] refs: add LMDB refs storage backend

2016-02-19 Thread Duy Nguyen

> On Fri, 2016-02-19 at 09:54 +0700, Duy Nguyen wrote:
>> On Fri, Feb 19, 2016 at 3:23 AM, David Turner <
>> dtur...@twopensource.com> wrote:
>> > > > +static int read_per_worktree_ref(const char *submodule, const
>> > > > char
>> > > > *refname,
>> > > > +struct MDB_val *val, int
>> > > > *needs_free)
>> > >
>> > > From what I read, I suspect these _per_worktree functions will be
>> > > identical for the next backend. Should we just hand over the job
>> > > for
>> > > files backend? For all entry points that may deal with per
>> > > -worktree
>> > > refs, e.g. lmdb_resolve_ref_unsafe, can we check ref_type() first
>> > > thing, if it's per-worktree we call
>> > > refs_be_files.resolve_ref_unsafe()
>> > > instead?  It could even be done at frontend level,
>> > > e.g. refs.c:resolve_ref_unsafe().
>> > >
>> > > Though I may be talking rubbish here because I don't know how
>> > > whether
>> > > it has anything to do with transactions.
>> >
>> > The reason I did it this way is that some ref chains cross backend
>> > boundaries (e.g. HEAD -> refs/heads/master).  But if we have other
>> > backends later, we could generalize.
>>
>> Crossing backends should go through frontend again, imo. But I don't
>> really know if it's efficient.
>
> It's pretty tricky to maintain state (e.g. count of symref redirects)
> across that barrier.  So I'm not sure how to do it cleanly.

I notice files backend does pretty much the same thing. "files"
backend looks more like two backends combined in one, one is files,
the other is packed-refs. And it looks like we could solve it by
providing a lower level api, read_raw_ref() or something, that
retrieves the ref without any validation or link following. More on
this later.

>> > > I'm not sure I get this comment. D/F conflicts are no longer a
>> > > thing
>> > > for lmdb backend, right?
>> >
>> > I'm trying to avoid the lmdb backend creating a set of refs that
>> > the
>> > files backend can't handle.  This would make collaboration with
>> > other
>> > versions of git more difficult.
>>
>> It already is. If you create refs "foo" and "FOO" on case sensitive
>> file system and clone it on a case-insensitive one, you face the same
>> problem. We may have an optional configuration knob to prevent
>> incompatibilities with files backend, but I think that should be done
>> (and enforced if necessary) outside backends.
>
> Sure, the current state isn't perfect, but why make it worse?

I see it from a different angle. The current state isn't perfect, but
we will be moving to a better future where "files" backend may
eventually be deprecated. Why hold back?

But this line of reasoning only works if we have a new backend capable
of replacing "files" without regressions or introducing new
dependency. Which is why I suggest a new backend [1] (or implement
Shawn's RefTree if it's proven as good with small repos)

I have no problem if you want to stay strictly compatible with "files"
though.

[1] http://thread.gmane.org/gmane.comp.version-control.git/285893/focus=286654

On Fri, Feb 19, 2016 at 05:49:46PM -0500, David Turner wrote:
> > 
> > This code looks a lot like near the end of resolve_ref_1(). Maybe we
> > could share the code in refs/backend-common.c or something and call
> > here instead?
> 
> Something like the following?
> 
> commit aad6b84fd1869f6e1cf6ed15bcece0c2f6429e9d
> Author: David Turner 
> Date:   Thu Feb 18 17:09:29 2016 -0500
> 
> refs: break out some functions from resolve_ref_1
> 
> A bunch of resolve_ref_1 is not backend-specific, so we can
> break it out into separate internal functions that other
> backends can use.
...
> 
> I'm not sure I like it, because it breaks out these weird tiny
> functions that take a lot of arguments.  But maybe it's worth it?  What
> do you think?

OK how about we keep resolve_ref_1() whole and split real backend code
out? Something like these three patches (only built, did not test). A
bit ugly with continue_symlink, but it's just demonstration.

commit ef46fcdc62ef89fd5260ca054cd1d98f9f2d7c2b
Author: Nguyễn Thái Ngọc Duy 
Date:   Sat Feb 20 09:18:45 2016 +0700

refs/files: move ref I/O code out of resolve_refs_1()

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 4bddfb3..f54f2ae 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1407,6 +1407,95 @@ static int resolve_missing_loose_ref(const char *refname,
}
 }
 
+static const char *continue_normal_ref = "read_ref returns a normal ref";
+static const char *continue_symlink = "read_ref returns a symlink";
+
+/*
+ * Read a ref from backend. Returning any values except
+ * continue_normal_ref or continue_symlink ends resolve_ref_1()
+ * execution. If the return value is not NULL, sha1 and flags must be
+ * updated correctly. except REF_ISBROKEN which is set by
+ * resolve_ref_1().
+ *
+ * If continue_* is returned, sb_contents must contain the ref data.
+ */
+static

Re: [PATCHv14 5/7] git submodule update: have a dedicated helper for cloning

2016-02-19 Thread Jonathan Nieder

Stefan Beller wrote:
> On Fri, Feb 19, 2016 at 3:07 PM, Jonathan Nieder  wrote:

[...]
>>> + strbuf_addf(err, "Skipping submodule '%s'\n",
>>> + displaypath);
>>
>> Does the caller expect a newline at the end of err?
>>
>> In the refs code that uses an err strbuf, the convention is to
>> not end the message with a newline.  That way, a function like
>> die() can insert a newline and messages are guaranteed to be
>> newline-terminated even if someone is sloppy and does the wrong thing
>> when generating an error.
>
> Oh! So we need to add new lines ourselves here.

Now I'm confused.  The code in this patch is inconsistent.  I was
recommending consistently leaving out the \n.

It looks like pp_start_one takes the content of err without adding
a \n.  That's a bug in pp_start_one and pp_collect_finished IMHO.

For example, default_task_finished and default_start_failure don't
put a newline don't put a newline at the end of the message.  I
don't think that's a bug --- they're doing the right thing, but
pp_start_one and pp_collect_finished is just not handling it
correctly.

>>> + if (pp->reference)
>>> + argv_array_push(>args, pp->reference);
>>> + if (pp->depth)
>>> + argv_array_push(>args, pp->depth);
>>
>> What does 'cp' stand for mean?  Would a name like 'child' work?
>
> child process ? (any code which had a struct child_process referred to
> it as *cp)

Output from 'git grep --F -e "child_process *"' finds variables with
various names, corresponding to what kind of child it is.

[...]
>> Is this the 'list' subcommand?
>
> no. :(

No problem --- that's what review is for.

[...]
>>> + if (pp.print_unmatched) {
>>> + printf("#unmatched\n");
>>
>> I'm still confused about this.  I think a comment where
>> 'print_unmatched' is declared would probably clear it up.
>
> Probably this is a too literal translation from shell to C.
> By printing #unmatched the shell on the other end of the pipe
> of this helper program knows to just stop and error out.
>
> So this is telling the downstream program to stop.
>
> Maybe we can name the variable 'quickstop' ?
> 'abort_and_exit' ?

Interesting.

Would it work for the helper to exit at that point with nonzero status?

Lacking "set -o pipefail", it's a little messy to handle in the shell,
but it's possible:

{
git submodule--helper \
--foo \
--bar \
--baz ||
... handle error, e.g. by printing something
that the other end of the pipe wants to see ...
} |
...

[...]
>> (just curious) why are these saved up and printed all at once instead
>> of being printed as it goes?
>
> To have a clean abort path, i.e. we need to be done with all the parallel 
> stuff
> before we start on doing local on-disk stuff.

Interesting.

That's subtle.  Probably worth a comment so people know not to break
it.  (E.g.

/*
 * We saved the output and splat it all at once now.
 * That means:
 * - the listener does not have to interleave their (checkout)
 *   work with our fetching.  The writes involved in a
 *   checkout involve more straightforward sequential I/O.
 * - the listener can avoid doing any work if fetching failed.
 */

).

Thinking out loud: the other side could write their output to a
temporary file (e.g.  under .git) and save us the trouble of buffering
it.  But the list of submodules and statuses is not likely to be huge
--- buffering doesn't hurt much.

[...]
> In a next step, we can do the checkout in parallel if there is nothing to 
> assume
> to lead to trouble. i.e. update strategy is checkout and the submodule is
> in a clean state at the tip of a branch.

Somebody tried parallelizing file output during checkout and found it
sped up the checkout on some systems by a small amount.  But then
later operations to read back the files were much slower.  It seems
that the output pattern affected the layout of the files on disk in a
bad way.

I'm not too afraid.  Just saying that the benefit of parallel checkout
would be something to measure.

A bigger worry is that checkout might be I/O bound and not benefit
much from parallelism.

> All problems which need to be solved by the user should be presented in
> a sequential way, i.e. present one problem and then full stop.
> That seems to be more encouraging as if we'd throw a "Here are a dozen
> broken submodule repositories which we expect the user fixes up".

It depends on the problem --- sometimes people want to see a list of
errors and fix them all (hence tools like "make -k").  I agree that
stop-on-first-error is a good default and that it's not worth worrying
about introducing --keep-going until someone asks for it.

Thanks for your thoughtfulness,
Jonathan
--
To unsubscribe from this list: send

Re: [PATCH] submodule: Fetch the direct sha1 first

2016-02-19 Thread Junio C Hamano

Stefan Beller  writes:

> On Fri, Feb 19, 2016 at 2:29 PM, Junio C Hamano  wrote:
>> Stefan Beller  writes:
>>
>>> Doing a 'git fetch' only and not the fetch for the specific sha1 would be
>>> incorrect?
>>
>> I thought that was what you are attempting to address.
>
> Yep. In an ideal world I would imagine it would look like
>
> if $sha1 doesn't exist:
> fetch $sha1
> if server did not support fetching direct sha1:
> fallback to fetch 

It should look more like this:

if $sha1's history and objects are incomplete:
fetch ;# normally just like we have done before
if $sha1's history and objects are still incomplete:
fetch $sha1

as existing users already expect that commits and objects that are
reachable from tips of refs configured to be fetched in the
submodule via its configured refspecs are available after this part
of the code runs, regardless of this "Gerrit reviews may not have
arrived to branches yet" issue.  The first "normal" fetch ensures
that the expectation is met.

> Would it make sense in case of broken histories to not fetch
> (specially if the user asked to not fetch) and rather repair by
> making it a shallow repository?

Commits whose ancestors, trees and/or blobs are incomplete can and
do exist in a perfectly healthy repository and there is no breakage
in the history as long as such commits are not reachable from any of
the refs.

You can for example make a small fetch from 'pu' today, that results
in unpack-objects to be run instead of index-pack, and then make
another fetch from 'pu', making these loose objects unreachable from
anywhere.  Maybe there were 5 commits worth of objects in the
original transfer, and the objects necessary for the bottom 2 were
pruned away while the tip one still in the repository [*1*].

"cat-file -e" may find that the tip commit is there, but "rev-list
--objects $oldtip --not --all" will find that the old tip of pu that
is left behind is incomplete and cannot be safely used (e.g. "git
log -p" would fail).  The "$sha1's history and objects are
incomplete" check aka "quickfetch()" test is a way to avoid getting
fooled by an object that passes "cat-file -e" test.

I am not sure if it is feasible, given such an island of commits and
associated objects, to craft a proper "shallow" boundary after the
fact.  It should be doable with extra code, but I do not think there
is a canned support at the plumbing level (you can obviously
construct it with "cat-file -e" and following the inter object links
yourself).

This "fetch" is in a cmd_update() codepath, whose purpose is "Update
each submodule path to correct revision, using clone and checkout as
needed", so I am not sure "to not fetch, specically if the user
asked to not fetch" makes much sense in the first place.

[Footnote]

*1* A canonical example used to be a commit walker that fetches from
the tip of a branch that is ahead of us by 5 commits, gets
killed after fetching and storing the tip commit object and some
of its trees and blobs, and before successfully fetching and
storing all the necessary objects, e.g. the parent commits and
its trees and blobs.  That would leave a disconnected island of
objects that are not anchored by any ref.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv14 5/7] git submodule update: have a dedicated helper for cloning

2016-02-19 Thread Stefan Beller

On Fri, Feb 19, 2016 at 3:07 PM, Jonathan Nieder  wrote:
>
> I still have trouble reading this patch (patch 5).  Some musing below
> to figure out what could change to make it more readable (perhaps in a
> patch on top).

After this review (I added replies from bottom to top, the same way I
write my code ;)
I think we need another reroll at least.

>> +#define SUBMODULE_UPDATE_CLONE_INIT {0, 0, 0, NULL, NULL, NULL, NULL, 
>> MODULE_LIST_INIT, STRING_LIST_INIT_DUP, SUBMODULE_UPDATE_STRATEGY_INIT}
>
> I think these struct fields need some comments.  It's not clear what
> most of them represent.

ok.

>
> [...]
>> +static int update_clone_inspect_next_task(struct child_process *cp,
>
> What is being inspected here?  What does the return value represent?
>
> Would a name like 'prepare_to_clone_next_submodule' make sense?  A
> comment could say that 'ce' points to the candidate submodule to clone,
> that the return value represents whether we want to clone it, and
> that the first parameter is an output parameter representing a command
> to run to carry out the clone.
>

ok.

>
> It's common to call submodule_from_path with null_sha1 as a parameter
> but I have trouble continuing to remember what that means.  Maybe
> there should be a separate function that handles that?  As a
> side-effect, the name and docstring of that function could explain
> what it means, which I still am not sure about. :)

Good idea!

> if (pp->update.type == SM_UPDATE_NONE
> || (pp->update.type == SM_UPDATE_UNSPECIFIED
> && sub->update_strategy.type == SM_UPDATE_NONE)) {
>
> What does pp stand for?

parallel processes similar to cp standing for child process.

>
> Is the --update parameter ever set to 'none'?  What does it mean
> when someone sets it to none?

In the initialization. This is to preserve precedence.
Yeah there is a test for --update=none.

>
>> + strbuf_addf(err, "Skipping submodule '%s'\n",
>> + displaypath);
>
> Does the caller expect a newline at the end of err?
>
> In the refs code that uses an err strbuf, the convention is to
> not end the message with a newline.  That way, a function like
> die() can insert a newline and messages are guaranteed to be
> newline-terminated even if someone is sloppy and does the wrong thing
> when generating an error.

Oh! So we need to add new lines ourselves here.

>> + if (pp->reference)
>> + argv_array_push(>args, pp->reference);
>> + if (pp->depth)
>> + argv_array_push(>args, pp->depth);
>
> What does 'cp' stand for mean?  Would a name like 'child' work?

child process ? (any code which had a struct child_process referred to
it as *cp)


>
> The 'count' variable is more of a cursor than a count.  Would a name +
> comment like
>
> /* index into 'list' representing the next submodule to consider 
> cloning */
> int current;
>
> work?
>

I think that's better.

>
> [...]
>> +static int update_clone(int argc, const char **argv, const char *prefix)
>> +{
>> + const char *update = NULL;
>> + struct string_list_item *item;
>> + struct submodule_update_clone pp = SUBMODULE_UPDATE_CLONE_INIT;
>> +
>> + struct option module_list_options[] = {
>
> The name looks stale.
>
>> + OPT_STRING(0, "prefix", ,
>> +N_("path"),
>> +N_("path into the working tree")),
>> + OPT_STRING(0, "recursive_prefix", _prefix,
>> +N_("path"),
>> +N_("path into the working tree, across nested "
>> +   "submodule boundaries")),
>
> What do these options represent?  I'm used to the 'prefix' parameter to
> a command coming from git machinery that remembers what the path to the
> original cwd was relative to the worktree or repository root.  Here
> there's an option to set it --- is that intentional?  Would setting the
> environment variable GIT_PREFIX (that git already knows how to respect)
> work in its place?
>
> What is recursive_prefix relative to?
>
> Nit: it would be more idiomatic to use a dash in place of an underscore
> in the second one's name.  But this is an internal interface so it
> doesn't matter much.
>
>> + OPT_STRING(0, "update", ,
>> +N_("string"),
>> +N_("update command for submodules")),
>
> This one is confusing to me because while the script supports --rebase /
> --merge / --checkout this option seems to be more general.
>
> If the help string said '(rebase, merge, or checkout)' then I wouldn't
> mind.
>
>> + OPT_STRING(0, "reference", , "",
>> +N_("Use the local reference repository "
>> +   "instead of a full clone")),
>
> Is this allowed to be relative?  If so, what is it relative to?
>
> [...]
>> + OPT__QUIET(, N_("do't print cloning

Re: [PATCH] submodule: Fetch the direct sha1 first

2016-02-19 Thread Stefan Beller

On Fri, Feb 19, 2016 at 2:29 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> Doing a 'git fetch' only and not the fetch for the specific sha1 would be
>> incorrect?
>
> I thought that was what you are attempting to address.

Yep. In an ideal world I would imagine it would look like

if $sha1 doesn't exist:
fetch $sha1
if server did not support fetching direct sha1:
fallback to fetch 

and I thought this would be small enough to be expressed
using the || and &&.

It would be slightly more complicated for first doing the fetch
and then refetching if the sha1 is still missing as the condition
is more complicated to express.

>
>> ('git fetch' with no args finishes successfully, so no fallback is
>> triggered. But we are not sure if we obtained the sha1, so we need to
>> check if we have the sha1 by doing a local check and then try to get the sha1
>> again if we don't have it locally.
>
> Yes, that is what I meant in the "In the opposite fallback order"
> suggestion.
   (clear_local_git_env; cd "$sm_path" &&
 + remote_name=$(get_default_remote)
   ( (rev=$(git rev-list -n 1 $sha1 
 --not --all 2>/dev/null) &&
 -  test -z "$rev") || git-fetch)) ||
 +  test -z "$rev") || git-fetch 
 $remote_name $rev
>>>
>>> Regardless of the "fallback order" issue, I do not think $rev is a
>>> correct thing to fetch here.  The superproject binds $sha1 to its
>>> tree, and you would be checking that out, so shouldn't you be
>>> fetching that commit?
>>
>> Both $sha1 and $rev are in the submodule (because
>> 'git submodule--helper list' puts out the sha1 as the
>> submodule sha1). $rev is either empty or equal to $sha1
>> in my understanding of "rev-list $sha1 --not --all".
>
> Not quite.  The rev-list command expects [*1*] one of three outcomes
> in the original construct:
>
>  * The repository does not know anything about $sha1; the command
>fails, rev is left empty, but thanks to &&, git-fetch runs.

ok. "git cat-file -e" would be able to replace this case.

>
>  * The repository has $sha1 but the history behind it is not
>complete.  While digging from $sha1 following the parent chain,
>it would hit a missing object and fails, rev may or may not be
>empty, but thanks to &&, git-fetch runs.

which I read as a broken shallow clone or a half-way gc'ed repository.
(git fetch repairs that? ok.)

An intact shallow clone which has enough history to contain sha1 should
not be deepened in my understanding of "submodule update".

Rereading the man page for  "git cat-file -e", the output of
"cat-file -e" is the same as in the first case, and we want to also fetch.

>
>  * The repository has $sha1 and its history is all connected.  The
>command succeeds.  If $sha1 is not connected to any of the refs,
>however, that commit may be shown and stored in $rev.  In this
>case, "$rev" happens to be the same as "$sha1".

So it would be possible to checkout $sha1 as a detached HEAD,
but it's not strongly protected against gc. (I assume gc will not
touch objects reachable from HEAD, but not referenced by any ref,
but HEAD can change in a heart beat, so it is not as strong of a protection
as having a branch include that $sha1).

>
> As this "fetch" is run in order to make sure that the history behind
> $sha1 is complete in the submodule repository, so that detaching the
> HEAD at that commit will give the user a useful repository and its
> working tree, the check the code is doing in the original is already
> flawed.  If $sha1 and its ancestry is complete in the repository,
> rev-list would succeed, and if $sha1 is ahead of any of the refs,
> the original code still runs "git fetch", which is not necessary for
> the purpose of detaching the head at $sha1.  On the other hand, by
> using "-n 1", it can cause rev-list stop before discovering a gap in
> history behind $sha1, allowing "git fetch" to be skipped when it
> should be run to fill the gap in the history.
>
> To be complete, the rev-list command line should also run with
> "--objects"; after all, a commit walker fetch may have downloaded
> commit chain completely but haven't fetched necessary trees and
> blobs when it was killed, and "rev-list $sha1 --not --all" would not
> catch such a breakage without "--objects".

So 'cat-file -e' doesn't sound like it would back-test the history at all,
so it doesn't sound like a sufficient replacement.

>
>> Oh! Looking at that I suspect the
>> "test -z $(git rev-list -n 1 $sha1 --not --all 2>/dev/null)"
>> and "git cat-file -e" are serving the same purpose here and should just
>> indicate if the given sha1 is present or not.
>
> That is the simplest explanation why the original "rev-list"
> invocation is already wrong.  It should do an equivalent of
>

Re: [PATCH v5 25/27] refs: add LMDB refs storage backend

2016-02-19 Thread Junio C Hamano

David Turner  writes:

> Something like the following?
>
> commit aad6b84fd1869f6e1cf6ed15bcece0c2f6429e9d
> Author: David Turner 
> Date:   Thu Feb 18 17:09:29 2016 -0500
>
> refs: break out some functions from resolve_ref_1
> 
> A bunch of resolve_ref_1 is not backend-specific, so we can
> break it out into separate internal functions that other
> backends can use.
> 
> Signed-off-by: David Turner 
> 
> diff --git a/refs.c b/refs.c
> index c9fa34d..680c2a5 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1221,6 +1221,66 @@ int for_each_rawref(each_ref_fn fn, void
> *cb_data)
>  DO_FOR_EACH_INCLUDE_BROKEN, cb_data);
>  }
>  
> +int parse_simple_ref(const char *buf, unsigned char *sha1, unsigned
> int *flags, int bad_name)
> +{
> + /*
> +  * Please note that FETCH_HEAD has a second
> +  * line containing other data.
> +  */

This comment is not quite correct; the reason why the latter half of
this condition is more convoluted than just !buf[40] is not because
FETCH_HEAD has a second line, but it has an additional info at the
tail on the first line.

Also the caller is expected to have already checked for "ref:"
prefix to decide not to call this.

> + if (get_sha1_hex(buf, sha1) ||
> + (buf[40] != '\0' && !isspace(buf[40]))) {

> +/*
> + * Parse a refname out of the contents of a symref into a provided
> + * strbuf.  Return a pointer to the strbuf's contents.
> + */
> +char *parse_symref_data(const char *buf, struct strbuf *sb_refname)
> +{
> + buf += 4;
> + while (isspace(*buf))
> + buf++;

This is expecting to be called by somebody who read "ref:..."  into
buf and the caller must decide by checking the "ref:" prefix before
calling into this function.

> ...  I'm not sure I like it, because it breaks out these weird
> tiny functions that take a lot of arguments.  But maybe it's worth
> it?  What do you think?

I wasn't Cc'ed but if you ask me, I do not think I can say I like
it, either.  Somehow it smells that the responsibility of inspecting
data and doing different things based on normal vs symbolic ref is
split between the caller and the callees at a wrong level.  The
caller also is responsible to rtrim the line before calling the
latter function if I am reading the code correctly, which smells
inconsistent given that an equivalent ltrim() is done by the "skip
the leading spaces" loop inside.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv14 5/7] git submodule update: have a dedicated helper for cloning

2016-02-19 Thread Jonathan Nieder

Hi,

Stefan Beller wrote:

> This introduces a new helper function in git submodule--helper
> which takes care of cloning all submodules, which we want to
> parallelize eventually.

Patches 1-4 are still
Reviewed-by: Jonathan Nieder 

I still have trouble reading this patch (patch 5).  Some musing below
to figure out what could change to make it more readable (perhaps in a
patch on top).

[...]
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -255,6 +255,234 @@ static int module_clone(int argc, const char **argv, 
> const char *prefix)
>   return 0;
>  }
>  
> +struct submodule_update_clone {
> + /* states */
> + int count;
> + int print_unmatched;
> + /* configuration */
> + int quiet;
> + const char *reference;
> + const char *depth;
> + const char *recursive_prefix;
> + const char *prefix;
> + struct module_list list;
> + struct string_list projectlines;
> + struct submodule_update_strategy update;
> + struct pathspec pathspec;
> +};
> +#define SUBMODULE_UPDATE_CLONE_INIT {0, 0, 0, NULL, NULL, NULL, NULL, 
> MODULE_LIST_INIT, STRING_LIST_INIT_DUP, SUBMODULE_UPDATE_STRATEGY_INIT}

I think these struct fields need some comments.  It's not clear what
most of them represent.

[...]
> +static int update_clone_inspect_next_task(struct child_process *cp,

What is being inspected here?  What does the return value represent?

Would a name like 'prepare_to_clone_next_submodule' make sense?  A
comment could say that 'ce' points to the candidate submodule to clone,
that the return value represents whether we want to clone it, and
that the first parameter is an output parameter representing a command
to run to carry out the clone.

[...]
> + if (ce_stage(ce)) {
> + if (pp->recursive_prefix)
> + strbuf_addf(err, "Skipping unmerged submodule %s/%s\n",
> + pp->recursive_prefix, ce->name);
> + else
> + strbuf_addf(err, "Skipping unmerged submodule %s\n",
> + ce->name);

Nit: this would be easier to scan with braces.

[...]
> + sub = submodule_from_path(null_sha1, ce->name);

It's common to call submodule_from_path with null_sha1 as a parameter
but I have trouble continuing to remember what that means.  Maybe
there should be a separate function that handles that?  As a
side-effect, the name and docstring of that function could explain
what it means, which I still am not sure about. :)

> + if (pp->recursive_prefix)
> + displaypath = relative_path(pp->recursive_prefix,
> + ce->name, _sb);

Nit: could use braces.

> + else
> + displaypath = ce->name;
> +
> + if (pp->update.type == SM_UPDATE_NONE ||
> + (pp->update.type == SM_UPDATE_UNSPECIFIED &&
> +  sub->update_strategy.type == SM_UPDATE_NONE)) {

Nit: this might be more readable with the operators starting each
line:

if (pp->update.type == SM_UPDATE_NONE
|| (pp->update.type == SM_UPDATE_UNSPECIFIED
&& sub->update_strategy.type == SM_UPDATE_NONE)) {

What does pp stand for?

Is the --update parameter ever set to 'none'?  What does it mean
when someone sets it to none?

> + strbuf_addf(err, "Skipping submodule '%s'\n",
> + displaypath);

Does the caller expect a newline at the end of err?

In the refs code that uses an err strbuf, the convention is to
not end the message with a newline.  That way, a function like
die() can insert a newline and messages are guaranteed to be
newline-terminated even if someone is sloppy and does the wrong thing
when generating an error.

[...]
> + if (needs_cloning) {

Could de-indent:

if (!needs_cloning)
goto cleanup;

> + cp->git_cmd = 1;
> + cp->no_stdin = 1;
> + cp->stdout_to_stderr = 1;
> + cp->err = -1;
> + argv_array_push(>args, "submodule--helper");
> + argv_array_push(>args, "clone");
> + if (pp->quiet)
> + argv_array_push(>args, "--quiet");
> +
> + if (pp->prefix)
> + argv_array_pushl(>args, "--prefix", pp->prefix, 
> NULL);
> +

Odd whitespace.  I think it would be fine to remove the blank lines.

> + argv_array_pushl(>args, "--path", sub->path, NULL);
> + argv_array_pushl(>args, "--name", sub->name, NULL);
> + argv_array_pushl(>args, "--url", url, NULL);
> + if (pp->reference)
> + argv_array_push(>args, pp->reference);
> + if (pp->depth)
> + argv_array_push(>args, pp->depth);

What does 'cp' stand for mean?  Would a name like 'child' work?

[...]
> +static int update_clone_get_next_task(struct child_process *cp,
> +   struct strbuf *err,
>

Re: [PATCH v5 25/27] refs: add LMDB refs storage backend

2016-02-19 Thread David Turner

On Thu, 2016-02-18 at 15:50 +0700, Duy Nguyen wrote:
> Caveat: I did not study how to use lmdb. I just guessed what it does
> based on function names. I don't know much about refs handling either
> (especially since the transaction thing is introduced)
> 
> > diff --git a/Documentation/technical/refs-lmdb-backend.txt
> > b/Documentation/technical/refs-lmdb-backend.txt
> > new file mode 100644
> > index 000..eb81465
> > --- /dev/null
> > +++ b/Documentation/technical/refs-lmdb-backend.txt
> > +Reflog values are in the same format as the original files-based
> > +reflog, including the trailing LF. The date in the reflog value
> > +matches the date in the timestamp field.
> 
> ..except that SHA-1s are stored in raw values instead of hex strings.
> 
> > diff --git a/Documentation/technical/repository-version.txt
> > b/Documentation/technical/repository-version.txt
> > index 00ad379..fca5ecd 100644
> > --- a/Documentation/technical/repository-version.txt
> > +++ b/Documentation/technical/repository-version.txt
> > @@ -86,3 +86,8 @@ for testing format-1 compatibility.
> >  When the config key `extensions.preciousObjects` is set to `true`,
> >  objects in the repository MUST NOT be deleted (e.g., by `git
> > -prune` or
> >  `git repack -d`).
> > +
> > +`refStorage`
> > +
> > +This extension allows the use of alternate ref storage backends. 
> >  The
> > +only defined value is `lmdb`.
> 
> refStorage accepts empty string and `files` as well, should probably
> be worth mentioning.
> 
> > diff --git a/refs/lmdb-backend.c b/refs/lmdb-backend.c
> > +#include "../cache.h"
> > +#include 
> > +#include "../object.h"
> > +#include "../refs.h"
> > +#include "refs-internal.h"
> > +#include "../tag.h"
> > +#include "../lockfile.h"
> 
> I'm quite sure we don't need "../". We have plenty of source files in
> subdirs and many of them (haven't checked all) just go with
> #include "cache.h".
> 
> > +struct lmdb_transaction transaction;
> 
> static?
> 
> > +
> > +static int in_write_transaction(void)
> > +{
> > +   return transaction.txn && !(transaction.flags &
> > MDB_RDONLY);
> > +}
> > +
> > +static void init_env(MDB_env **env, const char *path)
> > +{
> > +   int ret;
> > +   if (*env)
> > +   return;
> > +
> > +   assert(path);
> > +
> > +   ret = mdb_env_create(env);
> > +   if (ret)
> > +   die("BUG: mdb_env_create failed: %s",
> > mdb_strerror(ret));
> > +   ret = mdb_env_set_maxreaders(*env, 1000);
> > +   if (ret)
> > +   die("BUG: mdb_env_set_maxreaders failed: %s",
> > mdb_strerror(ret));
> > +   ret = mdb_env_set_mapsize(*env, (1<<30));
> > +   if (ret)
> > +   die("BUG: mdb_set_mapsize failed: %s",
> > mdb_strerror(ret));
> > +   ret = mdb_env_open(*env, path, 0 , 0664);
> 
> This permission makes me wonder if we need adjust_shared_perm() here
> and some other places.
> 
> > +   if (ret)
> > +   die("BUG: mdb_env_open (%s) failed: %s", path,
> > +   mdb_strerror(ret));
> > +}
> > +
> > +static int lmdb_init_db(int shared, struct strbuf *err)
> > +{
> > +   /*
> > +* To create a db, all we need to do is make a directory
> > for
> > +* it to live in; lmdb will do the rest.
> > +*/
> > +
> > +   if (!db_path)
> > +   db_path =
> > xstrdup(real_path(git_path("refs.lmdb")));
> > +
> > +   if (mkdir(db_path, 0775) && errno != EEXIST) {
> > +   strbuf_addf(err, "%s", strerror(errno));
> 
> maybe strbuf_addstr, unless want to add something more, "mkdir
> failed"?
> 
> > +static int read_per_worktree_ref(const char *submodule, const char
> > *refname,
> > +struct MDB_val *val, int
> > *needs_free)
> 
> From what I read, I suspect these _per_worktree functions will be
> identical for the next backend. Should we just hand over the job for
> files backend? For all entry points that may deal with per-worktree
> refs, e.g. lmdb_resolve_ref_unsafe, can we check ref_type() first
> thing, if it's per-worktree we call
> refs_be_files.resolve_ref_unsafe()
> instead?  It could even be done at frontend level,
> e.g. refs.c:resolve_ref_unsafe().
> 
> Though I may be talking rubbish here because I don't know how whether
> it has anything to do with transactions.
> 
> > +{
> > +   struct strbuf sb = STRBUF_INIT;
> > +   struct strbuf path = STRBUF_INIT;
> > +   struct stat st;
> > +   int ret = -1;
> > +
> > +   submodule_path(, submodule, refname);
> > +
> > +#ifndef NO_SYMLINK_HEAD
> 
> It started with the compiler warns about unused "st" when this macro
> is defined. Which makes me wonder, should we do something like this
> to
> make sure this code compiles unconditionally?
> 
> +#ifndef NO_SYMLINK_HEAD
> +   int no_symlink_head = 0;
> +#else
> +   int no_symlink_head = 1;
> +#endif
> ...
> +   if (!no_symlink_head) {
> ...
> 
> 
> > +int lmdb_transaction_begin_flags(struct strbuf *err, unsigned int
> > flags)
> 
> static?
> 
> > +#define MAXDEPTH 5
> > +
> > +static const char *parse_ref_data(struct

Re: [PATCH] git-p4.py: Make submit working on bare repository

2016-02-19 Thread Junio C Hamano

Amadeusz Żołnowski  writes:

> git-p4 can be successfully used from bare repository (which may act as a
> bridge between Perforce repository and pure Git repositories). After
> submitting changelist to Perforce, git-p4 performs unconditional rebase
> which obviously fails.
>
> Perform rebase only on non-bare repositories, so submit command can be
> successful on bare repository.

It is obvious that an attempt to run rebase would fail in a bare
repository, and skipping it would obviously make it not fail.

I think that part is well understood.

What is unclear is what the ramification of _not_ rebasing after
submitting is.

In other words, why do we have to rebase after submitting when we
are in a non-bare repository?  There must be a reason behind it,
i.e. "If we do not rebase, then the repository would be in a state
where future operations like X and Y do not work correctly because
of Z".

And why does that same reason Z not apply when we submit from a bare
repository?

A possible explanation might be that X and Y are operations that
happen only in a non-bare repository that we do not have to worry
about happening in a bare repository after we finish submitting.

But these X, Y and Z are left unexplained--that is what is unclear
in the original proposed log message, and it is still unclear in the
above update.

Thanks.

> Signed-off-by: Amadeusz Żołnowski 
> ---
>  git-p4.py | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/git-p4.py b/git-p4.py
> index c33dece..e00cd02 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -2059,8 +2059,9 @@ class P4Submit(Command, P4UserMap):
>  sync.branch = self.branch
>  sync.run([])
>  
> -rebase = P4Rebase()
> -rebase.rebase()
> +if not gitConfigBool("core.bare"):
> +rebase = P4Rebase()
> +rebase.rebase()
>  
>  else:
>  if len(applied) == 0:
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] submodule: Fetch the direct sha1 first

2016-02-19 Thread Junio C Hamano

Stefan Beller  writes:

> Doing a 'git fetch' only and not the fetch for the specific sha1 would be
> incorrect?

I thought that was what you are attempting to address.

> ('git fetch' with no args finishes successfully, so no fallback is
> triggered. But we are not sure if we obtained the sha1, so we need to
> check if we have the sha1 by doing a local check and then try to get the sha1
> again if we don't have it locally.

Yes, that is what I meant in the "In the opposite fallback order"
suggestion.
>>>   (clear_local_git_env; cd "$sm_path" &&
>>> + remote_name=$(get_default_remote)
>>>   ( (rev=$(git rev-list -n 1 $sha1 
>>> --not --all 2>/dev/null) &&
>>> -  test -z "$rev") || git-fetch)) ||
>>> +  test -z "$rev") || git-fetch 
>>> $remote_name $rev
>>
>> Regardless of the "fallback order" issue, I do not think $rev is a
>> correct thing to fetch here.  The superproject binds $sha1 to its
>> tree, and you would be checking that out, so shouldn't you be
>> fetching that commit?
>
> Both $sha1 and $rev are in the submodule (because
> 'git submodule--helper list' puts out the sha1 as the
> submodule sha1). $rev is either empty or equal to $sha1
> in my understanding of "rev-list $sha1 --not --all".

Not quite.  The rev-list command expects [*1*] one of three outcomes
in the original construct:

 * The repository does not know anything about $sha1; the command
   fails, rev is left empty, but thanks to &&, git-fetch runs.

 * The repository has $sha1 but the history behind it is not
   complete.  While digging from $sha1 following the parent chain,
   it would hit a missing object and fails, rev may or may not be
   empty, but thanks to &&, git-fetch runs.

 * The repository has $sha1 and its history is all connected.  The
   command succeeds.  If $sha1 is not connected to any of the refs,
   however, that commit may be shown and stored in $rev.  In this
   case, "$rev" happens to be the same as "$sha1".

As this "fetch" is run in order to make sure that the history behind
$sha1 is complete in the submodule repository, so that detaching the
HEAD at that commit will give the user a useful repository and its
working tree, the check the code is doing in the original is already
flawed.  If $sha1 and its ancestry is complete in the repository,
rev-list would succeed, and if $sha1 is ahead of any of the refs,
the original code still runs "git fetch", which is not necessary for
the purpose of detaching the head at $sha1.  On the other hand, by
using "-n 1", it can cause rev-list stop before discovering a gap in
history behind $sha1, allowing "git fetch" to be skipped when it
should be run to fill the gap in the history.

To be complete, the rev-list command line should also run with
"--objects"; after all, a commit walker fetch may have downloaded
commit chain completely but haven't fetched necessary trees and
blobs when it was killed, and "rev-list $sha1 --not --all" would not
catch such a breakage without "--objects".

> Oh! Looking at that I suspect the
> "test -z $(git rev-list -n 1 $sha1 --not --all 2>/dev/null)"
> and "git cat-file -e" are serving the same purpose here and should just
> indicate if the given sha1 is present or not.

That is the simplest explanation why the original "rev-list"
invocation is already wrong.  It should do an equivalent of
builtin/fetch.c::quickfetch() to ensure that $sha1 is something that
is complete, i.e. could be anchored with a ref if we wanted to,
before deciding to avoid running "git fetch".

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] submodule: Fetch the direct sha1 first

2016-02-19 Thread Stefan Beller

On Fri, Feb 19, 2016 at 1:13 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> When reviewing a change in Gerrit, which also updates a submodule,
>> a common review practice is to download and cherry-pick the patch locally
>> to test it. However when testing it locally, the 'git submodule update'
>> may fail fetching the correct submodule sha1 as the corresponding commit
>> in the submodule is not yet part of the project history, but also just a
>> proposed change.
>>
>> To ease this, try fetching by sha1 first and when that fails (in case of
>> servers which do not allow fetching by sha1), fall back to the default
>> behavior we already have.
>>
>> Signed-off-by: Stefan Beller 
>> ---
>>
>> I think it's best to apply this on origin/master, there is no collision
>> with sb/submodule-parallel-update.
>>
>> Also I do not see a good way to test this both in correctness as well
>> as performance degeneration. If the first git fetch fails, the second
>> fetch is executed, so it should behave as before this patch w.r.t. 
>> correctness.
>>
>> Regarding performance, the first fetch should fail quite fast iff the fetch
>> fails and then continue with the normal fetch. In case the first fetch works
>> fine getting the exact sha1, the fetch should be faster than a default fetch
>> as potentially less data needs to be fetched.
>
> "The fetch should be faster" may not be making a good trade-off
> overall--people may have depended on the branches configured to be
> fetched to be fetched after this codepath is exercised, but now if
> the commit bound to the superproject tree happens to be complete,
> even though it is not anchored by any remote tracking ref (hence the
> next GC may clobber it), the fetch of other branches will not
> happen.
>
> My knee-jerk reaction is that the order of fallback is probably the
> other way around.  That is, try "git fetch" as before, check again
> if the commit bound to the superproject tree is now complete, and
> fallback to fetch that commit with an extra "git fetch".

I thought about that and assumed we'd need to have an option for
fetch like "--try-to-get-sha1", which depending on the servers capabilities
would just add that sha1 to the "wants" during fetching negotiation if the
server supports it, otherwise just fetch normally.

Doing a 'git fetch' only and not the fetch for the specific sha1 would be
incorrect? ('git fetch' with no args finishes successfully, so no fallback is
triggered. But we are not sure if we obtained the sha1, so we need to
check if we have the sha1 by doing a local check and then try to get the sha1
again if we don't have it locally. So doing the reverse order would be
more code here for correctness.

>
> Jens, what do you think?
>
>>  git-submodule.sh | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/git-submodule.sh b/git-submodule.sh
>> index 9bc5c5f..ee0b985 100755
>> --- a/git-submodule.sh
>> +++ b/git-submodule.sh
>> @@ -746,8 +746,9 @@ Maybe you want to use 'update --init'?")"
>>   # Run fetch only if $sha1 isn't present or it
>>   # is not reachable from a ref.
>>   (clear_local_git_env; cd "$sm_path" &&
>> + remote_name=$(get_default_remote)
>>   ( (rev=$(git rev-list -n 1 $sha1 --not 
>> --all 2>/dev/null) &&
>> -  test -z "$rev") || git-fetch)) ||
>> +  test -z "$rev") || git-fetch 
>> $remote_name $rev
>
> Regardless of the "fallback order" issue, I do not think $rev is a
> correct thing to fetch here.  The superproject binds $sha1 to its
> tree, and you would be checking that out, so shouldn't you be
> fetching that commit?

Both $sha1 and $rev are in the submodule (because
'git submodule--helper list' puts out the sha1 as the
submodule sha1). $rev is either empty or equal to $sha1
in my understanding of "rev-list $sha1 --not --all". However for
readability maybe we want to write:

(clear_local_git_env; cd "$sm_path" &&
test -z $(git rev-list -n 1 $sha1 --not --all 2>/dev/null) ||
git fetch $(get_default_remote) $sha1 ||
git fetch ||
die ...
)

So in case you want to the other order, I'd propose

(clear_local_git_env; cd "$sm_path" &&
test -z $(git rev-list -n 1 $sha1 --not --all 2>/dev/null) ||
git fetch ||
(git cat-file -e $sha1 && git fetch $(get_default_remote) $sha1) ||
die ...
)

Oh! Looking at that I suspect the
"test -z $(git rev-list -n 1 $sha1 --not --all 2>/dev/null)"
and "git cat-file -e" are serving the same purpose here and should just
indicate if the given sha1 is present or not.

So we could reduce it further to

(clear_local_git_env; cd "$sm_path" &&
git cat-file -e $sha1 || git fetch ||
(git cat-file -e $sha1 &&

[PATCH] git-p4.py: Make submit working on bare repository

2016-02-19 Thread Amadeusz Żołnowski

git-p4 can be successfully used from bare repository (which may act as a
bridge between Perforce repository and pure Git repositories). After
submitting changelist to Perforce, git-p4 performs unconditional rebase
which obviously fails.

Perform rebase only on non-bare repositories, so submit command can be
successful on bare repository.

Signed-off-by: Amadeusz Żołnowski 
---
 git-p4.py | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index c33dece..e00cd02 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2059,8 +2059,9 @@ class P4Submit(Command, P4UserMap):
 sync.branch = self.branch
 sync.run([])
 
-rebase = P4Rebase()
-rebase.rebase()
+if not gitConfigBool("core.bare"):
+rebase = P4Rebase()
+rebase.rebase()
 
 else:
 if len(applied) == 0:
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] exec_cmd.c: use find_last_dir_sep() for code simplification

2016-02-19 Thread Junio C Hamano

Alexander Kuleshov  writes:

> We are trying to extract dirname from argv0 in the git_extract_argv0_path().
> But in the same time, the  provides find_last_dir_sep()
> to get dirname from a given path.  Let's use it instead of loop for the code
> simplification.
>
> Signed-off-by: Alexander Kuleshov 
> ---

Looks correct.  Thanks.

>  exec_cmd.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/exec_cmd.c b/exec_cmd.c
> index e85f0fd..680b257 100644
> --- a/exec_cmd.c
> +++ b/exec_cmd.c
> @@ -43,12 +43,10 @@ const char *git_extract_argv0_path(const char *argv0)
>  
>   if (!argv0 || !*argv0)
>   return NULL;
> - slash = argv0 + strlen(argv0);
>  
> - while (argv0 <= slash && !is_dir_sep(*slash))
> - slash--;
> + slash = find_last_dir_sep(argv0);
>  
> - if (slash >= argv0) {
> + if (slash) {
>   argv0_path = xstrndup(argv0, slash - argv0);
>   return slash + 1;
>   }
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] submodule: Fetch the direct sha1 first

2016-02-19 Thread Junio C Hamano

Stefan Beller  writes:

> When reviewing a change in Gerrit, which also updates a submodule,
> a common review practice is to download and cherry-pick the patch locally
> to test it. However when testing it locally, the 'git submodule update'
> may fail fetching the correct submodule sha1 as the corresponding commit
> in the submodule is not yet part of the project history, but also just a
> proposed change.
>
> To ease this, try fetching by sha1 first and when that fails (in case of
> servers which do not allow fetching by sha1), fall back to the default
> behavior we already have.
>
> Signed-off-by: Stefan Beller 
> ---
>
> I think it's best to apply this on origin/master, there is no collision
> with sb/submodule-parallel-update.
>
> Also I do not see a good way to test this both in correctness as well
> as performance degeneration. If the first git fetch fails, the second
> fetch is executed, so it should behave as before this patch w.r.t. 
> correctness.
>
> Regarding performance, the first fetch should fail quite fast iff the fetch
> fails and then continue with the normal fetch. In case the first fetch works
> fine getting the exact sha1, the fetch should be faster than a default fetch
> as potentially less data needs to be fetched.

"The fetch should be faster" may not be making a good trade-off
overall--people may have depended on the branches configured to be
fetched to be fetched after this codepath is exercised, but now if
the commit bound to the superproject tree happens to be complete,
even though it is not anchored by any remote tracking ref (hence the
next GC may clobber it), the fetch of other branches will not
happen.

My knee-jerk reaction is that the order of fallback is probably the
other way around.  That is, try "git fetch" as before, check again
if the commit bound to the superproject tree is now complete, and
fallback to fetch that commit with an extra "git fetch".

Jens, what do you think?

>  git-submodule.sh | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/git-submodule.sh b/git-submodule.sh
> index 9bc5c5f..ee0b985 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -746,8 +746,9 @@ Maybe you want to use 'update --init'?")"
>   # Run fetch only if $sha1 isn't present or it
>   # is not reachable from a ref.
>   (clear_local_git_env; cd "$sm_path" &&
> + remote_name=$(get_default_remote)
>   ( (rev=$(git rev-list -n 1 $sha1 --not 
> --all 2>/dev/null) &&
> -  test -z "$rev") || git-fetch)) ||
> +  test -z "$rev") || git-fetch 
> $remote_name $rev

Regardless of the "fallback order" issue, I do not think $rev is a
correct thing to fetch here.  The superproject binds $sha1 to its
tree, and you would be checking that out, so shouldn't you be
fetching that commit?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv14 0/7] Expose submodule parallelism to the user

2016-02-19 Thread Junio C Hamano

Looks good.  I didn't notice these unnecessary blank lines while
reading the previous rounds, and it is good to see them go.

Let's wait for a few days and merge them to 'next'.  David's ref
backend topic textually depends on this, and we'd better give it a
solid foundation to build on soon.




--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GSoC 2016: applications open, deadline = Fri, 19/2

2016-02-19 Thread Junio C Hamano

Lars Schneider  writes:

> If this mode is enabled then Git shall print a warning message before 
> running a potentially destructive command. In addition to the warning 
> Git shall print a command that would reverse the operation if possible. 
> Most of the information to reverse an operation is already available 
> via git reflog. However, the task here is to make this information more 
> easily accessible to Git beginners.
>
> --
>
> Does this go into the direction of "making the powerful tool harder to
> misuse"?

Not really, if done without a real thinking.  The approach risks to
make the powerful tool harder to use, not just misuse.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GSoC 2016: applications open, deadline = Fri, 19/2

2016-02-19 Thread Junio C Hamano

Matthieu Moy  writes:

>> We have these "powerful" tools for a reason.  After making a mess
>> experimenting with your working tree files, "reset --hard" is the
>> best tool to go back to the known-good state,
>
> I disagree with that. This reminds me a discussion I had with a student
> a few years ago:
>
>   student: how do a clear all changes from my worktree?
>   me: git reset --hard
>
> the next day:
>
>   student: OK, now, how do I get my changes back?
>   me: ...!
>
> There's almost no situation where reset --hard is the best tool.

I obviously have to disagree.  After maknig a mess experimenting,
when you want to discard all that, "reset --hard" is the best
tool--the situation of your student may be quite different but you
didn't make it clear what s/he wanted to salvage.  In any case, I
wasn't asking about "clear all changes for now, to be salvaged
later".

The "experimenting" would include mergy operations like "am -3" and
"cherry-pick".  "After queuing a topic and trying it in isolation,
an attempt to merge to the baseline results in quite a mess, and I
give up"--there is nothing to salvage.

And obviously, "stash" is not useful in such a situation.  You could
use "tar cf ../saved .", though.

> Now, another issue with the proposed core.isbeginner is compatibility
> with scripts. 

Yes.

> Dangerous commands are often plumbing, and a beginner may
> still want to use scripts or other porcelain on top of it. Typically, I
> think this rules out "git reset --hard" which is legitimate in scripts.

I agree that an "under core.isbeginner, the command will always be
refused" change can be written without thinking and it will be
useless for anything that has ledigimate uses (like, but not limited
to, being used in scripts) [*1*].

But not so fast.

If you can figure out when "git reset --hard" is legitimate based
*NOT* only on the fact that it is driven by a script, but on what
kind of modifications to the working tree contents, the index
contents and the refs are about to be made by the command, then
"core.isbeginner" can be a permission for the command to spend extra
cycles to examine the situation carefully to decide to selectively
go ahead, warn and go ahead, or refuse.

That of course takes a real thinking.  

[Footnote]

*1* I'd refuse to take a patch to make scripted Porcelains that make
legit calls to "powerful" tools export GIT_SCRIPT_IS_RUNNING_YOU
environment variable as a workaround for such a kludge.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: git submodule should honor "-c credential.helper" command line argument

2016-02-19 Thread Jacob Keller

On Fri, Feb 19, 2016 at 9:33 AM, Junio C Hamano  wrote:
> Jeff King  writes:
>
>> That being said, I am not sure this is the right solution. In the thread
>> I linked earlier[1], Jens indicated he would prefer not to blindly share
>> config with the submodules, and I think I agree. Or are you proposing to
>> pick and choose the keys in GIT_CONFIG_PARAMETERS, and whitelist
>> credential.*?
>
> Yes, I think it is sensible not to propagate by default, and pass
> ones that are absolutely safe, sane and necessary by whitelisting.

Yes I agree. I'm trying to think of a reasonable way to implement this
as part of a submodule--helper.. something like whitelist-config or
something, which would just output the value of GIT_CONFIG_PARAMETERS
after whitelisting. It would be up to the shell code to then use this
inside a subshell.

Regards,
Jake
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Test failures with GNU grep 2.23

2016-02-19 Thread Jeff King

On Fri, Feb 19, 2016 at 07:23:11PM +, John Keeping wrote:

> I suspect that any grep that lacks "-a" also lacks binary file handling
> that will break these tests.  I found a Solaris grep that doesn't
> support "-a" and it treats these files as text.
> 
> From that perspective, it would be better to have a central place that
> deals with figuring out how to get grep to work for us.  Perhaps we need
> test_grep to get this right.  We already have test_cmp_bin() as a thin
> wrapper around cmp so I don't think this is completely unprecedented.

I think 99% of the time we are using grep for ascii text. As evidenced
by the number of test failures we see with the new grep, it is a small
minority that feed binary gibberish. I'd prefer if "-a" handling didn't
need to pollute anything outside of this narrow range of tests (and as
with my prereq suggestion, I am even find just skipping this narrow
range of tests on platforms with no "-a", though falling back to running
without "-a" is fine if it works).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Test failures with GNU grep 2.23

2016-02-19 Thread John Keeping

On Fri, Feb 19, 2016 at 09:38:17AM -0800, Junio C Hamano wrote:
> Jeff King  writes:
> 
> > Yeah, I'd worry that "-a" is not portable. OTOH, BSD grep seems to have
> > it, so between that and GNU, I think most systems are covered. We could
> > do:
> >
> >   test_lazy_prereq GREP_A '
> > echo foo | grep -a foo
> >   '
> >
> > and mark these tests with it. I'd also be happy to skip that step and
> > just do it if and when somebody actually complains about a system
> > without it (I wouldn't be surprised if most people on antique systems
> > end up installing GNU grep anyway).
> >
> > Another option might be using "sed -ne '/^author/p'" or similar. But
> > that may very well just be trading one portability problem for another.
> 
> Would $PERL help, I wonder?

I suspect that any grep that lacks "-a" also lacks binary file handling
that will break these tests.  I found a Solaris grep that doesn't
support "-a" and it treats these files as text.

>From that perspective, it would be better to have a central place that
deals with figuring out how to get grep to work for us.  Perhaps we need
test_grep to get this right.  We already have test_cmp_bin() as a thin
wrapper around cmp so I don't think this is completely unprecedented.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 13/21] fast-import: simplify allocation in start_packfile

2016-02-19 Thread Jeff King

On Fri, Feb 19, 2016 at 09:48:54AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > This function allocate a packed_git flex-array, and adds a
> 
> s/allocate//, right?

Yes, thanks.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Test failures with GNU grep 2.23

2016-02-19 Thread Jeff King

On Fri, Feb 19, 2016 at 09:38:17AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > Yeah, I'd worry that "-a" is not portable. OTOH, BSD grep seems to have
> > it, so between that and GNU, I think most systems are covered. We could
> > do:
> >
> >   test_lazy_prereq GREP_A '
> > echo foo | grep -a foo
> >   '
> >
> > and mark these tests with it. I'd also be happy to skip that step and
> > just do it if and when somebody actually complains about a system
> > without it (I wouldn't be surprised if most people on antique systems
> > end up installing GNU grep anyway).
> >
> > Another option might be using "sed -ne '/^author/p'" or similar. But
> > that may very well just be trading one portability problem for another.
> 
> Would $PERL help, I wonder?

It would, though I think you would need to call `binmode` to make it
reliable. I was hesitant to suggest it, because I seem to recall some
resistance to more perl dependencies in the test suite, but I think we
may be past the point of no return there, anyway.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv13 5/7] git submodule update: have a dedicated helper for cloning

2016-02-19 Thread Stefan Beller

On Fri, Feb 19, 2016 at 11:08 AM, Jeff King  wrote:
> On Fri, Feb 19, 2016 at 12:20:48PM -0500, Eric Sunshine wrote:
>
>> >>> + argv_array_pushl(>args, "--path", sub->path, NULL);
>> >>> + argv_array_pushl(>args, "--name", sub->name, NULL);
>> >>> + argv_array_pushl(>args, "--url", strdup(url), NULL);
>> >>
>> >> No need to strdup() here; argv_array handles its own memory, so this
>> >> just leaks (and if we were keeping it, it should be xstrdup(), of
>> >> course).
>> >
>> > We cannot remove the strdup as the url is a local variable we read in from
>> > git_config_get_string and the local variable is going out of scope before 
>> > the
>> > child process ends?
>> >
>> > I'll change it to xstrdup then.
>>
>> When Peff said "argv_array handles its own memory", he meant that it
>> does xstrdup() itself, so there's no need for you to do so a second
>> time manually (leaking a string as a consequence).
>
> Exactly. :) Sorry for not being clear the first time around.

Sorry for being confused to easily. I should have just read the code of
argv_array_pushl before answering. In the resend I removed the strdup.

Thanks,
Stefan

>
> -Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v5 25/27] refs: add LMDB refs storage backend

2016-02-19 Thread David Turner

On Fri, 2016-02-19 at 09:54 +0700, Duy Nguyen wrote:
> On Fri, Feb 19, 2016 at 3:23 AM, David Turner <
> dtur...@twopensource.com> wrote:
> > > > +static int read_per_worktree_ref(const char *submodule, const
> > > > char
> > > > *refname,
> > > > +struct MDB_val *val, int
> > > > *needs_free)
> > > 
> > > From what I read, I suspect these _per_worktree functions will be
> > > identical for the next backend. Should we just hand over the job
> > > for
> > > files backend? For all entry points that may deal with per
> > > -worktree
> > > refs, e.g. lmdb_resolve_ref_unsafe, can we check ref_type() first
> > > thing, if it's per-worktree we call
> > > refs_be_files.resolve_ref_unsafe()
> > > instead?  It could even be done at frontend level,
> > > e.g. refs.c:resolve_ref_unsafe().
> > > 
> > > Though I may be talking rubbish here because I don't know how
> > > whether
> > > it has anything to do with transactions.
> > 
> > The reason I did it this way is that some ref chains cross backend
> > boundaries (e.g. HEAD -> refs/heads/master).  But if we have other
> > backends later, we could generalize.
> 
> Crossing backends should go through frontend again, imo. But I don't
> really know if it's efficient.

It's pretty tricky to maintain state (e.g. count of symref redirects)
across that barrier.  So I'm not sure how to do it cleanly.

> > > > +static int lmdb_create_symref(const char *ref_target,
> > > > + const char *refs_heads_master,
> > > > + const char *logmsg)
> > > > +{
> > > > +
> > > ...
> > > > +   mdb_put_or_die(, , , 0);
> > > > +
> > > > +   /* TODO: Don't create ref d/f conflicts */
> > > 
> > > I'm not sure I get this comment. D/F conflicts are no longer a
> > > thing
> > > for lmdb backend, right?
> > 
> > I'm trying to avoid the lmdb backend creating a set of refs that
> > the
> > files backend can't handle.  This would make collaboration with
> > other
> > versions of git more difficult.
> 
> It already is. If you create refs "foo" and "FOO" on case sensitive
> file system and clone it on a case-insensitive one, you face the same
> problem. We may have an optional configuration knob to prevent
> incompatibilities with files backend, but I think that should be done
> (and enforced if necessary) outside backends.

Sure, the current state isn't perfect, but why make it worse? 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv13 5/7] git submodule update: have a dedicated helper for cloning

2016-02-19 Thread Jeff King

On Fri, Feb 19, 2016 at 12:20:48PM -0500, Eric Sunshine wrote:

> >>> + argv_array_pushl(>args, "--path", sub->path, NULL);
> >>> + argv_array_pushl(>args, "--name", sub->name, NULL);
> >>> + argv_array_pushl(>args, "--url", strdup(url), NULL);
> >>
> >> No need to strdup() here; argv_array handles its own memory, so this
> >> just leaks (and if we were keeping it, it should be xstrdup(), of
> >> course).
> >
> > We cannot remove the strdup as the url is a local variable we read in from
> > git_config_get_string and the local variable is going out of scope before 
> > the
> > child process ends?
> >
> > I'll change it to xstrdup then.
> 
> When Peff said "argv_array handles its own memory", he meant that it
> does xstrdup() itself, so there's no need for you to do so a second
> time manually (leaking a string as a consequence).

Exactly. :) Sorry for not being clear the first time around.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] submodule: Fetch the direct sha1 first

2016-02-19 Thread Stefan Beller

When reviewing a change in Gerrit, which also updates a submodule,
a common review practice is to download and cherry-pick the patch locally
to test it. However when testing it locally, the 'git submodule update'
may fail fetching the correct submodule sha1 as the corresponding commit
in the submodule is not yet part of the project history, but also just a
proposed change.

To ease this, try fetching by sha1 first and when that fails (in case of
servers which do not allow fetching by sha1), fall back to the default
behavior we already have.

Signed-off-by: Stefan Beller 
---

I think it's best to apply this on origin/master, there is no collision
with sb/submodule-parallel-update.

Also I do not see a good way to test this both in correctness as well
as performance degeneration. If the first git fetch fails, the second
fetch is executed, so it should behave as before this patch w.r.t. correctness.

Regarding performance, the first fetch should fail quite fast iff the fetch
fails and then continue with the normal fetch. In case the first fetch works
fine getting the exact sha1, the fetch should be faster than a default fetch
as potentially less data needs to be fetched.

Thanks,
Stefan

 git-submodule.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 9bc5c5f..ee0b985 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -746,8 +746,9 @@ Maybe you want to use 'update --init'?")"
# Run fetch only if $sha1 isn't present or it
# is not reachable from a ref.
(clear_local_git_env; cd "$sm_path" &&
+   remote_name=$(get_default_remote)
( (rev=$(git rev-list -n 1 $sha1 --not 
--all 2>/dev/null) &&
-test -z "$rev") || git-fetch)) ||
+test -z "$rev") || git-fetch 
$remote_name $rev || git-fetch)) ||
die "$(eval_gettext "Unable to fetch in 
submodule path '\$displaypath'")"
fi
 
-- 
2.7.0.rc0.34.ga06e0b3.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] git-p4.py: Don't try to rebase on submit from bare repository

2016-02-19 Thread Eric Sunshine

On Fri, Feb 19, 2016 at 1:27 PM, Amadeusz Żołnowski
 wrote:
> git-p4 can be successfully used from bare repository (which acts as a
> bridge between Perforce repository and pure Git repositories). On submit
> git-p4 performs unconditional rebase. Do rebase only on non-bare
> repositories.

As a person who does not use Perforce, it is not obvious to me from
the commit message why this change is beneficial or even what the
consequences are. Will Perforce users understand this change given
only the explanation above? If not, perhaps it would be helpful to
expand the commit message to explain more thoroughly the impact of
this change and why it is a good idea.

Thanks.

> Signed-off-by: Amadeusz Żołnowski 
> ---
>  git-p4.py | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/git-p4.py b/git-p4.py
> index c33dece..e00cd02 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -2059,8 +2059,9 @@ class P4Submit(Command, P4UserMap):
>  sync.branch = self.branch
>  sync.run([])
>
> -rebase = P4Rebase()
> -rebase.rebase()
> +if not gitConfigBool("core.bare"):
> +rebase = P4Rebase()
> +rebase.rebase()
>
>  else:
>  if len(applied) == 0:
> --
> 2.7.0
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GSoC 2016: Microproject

2016-02-19 Thread Mehul Jain

On Fri, Feb 19, 2016 at 11:20 PM, Stefan Beller  wrote:
> Yes, most likely t/t5521-pull-options.sh or t/t5520-pull.sh would be the right
> place as judging from the file name of the tests.

Thanks for specifying the files. I think t/t5520-pull.sh line 258 will
be the best place to write test for this case.

Mehul Jain
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] git-p4.py: Don't try to rebase on submit from bare repository

2016-02-19 Thread Amadeusz Żołnowski

git-p4 can be successfully used from bare repository (which acts as a
bridge between Perforce repository and pure Git repositories). On submit
git-p4 performs unconditional rebase. Do rebase only on non-bare
repositories.

Signed-off-by: Amadeusz Żołnowski 
---
 git-p4.py | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index c33dece..e00cd02 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2059,8 +2059,9 @@ class P4Submit(Command, P4UserMap):
 sync.branch = self.branch
 sync.run([])
 
-rebase = P4Rebase()
-rebase.rebase()
+if not gitConfigBool("core.bare"):
+rebase = P4Rebase()
+rebase.rebase()
 
 else:
 if len(applied) == 0:
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 0/4] config: add '--sources' option to print the source of a config value

2016-02-19 Thread Junio C Hamano

larsxschnei...@gmail.com writes:

> From: Lars Schneider 
>
> diff to v5:
> * rename 'type' to 'origin_type' as 'type' is too broad a word in the context
>   of configuration file (Thanks to the reviewers Junio and Dscho)
> * add dedicated patch to rename git_config_from_buf to git_config_from_mem
>   (this change was part of the series since v4 as suggested by Junio but
>   hidden in the "config: add 'origin_type'" patch)
>
> Thanks,
> Lars

Overall this round looks good.

I personally prefer 'stdin' to be spelled out as '(the) standard
input' in end-user facing messages, e.g. the expected message in
this piece

test_expect_success 'invalid stdin config' '
echo "fatal: bad config line 1 in stdin " >expect &&

to say "line 1 in the standard input", but that is a quite minor
point and just a preference.

Thanks.

Let's queue this version and move it forward to 'next' soonish.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCHv14 4/7] submodule update: direct error message to stderr

2016-02-19 Thread Stefan Beller

Reroute the error message for specified but initialized submodules
to stderr instead of stdout.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 git-submodule.sh   | 4 ++--
 t/t7400-submodule-basic.sh | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 9bc5c5f..9ee86d4 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -693,7 +693,7 @@ cmd_update()
 
if test "$update_module" = "none"
then
-   echo "Skipping submodule '$displaypath'"
+   echo >&2 "Skipping submodule '$displaypath'"
continue
fi
 
@@ -702,7 +702,7 @@ cmd_update()
# Only mention uninitialized submodules when its
# path have been specified
test "$#" != "0" &&
-   say "$(eval_gettext "Submodule path '\$displaypath' not 
initialized
+   say >&2 "$(eval_gettext "Submodule path '\$displaypath' 
not initialized
 Maybe you want to use 'update --init'?")"
continue
fi
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index 540771c..5991e3c 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -462,7 +462,7 @@ test_expect_success 'update --init' '
git config --remove-section submodule.example &&
test_must_fail git config submodule.example.url &&
 
-   git submodule update init > update.out &&
+   git submodule update init 2> update.out &&
cat update.out &&
test_i18ngrep "not initialized" update.out &&
test_must_fail git rev-parse --resolve-git-dir init/.git &&
@@ -480,7 +480,7 @@ test_expect_success 'update --init from subdirectory' '
mkdir -p sub &&
(
cd sub &&
-   git submodule update ../init >update.out &&
+   git submodule update ../init 2>update.out &&
cat update.out &&
test_i18ngrep "not initialized" update.out &&
test_must_fail git rev-parse --resolve-git-dir ../init/.git &&
-- 
2.7.0.rc0.34.ga06e0b3.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCHv14 7/7] clone: allow an explicit argument for parallel submodule clones

2016-02-19 Thread Stefan Beller

Just pass it along to "git submodule update", which may pick reasonable
defaults if you don't specify an explicit number.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 Documentation/git-clone.txt |  6 +-
 builtin/clone.c | 19 +--
 t/t7406-submodule-update.sh | 15 +++
 3 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index 6bf000d..6db7b6d 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -14,7 +14,7 @@ SYNOPSIS
  [-o ] [-b ] [-u ] [--reference ]
  [--dissociate] [--separate-git-dir ]
  [--depth ] [--[no-]single-branch]
- [--recursive | --recurse-submodules] [--] 
+ [--recursive | --recurse-submodules] [--jobs ] [--] 
  []
 
 DESCRIPTION
@@ -221,6 +221,10 @@ objects from the source repository into a pack in the 
cloned repository.
The result is Git repository can be separated from working
tree.
 
+-j ::
+--jobs ::
+   The number of submodules fetched at the same time.
+   Defaults to the `submodule.fetchJobs` option.
 
 ::
The (possibly remote) repository to clone from.  See the
diff --git a/builtin/clone.c b/builtin/clone.c
index a0b3cd9..b004fb4 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -50,6 +50,7 @@ static int option_progress = -1;
 static struct string_list option_config;
 static struct string_list option_reference;
 static int option_dissociate;
+static int max_jobs = -1;
 
 static struct option builtin_clone_options[] = {
OPT__VERBOSITY(_verbosity),
@@ -72,6 +73,8 @@ static struct option builtin_clone_options[] = {
N_("initialize submodules in the clone")),
OPT_BOOL(0, "recurse-submodules", _recursive,
N_("initialize submodules in the clone")),
+   OPT_INTEGER('j', "jobs", _jobs,
+   N_("number of submodules cloned in parallel")),
OPT_STRING(0, "template", _template, N_("template-directory"),
   N_("directory from which templates will be used")),
OPT_STRING_LIST(0, "reference", _reference, N_("repo"),
@@ -95,10 +98,6 @@ static struct option builtin_clone_options[] = {
OPT_END()
 };
 
-static const char *argv_submodule[] = {
-   "submodule", "update", "--init", "--recursive", NULL
-};
-
 static const char *get_repo_path_1(struct strbuf *path, int *is_bundle)
 {
static char *suffix[] = { "/.git", "", ".git/.git", ".git" };
@@ -724,8 +723,16 @@ static int checkout(void)
err |= run_hook_le(NULL, "post-checkout", sha1_to_hex(null_sha1),
   sha1_to_hex(sha1), "1", NULL);
 
-   if (!err && option_recursive)
-   err = run_command_v_opt(argv_submodule, RUN_GIT_CMD);
+   if (!err && option_recursive) {
+   struct argv_array args = ARGV_ARRAY_INIT;
+   argv_array_pushl(, "submodule", "update", "--init", 
"--recursive", NULL);
+
+   if (max_jobs != -1)
+   argv_array_pushf(, "--jobs=%d", max_jobs);
+
+   err = run_command_v_opt(args.argv, RUN_GIT_CMD);
+   argv_array_clear();
+   }
 
return err;
 }
diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index 7fd5142..090891e 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -786,4 +786,19 @@ test_expect_success 'submodule update can be run in 
parallel' '
 grep "9 tasks" trace.out
)
 '
+
+test_expect_success 'git clone passes the parallel jobs config on to 
submodules' '
+   test_when_finished "rm -rf super4" &&
+   GIT_TRACE=$(pwd)/trace.out git clone --recurse-submodules --jobs 7 . 
super4 &&
+   grep "7 tasks" trace.out &&
+   rm -rf super4 &&
+   git config --global submodule.fetchJobs 8 &&
+   GIT_TRACE=$(pwd)/trace.out git clone --recurse-submodules . super4 &&
+   grep "8 tasks" trace.out &&
+   rm -rf super4 &&
+   GIT_TRACE=$(pwd)/trace.out git clone --recurse-submodules --jobs 9 . 
super4 &&
+   grep "9 tasks" trace.out &&
+   rm -rf super4
+'
+
 test_done
-- 
2.7.0.rc0.34.ga06e0b3.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCHv14 3/7] fetching submodules: respect `submodule.fetchJobs` config option

2016-02-19 Thread Stefan Beller

This allows to configure fetching and updating in parallel
without having the command line option.

This moved the responsibility to determine how many parallel processes
to start from builtin/fetch to submodule.c as we need a way to communicate
"The user did not specify the number of parallel processes in the command
line options" in the builtin fetch. The submodule code takes care of
the precedence (CLI > config > default).

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 Documentation/config.txt|  6 ++
 builtin/fetch.c |  2 +-
 submodule.c | 16 +++-
 submodule.h |  2 ++
 t/t5526-fetch-submodules.sh | 14 ++
 5 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 2d06b11..3b02732 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2646,6 +2646,12 @@ submodule..ignore::
"--ignore-submodules" option. The 'git submodule' commands are not
affected by this setting.
 
+submodule.fetchJobs::
+   Specifies how many submodules are fetched/cloned at the same time.
+   A positive integer allows up to that number of submodules fetched
+   in parallel. A value of 0 will give some reasonable default.
+   If unset, it defaults to 1.
+
 tag.sort::
This variable controls the sort ordering of tags when displayed by
linkgit:git-tag[1]. Without the "--sort=" option provided, the
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 586840d..5aa1c2d 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -37,7 +37,7 @@ static int prune = -1; /* unspecified */
 static int all, append, dry_run, force, keep, multiple, update_head_ok, 
verbosity;
 static int progress = -1, recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
 static int tags = TAGS_DEFAULT, unshallow, update_shallow;
-static int max_children = 1;
+static int max_children = -1;
 static const char *depth;
 static const char *upload_pack;
 static struct strbuf default_rla = STRBUF_INIT;
diff --git a/submodule.c b/submodule.c
index b38dd51..051f722 100644
--- a/submodule.c
+++ b/submodule.c
@@ -15,6 +15,7 @@
 #include "thread-utils.h"
 
 static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
+static int parallel_jobs = 1;
 static struct string_list changed_submodule_paths;
 static int initialized_fetch_ref_tips;
 static struct sha1_array ref_tips_before_fetch;
@@ -169,7 +170,12 @@ void set_diffopt_flags_from_submodule_config(struct 
diff_options *diffopt,
 
 int submodule_config(const char *var, const char *value, void *cb)
 {
-   if (starts_with(var, "submodule."))
+   if (!strcmp(var, "submodule.fetchjobs")) {
+   parallel_jobs = git_config_int(var, value);
+   if (parallel_jobs < 0)
+   die(_("negative values not allowed for 
submodule.fetchJobs"));
+   return 0;
+   } else if (starts_with(var, "submodule."))
return parse_submodule_config_option(var, value);
else if (!strcmp(var, "fetch.recursesubmodules")) {
config_fetch_recurse_submodules = 
parse_fetch_recurse_submodules_arg(var, value);
@@ -772,6 +778,9 @@ int fetch_populated_submodules(const struct argv_array 
*options,
argv_array_push(, "--recurse-submodules-default");
/* default value, "--submodule-prefix" and its value are added later */
 
+   if (max_parallel_jobs < 0)
+   max_parallel_jobs = parallel_jobs;
+
calculate_changed_submodule_paths();
run_processes_parallel(max_parallel_jobs,
   get_next_submodule,
@@ -1118,3 +1127,8 @@ void connect_work_tree_and_git_dir(const char *work_tree, 
const char *git_dir)
strbuf_release(_path);
free((void *)real_work_tree);
 }
+
+int parallel_submodules(void)
+{
+   return parallel_jobs;
+}
diff --git a/submodule.h b/submodule.h
index 3464500..3166608 100644
--- a/submodule.h
+++ b/submodule.h
@@ -26,6 +26,7 @@ struct submodule_update_strategy {
enum submodule_update_type type;
const char *command;
 };
+#define SUBMODULE_UPDATE_STRATEGY_INIT {SM_UPDATE_UNSPECIFIED, NULL}
 
 int is_staging_gitmodules_ok(void);
 int update_path_in_gitmodules(const char *oldpath, const char *newpath);
@@ -57,5 +58,6 @@ int find_unpushed_submodules(unsigned char new_sha1[20], 
const char *remotes_nam
struct string_list *needs_pushing);
 int push_unpushed_submodules(unsigned char new_sha1[20], const char 
*remotes_name);
 void connect_work_tree_and_git_dir(const char *work_tree, const char *git_dir);
+int parallel_submodules(void);
 
 #endif
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index 1241146..954d0e4 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -471,4 +471,18 @@ test_expect_success "don't fetch submodule when newly

[PATCHv14 0/7] Expose submodule parallelism to the user

2016-02-19 Thread Stefan Beller

Thanks Junio, Jeff and Eric for reviewing once again!

* fixes as proposed by Junio for readability in parse_submodule_update_strategy
* Do not leak the `url`, as found by Jeff.
  (I dug into the code of argv_array_pushl and you're obviously correct)

Thanks,
Stefan

Interdiff to v13:

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 65bdc14..c435c53 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -356,13 +356,11 @@ static int update_clone_inspect_next_task(struct 
child_process *cp,
 
argv_array_pushl(>args, "--path", sub->path, NULL);
argv_array_pushl(>args, "--name", sub->name, NULL);
-   argv_array_pushl(>args, "--url", strdup(url), NULL);
+   argv_array_pushl(>args, "--url", url, NULL);
if (pp->reference)
argv_array_push(>args, pp->reference);
if (pp->depth)
argv_array_push(>args, pp->depth);
-
-
}
 
 cleanup:
diff --git a/submodule.c b/submodule.c
index b54d92d..051f722 100644
--- a/submodule.c
+++ b/submodule.c
@@ -219,7 +219,7 @@ void gitmodules_config(void)
 int parse_submodule_update_strategy(const char *value,
struct submodule_update_strategy *dst)
 {
-   free((void*)dst->command);
+   free((void *)dst->command);
dst->command = NULL;
if (!strcmp(value, "none"))
dst->type = SM_UPDATE_NONE;
@@ -229,9 +229,9 @@ int parse_submodule_update_strategy(const char *value,
dst->type = SM_UPDATE_REBASE;
else if (!strcmp(value, "merge"))
dst->type = SM_UPDATE_MERGE;
-   else if (value[0] == '!') {
+   else if (skip_prefix(value, "!", )) {
dst->type = SM_UPDATE_COMMAND;
-   dst->command = xstrdup(value + 1);
+   dst->command = xstrdup(value);
} else
return -1;
return 0;

Stefan Beller (7):
  submodule-config: keep update strategy around
  submodule-config: drop check against NULL
  fetching submodules: respect `submodule.fetchJobs` config option
  submodule update: direct error message to stderr
  git submodule update: have a dedicated helper for cloning
  submodule update: expose parallelism to the user
  clone: allow an explicit argument for parallel submodule clones

 Documentation/config.txt|   6 +
 Documentation/git-clone.txt |   6 +-
 Documentation/git-submodule.txt |   7 +-
 builtin/clone.c |  19 +++-
 builtin/fetch.c |   2 +-
 builtin/submodule--helper.c | 237 
 git-submodule.sh|  54 -
 submodule-config.c  |  19 +++-
 submodule-config.h  |   2 +
 submodule.c |  37 ++-
 submodule.h |  18 +++
 t/t5526-fetch-submodules.sh |  14 +++
 t/t7400-submodule-basic.sh  |   4 +-
 t/t7406-submodule-update.sh |  27 +
 14 files changed, 403 insertions(+), 49 deletions(-)

-- 
2.7.0.rc0.34.ga06e0b3.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCHv14 2/7] submodule-config: drop check against NULL

2016-02-19 Thread Stefan Beller

Adhere to the common coding style of Git and not check explicitly
for NULL throughout the file. There are still other occurrences in the
code base but that is usually inside of conditions with side effects.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 submodule-config.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/submodule-config.c b/submodule-config.c
index a5cd2ee..9fa2269 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -267,7 +267,7 @@ static int parse_config(const char *var, const char *value, 
void *data)
if (!strcmp(item.buf, "path")) {
if (!value)
ret = config_error_nonbool(var);
-   else if (!me->overwrite && submodule->path != NULL)
+   else if (!me->overwrite && submodule->path)
warn_multiple_config(me->commit_sha1, submodule->name,
"path");
else {
@@ -291,7 +291,7 @@ static int parse_config(const char *var, const char *value, 
void *data)
} else if (!strcmp(item.buf, "ignore")) {
if (!value)
ret = config_error_nonbool(var);
-   else if (!me->overwrite && submodule->ignore != NULL)
+   else if (!me->overwrite && submodule->ignore)
warn_multiple_config(me->commit_sha1, submodule->name,
"ignore");
else if (strcmp(value, "untracked") &&
@@ -307,7 +307,7 @@ static int parse_config(const char *var, const char *value, 
void *data)
} else if (!strcmp(item.buf, "url")) {
if (!value) {
ret = config_error_nonbool(var);
-   } else if (!me->overwrite && submodule->url != NULL) {
+   } else if (!me->overwrite && submodule->url) {
warn_multiple_config(me->commit_sha1, submodule->name,
"url");
} else {
-- 
2.7.0.rc0.34.ga06e0b3.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCHv14 1/7] submodule-config: keep update strategy around

2016-02-19 Thread Stefan Beller

Currently submodule..update is only handled by git-submodule.sh.
C code will start to need to make use of that value as more of the
functionality of git-submodule.sh moves into library code in C.

Add the update field to 'struct submodule' and populate it so it can
be read as sm->update or from sm->update_command.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 submodule-config.c | 13 +
 submodule-config.h |  2 ++
 submodule.c| 21 +
 submodule.h| 16 
 4 files changed, 52 insertions(+)

diff --git a/submodule-config.c b/submodule-config.c
index afe0ea8..a5cd2ee 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -59,6 +59,7 @@ static void free_one_config(struct submodule_entry *entry)
 {
free((void *) entry->config->path);
free((void *) entry->config->name);
+   free((void *) entry->config->update_strategy.command);
free(entry->config);
 }
 
@@ -194,6 +195,8 @@ static struct submodule *lookup_or_create_by_name(struct 
submodule_cache *cache,
 
submodule->path = NULL;
submodule->url = NULL;
+   submodule->update_strategy.type = SM_UPDATE_UNSPECIFIED;
+   submodule->update_strategy.command = NULL;
submodule->fetch_recurse = RECURSE_SUBMODULES_NONE;
submodule->ignore = NULL;
 
@@ -311,6 +314,16 @@ static int parse_config(const char *var, const char 
*value, void *data)
free((void *) submodule->url);
submodule->url = xstrdup(value);
}
+   } else if (!strcmp(item.buf, "update")) {
+   if (!value)
+   ret = config_error_nonbool(var);
+   else if (!me->overwrite &&
+submodule->update_strategy.type != 
SM_UPDATE_UNSPECIFIED)
+   warn_multiple_config(me->commit_sha1, submodule->name,
+"update");
+   else if (parse_submodule_update_strategy(value,
+>update_strategy) < 0)
+   die(_("invalid value for %s"), var);
}
 
strbuf_release();
diff --git a/submodule-config.h b/submodule-config.h
index 9061e4e..092ebfc 100644
--- a/submodule-config.h
+++ b/submodule-config.h
@@ -2,6 +2,7 @@
 #define SUBMODULE_CONFIG_CACHE_H
 
 #include "hashmap.h"
+#include "submodule.h"
 #include "strbuf.h"
 
 /*
@@ -14,6 +15,7 @@ struct submodule {
const char *url;
int fetch_recurse;
const char *ignore;
+   struct submodule_update_strategy update_strategy;
/* the sha1 blob id of the responsible .gitmodules file */
unsigned char gitmodules_sha1[20];
 };
diff --git a/submodule.c b/submodule.c
index b83939c..b38dd51 100644
--- a/submodule.c
+++ b/submodule.c
@@ -210,6 +210,27 @@ void gitmodules_config(void)
}
 }
 
+int parse_submodule_update_strategy(const char *value,
+   struct submodule_update_strategy *dst)
+{
+   free((void *)dst->command);
+   dst->command = NULL;
+   if (!strcmp(value, "none"))
+   dst->type = SM_UPDATE_NONE;
+   else if (!strcmp(value, "checkout"))
+   dst->type = SM_UPDATE_CHECKOUT;
+   else if (!strcmp(value, "rebase"))
+   dst->type = SM_UPDATE_REBASE;
+   else if (!strcmp(value, "merge"))
+   dst->type = SM_UPDATE_MERGE;
+   else if (skip_prefix(value, "!", )) {
+   dst->type = SM_UPDATE_COMMAND;
+   dst->command = xstrdup(value);
+   } else
+   return -1;
+   return 0;
+}
+
 void handle_ignore_submodules_arg(struct diff_options *diffopt,
  const char *arg)
 {
diff --git a/submodule.h b/submodule.h
index cbc0003..3464500 100644
--- a/submodule.h
+++ b/submodule.h
@@ -13,6 +13,20 @@ enum {
RECURSE_SUBMODULES_ON = 2
 };
 
+enum submodule_update_type {
+   SM_UPDATE_UNSPECIFIED = 0,
+   SM_UPDATE_CHECKOUT,
+   SM_UPDATE_REBASE,
+   SM_UPDATE_MERGE,
+   SM_UPDATE_NONE,
+   SM_UPDATE_COMMAND
+};
+
+struct submodule_update_strategy {
+   enum submodule_update_type type;
+   const char *command;
+};
+
 int is_staging_gitmodules_ok(void);
 int update_path_in_gitmodules(const char *oldpath, const char *newpath);
 int remove_path_from_gitmodules(const char *path);
@@ -21,6 +35,8 @@ void set_diffopt_flags_from_submodule_config(struct 
diff_options *diffopt,
const char *path);
 int submodule_config(const char *var, const char *value, void *cb);
 void gitmodules_config(void);
+int parse_submodule_update_strategy(const char *value,
+   struct submodule_update_strategy *dst);
 void handle_ignore_submodules_arg(struct diff_options *diffopt, const char *);
 void show_submodule_summary(FILE *f, const char *path,
const char *line_prefix,
-- 
2.7.0.rc0.34.ga06e0b3.dirty

--

[PATCHv14 6/7] submodule update: expose parallelism to the user

2016-02-19 Thread Stefan Beller

Expose possible parallelism either via the "--jobs" CLI parameter or
the "submodule.fetchJobs" setting.

By having the variable initialized to -1, we make sure 0 can be passed
into the parallel processing machine, which will then pick as many parallel
workers as there are CPUs.

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 Documentation/git-submodule.txt |  7 ++-
 builtin/submodule--helper.c | 16 
 git-submodule.sh|  9 +
 t/t7406-submodule-update.sh | 12 
 4 files changed, 39 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index 1572f05..13adebf 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -16,7 +16,7 @@ SYNOPSIS
 'git submodule' [--quiet] deinit [-f|--force] [--] ...
 'git submodule' [--quiet] update [--init] [--remote] [-N|--no-fetch]
  [-f|--force] [--rebase|--merge] [--reference ]
- [--depth ] [--recursive] [--] [...]
+ [--depth ] [--recursive] [--jobs ] [--] [...]
 'git submodule' [--quiet] summary [--cached|--files] [(-n|--summary-limit) ]
  [commit] [--] [...]
 'git submodule' [--quiet] foreach [--recursive] 
@@ -377,6 +377,11 @@ for linkgit:git-clone[1]'s `--reference` and `--shared` 
options carefully.
clone with a history truncated to the specified number of revisions.
See linkgit:git-clone[1]
 
+-j ::
+--jobs ::
+   This option is only valid for the update command.
+   Clone new submodules in parallel with as many jobs.
+   Defaults to the `submodule.fetchJobs` option.
 
 ...::
Paths to submodule(s). When specified this will restrict the command
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index c356aaf..c435c53 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -422,6 +422,7 @@ static int update_clone_task_finished(int result,
 static int update_clone(int argc, const char **argv, const char *prefix)
 {
const char *update = NULL;
+   int max_jobs = -1;
struct string_list_item *item;
struct submodule_update_clone pp = SUBMODULE_UPDATE_CLONE_INIT;
 
@@ -442,6 +443,8 @@ static int update_clone(int argc, const char **argv, const 
char *prefix)
OPT_STRING(0, "depth", , "",
   N_("Create a shallow clone truncated to the "
  "specified number of revisions")),
+   OPT_INTEGER('j', "jobs", _jobs,
+   N_("parallel jobs")),
OPT__QUIET(, N_("do't print cloning progress")),
OPT_END()
};
@@ -467,10 +470,15 @@ static int update_clone(int argc, const char **argv, 
const char *prefix)
gitmodules_config();
/* Overlay the parsed .gitmodules file with .git/config */
git_config(submodule_config, NULL);
-   run_processes_parallel(1, update_clone_get_next_task,
- update_clone_start_failure,
- update_clone_task_finished,
- );
+
+   if (max_jobs < 0)
+   max_jobs = parallel_submodules();
+
+   run_processes_parallel(max_jobs,
+  update_clone_get_next_task,
+  update_clone_start_failure,
+  update_clone_task_finished,
+  );
 
if (pp.print_unmatched) {
printf("#unmatched\n");
diff --git a/git-submodule.sh b/git-submodule.sh
index 9f554fb..10c5af9 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -645,6 +645,14 @@ cmd_update()
--depth=*)
depth=$1
;;
+   -j|--jobs)
+   case "$2" in '') usage ;; esac
+   jobs="--jobs=$2"
+   shift
+   ;;
+   --jobs=*)
+   jobs=$1
+   ;;
--)
shift
break
@@ -670,6 +678,7 @@ cmd_update()
${update:+--update "$update"} \
${reference:+--reference "$reference"} \
${depth:+--depth "$depth"} \
+   ${jobs:+$jobs} \
"$@" | {
err=
while read mode sha1 stage just_cloned sm_path
diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index dda3929..7fd5142 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -774,4 +774,16 @@ test_expect_success 'submodule update --recursive drops 
module name before recur
 test_i18ngrep "Submodule path .deeper/submodule/subsubmodule.: checked 
out" actual
)
 '
+
+test_expect_success 'submodule update can be run in parallel' '
+   (cd super2 &&
+

[PATCHv14 5/7] git submodule update: have a dedicated helper for cloning

2016-02-19 Thread Stefan Beller

This introduces a new helper function in git submodule--helper
which takes care of cloning all submodules, which we want to
parallelize eventually.

Some tests (such as empty URL, update_mode=none) are required in the
helper to make the decision for cloning. These checks have been
moved into the C function as well (no need to repeat them in the
shell script).

Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
---
 builtin/submodule--helper.c | 229 
 git-submodule.sh|  45 +++--
 2 files changed, 240 insertions(+), 34 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index f4c3eff..c356aaf 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -255,6 +255,234 @@ static int module_clone(int argc, const char **argv, 
const char *prefix)
return 0;
 }
 
+struct submodule_update_clone {
+   /* states */
+   int count;
+   int print_unmatched;
+   /* configuration */
+   int quiet;
+   const char *reference;
+   const char *depth;
+   const char *recursive_prefix;
+   const char *prefix;
+   struct module_list list;
+   struct string_list projectlines;
+   struct submodule_update_strategy update;
+   struct pathspec pathspec;
+};
+#define SUBMODULE_UPDATE_CLONE_INIT {0, 0, 0, NULL, NULL, NULL, NULL, 
MODULE_LIST_INIT, STRING_LIST_INIT_DUP, SUBMODULE_UPDATE_STRATEGY_INIT}
+
+static int update_clone_inspect_next_task(struct child_process *cp,
+ struct strbuf *err,
+ struct submodule_update_clone *pp,
+ void **pp_task_cb,
+ const struct cache_entry *ce)
+{
+   const struct submodule *sub = NULL;
+   struct strbuf displaypath_sb = STRBUF_INIT;
+   struct strbuf sb = STRBUF_INIT;
+   const char *displaypath = NULL;
+   char *url = NULL;
+   int needs_cloning = 0;
+
+   if (ce_stage(ce)) {
+   if (pp->recursive_prefix)
+   strbuf_addf(err, "Skipping unmerged submodule %s/%s\n",
+   pp->recursive_prefix, ce->name);
+   else
+   strbuf_addf(err, "Skipping unmerged submodule %s\n",
+   ce->name);
+   goto cleanup;
+   }
+
+   sub = submodule_from_path(null_sha1, ce->name);
+
+   if (pp->recursive_prefix)
+   displaypath = relative_path(pp->recursive_prefix,
+   ce->name, _sb);
+   else
+   displaypath = ce->name;
+
+   if (pp->update.type == SM_UPDATE_NONE ||
+   (pp->update.type == SM_UPDATE_UNSPECIFIED &&
+sub->update_strategy.type == SM_UPDATE_NONE)) {
+   strbuf_addf(err, "Skipping submodule '%s'\n",
+   displaypath);
+   goto cleanup;
+   }
+
+   /*
+* Looking up the url in .git/config.
+* We must not fall back to .gitmodules as we only want
+* to process configured submodules.
+*/
+   strbuf_reset();
+   strbuf_addf(, "submodule.%s.url", sub->name);
+   git_config_get_string(sb.buf, );
+   if (!url) {
+   /*
+* Only mention uninitialized submodules when its
+* path have been specified
+*/
+   if (pp->pathspec.nr)
+   strbuf_addf(err, _("Submodule path '%s' not 
initialized\n"
+   "Maybe you want to use 'update --init'?"),
+   displaypath);
+   goto cleanup;
+   }
+
+   strbuf_reset();
+   strbuf_addf(, "%s/.git", ce->name);
+   needs_cloning = !file_exists(sb.buf);
+
+   strbuf_reset();
+   strbuf_addf(, "%06o %s %d %d\t%s\n", ce->ce_mode,
+   sha1_to_hex(ce->sha1), ce_stage(ce),
+   needs_cloning, ce->name);
+   string_list_append(>projectlines, sb.buf);
+
+   if (needs_cloning) {
+   cp->git_cmd = 1;
+   cp->no_stdin = 1;
+   cp->stdout_to_stderr = 1;
+   cp->err = -1;
+   argv_array_push(>args, "submodule--helper");
+   argv_array_push(>args, "clone");
+   if (pp->quiet)
+   argv_array_push(>args, "--quiet");
+
+   if (pp->prefix)
+   argv_array_pushl(>args, "--prefix", pp->prefix, 
NULL);
+
+   argv_array_pushl(>args, "--path", sub->path, NULL);
+   argv_array_pushl(>args, "--name", sub->name, NULL);
+   argv_array_pushl(>args, "--url", url, NULL);
+   if (pp->reference)
+   argv_array_push(>args, pp->reference);
+

Re: GSoC 2016: Microproject

2016-02-19 Thread Stefan Beller

On Fri, Feb 19, 2016 at 9:39 AM, Mehul Jain  wrote:
> On Fri, Feb 19, 2016 at 6:33 PM, Matthieu Moy
>  wrote:
>> Hi,
>>
>> This is a double-post. You've posted almost the same message under the
>> title "GSoC 2016". Nothing serious, but attention to details is
>> important if you want to give a good image of yourself.
>
> I'm sorry of this kind of behavior. It was due to a misunderstanding by my 
> side.
>
>> I don't have all the code in mind, but I think it is. In this situation,
>> you always end up with a variable telling Git what to do (I guess,
>> autostash here), and this variable is set according to the configuration
>> and the command-line.
>>
>> You should be careful about the precedence: command-line should override
>> the configuration. And the default behavior should be used if neither
>> the command-line nor the configuration specified otherwise.
>
> Thanks for the suggestion.
> I've started the work on patch and did the change in the code which
> were necessary for Microproject. I have run the test by using "make",
> and there was no failure of any test.
> I've a doubt regarding tests. Here as "git pull" will now understand
> the argument that  "--[no-]autostash" means "rebase.autostash" should
> be set false for current execution of command "git pull --rebase". So
> do I have to write a test for this new option?
>

Yes, most likely t/t5521-pull-options.sh or t/t5520-pull.sh would be the right
place as judging from the file name of the tests.

Thanks,
Stefan

> Mehul Jain
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 13/21] fast-import: simplify allocation in start_packfile

2016-02-19 Thread Junio C Hamano

Jeff King  writes:

> This function allocate a packed_git flex-array, and adds a

s/allocate//, right?

> mysterious 2 bytes to the length of the pack_name field. One
> is for the trailing NUL, but the other has no purpose. This
> is probably cargo-culted from add_packed_git, which gets the
> ".idx" path and needed to allocate enough space to hold the
> matching ".pack" (though since 48bcc1c, we calculate the
> size there differently).
>
> This site, however, is using the raw path of a tempfile, and
> does not need the extra byte. We can just replace the
> allocation with FLEX_ALLOC_STR, which handles the allocation
> and the NUL for us.
>
> Signed-off-by: Jeff King 
> ---
>  fast-import.c | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/fast-import.c b/fast-import.c
> index 3053bb8..9fc7093 100644
> --- a/fast-import.c
> +++ b/fast-import.c
> @@ -865,15 +865,12 @@ static void start_packfile(void)
>  {
>   static char tmp_file[PATH_MAX];
>   struct packed_git *p;
> - int namelen;
>   struct pack_header hdr;
>   int pack_fd;
>  
>   pack_fd = odb_mkstemp(tmp_file, sizeof(tmp_file),
> "pack/tmp_pack_XX");
> - namelen = strlen(tmp_file) + 2;
> - p = xcalloc(1, sizeof(*p) + namelen);
> - xsnprintf(p->pack_name, namelen, "%s", tmp_file);
> + FLEX_ALLOC_STR(p, pack_name, tmp_file);
>   p->pack_fd = pack_fd;
>   p->do_not_close = 1;
>   pack_file = sha1fd(pack_fd, p->pack_name);
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GSoC 2016: Microproject

2016-02-19 Thread Mehul Jain

On Fri, Feb 19, 2016 at 6:33 PM, Matthieu Moy
 wrote:
> Hi,
>
> This is a double-post. You've posted almost the same message under the
> title "GSoC 2016". Nothing serious, but attention to details is
> important if you want to give a good image of yourself.

I'm sorry of this kind of behavior. It was due to a misunderstanding by my side.

> I don't have all the code in mind, but I think it is. In this situation,
> you always end up with a variable telling Git what to do (I guess,
> autostash here), and this variable is set according to the configuration
> and the command-line.
>
> You should be careful about the precedence: command-line should override
> the configuration. And the default behavior should be used if neither
> the command-line nor the configuration specified otherwise.

Thanks for the suggestion.
I've started the work on patch and did the change in the code which
were necessary for Microproject. I have run the test by using "make",
and there was no failure of any test.
I've a doubt regarding tests. Here as "git pull" will now understand
the argument that  "--[no-]autostash" means "rebase.autostash" should
be set false for current execution of command "git pull --rebase". So
do I have to write a test for this new option?

Mehul Jain
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Test failures with GNU grep 2.23

2016-02-19 Thread Junio C Hamano

Jeff King  writes:

> Yeah, I'd worry that "-a" is not portable. OTOH, BSD grep seems to have
> it, so between that and GNU, I think most systems are covered. We could
> do:
>
>   test_lazy_prereq GREP_A '
>   echo foo | grep -a foo
>   '
>
> and mark these tests with it. I'd also be happy to skip that step and
> just do it if and when somebody actually complains about a system
> without it (I wouldn't be surprised if most people on antique systems
> end up installing GNU grep anyway).
>
> Another option might be using "sed -ne '/^author/p'" or similar. But
> that may very well just be trading one portability problem for another.

Would $PERL help, I wonder?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: git submodule should honor "-c credential.helper" command line argument

2016-02-19 Thread Junio C Hamano

Jeff King  writes:

> That being said, I am not sure this is the right solution. In the thread
> I linked earlier[1], Jens indicated he would prefer not to blindly share
> config with the submodules, and I think I agree. Or are you proposing to
> pick and choose the keys in GIT_CONFIG_PARAMETERS, and whitelist
> credential.*?

Yes, I think it is sensible not to propagate by default, and pass
ones that are absolutely safe, sane and necessary by whitelisting.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Test failures with GNU grep 2.23

2016-02-19 Thread Eric Sunshine

On Fri, Feb 19, 2016 at 6:59 AM, Jeff King  wrote:
> On Sun, Feb 07, 2016 at 04:25:40PM +, John Keeping wrote:
>> The following patch fixes the tests for me, but I wonder if "-a" is
>> supported on all target platforms (it's not in POSIX, which specifies
>> that the "input files shall be text files") or whether we should do
>> something more comprehensive to provide sane_{e,f,}grep which guarantee
>> to treat input as text.
>>
>> I also tried setting POSIXLY_CORRECT but that doesn't affect the
>> text/binary decision.
>
> Yeah, I'd worry that "-a" is not portable. OTOH, BSD grep seems to have
> it, so between that and GNU, I think most systems are covered.

Mac OS X grep seems to support -a and tests in t8005 still pass with
-a added to the egrep invocations.

> We could
> do:
>
>   test_lazy_prereq GREP_A '
> echo foo | grep -a foo
>   '
>
> and mark these tests with it. I'd also be happy to skip that step and
> just do it if and when somebody actually complains about a system
> without it (I wouldn't be surprised if most people on antique systems
> end up installing GNU grep anyway).
>
> Another option might be using "sed -ne '/^author/p'" or similar. But
> that may very well just be trading one portability problem for another.
>
> I also wondered whether we could get away without grepping at all here.
> But the blame output has a bunch of cruft we don't care about; I think
> the readability of the tests would suffer if we tried to match the whole
> thing in a test_cmp.
>
> -Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv13 5/7] git submodule update: have a dedicated helper for cloning

2016-02-19 Thread Eric Sunshine

On Fri, Feb 19, 2016 at 11:47 AM, Stefan Beller  wrote:
> On Fri, Feb 19, 2016 at 4:03 AM, Jeff King  wrote:
>> On Thu, Feb 18, 2016 at 03:33:16PM -0800, Stefan Beller wrote:
>>
>>> + if (needs_cloning) {
>>> + cp->git_cmd = 1;
>>> + cp->no_stdin = 1;
>>> + cp->stdout_to_stderr = 1;
>>> + cp->err = -1;
>>> + argv_array_push(>args, "submodule--helper");
>>> + argv_array_push(>args, "clone");
>>> + if (pp->quiet)
>>> + argv_array_push(>args, "--quiet");
>>> +
>>> + if (pp->prefix)
>>> + argv_array_pushl(>args, "--prefix", pp->prefix, 
>>> NULL);
>>> +
>>> + argv_array_pushl(>args, "--path", sub->path, NULL);
>>> + argv_array_pushl(>args, "--name", sub->name, NULL);
>>> + argv_array_pushl(>args, "--url", strdup(url), NULL);
>>
>> No need to strdup() here; argv_array handles its own memory, so this
>> just leaks (and if we were keeping it, it should be xstrdup(), of
>> course).
>
> We cannot remove the strdup as the url is a local variable we read in from
> git_config_get_string and the local variable is going out of scope before the
> child process ends?
>
> I'll change it to xstrdup then.

When Peff said "argv_array handles its own memory", he meant that it
does xstrdup() itself, so there's no need for you to do so a second
time manually (leaking a string as a consequence).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] git-p4.py: Don't try to rebase on submit from bare repository

2016-02-19 Thread Junio C Hamano

Luke Diamand  writes:

> On 17 February 2016 at 22:46, Amadeusz Żołnowski  wrote:
>> git-p4 can be successfully used from bare repository (which acts as a
>> bridge between Perforce repository and pure Git repositories). On submit
>> git-p4 performs unconditional rebase. Do rebase only on non-bare
>> repositories.
>
> This looks obviously sensible and good to me, ack.
>
> Thanks!
> Luke

Luke, thanks for a prompt feedback.

Amadeusz, can you sign off your patch (cf. Documentation/SubmittingPatches)?



>
>
>
>> ---
>>  git-p4.py | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/git-p4.py b/git-p4.py
>> index c33dece..e00cd02 100755
>> --- a/git-p4.py
>> +++ b/git-p4.py
>> @@ -2059,8 +2059,9 @@ class P4Submit(Command, P4UserMap):
>>  sync.branch = self.branch
>>  sync.run([])
>>
>> -rebase = P4Rebase()
>> -rebase.rebase()
>> +if not gitConfigBool("core.bare"):
>> +rebase = P4Rebase()
>> +rebase.rebase()
>>
>>  else:
>>  if len(applied) == 0:
>> --
>> 2.7.0
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe git" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 05/20] rename_tmp_log(): use raceproof_create_file()

2016-02-19 Thread Junio C Hamano

Michael Haggerty  writes:

> Now, we have to consider the opposite case, namely that we are calling a
> non-buggy implementation of `rename()`, and we artificially change
> ENOTDIR to EISDIR. Can that cause any bad effects?
>
> I don't think so, because the case where a non-buggy implementation can
> yield ENOTDIR is a case, the consequent call to
> `remove_dir_recursively()` would fail with ENOTDIR too, and
> `raceproof_create_file()` would give up immediately.
>
> So I think everything is OK, though I admit that it is not especially
> elegant.

OK, thanks for a clear analysis.

I knew the original and the update were meant to (and would) behave
the same, but the workaround logic in the original looked iffy.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv13 5/7] git submodule update: have a dedicated helper for cloning

2016-02-19 Thread Stefan Beller

On Fri, Feb 19, 2016 at 4:03 AM, Jeff King  wrote:
> On Thu, Feb 18, 2016 at 03:33:16PM -0800, Stefan Beller wrote:
>
>> + if (needs_cloning) {
>> + cp->git_cmd = 1;
>> + cp->no_stdin = 1;
>> + cp->stdout_to_stderr = 1;
>> + cp->err = -1;
>> + argv_array_push(>args, "submodule--helper");
>> + argv_array_push(>args, "clone");
>> + if (pp->quiet)
>> + argv_array_push(>args, "--quiet");
>> +
>> + if (pp->prefix)
>> + argv_array_pushl(>args, "--prefix", pp->prefix, 
>> NULL);
>> +
>> + argv_array_pushl(>args, "--path", sub->path, NULL);
>> + argv_array_pushl(>args, "--name", sub->name, NULL);
>> + argv_array_pushl(>args, "--url", strdup(url), NULL);
>
> No need to strdup() here; argv_array handles its own memory, so this
> just leaks (and if we were keeping it, it should be xstrdup(), of
> course).

We cannot remove the strdup as the url is a local variable we read in from
git_config_get_string and the local variable is going out of scope before the
child process ends?

I'll change it to xstrdup then.

>
> -Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 05/20] rename_tmp_log(): use raceproof_create_file()

2016-02-19 Thread Michael Haggerty

On 02/17/2016 09:53 PM, Junio C Hamano wrote:
> Michael Haggerty  writes:
> 
>> Besides shortening the code, this saves an unnecessary call to
>> safe_create_leading_directories_const() in almost all cases.
>>
>> Signed-off-by: Michael Haggerty 
>> ---
>>  refs/files-backend.c | 76 
>> ++--
>>  1 file changed, 32 insertions(+), 44 deletions(-)
>>
>> diff --git a/refs/files-backend.c b/refs/files-backend.c
>> index a549942..e5f964c 100644
>> --- a/refs/files-backend.c
>> +++ b/refs/files-backend.c
>> @@ -2400,55 +2400,43 @@ out:
>>   */
>>  #define TMP_RENAMED_LOG  "logs/refs/.tmp-renamed-log"
>>  
>> +static int rename_tmp_log_callback(const char *path, void *cb)
>> +{
>> +int *true_errno = cb;
>> +
>> +if (rename(git_path(TMP_RENAMED_LOG), path)) {
>> +/*
>> + * rename(a, b) when b is an existing directory ought
>> + * to result in ISDIR, but Solaris 5.8 gives ENOTDIR.
>> + * Sheesh. Record the true errno for error reporting,
>> + * but report EISDIR to raceproof_create_file() so
>> + * that it knows to retry.
>> + */
>> +*true_errno = errno;
>> +if (errno==ENOTDIR)
>> +errno = EISDIR;
> 
> Style: SP on both sides of a binary operator.

Thanks; will fix.

> More importantly, is ENOTDIR expected only on a buggy platform?  

Here I was just mimicking the old behavior, which I think was correct,
but let's check more carefully...

> [ENOTDIR]
> A component of either path prefix names an existing file that is
> neither a directory nor a symbolic link to a directory; or the old
> argument names a directory and the new argument names a
> non-directory file; or the old argument contains at least one non-
>  character and ends with one or more trailing 
> characters and the last pathname component names an existing file
> that is neither a directory nor a symbolic link to a directory; or
> the old argument names an existing non-directory file and the new
> argument names a nonexistent file, contains at least one non-
>  character, and ends with one or more trailing 
> characters; or the new argument names an existing non-directory
> file, contains at least one non-  character, and ends with
> one or more trailing  characters.
> 
> i.e. when a leading component of "path" or TMP_RENAMED_LOG is an
> existing non-directory, we could get ENOTDIR on a valid system.
> 
> If another instance of Git created a file A/B when this process is
> trying to rename the temporary thing to its final location A/B/C,
> isn't that the errno we would see here?
>
> [EISDIR]
> The new argument points to a directory and the old argument
> points to a file that is not a directory.
>
> Puzzled...

We just created TMP_RENAMED_LOG ourselves, so I don't think we need to
expect errors from that argument. (Though I don't recall that there is
any locking to prevent two `git branch -m` processes from clobbering
each others' temporary files. Oh well; renaming branches is relatively
rare and probably interactive, so I'll declare that potential problem to
be out of scope for this patch series.)

So let's consider the cases where we can get ENOTDIR for `path`:

> A component of either path prefix names an existing file that is
> neither a directory nor a symbolic link to a directory.

This can certainly happen for `path`, but it is not a case that can be
rescued by raceproof_create_file().

> or the old argument names a directory [...]

This is not the case.

> or the old argument contains [...]

Also not interesting.

> or the old argument names an existing non-directory file and the new
> argument names a nonexistent file, contains at least one non-
>  character, and ends with one or more trailing 
> characters

The new argument doesn't end with trailing  characters, so this
can't happen.

> or the new argument names an existing non-directory
> file, contains at least one non-  character, and ends with
> one or more trailing  characters

Ditto.

So while it is true that a non-buggy implementation can give ENOTDIR, it
is for a case that we can't rescue. So if it weren't for the buggy
implementation, we could just leave ENOTDIR un-handled.

Now, we have to consider the opposite case, namely that we are calling a
non-buggy implementation of `rename()`, and we artificially change
ENOTDIR to EISDIR. Can that cause any bad effects?

I don't think so, because the case where a non-buggy implementation can
yield ENOTDIR is a case, the consequent call to
`remove_dir_recursively()` would fail with ENOTDIR too, and
`raceproof_create_file()` would give up immediately.

So I think everything is OK, though I admit that it is not especially
elegant. We could limit ourselves to doing the workaround only on
Solaris 5.8, but that seems like a lot of effort for not much benefit.
Or we

Greetings

2016-02-19 Thread Estelle Amadieu

Greetings,

I am Mrs Estelle Amadieu, I am 65 years old base in Cote d'Ivoire I write to 
relate to you of my intention to use my money 2.5 million dollars for charity 
work in your country. I was married to Late Chrestien Amadieu who was a 
contractor with the Government of Cote d'Ivoire before he died after few days 
in the hospital.I have been suffering from cancer I want to know if I can trust 
you to use these funds for charity/orphanage and 20% will be for you as 
compensation. Please contact me, so that I will give you more details.

Yours in the Lord,

Mrs Estelle Amadieu
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GSoC 2016: Microproject

2016-02-19 Thread Matthieu Moy

Hi,

This is a double-post. You've posted almost the same message under the
title "GSoC 2016". Nothing serious, but attention to details is
important if you want to give a good image of yourself.

Mehul Jain  writes:

> Hello everyone,
>
> I'm Mehul Jain. I'm looking for participating in GSoC 2016.

Note that we are just submitting our application, but have no guarantee
that the Git organization will be accepted for this year's GSoC.

> I've started work on a Microproject" Teach git pull --rebase the
> --no-autostash" option. While looking at Git's source code I have made
> following observation: In the pull.c file
> 1.  git_config_get_bool( , ) search in the configuration key for the
> value of rebase.autostash, if found true then modify autostash's value
> to a non-zero number and thus making a stash to encounter the problem
> of dirty tree.
> 2. Here if in command line a flag "--no-autostash" is given then we
> can easily set the value of autostash = 0 and thus killing the process
> by die_on_unclean_work_tree(prefix).
> Is my observation is right?

I don't have all the code in mind, but I think it is. In this situation,
you always end up with a variable telling Git what to do (I guess,
autostash here), and this variable is set according to the configuration
and the command-line.

You should be careful about the precedence: command-line should override
the configuration. And the default behavior should be used if neither
the command-line nor the configuration specified otherwise.

To get an example, you can pick any pair (command-line option, config)
that work together, find the code implementing it and blame to find the
corresponding commit.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GSoC 2016: applications open, deadline = Fri, 19/2

2016-02-19 Thread Matthieu Moy

Lars Schneider  writes:

> Thanks for your elaborate response. I think I got your point and I
> tried to adjust my "beginner mode" proposal accordingly [1].

Now merged as
https://github.com/git/git.github.io/commit/6b8b5e19cdb221192dedd70ba3e207636f1cdab1

I've added a warning for students:

  Note that this project is not technically difficult, it requires a
  deep understanding of Git: how each command is meant to be used, what
  are the potential dangers, ... Reaching a solution that effectively
  protects beginners without harming anyone is much harder than it
  seems. See for example [this

thread](http://thread.gmane.org/gmane.comp.version-control.git/285893/focus=286614)
  for example potential objections. If chosen, this project should be
  discussed in depth on the list before and after the student
  application.

I just want to avoid students loosing their time writting silly
proposals (once you have seen what the majority of proposals looks like,
nothing surprises you anymore ;-) ).

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] merge_blobs: use strbuf instead of manually-sized mmfile_t

2016-02-19 Thread Stefan Frühwirth


On 2016-02-16 at 22:27 Junio C Hamano wrote:


Three, I know the existence of the program is not more than "we
could do something like this" illustration by Linus, and its output
is in no way _designed_ to be so.  We know today that it does not do


Well, then it is just really sad that the manpage doesn't say so. This 
should be corrected immediately in order to prevent someone to build 
more (e.g.) libraries on top of it.


Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] merge_blobs: use strbuf instead of manually-sized mmfile_t

2016-02-19 Thread Stefan Frühwirth


On 2016-02-16 at 21:35 Jeff King wrote:


Yeah, I agree there isn't a great solution in git here. Using "git
merge" is definitely wrong if you don't want to touch HEAD or have a
working directory.  If you _just_ care about doing the tree-level merge
without content-level merging inside blobs, you can do it in the index
like:

   export GIT_INDEX_FILE=temp.index
   base=$(git merge-base $ours $theirs)
   git read-tree -i -m --aggressive $base $ours $theirs

If you want to do content-level resolving on top of that, you can do it
with:

   git merge-index git-merge-one-file -a

though it will need a temp directory to write out conflicts and
resolved content.


That's an interesting alternative, I'll give it a try!


I don't think merge-tree is at all the wrong tool, in the sense that it
is being used as designed. But it is using merge code that is different
from literally the rest of git. That means you're going to run into
weird bugs (like this one), and places where it does not behave the
same.  This add/add case, for instance, is usually a conflict in a
normal git merge (try your test case with "git merge"), but merge-tree
tries to do a partial content merge with a base that never actually
existed[1].


Thank you for clarifying, I understand.


So I worry that merge-tree's existence is a disservice to people like
Chris, because there is no disclaimer that it is effectively
unmaintained.


I agree, I don't want to advocate continuing development under these 
conditions.



So merge-blobs.c:generate_common_file() is definitely buggy, but I think
the bug gets unintentionally canceled out by the follow-on three-way
merge. Which is...good, I guess?


Well I don't know how to handle all this with respect to my original 
problem, but that's completely off topic. Anyway: Thanks!


Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv13 5/7] git submodule update: have a dedicated helper for cloning

2016-02-19 Thread Jeff King

On Thu, Feb 18, 2016 at 03:33:16PM -0800, Stefan Beller wrote:

> + if (needs_cloning) {
> + cp->git_cmd = 1;
> + cp->no_stdin = 1;
> + cp->stdout_to_stderr = 1;
> + cp->err = -1;
> + argv_array_push(>args, "submodule--helper");
> + argv_array_push(>args, "clone");
> + if (pp->quiet)
> + argv_array_push(>args, "--quiet");
> +
> + if (pp->prefix)
> + argv_array_pushl(>args, "--prefix", pp->prefix, 
> NULL);
> +
> + argv_array_pushl(>args, "--path", sub->path, NULL);
> + argv_array_pushl(>args, "--name", sub->name, NULL);
> + argv_array_pushl(>args, "--url", strdup(url), NULL);

No need to strdup() here; argv_array handles its own memory, so this
just leaks (and if we were keeping it, it should be xstrdup(), of
course).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Test failures with GNU grep 2.23

2016-02-19 Thread Jeff King

On Sun, Feb 07, 2016 at 04:25:40PM +, John Keeping wrote:

> It seems that binary file detection has changed in GNU grep 2.23 as a
> result of commit 40ed879 (grep: fix bug with with invalid unibyte
> sequence).

I read this bug report a while ago when you posted it, but happily
ignored it until today, when my debian unstable system pulled in the new
version of grep. :)

> This causes a couple of test failures in t8005 and t9200 (the t9200 case
> is less obvious so I'm only including t8005 here):
> 
> -- >8 --
> $ ./t8005-blame-i18n.sh -v -i
> [snip]
> expecting success: 
> git blame --incremental file | \
> egrep "^(author|summary) " > actual &&
> test_cmp actual expected

Just a side note while we are touching these tests:

 - we probably should not pipe, so we check the exist code from
   git-blame

 - we usually flip the test_cmp file order, to show the difference from
   expectation when there is a failure

 - no space after ">" redirection :)

> The following patch fixes the tests for me, but I wonder if "-a" is
> supported on all target platforms (it's not in POSIX, which specifies
> that the "input files shall be text files") or whether we should do
> something more comprehensive to provide sane_{e,f,}grep which guarantee
> to treat input as text.
> 
> I also tried setting POSIXLY_CORRECT but that doesn't affect the
> text/binary decision.

Yeah, I'd worry that "-a" is not portable. OTOH, BSD grep seems to have
it, so between that and GNU, I think most systems are covered. We could
do:

  test_lazy_prereq GREP_A '
echo foo | grep -a foo
  '

and mark these tests with it. I'd also be happy to skip that step and
just do it if and when somebody actually complains about a system
without it (I wouldn't be surprised if most people on antique systems
end up installing GNU grep anyway).

Another option might be using "sed -ne '/^author/p'" or similar. But
that may very well just be trading one portability problem for another.

I also wondered whether we could get away without grepping at all here.
But the blame output has a bunch of cruft we don't care about; I think
the readability of the tests would suffer if we tried to match the whole
thing in a test_cmp.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GSoC 2016: applications open, deadline = Fri, 19/2

2016-02-19 Thread Thomas Gummerer

On 02/18, Lars Schneider wrote:
>
> On 17 Feb 2016, at 19:58, Matthieu Moy  wrote:
>
> > Lars Schneider  writes:
> >
> >> Coincidentally I started working on similar thing already (1) and I have
> >> lots of ideas around it.
> >
> > I guess it's time to start sharing these ideas then ;-).
> >
> > I think there's a lot to do. If we want to push this idea as a GSoC
> > project, we need:
> >
> > * A rough plan. We can't expect students to read a vague text like
> >  "let's make Git safer" and write a real proposal out of it.
> >
> > * A way to start this rough plan incrementally (i.e. first step should
> >  be easy and mergeable without waiting for next steps).
> >
> > Feel free to start writting an idea for
> > http://git.github.io/SoC-2016-Ideas/. It'd be nice to have a few more
> > ideas before Friday. We can polish them later if needed.
>
> I published my ideas here:
> https://github.com/git/git.github.io/pull/125/files

Sorry for posting my idea so late, but it took me a while to write
this all up, and life has a habit of getting in the way.  My idea goes
into a different direction than yours.

I do like the remote whitelist/blacklist project.

Junio pointed out to me off list that this is to complicated for a
GSoC project.  I kind of agree with that, but I wanted to see how this
could be split up, to completely convince myself as well.  And indeed,
the more I think about it the more risky it seems.

Below there are some thoughts on a potential design, in case someone
is interested, no code to back any of this up, sorry.

Everything proposed below should be hidden behind some configuration
variable, potentially one per command (?)

- start with git-clean.  It's well defined which files are cleaned
  from a repository when running the command.  Add them to a commit on
  the tip of the current branch.

  Start a new branch (or use the existing one if applicable) in
  refs/restore/history, and add a commit including a notes file.  The
  commit message contains the operation that was executed (clean in
  this case), and the hash of the commit we created which includes the
  cleaned files.

  Add a note to the commit, detailing from which command we come from,
  which files we added (not strictly necessary, as we can infer it
  from the parent commit).

  Useful in itself as the user can recover the files manually if
  needed, and can be sent as separate patch series.

  Potential problems:  Git has no way to track directories.  This can
  be mitigated by keeping the list of directories in the attached
  note.

- add a git recover command.  The command looks at This would look like `git 
recover
  `, where commit is the hash of the commit we saved before.

  This works by reading the note attached to the commit, figuring out
  which command was run before, and restoring the state we were in
  before.

  Potential problems: conflicts, but I think this can be solved by
  simply erroring out, at least in the first iteration.

- the next command could be git mv -f, git reset -f and friends.  It
  gets more tricky here, as we'll have to deal with the state of the
  files in the index.

  Analogous to git clean, the changes in the working tree are all
  staged and added to a new commit on the tip of the current branch.

  The note on this commit needs to contain the necessary data to
  rebuild the state in the index.  The format is more closely
  specified below.  We also need the corresponding changes in the
  git restore command.

  Restored files will be written to disk as racily smudged, so the
  contents are checked by git, as we lost the meta-data anyway.  This
  comes at a slight performance impact, but I think that's okay as we
  potentially saved the user a lot of time re-doing all the changes.

- git branch/tag --force.  Store the name and the old location of the
  branch in refs/restore/history.  There are no files lost with this
  operation, so no additional commits as for git clean or git reset
  etc. are needed.  The format of the commit depends on the exact
  operation that was forced, for exact format see below.

This treatment can't make all operations safe.  Any operation that
touches the remote is hard to undo as some users already might have
fetched the new state of the remote (e.g. git push -f).  Others such
as git-gc will inevitably delete information from the disk, but
changing that

There's more, but I don't think just writing up all commands without
any code would make any sense.

Formats:
- commits in refs/restore/history:
empty commits with the following commit message format for git-clean
and git-reset and friends:
$versionnumber\n
$command\n
$branchname\n
$sha1ofreferencedcommit\n

empty commits with the following commit message format for git branch
and friends
$versionnumber\n
$command\n (this includes the exact operation that was forced
(e.g. move, delete etc.)
$branchname\n
$sha1ThatWasReferencedByTheBranch\n

Re: GSoC 2016: applications open, deadline = now => submitted

2016-02-19 Thread Jeff King

On Fri, Feb 19, 2016 at 10:10:55AM +0100, Matthieu Moy wrote:

> > I think we can continue to modify up to the deadline (at least that has
> > been the case in years past).
> 
> Indeed: I've completed the application and it's still possible to modify
> (the form just says "You've completed this form", but there's no scary
> "submit and remove write access from now" button ;-)).
> 
> I had to modify the text a bit to fit within the new length limit (1000
> characters for most fields). The content of the form should be the same
> as what's currently on http://git.github.io/SoC-2016-Org-Application/.
> Please check.

I read over what is in Google's application system, and it all looks
good.

I did make one minor change, which is that I listed "Software Freedom
Conservancy" as a foundation of which we are a member. I don't think it
will make a difference either way to our application, but they may want
to have it in their system, as it becomes relevant later on for
invoicing the mentor payments.

Matthieu, thanks for coordinating the application effort this year.  And
thanks to everybody else for submitting ideas and microprojects.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Windows git bash - child processes see system PATH environment variable instead of user...

2016-02-19 Thread Johannes Schindelin

Hi Edward,

On Fri, 19 Feb 2016, Edward Marshall wrote:

> Hey, thanks for getting back to me. I subseuqently found the
> git-for-windows issues tracker on github so have posted an updated
> version of this there - not sure where is the best place for issues.

It is the best place for Windows-specific issues. Thanks for following up!

Ciao,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 20/21] convert ewah/bitmap code to use xmalloc

2016-02-19 Thread Jeff King

This code was originally written with the idea that it could
be spun off into its own ewah library, and uses the
overrideable ewah_malloc to do allocations.

We plug in xmalloc as our ewah_malloc, of course. But over
the years the ewah code itself has become more entangled
with git, and the return value of many ewah_malloc sites is
not checked.

Let's just drop the level of indirection and use xmalloc and
friends directly. This saves a few lines, and will let us
adapt these sites to our more advanced malloc helpers.

Signed-off-by: Jeff King 
---
 ewah/bitmap.c  | 12 ++--
 ewah/ewah_bitmap.c |  9 +++--
 ewah/ewah_io.c | 10 ++
 ewah/ewok.h| 10 --
 4 files changed, 11 insertions(+), 30 deletions(-)

diff --git a/ewah/bitmap.c b/ewah/bitmap.c
index 47ad674..c88daa0 100644
--- a/ewah/bitmap.c
+++ b/ewah/bitmap.c
@@ -25,8 +25,8 @@
 
 struct bitmap *bitmap_new(void)
 {
-   struct bitmap *bitmap = ewah_malloc(sizeof(struct bitmap));
-   bitmap->words = ewah_calloc(32, sizeof(eword_t));
+   struct bitmap *bitmap = xmalloc(sizeof(struct bitmap));
+   bitmap->words = xcalloc(32, sizeof(eword_t));
bitmap->word_alloc = 32;
return bitmap;
 }
@@ -38,8 +38,8 @@ void bitmap_set(struct bitmap *self, size_t pos)
if (block >= self->word_alloc) {
size_t old_size = self->word_alloc;
self->word_alloc = block * 2;
-   self->words = ewah_realloc(self->words,
-   self->word_alloc * sizeof(eword_t));
+   self->words = xrealloc(self->words,
+ self->word_alloc * sizeof(eword_t));
 
memset(self->words + old_size, 0x0,
(self->word_alloc - old_size) * sizeof(eword_t));
@@ -102,7 +102,7 @@ struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah)
while (ewah_iterator_next(, )) {
if (i >= bitmap->word_alloc) {
bitmap->word_alloc *= 1.5;
-   bitmap->words = ewah_realloc(
+   bitmap->words = xrealloc(
bitmap->words, bitmap->word_alloc * 
sizeof(eword_t));
}
 
@@ -134,7 +134,7 @@ void bitmap_or_ewah(struct bitmap *self, struct ewah_bitmap 
*other)
 
if (self->word_alloc < other_final) {
self->word_alloc = other_final;
-   self->words = ewah_realloc(self->words,
+   self->words = xrealloc(self->words,
self->word_alloc * sizeof(eword_t));
memset(self->words + original_size, 0x0,
(self->word_alloc - original_size) * sizeof(eword_t));
diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c
index b522437..fcd465e 100644
--- a/ewah/ewah_bitmap.c
+++ b/ewah/ewah_bitmap.c
@@ -39,7 +39,7 @@ static inline void buffer_grow(struct ewah_bitmap *self, 
size_t new_size)
return;
 
self->alloc_size = new_size;
-   self->buffer = ewah_realloc(self->buffer,
+   self->buffer = xrealloc(self->buffer,
self->alloc_size * sizeof(eword_t));
self->rlw = self->buffer + (rlw_offset / sizeof(eword_t));
 }
@@ -282,11 +282,8 @@ struct ewah_bitmap *ewah_new(void)
 {
struct ewah_bitmap *self;
 
-   self = ewah_malloc(sizeof(struct ewah_bitmap));
-   if (self == NULL)
-   return NULL;
-
-   self->buffer = ewah_malloc(32 * sizeof(eword_t));
+   self = xmalloc(sizeof(struct ewah_bitmap));
+   self->buffer = xmalloc(32 * sizeof(eword_t));
self->alloc_size = 32;
 
ewah_clear(self);
diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c
index 43481b9..4acff97 100644
--- a/ewah/ewah_io.c
+++ b/ewah/ewah_io.c
@@ -134,12 +134,9 @@ int ewah_read_mmap(struct ewah_bitmap *self, const void 
*map, size_t len)
self->buffer_size = self->alloc_size = get_be32(ptr);
ptr += sizeof(uint32_t);
 
-   self->buffer = ewah_realloc(self->buffer,
+   self->buffer = xrealloc(self->buffer,
self->alloc_size * sizeof(eword_t));
 
-   if (!self->buffer)
-   return -1;
-
/*
 * Copy the raw data for the bitmap as a whole chunk;
 * if we're in a little-endian platform, we'll perform
@@ -180,12 +177,9 @@ int ewah_deserialize(struct ewah_bitmap *self, int fd)
return -1;
 
self->buffer_size = self->alloc_size = (size_t)ntohl(word_count);
-   self->buffer = ewah_realloc(self->buffer,
+   self->buffer = xrealloc(self->buffer,
self->alloc_size * sizeof(eword_t));
 
-   if (!self->buffer)
-   return -1;
-
/** 64 bit x N -- compressed words */
buffer = self->buffer;
words_left = self->buffer_size;
diff --git a/ewah/ewok.h b/ewah/ewok.h
index 6e2c5e1..269a1a8 100644
--- a/ewah/ewok.h
+++ b/ewah/ewok.h
@@ -20,16 +20,6 @@
 #ifndef __EWOK_BITMAP_H__
 #define

[PATCH 21/21] ewah: convert to REALLOC_ARRAY, etc

2016-02-19 Thread Jeff King

Now that we're built around xmalloc and friends, we can use
helpers like REALLOC_ARRAY, ALLOC_GROW, and so on to make
the code shorter and protect against integer overflow.

Signed-off-by: Jeff King 
---
 ewah/bitmap.c  | 16 
 ewah/ewah_bitmap.c |  5 ++---
 ewah/ewah_io.c |  6 ++
 3 files changed, 8 insertions(+), 19 deletions(-)

diff --git a/ewah/bitmap.c b/ewah/bitmap.c
index c88daa0..7103cee 100644
--- a/ewah/bitmap.c
+++ b/ewah/bitmap.c
@@ -17,7 +17,7 @@
  * along with this program; if not, write to the Free Software
  * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, 
USA.
  */
-#include "git-compat-util.h"
+#include "cache.h"
 #include "ewok.h"
 
 #define EWAH_MASK(x) ((eword_t)1 << (x % BITS_IN_EWORD))
@@ -38,9 +38,7 @@ void bitmap_set(struct bitmap *self, size_t pos)
if (block >= self->word_alloc) {
size_t old_size = self->word_alloc;
self->word_alloc = block * 2;
-   self->words = xrealloc(self->words,
- self->word_alloc * sizeof(eword_t));
-
+   REALLOC_ARRAY(self->words, self->word_alloc);
memset(self->words + old_size, 0x0,
(self->word_alloc - old_size) * sizeof(eword_t));
}
@@ -100,12 +98,7 @@ struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah)
ewah_iterator_init(, ewah);
 
while (ewah_iterator_next(, )) {
-   if (i >= bitmap->word_alloc) {
-   bitmap->word_alloc *= 1.5;
-   bitmap->words = xrealloc(
-   bitmap->words, bitmap->word_alloc * 
sizeof(eword_t));
-   }
-
+   ALLOC_GROW(bitmap->words, i + 1, bitmap->word_alloc);
bitmap->words[i++] = blowup;
}
 
@@ -134,8 +127,7 @@ void bitmap_or_ewah(struct bitmap *self, struct ewah_bitmap 
*other)
 
if (self->word_alloc < other_final) {
self->word_alloc = other_final;
-   self->words = xrealloc(self->words,
-   self->word_alloc * sizeof(eword_t));
+   REALLOC_ARRAY(self->words, self->word_alloc);
memset(self->words + original_size, 0x0,
(self->word_alloc - original_size) * sizeof(eword_t));
}
diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c
index fcd465e..2dc9c82 100644
--- a/ewah/ewah_bitmap.c
+++ b/ewah/ewah_bitmap.c
@@ -39,8 +39,7 @@ static inline void buffer_grow(struct ewah_bitmap *self, 
size_t new_size)
return;
 
self->alloc_size = new_size;
-   self->buffer = xrealloc(self->buffer,
-   self->alloc_size * sizeof(eword_t));
+   REALLOC_ARRAY(self->buffer, self->alloc_size);
self->rlw = self->buffer + (rlw_offset / sizeof(eword_t));
 }
 
@@ -283,8 +282,8 @@ struct ewah_bitmap *ewah_new(void)
struct ewah_bitmap *self;
 
self = xmalloc(sizeof(struct ewah_bitmap));
-   self->buffer = xmalloc(32 * sizeof(eword_t));
self->alloc_size = 32;
+   ALLOC_ARRAY(self->buffer, self->alloc_size);
 
ewah_clear(self);
return self;
diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c
index 4acff97..61f6a43 100644
--- a/ewah/ewah_io.c
+++ b/ewah/ewah_io.c
@@ -134,8 +134,7 @@ int ewah_read_mmap(struct ewah_bitmap *self, const void 
*map, size_t len)
self->buffer_size = self->alloc_size = get_be32(ptr);
ptr += sizeof(uint32_t);
 
-   self->buffer = xrealloc(self->buffer,
-   self->alloc_size * sizeof(eword_t));
+   REALLOC_ARRAY(self->buffer, self->alloc_size);
 
/*
 * Copy the raw data for the bitmap as a whole chunk;
@@ -177,8 +176,7 @@ int ewah_deserialize(struct ewah_bitmap *self, int fd)
return -1;
 
self->buffer_size = self->alloc_size = (size_t)ntohl(word_count);
-   self->buffer = xrealloc(self->buffer,
-   self->alloc_size * sizeof(eword_t));
+   REALLOC_ARRAY(self->buffer, self->alloc_size);
 
/** 64 bit x N -- compressed words */
buffer = self->buffer;
-- 
2.7.1.577.gfed91b8
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 15/21] test-path-utils: fix normalize_path_copy output buffer size

2016-02-19 Thread Jeff King

The normalize_path_copy function needs an output buffer that
is at least as long as its input (it may shrink the path,
but never expand it). However, this test program feeds it
static PATH_MAX-sized buffers, which have no relation to the
input size.

In the normalize_ceiling_entry case, we do at least check
the size against PATH_MAX and die(), but that case is even
more convoluted. We normalize into a fixed-size buffer, free
the original, and then replace it with a strdup'd copy of
the result. But normalize_path_copy explicitly allows
normalizing in-place, so we can simply do that.

Signed-off-by: Jeff King 
---
 test-path-utils.c | 15 ---
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/test-path-utils.c b/test-path-utils.c
index 6232dfe..ba805b3 100644
--- a/test-path-utils.c
+++ b/test-path-utils.c
@@ -8,21 +8,14 @@
  */
 static int normalize_ceiling_entry(struct string_list_item *item, void *unused)
 {
-   const char *ceil = item->string;
-   int len = strlen(ceil);
-   char buf[PATH_MAX+1];
+   char *ceil = item->string;
 
-   if (len == 0)
+   if (!*ceil)
die("Empty path is not supported");
-   if (len > PATH_MAX)
-   die("Path \"%s\" is too long", ceil);
if (!is_absolute_path(ceil))
die("Path \"%s\" is not absolute", ceil);
-   if (normalize_path_copy(buf, ceil) < 0)
+   if (normalize_path_copy(ceil, ceil) < 0)
die("Path \"%s\" could not be normalized", ceil);
-   len = strlen(buf);
-   free(item->string);
-   item->string = xstrdup(buf);
return 1;
 }
 
@@ -166,7 +159,7 @@ static struct test_data dirname_data[] = {
 int main(int argc, char **argv)
 {
if (argc == 3 && !strcmp(argv[1], "normalize_path_copy")) {
-   char *buf = xmalloc(PATH_MAX + 1);
+   char *buf = xmallocz(strlen(argv[2]));
int rv = normalize_path_copy(buf, argv[2]);
if (rv)
buf = "++failed++";
-- 
2.7.1.577.gfed91b8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 19/21] diff_populate_gitlink: use a strbuf

2016-02-19 Thread Jeff King

We allocate 100 bytes to hold the "Submodule commit ..."
text. This is enough, but it's not immediately obvious that
this is the case, and we have to repeat the magic 100 twice.

We could get away with xstrfmt here, but we want to know the
size, as well, so let's use a real strbuf. And while we're
here, we can clean up the logic around size_only. It
currently sets and clears the "data" field pointlessly, and
leaves the "should_free" flag on even after we have cleared
the data.

Signed-off-by: Jeff King 
---
 diff.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/diff.c b/diff.c
index 27d14a7..a70ec6e 100644
--- a/diff.c
+++ b/diff.c
@@ -2704,21 +2704,21 @@ static int reuse_worktree_file(const char *name, const 
unsigned char *sha1, int
 
 static int diff_populate_gitlink(struct diff_filespec *s, int size_only)
 {
-   int len;
-   char *data = xmalloc(100), *dirty = "";
+   struct strbuf buf = STRBUF_INIT;
+   char *dirty = "";
 
/* Are we looking at the work tree? */
if (s->dirty_submodule)
dirty = "-dirty";
 
-   len = snprintf(data, 100,
-  "Subproject commit %s%s\n", sha1_to_hex(s->sha1), dirty);
-   s->data = data;
-   s->size = len;
-   s->should_free = 1;
+   strbuf_addf(, "Subproject commit %s%s\n", sha1_to_hex(s->sha1), 
dirty);
+   s->size = buf.len;
if (size_only) {
s->data = NULL;
-   free(data);
+   strbuf_release();
+   } else {
+   s->data = strbuf_detach(, NULL);
+   s->should_free = 1;
}
return 0;
 }
-- 
2.7.1.577.gfed91b8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 18/21] transport_anonymize_url: use xstrfmt

2016-02-19 Thread Jeff King

This function uses xcalloc and two memcpy calls to
concatenate two strings. We can do this as an xstrfmt
one-liner, and then it is more clear that we are allocating
the correct amount of memory.

Signed-off-by: Jeff King 
---
 transport.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/transport.c b/transport.c
index 3b4e644..ac857a9 100644
--- a/transport.c
+++ b/transport.c
@@ -1021,7 +1021,7 @@ int transport_disconnect(struct transport *transport)
  */
 char *transport_anonymize_url(const char *url)
 {
-   char *anon_url, *scheme_prefix, *anon_part;
+   char *scheme_prefix, *anon_part;
size_t anon_len, prefix_len = 0;
 
anon_part = strchr(url, '@');
@@ -1055,10 +1055,8 @@ char *transport_anonymize_url(const char *url)
goto literal_copy;
prefix_len = scheme_prefix - url + 3;
}
-   anon_url = xcalloc(1, 1 + prefix_len + anon_len);
-   memcpy(anon_url, url, prefix_len);
-   memcpy(anon_url + prefix_len, anon_part, anon_len);
-   return anon_url;
+   return xstrfmt("%.*s%.*s", (int)prefix_len, url,
+  (int)anon_len, anon_part);
 literal_copy:
return xstrdup(url);
 }
-- 
2.7.1.577.gfed91b8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 16/21] sequencer: simplify memory allocation of get_message

2016-02-19 Thread Jeff King

For a commit with sha1 "1234abcd" and subject "foo", this
function produces a struct with three strings:

 1. "foo"

 2. "1234abcd... foo"

 3. "parent of 1234abcd... foo"

It takes advantage of the fact that these strings are
subsets of each other, and allocates only _one_ string, with
pointers into the various parts. Unfortunately, this makes
the string allocation complicated and hard to follow.

Since we keep only one of these in memory at a time, we can
afford to simply allocate three strings. This lets us build
on tools like xstrfmt and avoid manual computation.

While we're here, we can also drop the ad-hoc
reimplementation of get_git_commit_encoding(), and simply
call that function.

Signed-off-by: Jeff King 
---
 sequencer.c | 29 ++---
 1 file changed, 10 insertions(+), 19 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 8048786..e66f2fe 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -124,42 +124,33 @@ static const char *action_name(const struct replay_opts 
*opts)
 
 struct commit_message {
char *parent_label;
-   const char *label;
-   const char *subject;
+   char *label;
+   char *subject;
const char *message;
 };
 
 static int get_message(struct commit *commit, struct commit_message *out)
 {
const char *abbrev, *subject;
-   int abbrev_len, subject_len;
-   char *q;
-
-   if (!git_commit_encoding)
-   git_commit_encoding = "UTF-8";
+   int subject_len;
 
-   out->message = logmsg_reencode(commit, NULL, git_commit_encoding);
+   out->message = logmsg_reencode(commit, NULL, 
get_commit_output_encoding());
abbrev = find_unique_abbrev(commit->object.oid.hash, DEFAULT_ABBREV);
-   abbrev_len = strlen(abbrev);
 
subject_len = find_commit_subject(out->message, );
 
-   out->parent_label = xmalloc(strlen("parent of ") + abbrev_len +
- strlen("... ") + subject_len + 1);
-   q = out->parent_label;
-   q = mempcpy(q, "parent of ", strlen("parent of "));
-   out->label = q;
-   q = mempcpy(q, abbrev, abbrev_len);
-   q = mempcpy(q, "... ", strlen("... "));
-   out->subject = q;
-   q = mempcpy(q, subject, subject_len);
-   *q = '\0';
+   out->subject = xmemdupz(subject, subject_len);
+   out->label = xstrfmt("%s... %s", abbrev, out->subject);
+   out->parent_label = xstrfmt("parent of %s", out->label);
+
return 0;
 }
 
 static void free_message(struct commit *commit, struct commit_message *msg)
 {
free(msg->parent_label);
+   free(msg->label);
+   free(msg->subject);
unuse_commit_buffer(commit, msg->message);
 }
 
-- 
2.7.1.577.gfed91b8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 14/21] fetch-pack: simplify add_sought_entry

2016-02-19 Thread Jeff King

We have two variants of this function, one that takes a
string and one that takes a ptr/len combo. But we only call
the latter with the length of a NUL-terminated string, so
our first simplification is to drop it in favor of the
string variant.

Since we know we have a string, we can also replace the
manual memory computation with a call to alloc_ref().

Furthermore, we can rely on get_oid_hex() to complain if it
hits the end of the string. That means we can simplify the
check for " " versus just "". Rather than
manage the ptr/len pair, we can just bump the start of our
string forward. The original code over-allocated based on
the original "namelen" (which wasn't _wrong_, but was simply
wasteful and confusing).

Signed-off-by: Jeff King 
---
 builtin/fetch-pack.c | 27 +--
 1 file changed, 9 insertions(+), 18 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 9b2a514..79a611f 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -10,33 +10,24 @@ static const char fetch_pack_usage[] =
 "[--include-tag] [--upload-pack=] [--depth=] "
 "[--no-progress] [--diag-url] [-v] [:] [...]";
 
-static void add_sought_entry_mem(struct ref ***sought, int *nr, int *alloc,
-const char *name, int namelen)
+static void add_sought_entry(struct ref ***sought, int *nr, int *alloc,
+const char *name)
 {
-   struct ref *ref = xcalloc(1, sizeof(*ref) + namelen + 1);
+   struct ref *ref;
struct object_id oid;
-   const int chunksz = GIT_SHA1_HEXSZ + 1;
 
-   if (namelen > chunksz && name[chunksz - 1] == ' ' &&
-   !get_oid_hex(name, )) {
-   oidcpy(>old_oid, );
-   name += chunksz;
-   namelen -= chunksz;
-   }
+   if (!get_oid_hex(name, ) && name[GIT_SHA1_HEXSZ] == ' ')
+   name += GIT_SHA1_HEXSZ + 1;
+   else
+   oidclr();
 
-   memcpy(ref->name, name, namelen);
-   ref->name[namelen] = '\0';
+   ref = alloc_ref(name);
+   oidcpy(>old_oid, );
(*nr)++;
ALLOC_GROW(*sought, *nr, *alloc);
(*sought)[*nr - 1] = ref;
 }
 
-static void add_sought_entry(struct ref ***sought, int *nr, int *alloc,
-const char *string)
-{
-   add_sought_entry_mem(sought, nr, alloc, string, strlen(string));
-}
-
 int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 {
int i, ret;
-- 
2.7.1.577.gfed91b8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 17/21] git-compat-util: drop mempcpy compat code

2016-02-19 Thread Jeff King

There are no callers of this left, as the last one was
dropped in the previous patch. And there are not likely to
be new ones, as the function has been around since 2010
without gaining any new callers.

Signed-off-by: Jeff King 
---
 git-compat-util.h | 9 -
 1 file changed, 9 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index e71e615..8f0e76b 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -681,7 +681,6 @@ extern int git_vsnprintf(char *str, size_t maxsize,
 #ifdef __GLIBC_PREREQ
 #if __GLIBC_PREREQ(2, 1)
 #define HAVE_STRCHRNUL
-#define HAVE_MEMPCPY
 #endif
 #endif
 
@@ -695,14 +694,6 @@ static inline char *gitstrchrnul(const char *s, int c)
 }
 #endif
 
-#ifndef HAVE_MEMPCPY
-#define mempcpy gitmempcpy
-static inline void *gitmempcpy(void *dest, const void *src, size_t n)
-{
-   return (char *)memcpy(dest, src, n) + n;
-}
-#endif
-
 #ifdef NO_INET_PTON
 int inet_pton(int af, const char *src, void *dst);
 #endif
-- 
2.7.1.577.gfed91b8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 12/21] write_untracked_extension: use FLEX_ALLOC helper

2016-02-19 Thread Jeff King

We perform unchecked additions when computing the size of a
"struct ondisk_untracked_cache". This is unlikely to have an
integer overflow in practice, but we'd like to avoid this
dangerous pattern to make further audits easier.

Note that there's one subtlety here, though.  We protect
ourselves against a NULL exclude_per_dir entry in our
source, and avoid calling strlen() on it, keeping "len" at
0. But later, we unconditionally memcpy "len + 1" bytes to
get the trailing NUL byte. If we did have a NULL
exclude_per_dir, we would read from bogus memory.

As it turns out, though, we always create this field
pointing to a string literal, so there's no bug. We can just
get rid of the pointless extra conditional.

Signed-off-by: Jeff King 
---
 dir.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/dir.c b/dir.c
index 2c91541..a4a9d9f 100644
--- a/dir.c
+++ b/dir.c
@@ -2360,16 +2360,15 @@ void write_untracked_extension(struct strbuf *out, 
struct untracked_cache *untra
struct ondisk_untracked_cache *ouc;
struct write_data wd;
unsigned char varbuf[16];
-   int len = 0, varint_len;
-   if (untracked->exclude_per_dir)
-   len = strlen(untracked->exclude_per_dir);
-   ouc = xmalloc(sizeof(*ouc) + len + 1);
+   int varint_len;
+   size_t len = strlen(untracked->exclude_per_dir);
+
+   FLEX_ALLOC_MEM(ouc, exclude_per_dir, untracked->exclude_per_dir, len);
stat_data_to_disk(>info_exclude_stat, 
>ss_info_exclude.stat);
stat_data_to_disk(>excludes_file_stat, 
>ss_excludes_file.stat);
hashcpy(ouc->info_exclude_sha1, untracked->ss_info_exclude.sha1);
hashcpy(ouc->excludes_file_sha1, untracked->ss_excludes_file.sha1);
ouc->dir_flags = htonl(untracked->dir_flags);
-   memcpy(ouc->exclude_per_dir, untracked->exclude_per_dir, len + 1);
 
varint_len = encode_varint(untracked->ident.len, varbuf);
strbuf_add(out, varbuf, varint_len);
-- 
2.7.1.577.gfed91b8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 13/21] fast-import: simplify allocation in start_packfile

2016-02-19 Thread Jeff King

This function allocate a packed_git flex-array, and adds a
mysterious 2 bytes to the length of the pack_name field. One
is for the trailing NUL, but the other has no purpose. This
is probably cargo-culted from add_packed_git, which gets the
".idx" path and needed to allocate enough space to hold the
matching ".pack" (though since 48bcc1c, we calculate the
size there differently).

This site, however, is using the raw path of a tempfile, and
does not need the extra byte. We can just replace the
allocation with FLEX_ALLOC_STR, which handles the allocation
and the NUL for us.

Signed-off-by: Jeff King 
---
 fast-import.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index 3053bb8..9fc7093 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -865,15 +865,12 @@ static void start_packfile(void)
 {
static char tmp_file[PATH_MAX];
struct packed_git *p;
-   int namelen;
struct pack_header hdr;
int pack_fd;
 
pack_fd = odb_mkstemp(tmp_file, sizeof(tmp_file),
  "pack/tmp_pack_XX");
-   namelen = strlen(tmp_file) + 2;
-   p = xcalloc(1, sizeof(*p) + namelen);
-   xsnprintf(p->pack_name, namelen, "%s", tmp_file);
+   FLEX_ALLOC_STR(p, pack_name, tmp_file);
p->pack_fd = pack_fd;
p->do_not_close = 1;
pack_file = sha1fd(pack_fd, p->pack_name);
-- 
2.7.1.577.gfed91b8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 11/21] prepare_{git,shell}_cmd: use argv_array

2016-02-19 Thread Jeff King

These functions transform an existing argv into one suitable
for exec-ing or spawning via git or a shell. We can use an
argv_array in each to avoid dealing with manual counting and
allocation.

This also makes the memory allocation more clear and fixes
some leaks. In prepare_shell_cmd, we would sometimes
allocate a new string with "$@" in it and sometimes not,
meaning the caller could not correctly free it. On the
non-Windows side, we are in a child process which will
exec() or exit() immediately, so the leak isn't a big deal.
On Windows, though, we use spawn() from the parent process,
and leak a string for each shell command we run. On top of
that, the Windows code did not free the allocated argv array
at all (but does for the prepare_git_cmd case!).

By switching both of these functions to write into an
argv_array, we can consistently free the result as
appropriate.

Signed-off-by: Jeff King 
---
Note that I had to touch the Windows run-command code here, but I don't
actually have a platform to test it on.

 exec_cmd.c| 28 +++-
 exec_cmd.h|  4 +++-
 run-command.c | 60 +--
 3 files changed, 39 insertions(+), 53 deletions(-)

diff --git a/exec_cmd.c b/exec_cmd.c
index e85f0fd..cf442a9 100644
--- a/exec_cmd.c
+++ b/exec_cmd.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "exec_cmd.h"
 #include "quote.h"
+#include "argv-array.h"
 #define MAX_ARGS   32
 
 static const char *argv_exec_path;
@@ -107,32 +108,25 @@ void setup_path(void)
strbuf_release(_path);
 }
 
-const char **prepare_git_cmd(const char **argv)
+const char **prepare_git_cmd(struct argv_array *out, const char **argv)
 {
-   int argc;
-   const char **nargv;
-
-   for (argc = 0; argv[argc]; argc++)
-   ; /* just counting */
-   nargv = xmalloc(sizeof(*nargv) * (argc + 2));
-
-   nargv[0] = "git";
-   for (argc = 0; argv[argc]; argc++)
-   nargv[argc + 1] = argv[argc];
-   nargv[argc + 1] = NULL;
-   return nargv;
+   argv_array_push(out, "git");
+   argv_array_pushv(out, argv);
+   return out->argv;
 }
 
 int execv_git_cmd(const char **argv) {
-   const char **nargv = prepare_git_cmd(argv);
-   trace_argv_printf(nargv, "trace: exec:");
+   struct argv_array nargv = ARGV_ARRAY_INIT;
+
+   prepare_git_cmd(, argv);
+   trace_argv_printf(nargv.argv, "trace: exec:");
 
/* execvp() can only ever return if it fails */
-   sane_execvp("git", (char **)nargv);
+   sane_execvp("git", (char **)nargv.argv);
 
trace_printf("trace: exec failed: %s\n", strerror(errno));
 
-   free(nargv);
+   argv_array_clear();
return -1;
 }
 
diff --git a/exec_cmd.h b/exec_cmd.h
index 93b0c02..1f6b433 100644
--- a/exec_cmd.h
+++ b/exec_cmd.h
@@ -1,11 +1,13 @@
 #ifndef GIT_EXEC_CMD_H
 #define GIT_EXEC_CMD_H
 
+struct argv_array;
+
 extern void git_set_argv_exec_path(const char *exec_path);
 extern const char *git_extract_argv0_path(const char *path);
 extern const char *git_exec_path(void);
 extern void setup_path(void);
-extern const char **prepare_git_cmd(const char **argv);
+extern const char **prepare_git_cmd(struct argv_array *out, const char **argv);
 extern int execv_git_cmd(const char **argv); /* NULL terminated */
 LAST_ARG_MUST_BE_NULL
 extern int execl_git_cmd(const char *cmd, ...);
diff --git a/run-command.c b/run-command.c
index cdf0184..019f6d1 100644
--- a/run-command.c
+++ b/run-command.c
@@ -160,50 +160,41 @@ int sane_execvp(const char *file, char * const argv[])
return -1;
 }
 
-static const char **prepare_shell_cmd(const char **argv)
+static const char **prepare_shell_cmd(struct argv_array *out, const char 
**argv)
 {
-   int argc, nargc = 0;
-   const char **nargv;
-
-   for (argc = 0; argv[argc]; argc++)
-   ; /* just counting */
-   /* +1 for NULL, +3 for "sh -c" plus extra $0 */
-   nargv = xmalloc(sizeof(*nargv) * (argc + 1 + 3));
-
-   if (argc < 1)
+   if (!argv[0])
die("BUG: shell command is empty");
 
if (strcspn(argv[0], "|&;<>()$`\\\"' \t\n*?[#~=%") != strlen(argv[0])) {
 #ifndef GIT_WINDOWS_NATIVE
-   nargv[nargc++] = SHELL_PATH;
+   argv_array_push(out, SHELL_PATH);
 #else
-   nargv[nargc++] = "sh";
+   argv_array_push(out, "sh");
 #endif
-   nargv[nargc++] = "-c";
-
-   if (argc < 2)
-   nargv[nargc++] = argv[0];
-   else {
-   struct strbuf arg0 = STRBUF_INIT;
-   strbuf_addf(, "%s \"$@\"", argv[0]);
-   nargv[nargc++] = strbuf_detach(, NULL);
-   }
-   }
+   argv_array_push(out, "-c");
 
-   for (argc = 0; argv[argc]; argc++)
-   nargv[nargc++] = argv[argc];
-   nargv[nargc] = NULL;
+   /*
+* If we

[PATCH 08/21] use xmallocz to avoid size arithmetic

2016-02-19 Thread Jeff King

We frequently allocate strings as xmalloc(len + 1), where
the extra 1 is for the NUL terminator. This can be done more
simply with xmallocz, which also checks for integer
overflow.

There's no case where switching xmalloc(n+1) to xmallocz(n)
is wrong; the result is the same length, and malloc made no
guarantees about what was in the buffer anyway. But in some
cases, we can stop manually placing NUL at the end of the
allocated buffer. But that's only safe if it's clear that
the contents will always fill the buffer.

In each case where this patch does so, I manually examined
the control flow, and I tried to err on the side of caution.

Signed-off-by: Jeff King 
---
 builtin/check-ref-format.c | 2 +-
 builtin/merge-tree.c   | 2 +-
 builtin/worktree.c | 2 +-
 column.c   | 3 +--
 combine-diff.c | 4 +---
 config.c   | 4 +---
 dir.c  | 2 +-
 entry.c| 2 +-
 grep.c | 3 +--
 imap-send.c| 5 ++---
 ll-merge.c | 2 +-
 progress.c | 2 +-
 refs.c | 2 +-
 setup.c| 5 ++---
 strbuf.c   | 2 +-
 15 files changed, 17 insertions(+), 25 deletions(-)

diff --git a/builtin/check-ref-format.c b/builtin/check-ref-format.c
index fd915d5..eac4994 100644
--- a/builtin/check-ref-format.c
+++ b/builtin/check-ref-format.c
@@ -20,7 +20,7 @@ static const char builtin_check_ref_format_usage[] =
  */
 static char *collapse_slashes(const char *refname)
 {
-   char *ret = xmalloc(strlen(refname) + 1);
+   char *ret = xmallocz(strlen(refname));
char ch;
char prev = '/';
char *cp = ret;
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index d4f0cbd..ca57004 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -174,7 +174,7 @@ static struct merge_list *create_entry(unsigned stage, 
unsigned mode, const unsi
 
 static char *traverse_path(const struct traverse_info *info, const struct 
name_entry *n)
 {
-   char *path = xmalloc(traverse_path_len(info, n) + 1);
+   char *path = xmallocz(traverse_path_len(info, n));
return make_traverse_path(path, info, n);
 }
 
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 475b958..0a45710 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -52,7 +52,7 @@ static int prune_worktree(const char *id, struct strbuf 
*reason)
return 1;
}
len = st.st_size;
-   path = xmalloc(len + 1);
+   path = xmallocz(len);
read_in_full(fd, path, len);
close(fd);
while (len && (path[len - 1] == '\n' || path[len - 1] == '\r'))
diff --git a/column.c b/column.c
index f9fda68..d55ead1 100644
--- a/column.c
+++ b/column.c
@@ -173,9 +173,8 @@ static void display_table(const struct string_list *list,
if (colopts & COL_DENSE)
shrink_columns();
 
-   empty_cell = xmalloc(initial_width + 1);
+   empty_cell = xmallocz(initial_width);
memset(empty_cell, ' ', initial_width);
-   empty_cell[initial_width] = '\0';
for (y = 0; y < data.rows; y++) {
for (x = 0; x < data.cols; x++)
if (display_cell(, initial_width, empty_cell, x, 
y))
diff --git a/combine-diff.c b/combine-diff.c
index a698016..890c415 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -1043,7 +1043,7 @@ static void show_patch_diff(struct combine_diff_path 
*elem, int num_parent,
elem->mode = canon_mode(S_IFLNK);
 
result_size = len;
-   result = xmalloc(len + 1);
+   result = xmallocz(len);
 
done = read_in_full(fd, result, len);
if (done < 0)
@@ -1051,8 +1051,6 @@ static void show_patch_diff(struct combine_diff_path 
*elem, int num_parent,
else if (done < len)
die("early EOF '%s'", elem->path);
 
-   result[len] = 0;
-
/* If not a fake symlink, apply filters, e.g. autocrlf 
*/
if (is_file) {
struct strbuf buf = STRBUF_INIT;
diff --git a/config.c b/config.c
index b95ac3a..e7b589a 100644
--- a/config.c
+++ b/config.c
@@ -1902,7 +1902,7 @@ static int git_config_parse_key_1(const char *key, char 
**store_key, int *basele
 * Validate the key and while at it, lower case it for matching.
 */
if (store_key)
-   *store_key = xmalloc(strlen(key) + 1);
+   *store_key = xmallocz(strlen(key));
 
dot = 0;
for (i = 0; key[i]; i++) {
@@ -1926,8 +1926,6 @@ static int git_config_parse_key_1(const char *key, char 
**store_key, int *basele
if (store_key)
(*store_key)[i] = c;
}
-   if (store_key)
-   (*store_key)[i] =

[PATCH 10/21] use st_add and st_mult for allocation size computation

2016-02-19 Thread Jeff King

If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.

Signed-off-by: Jeff King 
---
 archive.c  |  4 ++--
 builtin/apply.c|  2 +-
 builtin/clean.c|  2 +-
 builtin/fetch.c|  2 +-
 builtin/index-pack.c   |  4 ++--
 builtin/merge.c|  2 +-
 builtin/mv.c   |  4 ++--
 builtin/receive-pack.c |  2 +-
 combine-diff.c | 14 +++---
 commit.c   |  2 +-
 compat/mingw.c |  4 ++--
 compat/qsort.c |  2 +-
 compat/setenv.c|  2 +-
 compat/win32/syslog.c  |  4 ++--
 diffcore-delta.c   |  6 --
 diffcore-rename.c  |  2 +-
 dir.c  |  4 ++--
 fast-import.c  |  2 +-
 refs.c |  2 +-
 remote.c   |  8 
 revision.c |  2 +-
 sha1_file.c| 20 +++-
 sha1_name.c|  5 ++---
 shallow.c  |  2 +-
 submodule.c|  6 +++---
 25 files changed, 56 insertions(+), 53 deletions(-)

diff --git a/archive.c b/archive.c
index 0687afa..5d735ae 100644
--- a/archive.c
+++ b/archive.c
@@ -171,8 +171,8 @@ static void queue_directory(const unsigned char *sha1,
unsigned mode, int stage, struct archiver_context *c)
 {
struct directory *d;
-   size_t len = base->len + 1 + strlen(filename) + 1;
-   d = xmalloc(sizeof(*d) + len);
+   size_t len = st_add4(base->len, 1, strlen(filename), 1);
+   d = xmalloc(st_add(sizeof(*d), len));
d->up  = c->bottom;
d->baselen = base->len;
d->mode= mode;
diff --git a/builtin/apply.c b/builtin/apply.c
index d61ac65..42c610e 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -2632,7 +2632,7 @@ static void update_image(struct image *img,
insert_count = postimage->len;
 
/* Adjust the contents */
-   result = xmalloc(img->len + insert_count - remove_count + 1);
+   result = xmalloc(st_add3(st_sub(img->len, remove_count), insert_count, 
1));
memcpy(result, img->buf, applied_at);
memcpy(result + applied_at, postimage->buf, postimage->len);
memcpy(result + applied_at + postimage->len,
diff --git a/builtin/clean.c b/builtin/clean.c
index 8229f7e..0371010 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -615,7 +615,7 @@ static int *list_and_choose(struct menu_opts *opts, struct 
menu_stuff *stuff)
nr += chosen[i];
}
 
-   result = xcalloc(nr + 1, sizeof(int));
+   result = xcalloc(st_add(nr, 1), sizeof(int));
for (i = 0; i < stuff->nr && j < nr; i++) {
if (chosen[i])
result[j++] = i;
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 8e74213..373a89d 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1110,7 +1110,7 @@ static int fetch_one(struct remote *remote, int argc, 
const char **argv)
if (argc > 0) {
int j = 0;
int i;
-   refs = xcalloc(argc + 1, sizeof(const char *));
+   refs = xcalloc(st_add(argc, 1), sizeof(const char *));
for (i = 0; i < argc; i++) {
if (!strcmp(argv[i], "tag")) {
i++;
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index a60bcfa..193908a 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1744,9 +1744,9 @@ int cmd_index_pack(int argc, const char **argv, const 
char *prefix)
 
curr_pack = open_pack_file(pack_name);
parse_pack_header();
-   objects = xcalloc(nr_objects + 1, sizeof(struct object_entry));
+   objects = xcalloc(st_add(nr_objects, 1), sizeof(struct object_entry));
if (show_stat)
-   obj_stat = xcalloc(nr_objects + 1, sizeof(struct object_stat));
+   obj_stat = xcalloc(st_add(nr_objects, 1), sizeof(struct 
object_stat));
ofs_deltas = xcalloc(nr_objects, sizeof(struct ofs_delta_entry));
parse_pack_objects(pack_sha1);
resolve_deltas();
diff --git a/builtin/merge.c b/builtin/merge.c
index b98a348..101ffef 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -939,7 +939,7 @@ static int setup_with_upstream(const char ***argv)
if (!branch->merge_nr)
die(_("No default upstream defined for the current branch."));
 
-   args = xcalloc(branch->merge_nr + 1, sizeof(char *));
+   args = xcalloc(st_add(branch->merge_nr, 1), sizeof(char *));
for (i = 0; i < branch->merge_nr; i++) {
if (!branch->merge[i]->dst)
die(_("No remote-tracking branch for %s from %s"),

[PATCH 09/21] convert trivial cases to FLEX_ARRAY macros

2016-02-19 Thread Jeff King

Using FLEX_ARRAY macros reduces the amount of manual
computation size we have to do. It also ensures we don't
overflow size_t, and it makes sure we write the same number
of bytes that we allocated.

Signed-off-by: Jeff King 
---
 attr.c   |  4 +---
 builtin/blame.c  |  4 +---
 builtin/help.c   |  9 +++--
 builtin/mktree.c |  9 +
 builtin/reflog.c |  7 ++-
 cache-tree.c |  4 +---
 combine-diff.c   |  4 +---
 diff.c   |  7 ++-
 dir.c| 16 +++-
 hashmap.c|  3 +--
 help.c   |  6 ++
 log-tree.c   |  5 ++---
 name-hash.c  |  3 +--
 ref-filter.c |  6 ++
 refs.c   |  6 ++
 refs/files-backend.c | 19 +--
 remote.c |  5 +
 17 files changed, 35 insertions(+), 82 deletions(-)

diff --git a/attr.c b/attr.c
index c83ec49..6537a43 100644
--- a/attr.c
+++ b/attr.c
@@ -93,9 +93,7 @@ static struct git_attr *git_attr_internal(const char *name, 
int len)
if (invalid_attr_name(name, len))
return NULL;
 
-   a = xmalloc(sizeof(*a) + len + 1);
-   memcpy(a->name, name, len);
-   a->name[len] = 0;
+   FLEX_ALLOC_MEM(a, name, name, len);
a->h = hval;
a->next = git_attr_hash[pos];
a->attr_nr = attr_nr++;
diff --git a/builtin/blame.c b/builtin/blame.c
index b4ed462..e982fb8 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -466,13 +466,11 @@ static void queue_blames(struct scoreboard *sb, struct 
origin *porigin,
 static struct origin *make_origin(struct commit *commit, const char *path)
 {
struct origin *o;
-   size_t pathlen = strlen(path) + 1;
-   o = xcalloc(1, sizeof(*o) + pathlen);
+   FLEX_ALLOC_STR(o, path, path);
o->commit = commit;
o->refcnt = 1;
o->next = commit->util;
commit->util = o;
-   memcpy(o->path, path, pathlen); /* includes NUL */
return o;
 }
 
diff --git a/builtin/help.c b/builtin/help.c
index 1cd0c1e..3c55ce4 100644
--- a/builtin/help.c
+++ b/builtin/help.c
@@ -171,12 +171,10 @@ static void exec_man_cmd(const char *cmd, const char 
*page)
 static void add_man_viewer(const char *name)
 {
struct man_viewer_list **p = _viewer_list;
-   size_t len = strlen(name);
 
while (*p)
p = &((*p)->next);
-   *p = xcalloc(1, (sizeof(**p) + len + 1));
-   memcpy((*p)->name, name, len); /* NUL-terminated by xcalloc */
+   FLEX_ALLOC_STR(*p, name, name);
 }
 
 static int supported_man_viewer(const char *name, size_t len)
@@ -190,9 +188,8 @@ static void do_add_man_viewer_info(const char *name,
   size_t len,
   const char *value)
 {
-   struct man_viewer_info_list *new = xcalloc(1, sizeof(*new) + len + 1);
-
-   memcpy(new->name, name, len); /* NUL-terminated by xcalloc */
+   struct man_viewer_info_list *new;
+   FLEX_ALLOC_MEM(new, name, name, len);
new->info = xstrdup(value);
new->next = man_viewer_info_list;
man_viewer_info_list = new;
diff --git a/builtin/mktree.c b/builtin/mktree.c
index a237caa..4282b62 100644
--- a/builtin/mktree.c
+++ b/builtin/mktree.c
@@ -19,16 +19,17 @@ static int alloc, used;
 static void append_to_tree(unsigned mode, unsigned char *sha1, char *path)
 {
struct treeent *ent;
-   int len = strlen(path);
+   size_t len = strlen(path);
if (strchr(path, '/'))
die("path %s contains slash", path);
 
-   ALLOC_GROW(entries, used + 1, alloc);
-   ent = entries[used++] = xmalloc(sizeof(**entries) + len + 1);
+   FLEX_ALLOC_MEM(ent, name, path, len);
ent->mode = mode;
ent->len = len;
hashcpy(ent->sha1, sha1);
-   memcpy(ent->name, path, len+1);
+
+   ALLOC_GROW(entries, used + 1, alloc);
+   entries[used++] = ent;
 }
 
 static int ent_compare(const void *a_, const void *b_)
diff --git a/builtin/reflog.c b/builtin/reflog.c
index 9980731..2d46b64 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -382,11 +382,9 @@ static int collect_reflog(const char *ref, const struct 
object_id *oid, int unus
 {
struct collected_reflog *e;
struct collect_reflog_cb *cb = cb_data;
-   size_t namelen = strlen(ref);
 
-   e = xmalloc(sizeof(*e) + namelen + 1);
+   FLEX_ALLOC_STR(e, reflog, ref);
hashcpy(e->sha1, oid->hash);
-   memcpy(e->reflog, ref, namelen + 1);
ALLOC_GROW(cb->e, cb->nr + 1, cb->alloc);
cb->e[cb->nr++] = e;
return 0;
@@ -411,8 +409,7 @@ static struct reflog_expire_cfg *find_cfg_ent(const char 
*pattern, size_t len)
ent->pattern[len] == '\0')
return ent;
 
-   ent = xcalloc(1, sizeof(*ent) + len + 1);
-   memcpy(ent->pattern, pattern, len);
+   FLEX_ALLOC_MEM(ent, pattern, pattern, len);

[PATCH 07/21] convert trivial cases to ALLOC_ARRAY

2016-02-19 Thread Jeff King

Each of these cases can be converted to use ALLOC_ARRAY or
REALLOC_ARRAY, which has two advantages:

  1. It automatically checks the array-size multiplication
 for overflow.

  2. It always uses sizeof(*array) for the element-size,
 so that it can never go out of sync with the declared
 type of the array.

Signed-off-by: Jeff King 
---
 alias.c  |  2 +-
 attr.c   |  2 +-
 bisect.c |  4 ++--
 builtin/blame.c  |  3 ++-
 builtin/clean.c  |  2 +-
 builtin/fast-export.c|  2 +-
 builtin/index-pack.c |  4 ++--
 builtin/merge-base.c |  2 +-
 builtin/mv.c |  3 ++-
 builtin/pack-objects.c   |  7 ---
 builtin/pack-redundant.c |  2 +-
 builtin/receive-pack.c   |  5 ++---
 column.c |  2 +-
 combine-diff.c   |  4 ++--
 commit.c |  2 +-
 compat/mingw.c   |  6 +++---
 diffcore-order.c |  4 ++--
 dir.c|  6 +++---
 fast-import.c|  5 +++--
 fsck.c   |  3 ++-
 graph.c  | 10 --
 khash.h  |  2 +-
 levenshtein.c|  8 +---
 line-log.c   |  8 
 notes.c  |  2 +-
 pack-check.c |  2 +-
 pack-revindex.c  | 12 
 pathspec.c   |  5 +++--
 remote-curl.c|  3 ++-
 sha1_file.c  |  4 ++--
 shallow.c|  6 +++---
 show-index.c |  3 ++-
 transport.c  |  2 +-
 xdiff-interface.c|  2 +-
 34 files changed, 75 insertions(+), 64 deletions(-)

diff --git a/alias.c b/alias.c
index a11229d..3b90397 100644
--- a/alias.c
+++ b/alias.c
@@ -23,7 +23,7 @@ int split_cmdline(char *cmdline, const char ***argv)
int src, dst, count = 0, size = 16;
char quoted = 0;
 
-   *argv = xmalloc(sizeof(**argv) * size);
+   ALLOC_ARRAY(*argv, size);
 
/* split alias_string */
(*argv)[count++] = cmdline;
diff --git a/attr.c b/attr.c
index 086c08d..c83ec49 100644
--- a/attr.c
+++ b/attr.c
@@ -799,7 +799,7 @@ int git_all_attrs(const char *path, int *num, struct 
git_attr_check **check)
++count;
}
*num = count;
-   *check = xmalloc(sizeof(**check) * count);
+   ALLOC_ARRAY(*check, count);
j = 0;
for (i = 0; i < attr_nr; i++) {
const char *value = check_all_attr[i].value;
diff --git a/bisect.c b/bisect.c
index 06ec54e..7996c29 100644
--- a/bisect.c
+++ b/bisect.c
@@ -708,10 +708,10 @@ static struct commit *get_commit_reference(const unsigned 
char *sha1)
 
 static struct commit **get_bad_and_good_commits(int *rev_nr)
 {
-   int len = 1 + good_revs.nr;
-   struct commit **rev = xmalloc(len * sizeof(*rev));
+   struct commit **rev;
int i, n = 0;
 
+   ALLOC_ARRAY(rev, 1 + good_revs.nr);
rev[n++] = get_commit_reference(current_bad_oid->hash);
for (i = 0; i < good_revs.nr; i++)
rev[n++] = get_commit_reference(good_revs.sha1[i]);
diff --git a/builtin/blame.c b/builtin/blame.c
index 55bf5fa..b4ed462 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -2059,7 +2059,8 @@ static int prepare_lines(struct scoreboard *sb)
for (p = buf; p < end; p = get_next_line(p, end))
num++;
 
-   sb->lineno = lineno = xmalloc(sizeof(*sb->lineno) * (num + 1));
+   ALLOC_ARRAY(sb->lineno, num + 1);
+   lineno = sb->lineno;
 
for (p = buf; p < end; p = get_next_line(p, end))
*lineno++ = p - buf;
diff --git a/builtin/clean.c b/builtin/clean.c
index 7b08237..8229f7e 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -543,7 +543,7 @@ static int *list_and_choose(struct menu_opts *opts, struct 
menu_stuff *stuff)
int eof = 0;
int i;
 
-   chosen = xmalloc(sizeof(int) * stuff->nr);
+   ALLOC_ARRAY(chosen, stuff->nr);
/* set chosen as uninitialized */
for (i = 0; i < stuff->nr; i++)
chosen[i] = -1;
diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 2471297..8164b58 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -1021,7 +1021,7 @@ int cmd_fast_export(int argc, const char **argv, const 
char *prefix)
const char **refspecs_str;
int i;
 
-   refspecs_str = xmalloc(sizeof(*refspecs_str) * 
refspecs_list.nr);
+   ALLOC_ARRAY(refspecs_str, refspecs_list.nr);
for (i = 0; i < refspecs_list.nr; i++)
refspecs_str[i] = refspecs_list.items[i].string;
 
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 6a01509..a60bcfa 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1346,7 +1346,7 @@ static void fix_unresolved_deltas(struct sha1file *f)
 * before deltas depending on them, a good heuristic is to start
 * resolving deltas in the same order as

[PATCH 06/21] convert manual allocations to argv_array

2016-02-19 Thread Jeff King

There are many manual argv allocations that predate the
argv_array API. Switching to that API brings a few
advantages:

  1. We no longer have to manually compute the correct final
 array size (so it's one less thing we can screw up).

  2. In many cases we had to make a separate pass to count,
 then allocate, then fill in the array. Now we can do it
 in one pass, making the code shorter and easier to
 follow.

  3. argv_array handles memory ownership for us, making it
 more obvious when things should be free()d and and when
 not.

Most of these cases are pretty straightforward. In some, we
switch from "run_command_v" to "run_command" which lets us
directly use the argv_array embedded in "struct
child_process".

Signed-off-by: Jeff King 
---
 builtin/grep.c | 10 +-
 builtin/receive-pack.c | 12 +++-
 builtin/remote-ext.c   | 26 +-
 daemon.c   | 12 +---
 git.c  | 14 +-
 line-log.c | 26 ++
 remote-curl.c  | 23 +++
 7 files changed, 44 insertions(+), 79 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 8c516a9..aa7435f 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -365,17 +365,17 @@ static void append_path(struct grep_opt *opt, const void 
*data, size_t len)
 static void run_pager(struct grep_opt *opt, const char *prefix)
 {
struct string_list *path_list = opt->output_priv;
-   const char **argv = xmalloc(sizeof(const char *) * (path_list->nr + 1));
+   struct child_process child = CHILD_PROCESS_INIT;
int i, status;
 
for (i = 0; i < path_list->nr; i++)
-   argv[i] = path_list->items[i].string;
-   argv[path_list->nr] = NULL;
+   argv_array_push(, path_list->items[i].string);
+   child.dir = prefix;
+   child.use_shell = 1;
 
-   status = run_command_v_opt_cd_env(argv, RUN_USING_SHELL, prefix, NULL);
+   status = run_command();
if (status)
exit(status);
-   free(argv);
 }
 
 static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec, 
int cached)
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index f2d6761..932afab 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1031,7 +1031,6 @@ static void run_update_post_hook(struct command *commands)
 {
struct command *cmd;
int argc;
-   const char **argv;
struct child_process proc = CHILD_PROCESS_INIT;
const char *hook;
 
@@ -1044,21 +1043,16 @@ static void run_update_post_hook(struct command 
*commands)
if (!argc || !hook)
return;
 
-   argv = xmalloc(sizeof(*argv) * (2 + argc));
-   argv[0] = hook;
-
-   for (argc = 1, cmd = commands; cmd; cmd = cmd->next) {
+   argv_array_push(, hook);
+   for (cmd = commands; cmd; cmd = cmd->next) {
if (cmd->error_string || cmd->did_not_exist)
continue;
-   argv[argc] = xstrdup(cmd->ref_name);
-   argc++;
+   argv_array_push(, cmd->ref_name);
}
-   argv[argc] = NULL;
 
proc.no_stdin = 1;
proc.stdout_to_stderr = 1;
proc.err = use_sideband ? -1 : 0;
-   proc.argv = argv;
 
if (!start_command()) {
if (use_sideband)
diff --git a/builtin/remote-ext.c b/builtin/remote-ext.c
index e3cd25d..7457c74 100644
--- a/builtin/remote-ext.c
+++ b/builtin/remote-ext.c
@@ -114,30 +114,14 @@ static char *strip_escapes(const char *str, const char 
*service,
}
 }
 
-/* Should be enough... */
-#define MAXARGUMENTS 256
-
-static const char **parse_argv(const char *arg, const char *service)
+static void parse_argv(struct argv_array *out, const char *arg, const char 
*service)
 {
-   int arguments = 0;
-   int i;
-   const char **ret;
-   char *temparray[MAXARGUMENTS + 1];
-
while (*arg) {
-   char *expanded;
-   if (arguments == MAXARGUMENTS)
-   die("remote-ext command has too many arguments");
-   expanded = strip_escapes(arg, service, );
+   char *expanded = strip_escapes(arg, service, );
if (expanded)
-   temparray[arguments++] = expanded;
+   argv_array_push(out, expanded);
+   free(expanded);
}
-
-   ret = xmalloc((arguments + 1) * sizeof(char *));
-   for (i = 0; i < arguments; i++)
-   ret[i] = temparray[i];
-   ret[arguments] = NULL;
-   return ret;
 }
 
 static void send_git_request(int stdin_fd, const char *serv, const char *repo,
@@ -158,7 +142,7 @@ static int run_child(const char *arg, const char *service)
child.in = -1;
child.out = -1;
child.err = 0;
-   child.argv = parse_argv(arg, service);
+   parse_argv(, arg, service);

[PATCH 03/21] tree-diff: catch integer overflow in combine_diff_path allocation

2016-02-19 Thread Jeff King

A combine_diff_path struct has two "flex" members allocated
alongside the struct: a string to hold the pathname, and an
array of parent pointers. We use an "int" to compute this,
meaning we may easily overflow it if the pathname is
extremely long.

We can fix this by using size_t, and checking for overflow
with the st_add helper.

Signed-off-by: Jeff King 
---
 diff.h  | 4 ++--
 tree-diff.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/diff.h b/diff.h
index 70b2d70..beafbbd 100644
--- a/diff.h
+++ b/diff.h
@@ -222,8 +222,8 @@ struct combine_diff_path {
} parent[FLEX_ARRAY];
 };
 #define combine_diff_path_size(n, l) \
-   (sizeof(struct combine_diff_path) + \
-sizeof(struct combine_diff_parent) * (n) + (l) + 1)
+   st_add4(sizeof(struct combine_diff_path), (l), 1, \
+   st_mult(sizeof(struct combine_diff_parent), (n)))
 
 extern void show_combined_diff(struct combine_diff_path *elem, int num_parent,
  int dense, struct rev_info *);
diff --git a/tree-diff.c b/tree-diff.c
index 290a1da..4dda9a1 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -124,8 +124,8 @@ static struct combine_diff_path *path_appendnew(struct 
combine_diff_path *last,
unsigned mode, const unsigned char *sha1)
 {
struct combine_diff_path *p;
-   int len = base->len + pathlen;
-   int alloclen = combine_diff_path_size(nparent, len);
+   size_t len = st_add(base->len, pathlen);
+   size_t alloclen = combine_diff_path_size(nparent, len);
 
/* if last->next is !NULL - it is a pre-allocated memory, we can reuse 
*/
p = last->next;
-- 
2.7.1.577.gfed91b8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 04/21] harden REALLOC_ARRAY and xcalloc against size_t overflow

2016-02-19 Thread Jeff King

REALLOC_ARRAY inherently involves a multiplication which can
overflow size_t, resulting in a much smaller buffer than we
think we've allocated. We can easily harden it by using
st_mult() to check for overflow.  Likewise, we can add
ALLOC_ARRAY to do the same thing for xmalloc calls.

xcalloc() should already be fine, because it takes the two
factors separately, assuming the system calloc actually
checks for overflow. However, before we even hit the system
calloc(), we do our memory_limit_check, which involves a
multiplication. Let's check for overflow ourselves so that
this limit cannot be bypassed.

Signed-off-by: Jeff King 
---
 git-compat-util.h | 3 ++-
 wrapper.c | 3 +++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 0c65033..55c073d 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -779,7 +779,8 @@ extern int odb_pack_keep(char *name, size_t namesz, const 
unsigned char *sha1);
 extern char *xgetcwd(void);
 extern FILE *fopen_for_writing(const char *path);
 
-#define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), (alloc) * sizeof(*(x)))
+#define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult((alloc), sizeof(*(x
+#define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), st_mult((alloc), 
sizeof(*(x
 
 static inline char *xstrdup_or_null(const char *str)
 {
diff --git a/wrapper.c b/wrapper.c
index 29a45d2..9afc1a0 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -152,6 +152,9 @@ void *xcalloc(size_t nmemb, size_t size)
 {
void *ret;
 
+   if (unsigned_mult_overflows(nmemb, size))
+   die("data too large to fit into virtual memory space");
+
memory_limit_check(size * nmemb, 0);
ret = calloc(nmemb, size);
if (!ret && (!nmemb || !size))
-- 
2.7.1.577.gfed91b8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 05/21] add helpers for allocating flex-array structs

2016-02-19 Thread Jeff King

Allocating a struct with a flex array is pretty simple in
practice: you over-allocate the struct, then copy some data
into the over-allocation. But it can be a slight pain to
make sure you're allocating and copying the right amounts.

This patch adds a few helpers to turn simple cases of
flex-array struct allocation into a one-liner that properly
checks for overflow. See the embedded documentation for
details.

Ideally we could provide a more flexible version that could
handle multiple strings, like:

  FLEX_ALLOC_FMT(ref, name, "%s%s", prefix, name);

But we have to implement this as a macro (because of the
offset calculation of the flex member), which means we would
need all compilers to support variadic macros.

Signed-off-by: Jeff King 
---
 git-compat-util.h | 62 +++
 1 file changed, 62 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index 55c073d..e71e615 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -782,6 +782,68 @@ extern FILE *fopen_for_writing(const char *path);
 #define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult((alloc), sizeof(*(x
 #define REALLOC_ARRAY(x, alloc) (x) = xrealloc((x), st_mult((alloc), 
sizeof(*(x
 
+/*
+ * These functions help you allocate structs with flex arrays, and copy
+ * the data directly into the array. For example, if you had:
+ *
+ *   struct foo {
+ * int bar;
+ * char name[FLEX_ARRAY];
+ *   };
+ *
+ * you can do:
+ *
+ *   struct foo *f;
+ *   FLEX_ALLOC_MEM(f, name, src, len);
+ *
+ * to allocate a "foo" with the contents of "src" in the "name" field.
+ * The resulting struct is automatically zero'd, and the flex-array field
+ * is NUL-terminated (whether the incoming src buffer was or not).
+ *
+ * The FLEXPTR_* variants operate on structs that don't use flex-arrays,
+ * but do want to store a pointer to some extra data in the same allocated
+ * block. For example, if you have:
+ *
+ *   struct foo {
+ * char *name;
+ * int bar;
+ *   };
+ *
+ * you can do:
+ *
+ *   struct foo *f;
+ *   FLEX_ALLOC_STR(f, name, src);
+ *
+ * and "name" will point to a block of memory after the struct, which will be
+ * freed along with the struct (but the pointer can be repointed anywhere).
+ *
+ * The *_STR variants accept a string parameter rather than a ptr/len
+ * combination.
+ *
+ * Note that these macros will evaluate the first parameter multiple
+ * times, and it must be assignable as an lvalue.
+ */
+#define FLEX_ALLOC_MEM(x, flexname, buf, len) do { \
+   (x) = NULL; /* silence -Wuninitialized for offset calculation */ \
+   (x) = xalloc_flex(sizeof(*(x)), (char *)(&((x)->flexname)) - (char 
*)(x), (buf), (len)); \
+} while (0)
+#define FLEXPTR_ALLOC_MEM(x, ptrname, buf, len) do { \
+   (x) = xalloc_flex(sizeof(*(x)), sizeof(*(x)), (buf), (len)); \
+   (x)->ptrname = (void *)((x)+1); \
+} while(0)
+#define FLEX_ALLOC_STR(x, flexname, str) \
+   FLEX_ALLOC_MEM((x), flexname, (str), strlen(str))
+#define FLEXPTR_ALLOC_STR(x, ptrname, str) \
+   FLEXPTR_ALLOC_MEM((x), ptrname, (str), strlen(str))
+
+static inline void *xalloc_flex(size_t base_len, size_t offset,
+   const void *src, size_t src_len)
+{
+   unsigned char *ret = xcalloc(1, st_add3(base_len, src_len, 1));
+   memcpy(ret + offset, src, src_len);
+   return ret;
+}
+
 static inline char *xstrdup_or_null(const char *str)
 {
return str ? xstrdup(str) : NULL;
-- 
2.7.1.577.gfed91b8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 02/21] add helpers for detecting size_t overflow

2016-02-19 Thread Jeff King

Performing computations on size_t variables that we feed to
xmalloc and friends can be dangerous, as an integer overflow
can cause us to allocate a much smaller chunk than we
realized.

We already have unsigned_add_overflows(), but let's add
unsigned_mult_overflows() to that. Furthermore, rather than
have each site manually check and die on overflow, we can
provide some helpers that will:

  - promote the arguments to size_t, so that we know we are
doing our computation in the same size of integer that
will ultimately be fed to xmalloc

  - check and die on overflow

  - return the result so that computations can be done in
the parameter list of xmalloc.

These functions are a lot uglier to use than normal
arithmetic operators (you have to do "st_add(foo, bar)"
instead of "foo + bar"). To at least limit the damage, we
also provide multi-valued versions. So rather than:

  st_add(st_add(a, b), st_add(c, d));

you can write:

  st_add4(a, b, c, d);

This isn't nearly as elegant as a varargs function, but it's
a lot harder to get it wrong. You don't have to remember to
add a sentinel value at the end, and the compiler will
complain if you get the number of arguments wrong. This
patch adds only the numbered variants required to convert
the current code base; we can easily add more later if
needed.

Signed-off-by: Jeff King 
---
 git-compat-util.h | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index 693a336..0c65033 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -96,6 +96,14 @@
 #define unsigned_add_overflows(a, b) \
 ((b) > maximum_unsigned_value_of_type(a) - (a))
 
+/*
+ * Returns true if the multiplication of "a" and "b" will
+ * overflow. The types of "a" and "b" must match and must be unsigned.
+ * Note that this macro evaluates "a" twice!
+ */
+#define unsigned_mult_overflows(a, b) \
+((a) && (b) > maximum_unsigned_value_of_type(a) / (a))
+
 #ifdef __GNUC__
 #define TYPEOF(x) (__typeof__(x))
 #else
@@ -713,6 +721,32 @@ extern void release_pack_memory(size_t);
 typedef void (*try_to_free_t)(size_t);
 extern try_to_free_t set_try_to_free_routine(try_to_free_t);
 
+static inline size_t st_add(size_t a, size_t b)
+{
+   if (unsigned_add_overflows(a, b))
+   die("size_t overflow: %"PRIuMAX" + %"PRIuMAX,
+   (uintmax_t)a, (uintmax_t)b);
+   return a + b;
+}
+#define st_add3(a,b,c)   st_add((a),st_add((b),(c)))
+#define st_add4(a,b,c,d) st_add((a),st_add3((b),(c),(d)))
+
+static inline size_t st_mult(size_t a, size_t b)
+{
+   if (unsigned_mult_overflows(a, b))
+   die("size_t overflow: %"PRIuMAX" * %"PRIuMAX,
+   (uintmax_t)a, (uintmax_t)b);
+   return a * b;
+}
+
+static inline size_t st_sub(size_t a, size_t b)
+{
+   if (a < b)
+   die("size_t underflow: %"PRIuMAX" - %"PRIuMAX,
+   (uintmax_t)a, (uintmax_t)b);
+   return a - b;
+}
+
 #ifdef HAVE_ALLOCA_H
 # include 
 # define xalloca(size)  (alloca(size))
-- 
2.7.1.577.gfed91b8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 01/21] reflog_expire_cfg: NUL-terminate pattern field

2016-02-19 Thread Jeff King

You can tweak the reflog expiration for a particular subset
of refs by configuring gc.foo.reflogexpire. We keep a linked
list of reflog_expire_cfg structs, each of which holds the
pattern and a "len" field for the length of the pattern. The
pattern itself is _not_ NUL-terminated.

However, we feed the pattern directly to wildmatch(), which
expects a NUL-terminated string, meaning it may keep reading
random junk after our struct.

We can fix this by allocating an extra byte for the NUL
(which is already zero because we use xcalloc). Let's also
drop the misleading "len" field, which is no longer
necessary. The existing use of "len" can be converted to use
strncmp().

Signed-off-by: Jeff King 
---
 builtin/reflog.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/builtin/reflog.c b/builtin/reflog.c
index f39960e..9980731 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -396,7 +396,6 @@ static struct reflog_expire_cfg {
struct reflog_expire_cfg *next;
unsigned long expire_total;
unsigned long expire_unreachable;
-   size_t len;
char pattern[FLEX_ARRAY];
 } *reflog_expire_cfg, **reflog_expire_cfg_tail;
 
@@ -408,13 +407,12 @@ static struct reflog_expire_cfg *find_cfg_ent(const char 
*pattern, size_t len)
reflog_expire_cfg_tail = _expire_cfg;
 
for (ent = reflog_expire_cfg; ent; ent = ent->next)
-   if (ent->len == len &&
-   !memcmp(ent->pattern, pattern, len))
+   if (!strncmp(ent->pattern, pattern, len) &&
+   ent->pattern[len] == '\0')
return ent;
 
-   ent = xcalloc(1, (sizeof(*ent) + len));
+   ent = xcalloc(1, sizeof(*ent) + len + 1);
memcpy(ent->pattern, pattern, len);
-   ent->len = len;
*reflog_expire_cfg_tail = ent;
reflog_expire_cfg_tail = &(ent->next);
return ent;
-- 
2.7.1.577.gfed91b8

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 0/21] hardening allocations against integer overflow

2016-02-19 Thread Jeff King

Here's a re-roll of jk/tighten-alloc series from:

  http://thread.gmane.org/gmane.comp.version-control.git/286253

This one fixes all of the minor typo/gramm-o problems from the first
round. As Eric noted, the change to reflog_expire_cfg is an actual
bug-fix. Rather than silently fixing it, I've bumped the fix out to its
own commit at the front of the series.

I also took a look at the raw computation in the ALLOC_ARRAY and
REALLOC_ARRAY lines, as well as the ones in ALLOC_GROW. In theory
something like this is dangerous:

  ALLOC_GROW(foo, nr_foo + 1, alloc_foo);
  foo[nr_foo++] = whatever;

if we overflow nr_foo, which is quite often an "int". In practice, I
think we're OK, for two reasons:

  1. Overflowing a signed "int" here is going to make it go negative
 (technically, it invokes undefined behavior, but let's be blindly
 pragmatic for a minute and assume twos-complement wrapping). On a
 system with a 64-bit size_t, that will try to allocate an enormous
 amount of memory and fail. On a 32 bit system, it will be only
 about 2GB.  But...

  2. We're talking about overflowing 2^31 counters here. And the counter
 is multiplied by the size of each object we're storing in the
 array. So even if we assume that foo is "char *", we know we've
 allocated close to 2GB already. On a 32-bit system, the subsequent
 2GB allocation is pretty much guaranteed to fail.

 On a 64-bit system, I suspect it's possible to convince some of
 these counters to wrap (e.g., storing an array of ints, we're
 talking about only 8GB; that's a lot, but plenty of machines,
 especially servers, can allocate that).

So I have a feeling we're mostly OK there, but the reasoning is
certainly hand-wavy and I'd like to do better. Just switching to:

  ALLOC_GROW(foo, st_add(nr_foo, 1), alloc_foo);
  foo[nr_foo++] = whatever;

doesn't quite cut it. We might succeed in the allocation, and it stays
big, which is good. But if nr_foo is an int, and we wrap to negative
values, we'll start writing to memory before "foo", corrupting the heap.

So I really think we need to look at each site (and there are a lot of
them) and start using size_t more consistently for these. Or
alternatively, have an int-sized version of st_add and use that, though
it's probably just as much work to convert it to a size_t, which IMHO is
more correct. I really wanted to make a type-agnostic version of
st_add(), but I don't think it's possible to do so portably. My best
attempts needed either typeof() or compiler intrinsics.

So I've punted on that for this series, because I'm not convinced there
are active problems, and it's quite a lot of work (and the patches will
be quite disruptive).

While pondering this, I also looked at what happens if an incoming
packfile claims to have 2^32 objects in its header. In index-pack we
actually read this into a signed "int". Which is kind of bad, but in
practice means we run into the "whoops, I can't allocate (size_t)-1
memory" problem and die. We could change this to a uint32_t (which is
what the actual incoming format supports), but I have a feeling that
makes things worse (if we actually manage to process that many objects,
we then start doing some other computations based on the number of
objects, all using ints; so at least as it is now, we bail early).

While peeking at some of these sites, though, I did realize that many of
the ones that became "ALLOC_ARRAY(foo + 1)" were doing so to make a
NULL-terminated argv list. So there are two new patches in this
iteration to switch them to argv_array (one to catch the mundane cases,
and one for a unique snowflake).

  [01/21]: reflog_expire_cfg: NUL-terminate pattern field
  [02/21]: add helpers for detecting size_t overflow
  [03/21]: tree-diff: catch integer overflow in combine_diff_path allocation
  [04/21]: harden REALLOC_ARRAY and xcalloc against size_t overflow
  [05/21]: add helpers for allocating flex-array structs
  [06/21]: convert manual allocations to argv_array
  [07/21]: convert trivial cases to ALLOC_ARRAY
  [08/21]: use xmallocz to avoid size arithmetic
  [09/21]: convert trivial cases to FLEX_ARRAY macros
  [10/21]: use st_add and st_mult for allocation size computation
  [11/21]: prepare_{git,shell}_cmd: use argv_array
  [12/21]: write_untracked_extension: use FLEX_ALLOC helper
  [13/21]: fast-import: simplify allocation in start_packfile
  [14/21]: fetch-pack: simplify add_sought_entry
  [15/21]: test-path-utils: fix normalize_path_copy output buffer size
  [16/21]: sequencer: simplify memory allocation of get_message
  [17/21]: git-compat-util: drop mempcpy compat code
  [18/21]: transport_anonymize_url: use xstrfmt
  [19/21]: diff_populate_gitlink: use a strbuf
  [20/21]: convert ewah/bitmap code to use xmalloc
  [21/21]: ewah: convert to REALLOC_ARRAY, etc

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at

Re: Windows git bash - child processes see system PATH environment variable instead of user...

2016-02-19 Thread Edward Marshall

Hey, thanks for getting back to me. I subseuqently found the
git-for-windows issues tracker on github so have posted an updated
version of this there - not sure where is the best place for issues.

I have also found that the cause was having
%VPROJECT%..\bin
inside my user PATH
but no VPROJECT environment variable.
Removing the path or adding the missing env var fixes the problem.
This suggests that something is failing to concatenate user PATH to
system PATH when it can't expand the non-existent environment
variable.

fwiw, I don't have a .profile file either.

Hopefully you can replicate the problem with the above info, but
please let me know if you still want me to run anything or if there is
anything else I can do to help.

Kind regards,

Edward Marshall


On 19 February 2016 at 11:08, Johannes Schindelin
 wrote:
> Hi Edward,
>
> On Wed, 17 Feb 2016, Edward Marshall wrote:
>
>> Edward@Edward-PC MINGW64 /f/Work
>> $ echo $PATH
>> ...(USER PATH)...
>
> I presume this is Git for Windows 2.7.1(2), installed via the default
> installer?
>
> And I also assume that you run this in Git Bash?
>
> Can you test whether this also happens in the PortableGit from
> https://github.com/git-for-windows/git/releases/latest? (These are
> self-extracting .7z archives, just install them somewhere else than your
> current installation, it won't touch your Start Menu or desktop icons.)
>
> The Git Bash is available in the portable installation as git-bash.exe in
> the top-leve directory.
>
>> Edward@Edward-PC MINGW64 /f/Work
>> $ cmd
>> Microsoft Windows [Version 6.1.7601]
>> Copyright (c) 2009 Microsoft Corporation. All rights reserved.
>>
>> F:\Work>echo %PATH%
>> echo %PATH%
>> ...(SYSTEM PATH)...
>>
>> The same is true of any child process (e.g. node.js) - they all see
>> SYSTEM PATH now instead of USER PATH.
>
> Are you sure that it is "instead of"? AFAICT both Git CMD and Git Bash
> should have the full PATH, i.e. both system and user PATH concatenated.
>
>> Running bash -l from a cmd window, has the problem.
>
> Now, this is interesting. Maybe something funky does happen in some
> strange place in the Bash profile.
>
> If you can reproduce this with the portable Git, it would be good to
> insert the line "set -x" at the top of %PORTABLEGIT%\etc\profile
> (that is the correct file name IIRC, it could also be bash.bash_profile,
> but I think the latter is included by the former). Then start bash -l
> again, and see whether you can find the place where it sets the PATH.
>
> To be certain, I would also recommend outputting the PATH with which Bash
> starts, either by inserting a line "echo $PATH" at the top of the profile
> file, or by starting bash without -l option.
>
>> I have no .bashrc or .bash_profile files on either system (no idea
>> what these are for but a colleague was trying to help diagnose the
>> problem - they ultimately came up empty).
>
> Could be a .profile, too, maybe.
>
>> Unfortunately I don't know what version of git I had before, and older
>> versions aren't offered for download so I can't trial and error.
>
> Older versions are still available (we did not remove any) from
> https://github.com/git-for-windows/git/releases (and if you look for the
> 1.x series that was using a completely different Bash/MSYS:
> https://github.com/msysgit/msysgit/releases).
>
> Ciao,
> Johannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Windows git bash - child processes see system PATH environment variable instead of user...

2016-02-19 Thread Johannes Schindelin

Hi Edward,

On Wed, 17 Feb 2016, Edward Marshall wrote:

> Edward@Edward-PC MINGW64 /f/Work
> $ echo $PATH
> ...(USER PATH)...

I presume this is Git for Windows 2.7.1(2), installed via the default
installer?

And I also assume that you run this in Git Bash?

Can you test whether this also happens in the PortableGit from
https://github.com/git-for-windows/git/releases/latest? (These are
self-extracting .7z archives, just install them somewhere else than your
current installation, it won't touch your Start Menu or desktop icons.)

The Git Bash is available in the portable installation as git-bash.exe in
the top-leve directory.

> Edward@Edward-PC MINGW64 /f/Work
> $ cmd
> Microsoft Windows [Version 6.1.7601]
> Copyright (c) 2009 Microsoft Corporation. All rights reserved.
> 
> F:\Work>echo %PATH%
> echo %PATH%
> ...(SYSTEM PATH)...
> 
> The same is true of any child process (e.g. node.js) - they all see
> SYSTEM PATH now instead of USER PATH.

Are you sure that it is "instead of"? AFAICT both Git CMD and Git Bash
should have the full PATH, i.e. both system and user PATH concatenated.

> Running bash -l from a cmd window, has the problem.

Now, this is interesting. Maybe something funky does happen in some
strange place in the Bash profile.

If you can reproduce this with the portable Git, it would be good to
insert the line "set -x" at the top of %PORTABLEGIT%\etc\profile
(that is the correct file name IIRC, it could also be bash.bash_profile,
but I think the latter is included by the former). Then start bash -l
again, and see whether you can find the place where it sets the PATH.

To be certain, I would also recommend outputting the PATH with which Bash
starts, either by inserting a line "echo $PATH" at the top of the profile
file, or by starting bash without -l option.

> I have no .bashrc or .bash_profile files on either system (no idea
> what these are for but a colleague was trying to help diagnose the
> problem - they ultimately came up empty).

Could be a .profile, too, maybe.

> Unfortunately I don't know what version of git I had before, and older
> versions aren't offered for download so I can't trial and error.

Older versions are still available (we did not remove any) from
https://github.com/git-for-windows/git/releases (and if you look for the
1.x series that was using a completely different Bash/MSYS:
https://github.com/msysgit/msysgit/releases).

Ciao,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GSoC 2016: applications open, libgit2 and git.git

2016-02-19 Thread Carlos Martín Nieto

On Fri, 2016-02-19 at 09:06 +0100, Matthieu Moy wrote:
> Carlos Martín Nieto  writes:
> 
> > We still have most of the same things open as for the 2014 list.
> > I'll
> > ask around to see if we have. Last year I wasn't involved in the
> > candidate selection but IIRC we didn't do a project as none of the
> > applications showed the candidates would be capable of doing the
> > project they were applying for.
> 
> OK. It's essentially too late to change this for this year, but next
> time we should discuss earlier about how we want to organize this
> git.git/libgit2 thing. For example, I think it would make little
> sense
> to have a git.git microproject and then apply for a libgit2 project
> since we have very different ways of interacting. And honnestly,
> right
> now the application is really git.git-centric so I don't think it
> attracts students towards libgit2. So, if you want to attract more
> students, we should work on that.
> 
> I tried to clarify the situation with libgit2:
> 
> https://github.com/git/git.github.io/commit/94d1747eb9621b3bc892be2f2
> 32338b7933ac271
> 
> Please say if you're happy/unhappy with what I wrote. PRs are still
> welcome after the deadline.

This is fine. Our projects file should also be noted for the
microprojects, but I'll write that up myself. I'm writing up the two
projects we've thought up as part of the ideas page and will send a PR
today.

   cmn

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] git-p4.py: Don't try to rebase on submit from bare repository

2016-02-19 Thread Luke Diamand

On 17 February 2016 at 22:46, Amadeusz Żołnowski  wrote:
> git-p4 can be successfully used from bare repository (which acts as a
> bridge between Perforce repository and pure Git repositories). On submit
> git-p4 performs unconditional rebase. Do rebase only on non-bare
> repositories.

This looks obviously sensible and good to me, ack.

Thanks!
Luke



> ---
>  git-p4.py | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/git-p4.py b/git-p4.py
> index c33dece..e00cd02 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -2059,8 +2059,9 @@ class P4Submit(Command, P4UserMap):
>  sync.branch = self.branch
>  sync.run([])
>
> -rebase = P4Rebase()
> -rebase.rebase()
> +if not gitConfigBool("core.bare"):
> +rebase = P4Rebase()
> +rebase.rebase()
>
>  else:
>  if len(applied) == 0:
> --
> 2.7.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GSoC 2016: applications open, deadline = Fri, 19/2

2016-02-19 Thread Duy Nguyen

On Fri, Feb 19, 2016 at 2:17 PM, Matthieu Moy
 wrote:
>> with David's multiple ref backend work, we could have a third,
>> no-dependency backend. We can use index format to store refs.
>
> This sounds like an interesting but ambitious project for a GSoC. There
> are a lot of new stuff to understand for someone potentially new to
> Git's codebase. And it's hard to work incrementally: the result would
> hardly be mergeable before being almost finished.

On the other hand, the actual amount of code they write is roughly
about 1700 lines of refs/lmdb-backend.c. Which I guess can be written
in a month once you know what's going on, basically how refs are
handled (I think documents have been greatly improved), git object
manipulation and optionally index manipulation  (if we store in index
instead of trees). I think it's manageable. But then I haven't
interacted with students for a looong time.

> I think it's interesting to offer the idea, but there should be a
> warning for the student about the difficulties.

Yep.

> Would you be willing to (co-)mentor?

I can't guarantee I will not disappear for a couple months again like
last year. It depends on $DAY_JOB. So maybe co-mentor position, but my
other co-mentor should be ready for that situation.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 14/25] shallow.c: implement a generic shallow boundary finder based on rev-list

2016-02-19 Thread Duy Nguyen

On Mon, Feb 08, 2016 at 01:09:24PM -0800, Junio C Hamano wrote:
> Nguyễn Thái Ngọc Duy   writes:
> 
> > Instead of a custom commit walker like get_shallow_commits(), this new
> > function uses rev-list to mark NOT_SHALLOW to all reachable commits,
> > except borders. The definition of reachable is to be defined by the
> > protocol later. This makes it more flexible to define shallow boundary.
> >
> > Note: if a commit has one NOT_SHALLOW parent and one SHALLOW parent,
> > then it's considered the boundary. Which means in the client side, this
> > commit has _no_ parents. This could lead to surprising cuts if we're not
> > careful.
> >
> > Another option is to include more commits and only mark commits whose
> > all parents are SHALLOW as boundary.
> 
> The second and third are greek to me at this point ;-) but hopefully
> they will become clear as we read on.

Yeah. Everything looks clearer with illustration. This should be a
better. The question is should we do something about it now, or leave
it as is.

I'm tempted to go with "the first way" in future (so add some comments
about this in is_repository_shallow, instead of leaving it as commit
message in this patch).

-- 8< --
The way we find find border is paint all reachable commits NOT_SHALLOW.
Any of them that "touches" commits without NOT_SHALLOW flag are
considered shallow (e.g. zero parents via grafting mechanism). Shallow
commits and their true parents are all marked SHALLOW. Then NOT_SHALLOW
is removed from shallow commits at the end.

There is interesting observation, though somewhat off topic for this
patch. In the following graph, "x" is unreachable commits. "b" is the
parent of "a".

   x -- a -- o
   //
 x -- b -- o

And as a result, "a" and "b" are both considered shallow commits. After
grafting occurs at the client side, what we see is

a -- o
/
  b -- o

Notice that because of grafting, "a" has zero parents, so "b" is no
longer a parent of "a".

This is unfortunate and may be solved in two ways. The first is change
the way shallow grafting works and keeps "b -- a" connection if "b"
exists and is a shallow commit.

The second way is produce this graph (at client side) instead

   x -- a -- o
   //
  b -- o

Which means we mark "x" as a shallow commit instead of "a".
-- 8< --
--
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GSoC 2016: applications open, deadline = Fri, 19/2

2016-02-19 Thread Lars Schneider


On 18 Feb 2016, at 20:13, Junio C Hamano  wrote:

> Stefan Beller  writes:
> 
>> On Thu, Feb 18, 2016 at 12:41 AM, Lars Schneider
>>  wrote:
 Feel free to start writting an idea for
 http://git.github.io/SoC-2016-Ideas/. It'd be nice to have a few more
 ideas before Friday. We can polish them later if needed.
>>> 
>>> I published my ideas here:
>>> https://github.com/git/git.github.io/pull/125/files
>> 
>> I like the idea of a beginner mode, but on the other hand that looks
>> inflexible to me ;)
>> (What if I want to use rebase, but not reset --hard?)
> 
> That's simple.  You say "cd .. && rm -fr repo && git clone" and
> start from scratch ;-).
> 
> This whole "beginner should be limited to a 'safe' subset" is an
> unhealthy attitude.
> 
> Deciding what the 'safe' subset is must be done with a lot of
> thinking by people who intimately know what implications it has to
> ban each feature.  I do not think it would be a good fit for a
> project to give to a relatively new participant to the Git project.
> 
> For example, I think banning "worktree" feature from newbies may not
> be a bad idea, as you can work on a project without using "worktree"
> at all, and use of "worktree" would only subject you to bugs that do
> not exist when you do not use that feature.  The "shallow clone",
> "sparse checkout", and "untracked cache" fall into the same category
> for exactly the same reason.  The "submodule" feature might fall
> into the same category for the same reason, but that is not
> something you as a project participant can unilaterally decide, as
> the project you are working on may have already decided to use the
> feature, so it is harder to ban from the beginners.
> 
> But for the rest of really "core" part of Git, I do not think there
> is any such command that can be totally banned.
> 
> We have these "powerful" tools for a reason.  After making a mess
> experimenting with your working tree files, "reset --hard" is the
> best tool to go back to the known-good state, and robbing it from
> the users is not a sound approach to help them.  When "powerful"
> becomes "too powerful" is when a "powerful" tool is misused.  It is
> perhaps done by mistake or perhaps done by copying and pasting a
> solution from Interweb for a problem that does not match your
> situation without understanding what you are doing.
> 
> What is needed to help beginners is to make the powerful tool harder
> to misuse.  Of course, that would be a harder task, because you have
> to do a real thinking.
> 
> You do not have to do any thinking to say that "a blanket ban that
> hides these powerful tools behind the beginner mode" helps
> beginners, but I do not think it is solving what really matters.  At
> the same time, it just adds to the FUD, i.e. some commands are too
> powerful for their own good.

Thanks for your elaborate response. I think I got your point and I
tried to adjust my "beginner mode" proposal accordingly [1]. Here
is the relevant change:

If this mode is enabled then Git shall print a warning message before 
running a potentially destructive command. In addition to the warning 
Git shall print a command that would reverse the operation if possible. 
Most of the information to reverse an operation is already available 
via git reflog. However, the task here is to make this information more 
easily accessible to Git beginners.

--

Does this go into the direction of "making the powerful tool harder to
misuse"?

Thanks,
Lars
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 2/4] rename git_config_from_buf to git_config_from_mem

2016-02-19 Thread larsxschneider

From: Lars Schneider 

This matches the naming used in the index_{fd,mem,...} functions.

Suggested-by: Junio C Hamano 
Signed-off-by: Lars Schneider 
---
 cache.h| 2 +-
 config.c   | 4 ++--
 submodule-config.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/cache.h b/cache.h
index c63fcc1..6679bb4 100644
--- a/cache.h
+++ b/cache.h
@@ -1485,7 +1485,7 @@ struct git_config_source {
 typedef int (*config_fn_t)(const char *, const char *, void *);
 extern int git_default_config(const char *, const char *, void *);
 extern int git_config_from_file(config_fn_t fn, const char *, void *);
-extern int git_config_from_buf(config_fn_t fn, const char *name,
+extern int git_config_from_mem(config_fn_t fn, const char *name,
   const char *buf, size_t len, void *data);
 extern void git_config_push_parameter(const char *text);
 extern int git_config_from_parameters(config_fn_t fn, void *data);
diff --git a/config.c b/config.c
index 86a5eb2..36b0ddb 100644
--- a/config.c
+++ b/config.c
@@ -1096,7 +1096,7 @@ int git_config_from_file(config_fn_t fn, const char 
*filename, void *data)
return ret;
 }
 
-int git_config_from_buf(config_fn_t fn, const char *name, const char *buf,
+int git_config_from_mem(config_fn_t fn, const char *name, const char *buf,
size_t len, void *data)
 {
struct config_source top;
@@ -1132,7 +1132,7 @@ static int git_config_from_blob_sha1(config_fn_t fn,
return error("reference '%s' does not point to a blob", name);
}
 
-   ret = git_config_from_buf(fn, name, buf, size, data);
+   ret = git_config_from_mem(fn, name, buf, size, data);
free(buf);
 
return ret;
diff --git a/submodule-config.c b/submodule-config.c
index fe8ceab..b85a937 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -427,7 +427,7 @@ static const struct submodule *config_from(struct 
submodule_cache *cache,
parameter.commit_sha1 = commit_sha1;
parameter.gitmodules_sha1 = sha1;
parameter.overwrite = 0;
-   git_config_from_buf(parse_config, rev.buf, config, config_size,
+   git_config_from_mem(parse_config, rev.buf, config, config_size,
);
free(config);
 
-- 
2.5.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 0/4] config: add '--sources' option to print the source of a config value

2016-02-19 Thread larsxschneider

From: Lars Schneider 

diff to v5:
* rename 'type' to 'origin_type' as 'type' is too broad a word in the context
  of configuration file (Thanks to the reviewers Junio and Dscho)
* add dedicated patch to rename git_config_from_buf to git_config_from_mem
  (this change was part of the series since v4 as suggested by Junio but
  hidden in the "config: add 'origin_type'" patch)

Thanks,
Lars

Lars Schneider (4):
  t: do not hide Git's exit code in tests using 'nul_to_q'
  rename git_config_from_buf to git_config_from_mem
  config: add 'origin_type' to config_source struct
  config: add '--show-origin' option to print the origin of a config
value

 Documentation/git-config.txt |  15 ++--
 builtin/config.c |  33 +
 cache.h  |   6 +-
 config.c |  36 +++---
 submodule-config.c   |   4 +-
 t/t1300-repo-config.sh   | 161 ++-
 t/t1308-config-set.sh|   4 +-
 t/t7008-grep-binary.sh   |   3 +-
 8 files changed, 236 insertions(+), 26 deletions(-)

--
2.5.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 1/4] t: do not hide Git's exit code in tests using 'nul_to_q'

2016-02-19 Thread larsxschneider

From: Lars Schneider 

Git should not be on the left-hand side of a pipe, because it hides the exit
code, and we want to make sure git does not fail.

Fix all invocations of 'nul_to_q' (defined in /t/test-lib-functions.sh) using
this pattern. There is one more occurrence of the pattern in t9010-svn-fe.sh
which is too evolved to change it easily.

All remaining test code that does not adhere to the pattern can be found with
the following command:
git grep -E 'git.*[^|]\|($|[^|])'

Helped-by: Jeff King 
Signed-off-by: Lars Schneider 
---
 t/t1300-repo-config.sh | 6 --
 t/t7008-grep-binary.sh | 3 ++-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 52678e7..1782add 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -957,13 +957,15 @@ Qsection.sub=section.val4
 Qsection.sub=section.val5Q
 EOF
 test_expect_success '--null --list' '
-   git config --null --list | nul_to_q >result &&
+   git config --null --list >result.raw &&
+   nul_to_q result &&
echo >>result &&
test_cmp expect result
 '
 
 test_expect_success '--null --get-regexp' '
-   git config --null --get-regexp "val[0-9]" | nul_to_q >result &&
+   git config --null --get-regexp "val[0-9]" >result.raw &&
+   nul_to_q result &&
echo >>result &&
test_cmp expect result
 '
diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index b146406..9c9c378 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -141,7 +141,8 @@ test_expect_success 'grep respects not-binary diff 
attribute' '
test_cmp expect actual &&
echo "b diff" >.gitattributes &&
echo "b:binQary" >expect &&
-   git grep bin b | nul_to_q >actual &&
+   git grep bin b >actual.raw &&
+   nul_to_q actual &&
test_cmp expect actual
 '
 
-- 
2.5.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 4/4] config: add '--show-origin' option to print the origin of a config value

2016-02-19 Thread larsxschneider

From: Lars Schneider 

If config values are queried using 'git config' (e.g. via --get,
--get-all, --get-regexp, or --list flag) then it is sometimes hard to
find the configuration file where the values were defined.

Teach 'git config' the '--show-origin' option to print the source
configuration file for every printed value.

Based-on-patch-by: Jeff King 
Signed-off-by: Lars Schneider 
---
 Documentation/git-config.txt |  15 +++--
 builtin/config.c |  33 ++
 t/t1300-repo-config.sh   | 147 +++
 3 files changed, 190 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index 2608ca7..6374997 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -9,18 +9,18 @@ git-config - Get and set repository or global options
 SYNOPSIS
 
 [verse]
-'git config' [] [type] [-z|--null] name [value [value_regex]]
+'git config' [] [type] [--show-origin] [-z|--null] name [value 
[value_regex]]
 'git config' [] [type] --add name value
 'git config' [] [type] --replace-all name value [value_regex]
-'git config' [] [type] [-z|--null] --get name [value_regex]
-'git config' [] [type] [-z|--null] --get-all name [value_regex]
-'git config' [] [type] [-z|--null] [--name-only] --get-regexp 
name_regex [value_regex]
+'git config' [] [type] [--show-origin] [-z|--null] --get name 
[value_regex]
+'git config' [] [type] [--show-origin] [-z|--null] --get-all name 
[value_regex]
+'git config' [] [type] [--show-origin] [-z|--null] [--name-only] 
--get-regexp name_regex [value_regex]
 'git config' [] [type] [-z|--null] --get-urlmatch name URL
 'git config' [] --unset name [value_regex]
 'git config' [] --unset-all name [value_regex]
 'git config' [] --rename-section old_name new_name
 'git config' [] --remove-section name
-'git config' [] [-z|--null] [--name-only] -l | --list
+'git config' [] [--show-origin] [-z|--null] [--name-only] -l | 
--list
 'git config' [] --get-color name [default]
 'git config' [] --get-colorbool name [stdout-is-tty]
 'git config' [] -e | --edit
@@ -194,6 +194,11 @@ See also <>.
Output only the names of config variables for `--list` or
`--get-regexp`.

+--show-origin::
+   Augment the output of all queried config options with the
+   origin type (file, stdin, blob, cmdline) and the actual origin
+   (config file path, ref, or blob id if applicable).
+
 --get-colorbool name [stdout-is-tty]::

Find the color setting for `name` (e.g. `color.diff`) and output
diff --git a/builtin/config.c b/builtin/config.c
index adc7727..a1a9b9a 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -3,6 +3,7 @@
 #include "color.h"
 #include "parse-options.h"
 #include "urlmatch.h"
+#include "quote.h"

 static const char *const builtin_config_usage[] = {
N_("git config []"),
@@ -27,6 +28,7 @@ static int actions, types;
 static const char *get_color_slot, *get_colorbool_slot;
 static int end_null;
 static int respect_includes = -1;
+static int show_origin;

 #define ACTION_GET (1<<0)
 #define ACTION_GET_ALL (1<<1)
@@ -81,6 +83,7 @@ static struct option builtin_config_options[] = {
OPT_BOOL('z', "null", _null, N_("terminate values with NUL byte")),
OPT_BOOL(0, "name-only", _values, N_("show variable names only")),
OPT_BOOL(0, "includes", _includes, N_("respect include 
directives on lookup")),
+   OPT_BOOL(0, "show-origin", _origin, N_("show origin of config 
(file, stdin, blob, cmdline)")),
OPT_END(),
 };

@@ -91,8 +94,28 @@ static void check_argc(int argc, int min, int max) {
usage_with_options(builtin_config_usage, builtin_config_options);
 }

+static void show_config_origin(struct strbuf *buf)
+{
+   const char term = end_null ? '\0' : '\t';
+
+   strbuf_addstr(buf, current_config_origin_type());
+   strbuf_addch(buf, ':');
+   if (end_null)
+   strbuf_addstr(buf, current_config_name());
+   else
+   quote_c_style(current_config_name(), buf, NULL, 0);
+   strbuf_addch(buf, term);
+}
+
 static int show_all_config(const char *key_, const char *value_, void *cb)
 {
+   if (show_origin) {
+   struct strbuf buf = STRBUF_INIT;
+   show_config_origin();
+   /* Use fwrite as "buf" can contain \0's if "end_null" is set. */
+   fwrite(buf.buf, 1, buf.len, stdout);
+   strbuf_release();
+   }
if (!omit_values && value_)
printf("%s%c%s%c", key_, delim, value_, term);
else
@@ -108,6 +131,8 @@ struct strbuf_list {

 static int format_config(struct strbuf *buf, const char *key_, const char 
*value_)
 {
+   if (show_origin)
+   show_config_origin(buf);
if (show_keys)
strbuf_addstr(buf, key_);
if (!omit_values) {
@@ -538,6 +563,14 @@ int cmd_config(int argc,

[PATCH v6 3/4] config: add 'origin_type' to config_source struct

2016-02-19 Thread larsxschneider

From: Lars Schneider 

Use the config origin_type to print more detailed error messages that
inform the user about the origin of a config error (file, stdin, blob).

Helped-by: Ramsay Jones 
Signed-off-by: Lars Schneider 
Acked-by: Jeff King 
---
 cache.h|  6 --
 config.c   | 36 +---
 submodule-config.c |  4 ++--
 t/t1300-repo-config.sh |  8 +++-
 t/t1308-config-set.sh  |  4 ++--
 5 files changed, 40 insertions(+), 18 deletions(-)

diff --git a/cache.h b/cache.h
index 6679bb4..ad7fcfc 100644
--- a/cache.h
+++ b/cache.h
@@ -1485,8 +1485,8 @@ struct git_config_source {
 typedef int (*config_fn_t)(const char *, const char *, void *);
 extern int git_default_config(const char *, const char *, void *);
 extern int git_config_from_file(config_fn_t fn, const char *, void *);
-extern int git_config_from_mem(config_fn_t fn, const char *name,
-  const char *buf, size_t len, void *data);
+extern int git_config_from_mem(config_fn_t fn, const char *origin_type,
+   const char *name, const char *buf, 
size_t len, void *data);
 extern void git_config_push_parameter(const char *text);
 extern int git_config_from_parameters(config_fn_t fn, void *data);
 extern void git_config(config_fn_t fn, void *);
@@ -1525,6 +1525,8 @@ extern const char *get_log_output_encoding(void);
 extern const char *get_commit_output_encoding(void);
 
 extern int git_config_parse_parameter(const char *, config_fn_t fn, void 
*data);
+extern const char *current_config_origin_type(void);
+extern const char *current_config_name(void);
 
 struct config_include_data {
int depth;
diff --git a/config.c b/config.c
index 36b0ddb..3be2cbc 100644
--- a/config.c
+++ b/config.c
@@ -24,6 +24,7 @@ struct config_source {
size_t pos;
} buf;
} u;
+   const char *origin_type;
const char *name;
const char *path;
int die_on_error;
@@ -471,9 +472,9 @@ static int git_parse_source(config_fn_t fn, void *data)
break;
}
if (cf->die_on_error)
-   die(_("bad config file line %d in %s"), cf->linenr, cf->name);
+   die(_("bad config line %d in %s %s"), cf->linenr, 
cf->origin_type, cf->name);
else
-   return error(_("bad config file line %d in %s"), cf->linenr, 
cf->name);
+   return error(_("bad config line %d in %s %s"), cf->linenr, 
cf->origin_type, cf->name);
 }
 
 static int parse_unit_factor(const char *end, uintmax_t *val)
@@ -588,9 +589,9 @@ static void die_bad_number(const char *name, const char 
*value)
if (!value)
value = "";
 
-   if (cf && cf->name)
-   die(_("bad numeric config value '%s' for '%s' in %s: %s"),
-   value, name, cf->name, reason);
+   if (cf && cf->origin_type && cf->name)
+   die(_("bad numeric config value '%s' for '%s' in %s %s: %s"),
+   value, name, cf->origin_type, cf->name, reason);
die(_("bad numeric config value '%s' for '%s': %s"), value, name, 
reason);
 }
 
@@ -1061,11 +1062,13 @@ static int do_config_from(struct config_source *top, 
config_fn_t fn, void *data)
 }
 
 static int do_config_from_file(config_fn_t fn,
-   const char *name, const char *path, FILE *f, void *data)
+   const char *origin_type, const char *name, const char *path, 
FILE *f,
+   void *data)
 {
struct config_source top;
 
top.u.file = f;
+   top.origin_type = origin_type;
top.name = name;
top.path = path;
top.die_on_error = 1;
@@ -1078,7 +1081,7 @@ static int do_config_from_file(config_fn_t fn,
 
 static int git_config_from_stdin(config_fn_t fn, void *data)
 {
-   return do_config_from_file(fn, "", NULL, stdin, data);
+   return do_config_from_file(fn, "stdin", "", NULL, stdin, data);
 }
 
 int git_config_from_file(config_fn_t fn, const char *filename, void *data)
@@ -1089,21 +1092,22 @@ int git_config_from_file(config_fn_t fn, const char 
*filename, void *data)
f = fopen(filename, "r");
if (f) {
flockfile(f);
-   ret = do_config_from_file(fn, filename, filename, f, data);
+   ret = do_config_from_file(fn, "file", filename, filename, f, 
data);
funlockfile(f);
fclose(f);
}
return ret;
 }
 
-int git_config_from_mem(config_fn_t fn, const char *name, const char *buf,
-   size_t len, void *data)
+int git_config_from_mem(config_fn_t fn, const char *origin_type,
+   const char *name, const char *buf, size_t len, void 
*data)
 {
struct config_source top;
 
top.u.buf.buf = buf;
top.u.buf.len = len;
top.u.buf.pos = 0;

Re: GSoC 2016: applications open, deadline = now => submitted

2016-02-19 Thread Matthieu Moy

Jeff King  writes:

> On Fri, Feb 19, 2016 at 09:09:50AM +0100, Matthieu Moy wrote:
>
>> Hi,
>> 
>> The deadline is tonight (19:00 UTC). I don't want to miss it, so I'll
>> submit the application very soon based on
>> http://git.github.io/SoC-2016-Org-Application/.
>
> I think we can continue to modify up to the deadline (at least that has
> been the case in years past).

Indeed: I've completed the application and it's still possible to modify
(the form just says "You've completed this form", but there's no scary
"submit and remove write access from now" button ;-)).

I had to modify the text a bit to fit within the new length limit (1000
characters for most fields). The content of the form should be the same
as what's currently on http://git.github.io/SoC-2016-Org-Application/.
Please check.

Thanks,

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

GSoC 2016: Microproject

2016-02-19 Thread Mehul Jain

Hello everyone,

I'm Mehul Jain. I'm looking for participating in GSoC 2016.

I've started work on a Microproject" Teach git pull --rebase the
--no-autostash" option. While looking at Git's source code I have made
following observation: In the pull.c file
1.  git_config_get_bool( , ) search in the configuration key for the
value of rebase.autostash, if found true then modify autostash's value
to a non-zero number and thus making a stash to encounter the problem
of dirty tree.
2. Here if in command line a flag "--no-autostash" is given then we
can easily set the value of autostash = 0 and thus killing the process
by die_on_unclean_work_tree(prefix).
Is my observation is right?

I'm new to open source projects and not much experienced at it. So
please correct/comment on any mistake that I made while trying to put
my point. I will also appreciate any comment/suggestion/criticism on
my observation.

Thanks
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] exec_cmd.c: use find_last_dir_sep() for code simplification

2016-02-19 Thread Alexander Kuleshov

We are trying to extract dirname from argv0 in the git_extract_argv0_path().
But in the same time, the  provides find_last_dir_sep()
to get dirname from a given path.  Let's use it instead of loop for the code
simplification.

Signed-off-by: Alexander Kuleshov 
---
 exec_cmd.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/exec_cmd.c b/exec_cmd.c
index e85f0fd..680b257 100644
--- a/exec_cmd.c
+++ b/exec_cmd.c
@@ -43,12 +43,10 @@ const char *git_extract_argv0_path(const char *argv0)
 
if (!argv0 || !*argv0)
return NULL;
-   slash = argv0 + strlen(argv0);
 
-   while (argv0 <= slash && !is_dir_sep(*slash))
-   slash--;
+   slash = find_last_dir_sep(argv0);
 
-   if (slash >= argv0) {
+   if (slash) {
argv0_path = xstrndup(argv0, slash - argv0);
return slash + 1;
}
-- 
2.4.4.764.g5dbb725.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GSoC 2016: applications open, deadline = now => submission

2016-02-19 Thread Jeff King

On Fri, Feb 19, 2016 at 09:09:50AM +0100, Matthieu Moy wrote:

> Hi,
> 
> The deadline is tonight (19:00 UTC). I don't want to miss it, so I'll
> submit the application very soon based on
> http://git.github.io/SoC-2016-Org-Application/.

I think we can continue to modify up to the deadline (at least that has
been the case in years past).

But I agree with getting something in place and iterating from there.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GSoC 2016: applications open, deadline = now => submission

2016-02-19 Thread Matthieu Moy

Hi,

The deadline is tonight (19:00 UTC). I don't want to miss it, so I'll
submit the application very soon based on
http://git.github.io/SoC-2016-Org-Application/.

Other pages can still be modified afterwards.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GSoC 2016: applications open, libgit2 and git.git

2016-02-19 Thread Matthieu Moy

Carlos Martín Nieto  writes:

> We still have most of the same things open as for the 2014 list. I'll
> ask around to see if we have. Last year I wasn't involved in the
> candidate selection but IIRC we didn't do a project as none of the
> applications showed the candidates would be capable of doing the
> project they were applying for.

OK. It's essentially too late to change this for this year, but next
time we should discuss earlier about how we want to organize this
git.git/libgit2 thing. For example, I think it would make little sense
to have a git.git microproject and then apply for a libgit2 project
since we have very different ways of interacting. And honnestly, right
now the application is really git.git-centric so I don't think it
attracts students towards libgit2. So, if you want to attract more
students, we should work on that.

I tried to clarify the situation with libgit2:

https://github.com/git/git.github.io/commit/94d1747eb9621b3bc892be2f232338b7933ac271

Please say if you're happy/unhappy with what I wrote. PRs are still
welcome after the deadline.

> I'll ask around to make sure people would be able to be mentors, but I
> think that we would still like to put forward a few projects (I can
> send a PR with the projects that we would still like to see to the 2016
> page).

We don't have this everywhere, but having a "potential mentors" field
for projects also helps (students, and us, at least to make sure we do
have mentors).

Cheers,

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 >

100 matches

Mail list logo