Re: [PATCH v4 1/1] commit-reach: properly peel tags and clear flags

2018-09-24 Thread Eric Sunshine
On Mon, Sep 24, 2018 at 4:58 PM Derrick Stolee via GitGitGadget
 wrote:
> diff --git a/commit-reach.c b/commit-reach.c
> @@ -544,20 +544,42 @@ int can_all_from_reach_with_flag(struct object_array 
> *from,
>  {
> +   from_one = deref_tag(the_repository, from_one,
> +"a from object", 0);
> +   if (!from_one || from_one->type != OBJ_COMMIT) {
> +   /* no way to tell if this is reachable by
> +* looking at the ancestry chain alone, so
> +* leave a note to ourselves not to worry about
> +* this object anymore.
> +*/

Style nit:

/*
 * Multi-line comment
 * formatting.
 */


Re: Git for games working group

2018-09-24 Thread John Austin
On Mon, Sep 24, 2018 at 12:58 PM Taylor Blau  wrote:
> I'm replying to this part of the email to note that this would cause Git
> LFS to have to do some extra work, since running 'git lfs install'
> already writes to .git/hooks/post-commit (ironically, to detect and
> unlock locks that we should have released).

Right, that should have been another bullet point. The fact that there
can only be one git hook is.. frustrating.

Perhaps, if LFS has an option to bundle global-graph, LFS could merge
the hooks when installing?

If you instead install global-graph after LFS, I think it should
probably attempt something like:
  -- first move the existing hook to a folder: post-commit.d/
  -- install the global-graph hook to post-commit.d/
  -- install a new hook at post-commit that simply calls all
executables in post-commit.d/

Not sure if this is something that's been discussed, since I know LFS
has a similar issue with existing hooks, but might be sensible.



Re: [PATCH 3/8] refs: new ref types to make per-worktree refs visible to all worktrees

2018-09-24 Thread Stefan Beller
On Sat, Sep 22, 2018 at 11:05 AM Nguyễn Thái Ngọc Duy  wrote:
>
> One of the problems with multiple worktree is accessing per-worktree
> refs of one worktree from another worktree. This was sort of solved by
> multiple ref store, where the code can open the ref store of another
> worktree and has access to the ref space of that worktree.
>
> The problem with this is reporting. "HEAD" in another ref space is
> also called "HEAD" like in the current ref space. In order to
> differentiate them, all the code must somehow carry the ref store
> around and print something like "HEAD from this ref store".
>
> But that is not feasible (or possible with a _lot_ of work). With the
> current design, we pass a reference around as a string (so called
> "refname"). Extending this design to pass a string _and_ a ref store
> is a nightmare, especially when handling extended SHA-1 syntax.
>
> So we do it another way. Instead of entering a separate ref space, we
> make refs from other worktrees available in the current ref space. So
> "HEAD" is always HEAD of the current worktree, but then we can have
> "worktrees/blah/HEAD" to denote HEAD from a worktree named
> "blah". This syntax coincidentally matches the underlying directory
> structure which makes implementation a bit easier.
>
> The main worktree has to be treated specially because well.. it's
> special from the beginning. So HEAD from the main worktree is
> acccessible via the name "main/HEAD" (we can't use
> "worktrees/main/HEAD" because "main" under "worktrees" is not
> reserved).
>
> This patch also makes it possible to specify refs from one worktree in
> another one, e.g.
>
> git log worktrees/foo/HEAD

This has strong similarities to remote refs:
Locally I may have a branch master, whose (stale local copy of its
distributed) counterpart is named origin/master.

It is also possible to have a working tree named origin
(just I like to name my worktree "git", when working on git.git),
how do we differentiate between the neighbor-worktree
"origin/master" and the remote-tracking branch "origin/master" ?

As the remote tracking branches are shared between all
worktree there is no need to differentiate between a
local-worktree remote tracking branch and a
neighbor-worktree remote tracking branch.

Now that you introduce the magic main working tree,
we also need to disallow working trees to be named "main",
i.e.
$ git worktree add main HEAD

produces

  $ ls .git/worktrees/
  main

How do we deal with that?


Re: [PATCH 2/8] Add a place for (not) sharing stuff between worktrees

2018-09-24 Thread Stefan Beller
On Sat, Sep 22, 2018 at 11:05 AM Nguyễn Thái Ngọc Duy  wrote:
>
> When multiple worktrees are used, we need rules to determine if
> something belongs to one worktree or all of them. Instead of keeping
> adding rules when new stuff comes, have a generic rule:
>
> - Inside $GIT_DIR, which is per-worktree by default, add
>   $GIT_DIR/common which is always shared. New features that want to
>   share stuff should put stuff under this directory.

So that /common is a directory and you have to use it specifically
in new code? That would be easy to overlook when coming up
with $GIT_DIR/foo for implementing the git-foo.

>
> - Inside refs/, which is shared by default except refs/bisect, add
>   refs/local/ which is per-worktree. We may eventually move
>   refs/bisect to this new location and remove the exception in refs
>   code.

That sounds dangerous to me. There is already a concept of
local and remote-tracking branches. So I would think that local
may soon become an overused word, (just like "index" today or
"recursive" to a lesser extend).

Could this special area be more explicit?
(refs/worktree-local/ ? or after peeking at the docs below
refs/un-common/ ?)


Re: [PATCH v2] mailmap: consistently normalize brian m. carlson's name

2018-09-24 Thread Jonathan Nieder
brian m. carlson wrote:
> On Mon, Sep 24, 2018 at 10:39:02AM -0700, Jonathan Nieder wrote:
>> brian m. carlson wrote:

>>> I think this commit message makes sense.
[...]
>>  What would it take to make the patch make
>> sense, too? ;-)
>
> I certainly didn't mean to imply a failing on your part for explaining
> the change adequately.  I've just always found the format confusing and
> I know others do, too.

No worries.  I took the opportunity because the patch isn't in "next"
yet so I was looking for a way to nudge it forward.  I think v2 is
simpler than v1.  Thanks for your help.

[...]
> This has been a really helpful explanation.  Thanks.
>
> Maybe I'll have some time over the next week or so to send a patch to
> the documentation to make it more understandable to past me.

Sounds good.  I think the (non-)handling of the 'name  name'
case is likely to be a bug.

>> How about this?
[...]
> Having read your explanation, this looks good.  Thanks for fixing this.

\o/ Thanks for looking it over.

Sincerely,
Jonathan


Re: [PATCH] submodule.c: Make get_superproject_working_tree() work when supermodule has unmerged changes of the submodule reference

2018-09-24 Thread Stefan Beller
On Mon, Sep 24, 2018 at 2:25 PM Sam McKelvie  wrote:

Thanks for writing a patch!
I wonder if we can shorten the subject and make it a bit more concise,
how about

submodule: Alllow staged changes for get_superproject_working_tree

or

rev-parse: allow staged submodule in --show-superproject-working-tree


> Previously, "fatal: BUG: returned path string doesn't match cwd?" is 
> displayed and
> git-rev-parse aborted.

Usually it is hard to read, continuing from the commit title to the body
of the commit message, as the title may be displayed differently or
somewhere else, so it would be good to restate it or rather state the
command that is broken, which we are trying to fix.

Invoking 'git rev-parse --show-superproject-working-tree' exits with

fatal: BUG ...

instead of displaying the superproject working tree when 

> The problem is due to the fact that when a merge of the submodule 
> reference is in progress,
> "git --stage —full-name ” returns three seperate 
> entries for the

"ls-files" is missing in that git invocation?

> submodule (one for each stage) rather than a single entry; e.g.,
>
> $ git ls-files --stage --full-name submodule-child-test
> 16 dbbd2766fa330fa741ea59bb38689fcc2d283ac5 1   
> submodule-child-test
> 16 f174d1dbfe863a59692c3bdae730a36f2a788c51 2   
> submodule-child-test
> 16 e6178f3a58b958543952e12824aa2106d560f21d 3   
> submodule-child-test
>
> The code in get_superproject_working_tree() expected exactly one entry to 
> be returned;
> this patch makes it use the first entry if multiple entries are returned.

The code looks good.

I wonder if we want to add a test for it, see t/t1500-rev-parse.sh
(at the very end 'showing the superproject correctly').

>
> Signed-off-by: Sam McKelvie 

Side note: the commit message seems to be indented,
but the patch applies fine, which makes me curious:
How did you produce the patch?
(Usually a patch would not apply as white spaces are
mangled in the code for example in some email setups)
(If I had to guess I could imagine that it is the output of
git-show as that indents the commit message usually,
see git-format-patch for a better tool)

Thanks for writing the patch!
Stefan


Re: [PATCH v2] mailmap: consistently normalize brian m. carlson's name

2018-09-24 Thread brian m. carlson
On Mon, Sep 24, 2018 at 10:39:02AM -0700, Jonathan Nieder wrote:
> Hi,
> 
> brian m. carlson wrote:
> 
> > I think this commit message makes sense.  I apparently still fail to
> > understand how the .mailmap format works, so I can't tell you if the
> > patch is correct.
> 
> Thanks for looking it over.  What would it take to make the patch make
> sense, too? ;-)

I certainly didn't mean to imply a failing on your part for explaining
the change adequately.  I've just always found the format confusing and
I know others do, too.

> Most mailmap entries are of the form
> 
>   Some Name 
> 
> which means "Wherever you see the email address someem...@example.com,
> canonicalize the author's name to Some Name".  We can use that:
> 
>   brian m. carlson 
> 
> When we see sand...@crustytoothpaste.ath.cx, we also want to
> canonicalize the email address.  For that, we can do
> 
>   brian m. carlson  
> 
> 
> There's only one person who has used these email addresses, so we
> don't have to do matching by name.  If we wanted to tighten the name
> normalization to match by name, I think we'd do something like
> 
>   brian m. carlson  Brian M. Carlson
> 
> but I can't get that to seem to have any effect when I test with the
> "git check-mailmap" command --- for example, "git check-mailmap 'Dana
> How '" does not map and "git check-mailmap
> 'Random Name '" maps to 'Dana L. How
> '.
> 
> The even tighter matching used in v1
> 
>   brian m. carlson  Brian M. Carlson 
> 
> 
> does work, but it's unnecessary complexity.  We don't need it.

This has been a really helpful explanation.  Thanks.

Maybe I'll have some time over the next week or so to send a patch to
the documentation to make it more understandable to past me.

> How about this?
> 
> Changes since v1:
> - loosened the matching to only look at email and ignore name
> - no other changes
> 
>  .mailmap | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/.mailmap b/.mailmap
> index f165222a78..bef3352b0d 100644
> --- a/.mailmap
> +++ b/.mailmap
> @@ -25,7 +25,7 @@ Ben Walton  
>  Benoit Sigoure  
>  Bernt Hansen  
>  Brandon Casey  
> -brian m. carlson  Brian M. Carlson 
> 
> +brian m. carlson 
>  brian m. carlson  
> 
>  Bryan Larsen  
>  Bryan Larsen  

Having read your explanation, this looks good.  Thanks for fixing this.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [RFC PATCH 3/3] diff: add --color-moved-ws=allow-mixed-indentation-change

2018-09-24 Thread Stefan Beller
On Mon, Sep 24, 2018 at 3:06 AM Phillip Wood  wrote:
>
> From: Phillip Wood 
>
> This adds another mode for highlighting lines that have moved with an
> indentation change. Unlike the existing
> --color-moved-ws=allow-indentation-change setting this mode uses the
> visible change in the indentation to group lines, rather than the
> indentation string.

Wow! Thanks for putting this RFC out.
My original vision was to be useful to python users as well,
which counts 1 tab as 8 spaces IIUC.

The "visual" indentation you mention here sounds like
a tab is counted as "up to the next position of (n-1) % 8",
i.e. stop at positions 8, 16, 24... which would not be pythonic,
but useful in e.g. our code base.

> This means it works with files that use a mix of
> tabs and spaces for indentation and can cope with whitespace errors
> where there is a space before a tab

Cool!

> (it's the job of
> --ws-error-highlight to deal with those errors, it should affect the
> move detection).

Not sure I understand this side note. So --ws-error-highlight can
highlight them, but the move detection should *not*(?) be affected
by the highlighted parts, or it should do things differently on
whether  --ws-error-highlight is given?

> It will also group the lines either
> side of a blank line if their indentation change matches so short
> lines followed by a blank line followed by more lines with the same
> indentation change will be correctly highlighted.

That sounds very useful (at least for my editor, that strips
blank lines to be empty lines), but I would think this feature is
worth its own commit/patch.

I wonder how much this feature is orthogonal to the existing
problem of detecting the moved indented blocks (existing
allow-indentation-change vs the new feature discussed first
above)

>
> This is a RFC as there are a number of questions about how to proceed
> from here:
>  1) Do we need a second option or should this implementation replace
> --color-moved-ws=allow-indentation-change. I'm unclear if that mode
> has any advantages for some people. There seems to have been an
> intention [1] to get it working with mixes of tabs and spaces but
> nothing ever came of it.

Oh, yeah, I was working on that, but dropped the ball.

I am not sure what the best end goal is, or if there are many different
modes that are useful to different target audiences.
My own itch at the time was (de-/)in-dented code from refactoring
patches for git.git and JGit (so Java, C, shell); and I think not hurting
python would also be good.

ignoring the mixture of ws seems like it would also cater free text or
other more exotic languages.

What is your use case, what kind of content do you process that
this patch would help you?

I am not overly attached to the current implementation of
 --color-moved-ws=allow-indentation-change,
and I think Junio has expressed the fear of "too many options"
already in this problem space, so if possible I would extend/replace
the current option.

>  2) If we keep two options what should this option be called, the name
> is long and ambiguous at the moment - mixed could refer to mixed
> indentation length rather than a mix of tabs and spaces.

Let's first read the code to have an opinion, or re-state the question
from above ("What is this used for?") as I could imagine one of the
modes could be "ws-pythonic" and allow for whitespace indentation
that would have the whole block count as an indented by the same
amount, (e.g. if you wrap a couple functions in python by a class).

> +++ b/diff.c
> @@ -304,7 +304,11 @@ static int parse_color_moved_ws(const char *arg)
> else if (!strcmp(sb.buf, "ignore-all-space"))
> ret |= XDF_IGNORE_WHITESPACE;
> else if (!strcmp(sb.buf, "allow-indentation-change"))
> -   ret |= COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE;
> +   ret = COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE |
> +(ret & 
> ~COLOR_MOVED_WS_ALLOW_MIXED_INDENTATION_CHANGE);

So this RFC lets "allow-indentation-change" override
"allow-mixed-indentation-change" and vice versa. That
also solves the issue of configuring one and giving the other
as a command line option. Nice.

> if ((ret & COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) &&
> (ret & XDF_WHITESPACE_FLAGS))
> die(_("color-moved-ws: allow-indentation-change cannot be 
> combined with other white space modes"));
> +   else if ((ret & COLOR_MOVED_WS_ALLOW_MIXED_INDENTATION_CHANGE) &&
> +(ret & XDF_WHITESPACE_FLAGS))
> +   die(_("color-moved-ws: allow-mixed-indentation-change cannot 
> be combined with other white space modes"));

Do we want to open a bit mask for all indentation change options? e.g.
#define COLOR_MOVED_WS_INDENTATION_MASK \
(COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE | \
 COLOR_MOVED_WS_ALLOW_MIXED_INDENTATION_CHANGE)

> @@ -763,11 +770,65 @@ struct moved_entry 

Re: What's cooking in git.git (Sep 2018, #05; Mon, 24)

2018-09-24 Thread Derrick Stolee

On 9/24/2018 6:06 PM, Junio C Hamano wrote:

* ds/reachable (2018-09-21) 2 commits
   (merged to 'next' on 2018-09-21 at d4cd62108e)
  + commit-reach: fix memory and flag leaks
  + commit-reach: properly peel tags

  Recent update broke the reachability algorithm when refs (e.g.
  tags) that point at objects that are not commit were involved,
  which has been fixed.
  Hotfix for a topic already in 'master'.


Sorry that I took so long to reroll a v4. The changes from v3 to v4 are 
mostly cleanup, so I'll send those as a single commit later, and they 
don't need to be rushed to master as nothing is broken in the version 
you merged.




* ds/reachable-topo-order (2018-09-21) 7 commits
  - revision.c: refactor basic topo-order logic
  - revision.h: add whitespace in flag definitions
  - commit/revisions: bookkeeping before refactoring
  - revision.c: begin refactoring --topo-order logic
  - test-reach: add rev-list tests
  - test-reach: add run_three_modes method
  - prio-queue: add 'peek' operation

  The revision walker machinery learned to take advantage of the
  commit generation numbers stored in the commit-graph file.


I've had limited review of this topic that is very exciting (to me). I 
know that last commit is a bit daunting, but I'd love someone to take a 
look.




* ds/format-commit-graph-docs (2018-08-21) 2 commits
  - commit-graph.txt: improve formatting for asciidoc
  - Docs: Add commit-graph tech docs to Makefile

  Design docs for the commit-graph machinery is now made into HTML as
  well as text.

  Will discard.
  I am inclined to drop these, as I do not see much clarity in HTML
  output over the text source.  Opinions?


Just a reminder that we agreed to drop this series.

Thanks,

-Stolee



Re: [BUG] A part of an edge from an octopus merge gets colored, even with --color=never

2018-09-24 Thread Noam Postavsky
On Sat, 8 Sep 2018 at 12:13, Jeff King  wrote:

> Great (and sorry for the delayed response).

No problem, I know it's not the most urgent bug ever :)

> So I think this is doing the right thing. I'm not sure if there's a
> better way to explain "dashless" or not.

I've updated the comments and renamed a few variables, see if that helps.

> Do you feel comfortable trying to add something to the test suite for
> this?

Um, sort of. I managed to recast my script into the framework of the
other tests (see attached t4299-octopus.sh); it seems like it should
go into t4202-log.sh, but it's not clear to me how I can do this
without breaking all the other tests which expect a certain sequence
of commits.


t4299-octopus.sh
Description: Bourne shell script
From ade526d32f692cae06000bb413ff29dad3f6109e Mon Sep 17 00:00:00 2001
From: Noam Postavsky 
Date: Sat, 1 Sep 2018 20:07:16 -0400
Subject: [PATCH v4] log: Fix coloring of certain octupus merge shapes

For octopus merges where the first parent edge immediately merges into
the next column to the left:

| *-.
| |\ \
|/ / /

then the number of columns should be one less than the usual case:

| *-.
| |\ \
| | | *

Also refactor the code to iterate over columns rather than dashes,
building from an initial patch suggestion by Jeff King.

Signed-off-by: Noam Postavsky 
---
 graph.c | 56 +---
 1 file changed, 41 insertions(+), 15 deletions(-)

diff --git a/graph.c b/graph.c
index e1f6d3bdd..a3366f6da 100644
--- a/graph.c
+++ b/graph.c
@@ -842,27 +842,53 @@ static void graph_output_commit_char(struct git_graph *graph, struct strbuf *sb)
 }
 
 /*
- * Draw an octopus merge and return the number of characters written.
+ * Draw the horizontal dashes of an octopus merge and return the number of
+ * characters written.
  */
 static int graph_draw_octopus_merge(struct git_graph *graph,
 struct strbuf *sb)
 {
 	/*
-	 * Here dashless_commits represents the number of parents
-	 * which don't need to have dashes (because their edges fit
-	 * neatly under the commit).
-	 */
-	const int dashless_commits = 2;
-	int col_num, i;
-	int num_dashes =
-		((graph->num_parents - dashless_commits) * 2) - 1;
-	for (i = 0; i < num_dashes; i++) {
-		col_num = (i / 2) + dashless_commits + graph->commit_index;
-		strbuf_write_column(sb, >new_columns[col_num], '-');
+	 * Here dashless_parents represents the number of parents which don't
+	 * need to have dashes (the edges labeled "0" and "1").  And
+	 * dashful_parents are the remaining ones.
+	 *
+	 * | *---.
+	 * | |\ \ \
+	 * | | | | |
+	 * x 0 1 2 3
+	 *
+	 */
+	const int dashless_parents = 2;
+	int dashful_parents = graph->num_parents - dashless_parents;
+
+	/*
+	 * Usually, each parent gets its own column, like the diagram above, but
+	 * sometimes the first parent goes into an existing column, like this:
+	 *
+	 * | *---.
+	 * | |\ \ \
+	 * |/ / / /
+	 * x 0 1 2
+	 *
+	 * In which case there will be more parents than the delta of columns.
+	 */
+	int delta_cols = (graph->num_new_columns - graph->num_columns);
+	int parent_in_old_cols = graph->num_parents - delta_cols;
+
+	/*
+	 * In both cases, commit_index corresponds to the edge labeled "0".
+	 */
+	int first_col = graph->commit_index + dashless_parents
+	- parent_in_old_cols;
+
+	int i;
+	for (i = 0; i < dashful_parents; i++) {
+		strbuf_write_column(sb, >new_columns[i+first_col], '-');
+		strbuf_write_column(sb, >new_columns[i+first_col],
+i == dashful_parents-1 ? '.' : '-');
 	}
-	col_num = (i / 2) + dashless_commits + graph->commit_index;
-	strbuf_write_column(sb, >new_columns[col_num], '.');
-	return num_dashes + 1;
+	return 2 * dashful_parents;
 }
 
 static void graph_output_commit_line(struct git_graph *graph, struct strbuf *sb)
-- 
2.11.0



Re: git silently ignores include directive with single quotes

2018-09-24 Thread Stas Bekman
On 2018-09-24 02:08 PM, Ævar Arnfjörð Bjarmason wrote:
[...]

> Posting to this mailing list is generally how it's done

Thank you, Ævar, for clarifying that there is no issue tracker for the
git project.

> The best way to fix stuff in git that you can't interest others in is to
> do it yourself. Take a look at Documentation/SubmittingPatches in the
> git.git repository for how to do that.

Based on the initial rich discussion this post created I had a feeling
that there was a lot of interest. But you're correct, it's easier to
share one's thought and patches take a lot more effort and time.

> In particular, starting by clarifying the docs around this as I
> suggested upthread might be a good and easy start to your first
> contribution to git!

That's an excellent idea. Philip started the process and hopefully it
will lead to better documentation of the issue.

Thanks again.


-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


Re: [RFC PATCH 1/3] xdiff-interface: make xdl_blankline() available

2018-09-24 Thread Stefan Beller
On Mon, Sep 24, 2018 at 3:06 AM Phillip Wood  wrote:
>
> From: Phillip Wood 
>
> This will be used by the move detection code.
>
> Signed-off-by: Phillip Wood 
> ---
>  xdiff-interface.c | 5 +
>  xdiff-interface.h | 5 +
>  2 files changed, 10 insertions(+)
>
> diff --git a/xdiff-interface.c b/xdiff-interface.c
> index 9315bc0ede..eceabfa72d 100644
> --- a/xdiff-interface.c
> +++ b/xdiff-interface.c
> @@ -308,6 +308,11 @@ int xdiff_compare_lines(const char *l1, long s1,
> return xdl_recmatch(l1, s1, l2, s2, flags);
>  }
>
> +int xdiff_is_blankline(const char *l1, long s1, long flags)
> +{
> +   return xdl_blankline(l1, s1, flags);
> +}
> +
>  int git_xmerge_style = -1;
>
>  int git_xmerge_config(const char *var, const char *value, void *cb)
> diff --git a/xdiff-interface.h b/xdiff-interface.h
> index 135fc05d72..d0008b016f 100644
> --- a/xdiff-interface.h
> +++ b/xdiff-interface.h
> @@ -45,4 +45,9 @@ extern int xdiff_compare_lines(const char *l1, long s1,
>   */
>  extern unsigned long xdiff_hash_string(const char *s, size_t len, long 
> flags);
>
> +/*
> + * Returns 1 if the line is blank, taking XDF_WHITESPACE_FLAGS into account

presumably in the flags field.

> + */
> +extern int xdiff_is_blankline(const char *s, long len, long flags);

We also have
int ws_blank_line(const char *, int, int)
that looks very similar, but works slightly differently.
grep.c has
static int is_empty_line(const char *bol, const char *eol)
{
   while (bol < eol && isspace(*bol))
   bol++;
   return bol == eol;
}

pretty.c also has a is_blank_line() (and some stale comment in
that file refers to it as is_empty_line, 77356122443
(pretty: make the skip_blank_lines() function public,
2016-06-22)

Would we be able to unify all these down to one or two
functions? (Maybe all can use the new xdiff function?)
It seems as if we're reinventing the wheel a couple times
in our code base.

Stefan


Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes

2018-09-24 Thread Jeff King
On Mon, Sep 24, 2018 at 02:55:57PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > Do you have an opinion on whether for_each_alternate_refs() interface
> > should stop passing back refnames? By the "they may not even exist"
> > rationale in this sub-thread, I think it's probably foolish for any
> > caller to actually depend on the names being meaningful.
> 
> I personally do not mind they were all ".have" or unnamed.
> 
> The primary motivatgion behind for-each-alternate-refs was that we
> wanted to find more anchoring points to help the common ancestry
> negotiation and for-each-*-ref was the obvious way to do so; the
> user did not care anything about names.

Right, I think that is totally fine for the current uses. I guess my
question was: do you envision cutting the interface down to only the
oids to bite us in the future?

I was on the fence during past discussions, but I think I've come over
to the idea that the refnames actively confuse things.

-Peff


Re: [PATCH v5 1/9] submodule: add a print_config_from_gitmodules() helper

2018-09-24 Thread Stefan Beller
> > +int print_config_from_gitmodules(const char *key)
>
> I am thinking about adding  a "struct repository" argument to this
> function

Sounds like a good idea.


Re: [PATCH 0/1] Re: git silently ignores include directive with single quotes

2018-09-24 Thread Stas Bekman
On 2018-09-24 03:24 PM, Philip Oakley wrote:
> Rather than attaching the problem with code, I decided to simply update
> the config file documentation.
> 
> As the userbase expands the documentation will need to be more comprehensive
> about exclusions and omissions, along with better highlighting for core
> areas.
> 
> I would be useful if Stas could comment on whether these changes would
> have assisted in debugging the faulty config file. 

Thank you for writing this doc patch, Philip.

The documentation improvement would be most useful in conjunction with a
an improved debugging/tracing facility. So that a user can see what git
is seeing. Once a user sees that their configuration is broken then they
can peruse the improved documentation to find why it is broken. Without
the debugging ability, the docs would help but it'll be a much longer
journey, since words like:

"Single quotes are not special and form part of the variable's value."

aren't necessarily going to stand out as an indicator of a potential
problem, when you won't think twice that quotes could even be a suspect,
even though the docs say so explicitly.

A trace saying:

"./.git/'.gitconfig'" is not found

would speak volumes and be self-documenting.

In lieu of that, the docs would be need to have more examples.

Here are the potential expansions to the patch you shared:

1. "Single quotes are not special and form part of the variable's value.
For example, if the configuration includes:

  include = '.gitconfig'

then git will look for "'.gitconfig'", single quotes included. Also note
that it'll look for the file relative to "REPO/.git/", hence it'll look
for "REPO/.git/'.gitconfig'", which is most likely incorrect, since you
can't check in files under "REPO/.git/". The correct configuration for
including "REPO/.gitconfig" is:

  include = ../.gitconfig

2. Same with:

"Both the `include` and `includeIf` sections implicitly apply an 'if
found' condition to the given path names."

To a user this would be a difficult statement to make sense of. An
example would fix that:

"Both the `include` and `includeIf` sections implicitly apply an 'if
found' condition to the given path names. For example, if the
configuration includes:

  include = ../.gitconfig

and git finds "REPO/.gitconfig", it will include its configuration. If
git can't find it, it will silently ignore this include statement until
this file appears. It has been designed this way to allow for optional
user-specific configuration facilities."

Thank you.

-- 

Stas Bekman   <'><   <'><
https://stasosphere.com  https://chestofbooks.com
https://experientialsexlab.com https://stason.org
https://stasosphere.com/experience-life/my-books


[PATCH 0/1] Re: git silently ignores include directive with single quotes

2018-09-24 Thread Philip Oakley
Rather than attaching the problem with code, I decided to simply update
the config file documentation.

As the userbase expands the documentation will need to be more comprehensive
about exclusions and omissions, along with better highlighting for core
areas.

I would be useful if Stas could comment on whether these changes would
have assisted in debugging the faulty config file. 

Philip Oakley (1):
  config doc: highlight the name=value syntax

 Documentation/config.txt | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

-- 
2.17.1.windows.2



[PATCH 1/1] config doc: highlight the name=value syntax

2018-09-24 Thread Philip Oakley
Stas Bekman reported [1] that Git config was not accepting single quotes
around a filename as may have been expected by shell users.

Highlight the 'name = value' syntax with its own heading. Clarify that
single quotes are not special here. Also point to this paragraph in the
'include' section regarding pathnames.

In addition clarify that missing include file paths are not an error, but
rather an implicit 'if found' for include files.

[1] 
https://public-inbox.org/git/ca2b192e-1722-092e-2c54-d79d21a66...@stason.org/

Reported-by: Stas Bekman 
Signed-off-by: Philip Oakley 
---
 Documentation/config.txt | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 1264d91fa3..b65fd6138d 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -19,8 +19,8 @@ characters and `-`, and must start with an alphabetic 
character.  Some
 variables may appear multiple times; we say then that the variable is
 multivalued.
 
-Syntax
-~~
+Config file Syntax
+~~
 
 The syntax is fairly flexible and permissive; whitespaces are mostly
 ignored.  The '#' and ';' characters begin comments to the end of line,
@@ -56,6 +56,9 @@ syntax, the subsection name is converted to lower-case and is 
also
 compared case sensitively. These subsection names follow the same
 restrictions as section names.
 
+Variable name/value syntax
+^^
+
 All the other lines (and the remainder of the line after the section
 header) are recognized as setting variables, in the form
 'name = value' (or just 'name', which is a short-hand to say that
@@ -69,7 +72,8 @@ stripped.  Leading whitespaces after 'name =', the remainder 
of the
 line after the first comment character '#' or ';', and trailing
 whitespaces of the line are discarded unless they are enclosed in
 double quotes.  Internal whitespaces within the value are retained
-verbatim.
+verbatim. Single quotes are not special and form part of the
+variable's value.
 
 Inside double quotes, double quote `"` and backslash `\` characters
 must be escaped: use `\"` for `"` and `\\` for `\`.
@@ -89,10 +93,14 @@ each other with the exception that `includeIf` sections may 
be ignored
 if their condition does not evaluate to true; see "Conditional includes"
 below.
 
+Both the `include` and `includeIf` sections implicitly apply an 'if found'
+condition to the given path names.
+
 You can include a config file from another by setting the special
 `include.path` (or `includeIf.*.path`) variable to the name of the file
 to be included. The variable takes a pathname as its value, and is
-subject to tilde expansion. These variables can be given multiple times.
+subject to tilde expansion and the value syntax detailed above.
+These variables can be given multiple times.
 
 The contents of the included file are inserted immediately, as if they
 had been found at the location of the include directive. If the value of the
-- 
2.17.1.windows.2



Re: [PATCH 1/1] read-cache: update index format default to v4

2018-09-24 Thread Junio C Hamano
SZEDER Gábor  writes:

> On Mon, Sep 24, 2018 at 02:15:30PM -0700, Derrick Stolee via GitGitGadget 
> wrote:
>> From: Derrick Stolee 
>> 
>> The index v4 format has been available since 2012 with 9d22778
>> "reach-cache.c: write prefix-compressed names in the index". Since
>> the format has been stable for so long, almost all versions of Git
>> in use today understand version 4, removing one barrier to upgrade
>> -- that someone may want to downgrade and needs a working repo.
>
> What about alternative implementations, like JGit, libgit2, etc.?

Good question.

Because the index-version of an index file is designed to be sticky,
repos that need to be accessed by other implementations can keep
whatever current version.  A new repo that need to be accessed by
them can be (forcibly) written in v2 and keep its v2ness, I would
think.

And that would serve as an incentive for the implementations to
catch up ;-)





Re: bug in 'git describe'?

2018-09-24 Thread Junio C Hamano
Sebastian Kuzminsky  writes:

> I've got two tiny git repos whose commit graphs are identical, but
> where 'git describe' gives different results.
> ...
> The histories differ only in the timestamps of the commits...

describe does take the commit timestamps into account, so it is
expected you would get different results out of an otherwise
identically looking graph.


What's cooking in git.git (Sep 2018, #05; Mon, 24)

2018-09-24 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

The tip of 'master' has been updated with about 20 topics, the tip
of 'next' has been rewound, and 'maint' now prepares for Git 2.19.x
maintenance track.  A few topics have been kicked out of 'next' to
'pu' to be replaced with a newer iterations.

Please double check what's in and not in 'next', and when you are
rerolling a topic in flight, make sure to make patches incremental
if your topic is already in 'next'.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* bp/mv-submodules-with-fsmonitor (2018-09-12) 1 commit
  (merged to 'next' on 2018-09-17 at 61c3dc4ebe)
 + git-mv: allow submodules and fsmonitor to work together

 When fsmonitor is in use, after operation on submodules updates
 .gitmodules, we lost track of the fact that we did so and relied on
 stale fsmonitor data.


* bw/protocol-v2 (2018-09-10) 1 commit
  (merged to 'next' on 2018-09-17 at 973a67bf55)
 + config: document value 2 for protocol.version

 Doc fix.


* ds/format-patch-range-diff-test (2018-09-12) 1 commit
  (merged to 'next' on 2018-09-17 at bd99e0e88c)
 + t3206-range-diff.sh: cover single-patch case

 A new test to cover "--range-diff" option of format-patch used for
 a single patch.


* ds/reachable (2018-09-21) 2 commits
  (merged to 'next' on 2018-09-21 at d4cd62108e)
 + commit-reach: fix memory and flag leaks
 + commit-reach: properly peel tags

 Recent update broke the reachability algorithm when refs (e.g.
 tags) that point at objects that are not commit were involved,
 which has been fixed.
 Hotfix for a topic already in 'master'.


* en/double-semicolon-fix (2018-09-05) 1 commit
  (merged to 'next' on 2018-09-17 at dc3847b728)
 + Remove superfluous trailing semicolons

 Code clean-up.


* en/rerere-multi-stage-1-fix (2018-09-11) 2 commits
  (merged to 'next' on 2018-09-17 at 07b9b319ab)
 + rerere: avoid buffer overrun
 + t4200: demonstrate rerere segfault on specially crafted merge

 A corner case bugfix in "git rerere" code.


* en/sequencer-empty-edit-result-aborts (2018-09-13) 1 commit
  (merged to 'next' on 2018-09-17 at 768dcf3cab)
 + sequencer: fix --allow-empty-message behavior, make it smarter

 "git rebase" etc. in Git 2.19 fails to abort when given an empty
 commit log message as result of editing, which has been corrected.


* en/update-ref-no-deref-stdin (2018-09-12) 2 commits
  (merged to 'next' on 2018-09-17 at a56c0a3003)
 + update-ref: allow --no-deref with --stdin
 + update-ref: fix type of update_flags variable to match its usage

 "git update-ref" learned to make both "--no-deref" and "--stdin"
 work at the same time.


* jk/dev-build-format-security (2018-09-11) 1 commit
  (merged to 'next' on 2018-09-17 at 36fbb6a88b)
 + config.mak.dev: add -Wformat-security

 Build tweak to help developers.


* jk/reopen-tempfile-truncate (2018-09-05) 1 commit
  (merged to 'next' on 2018-09-17 at 7c7a0608e0)
 + reopen_tempfile(): truncate opened file

 Fix for a long-standing bug that leaves the index file corrupt when
 it shrinks during a partial commit.


* js/mingw-o-append (2018-09-11) 2 commits
  (merged to 'next' on 2018-09-17 at 5b6e9be48e)
 + mingw: fix mingw_open_append to work with named pipes
 + t0051: test GIT_TRACE to a windows named pipe

 Further fix for O_APPEND emulation on Windows


* js/rebase-i-autosquash-fix (2018-09-04) 2 commits
  (merged to 'next' on 2018-09-17 at cec540d24b)
 + rebase -i: be careful to wrap up fixup/squash chains
 + rebase -i --autosquash: demonstrate a problem skipping the last squash

 "git rebase -i" did not clear the state files correctly when a run
 of "squash/fixup" is aborted and then the user manually amended the
 commit instead, which has been corrected.


* jt/lazy-object-fetch-fix (2018-09-13) 2 commits
  (merged to 'next' on 2018-09-17 at 321602b284)
 + fetch-object: set exact_oid when fetching
 + fetch-object: unify fetch_object[s] functions

 The code to backfill objects in lazily cloned repository did not
 work correctly, which has been corrected.


* ms/remote-error-message-update (2018-09-14) 1 commit
  (merged to 'next' on 2018-09-17 at d37a215b62)
 + builtin/remote: quote remote name on error to display empty name

 Update error messages given by "git remote" and make them consistent.


* nd/attr-pathspec-fix (2018-09-21) 1 commit
  (merged to 'next' on 2018-09-21 at 9f6b5a497f)
 + add: do not accept pathspec magic 'attr'

 "git add ':(attr:foo)'" is not supported and is supposed to be
 rejected while the command line arguments are parsed, but we fail
 to reject such a command line upfront.


* 

bug in 'git describe'?

2018-09-24 Thread Sebastian Kuzminsky
I think I've run in to a bug in 'git describe' (reproduced with git 
2.11.0, 2.16.1, and 2.19.0.221.g150f307af).


I've got two tiny git repos whose commit graphs are identical, but where 
'git describe' gives different results.



*   merge 1.1 into 2.0  (HEAD -> release-2.0)
|\
| *   merge boo into 1.1  (tag: release/1.1.1, release-1.1)
| |\
| | * dummy commit  (boo)
| * | dummy commit  (tag: release/1.1.0)
* | |   merge feature into 2.0
|\ \ \
| * | | dummy commit  (feature)
|/ / /
* | | dummy commit  (tag: release/2.0.1)
* | | dummy commit  (tag: release/2.0.0)
|/ /
* | dummy commit  (release-1.0)
|/
* dummy commit  (tag: release/1.0.0)


The tag 'release/1.0.0' is the first commit in history.

The histories differ only in the timestamps of the commits (and thus the 
SHAs of the commit objects).


Good repo: release/2.0.1-6-gbc33a04
Bad repo: release/2.0.1-8-g2c0a20c

Details, including full copies of both repos, here:

http://highlab.com/~seb/git-describe-bug/


--
Sebastian Kuzminsky


Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes

2018-09-24 Thread Junio C Hamano
Jeff King  writes:

> Do you have an opinion on whether for_each_alternate_refs() interface
> should stop passing back refnames? By the "they may not even exist"
> rationale in this sub-thread, I think it's probably foolish for any
> caller to actually depend on the names being meaningful.

I personally do not mind they were all ".have" or unnamed.

The primary motivatgion behind for-each-alternate-refs was that we
wanted to find more anchoring points to help the common ancestry
negotiation and for-each-*-ref was the obvious way to do so; the
user did not care anything about names.


Re: [PATCH 1/1] read-cache: update index format default to v4

2018-09-24 Thread SZEDER Gábor
On Mon, Sep 24, 2018 at 02:15:30PM -0700, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee 
> 
> The index v4 format has been available since 2012 with 9d22778
> "reach-cache.c: write prefix-compressed names in the index". Since
> the format has been stable for so long, almost all versions of Git
> in use today understand version 4, removing one barrier to upgrade
> -- that someone may want to downgrade and needs a working repo.

What about alternative implementations, like JGit, libgit2, etc.?

> Despite being stable for a long time, this index version was never
> adopted as the default. This prefix-compressed version of the format
> can get significant space savings on repos with large working
> directories (which naturally tend to have deep nesting). This version
> is set as the default for some external tools, such as VFS for Git.
> Because of this external use, the format has had a lot of "testing in
> production" and also is subject to continuous integration in these
> environments.
> 
> Previously, to test version 4 indexes, we needed to run the test
> suite with GIT_TEST_INDEX_VERSION=4 (or TEST_GIT_INDEX_VERSION=4).
> 
> One potential, but short-term, downside is that we lose coverage of
> the version 3 indexes. The trade-off is that we may want to cover
> that version using GIT_TEST_INDEX_VERSION=3.
> 
> Signed-off-by: Derrick Stolee 
> ---
>  read-cache.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/read-cache.c b/read-cache.c
> index 372588260e..af6c8f2a67 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1484,7 +1484,7 @@ struct cache_entry *refresh_cache_entry(struct 
> cache_entry *ce,
>   * Index File I/O
>   */
>  
> -#define INDEX_FORMAT_DEFAULT 3
> +#define INDEX_FORMAT_DEFAULT 4
>  
>  static unsigned int get_index_format_default(void)
>  {
> -- 
> gitgitgadget


Re: [PATCH] ref-filter: don't look for objects when outside of a repository

2018-09-24 Thread Jeff King
On Mon, Sep 24, 2018 at 11:20:34PM +0200, SZEDER Gábor wrote:

> > Would we perhaps want to give the user a hint that the object is not
> > really missing, but rather that we're not in a repository? E.g.,
> > something like:
> > 
> >   if (!have_git_dir())
> > return strbuf_addf_ret(err, -1, "format specifier requires a 
> > repository");
> >   if (oid_object_info_extended(...))
> > return ...;
> > 
> > ?
> 
> I think it makes sense.
> 
> I wanted to preserve the error message, because the description of
> '--sort=' in 'Documentation/git-ls-remote.txt' explicitly
> mentions it, and I added the condition at this place because I didn't
> want to duplicate the construction of the error message.

Ah, I didn't realize we actually documented that. And perhaps it is more
consistent, too: you'd get different results from running "ls-remote"
outside a repository versus one that just doesn't have the objects from
the other side.

> However, if we go for a more informative error message, then wouldn't
> it be better to add this condition in populate_value() before it even
> calls get_object()?  Then we could also add the problematic format
> specifier to the error message (I think, but didn't actually check),
> just in case someone specified multiple sort keys.

Yeah, that probably would be a better place. Though your response also
has made me think that maybe just sticking with the "missing object"
response is reasonable. I don't have a strong opinion between the two.

-Peff


[PATCH] submodule.c: Make get_superproject_working_tree() work when supermodule has unmerged changes of the submodule reference

2018-09-24 Thread Sam McKelvie
Previously, "fatal: BUG: returned path string doesn't match cwd?" is 
displayed and
git-rev-parse aborted.

The problem is due to the fact that when a merge of the submodule reference 
is in progress,
"git --stage —full-name ” returns three seperate 
entries for the
submodule (one for each stage) rather than a single entry; e.g.,

$ git ls-files --stage --full-name submodule-child-test
16 dbbd2766fa330fa741ea59bb38689fcc2d283ac5 1   submodule-child-test
16 f174d1dbfe863a59692c3bdae730a36f2a788c51 2   submodule-child-test
16 e6178f3a58b958543952e12824aa2106d560f21d 3   submodule-child-test

The code in get_superproject_working_tree() expected exactly one entry to 
be returned;
this patch makes it use the first entry if multiple entries are returned.

Signed-off-by: Sam McKelvie 
---
 submodule.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/submodule.c b/submodule.c
index 33de6ee5f..5b9d5ad7e 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1885,7 +1885,7 @@ const char *get_superproject_working_tree(void)
 * We're only interested in the name after the tab.
 */
super_sub = strchr(sb.buf, '\t') + 1;
-   super_sub_len = sb.buf + sb.len - super_sub - 1;
+   super_sub_len = strlen(super_sub);
 
if (super_sub_len > cwd_len ||
strcmp([cwd_len - super_sub_len], super_sub))
-- 
2.19.0.605.g01d371f74.dirty



Re: [PATCH] ref-filter: don't look for objects when outside of a repository

2018-09-24 Thread SZEDER Gábor
On Mon, Sep 24, 2018 at 02:17:22PM -0400, Jeff King wrote:
> On Sat, Sep 22, 2018 at 04:11:45PM +0200, SZEDER Gábor wrote:
 
> > diff --git a/ref-filter.c b/ref-filter.c
> > index e1bcb4ca8a..3555bc29e7 100644
> > --- a/ref-filter.c
> > +++ b/ref-filter.c
> > @@ -1473,7 +1473,8 @@ static int get_object(struct ref_array_item *ref, int 
> > deref, struct object **obj
> > oi->info.sizep = >size;
> > oi->info.typep = >type;
> > }
> > -   if (oid_object_info_extended(the_repository, >oid, >info,
> > +   if (!have_git_dir() ||
> > +   oid_object_info_extended(the_repository, >oid, >info,
> >  OBJECT_INFO_LOOKUP_REPLACE))
> > return strbuf_addf_ret(err, -1, _("missing object %s for %s"),
> >oid_to_hex(>oid), ref->refname);
> 
> Would we perhaps want to give the user a hint that the object is not
> really missing, but rather that we're not in a repository? E.g.,
> something like:
> 
>   if (!have_git_dir())
>   return strbuf_addf_ret(err, -1, "format specifier requires a 
> repository");
>   if (oid_object_info_extended(...))
>   return ...;
> 
> ?

I think it makes sense.

I wanted to preserve the error message, because the description of
'--sort=' in 'Documentation/git-ls-remote.txt' explicitly
mentions it, and I added the condition at this place because I didn't
want to duplicate the construction of the error message.

However, if we go for a more informative error message, then wouldn't
it be better to add this condition in populate_value() before it even
calls get_object()?  Then we could also add the problematic format
specifier to the error message (I think, but didn't actually check),
just in case someone specified multiple sort keys.




[PATCH 1/1] read-cache: update index format default to v4

2018-09-24 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee 

The index v4 format has been available since 2012 with 9d22778
"reach-cache.c: write prefix-compressed names in the index". Since
the format has been stable for so long, almost all versions of Git
in use today understand version 4, removing one barrier to upgrade
-- that someone may want to downgrade and needs a working repo.

Despite being stable for a long time, this index version was never
adopted as the default. This prefix-compressed version of the format
can get significant space savings on repos with large working
directories (which naturally tend to have deep nesting). This version
is set as the default for some external tools, such as VFS for Git.
Because of this external use, the format has had a lot of "testing in
production" and also is subject to continuous integration in these
environments.

Previously, to test version 4 indexes, we needed to run the test
suite with GIT_TEST_INDEX_VERSION=4 (or TEST_GIT_INDEX_VERSION=4).

One potential, but short-term, downside is that we lose coverage of
the version 3 indexes. The trade-off is that we may want to cover
that version using GIT_TEST_INDEX_VERSION=3.

Signed-off-by: Derrick Stolee 
---
 read-cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index 372588260e..af6c8f2a67 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1484,7 +1484,7 @@ struct cache_entry *refresh_cache_entry(struct 
cache_entry *ce,
  * Index File I/O
  */
 
-#define INDEX_FORMAT_DEFAULT 3
+#define INDEX_FORMAT_DEFAULT 4
 
 static unsigned int get_index_format_default(void)
 {
-- 
gitgitgadget


[PATCH 0/1] read-cache: update index format default to v4

2018-09-24 Thread Derrick Stolee via GitGitGadget
After discussing this with several people off-list, I thought I would open
the question up to the list:

Should we update the default index version to v4?

The .git/index file stores the list of every path tracked by Git in the
working directory, including merge information, staged paths, and
information about the file system contents (based on modified time). The
only major update in v4 is that the paths are prefix-compressed. This
compression works best in repos with a lot of paths, especially deep paths.
For this reason, we set the index to v4 in VFS for Git.

Among VFS for Git contributors, we were talking about how the v4 format is
not covered by the test suite by default. We are working to increase the
number of CI builds that set extra GIT_TEST_* variables that we need.
However, I thought it worth having a discussion of whether this is a good
thing to recommend for all users of Git.

Personally, I'm not an expert here, but I am happy to start the
conversation.

Thanks, -Stolee

Derrick Stolee (1):
  read-cache: update index format default to v4

 read-cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


base-commit: 53f9a3e157dbbc901a02ac2c73346d375e24978c
Published-As: 
https://github.com/gitgitgadget/git/releases/tags/pr-41%2Fderrickstolee%2Findex-v4-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git 
pr-41/derrickstolee/index-v4-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/41
-- 
gitgitgadget


Re: [PATCH v4 1/1] commit-reach: properly peel tags and clear flags

2018-09-24 Thread Jeff King
On Mon, Sep 24, 2018 at 01:57:52PM -0700, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee 
> 
> The can_all_from_reach_with_flag() algorithm was refactored in 4fbcca4e
> "commit-reach: make can_all_from_reach... linear" but incorrectly
> assumed that all objects provided were commits. During a fetch
> negotiation, ok_to_give_up() in upload-pack.c may provide unpeeled tags
> to the 'from' array. The current code creates a segfault.
> [...]

Thanks, this version looks good to me.

-Peff


Re: git silently ignores include directive with single quotes

2018-09-24 Thread Ævar Arnfjörð Bjarmason


On Sun, Sep 23 2018, Stas Bekman wrote:

> Apologies for I don't know how this project manages issues, so I'm not
> sure whether it is my responsibility to make sure this issue gets
> resolved, or do you have some tracking mechanism where you have it
> registered? There is no rush, I'm asking because the discussion about
> this issue has suddenly dropped about 2 weeks ago, hence my ping.

Posting to this mailing list is generally how it's done, see
https://github.com/git/git/blame/v2.19.0/README.md#L30-L37

Git's a project worked on by a bunch of people who're either doing it as
a hobby, or are otherwise busy chasing stuff they're planning to work
on.

That doesn't mean the issue you reported doesn't matter, just that
realistically we have thousands of issues big and small at any given
time, and any new reported issue competes with those. There's always too
much to do, and too little time to do it.

Personally, I'm interested enough in this to muse about how it could be
fixed / what sort of general issues it exposes, but not enough to tackle
it myself, the general silence for a couple of weeks means a lot of
people share that sentiment (or care even less).

That doesn't mean this issue doesn't matter, or that it couldn't be
addressed in some way. I just wanted to try to give you some fair &
realistic summary of what's going on.

The best way to fix stuff in git that you can't interest others in is to
do it yourself. Take a look at Documentation/SubmittingPatches in the
git.git repository for how to do that.

In particular, starting by clarifying the docs around this as I
suggested upthread might be a good and easy start to your first
contribution to git!

I hope that helps.


Re: [PATCH v3 1/5] CodingGuidelines: add shell piping guidelines

2018-09-24 Thread SZEDER Gábor
On Thu, Sep 20, 2018 at 06:43:27PM -0700, Matthew DeVore wrote:
> Add two guidelines:
> 
>  - pipe characters should appear at the end of lines, and not cause
>indentation
>  - pipes should be avoided when they swallow exit codes that can
>potentially fail
> ---
>  Documentation/CodingGuidelines | 27 +++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
> index 48aa4edfb..6d265327c 100644
> --- a/Documentation/CodingGuidelines
> +++ b/Documentation/CodingGuidelines
> @@ -118,6 +118,24 @@ For shell scripts specifically (not exhaustive):
>   do this
>   fi
>  
> + - If a command sequence joined with && or || or | spans multiple
> +   lines, put each command on a separate line and put && and || and |
> +   operators at the end of each line, rather than the start. This
> +   means you don't need to use \ to join lines, since the above
> +   operators imply the sequence isn't finished.
> +
> + (incorrect)
> + grep blob verify_pack_result \
> + | awk -f print_1.awk \
> + | sort >actual &&
> + ...
> +
> + (correct)
> + grep blob verify_pack_result |
> + awk -f print_1.awk |
> + sort >actual &&
> + ...
> +

The above are general shell coding style guidelines, so it makes sense
to include them in 'Documentation/CodingGuidelines'.


>   - We prefer "test" over "[ ... ]".
>  
>   - We do not write the noiseword "function" in front of shell
> @@ -163,6 +181,15 @@ For shell scripts specifically (not exhaustive):
>  
> does not have such a problem.
>  
> + - In a piped chain such as "grep blob objects | sort", the exit codes

Let's make an example with git in it, e.g. something like this:

  git cmd | grep important | sort

since just two lines below the new text mentions git crashing.

> +   returned by processes besides the last are ignored. This means that
> +   if git crashes at the beginning or middle of a chain, it may go
> +   undetected. Prefer writing the output of that command to a
> +   temporary file with '>' rather than pipe it.
> +
> + - The $(git ...) construct also discards git's exit code, so if the

This contruct is called command substitution, and it does preserve the
command's exit code, when the expanded text is assigned to a variable:

  $ var=$(exit 42) ; echo $?
  42

Note, however, that even in that case only the exit code of the last
command substitution is preserved:

  $ var=$(exit 1)foo$(exit 2)bar$(exit 3) ; echo $?
  3

> +   goal is to test that particular command, redirect its output to a
> +   temporary file rather than wrap it with $( ).

I find this a bit vague, and to me it implies that ignoring the exit
code of a git command that is not the main focus of the given test is
acceptable, e.g. (made up pseudo example):

  test_expect_success 'fetch gets what it should' '
git fetch $remote &&
test "$(git rev-parse just-fetched)" = $expected_oid
  '

In my opinion no tests should ignore the exit code of any git
command, ever.


These last two points, however, are specific to test scripts,
therefore I think they would be better placed in 't/README', where the
rest of the test-specific guidelines are.

>  For C programs:
>  
> -- 
> 2.19.0.444.g18242da7ef-goog
> 


Re: [PATCH v3 4/4] wt-status.c: Set the committable flag in the collect phase.

2018-09-24 Thread Junio C Hamano
Stephen Smith  writes:

> I can see three solutions and could support any of the three:
> 1) Move the free calls to run_status() and cmd_status().
> 2) Move the calls calls to wt_status_print since that is the last function 
> from wt_status.c that is called befor the structure goes out of scope in  
> run_status() and cmd_status().
> 3) Add a new wt_collect*() function to free the variables. This would have an 
> advantage that the free calls could be grouped in on place and not done in to 
> functions.  A second advantage is that the free calls would be located where 
> the pointers are initialized.  
>
> Personally I like solutions 1 and 3 over 2.
> What do others think?

I think freeing at the top level caller (i.e. #1) once it finished
using the information collected would make the most sense---it
initiated the collection, then fed the collected info to shower, and
now it knows it is done with the pieces of memory it used to make
these two parts communicate with each other.

And for keeping multiple "pieces of memory" as a unit, introducing a
helper is a good technique (i.e. #3); but I view that mostly as an
implementation detail of #1.

Thanks.



Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes

2018-09-24 Thread Jeff King
On Mon, Sep 24, 2018 at 04:50:22PM -0400, Jeff King wrote:

> On Mon, Sep 24, 2018 at 01:32:26PM -0700, Junio C Hamano wrote:
> 
> > Jeff King  writes:
> > 
> > > So I think it's conceptually consistent to always show a subset.
> > 
> > OK.  Then I agree with you that it is a good approach to first adopt
> > core.* knobs that universally apply, and add specialized ones as
> > they are needed later.
> 
> Thanks. There's one other major decision for this series, I think.
> 
> Do you have an opinion on whether for_each_alternate_refs() interface
> should stop passing back refnames? By the "they may not even exist"
> rationale in this sub-thread, I think it's probably foolish for any
> caller to actually depend on the names being meaningful.
> 
> We need to decide now because the idea of which data is relevant is
> getting baked into the documented alternateRefsCmd output format.

Just to sketch it out further, I was thinking that we'd do something
like this at the front of Taylor's series (with the rest rebased as
appropriate on top).

-- >8 --
Subject: [PATCH] transport: drop refnames from for_each_alternate_ref

None of the current callers use the refname parameter we pass to their
callbacks. In theory somebody _could_ do so, but it's actually quite
weird if you think about it: it's a ref in somebody else's repository.
So the name has no meaning locally, and in fact there may be duplicates
if there are multiple alternates.

The users of this interface really only care about seeing some ref tips,
since that promises that the alternate has the full commit graph
reachable from there. So let's keep the information we pass back to the
bare minimum.

Signed-off-by: Jeff King 
---
 builtin/receive-pack.c | 3 +--
 fetch-pack.c   | 3 +--
 transport.c| 6 +++---
 transport.h| 2 +-
 4 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index a3bb13af10..39993f2bcf 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -281,8 +281,7 @@ static int show_ref_cb(const char *path_full, const struct 
object_id *oid,
return 0;
 }
 
-static void show_one_alternate_ref(const char *refname,
-  const struct object_id *oid,
+static void show_one_alternate_ref(const struct object_id *oid,
   void *data)
 {
struct oidset *seen = data;
diff --git a/fetch-pack.c b/fetch-pack.c
index 75047a4b2a..b643de143b 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -76,8 +76,7 @@ struct alternate_object_cache {
size_t nr, alloc;
 };
 
-static void cache_one_alternate(const char *refname,
-   const struct object_id *oid,
+static void cache_one_alternate(const struct object_id *oid,
void *vcache)
 {
struct alternate_object_cache *cache = vcache;
diff --git a/transport.c b/transport.c
index 1c76d64aba..2e0bc414d0 100644
--- a/transport.c
+++ b/transport.c
@@ -1336,7 +1336,7 @@ static void read_alternate_refs(const char *path,
cmd.git_cmd = 1;
argv_array_pushf(, "--git-dir=%s", path);
argv_array_push(, "for-each-ref");
-   argv_array_push(, "--format=%(objectname) %(refname)");
+   argv_array_push(, "--format=%(objectname)");
cmd.env = local_repo_env;
cmd.out = -1;
 
@@ -1348,13 +1348,13 @@ static void read_alternate_refs(const char *path,
struct object_id oid;
 
if (get_oid_hex(line.buf, ) ||
-   line.buf[GIT_SHA1_HEXSZ] != ' ') {
+   line.buf[GIT_SHA1_HEXSZ]) {
warning(_("invalid line while parsing alternate refs: 
%s"),
line.buf);
break;
}
 
-   cb(line.buf + GIT_SHA1_HEXSZ + 1, , data);
+   cb(, data);
}
 
fclose(fh);
diff --git a/transport.h b/transport.h
index 01e717c29e..9baeca2d7a 100644
--- a/transport.h
+++ b/transport.h
@@ -261,6 +261,6 @@ int transport_refs_pushed(struct ref *ref);
 void transport_print_push_status(const char *dest, struct ref *refs,
  int verbose, int porcelain, unsigned int *reject_reasons);
 
-typedef void alternate_ref_fn(const char *refname, const struct object_id 
*oid, void *);
+typedef void alternate_ref_fn(const struct object_id *oid, void *);
 extern void for_each_alternate_ref(alternate_ref_fn, void *);
 #endif
-- 
2.19.0.764.g0a058409ab



Re: [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree

2018-09-24 Thread Stefan Beller
On Mon, Sep 24, 2018 at 3:20 AM Antonio Ospite  wrote:

> > The third call, however, looks at the nested submodule at
> > 'submodule/sub', which doesn't contain a '.gitmodules' file.  So this
> > function goes on with the second condition and calls
> > get_oid(GITMODULES_INDEX, ), which then appears to find the blob
> > in the _superproject's_ index.
> >
>
> You are correct.
>
> This is a limitation of the object store in git, there is no equivalent
> of get_oid() to get the oid from a specific repository and this affects
> config_with_options too when the config source is a blob.

Not yet, as there is a big push to pass-through an object-store object
or similar recently and rely less on global variables.
I am not sure I get to this code, though.

> This does not affect commands called via "git -C submodule_dir cmd"
> because in that case the chdir happens before the_repository is set up,
> for instance "git-submodule $SOMETHING --recursive" commands seem to
> change the working directory before the recursion.

For this it may be worth looking into the option
   --super-prefix=
  Currently for internal use only. Set a prefix which gives a
  path from above a repository down to its root. One use is
  to give submodules context about the superproject that
  invoked it.

the whole motion of moving to in-process deprecates this clunky
API to pass around strings to subprocesses.

> The test suite passes even after removing repo_read_gitmodules()
> entirely from builtin/grep.c, but I am still not confident that I get
> all the implication of why that call was originally added in commit
> f9ee2fcdfa (grep: recurse in-process using 'struct repository',
> 2017-08-02).

If you checkout that commit and remove the call to repo_read_gitmodules
and then call git-grep in a superproject with nested submodules, you
get a segfault.

On master (and deleting out that line) you do not get the segfault,
I think praise goes to ff6f1f564c4 (submodule-config: lazy-load a
repository's .gitmodules file, 2017-08-03) which happened shortly
after f9ee2fcdfa.

It showcased that it worked by converting ls-files, but left out grep.

So I think based on ff6f1f564c4 it is safe to remove all calls to
repo_read_gitmodules.

> Anyways, even if we removed the call we would prevent the problem from
> happening in the test suite, but not in the real world, in case non-leaf
> submodules without .gitmodules in their working tree.

Quite frankly I think grep was just overlooked in review of
https://public-inbox.org/git/20170803182000.179328-14-bmw...@google.com/

Stefan


Re: [PATCH] git help: promote 'git help -av'

2018-09-24 Thread Junio C Hamano
Jeff King  writes:

> I agree that "help -av" is likely to be more friendly. I kind of wonder
> if it should just be the default for "-a". Do we have any obligation not
> to change the format of that output?

I know that at least older versions of git-completion.bash (I think
it was up to 2.17.x series, but do not quote me on that) have parsed
"git help -a" output to obtain the list of commands.  So such a
change would impact those minority users who keep stale completion
script in their $HOME/.bashrc without ever update it, thinking it is
good enough.

I personally find "help -av" a bit too loud to my taste than plain
"-a", and more importantly, I look at "help -a" primarily to check
the last section "avaialble from elsewhere on your $PATH" to find
things like "clang-format", which I do not think is available
anywhere in "help -av", so I do not think "-av" can be promoted as
an upward-compatible replacement in its current form.

I probably am not a part of the target audience, though.




[PATCH v4 0/1] Properly peel tags in can_all_from_reach_with_flags()

2018-09-24 Thread Derrick Stolee via GitGitGadget
As Peff reported [1], the refactored can_all_from_reach_with_flags() method
does not properly peel tags. Since the helper method can_all_from_reach()
and code in t/helper/test-reach.c all peel tags before getting to this
method, it is not super-simple to create a test that demonstrates this.

I modified t/helper/test-reach.c to allow calling
can_all_from_reach_with_flags() directly, and added a test in
t6600-test-reach.sh that demonstrates the segfault without the fix.

For V2, I compared the loop that inspects the 'from' commits in commit
ba3ca1edce "commit-reach: move can_all_from_reach_with_flags" to the version
here and got the following diff:

3c3
< if (from_one->flags & assign_flag)
---
> if (!from_one || from_one->flags & assign_flag)
5c5,7
< from_one = deref_tag(the_repository, from_one, "a from 
object", 0);
---
>
> from_one = deref_tag(the_repository, from_one,
>  "a from object", 0);
14a17,22
>
> list[nr_commits] = (struct commit *)from_one;
> if (parse_commit(list[nr_commits]) ||
> list[nr_commits]->generation < min_generation)
> return 0; /* is this a leak? */
> nr_commits++;

This diff includes the early termination we had before 'deref_tag' and the
comment for why we can ignore non-commit objects.

[1] 
https://public-inbox.org/git/0bf9103c-9377-506b-7ad7-e5273d8e9...@gmail.com/T/#u

Derrick Stolee (1):
  commit-reach: properly peel tags and clear flags

 commit-reach.c| 44 +--
 t/helper/test-reach.c | 22 +-
 t/t6600-test-reach.sh | 30 +++--
 3 files changed, 79 insertions(+), 17 deletions(-)


base-commit: 6621c838743812aaba96e55cfec8524ea1144c2d
Published-As: 
https://github.com/gitgitgadget/git/releases/tags/pr-39%2Fderrickstolee%2Ftag-fix-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git 
pr-39/derrickstolee/tag-fix-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/39

Range-diff vs v3:

 1:  0a1e661271 ! 1:  a0a3cf0134 commit-reach: properly peel tags
 @@ -1,6 +1,6 @@
  Author: Derrick Stolee 
  
 -commit-reach: properly peel tags
 +commit-reach: properly peel tags and clear flags
  
  The can_all_from_reach_with_flag() algorithm was refactored in 
4fbcca4e
  "commit-reach: make can_all_from_reach... linear" but incorrectly
 @@ -14,6 +14,19 @@
  Correct the issue by peeling tags when investigating the initial list
  of objects in the 'from' array.
  
 +The can_all_from_reach_with_flag() method uses 'assign_flag' as a
 +value we can use to mark objects temporarily during our commit walk.
 +The intent is that these flags are removed from all objects before
 +returning. However, this is not the case.
 +
 +The 'from' array could also contain objects that are not commits, and
 +we mark those objects with 'assign_flag'. Add a loop to the 'cleanup'
 +section that removes these markers.
 +
 +Also, we forgot to free() the memory for 'list', so add that to the
 +'cleanup' section. Also, use a cleaner mechanism for clearing those
 +flags.
 +
  Signed-off-by: Jeff King 
  Signed-off-by: Derrick Stolee 
  
 @@ -74,10 +87,18 @@
   
   cleanup:
  - for (i = 0; i < from->nr; i++) {
 -+ for (i = 0; i < nr_commits; i++) {
 -  clear_commit_marks(list[i], RESULT);
 -  clear_commit_marks(list[i], assign_flag);
 -  }
 +- clear_commit_marks(list[i], RESULT);
 +- clear_commit_marks(list[i], assign_flag);
 +- }
 ++ clear_commit_marks_many(nr_commits, list, RESULT | assign_flag);
 ++ free(list);
 ++
 ++ for (i = 0; i < from->nr; i++)
 ++ from->objects[i].item->flags &= ~assign_flag;
 ++
 +  return result;
 + }
 + 
  
  diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
  --- a/t/helper/test-reach.c
 2:  b2e0ee4978 < -:  -- commit-reach: fix memory and flag leaks

-- 
gitgitgadget


[PATCH v4 1/1] commit-reach: properly peel tags and clear flags

2018-09-24 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee 

The can_all_from_reach_with_flag() algorithm was refactored in 4fbcca4e
"commit-reach: make can_all_from_reach... linear" but incorrectly
assumed that all objects provided were commits. During a fetch
negotiation, ok_to_give_up() in upload-pack.c may provide unpeeled tags
to the 'from' array. The current code creates a segfault.

Add a direct call to can_all_from_reach_with_flag() in 'test-tool reach'
and add a test in t6600-test-reach.sh that demonstrates this segfault.

Correct the issue by peeling tags when investigating the initial list
of objects in the 'from' array.

The can_all_from_reach_with_flag() method uses 'assign_flag' as a
value we can use to mark objects temporarily during our commit walk.
The intent is that these flags are removed from all objects before
returning. However, this is not the case.

The 'from' array could also contain objects that are not commits, and
we mark those objects with 'assign_flag'. Add a loop to the 'cleanup'
section that removes these markers.

Also, we forgot to free() the memory for 'list', so add that to the
'cleanup' section. Also, use a cleaner mechanism for clearing those
flags.

Signed-off-by: Jeff King 
Signed-off-by: Derrick Stolee 
---
 commit-reach.c| 44 +--
 t/helper/test-reach.c | 22 +-
 t/t6600-test-reach.sh | 30 +++--
 3 files changed, 79 insertions(+), 17 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 86715c103c..db84f85986 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -544,20 +544,42 @@ int can_all_from_reach_with_flag(struct object_array 
*from,
 {
struct commit **list = NULL;
int i;
+   int nr_commits;
int result = 1;
 
ALLOC_ARRAY(list, from->nr);
+   nr_commits = 0;
for (i = 0; i < from->nr; i++) {
-   list[i] = (struct commit *)from->objects[i].item;
+   struct object *from_one = from->objects[i].item;
 
-   if (parse_commit(list[i]) ||
-   list[i]->generation < min_generation)
-   return 0;
+   if (!from_one || from_one->flags & assign_flag)
+   continue;
+
+   from_one = deref_tag(the_repository, from_one,
+"a from object", 0);
+   if (!from_one || from_one->type != OBJ_COMMIT) {
+   /* no way to tell if this is reachable by
+* looking at the ancestry chain alone, so
+* leave a note to ourselves not to worry about
+* this object anymore.
+*/
+   from->objects[i].item->flags |= assign_flag;
+   continue;
+   }
+
+   list[nr_commits] = (struct commit *)from_one;
+   if (parse_commit(list[nr_commits]) ||
+   list[nr_commits]->generation < min_generation) {
+   result = 0;
+   goto cleanup;
+   }
+
+   nr_commits++;
}
 
-   QSORT(list, from->nr, compare_commits_by_gen);
+   QSORT(list, nr_commits, compare_commits_by_gen);
 
-   for (i = 0; i < from->nr; i++) {
+   for (i = 0; i < nr_commits; i++) {
/* DFS from list[i] */
struct commit_list *stack = NULL;
 
@@ -600,10 +622,12 @@ int can_all_from_reach_with_flag(struct object_array 
*from,
}
 
 cleanup:
-   for (i = 0; i < from->nr; i++) {
-   clear_commit_marks(list[i], RESULT);
-   clear_commit_marks(list[i], assign_flag);
-   }
+   clear_commit_marks_many(nr_commits, list, RESULT | assign_flag);
+   free(list);
+
+   for (i = 0; i < from->nr; i++)
+   from->objects[i].item->flags &= ~assign_flag;
+
return result;
 }
 
diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
index eb21103998..08d2ea68e8 100644
--- a/t/helper/test-reach.c
+++ b/t/helper/test-reach.c
@@ -31,6 +31,7 @@ int cmd__reach(int ac, const char **av)
struct object_id oid_A, oid_B;
struct commit *A, *B;
struct commit_list *X, *Y;
+   struct object_array X_obj = OBJECT_ARRAY_INIT;
struct commit **X_array;
int X_nr, X_alloc;
struct strbuf buf = STRBUF_INIT;
@@ -49,7 +50,8 @@ int cmd__reach(int ac, const char **av)
 
while (strbuf_getline(, stdin) != EOF) {
struct object_id oid;
-   struct object *o;
+   struct object *orig;
+   struct object *peeled;
struct commit *c;
if (buf.len < 3)
continue;
@@ -57,14 +59,14 @@ int cmd__reach(int ac, const char **av)
if (get_oid_committish(buf.buf + 2, ))
die("failed to resolve %s", buf.buf + 2);
 
-   o = parse_object(r, );
- 

git clone works in ide but not git bash CLI

2018-09-24 Thread David Brown

Howdy, I have a conundrum:

App: Spring Cloud Config Server
envvars: GIT_URL and SSH_KEY
IDE: Intellij 2018.2.4 Ultimate

When I use the IDE to assign the SSH_KEY value all is copacetic.

If I assign the envvar at the Git Bash CLI:

com.jcraft.jsch.JSchException: Auth fail

Any guesses?

Thanks and regards,



Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes

2018-09-24 Thread Jeff King
On Mon, Sep 24, 2018 at 01:32:26PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > So I think it's conceptually consistent to always show a subset.
> 
> OK.  Then I agree with you that it is a good approach to first adopt
> core.* knobs that universally apply, and add specialized ones as
> they are needed later.

Thanks. There's one other major decision for this series, I think.

Do you have an opinion on whether for_each_alternate_refs() interface
should stop passing back refnames? By the "they may not even exist"
rationale in this sub-thread, I think it's probably foolish for any
caller to actually depend on the names being meaningful.

We need to decide now because the idea of which data is relevant is
getting baked into the documented alternateRefsCmd output format.

-Peff


Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes

2018-09-24 Thread Junio C Hamano
Jeff King  writes:

> So I think it's conceptually consistent to always show a subset.

OK.  Then I agree with you that it is a good approach to first adopt
core.* knobs that universally apply, and add specialized ones as
they are needed later.

Thanks.


Re: [PATCH] git help: promote 'git help -av'

2018-09-24 Thread Jeff King
On Mon, Sep 24, 2018 at 03:20:00PM -0500, Taylor Blau wrote:

> On Mon, Sep 24, 2018 at 02:19:28PM -0400, Jeff King wrote:
> > On Sat, Sep 22, 2018 at 07:47:07PM +0200, Nguyễn Thái Ngọc Duy wrote:
> >
> > > When you type "git help" (or just "git") you are greeted with a list
> > > with commonly used commands and their short description and are
> > > suggested to use "git help -a" or "git help -g" for more details.
> > >
> > > "git help -av" would be more friendly and inline with what is shown
> > > with "git help" since it shows list of commands with description as
> > > well, and commands are properly grouped.
> >
> > I agree that "help -av" is likely to be more friendly. I kind of wonder
> > if it should just be the default for "-a". Do we have any obligation not
> > to change the format of that output?
> 
> I agree, though I'd like to clarify what you said before doing so
> wholeheartedly.
> 
> Did you mean that all existing uses of 'git help -a' should instead mean
> 'git help -av' (i.e., that '-a' after your proposed patch means the same
> as '-av' in revisions prior to this one?)

Yes, exactly. I think the vast majority of uses would prefer the
categorized list.

The obvious exceptions are:

 - you can't remember the name of the command so an alphabetized list is
   easier for sifting through (without having to re-sift for each
   category).

 - you need a machine-readable version of the list (e.g., for
   programmable completion). We have "git --list-cmds", but we may need
   to advertise it better and mark it non-experimental.

I dunno. Maybe it is not worth the effort. Duy's existing patch is an
easy one liner. ;)

-Peff


Re: [PATCH] git help: promote 'git help -av'

2018-09-24 Thread Taylor Blau
On Mon, Sep 24, 2018 at 02:19:28PM -0400, Jeff King wrote:
> On Sat, Sep 22, 2018 at 07:47:07PM +0200, Nguyễn Thái Ngọc Duy wrote:
>
> > When you type "git help" (or just "git") you are greeted with a list
> > with commonly used commands and their short description and are
> > suggested to use "git help -a" or "git help -g" for more details.
> >
> > "git help -av" would be more friendly and inline with what is shown
> > with "git help" since it shows list of commands with description as
> > well, and commands are properly grouped.
>
> I agree that "help -av" is likely to be more friendly. I kind of wonder
> if it should just be the default for "-a". Do we have any obligation not
> to change the format of that output?

I agree, though I'd like to clarify what you said before doing so
wholeheartedly.

Did you mean that all existing uses of 'git help -a' should instead mean
'git help -av' (i.e., that '-a' after your proposed patch means the same
as '-av' in revisions prior to this one?)

If so, I agree. I can't imagine a case where I'd like to provide '-a'
and _not_ '-v', so certainly the later should come as a two-for-one
deal. Less, more well-intentioned knobs seems a positive trend to me, so
I am for this idea.

Thanks,
Taylor


Re: [PATCH] Add an EditorConfig file

2018-09-24 Thread Junio C Hamano
"brian m. carlson"  writes:

>> But that is a response in a dream-world.  If there is no such tool,
>> I am perfectly OK if the plan is to manually keep them (loosely) in
>> sync.  I do not think it is good use of our time to try to come up
>> with such a tool (unless somebody is really interested in doing it,
>> that is).
>
> Would it be helpful if I sent a script that ran during CI to ensure they
> stayed in sync for the couple places where they overlap?  I'm happy to
> do so if you think it would be useful.

It may even be an overkill.

A comment addressed to those who edit one file to look at the other
file on both files would be sufficient, I suspect.


Re: Git for games working group

2018-09-24 Thread Taylor Blau
On Mon, Sep 24, 2018 at 08:34:44AM -0700, John Austin wrote:
> Perhaps git-global-graph is a decent name. GGG? G3? :). The structure
> right now in my head looks a bit like:
>
> Global Graph:
>  client - post-commit git hooks to push changes up to the GG

I'm replying to this part of the email to note that this would cause Git
LFS to have to do some extra work, since running 'git lfs install'
already writes to .git/hooks/post-commit (ironically, to detect and
unlock locks that we should have released).

I'm not immediately sure about how we'd resolve this, though I suspect
it would look like either of:

  - Git LFS knows how to install or _append_ hooks to a given location,
should one already exist at that path on disk, or

  - git-global-graph knows how to accommodate Git LFS, and can include a
line that calls 'git-lfs-post-commit(1)', perhaps via:

  $ git global-graph install --git-lfs=$(which git-lfs)

or similar.

> For LFS, The main points of integration with I see are:
> -- bundling of packages (optionally install this package with a
> normal LFS installation)
> -- `git lfs locks` integration. ie. integration with the read-only
> control of LFS

Sounds sane to me.

> > we strictly avoid using CGo
>
> What's the main reason for this? Build system complexity?

A couple of reasons. CGO is widely considered to be (1) slow and (2)
unsafe. For our purposes, this would almost be OK, except that it makes
it impossible for me to build cross-platform binaries without the
correct compilers installed.

Today, I build Git LFS for every pair in {Windows, Darwin, Linux,
FreeBSD} x {386, amd64} by running 'make release', and using CGO would
not allow me to do that.

Transitioning from Go to CGO during each call is notoriously expensive,
and concedes many of the benefits that leads us to choose Go in the
first place. (Although now that I write much more C than Go, I don't
think I would make the same argument today ;-).)

Thanks,
Taylor


Re: "git rev-parse --show-superproject-working-tree" fails with "fatal: BUG: returned path string doesn't match cwd?" if supermodule has unmerged changes of the submodule reference

2018-09-24 Thread Stefan Beller
On Mon, Sep 24, 2018 at 10:59 AM Sam McKelvie  wrote:
>
> I experienced this problem using git 2.17.1; however, from inspection of the 
> next branch, function get_superproject_working_tree() in submodule.c has not 
> changed in 2 years.
>
> I believe the problem is related to the fact that when a merge of the 
> submodule reference is in progress, "git --stage —full-name 
> ” returns three seperate entries for the submodule 
> (one for each stage) rather than a single entry; e.g.,
>
> $ git ls-files --stage --full-name submodule-child-test
> 16 dbbd2766fa330fa741ea59bb38689fcc2d283ac5 1   submodule-child-test
> 16 f174d1dbfe863a59692c3bdae730a36f2a788c51 2   submodule-child-test
> 16 e6178f3a58b958543952e12824aa2106d560f21d 3   submodule-child-test
>
> The code in get_superproject_working_tree() uses the “-z” option on ls-files, 
> so it expects null-byte termination between entries. However, the computation 
> of super_sub_len:
>
> super_sub_len = sb.buf + sb.len - super_sub - 1;
>
> will only work when there is exactly one entry returned. If this line is 
> changed to:
>
> super_sub_len = strlen(super_sub);
>
> then only the first returned entry is used, and the bug is resolved.
>
> strlen() should be safe to use here because strbuf_read ensures the result 
> buffer is null-terminated.

This is good analysis of the issue. Thanks for writing it up!
Would you also mind to send a patch fixing the problem?

I agree that using strlen should work. I do not recall why I
did not use it at the time of writing it.

Thanks,
Stefan


Re: [PATCH v3 2/2] commit-reach: fix memory and flag leaks

2018-09-24 Thread Jeff King
On Mon, Sep 24, 2018 at 01:25:12PM -0400, Derrick Stolee wrote:

> On 9/21/2018 7:58 PM, Jeff King wrote:
> > On Fri, Sep 21, 2018 at 08:05:27AM -0700, Derrick Stolee via GitGitGadget 
> > wrote:
> > 
> > > From: Derrick Stolee 
> > > 
> > > The can_all_from_reach_with_flag() method uses 'assign_flag' as a
> > > value we can use to mark objects temporarily during our commit walk.
> > > The intent is that these flags are removed from all objects before
> > > returning. However, this is not the case.
> > > 
> > > The 'from' array could also contain objects that are not commits, and
> > > we mark those objects with 'assign_flag'. Add a loop to the 'cleanup'
> > > section that removes these markers.
> > > 
> > > Also, we forgot to free() the memory for 'list', so add that to the
> > > 'cleanup' section.
> > Urgh, ignore most of my response to patch 1, then. I saw there was a
> > patch 2, but thought it was just handling the free().
> > 
> > The flag-clearing here makes perfect sense.
> 
> In my local branch, I ended up squashing this commit into the previous,
> because I discovered clear_commit_marks_many() making the cleanup section
> have this diff:
> 
> @@ -600,10 +622,12 @@ int can_all_from_reach_with_flag(struct object_array
> *from,
>     }
> 
>  cleanup:
> -   for (i = 0; i < from->nr; i++) {
> -   clear_commit_marks(list[i], RESULT);
> -   clear_commit_marks(list[i], assign_flag);
> -   }
> +   clear_commit_marks_many(nr_commits, list, RESULT | assign_flag);
> +   free(list);
> +
> +   for (i = 0; i < from->nr; i++)
> +   from->objects[i].item->flags &= ~assign_flag;
> +
>     return result;
>  }
> 
> With the bigger change in the larger set, there is less reason to do
> separate commits. (Plus, it confused you during review.)

Yeah, that looks better still. Thanks!

-Peff


Re: [PATCH 1/3] t7001: reformat to newer style

2018-09-24 Thread Stefan Beller
On Mon, Sep 24, 2018 at 6:31 AM Derrick Stolee  wrote:
>
> On 9/21/2018 7:58 PM, Stefan Beller wrote:
> > The old formatting style is a real hindrance of getting people up to speed
> > contributing as they use existing code as an example and follow that style.
> > So let's get rid of the old style and reformat it in our current style.
> I was suspicious of your automated approach catching everything, so I
> looked carefully at this patch. There are still a lot of things
> happening that we would not recommend doing in new tests.

Heh, thanks for calling that out. So we're looking at a full formatter
instead of a partial formatter that helps moving in the right direction now. :-/

I would prefer to use automation as much as possible for these tasks
to keep it easy to apply it at scale once a file is not touched by
master..pu any more.

When applying styles manually, there is sometimes a judgement call,
which would like to the inevitable bikeshedding that I'd prefer to avoid.

> > +test_expect_success 'moving the file out of subdirectory' '
> > + cd path0 && git mv COPYING ../path1/COPYING
> > +'
> Perhaps split this line on the &&?

In real modern testing, this could also be

git -C path0 mv ...

which would also fix the cd.. below and not needing
a subshell there either (using -C again).

Looking at this from a higher level, nowadays I would write
tests that have more lines in them, instead of having
one git command per test.

> > +test_expect_success 'moving to existing tracked target with trailing 
> > slash' '
> > + mkdir path2 &&
> > + >path2/file && git add path2/file &&
> This line in particular looks a bit strange. What is this doing? At
> least we should split the &&.

Yes.

> > +test_expect_success 'do not move directory over existing directory' '
> > + mkdir path0 && mkdir path0/path2 && test_must_fail git mv path2 path0
> > +'
>
> Split this line.

Okay, I'll go manually over these tests to adapt to new style.

Thanks for looking over the patch!
Stefan


Re: [PATCH] git help: promote 'git help -av'

2018-09-24 Thread Jeff King
On Sat, Sep 22, 2018 at 07:47:07PM +0200, Nguyễn Thái Ngọc Duy wrote:

> When you type "git help" (or just "git") you are greeted with a list
> with commonly used commands and their short description and are
> suggested to use "git help -a" or "git help -g" for more details.
> 
> "git help -av" would be more friendly and inline with what is shown
> with "git help" since it shows list of commands with description as
> well, and commands are properly grouped.

I agree that "help -av" is likely to be more friendly. I kind of wonder
if it should just be the default for "-a". Do we have any obligation not
to change the format of that output?

Assuming we want to keep "-a" and "-av" as-is, your patch seems quite
sensible to me.

-Peff


Re: [PATCH] ref-filter: don't look for objects when outside of a repository

2018-09-24 Thread Jeff King
On Sat, Sep 22, 2018 at 04:11:45PM +0200, SZEDER Gábor wrote:

> The command 'git ls-remote --sort=authordate ' segfaults when
> run outside of a repository, ever since the introduction of its
> '--sort' option in 1fb20dfd8e (ls-remote: create '--sort' option,
> 2018-04-09).
> 
> While in general the 'git ls-remote' command can be run outside of a
> repository just fine, its '--sort=' option with certain keys does
> require access to the referenced objects.  This sorting is implemented
> using the generic ref-filter sorting facility, which already handles
> missing objects gracefully with the appropriate 'missing object
> deadbeef for HEAD' message.  However, being generic means that it
> checks replace refs while trying to retrieve an object, and while
> doing so it accesses the 'git_replace_ref_base' variable, which has
> not been initialized and is still a NULL pointer when outside of a
> repository, thus causing the segfault.
> 
> Make ref-filter more careful and only attempt to retrieve an object
> when we are in a repository.  Also add a test to ensure that 'git
> ls-remote --sort' fails gracefully when executed outside of a
> repository.

This all makes sense, and I think your fix is going in the right
direction.

But...

> I'm not quite sure that this is the best place to add this check...
> but hey, it's a Saturday afternoon after all ;)

I also wonder about this. For refs, we already catch these cases at a
low-level and BUG(). That's better than a segfault, and I suspect we
should be doing the same here in oid_object_info_extended(). But that
just shifts the segfault to a BUG().

For the refs code, we've generally tried to catch things at a high-level
and report a more human-friendly error explaining the situation. So
doing the same thing here would mean adding code to ls-remote. But I
think the plumbing gets pretty tricky, since it has no way to ask
ref-filter "hey, are we doing to need to look at objects?".

That's a thing that I think ref-filter _should_ support (it knows it
after having parsed the format string). But it probably ought to come
along with other refactoring, and shouldn't hold up this fix.

So this probably _is_ a reasonable place to check it. However...

> diff --git a/ref-filter.c b/ref-filter.c
> index e1bcb4ca8a..3555bc29e7 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -1473,7 +1473,8 @@ static int get_object(struct ref_array_item *ref, int 
> deref, struct object **obj
>   oi->info.sizep = >size;
>   oi->info.typep = >type;
>   }
> - if (oid_object_info_extended(the_repository, >oid, >info,
> + if (!have_git_dir() ||
> + oid_object_info_extended(the_repository, >oid, >info,
>OBJECT_INFO_LOOKUP_REPLACE))
>   return strbuf_addf_ret(err, -1, _("missing object %s for %s"),
>  oid_to_hex(>oid), ref->refname);

Would we perhaps want to give the user a hint that the object is not
really missing, but rather that we're not in a repository? E.g.,
something like:

  if (!have_git_dir())
return strbuf_addf_ret(err, -1, "format specifier requires a 
repository");
  if (oid_object_info_extended(...))
return ...;

?

-Peff


Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes

2018-09-24 Thread Jeff King
On Mon, Sep 24, 2018 at 08:17:14AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > I was suggesting that check_everything_connected() is not strictly
> > transport-related, so would be inappropriate for transport.*, and we'd
> > need a more generic name. And my "either way" was that I could see
> > an argument that it _is_ transport related, since we only call it now
> > when receiving a pack. But that doesn't have to be the case, and
> > certainly implementing it with "rev-list --alternate-refs" muddies that
> > considerably.
> 
> Even after 7043c707 ("check_everything_connected: use a struct with
> named options", 2016-07-15) unified many into check_connected(),
> there still are different reasons why we call to find out about the
> connectivity, and I doubt we can afford to have a single knob that
> is shared both for transport and other kind of connectivity checks
> (like fsck or repack).  Do we want to be affected by "we pretend
> that these are the only refs exported from that alternate object
> store" when repacking and pruning only local objects and keep us
> rely on the alternate, for example?

Actually, yes, I think there is value in a single knob. At least that's
what I'd want for our (GitHub's) use case.

Remember that these alternate refs might not exist at all (the
alternates mechanism can work with just a bare "objects" directory,
unconnected from a real git repo). So I think anything using them has to
view it as a "best effort" optimization: we might or might not know
about some ref tips that might or might not cover the whole set of
objects in the alternate. They're the things we _guarantee_ that the
alternate has full connectivity for, and it might have more.

So I think it's conceptually consistent to always show a subset. I did
qualify with "for our use case" because some people might be primarily
concerned with the bandwidth of sending .haves across the network.
Whereas at our scale, even enumerating them at all is prohibitively
expensive.

One thing we could do is add a "core" config now (whether it's in core.*
or wherever). And then if later somebody wants receive-pack to behave
differently, we have an out: we can add transfer.alternateRefsCommand or
even receive.alternateRefsCommand that take precedence in those
situations.

Of course we could add the more restricted ones now, and add the "core"
one later as new uses grow. But that's more work now, since we'd have to
plumb through that context to the for_each_alternate_ref() interface.
I'd rather punt on that work until later (because I suspect that "later"
will never actually come).

> In any case it is good that these configuration variables are
> defined on _our_ side, not in the alternate---it means that we do
> not have to worry about the case where the alternateRefsCommand lies
> and tells us that an object that the alternate does not actually
> have exists at a tip of a ref in an attempt to confuse us, etc.

Yes. It also makes it easy to use "git -c" to override the scheme if you
want to (as opposed to mucking with on-disk files in the alternate).

-Peff


"git rev-parse --show-superproject-working-tree" fails with "fatal: BUG: returned path string doesn't match cwd?" if supermodule has unmerged changes of the submodule reference

2018-09-24 Thread Sam McKelvie
I experienced this problem using git 2.17.1; however, from inspection of the 
next branch, function get_superproject_working_tree() in submodule.c has not 
changed in 2 years.

I believe the problem is related to the fact that when a merge of the submodule 
reference is in progress, "git --stage —full-name ” 
returns three seperate entries for the submodule (one for each stage) rather 
than a single entry; e.g.,

$ git ls-files --stage --full-name submodule-child-test
16 dbbd2766fa330fa741ea59bb38689fcc2d283ac5 1   submodule-child-test
16 f174d1dbfe863a59692c3bdae730a36f2a788c51 2   submodule-child-test
16 e6178f3a58b958543952e12824aa2106d560f21d 3   submodule-child-test

The code in get_superproject_working_tree() uses the “-z” option on ls-files, 
so it expects null-byte termination between entries. However, the computation 
of super_sub_len:

super_sub_len = sb.buf + sb.len - super_sub - 1;

will only work when there is exactly one entry returned. If this line is 
changed to:

super_sub_len = strlen(super_sub);

then only the first returned entry is used, and the bug is resolved.

strlen() should be safe to use here because strbuf_read ensures the result 
buffer is null-terminated.

Thanks,
Sam McKelvie




[PATCH v2] mailmap: consistently normalize brian m. carlson's name

2018-09-24 Thread Jonathan Nieder
v2.18.0-rc0~70^2 (mailmap: update brian m. carlson's email address,
2018-05-08) changed the mailmap to map

  sand...@crustytoothpaste.ath.cx
   -> brian m. carlson 

instead of

  sand...@crustytoothpaste.net
-> brian m. carlson 

That means the mapping

  Brian M. Carlson 
-> brian m. carlson 

is redundant, so we can remove it.  More importantly, it means that
the identity "Brian M. Carlson " used in
some commits is not normalized any more.  Add a mapping for it.

Noticed while updating Debian's Git packaging, which uses "git
shortlog --no-merges" to produce a list of changes in each version,
grouped by author's (normalized) name.

Signed-off-by: Jonathan Nieder 
---
Hi,

brian m. carlson wrote:

> I think this commit message makes sense.  I apparently still fail to
> understand how the .mailmap format works, so I can't tell you if the
> patch is correct.

Thanks for looking it over.  What would it take to make the patch make
sense, too? ;-)

Most mailmap entries are of the form

Some Name 

which means "Wherever you see the email address someem...@example.com,
canonicalize the author's name to Some Name".  We can use that:

brian m. carlson 

When we see sand...@crustytoothpaste.ath.cx, we also want to
canonicalize the email address.  For that, we can do

brian m. carlson  


There's only one person who has used these email addresses, so we
don't have to do matching by name.  If we wanted to tighten the name
normalization to match by name, I think we'd do something like

brian m. carlson  Brian M. Carlson

but I can't get that to seem to have any effect when I test with the
"git check-mailmap" command --- for example, "git check-mailmap 'Dana
How '" does not map and "git check-mailmap
'Random Name '" maps to 'Dana L. How
'.

The even tighter matching used in v1

brian m. carlson  Brian M. Carlson 


does work, but it's unnecessary complexity.  We don't need it.

How about this?

Changes since v1:
- loosened the matching to only look at email and ignore name
- no other changes

 .mailmap | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.mailmap b/.mailmap
index f165222a78..bef3352b0d 100644
--- a/.mailmap
+++ b/.mailmap
@@ -25,7 +25,7 @@ Ben Walton  
 Benoit Sigoure  
 Bernt Hansen  
 Brandon Casey  
-brian m. carlson  Brian M. Carlson 

+brian m. carlson 
 brian m. carlson  

 Bryan Larsen  
 Bryan Larsen  
-- 
2.19.0.444.g18242da7ef



I want you to Distribute my funds to less priviledge if interested reply now

2018-09-24 Thread Shill Sheila Johnson


Re: [PATCH v3 2/2] commit-reach: fix memory and flag leaks

2018-09-24 Thread Derrick Stolee

On 9/21/2018 7:58 PM, Jeff King wrote:

On Fri, Sep 21, 2018 at 08:05:27AM -0700, Derrick Stolee via GitGitGadget wrote:


From: Derrick Stolee 

The can_all_from_reach_with_flag() method uses 'assign_flag' as a
value we can use to mark objects temporarily during our commit walk.
The intent is that these flags are removed from all objects before
returning. However, this is not the case.

The 'from' array could also contain objects that are not commits, and
we mark those objects with 'assign_flag'. Add a loop to the 'cleanup'
section that removes these markers.

Also, we forgot to free() the memory for 'list', so add that to the
'cleanup' section.

Urgh, ignore most of my response to patch 1, then. I saw there was a
patch 2, but thought it was just handling the free().

The flag-clearing here makes perfect sense.


In my local branch, I ended up squashing this commit into the previous, 
because I discovered clear_commit_marks_many() making the cleanup 
section have this diff:


@@ -600,10 +622,12 @@ int can_all_from_reach_with_flag(struct 
object_array *from,

    }

 cleanup:
-   for (i = 0; i < from->nr; i++) {
-   clear_commit_marks(list[i], RESULT);
-   clear_commit_marks(list[i], assign_flag);
-   }
+   clear_commit_marks_many(nr_commits, list, RESULT | assign_flag);
+   free(list);
+
+   for (i = 0; i < from->nr; i++)
+   from->objects[i].item->flags &= ~assign_flag;
+
    return result;
 }

With the bigger change in the larger set, there is less reason to do 
separate commits. (Plus, it confused you during review.)


Thanks,

-Stolee




Re: Import/Export as a fast way to purge files from Git?

2018-09-24 Thread Elijah Newren
On Sun, Sep 23, 2018 at 6:08 AM Lars Schneider  wrote:
>
> Hi,
>
> I recently had to purge files from large Git repos (many files, many commits).
> The usual recommendation is to use `git filter-branch --index-filter` to purge
> files. However, this is *very* slow for large repos (e.g. it takes 45min to
> remove the `builtin` directory from git core). I realized that I can remove
> files *way* faster by exporting the repo, removing the file references,
> and then importing the repo (see Perl script below, it takes ~30sec to remove
> the `builtin` directory from git core). Do you see any problem with this
> approach?

It looks like others have pointed you at other tools, and you're
already shifting to that route.  But I think it's a useful question to
answer more generally, so for those that are really curious...


The basic approach is fine, though if you try to extend it much you
can run into a few possible edge/corner cases (more on that below).
I've been using this basic approach for years and even created a
mini-python library[1] designed specifically to allow people to create
"fast-filters", used as
   git fast-export  | your-fast-filter | git fast-import 

But that library didn't really take off; even I have rarely used it,
often opting for filter-branch despite its horrible performance or a
simple fast-export | long-sed-command | fast-import (with some extra
pre-checking to make sure the sed wouldn't unintentionally munge other
data).  BFG is great, as long as you're only interested in removing a
few big items, but otherwise doesn't seem very useful (to be fair,
it's very upfront about only wanting to solve that problem).
Recently, due to continuing questions on filter-branch and folks still
getting confused with it, I looked at existing tools, decided I didn't
think any quite fit, and started looking into converting
git_fast_filter into a filter-branch-like tool instead of just a
libary.  Found some bugs and missing features in fast-export along the
way (and have some patches I still need to send in).  But I kind of
got stuck -- if the tool is in python, will that limit adoption too
much?  It'd be kind of nice to have this tool in core git.  But I kind
of like leaving open the possibility of using it as a tool _or_ as a
library, the latter for the special cases where case-specific
programmatic filtering is needed.  But a developer-convenience library
makes almost no sense unless in a higher level language, such as
python.  I'm still trying to make up my mind about what I want (and
what others might want), and have been kind of blocking on that.  (If
others have opinions, I'm all ears.)


Anyway, the edge/corner cases you can watch out for:

  - Signed tags are a problem; you may need to specify
--signed-tags=strip to fast-export

  - References to other commits in your commit messages will now be
incorrect.  I think a good tool should either default to rewriting
commit ids in commit messages or at least have an option to do so
(BFG does this; filter-branch doesn't; fast-export format makes it
really hard for a filter based on it to do so)

  - If the paths you remove are the only paths modified in a commit,
the commit can become empty.  If you're only filtering a few paths
out, this might be nothing more than a minor inconvenience for you.
However, if you're trying to prune directories (and perhaps several
toplevel ones), then it can be extremely annoying to have a new
history with the vast majority of all commits being empty.
(filter-branch has an option for this; BFG does not; tools based on
fast-export output can do it with sufficient effort).

  - If you start pruning empty commits, you have to worry about
rewriting branches and tags to remaining parents.  This _might_ happen
for free depending on your history's structure and the fast-export
stream, but to be correct in general you will have to specify the new
commit for some branches or tags.

  - If you start pruning empty commits, you have to decide whether to
allow pruning of merge commits.  Your first reaction might be to not
allow it, but if one parent and its entire history are all pruned,
then transforming the merge commit to a normal commit and then
considering whether it is empty and allowing it to be pruned is much
better.

  - If you start pruning empty commits, you also have to worry about
history topology changing, beyond the all-ancestors-empty case above.
For example, the last non-empty commit in the ancestry of a merge on
both sides may be the same commit, making the merge-commit have the
same parent twice.  Should the duplicate parent be de-duped,
transforming the commit into a normal non-merge commit?  (I'd say yes
-- this commit is likely to be empty and prunable once you do so, but
I'm not sure everyone would agree with converting a merge commit to a
non-merge.)  Similarly, what if the rewritten parents of a merge have
one parent that is the direct ancestor of another?  Can the extra
unnecessary parent be removed as a 

Re: [ANNOUNCE] Git v2.8.3

2018-09-24 Thread adam22
 Thank you for sharing!  ip19216811.site   



--
Sent from: http://git.661346.n2.nabble.com/


Re: [PATCH] ref-filter: don't look for objects when outside of a repository

2018-09-24 Thread Junio C Hamano
SZEDER Gábor  writes:

> The command 'git ls-remote --sort=authordate ' segfaults when
> run outside of a repository, ever since the introduction of its
> '--sort' option in 1fb20dfd8e (ls-remote: create '--sort' option,
> 2018-04-09).
>
> While in general the 'git ls-remote' command can be run outside of a
> repository just fine, its '--sort=' option with certain keys does
> require access to the referenced objects.  This sorting is implemented
> using the generic ref-filter sorting facility, which already handles
> missing objects gracefully with the appropriate 'missing object
> deadbeef for HEAD' message.  However, being generic means that it
> checks replace refs while trying to retrieve an object, and while
> doing so it accesses the 'git_replace_ref_base' variable, which has
> not been initialized and is still a NULL pointer when outside of a
> repository, thus causing the segfault.
>
> Make ref-filter more careful and only attempt to retrieve an object
> when we are in a repository.  Also add a test to ensure that 'git
> ls-remote --sort' fails gracefully when executed outside of a
> repository.

OK.  So by forcing get_object() return an error, we do the same to
populate_value() which in turn will make get_ref_atgom_value return
an error and cmp_ref_sorting() will notice and die.

I think that is the best we could do.

>
> Reported-by: H.Merijn Brand 
> Signed-off-by: SZEDER Gábor 
> ---
>
> I'm not quite sure that this is the best place to add this check...
> but hey, it's a Saturday afternoon after all ;)
>
>  ref-filter.c | 3 ++-
>  t/t5512-ls-remote.sh | 6 ++
>  2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/ref-filter.c b/ref-filter.c
> index e1bcb4ca8a..3555bc29e7 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -1473,7 +1473,8 @@ static int get_object(struct ref_array_item *ref, int 
> deref, struct object **obj
>   oi->info.sizep = >size;
>   oi->info.typep = >type;
>   }
> - if (oid_object_info_extended(the_repository, >oid, >info,
> + if (!have_git_dir() ||
> + oid_object_info_extended(the_repository, >oid, >info,
>OBJECT_INFO_LOOKUP_REPLACE))
>   return strbuf_addf_ret(err, -1, _("missing object %s for %s"),
>  oid_to_hex(>oid), ref->refname);
> diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh
> index bc5703ff9b..7dd081da01 100755
> --- a/t/t5512-ls-remote.sh
> +++ b/t/t5512-ls-remote.sh
> @@ -302,4 +302,10 @@ test_expect_success 'ls-remote works outside repository' 
> '
>   nongit git ls-remote dst.git
>  '
>  
> +test_expect_success 'ls-remote --sort fails gracefully outside repository' '
> + # Use a sort key that requires access to the referenced objects.
> + nongit test_must_fail git ls-remote --sort=authordate 
> "$TRASH_DIRECTORY" 2>err &&
> + test_i18ngrep "^fatal: missing object" err
> +'
> +
>  test_done


Re: [PATCH 1/3] t7001: reformat to newer style

2018-09-24 Thread Junio C Hamano
Derrick Stolee  writes:

>> +test_expect_success 'moving the file back into subdirectory' '
>> +cd path0 && git mv ../path1/COPYING COPYING
>> +'
>
> Split at &&, use subshell?

Yes, I was almost going to point out the same, saying "'reformat to
newer style' is much larger than only changing how the test body is
formatted", but was debating myself, as a good "modernization patch"
needs both mechanical changes and manual/semantic clean-ups, and it
is very clear that these three patches deliberately limit themselves
to the former for easier verification.

It is relatively rare that files are not touched by any in-flight
topic in the codebase, which is a good opportunity to apply this
kind of wholesale clean-up, so I tend to agree that it is a shame
not to do the non-mechanical clean up in the same series.  Perhaps
the best way would be to keep these three mechanical steps as they
are, and then follow-up with non-mechanical clean-up like you
suggested.

>> +test_expect_success 'commiting the change' '
>> +cd .. && git commit -m move-in -a
>> +'
>
> Drop "cd .." (and the comments about being in path0)

... when the previous step moves to "git -C path0 mv ..." or
preferrably "(cd path0 && git mv ...)".


> [big snip]
>
>> +test_expect_success 'moving to existing tracked target with trailing slash' 
>> '
>> +mkdir path2 &&
>> +>path2/file && git add path2/file &&
> This line in particular looks a bit strange. What is this doing? At
> least we should split the &&.

Create an empty file by redirecting the output from a no-op into it,
and then adding the result.  I agree with you that this would be
easier to read on two lines.

>> +git mv path1/path0/ path2/ &&
>> +test_path_is_dir path2/path0/
>> +'


Re: [PATCH] receive-pack: update comment with check_everything_connected

2018-09-24 Thread Junio C Hamano
Jeff King  writes:

> That function is now called "check_connected()", but we forgot to update
> this comment in 7043c7071c (check_everything_connected: use a struct
> with named options, 2016-07-15).
>
> Signed-off-by: Jeff King 
> ---
> Just a minor annoyance I happened to notice while discussing in another
> thread. I notice both of us still called it check-everything-connected
> in our emails; old habits die hard, I suppose. ;)

Yup, and now I think I caught up ;-)  Thanks.

>
>  builtin/receive-pack.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> index a3bb13af10..3b7432c8e4 100644
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -1834,7 +1834,7 @@ static void prepare_shallow_update(struct command 
> *commands,
>   /*
>* keep hooks happy by forcing a temporary shallow file via
>* env variable because we can't add --shallow-file to every
> -  * command. check_everything_connected() will be done with
> +  * command. check_connected() will be done with
>* true .git/shallow though.
>*/
>   setenv(GIT_SHALLOW_FILE_ENVIRONMENT, alt_shallow_file, 1);


[PATCH] fetch-pack: approximate no_dependents with filter

2018-09-24 Thread Jonathan Tan
Whenever a lazy fetch is performed for a tree object, any trees and
blobs it directly or indirectly references will be fetched as well.
There is a "no_dependents" argument in struct fetch_pack_args that
indicates that objects that the wanted object references need not be
sent, but it currently has no effect other than to inhibit usage of
object flags.

Extend the "no_dependents" argument to also exclude sending of objects
as much as the current protocol allows: when fetching a tree, all trees
it references will be sent (but not the blobs), and when fetching a
blob, it will still be sent. (If this mechanism is used to fetch a
commit or any other non-blob object, all referenced objects, except
blobs, will be sent.) The client neither needs to know or specify the
type of each object it wants.

The necessary code change is done in fetch_pack() instead of somewhere
closer to where the "filter" instruction is written to the wire so that
only one part of the code needs to be changed in order for users of all
protocol versions to benefit from this optimization.

Signed-off-by: Jonathan Tan 
---
This was prompted by a user at $DAY_JOB who had a partial clone
excluding trees, and had a workflow that only required tree objects (and
not blobs).

This will hopefully make partial clones excluding trees (with the
"tree:0" filter) a bit better, in that if an operation requires only
trees to be inspected, the required download is much smaller.
---
 fetch-pack.c | 14 ++
 fetch-pack.h |  7 +++
 t/t0410-partial-clone.sh | 41 
 3 files changed, 62 insertions(+)

diff --git a/fetch-pack.c b/fetch-pack.c
index 88a078e9b..c25b0f54c 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1598,6 +1598,20 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
if (nr_sought)
nr_sought = remove_duplicates_in_refs(sought, nr_sought);
 
+   if (args->no_dependents && !args->filter_options.choice) {
+   /*
+* The protocol does not support requesting that only the
+* wanted objects be sent, so approximate this by setting a
+* "blob:none" filter if no filter is already set. This works
+* for all object types: note that wanted blobs will still be
+* sent because they are directly specified as a "want".
+*
+* NEEDSWORK: Add an option in the protocol to request that
+* only the wanted objects be sent, and implement it.
+*/
+   parse_list_objects_filter(>filter_options, "blob:none");
+   }
+
if (!ref) {
packet_flush(fd[1]);
die(_("no matching remote head"));
diff --git a/fetch-pack.h b/fetch-pack.h
index 5b6e86880..43ec344d9 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -43,6 +43,13 @@ struct fetch_pack_args {
unsigned from_promisor:1;
 
/*
+* Attempt to fetch only the wanted objects, and not any objects
+* referred to by them. Due to protocol limitations, extraneous
+* objects may still be included. (When fetching non-blob
+* objects, only blobs are excluded; when fetching a blob, the
+* blob itself will still be sent. The client does not need to
+* know whether a wanted object is a blob or not.)
+*
 * If 1, fetch_pack() will also not modify any object flags.
 * This allows fetch_pack() to safely be called by any function,
 * regardless of which object flags it uses (if any).
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 128130066..08a0c3651 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -170,6 +170,47 @@ test_expect_success 'fetching of missing objects' '
git verify-pack --verbose "$IDX" | grep "$HASH"
 '
 
+test_expect_success 'fetching of missing blobs works' '
+   rm -rf server repo &&
+   test_create_repo server &&
+   test_commit -C server foo &&
+   git -C server repack -a -d --write-bitmap-index &&
+
+   git clone "file://$(pwd)/server" repo &&
+   git hash-object repo/foo.t >blobhash &&
+   rm -rf repo/.git/objects/* &&
+
+   git -C server config uploadpack.allowanysha1inwant 1 &&
+   git -C server config uploadpack.allowfilter 1 &&
+   git -C repo config core.repositoryformatversion 1 &&
+   git -C repo config extensions.partialclone "origin" &&
+
+   git -C repo cat-file -p $(cat blobhash)
+'
+
+test_expect_success 'fetching of missing trees does not fetch blobs' '
+   rm -rf server repo &&
+   test_create_repo server &&
+   test_commit -C server foo &&
+   git -C server repack -a -d --write-bitmap-index &&
+
+   git clone "file://$(pwd)/server" repo &&
+   git -C repo rev-parse foo^{tree} >treehash &&
+   git hash-object repo/foo.t >blobhash &&
+   rm -rf repo/.git/objects/* &&
+
+   

Re: Git for games working group

2018-09-24 Thread John Austin
Perhaps git-global-graph is a decent name. GGG? G3? :). The structure
right now in my head looks a bit like:

Global Graph:
 client - post-commit git hooks to push changes up to the GG
 git server - just the standard git server configuration
 query server - replies with information about the current state of the GG

Locks Pre-Commit:
 client - pre-commit hook that makes requests to the GG query server

For cross-platform compatibility, the Global Graph client and the
Locks/Conflicts client are the pieces that need to be use-able on all
platforms. My goal is to keep these pieces as simple as possible. I'd
like to at least start prototyping these in Rust, hopefully in a way
that can either be easily ported or easily re-implemented in C later
on, once things are feature-frozen.

For LFS, The main points of integration with I see are:
-- bundling of packages (optionally install this package with a
normal LFS installation)
-- `git lfs locks` integration. ie. integration with the read-only
control of LFS

If we push more of the functionality into the gg query server, the
integration with `lfs locks` could be simple enough to be a couple of
web requests. That might help avoid integration issues.

> we strictly avoid using CGo
What's the main reason for this? Build system complexity?
On Mon, Sep 24, 2018 at 7:37 AM Taylor Blau  wrote:
>
> On Sun, Sep 23, 2018 at 12:53:58PM -0700, John Austin wrote:
> > On Sun, Sep 23, 2018 at 10:57 AM Randall S. Becker
> >  wrote:
> > >  I would even like to help with your effort and have non-unixy platforms 
> > > I'd like to do this on.
> > > Having this separate from git LFS is an even better idea IMO, and I would 
> > > suggest implementing this using the same set of build tools that git uses 
> > > so that it is broadly portable, unlike git LFS. Glad to help there too.
> >
> > Great to hear -- once the code is in a bit better shape I can open it
> > up on github. Cross platform is definitely one of my focuses. I'm
> > currently implementing in Rust because it targets the same space as C
> > and has great, near trivial, cross-platform support. What sorts of
> > platforms are you interested in? Windows is my first target because
> > that's where many game developers live.
>
> This would likely mean that Git LFS will have to reimplement it, since
> we strictly avoid using CGo (Go's mechanism to issue function calls to
> other languages).
>
> The upshot is that it likely shouldn't be too much effort for anybody,
> and the open-source community would get a Go implementation of the API,
> too.
>
> Thanks,
> Taylor
>



Re: [PATCH v2 3/3] transport.c: introduce core.alternateRefsPrefixes

2018-09-24 Thread Junio C Hamano
Jeff King  writes:

> I was suggesting that check_everything_connected() is not strictly
> transport-related, so would be inappropriate for transport.*, and we'd
> need a more generic name. And my "either way" was that I could see
> an argument that it _is_ transport related, since we only call it now
> when receiving a pack. But that doesn't have to be the case, and
> certainly implementing it with "rev-list --alternate-refs" muddies that
> considerably.

Even after 7043c707 ("check_everything_connected: use a struct with
named options", 2016-07-15) unified many into check_connected(),
there still are different reasons why we call to find out about the
connectivity, and I doubt we can afford to have a single knob that
is shared both for transport and other kind of connectivity checks
(like fsck or repack).  Do we want to be affected by "we pretend
that these are the only refs exported from that alternate object
store" when repacking and pruning only local objects and keep us
rely on the alternate, for example?

In any case it is good that these configuration variables are
defined on _our_ side, not in the alternate---it means that we do
not have to worry about the case where the alternateRefsCommand lies
and tells us that an object that the alternate does not actually
have exists at a tip of a ref in an attempt to confuse us, etc.


Re: git fetch behaves weirdely when run in a worktree

2018-09-24 Thread Duy Nguyen
On Sun, Sep 23, 2018 at 10:19 PM Kaartic Sivaraam
 wrote:
>
> Hi,
>
> I was actually trying to automae the building and installation of Git
> source code to reduce my burden. I tried to automate it with the help
> of a script that runs daily via cron and a separate worktree used only
> by the build script.y run
>
> The script typically fetches new changes for the next branch by running
> the following in the build worktree (which is not the main worktree):
>
>$ git fetch origin next
>
> I thought that would result in FETCH_HEAD pointing to the latest
> changes for origin/next if the command succeeded.
>
> Unfortunately, it seems to be behaving weirdely when run in a worktree.
> It sems to be behaving as if I ran 'git fetch origin'. To add to that
> confusion when I run
> ...
>Why is this weirdness happening when run in other worktrees?
>
>Why isn't 'git fetch   not fetching the changes for
>just the specified branch?
>
>Am I missing something?

Yes, some bugs. It behaves correctly for me. There must be something
strange that triggers this. What's your "git worktree list" (iow
anything strange there, duplicate worktrees perhaps)? Also please try
"git fetch" again with GIT_TRACE=1 and GIT_TRACE_SETUP=1. Hopefully we
could catch something with that.
-- 
Duy


Re: [PATCH] worktree: add per-worktree config files

2018-09-24 Thread Taylor Blau
On Sun, Sep 23, 2018 at 07:04:38PM +0200, Nguyễn Thái Ngọc Duy wrote:
> A new repo extension is added, worktreeConfig. When it is present:

I was initially scratching my head at why this should be a repository
extension, but...

>  - The special treatment for core.bare and core.worktree, to stay
>effective only in main worktree, is gone. These config files are
>supposed to be in config.worktree.

This seems to be sufficient justification for it. A destructive action
(such as migrating configuration from one location to another, as you
have implemented in the patch below) should be something that the user
opts into, rather than having happen automatically.

> This extension is most useful in multiple worktree setup because you
> now have an option to store per-worktree config (which is either
> .git/config.worktree for main worktree, or
> .git/worktrees/xx/config.worktree for linked ones).

This sounds quite useful for these situations.

> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 8d85d1a324..c24abf5871 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -2,8 +2,9 @@ CONFIGURATION FILE
>  --
>
>  The Git configuration file contains a number of variables that affect
> -the Git commands' behavior. The `.git/config` file in each repository
> -is used to store the configuration for that repository, and
> +the Git commands' behavior. The files `.git/config` and optionally
> +`config.worktree` (see `extensions.worktreeConfig` below) are each
> +repository is used to store the configuration for that repository, and

Typo: 'are each'.

>  ENVIRONMENT
> diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
> index e2ee9fc21b..3f9112db56 100644
> --- a/Documentation/git-worktree.txt
> +++ b/Documentation/git-worktree.txt
> @@ -204,6 +204,43 @@ working trees, it can be used to identify worktrees. For 
> example if
>  you only have two working trees, at "/abc/def/ghi" and "/abc/def/ggg",
>  then "ghi" or "def/ghi" is enough to point to the former working tree.
>
> +CONFIGURATION FILE
> +--
> +By default, the repository "config" file is shared across all working
> +directories. If the config variables `core.bare` or `core.worktree`
> +are already present in the config file, they will be applied to the
> +main working directory only.
> +
> +In order to have configuration specific to working directories, you
> +can turn on "worktreeConfig" extension, e.g.:
> +
> +
> +$ git config extensions.worktreeConfig true
> +

Good, this matches my expectation from above.

> @@ -24,6 +25,7 @@ static char key_delim = ' ';
>  static char term = '\n';
>
>  static int use_global_config, use_system_config, use_local_config;
> +static int use_worktree_config;
>  static struct git_config_source given_config_source;
>  static int actions, type;
>  static char *default_value;
> @@ -123,6 +125,7 @@ static struct option builtin_config_options[] = {
>   OPT_BOOL(0, "global", _global_config, N_("use global config file")),
>   OPT_BOOL(0, "system", _system_config, N_("use system config file")),
>   OPT_BOOL(0, "local", _local_config, N_("use repository config 
> file")),
> + OPT_BOOL(0, "worktree", _worktree_config, N_("use per-worktree 
> config file")),
>   OPT_STRING('f', "file", _config_source.file, N_("file"), N_("use 
> given config file")),
>   OPT_STRING(0, "blob", _config_source.blob, N_("blob-id"), 
> N_("read config from given blob object")),
>   OPT_GROUP(N_("Action")),
> @@ -602,6 +605,7 @@ int cmd_config(int argc, const char **argv, const char 
> *prefix)
>PARSE_OPT_STOP_AT_NON_OPTION);
>
>   if (use_global_config + use_system_config + use_local_config +
> + use_worktree_config +
>   !!given_config_source.file + !!given_config_source.blob > 1) {

I feel like we're getting into "let's extract a function" territory,
here, since this line is growing in width. Perhaps:

  static int config_files_count()
  {
return use_global_config + use_system_config + use_local_config +
use_worktree_config +
!!given_config_source.file +
!!given_config_source.blob;
  }

Simplifying the call to:

> diff --git a/t/t2029-worktree-config.sh b/t/t2029-worktree-config.sh
> new file mode 100755
> index 00..4ebdf13cf9
> --- /dev/null
> +++ b/t/t2029-worktree-config.sh
> @@ -0,0 +1,82 @@
> +#!/bin/sh
> +
> +test_description="config file in multi worktree"
> +
> +. ./test-lib.sh
> +
> +cmp_config() {
> + if [ "$1" = "-C" ]; then
> + shift &&
> + GD="-C $1" &&
> + shift
> + else
> + GD=
> + fi &&
> + echo "$1" >expected &&
> + shift &&
> + git $GD config "$@" >actual &&
> + test_cmp expected actual
> +}

This cmp_config seems generally useful, perhaps beyond t2029. What do
you think about putting it in 

Re: Git for games working group

2018-09-24 Thread Taylor Blau
On Sun, Sep 23, 2018 at 12:53:58PM -0700, John Austin wrote:
> On Sun, Sep 23, 2018 at 10:57 AM Randall S. Becker
>  wrote:
> >  I would even like to help with your effort and have non-unixy platforms 
> > I'd like to do this on.
> > Having this separate from git LFS is an even better idea IMO, and I would 
> > suggest implementing this using the same set of build tools that git uses 
> > so that it is broadly portable, unlike git LFS. Glad to help there too.
>
> Great to hear -- once the code is in a bit better shape I can open it
> up on github. Cross platform is definitely one of my focuses. I'm
> currently implementing in Rust because it targets the same space as C
> and has great, near trivial, cross-platform support. What sorts of
> platforms are you interested in? Windows is my first target because
> that's where many game developers live.

This would likely mean that Git LFS will have to reimplement it, since
we strictly avoid using CGo (Go's mechanism to issue function calls to
other languages).

The upshot is that it likely shouldn't be too much effort for anybody,
and the open-source community would get a Go implementation of the API,
too.

Thanks,
Taylor


Re: Git for games working group

2018-09-24 Thread Taylor Blau
On Sun, Sep 23, 2018 at 01:56:37PM -0400, Randall S. Becker wrote:
> On September 23, 2018 1:29 PM, John Austin wrote:
> > I've been putting together a prototype file-locking implementation for a
> > system that plays better with git. What are everyone's thoughts on
> > something like the following? I'm tentatively labeling this system git-sync 
> > or
> > sync-server. There are two pieces:
> >
> > 1. A centralized repository called the Global Graph that contains the union 
> > git
> > commit graph for local developer repos. When Developer A makes a local
> > commit on branch 'feature', git-sync will automatically push that new commit
> > up to the global server, under a name-spaced
> > branch: 'developera_repoabcdef/feature'. This can be done silently as a
> > force push, and shouldn't ever interrupt the developer's workflow.
> > Simple http queries can be made to the Global Graph, such as "Which
> > commits descend from commit abcdefgh?"
> >
> > 2. A client-side tool that queries the Global Graph to determine when your
> > current changes are in conflict with another developer. It might ask "Are
> > there any commits I don't have locally that modify lockable_file.bin?". This
> > could either be on pre-commit, or for more security, be part of a read-only
> > marking system ala Git LFS. There wouldn't be any "lock" per say, rather, 
> > the
> > client could refuse to modify a file if it found other commits for that 
> > file in the
> > global graph.
> >
> > The key here is the separation of concerns. The Global Graph is fairly
> > dimwitted -- it doesn't know anything about file locking. But it provides a
> > layer of information from which we can implement file locking on the client
> > side (or perhaps other interesting systems).
> >
> > Thoughts?
>
> I'm encouraged of where this is going. I might suggest "sync" is the
> wrong name here, with "mutex" being slightly better - I would even
> like to help with your effort and have non-unixy platforms I'd like to
> do this on.
>
> Having this separate from git LFS is an even better idea IMO, and I
> would suggest implementing this using the same set of build tools that
> git uses so that it is broadly portable, unlike git LFS. Glad to help
> there too.

I think that this is the way that we would prefer it, too. Ideally users
outside of those who have Git LFS installed or those that are regular
users of it should be able to interoperate with those using the global
graph.

We're thinking a lot about what should go into the next major version of
Git LFS, v3.0.0, and this seems a good candidate to me. We'd also want
to figure out how to transition v2.0.0-era locks into the new global
graph, but that seems a topic for a later discussion.

Thanks,
Taylor


Re: [PATCH 1/3] t7001: reformat to newer style

2018-09-24 Thread Derrick Stolee

On 9/21/2018 7:58 PM, Stefan Beller wrote:

The old formatting style is a real hindrance of getting people up to speed
contributing as they use existing code as an example and follow that style.
So let's get rid of the old style and reformat it in our current style.
I was suspicious of your automated approach catching everything, so I 
looked carefully at this patch. There are still a lot of things 
happening that we would not recommend doing in new tests.


Reported-by: Derrick Stolee 
Reported-by: Jeff Hostetler 
Signed-off-by: Stefan Beller 
---
  t/t7001-mv.sh | 268 +-
  1 file changed, 134 insertions(+), 134 deletions(-)

diff --git a/t/t7001-mv.sh b/t/t7001-mv.sh
index 36b50d0b4c1..2251d24735c 100755
--- a/t/t7001-mv.sh
+++ b/t/t7001-mv.sh
@@ -3,74 +3,74 @@
  test_description='git mv in subdirs'
  . ./test-lib.sh
  
-test_expect_success \

-'prepare reference tree' \
-'mkdir path0 path1 &&
- cp "$TEST_DIRECTORY"/../COPYING path0/COPYING &&
- git add path0/COPYING &&
- git commit -m add -a'
+test_expect_success 'prepare reference tree' '
+   mkdir path0 path1 &&
+   cp "$TEST_DIRECTORY"/../COPYING path0/COPYING &&
+   git add path0/COPYING &&
+   git commit -m add -a
+'
  
-test_expect_success \

-'moving the file out of subdirectory' \
-'cd path0 && git mv COPYING ../path1/COPYING'
+test_expect_success 'moving the file out of subdirectory' '
+   cd path0 && git mv COPYING ../path1/COPYING
+'

Perhaps split this line on the &&?
  
  # in path0 currently

-test_expect_success \
-'commiting the change' \
-'cd .. && git commit -m move-out -a'
+test_expect_success 'commiting the change' '
+   cd .. && git commit -m move-out -a
+'


This "cd .." should probably be removed and use a subshell in the test 
above where we "cd path0".


  
-test_expect_success \

-'checking the commit' \
-'git diff-tree -r -M --name-status  HEAD^ HEAD >actual &&
-grep "^R100..*path0/COPYING..*path1/COPYING" actual'
+test_expect_success 'checking the commit' '
+   git diff-tree -r -M --name-status  HEAD^ HEAD >actual &&
+   grep "^R100..*path0/COPYING..*path1/COPYING" actual
+'
  
-test_expect_success \

-'moving the file back into subdirectory' \
-'cd path0 && git mv ../path1/COPYING COPYING'
+test_expect_success 'moving the file back into subdirectory' '
+   cd path0 && git mv ../path1/COPYING COPYING
+'


Split at &&, use subshell?



+test_expect_success 'commiting the change' '
+   cd .. && git commit -m move-in -a
+'


Drop "cd .." (and the comments about being in path0)

[big snip]


+test_expect_success 'moving to existing tracked target with trailing slash' '
+   mkdir path2 &&
+   >path2/file && git add path2/file &&
This line in particular looks a bit strange. What is this doing? At 
least we should split the &&.

+   git mv path1/path0/ path2/ &&
+   test_path_is_dir path2/path0/
+'
+
+test_expect_success 'clean up' '
+   git reset --hard
+'
+
+test_expect_success 'adding another file' '
+   cp "$TEST_DIRECTORY"/../README.md path0/README &&
+   git add path0/README &&
+   git commit -m add2 -a
+'
+
+test_expect_success 'moving whole subdirectory' '
+   git mv path0 path2
+'
+
+test_expect_success 'commiting the change' '
+   git commit -m dir-move -a
+'
+
+test_expect_success 'checking the commit' '
+   git diff-tree -r -M --name-status  HEAD^ HEAD >actual &&
+   grep "^R100..*path0/COPYING..*path2/COPYING" actual &&
+   grep "^R100..*path0/README..*path2/README" actual
+'
+
+test_expect_success 'succeed when source is a prefix of destination' '
+   git mv path2/COPYING path2/COPYING-renamed
+'
+
+test_expect_success 'moving whole subdirectory into subdirectory' '
+   git mv path2 path1
+'
+
+test_expect_success 'commiting the change' '
+   git commit -m dir-move -a
+'
+
+test_expect_success 'checking the commit' '
+   git diff-tree -r -M --name-status  HEAD^ HEAD >actual &&
+   grep "^R100..*path2/COPYING..*path1/path2/COPYING" actual &&
+   grep "^R100..*path2/README..*path1/path2/README" actual
+'
+
+test_expect_success 'do not move directory over existing directory' '
+   mkdir path0 && mkdir path0/path2 && test_must_fail git mv path2 path0
+'


Split this line.

Thanks,

-Stolee



Re: [PATCH v3 1/2] commit-reach: properly peel tags

2018-09-24 Thread Derrick Stolee

On 9/21/2018 7:56 PM, Jeff King wrote:

On Fri, Sep 21, 2018 at 08:05:26AM -0700, Derrick Stolee via GitGitGadget wrote:
+   if (!from_one || from_one->type != OBJ_COMMIT) {
+   /* no way to tell if this is reachable by
+* looking at the ancestry chain alone, so
+* leave a note to ourselves not to worry about
+* this object anymore.
+*/
A minor nit, but the original comment you restored here has a style
violation. Might be worth fixing up (but certainly not worth a re-roll
on its own).


@@ -600,7 +622,7 @@ int can_all_from_reach_with_flag(struct object_array *from,
}
  
  cleanup:

-   for (i = 0; i < from->nr; i++) {
+   for (i = 0; i < nr_commits; i++) {
clear_commit_marks(list[i], RESULT);
clear_commit_marks(list[i], assign_flag);

Also, a minor aside, but I think it would be slightly more efficient for
those final two lines to do:

   clear_commit_marks(list[i], RESULT | assign_flag);

Of course, that's totally orthogonal to this patch, but it may make you
feel better to offset the other round of clearing I'm suggesting. ;)


This is definitely a better thing to do, and I'll make the change in v4, 
along with the comment style cleanup above.


Thanks,

-Stolee




Re: [RFC PATCH 0/3] diff --color-moved-ws: allow mixed spaces and tabs in indentation change

2018-09-24 Thread Phillip Wood
On 24/09/2018 11:06, Phillip Wood wrote:
> From: Phillip Wood 
> 
> When trying out the new --color-moved-ws=allow-indentation-change I
> was disappointed to discover it did not work if the indentation
> contains a mix of spaces and tabs. This series adds a new option that
> does. It's and RFC as there are some open questions about how to
> proceed, see the last patch for details. Once there's a clearer way
> forward I'll reroll with some documentation changes.
> 

I should have said that this series is based on top of fab01ec52e
("diff: fix --color-moved-ws=allow-indentation-change", 2018-09-04), the
result merges into current master without conflicts.

Best Wishes

Phillip

> Phillip Wood (3):
>   xdiff-interface: make xdl_blankline() available
>   diff.c: remove unused variables
>   diff: add --color-moved-ws=allow-mixed-indentation-change
> 
>  diff.c | 124 -
>  diff.h |   1 +
>  t/t4015-diff-whitespace.sh |  89 ++
>  xdiff-interface.c  |   5 ++
>  xdiff-interface.h  |   5 ++
>  5 files changed, 208 insertions(+), 16 deletions(-)
> 



Re: [PATCH v5 1/9] submodule: add a print_config_from_gitmodules() helper

2018-09-24 Thread Antonio Ospite
On Mon, 17 Sep 2018 16:09:32 +0200
Antonio Ospite  wrote:

> Add a new print_config_from_gitmodules() helper function to print values
> from .gitmodules just like "git config -f .gitmodules" would.
> 
[...]

> +int print_config_from_gitmodules(const char *key)

I am thinking about adding  a "struct repository" argument to this
function

> +{
> + int ret;
> + char *store_key;
> +
> + ret = git_config_parse_key(key, _key, NULL);
> + if (ret < 0)
> + return CONFIG_INVALID_KEY;
> +
> + config_from_gitmodules(config_print_callback, the_repository, 
> store_key);

And use it here, to avoid another usage of "the_repository" when it's
not strictly necessary.

Ciao,
   Antonio

-- 
Antonio Ospite
https://ao2.it
https://twitter.com/ao2it

A: Because it messes up the order in which people normally read text.
   See http://en.wikipedia.org/wiki/Posting_style
Q: Why is top-posting such a bad thing?


Re: [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree

2018-09-24 Thread Antonio Ospite
On Tue, 18 Sep 2018 19:12:57 +0200
SZEDER Gábor  wrote:

[...]
> On Mon, Sep 17, 2018 at 04:09:40PM +0200, Antonio Ospite wrote:
> > When the .gitmodules file is not available in the working tree, try
> > using the content from the index and from the current branch.
> 
> "from the index and from the current branch" of which repository?
>

I took a look, some comments below.

> > diff --git a/submodule-config.c b/submodule-config.c
> > index 61a555e920..bdb1d0e2c9 100644
> > --- a/submodule-config.c
> > +++ b/submodule-config.c
> 
> > @@ -603,8 +604,21 @@ static void submodule_cache_check_init(struct 
> > repository *repo)
> >  static void config_from_gitmodules(config_fn_t fn, struct repository 
> > *repo, void *data)
> >  {
[...]
> > +   file = repo_worktree_path(repo, GITMODULES_FILE);
> > +   if (file_exists(file))
> > +   config_source.file = file;
> > +   else if (get_oid(GITMODULES_INDEX, ) >= 0)
> > +   config_source.blob = GITMODULES_INDEX;
> > +   else if (get_oid(GITMODULES_HEAD, ) >= 0)
> > +   config_source.blob = GITMODULES_HEAD;
> > +
> 
> The repository used in t7814 contains nested submodules, which means
> that config_from_gitmodules() is invoked three times.
> 
> Now, the first two of those calls look at the superproject and at
> 'submodule', and find the existing files '.../trash
> directory.t7814-grep-recurse-submodules/.gitmodules' and '.../trash
> directory.t7814-grep-recurse-submodules/submodule/.gitmodules',
> respectively.  So far so good.
> 
> The third call, however, looks at the nested submodule at
> 'submodule/sub', which doesn't contain a '.gitmodules' file.  So this
> function goes on with the second condition and calls
> get_oid(GITMODULES_INDEX, ), which then appears to find the blob
> in the _superproject's_ index.
>

You are correct.

This is a limitation of the object store in git, there is no equivalent
of get_oid() to get the oid from a specific repository and this affects
config_with_options too when the config source is a blob.

This does not affect commands called via "git -C submodule_dir cmd"
because in that case the chdir happens before the_repository is set up,
for instance "git-submodule $SOMETHING --recursive" commands seem to
change the working directory before the recursion.

> > +   config_with_options(fn, data, _source, );
> > +
> > free(file);
> > }
> >  }
> I'm no expert on submodules, but my gut feeling says that this can't
> be right.  But if it _is_ right, then I would say that the commit
> message should explain in detail, why it is right.
> 

I agree it isn't right, I didn't consider the case of nested
submodules, but even if I did the current git design does not
allow to correctly solve the problem: "read config from blob of an
arbitrary repository".

So what to do for the time being?

The issue is there but in a "normal" scenario it is not causing any real
harm because:

  1. currently it is exposed "only" by git grep and nested submodules.

  2. the new mechanism works fine when reading the submodule config for
 the root repository.

  3. the new mechanism does not "usually" impact non-leaf nested
 submodules, because the .gitmodules file is "normally" there.

  4. git grep never *uses* the GITMODULES_INDEX erroneously read
 from the root project when scanning the _leaf_ nested submodule
 because there are no further submodules down the line, the
 following check fails in builtin/grep.c:

   if (recurse_submodules && S_ISGITLINK(entry.mode) ...
   ...

In fact, because of 4. the test suite passes even if the gitmodule
config is not correct for the leaf submodule.

Actually 4. makes me think that the repo_read_gitmodules() call in
builtin/grep.c might not be strictly necessary, its purpose seems to be
to *anticipate* reading the config of *child* submodules while still
processing the *current* submodule, the config for the *current*
submodule was already read from the superproject by the immediately
preceding repo_submodule_init(), via:

  repo_submodule_init()
submodule_from_path()
  gitmodules_read_check()
repo_read_gitmodules()

And this would happen anyway also for child submodules down the
recursion path if we removed repo_read_gitmodules() in builtin/grep.c,
the operation would be not protected by grep_read_lock() tho.

The test suite passes even after removing repo_read_gitmodules()
entirely from builtin/grep.c, but I am still not confident that I get
all the implication of why that call was originally added in commit
f9ee2fcdfa (grep: recurse in-process using 'struct repository',
2017-08-02).

Anyways, even if we removed the call we would prevent the problem from
happening in the test suite, but not in the real world, in case non-leaf
submodules without .gitmodules in their working tree.

To recap:
  - patch 9/9 exposes a problem with the object store but for now it's
only a potential 

[RFC PATCH 2/3] diff.c: remove unused variables

2018-09-24 Thread Phillip Wood
From: Phillip Wood 

The string lengths are not used in cmp_in_block_with_wsd() so lets
remove them.

Signed-off-by: Phillip Wood 
---
 diff.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/diff.c b/diff.c
index 9393993e33..0a652e28d4 100644
--- a/diff.c
+++ b/diff.c
@@ -789,7 +789,6 @@ static int cmp_in_block_with_wsd(const struct diff_options 
*o,
 int n)
 {
struct emitted_diff_symbol *l = >emitted_symbols->buf[n];
-   int al = cur->es->len, cl = l->len;
const char *a = cur->es->line,
   *b = match->es->line,
   *c = l->line;
@@ -823,13 +822,10 @@ static int cmp_in_block_with_wsd(const struct 
diff_options *o,
 */
 
wslen = strlen(pmb->wsd->string);
-   if (pmb->wsd->current_longer) {
+   if (pmb->wsd->current_longer)
c += wslen;
-   cl -= wslen;
-   } else {
+   else
a += wslen;
-   al -= wslen;
-   }
 
if (strcmp(a, c))
return 1;
-- 
2.19.0



[RFC PATCH 3/3] diff: add --color-moved-ws=allow-mixed-indentation-change

2018-09-24 Thread Phillip Wood
From: Phillip Wood 

This adds another mode for highlighting lines that have moved with an
indentation change. Unlike the existing
--color-moved-ws=allow-indentation-change setting this mode uses the
visible change in the indentation to group lines, rather than the
indentation string. This means it works with files that use a mix of
tabs and spaces for indentation and can cope with whitespace errors
where there is a space before a tab (it's the job of
--ws-error-highlight to deal with those errors, it should affect the
move detection). It will also group the lines either
side of a blank line if their indentation change matches so short
lines followed by a blank line followed by more lines with the same
indentation change will be correctly highlighted.

This is a RFC as there are a number of questions about how to proceed
from here:
 1) Do we need a second option or should this implementation replace
--color-moved-ws=allow-indentation-change. I'm unclear if that mode
has any advantages for some people. There seems to have been an
intention [1] to get it working with mixes of tabs and spaces but
nothing ever came of it.
 2) If we keep two options what should this option be called, the name
is long and ambiguous at the moment - mixed could refer to mixed
indentation length rather than a mix of tabs and spaces.
 3) Should we support whitespace flags with this mode?
--ignore-space-at-eol and --ignore-cr-at eol would be fairly simple
to support and I can see a use for them, --ignore-all-space and
--ignore-space-change would need some changes to xdiff to allow them
to apply only after the indentation. I think --ignore-blank-lines
would need a bit of work to get it working as well. (Note the
existing mode does not support any of these flags either)

[1] 
https://public-inbox.org/git/CAGZ79kasAqE+=7ciVrdjoRdu0UFjVBr8Ma502nw+3hZL=eb...@mail.gmail.com/

Signed-off-by: Phillip Wood 
---
 diff.c | 122 +
 diff.h |   1 +
 t/t4015-diff-whitespace.sh |  89 +++
 3 files changed, 199 insertions(+), 13 deletions(-)

diff --git a/diff.c b/diff.c
index 0a652e28d4..45f33daa60 100644
--- a/diff.c
+++ b/diff.c
@@ -304,7 +304,11 @@ static int parse_color_moved_ws(const char *arg)
else if (!strcmp(sb.buf, "ignore-all-space"))
ret |= XDF_IGNORE_WHITESPACE;
else if (!strcmp(sb.buf, "allow-indentation-change"))
-   ret |= COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE;
+   ret = COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE |
+(ret & ~COLOR_MOVED_WS_ALLOW_MIXED_INDENTATION_CHANGE);
+   else if (!strcmp(sb.buf, "allow-mixed-indentation-change"))
+   ret = COLOR_MOVED_WS_ALLOW_MIXED_INDENTATION_CHANGE |
+(ret & ~COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE);
else
error(_("ignoring unknown color-moved-ws mode '%s'"), 
sb.buf);
 
@@ -314,6 +318,9 @@ static int parse_color_moved_ws(const char *arg)
if ((ret & COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) &&
(ret & XDF_WHITESPACE_FLAGS))
die(_("color-moved-ws: allow-indentation-change cannot be 
combined with other white space modes"));
+   else if ((ret & COLOR_MOVED_WS_ALLOW_MIXED_INDENTATION_CHANGE) &&
+(ret & XDF_WHITESPACE_FLAGS))
+   die(_("color-moved-ws: allow-mixed-indentation-change cannot be 
combined with other white space modes"));
 
string_list_clear(, 0);
 
@@ -763,11 +770,65 @@ struct moved_entry {
  * comparision is longer than the second.
  */
 struct ws_delta {
-   char *string;
+   union {
+   int delta;
+   char *string;
+   };
unsigned int current_longer : 1;
+   unsigned int have_string : 1;
 };
 #define WS_DELTA_INIT { NULL, 0 }
 
+static int compute_mixed_ws_delta(const struct emitted_diff_symbol *a,
+ const struct emitted_diff_symbol *b,
+ int *delta)
+{
+   unsigned int i = 0, j = 0;
+   int la, lb;
+   int ta = a->flags & WS_TAB_WIDTH_MASK;
+   int tb = b->flags & WS_TAB_WIDTH_MASK;
+   const char *sa = a->line;
+   const char *sb = b->line;
+
+   if (xdiff_is_blankline(sa, a->len, 0) &&
+   xdiff_is_blankline(sb, b->len, 0)) {
+   *delta = INT_MIN;
+   return 1;
+   }
+
+   /* skip any \v \f \r at start of indentation */
+   while (sa[i] == '\f' || sa[i] == '\v' ||
+  (sa[i] == '\r' && i < a->len - 1))
+   i++;
+   while (sb[j] == '\f' || sb[j] == '\v' ||
+  (sb[j] == '\r' && j < b->len - 1))
+   j++;
+
+   for (la = 0; ; i++) {
+   if (sa[i] == ' ')
+   la++;
+   else if 

[RFC PATCH 0/3] diff --color-moved-ws: allow mixed spaces and tabs in indentation change

2018-09-24 Thread Phillip Wood
From: Phillip Wood 

When trying out the new --color-moved-ws=allow-indentation-change I
was disappointed to discover it did not work if the indentation
contains a mix of spaces and tabs. This series adds a new option that
does. It's and RFC as there are some open questions about how to
proceed, see the last patch for details. Once there's a clearer way
forward I'll reroll with some documentation changes.

Phillip Wood (3):
  xdiff-interface: make xdl_blankline() available
  diff.c: remove unused variables
  diff: add --color-moved-ws=allow-mixed-indentation-change

 diff.c | 124 -
 diff.h |   1 +
 t/t4015-diff-whitespace.sh |  89 ++
 xdiff-interface.c  |   5 ++
 xdiff-interface.h  |   5 ++
 5 files changed, 208 insertions(+), 16 deletions(-)

-- 
2.19.0



[RFC PATCH 1/3] xdiff-interface: make xdl_blankline() available

2018-09-24 Thread Phillip Wood
From: Phillip Wood 

This will be used by the move detection code.

Signed-off-by: Phillip Wood 
---
 xdiff-interface.c | 5 +
 xdiff-interface.h | 5 +
 2 files changed, 10 insertions(+)

diff --git a/xdiff-interface.c b/xdiff-interface.c
index 9315bc0ede..eceabfa72d 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -308,6 +308,11 @@ int xdiff_compare_lines(const char *l1, long s1,
return xdl_recmatch(l1, s1, l2, s2, flags);
 }
 
+int xdiff_is_blankline(const char *l1, long s1, long flags)
+{
+   return xdl_blankline(l1, s1, flags);
+}
+
 int git_xmerge_style = -1;
 
 int git_xmerge_config(const char *var, const char *value, void *cb)
diff --git a/xdiff-interface.h b/xdiff-interface.h
index 135fc05d72..d0008b016f 100644
--- a/xdiff-interface.h
+++ b/xdiff-interface.h
@@ -45,4 +45,9 @@ extern int xdiff_compare_lines(const char *l1, long s1,
  */
 extern unsigned long xdiff_hash_string(const char *s, size_t len, long flags);
 
+/*
+ * Returns 1 if the line is blank, taking XDF_WHITESPACE_FLAGS into account
+ */
+extern int xdiff_is_blankline(const char *s, long len, long flags);
+
 #endif
-- 
2.19.0