Re: commit-graph: change in "best" merge-base when ambiguous

2018-05-21 Thread Michael Haggerty
On 05/21/2018 08:10 PM, Derrick Stolee wrote:
> [...]
> In the Discussion section of the `git merge-base` docs [1], we have the
> following:
> 
>     When the history involves criss-cross merges, there can be more than
> one best common ancestor for two commits. For example, with this topology:
> 
>     ---1---o---A
>         \ /
>      X
>         / \
>     ---2---o---o---B
> 
>     both 1 and 2 are merge-bases of A and B. Neither one is better than
> the other (both are best merge bases). When the --all option is not
> given,     it is unspecified which best one is output.
> 
> This means our official documentation mentions that we do not have a
> concrete way to differentiate between these choices. This makes me think
> that this change in behavior is not a bug, but it _is_ a change in
> behavior. It's worth mentioning, but I don't think there is any value in
> making sure `git merge-base` returns the same output.
> 
> Does anyone disagree? Is this something we should solidify so we always
> have a "definitive" merge-base?
> [...]

This may be beyond the scope of what you are working on, but there are
significant advantages to selecting a "best" merge base from among the
candidates. Long ago [1] I proposed that the "best" merge base is the
merge base candidate that minimizes the number of non-merge commits that
are in

git rev-list $candidate..$branch

that are already in master:

git rev-list $master

(assuming merging branch into master), which is equivalent to choosing
the merge base that minimizes

git rev-list --count $candidate..$branch

In fact, this criterion is symmetric if you exchange branch ↔ master,
which is a nice property, and indeed generalizes pretty simply to
computing the merge base of more than two commits.

In that email I also included some data showing that the "best" merge
base almost always results in either the same or a shorter diff than the
more or less arbitrary algorithm that we currently use. Sometimes the
difference in diff length is dramatic.

To me it feels like the best *deterministic* merge base would be based
on the above criterion, maybe with first-parent reachability, commit
times, and SHA-1s used (in that order) to break ties.

I don't plan to work on the implementation of this idea myself (though
we've long used a script-based implementation of this algorithm
internally at GitHub).

Michael

[1] https://public-inbox.org/git/539a25bf.4060...@alum.mit.edu/
See the rest of the thread for more interesting discussion.
[2]
https://public-inbox.org/git/8a9b3f20-eed2-c59b-f7ea-3c68b3c30...@alum.mit.edu/
Higher in this thread, Junio proposes a different criterion.


Re: [PATCH v5 0/4] unpack_trees_options: free messages when done

2018-05-21 Thread Junio C Hamano
Junio C Hamano  writes:

> Martin Ågren  writes:
>
>> I've taken the six patches that Junio has queued and rebuilt the series
>> to get rid of the new and possibly bug-prone function that no-one uses
>> once the series is over.
>
> Hmph, this unfortunately depends on 'next', which means we cannot
> merge it down to 'maint' later to fix these leaks.  I guess it is
> not a huge deal, though.  We've lived with these message leaks for
> quite some time now and earth still kept rotating ;-)

Oh, what was I thinking.  This, just like its previous rounds, is on
top of bp/merge-rename-config^0 and it is expected *not* to be
mergeable to 'maint' (or 'master', for that matter, at least not
yet).

Will queue.  Thanks.


Re: why does "man git-clean" not mention files ignored by core.excludesFile?

2018-05-21 Thread Junio C Hamano
"Robert P. J. Day"  writes:

>   why is there no mention of files ignored via a user's
> core.excludesFile configuration?

IIUC, core.excludesFile is a much much later invention made long
after everybody lost interest in updating "git clean", let alone its
documentation.  The support for the configuration variable was added
to the internal API used to access exclude mechanism, so "clean",
together with other users of the same API, got it for free when it
was added, and nobody bothered to update the documentation of
"clean".

In other words, a short answer is because you haven't made it to
mention it ;-).



Re: [PATCH v5 0/4] unpack_trees_options: free messages when done

2018-05-21 Thread Junio C Hamano
Martin Ågren  writes:

> I've taken the six patches that Junio has queued and rebuilt the series
> to get rid of the new and possibly bug-prone function that no-one uses
> once the series is over.

Hmph, this unfortunately depends on 'next', which means we cannot
merge it down to 'maint' later to fix these leaks.  I guess it is
not a huge deal, though.  We've lived with these message leaks for
quite some time now and earth still kept rotating ;-)



Re: [PATCH] regex: do not call `regfree()` if compilation fails

2018-05-21 Thread Eric Sunshine
On Mon, May 21, 2018 at 2:43 PM, Stefan Beller  wrote:
> On Sun, May 20, 2018 at 3:50 AM, Martin Ågren  wrote:
>> It is apparently undefined behavior to call `regfree()` on a regex where
>> `regcomp()` failed. [...]
>>
>> diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
>> @@ -215,7 +215,6 @@ static void regcomp_or_die(regex_t *regex, const char 
>> *needle, int cflags)
>> /* The POSIX.2 people are surely sick */
>> char errbuf[1024];
>> regerror(err, regex, errbuf, 1024);
>> -   regfree(regex);
>> die("invalid regex: %s", errbuf);
>
> While the commit message is very clear why we supposedly introduce a leak 
> here,
> it is hard to be found from the source code (as we only delete code
> there, so digging
> for history is not obvious), so maybe
>
>  /* regfree(regex) is invalid here */
>
> instead?

The commit message doesn't say that we are supposedly introducing a
leak (and, indeed, no resources should have been allocated to the
'regex' upon failed compile). It's saying that removing this call
potentially avoids a crash under some implementations.

Given that the very next line is die(), and that the function name has
"_or_die" in it, I'm not sure that an in-code comment about regfree()
being invalid upon failed compile would be all that helpful; indeed,
it could be confusing, causing the reader to wonder why that is
significant if we're just dying anyhow. I find that the patch, as is,
clarifies rather than muddles the situation.


Re: which files are "known to git"?

2018-05-21 Thread Junio C Hamano
Jonathan Nieder  writes:

> My understanding was the same as Elijah's.
>
> I would be in favor of a patch that replaces the phrase "known to Git"
> in Git's documentation with something less confusing.

One possible twist I recall was that normally we only pay attention
to the index (i.e. the term "tracked files" means "paths shown in
ls-files (no args) output"), but there was a desire to take a union
of paths in the index and paths in the HEAD (the obvious difference
is those that are removed from the index), and we may have called
these "known to Git" in the discussion to distinguish them from
"paths in the index".

Clearly the phrase we are discussing (e.g. the ones used in "git
clean" documentation) has been used _without_ such a desire but
merely is used carelessly and confusingly.  So I am all for finding
"known to Git" and replace them with "tracked" and/or "added to the
index" when the phrase is not used to mean "union of paths in the
index and in the HEAD".

Thanks.



Re: [PATCH v2 02/18] Add a new builtin: branch-diff

2018-05-21 Thread Junio C Hamano
Jeff King  writes:

>> I haven't really been following all of the discussion but from what I
>> can tell the point of this command is to generate a diff based on two
>> different versions of a series, so why not call it 'series-diff'? :)
>
> That's OK with me, though I prefer "range" as I think we use that term
> elsewhere ("series" is usually part of "patch series", but many people
> do not use a workflow with that term).

FWIW, I am OK with either, with a bit of preference to "range" over
"series".  As long as this stays to be an independent command (as
opposed to be made into a new mode to existing command) and the
command name is not overly hard to type, I am OK with anything ;-)


Re: [PATCH 00/18] Add `branch-diff`, a `tbdiff` lookalike

2018-05-21 Thread Junio C Hamano
Johannes Schindelin  writes:

>> In the picture, the three pre-context lines that are all indented by
>> a HT are prefixed by a SP, and that is prefixed by a '+' sign of the
>> outer diff.
>
> Yep, that's exactly it.
>
> The way tbdiff did it was to parse the diff and re-roll the coloring on
> its own. I am not really keen on doing that in `branch --diff`, too.

Are you saying that these are "whitespace errors" getting painted?
It is somewhat odd because my whitespace errors are configured to be
painted in "reverse blue".  Perhaps you are forcing the internal
diff not to pay attention to the end-user configuration---which
actually does make sense, as reusing of "git diff" to take "diff of
diff" is a mere implementation detail.

In any case, the "whitespace errors" in "diff of diff" are mostly
distracting.

> I was wondering from the get-go whether it would make sense to make
> --dual-color the default.
>
> But now I wonder whether there is actually *any* use in `--color` without
> `--dual-color`.
>
> What do you think? Should I make the dual color mode the *only* color
> mode?

Sorry but you are asking a good question to a wrong person.

I normally do not seek much useful information in colored output, so
my reaction would not be very useful.  Non dual-color mode irritates
me due to the false whitespace errors, and dual-color mode irritates
me because it looks sufficiently different from tbdiff output that I
am used to see.



Re: [PATCH 1/2] remote-curl: accept all encoding supported by curl

2018-05-21 Thread Jonathan Nieder
Hi,

Stefan Beller wrote:
> On Mon, May 21, 2018 at 4:40 PM, Brandon Williams  wrote:

>> Configure curl to accept all encoding which curl supports instead of
>> only accepting gzip responses.
>
> This partially reverts aa90b9697f9 (Enable info/refs gzip decompression
> in HTTP client, 2012-09-19), as that specifically called out deflate not being
> a good option. Is that worth mentioning in the commit message?

More specifically, it mentions the wasted 9 extra bytes from including
"deflate, " on the Accept-Encoding line.  I think the extra bandwidth
usage will be okay. :)

[...]
>> -   curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "gzip");
>> +   curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "");
>
> Looking at the code here, this succeeds if enough memory is available.
> There is no check if the given parameter is part of
> Curl_all_content_encodings();
> https://github.com/curl/curl/blob/e66cca046cef20d00fba89260dfa6b4a3997233d/lib/setopt.c#L429
> https://github.com/curl/curl/blob/c675c40295045d4988eeb6291c54eb48f138822f/lib/content_encoding.c#L686
>
> which may be worth checking first?

By "this" are you referring to the preimage or the postimage?  Are you
suggesting a change in git or in libcurl?

Curl_all_content_encodings() is an internal function in libcurl, so
I'm assuming the latter.

Thanks,
Jonathan


[PATCH v2 4/5] merge-recursive: rename conflict_rename_*() family of functions

2018-05-21 Thread Elijah Newren
These functions were added because processing of these conflicts needed
to be deferred until process_entry() in order to get D/F conflicts and
such right.  The number of these has grown over time, and now include
some whose name is misleading:
  * conflict_rename_normal() is for handling normal file renames; a
typical rename may need content merging, but we expect conflicts
from that to be more the exception than the rule.
  * conflict_rename_via_dir() will not be a conflict; it was just an
add that turned into a move due to directory rename detection.
(If there was a file in the way of the move, that would have been
detected and reported earlier.)
  * conflict_rename_rename_2to1 and conflict_rename_add (the latter
of which doesn't exist yet but has been submitted before and I
intend to resend) technically might not be conflicts if the
colliding paths happen to match exactly.
Rename this family of functions to handle_rename_*().

Also rename handle_renames() to detect_and_process_renames() both to make
it clearer what it does, and to differentiate it as a pre-processing step
from all the handle_rename_*() functions which are called from
process_entry().

Acked-by: Johannes Schindelin 
Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 86 +++
 1 file changed, 43 insertions(+), 43 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index e2b0db0645..270a4d2aad 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1224,10 +1224,10 @@ static int merge_file_one(struct merge_options *o,
return merge_file_1(o, , , , path, branch1, branch2, mfi);
 }
 
-static int conflict_rename_via_dir(struct merge_options *o,
-  struct diff_filepair *pair,
-  const char *rename_branch,
-  const char *other_branch)
+static int handle_rename_via_dir(struct merge_options *o,
+struct diff_filepair *pair,
+const char *rename_branch,
+const char *other_branch)
 {
/*
 * Handle file adds that need to be renamed due to directory rename
@@ -1329,10 +1329,10 @@ static int handle_change_delete(struct merge_options *o,
return ret;
 }
 
-static int conflict_rename_delete(struct merge_options *o,
- struct diff_filepair *pair,
- const char *rename_branch,
- const char *delete_branch)
+static int handle_rename_delete(struct merge_options *o,
+   struct diff_filepair *pair,
+   const char *rename_branch,
+   const char *delete_branch)
 {
const struct diff_filespec *orig = pair->one;
const struct diff_filespec *dest = pair->two;
@@ -1434,8 +1434,8 @@ static int handle_file(struct merge_options *o,
return ret;
 }
 
-static int conflict_rename_rename_1to2(struct merge_options *o,
-  struct rename_conflict_info *ci)
+static int handle_rename_rename_1to2(struct merge_options *o,
+struct rename_conflict_info *ci)
 {
/* One file was renamed in both branches, but to different names. */
struct diff_filespec *one = ci->pair1->one;
@@ -1496,8 +1496,8 @@ static int conflict_rename_rename_1to2(struct 
merge_options *o,
return 0;
 }
 
-static int conflict_rename_rename_2to1(struct merge_options *o,
-  struct rename_conflict_info *ci)
+static int handle_rename_rename_2to1(struct merge_options *o,
+struct rename_conflict_info *ci)
 {
/* Two files, a & b, were renamed to the same thing, c. */
struct diff_filespec *a = ci->pair1->one;
@@ -2231,7 +2231,7 @@ static void apply_directory_rename_modifications(struct 
merge_options *o,
 * "NOTE" in update_stages(), doing so will modify the current
 * in-memory index which will break calls to would_lose_untracked()
 * that we need to make.  Instead, we need to just make sure that
-* the various conflict_rename_*() functions update the index
+* the various handle_rename_*() functions update the index
 * explicitly rather than relying on unpack_trees() to have done it.
 */
get_tree_entry(>object.oid,
@@ -2635,12 +2635,12 @@ static void initial_cleanup_rename(struct 
diff_queue_struct *pairs,
free(pairs);
 }
 
-static int handle_renames(struct merge_options *o,
- struct tree *common,
- struct tree *head,
- struct tree *merge,
- struct string_list *entries,
- struct rename_info *ri)
+static int 

[PATCH v2 5/5] merge-recursive: add pointer about unduly complex looking code

2018-05-21 Thread Elijah Newren
handle_change_delete() has a block of code displaying one of four nearly
identical messages.  Each contains about half a dozen variable interpolations,
which use nearly identical variables as well.  Someone trying to parse this
may be slowed down trying to parse the differences and why they are here; help
them out by adding a comment explaining the differences.

Further, point out that this code structure isn't collapsed into something
more concise and readable for the programmer, because we want to keep full
messages intact in order to make translators' jobs much easier.

Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index 270a4d2aad..28f39b3e9f 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1290,6 +1290,21 @@ static int handle_change_delete(struct merge_options *o,
if (!ret)
ret = update_file(o, 0, o_oid, o_mode, update_path);
} else {
+   /*
+* Despite the four nearly duplicate messages and argument
+* lists below and the ugliness of the nested if-statements,
+* having complete messages makes the job easier for
+* translators.
+*
+* The slight variance among the cases is due to the fact
+* that:
+*   1) directory/file conflicts (in effect if
+*  !alt_path) could cause us to need to write the
+*  file to a different path.
+*   2) renames (in effect if !old_path) could mean that
+*  there are two names for the path that the user
+*  may know the file by.
+*/
if (!alt_path) {
if (!old_path) {
output(o, 1, _("CONFLICT (%s/delete): %s 
deleted in %s "
-- 
2.17.0.847.g20b8963732



[PATCH v2 0/5] merge-recursive code cleanups

2018-05-21 Thread Elijah Newren
This patch series contains several small code cleanups for merge-recursive.
I have removed a couple small cleanup chunks in order to avoid conflicts
with any other in-flight topics in pu (namely, nd/commit-util-to-slab and
sb/submodule-merge-in-merge-recursive).  I may resend those later
separately.

The series was made on top of next (specifically commit 4fb28f7b1639
("Merge branch 'sb/object-store-replace' into next")); it will not
apply to master.


Changes since v1:
  * As suggested by Dscho, replaced patch 5 with a pointer about the
fact that the code is that way to help translators
  * Small rewordings or paragraph splittings suggested by Dscho
  * Added Dscho's Acked-by on relevant patches (2-4)


Elijah Newren (5):
  merge-recursive: fix miscellaneous grammar error in comment
  merge-recursive: fix numerous argument alignment issues
  merge-recursive: clarify the rename_dir/RENAME_DIR meaning
  merge-recursive: rename conflict_rename_*() family of functions
  merge-recursive: add pointer about unduly complex looking code

 merge-recursive.c | 152 ++
 1 file changed, 87 insertions(+), 65 deletions(-)


branch-diff:
 1: 1af1c7df17 =  1: 96225eddbb merge-recursive: fix miscellaneous grammar 
error in comment
 2: 8ed7c8b588 !  2: c63f72e96d merge-recursive: fix numerous argument 
alignment issues
@@ -5,6 +5,7 @@
 Various refactorings throughout the code have left lots of alignment
 issues that were driving me crazy; fix them.
 
+Acked-by: Johannes Schindelin 
 Signed-off-by: Elijah Newren 
 
 diff --git a/merge-recursive.c b/merge-recursive.c
 3: e96b2cd9ae !  3: 3d95e8b756 merge-recursive: clarify the 
rename_dir/RENAME_DIR meaning
@@ -4,14 +4,17 @@
 
 We had an enum of rename types which included RENAME_DIR; this name 
felt
 misleading since it was not about an entire directory but was a status 
for
-each individual file add that occurred within a renamed directory.  
Since
-this type is for signifying that the files in question were being 
renamed
-due to directory rename detection, rename this enum value to
-RENAME_VIA_DIR.  Make a similar change to the conflict_rename_dir()
-function, and add a comment to the top of that function explaining its
-purpose (it may not be quite as obvious as for the other
-conflict_rename_*() functions).
+each individual file add that occurred within a renamed directory.
 
+Since this type is for signifying that the files in question were being
+renamed due to directory rename detection, rename this enum value to
+RENAME_VIA_DIR.
+
+Make a similar change to the conflict_rename_dir() function, and add a
+comment to the top of that function explaining its purpose (it may not 
be
+quite as obvious as for the other conflict_rename_*() functions).
+
+Acked-by: Johannes Schindelin 
 Signed-off-by: Elijah Newren 
 
 diff --git a/merge-recursive.c b/merge-recursive.c
@@ -42,7 +45,7 @@
 +  /*
 +   * Handle file adds that need to be renamed due to directory rename
 +   * detection.  This differs from handle_rename_normal, because
-+   * there is no content merge to do; just move the path into the
++   * there is no content merge to do; just move the file into the
 +   * desired final location.
 +   */
const struct diff_filespec *dest = pair->two;
 4: 4155613a6b !  4: dfd6ab22db merge-recursive: rename conflict_rename_*() 
family of functions
@@ -24,6 +24,7 @@
 from all the handle_rename_*() functions which are called from
 process_entry().
 
+Acked-by: Johannes Schindelin 
 Signed-off-by: Elijah Newren 
 
 diff --git a/merge-recursive.c b/merge-recursive.c
 5: 20b8963732 < --: -- merge-recursive: simplify handle_change_delete
--: -- >  5: 122629ef7a merge-recursive: add pointer about unduly 
complex looking code

-- 
2.17.0.847.g20b8963732



[PATCH v2 1/5] merge-recursive: fix miscellaneous grammar error in comment

2018-05-21 Thread Elijah Newren
Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 35df695fa4..c430fd72bc 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -531,7 +531,7 @@ static void record_df_conflict_files(struct merge_options 
*o,
 struct string_list *entries)
 {
/* If there is a D/F conflict and the file for such a conflict
-* currently exist in the working tree, we want to allow it to be
+* currently exists in the working tree, we want to allow it to be
 * removed to make room for the corresponding directory if needed.
 * The files underneath the directories of such D/F conflicts will
 * be processed before the corresponding file involved in the D/F
-- 
2.17.0.847.g20b8963732



[PATCH v2 2/5] merge-recursive: fix numerous argument alignment issues

2018-05-21 Thread Elijah Newren
Various refactorings throughout the code have left lots of alignment
issues that were driving me crazy; fix them.

Acked-by: Johannes Schindelin 
Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 47 ---
 1 file changed, 24 insertions(+), 23 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index c430fd72bc..01306c87eb 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -308,8 +308,8 @@ static void output_commit_title(struct merge_options *o, 
struct commit *commit)
 }
 
 static int add_cacheinfo(struct merge_options *o,
-   unsigned int mode, const struct object_id *oid,
-   const char *path, int stage, int refresh, int options)
+unsigned int mode, const struct object_id *oid,
+const char *path, int stage, int refresh, int options)
 {
struct cache_entry *ce;
int ret;
@@ -409,8 +409,8 @@ struct tree *write_tree_from_memory(struct merge_options *o)
 }
 
 static int save_files_dirs(const struct object_id *oid,
-   struct strbuf *base, const char *path,
-   unsigned int mode, int stage, void *context)
+  struct strbuf *base, const char *path,
+  unsigned int mode, int stage, void *context)
 {
struct path_hashmap_entry *entry;
int baselen = base->len;
@@ -1257,13 +1257,13 @@ static int conflict_rename_dir(struct merge_options *o,
 }
 
 static int handle_change_delete(struct merge_options *o,
-const char *path, const char *old_path,
-const struct object_id *o_oid, int o_mode,
-const struct object_id *changed_oid,
-int changed_mode,
-const char *change_branch,
-const char *delete_branch,
-const char *change, const char *change_past)
+   const char *path, const char *old_path,
+   const struct object_id *o_oid, int o_mode,
+   const struct object_id *changed_oid,
+   int changed_mode,
+   const char *change_branch,
+   const char *delete_branch,
+   const char *change, const char *change_past)
 {
char *alt_path = NULL;
const char *update_path = path;
@@ -1324,9 +1324,9 @@ static int handle_change_delete(struct merge_options *o,
 }
 
 static int conflict_rename_delete(struct merge_options *o,
-  struct diff_filepair *pair,
-  const char *rename_branch,
-  const char *delete_branch)
+ struct diff_filepair *pair,
+ const char *rename_branch,
+ const char *delete_branch)
 {
const struct diff_filespec *orig = pair->one;
const struct diff_filespec *dest = pair->two;
@@ -1429,7 +1429,7 @@ static int handle_file(struct merge_options *o,
 }
 
 static int conflict_rename_rename_1to2(struct merge_options *o,
-   struct rename_conflict_info *ci)
+  struct rename_conflict_info *ci)
 {
/* One file was renamed in both branches, but to different names. */
struct diff_filespec *one = ci->pair1->one;
@@ -1491,7 +1491,7 @@ static int conflict_rename_rename_1to2(struct 
merge_options *o,
 }
 
 static int conflict_rename_rename_2to1(struct merge_options *o,
-   struct rename_conflict_info *ci)
+  struct rename_conflict_info *ci)
 {
/* Two files, a & b, were renamed to the same thing, c. */
struct diff_filespec *a = ci->pair1->one;
@@ -2710,7 +2710,8 @@ static struct object_id *stage_oid(const struct object_id 
*oid, unsigned mode)
 }
 
 static int read_oid_strbuf(struct merge_options *o,
-   const struct object_id *oid, struct strbuf *dst)
+  const struct object_id *oid,
+  struct strbuf *dst)
 {
void *buf;
enum object_type type;
@@ -2763,10 +2764,10 @@ static int blob_unchanged(struct merge_options *opt,
 }
 
 static int handle_modify_delete(struct merge_options *o,
-const char *path,
-struct object_id *o_oid, int o_mode,
-struct object_id *a_oid, int a_mode,
-struct object_id *b_oid, int b_mode)
+   const char *path,
+   struct object_id *o_oid, int o_mode,
+   

[PATCH v2 3/5] merge-recursive: clarify the rename_dir/RENAME_DIR meaning

2018-05-21 Thread Elijah Newren
We had an enum of rename types which included RENAME_DIR; this name felt
misleading since it was not about an entire directory but was a status for
each individual file add that occurred within a renamed directory.

Since this type is for signifying that the files in question were being
renamed due to directory rename detection, rename this enum value to
RENAME_VIA_DIR.

Make a similar change to the conflict_rename_dir() function, and add a
comment to the top of that function explaining its purpose (it may not be
quite as obvious as for the other conflict_rename_*() functions).

Acked-by: Johannes Schindelin 
Signed-off-by: Elijah Newren 
---
 merge-recursive.c | 28 +---
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 01306c87eb..e2b0db0645 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -180,7 +180,7 @@ static int oid_eq(const struct object_id *a, const struct 
object_id *b)
 
 enum rename_type {
RENAME_NORMAL = 0,
-   RENAME_DIR,
+   RENAME_VIA_DIR,
RENAME_DELETE,
RENAME_ONE_FILE_TO_ONE,
RENAME_ONE_FILE_TO_TWO,
@@ -1224,11 +1224,17 @@ static int merge_file_one(struct merge_options *o,
return merge_file_1(o, , , , path, branch1, branch2, mfi);
 }
 
-static int conflict_rename_dir(struct merge_options *o,
-  struct diff_filepair *pair,
-  const char *rename_branch,
-  const char *other_branch)
+static int conflict_rename_via_dir(struct merge_options *o,
+  struct diff_filepair *pair,
+  const char *rename_branch,
+  const char *other_branch)
 {
+   /*
+* Handle file adds that need to be renamed due to directory rename
+* detection.  This differs from handle_rename_normal, because
+* there is no content merge to do; just move the file into the
+* desired final location.
+*/
const struct diff_filespec *dest = pair->two;
 
if (!o->call_depth && would_lose_untracked(dest->path)) {
@@ -2498,7 +2504,7 @@ static int process_renames(struct merge_options *o,
 
if (oid_eq(_other.oid, _oid) &&
ren1->add_turned_into_rename) {
-   setup_rename_conflict_info(RENAME_DIR,
+   setup_rename_conflict_info(RENAME_VIA_DIR,
   ren1->pair,
   NULL,
   branch1,
@@ -2944,12 +2950,12 @@ static int process_entry(struct merge_options *o,
 b_oid, b_mode,
 conflict_info);
break;
-   case RENAME_DIR:
+   case RENAME_VIA_DIR:
clean_merge = 1;
-   if (conflict_rename_dir(o,
-   conflict_info->pair1,
-   conflict_info->branch1,
-   conflict_info->branch2))
+   if (conflict_rename_via_dir(o,
+   conflict_info->pair1,
+   conflict_info->branch1,
+   conflict_info->branch2))
clean_merge = -1;
break;
case RENAME_DELETE:
-- 
2.17.0.847.g20b8963732



Re: symbol string_list_appendf() unused

2018-05-21 Thread Ramsay Jones


On 22/05/18 00:59, Junio C Hamano wrote:
> Ramsay Jones  writes:
> 
>> I strongly suspect that I haven't followed the discussion on
>> the list closely enough, but your 'ma/unpack-trees-free-msgs'
>> branch in 'pu', seems to define string_list_appendf() but then
>> never call it. This is despite commit 40ebd6c7b0 ("string-list:
>> provide `string_list_appendf()`", 2018-05-20) claiming that:
>> 'The next commit will add a user'. ;-)
>>
>> Have I missed something?
> 
> Yes, I pushed out a tentative "how about doing it this way" update
> that goes alongside his version, making some solutions (including
> the function you found) he had obsoleted, yet without removing
> them.  That is what you saw on 'pu'.
> 
> There is a reroll by Martin that ties all the loose ends.

Ah, OK, sorry for the noise.

Thanks!

ATB,
Ramsay Jones



Re: [PATCH 1/2] remote-curl: accept all encoding supported by curl

2018-05-21 Thread Stefan Beller
On Mon, May 21, 2018 at 4:40 PM, Brandon Williams  wrote:
> Configure curl to accept all encoding which curl supports instead of
> only accepting gzip responses.

This partially reverts aa90b9697f9 (Enable info/refs gzip decompression
in HTTP client, 2012-09-19), as that specifically called out deflate not being
a good option. Is that worth mentioning in the commit message?

> This is necessary to fix a bug when using an installation of curl which
> doesn't support gzip.  Since curl doesn't do any checking to verify that
> it supports the encoding set when calling 'curl_easy_setopt()', curl can
> end up sending an "Accept-Encoding" header indicating that it supports
> a particular encoding when in fact it doesn't.  Instead when the empty
> string "" is used when setting `CURLOPT_ENCODING`, curl will send an
> "Accept-Encoding" header containing only the encoding methods curl
> supports.
>
> Signed-off-by: Brandon Williams 
> ---
>  http.c  | 2 +-
>  remote-curl.c   | 2 +-
>  t/t5551-http-fetch-smart.sh | 4 ++--
>  3 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/http.c b/http.c
> index fed13b216..709150fc7 100644
> --- a/http.c
> +++ b/http.c
> @@ -1788,7 +1788,7 @@ static int http_request(const char *url,
>
> curl_easy_setopt(slot->curl, CURLOPT_URL, url);
> curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
> -   curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "gzip");
> +   curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "");
>
> ret = run_one_slot(slot, );
>
> diff --git a/remote-curl.c b/remote-curl.c
> index ceb05347b..565bba104 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -684,7 +684,7 @@ static int post_rpc(struct rpc_state *rpc)
> curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
> curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
> curl_easy_setopt(slot->curl, CURLOPT_URL, rpc->service_url);
> -   curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "gzip");
> +   curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "");

Looking at the code here, this succeeds if enough memory is available.
There is no check if the given parameter is part of
Curl_all_content_encodings();
https://github.com/curl/curl/blob/e66cca046cef20d00fba89260dfa6b4a3997233d/lib/setopt.c#L429
https://github.com/curl/curl/blob/c675c40295045d4988eeb6291c54eb48f138822f/lib/content_encoding.c#L686

which may be worth checking first?

>
> if (large_request) {
> /* The request body is large and the size cannot be predicted.
> diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
> index f5721b4a5..39c65482c 100755
> --- a/t/t5551-http-fetch-smart.sh
> +++ b/t/t5551-http-fetch-smart.sh
> @@ -26,14 +26,14 @@ setup_askpass_helper
>  cat >exp <  > GET /smart/repo.git/info/refs?service=git-upload-pack HTTP/1.1
>  > Accept: */*
> -> Accept-Encoding: gzip
> +> Accept-Encoding: deflate, gzip
>  > Pragma: no-cache
>  < HTTP/1.1 200 OK
>  < Pragma: no-cache
>  < Cache-Control: no-cache, max-age=0, must-revalidate
>  < Content-Type: application/x-git-upload-pack-advertisement
>  > POST /smart/repo.git/git-upload-pack HTTP/1.1
> -> Accept-Encoding: gzip
> +> Accept-Encoding: deflate, gzip
>  > Content-Type: application/x-git-upload-pack-request
>  > Accept: application/x-git-upload-pack-result
>  > Content-Length: xxx
> --
> 2.17.0.441.gb46fe60e1d-goog
>


Re: [PATCH 1/2] remote-curl: accept all encoding supported by curl

2018-05-21 Thread Jonathan Nieder
Hi,

Brandon Williams wrote:

> Subject: remote-curl: accept all encoding supported by curl

nit: s/encoding/encodings

> Configure curl to accept all encoding which curl supports instead of
> only accepting gzip responses.

Likewise.

> This is necessary to fix a bug when using an installation of curl which
> doesn't support gzip.  Since curl doesn't do any checking to verify that
> it supports the encoding set when calling 'curl_easy_setopt()', curl can
> end up sending an "Accept-Encoding" header indicating that it supports
> a particular encoding when in fact it doesn't.  Instead when the empty
> string "" is used when setting `CURLOPT_ENCODING`, curl will send an
> "Accept-Encoding" header containing only the encoding methods curl
> supports.
>
> Signed-off-by: Brandon Williams 

Thanks for the analysis and fix.

Reported-by: Anton Golubev 

Also ccing the reporter so we can hopefully get a tested-by.  Anton,
can you test this patch and let us know how it goes?  You can apply it
as follows:

  curl \
https://public-inbox.org/git/20180521234004.142548-1-bmw...@google.com/raw \
>patch.txt
  git am -3 patch.txt

Brandon, can the commit message also say a little more about the
motivating context and symptoms?

  $ curl --version
  curl 7.52.1 (arm-openwrt-linux-gnu) libcurl/7.52.1 mbedTLS/2.6.0
  Protocols: file ftp ftps http https
  Features: IPv6 Largefile SSL

The issue is that when curl is built without the "zlib" feature, since
v1.8.0-rc0~14^2 (Enable info/refs gzip decompression in HTTP client,
2012-09-19) we end up requesting "gzip" encoding anyway despite
libcurl not being able to decode it.  Worse, instead of getting a
clear error message indicating so, we end up falling back to "dumb"
http, producing a confusing and difficult to debug result.

> ---
>  http.c  | 2 +-
>  remote-curl.c   | 2 +-
>  t/t5551-http-fetch-smart.sh | 4 ++--
>  3 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/http.c b/http.c
> index fed13b216..709150fc7 100644
> --- a/http.c
> +++ b/http.c
> @@ -1788,7 +1788,7 @@ static int http_request(const char *url,
>  
>   curl_easy_setopt(slot->curl, CURLOPT_URL, url);
>   curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
> - curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "gzip");
> + curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "");
>  
>   ret = run_one_slot(slot, );
>  
> diff --git a/remote-curl.c b/remote-curl.c
> index ceb05347b..565bba104 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -684,7 +684,7 @@ static int post_rpc(struct rpc_state *rpc)
>   curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
>   curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
>   curl_easy_setopt(slot->curl, CURLOPT_URL, rpc->service_url);
> - curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "gzip");
> + curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "");
>  
>   if (large_request) {
>   /* The request body is large and the size cannot be predicted.

Makes sense.

> diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
> index f5721b4a5..39c65482c 100755
> --- a/t/t5551-http-fetch-smart.sh
> +++ b/t/t5551-http-fetch-smart.sh
> @@ -26,14 +26,14 @@ setup_askpass_helper
>  cat >exp <  > GET /smart/repo.git/info/refs?service=git-upload-pack HTTP/1.1
>  > Accept: */*
> -> Accept-Encoding: gzip
> +> Accept-Encoding: deflate, gzip
>  > Pragma: no-cache
>  < HTTP/1.1 200 OK
>  < Pragma: no-cache
>  < Cache-Control: no-cache, max-age=0, must-revalidate
>  < Content-Type: application/x-git-upload-pack-advertisement
>  > POST /smart/repo.git/git-upload-pack HTTP/1.1
> -> Accept-Encoding: gzip
> +> Accept-Encoding: deflate, gzip
>  > Content-Type: application/x-git-upload-pack-request
>  > Accept: application/x-git-upload-pack-result
>  > Content-Length: xxx

If libcurl gains support for another encoding in the future, this test
would start failing.  Can we make the matching less strict?  For
example, how about something like the following for squashing in?

Thanks,
Jonathan

diff --git i/t/t5551-http-fetch-smart.sh w/t/t5551-http-fetch-smart.sh
index 39c65482ce..913089b144 100755
--- i/t/t5551-http-fetch-smart.sh
+++ w/t/t5551-http-fetch-smart.sh
@@ -26,14 +26,14 @@ setup_askpass_helper
 cat >exp < GET /smart/repo.git/info/refs?service=git-upload-pack HTTP/1.1
 > Accept: */*
-> Accept-Encoding: deflate, gzip
+> Accept-Encoding: ENCODINGS
 > Pragma: no-cache
 < HTTP/1.1 200 OK
 < Pragma: no-cache
 < Cache-Control: no-cache, max-age=0, must-revalidate
 < Content-Type: application/x-git-upload-pack-advertisement
 > POST /smart/repo.git/git-upload-pack HTTP/1.1
-> Accept-Encoding: deflate, gzip
+> Accept-Encoding: ENCODINGS
 > Content-Type: application/x-git-upload-pack-request
 > Accept: application/x-git-upload-pack-result
 > Content-Length: xxx
@@ -79,8 +79,13 @@ test_expect_success 'clone http repository' '
/^< 

Re: symbol string_list_appendf() unused

2018-05-21 Thread Junio C Hamano
Ramsay Jones  writes:

> I strongly suspect that I haven't followed the discussion on
> the list closely enough, but your 'ma/unpack-trees-free-msgs'
> branch in 'pu', seems to define string_list_appendf() but then
> never call it. This is despite commit 40ebd6c7b0 ("string-list:
> provide `string_list_appendf()`", 2018-05-20) claiming that:
> 'The next commit will add a user'. ;-)
>
> Have I missed something?

Yes, I pushed out a tentative "how about doing it this way" update
that goes alongside his version, making some solutions (including
the function you found) he had obsoleted, yet without removing
them.  That is what you saw on 'pu'.

There is a reroll by Martin that ties all the loose ends.

Thanks for paying great attention to the details. Always
appreciated.



symbol string_list_appendf() unused

2018-05-21 Thread Ramsay Jones
Hi Martin,

I strongly suspect that I haven't followed the discussion on
the list closely enough, but your 'ma/unpack-trees-free-msgs'
branch in 'pu', seems to define string_list_appendf() but then
never call it. This is despite commit 40ebd6c7b0 ("string-list:
provide `string_list_appendf()`", 2018-05-20) claiming that:
'The next commit will add a user'. ;-)

Have I missed something?

ATB,
Ramsay Jones



[PATCH 2/2] remote-curl: accept compressed responses with protocol v2

2018-05-21 Thread Brandon Williams
Configure curl to accept compressed responses when using protocol v2 by
setting `CURLOPT_ENCODING` to "", which indicates that curl should send
an "Accept-Encoding" header with all supported compression encodings.

Signed-off-by: Brandon Williams 
---
 remote-curl.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/remote-curl.c b/remote-curl.c
index 565bba104..99b0bedc6 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -1259,6 +1259,7 @@ static int proxy_request(struct proxy_state *p)
 
slot = get_active_slot();
 
+   curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "");
curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
curl_easy_setopt(slot->curl, CURLOPT_URL, p->service_url);
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 1/2] remote-curl: accept all encoding supported by curl

2018-05-21 Thread Brandon Williams
Configure curl to accept all encoding which curl supports instead of
only accepting gzip responses.

This is necessary to fix a bug when using an installation of curl which
doesn't support gzip.  Since curl doesn't do any checking to verify that
it supports the encoding set when calling 'curl_easy_setopt()', curl can
end up sending an "Accept-Encoding" header indicating that it supports
a particular encoding when in fact it doesn't.  Instead when the empty
string "" is used when setting `CURLOPT_ENCODING`, curl will send an
"Accept-Encoding" header containing only the encoding methods curl
supports.

Signed-off-by: Brandon Williams 
---
 http.c  | 2 +-
 remote-curl.c   | 2 +-
 t/t5551-http-fetch-smart.sh | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/http.c b/http.c
index fed13b216..709150fc7 100644
--- a/http.c
+++ b/http.c
@@ -1788,7 +1788,7 @@ static int http_request(const char *url,
 
curl_easy_setopt(slot->curl, CURLOPT_URL, url);
curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
-   curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "gzip");
+   curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "");
 
ret = run_one_slot(slot, );
 
diff --git a/remote-curl.c b/remote-curl.c
index ceb05347b..565bba104 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -684,7 +684,7 @@ static int post_rpc(struct rpc_state *rpc)
curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
curl_easy_setopt(slot->curl, CURLOPT_URL, rpc->service_url);
-   curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "gzip");
+   curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "");
 
if (large_request) {
/* The request body is large and the size cannot be predicted.
diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index f5721b4a5..39c65482c 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -26,14 +26,14 @@ setup_askpass_helper
 cat >exp < GET /smart/repo.git/info/refs?service=git-upload-pack HTTP/1.1
 > Accept: */*
-> Accept-Encoding: gzip
+> Accept-Encoding: deflate, gzip
 > Pragma: no-cache
 < HTTP/1.1 200 OK
 < Pragma: no-cache
 < Cache-Control: no-cache, max-age=0, must-revalidate
 < Content-Type: application/x-git-upload-pack-advertisement
 > POST /smart/repo.git/git-upload-pack HTTP/1.1
-> Accept-Encoding: gzip
+> Accept-Encoding: deflate, gzip
 > Content-Type: application/x-git-upload-pack-request
 > Accept: application/x-git-upload-pack-result
 > Content-Length: xxx
-- 
2.17.0.441.gb46fe60e1d-goog



Re: [PATCH v4 00/28] Hash-independent tests

2018-05-21 Thread brian m. carlson
On Mon, May 21, 2018 at 09:17:39AM -0400, Ben Peart wrote:
> Do you plan to update those tests that hard code the SHA of the index file
> itself?  For example, t1700-split-index.sh has hard coded values for the SHA
> and currently only supports different hard coded values for V4 indexes vs
> others.

I have some additional test changes that use the SHA1 prerequisite,
plus some that fix issues due to the use of a different hash algorithm
or a longer hash algorithm.

Ultimately, I plan to discard the SHA1 prerequisite and update those
tests to use a translation table to look up the correct values depending
on the hash algorithm in use.  However, we haven't decided on the actual
algorithm that NewHash will be yet, so I can't write those translation
tables at this point.

I plan to start a discussion about the choice of algorithm soon, since
we'll soon need to make a decision in order to progress, but I need to
do some research and testing in order to have sufficient facts to
present to the list.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [RFC PATCH] fetch-pack: space out sent "haves" in negotiation

2018-05-21 Thread Stefan Beller
Hi Jonathan,

On Mon, May 21, 2018 at 1:43 PM, Jonathan Tan  wrote:
> I was thinking about fetch negotiation in some non-ideal situations
> (specifically, when the client repo contains two or more independent
> branches that meet only somewhere far in the past) and thought about
> skipping over intermediate commits, using exponentially larger skips as
> we proceed, when generating "have" lines. This is in the hope of
> reducing the bandwidth and roundtrips needed when fetching, and does not
> require a modification to the server.

In an ideal world, the server and client would both estimate the potential
reduction of the packfile to send, and base the decision if to continue
negotiating on the trade off if the packfile reduction savings are greater
than the cost of negotiation (in terms of bandwidth or time).
(e.g. the server could keep track of the "potential largest packfile to
sent" as well as the "potential smallest packfile to sent" given the
state of negotiation. And as soon as the difference between those
two packs is smaller than the size of one round of negotiation,
it is better to stop and just sent the large file).

You state that you do not want to change the server side, and stick to
the current protocol, which makes this ideal world scenario moot, but
shifts the problem to "picking haves more intelligently".

> I'm not sure if this is the best way,

I think it is the best for a short term gain, as the picking algorithm is
not part of the protocol, so it can be easily extended/reverted/improved
as we go. So I would continue this way.

> (1) The implementation that I have
>
> This patch contains some drop-in code that passes all existing tests,
> but the new negotiation algorithm is not tested.
>
> To mitigate the effect of skipping, I included functionality wherein
> the client will retry the commits in a skip if the server ACKs the
> destination of the skip, but this is currently imperfect - in
> particular, the server might end the negotiation early, and the commits
> retried in my current implementation are a superset due to the fact that
> I didn't store the commits in the skip.

So we start with exponential hops, fall back to linear probing and then
"make off by one errors" in the linear probes?

> (2) Possible future work for my implementation
>
> Since each sent commit maintains pointers to sent descendants and sent
> ancestors (strictly speaking, only the "close" ones - to find all of
> them, you need the transitive closure), this can be used for some sort
> of error correction when, during a stateless RPC negotiation, the server
> (which may be a group of eventually consistent servers behind a load
> balancer) reports that it no longer has a commit that it said it had.
> For example, we could in this case mark that commit as "they_have=NO"
> and for all its closest ancestors, set it to "they_have=YES" unless they
> in turn have a descendant with "they_have=YES" or
> "they_have=HAVE_DESCENDANT".
>
> (3) Other ways of improving negotiation
>
> If we're prepared to commit-walk a significant part of the entire local
> repo (as we are, in the situation I described in the first paragraph),

> and if we have access to corresponding remote-tracking information,

This is a dangerous assumption, as not everyone is having a 1:1 relationship
with their remote server (for e.g. code review), but there are these triangle
workflows in the kernel community for example, where you push in one
remote direction and (re-)obtain the history merged into the bigger picture
from another remote. And these two remotes are not special to each other
on the client side.

>  fetch-negotiator.c | 309 +
>  fetch-negotiator.h |  40 ++

This patch is moving the algorithm driving the selection of new
commits to pick to
a new file, but there is no new algorithm, yet?
As hinted at from (1), this is smarter than what we did before by
picking commits
non-linearly but with some sort of exponential back off, how does it end the
exponential phase?

The way forward out of RFC state, might be to separate the introduction of a new
improved algorithm and the refactoring. So first move code literally into the
fetch-negotiator file, and then add improvements in there, or is it
just not worth
the refactoring and directly put in the new algorithm?

Another use case we discussed was "open-ended bisection", where you know
you are in a bad state and "once upon a time it worked", and now you are tasked
to find the offending commit. To find such a commit, you probably
would also start
with such an exponential back off until you run into a "good frontier"
of commits
and then use conventional bisect to narrow down the exact commit.


> +enum they_have {
> +   /*
> +* We do not know if the server has this commit, or we know that the
> +* server does not have this commit.
> +*/
> +   NO,
> +
> +   /*
> +* The server has this 

Re: commit-graph: change in "best" merge-base when ambiguous

2018-05-21 Thread Stefan Beller
On Mon, May 21, 2018 at 2:50 PM, Jeff King  wrote:
> On Mon, May 21, 2018 at 11:33:11AM -0700, Elijah Newren wrote:
>
>> > In t6024-recursive-merge.sh, we have the following commit structure:
>> >
>> > # 1 - A - D - F
>> > #   \   X   /
>> > # B   X
>> > #   X   \
>> > # 2 - C - E - G
>> >
>> > When merging F to G, there are two "best" merge-bases, A and C. With
>> > core.commitGraph=false, 'git merge-base F G' returns A, while it returns C
>> > when core.commitGraph=true. This is due to the new walk order when using
>> > generation numbers, although I have not dug deep into the code to point out
>> > exactly where the choice between A and C is made. Likely it's just whatever
>> > order they are inserted into a list.
>>
>> Ooh, interesting.
>>
>> Just a guess, but could it be related to relative ordering of
>> committer timestamps?  Ordering of committer timestamps apparently
>> affects order of merge-bases returned to merge-recursive, and although
>> that shouldn't have mattered, a few bugs meant that it did and the
>> order ended up determining what contents a successful merge would
>> have.  See this recent post:
>>
>> https://public-inbox.org/git/CABPp-BFc1OLYKzS5rauOehvEugPc0oGMJp-NMEAmVMW7QR=4...@mail.gmail.com/
>>
>> The fact that the merge was successful for both orderings of merge
>> bases was the real bug, though; it should have detected and reported a
>> conflict both ways.
>
> Traditionally we've inserted commits into the walk queue in commit-date
> ordering, but with identical dates it may depend on the order in which
> you reach the commits. Many of the tests are particularly bad for
> showing this off because they do not use test_tick, and so you end up
> with a bunch of commits with identical timestamps.
>
> If we're just using generation numbers for queue ordering, we're even
> more likely to hit these cases, since they're expected to increase along
> parallel branches at roughly the same rate. It's probably a good idea to
> have some tie-breakers to make things more deterministic (walk order
> shouldn't matter, but it can be confusing if we sometimes use one order
> and sometimes the other).
>
> Even ordering by {generation, timestamp} isn't quite enough, since you
> could still tie there. Perhaps {generation, timestamp, hash} would be a
> sensible ordering?

The hash sounds reasonable as the definite tie breaker.

git merge-base is documented as "Find as good common ancestors
as possible for a merge", so in case we do not require the tie
breaking to be cheap, we could go by "smallest diff output"
of the two diffs against the potential merge commit.

Though I don't think this is really optimal for performance reasons.


Re: commit-graph: change in "best" merge-base when ambiguous

2018-05-21 Thread Jacob Keller
On Mon, May 21, 2018 at 2:54 PM, Jeff King  wrote:
> Yes, I think this is clearly a case where all of the single merge-bases
> we could show are equally good. And I don't think we should promise to
> show a particular one, but I _do_ think it's friendly for us to have
> deterministic tie-breakers (we certainly don't now).
>
> -Peff

Right. I think we should probably have some mechanism that will allow
us to always give the same answer, even if it's somewhat arbitrary.

Regards,
Jake


Re: [PATCHv4 1/1] git-p4: add unshelve command

2018-05-21 Thread SZEDER Gábor
On Tue, May 22, 2018 at 12:09 AM, Luke Diamand  wrote:
> On 21 May 2018 at 22:39, SZEDER Gábor  wrote:
>>> diff --git a/t/t9832-unshelve.sh b/t/t9832-unshelve.sh
>
> ...
> ...
>
>>> +'
>>
>> This test fails on my box and on Travis CI (in all four standard Linux
>> and OSX build jobs) with:
>>

>> File "/home/szeder/src/git/git-p4", line 2467, in cmp_shelved
>>   return ret["status"] == "identical"
>>   KeyError: 'status'
>>   error: last command exited with $?=1
>>   not ok 4 - create shelved changelist
>
> It works fine for me - but given where it's failing, my first
> suspicion would be p4 client version (or server) differences.
>
> I'm using 2015.1 for server and client. Could you check which version
> you are using?

Well, sending a bug report and not including the version used...
totally rookie move.  Sorry :)

I downloaded the same version used by the Travis CI Linux build jobs,
which is:

  Rev. P4/LINUX26X86_64/2016.2/1598668 (2017/12/08).
  Rev. P4D/LINUX26X86_64/2016.2/1648692 (2018/04/16).

The OSX build jobs used whatever current version they got via
homebrew:

  Rev. P4/DARWIN90X86_64/2017.2/1611521 (2018/01/21).
  Rev. P4D/DARWIN90X86_64/2017.2/1650199 (2018/04/18).


Proposal

2018-05-21 Thread Miss Zeliha Omer Faruk



Hello

Greetings to you please i have a business proposal for you contact me
for more detailes asap thanks.

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turkey



Re: [PATCHv4 1/1] git-p4: add unshelve command

2018-05-21 Thread Luke Diamand
On 21 May 2018 at 23:09, Luke Diamand  wrote:
> On 21 May 2018 at 22:39, SZEDER Gábor  wrote:
>>> diff --git a/t/t9832-unshelve.sh b/t/t9832-unshelve.sh
>
> ...
> ...
>
>>> +'
>>
>> This test fails on my box and on Travis CI (in all four standard Linux
>> and OSX build jobs) with:
>>
>>   + cd /home/szeder/src/git/t/trash directory.t9832-unshelve/cli
>>   + p4 edit file1
>>   //depot/file1#1 - opened for edit
>>   + echo a change
>>   + echo new file
>>   + p4 add file2
>>   //depot/file2#1 - opened for add
>>   + p4 delete file_to_delete
>>   //depot/file_to_delete#1 - opened for delete
>>   + p4 opened
>>   //depot/file1#1 - edit default change (text)
>>   //depot/file2#1 - add default change (text)
>>   //depot/file_to_delete#1 - delete default change (text)
>>   + p4 shelve -i
>>   Change 3 created with 3 open file(s).
>>   Shelving files for change 3.
>>   edit //depot/file1#1
>>   add //depot/file2#1
>>   delete //depot/file_to_delete#1
>>   Change 3 files shelved.
>>   + cd /home/szeder/src/git/t/trash directory.t9832-unshelve/git
>>   + last_shelved_change
>>   + p4 changes -s shelved -m1
>>   + cut -d   -f 2
>>   + change=3
>>   + git p4 unshelve 3
>>   Traceback (most recent call last):
>> File "/home/szeder/src/git/git-p4", line 3975, in 
>>   main()
>> File "/home/szeder/src/git/git-p4", line 3969, in main
>>   if not cmd.run(args):
>> File "/home/szeder/src/git/git-p4", line 3851, in run
>>   sync.importChanges(changes, shelved=True,
>>   origin_revision=origin_revision)
>> File "/home/szeder/src/git/git-p4", line 3296, in importChanges
>>   files = self.extractFilesFromCommit(description, shelved, change,
>>   origin_revision)
>> File "/home/szeder/src/git/git-p4", line 2496, in
>>   extractFilesFromCommit
>>   not self.cmp_shelved(path, file["rev"], origin_revision):
>> File "/home/szeder/src/git/git-p4", line 2467, in cmp_shelved
>>   return ret["status"] == "identical"
>>   KeyError: 'status'
>>   error: last command exited with $?=1
>>   not ok 4 - create shelved changelist
>
> It works fine for me - but given where it's failing, my first
> suspicion would be p4 client version (or server) differences.
>
> I'm using 2015.1 for server and client. Could you check which version
> you are using?

In fact, no need. It works fine with 2015.1 but 2017.1 fails with this
error. Sigh. I'll reroll.

Luke


>
> Thanks,
> Luke


Re: [PATCHv4 1/1] git-p4: add unshelve command

2018-05-21 Thread Luke Diamand
On 21 May 2018 at 22:39, SZEDER Gábor  wrote:
>> diff --git a/t/t9832-unshelve.sh b/t/t9832-unshelve.sh

...
...

>> +'
>
> This test fails on my box and on Travis CI (in all four standard Linux
> and OSX build jobs) with:
>
>   + cd /home/szeder/src/git/t/trash directory.t9832-unshelve/cli
>   + p4 edit file1
>   //depot/file1#1 - opened for edit
>   + echo a change
>   + echo new file
>   + p4 add file2
>   //depot/file2#1 - opened for add
>   + p4 delete file_to_delete
>   //depot/file_to_delete#1 - opened for delete
>   + p4 opened
>   //depot/file1#1 - edit default change (text)
>   //depot/file2#1 - add default change (text)
>   //depot/file_to_delete#1 - delete default change (text)
>   + p4 shelve -i
>   Change 3 created with 3 open file(s).
>   Shelving files for change 3.
>   edit //depot/file1#1
>   add //depot/file2#1
>   delete //depot/file_to_delete#1
>   Change 3 files shelved.
>   + cd /home/szeder/src/git/t/trash directory.t9832-unshelve/git
>   + last_shelved_change
>   + p4 changes -s shelved -m1
>   + cut -d   -f 2
>   + change=3
>   + git p4 unshelve 3
>   Traceback (most recent call last):
> File "/home/szeder/src/git/git-p4", line 3975, in 
>   main()
> File "/home/szeder/src/git/git-p4", line 3969, in main
>   if not cmd.run(args):
> File "/home/szeder/src/git/git-p4", line 3851, in run
>   sync.importChanges(changes, shelved=True,
>   origin_revision=origin_revision)
> File "/home/szeder/src/git/git-p4", line 3296, in importChanges
>   files = self.extractFilesFromCommit(description, shelved, change,
>   origin_revision)
> File "/home/szeder/src/git/git-p4", line 2496, in
>   extractFilesFromCommit
>   not self.cmp_shelved(path, file["rev"], origin_revision):
> File "/home/szeder/src/git/git-p4", line 2467, in cmp_shelved
>   return ret["status"] == "identical"
>   KeyError: 'status'
>   error: last command exited with $?=1
>   not ok 4 - create shelved changelist

It works fine for me - but given where it's failing, my first
suspicion would be p4 client version (or server) differences.

I'm using 2015.1 for server and client. Could you check which version
you are using?

Thanks,
Luke


Re: commit-graph: change in "best" merge-base when ambiguous

2018-05-21 Thread Jeff King
On Mon, May 21, 2018 at 02:10:54PM -0400, Derrick Stolee wrote:

> In the Discussion section of the `git merge-base` docs [1], we have the
> following:
> 
>     When the history involves criss-cross merges, there can be more than one
> best common ancestor for two commits. For example, with this topology:
> 
>     ---1---o---A
>         \ /
>      X
>         / \
>     ---2---o---o---B
> 
>     both 1 and 2 are merge-bases of A and B. Neither one is better than the
> other (both are best merge bases). When the --all option is not given,    
> it is unspecified which best one is output.
> 
> This means our official documentation mentions that we do not have a
> concrete way to differentiate between these choices. This makes me think
> that this change in behavior is not a bug, but it _is_ a change in behavior.
> It's worth mentioning, but I don't think there is any value in making sure
> `git merge-base` returns the same output.
> 
> Does anyone disagree? Is this something we should solidify so we always have
> a "definitive" merge-base?

Heh, I should have read your whole original message before responding,
not just the part that Elijah quoted.

Yes, I think this is clearly a case where all of the single merge-bases
we could show are equally good. And I don't think we should promise to
show a particular one, but I _do_ think it's friendly for us to have
deterministic tie-breakers (we certainly don't now).

-Peff


Re: [PATCH v2 02/18] Add a new builtin: branch-diff

2018-05-21 Thread Jeff King
On Mon, May 21, 2018 at 02:40:57PM -0700, Brandon Williams wrote:

> > > > Most fellow German software engineers (who seem to have a knack for
> > > > idiotically long variable/function names) would now probably suggest:
> > > >
> > > > git compare-patch-series-with-revised-patch-series
> > > 
> > > or short:
> > > 
> > >   revision-compare
> > >   compare-revs
> > >   com-revs
> > > 
> > >   revised-diff
> > >   revise-diff
> > >   revised-compare
> > > 
> > >   diff-revise
> 
> I haven't really been following all of the discussion but from what I
> can tell the point of this command is to generate a diff based on two
> different versions of a series, so why not call it 'series-diff'? :)

That's OK with me, though I prefer "range" as I think we use that term
elsewhere ("series" is usually part of "patch series", but many people
do not use a workflow with that term).

-Peff


Re: commit-graph: change in "best" merge-base when ambiguous

2018-05-21 Thread Jeff King
On Mon, May 21, 2018 at 11:33:11AM -0700, Elijah Newren wrote:

> > In t6024-recursive-merge.sh, we have the following commit structure:
> >
> > # 1 - A - D - F
> > #   \   X   /
> > # B   X
> > #   X   \
> > # 2 - C - E - G
> >
> > When merging F to G, there are two "best" merge-bases, A and C. With
> > core.commitGraph=false, 'git merge-base F G' returns A, while it returns C
> > when core.commitGraph=true. This is due to the new walk order when using
> > generation numbers, although I have not dug deep into the code to point out
> > exactly where the choice between A and C is made. Likely it's just whatever
> > order they are inserted into a list.
>
> Ooh, interesting.
> 
> Just a guess, but could it be related to relative ordering of
> committer timestamps?  Ordering of committer timestamps apparently
> affects order of merge-bases returned to merge-recursive, and although
> that shouldn't have mattered, a few bugs meant that it did and the
> order ended up determining what contents a successful merge would
> have.  See this recent post:
> 
> https://public-inbox.org/git/CABPp-BFc1OLYKzS5rauOehvEugPc0oGMJp-NMEAmVMW7QR=4...@mail.gmail.com/
> 
> The fact that the merge was successful for both orderings of merge
> bases was the real bug, though; it should have detected and reported a
> conflict both ways.

Traditionally we've inserted commits into the walk queue in commit-date
ordering, but with identical dates it may depend on the order in which
you reach the commits. Many of the tests are particularly bad for
showing this off because they do not use test_tick, and so you end up
with a bunch of commits with identical timestamps.

If we're just using generation numbers for queue ordering, we're even
more likely to hit these cases, since they're expected to increase along
parallel branches at roughly the same rate. It's probably a good idea to
have some tie-breakers to make things more deterministic (walk order
shouldn't matter, but it can be confusing if we sometimes use one order
and sometimes the other).

Even ordering by {generation, timestamp} isn't quite enough, since you
could still tie there. Perhaps {generation, timestamp, hash} would be a
sensible ordering?

As for this specific case, even with the current code asking for `git
merge-base G F` will return the other answer. This is clearly a case
with multiple merge bases, and I'd expect "merge-base --all" to return
both (and actually it shows "B" as well, which makes sense). In the
non-all case, there is no "best", so we're free to show any.

-Peff


Re: [PATCH v2 02/18] Add a new builtin: branch-diff

2018-05-21 Thread Stefan Beller
On Mon, May 21, 2018 at 2:40 PM, Brandon Williams  wrote:
revised-compare
>> >
>> >   diff-revise
>
> I haven't really been following all of the discussion but from what I
> can tell the point of this command is to generate a diff based on two
> different versions of a series, so why not call it 'series-diff'? :)

Upon mentioning series-diff, I misheard Brandon and thought he proposed

   serious-diff

:-)
Stefan


Re: [PATCH v2 02/18] Add a new builtin: branch-diff

2018-05-21 Thread Brandon Williams
On 05/21, Jeff King wrote:
> On Mon, May 21, 2018 at 10:56:47AM -0700, Stefan Beller wrote:
> 
> > > It is very much about
> > > comparing two *ranges of* revisions, and not just any ranges, no. Those
> > > ranges need to be so related as to contain mostly identical changes.
> > 
> > range-diff, eh?
> > 
> > > Most fellow German software engineers (who seem to have a knack for
> > > idiotically long variable/function names) would now probably suggest:
> > >
> > > git compare-patch-series-with-revised-patch-series
> > 
> > or short:
> > 
> >   revision-compare
> >   compare-revs
> >   com-revs
> > 
> >   revised-diff
> >   revise-diff
> >   revised-compare
> > 
> >   diff-revise

I haven't really been following all of the discussion but from what I
can tell the point of this command is to generate a diff based on two
different versions of a series, so why not call it 'series-diff'? :)

-- 
Brandon Williams


Re: [PATCHv4 1/1] git-p4: add unshelve command

2018-05-21 Thread SZEDER Gábor
> diff --git a/t/t9832-unshelve.sh b/t/t9832-unshelve.sh
> new file mode 100755
> index 00..cca2dec536
> --- /dev/null
> +++ b/t/t9832-unshelve.sh
> @@ -0,0 +1,153 @@
> +#!/bin/sh
> +
> +last_shelved_change() {
> + p4 changes -s shelved -m1 | cut -d " " -f 2
> +}
> +
> +test_description='git p4 unshelve'
> +
> +. ./lib-git-p4.sh
> +
> +test_expect_success 'start p4d' '
> + start_p4d
> +'
> +
> +test_expect_success 'init depot' '
> + (
> + cd "$cli" &&
> + echo file1 >file1 &&
> + p4 add file1 &&
> + p4 submit -d "change 1"
> + : >file_to_delete &&
> + p4 add file_to_delete &&
> + p4 submit -d "file to delete"
> + )
> +'
> +
> +test_expect_success 'initial clone' '
> + git p4 clone --dest="$git" //depot/@all
> +'
> +
> +test_expect_success 'create shelved changelist' '
> + (
> + cd "$cli" &&
> + p4 edit file1 &&
> + echo "a change" >>file1 &&
> + echo "new file" >file2 &&
> + p4 add file2 &&
> + p4 delete file_to_delete &&
> + p4 opened &&
> + p4 shelve -i < +Change: new
> +Description:
> + Test commit
> +
> + Further description
> +Files:
> + //depot/file1
> + //depot/file2
> + //depot/file_to_delete
> +EOF
> +
> + ) &&
> + (
> + cd "$git" &&
> + change=$(last_shelved_change) &&
> + git p4 unshelve $change &&
> + git show refs/remotes/p4/unshelved/$change | grep -q "Further 
> description" &&
> + git cherry-pick refs/remotes/p4/unshelved/$change &&
> + test_path_is_file file2 &&
> + test_cmp file1 "$cli"/file1 &&
> + test_cmp file2 "$cli"/file2 &&
> + test_path_is_missing file_to_delete
> + )
> +'

This test fails on my box and on Travis CI (in all four standard Linux
and OSX build jobs) with:

  + cd /home/szeder/src/git/t/trash directory.t9832-unshelve/cli
  + p4 edit file1
  //depot/file1#1 - opened for edit
  + echo a change
  + echo new file
  + p4 add file2
  //depot/file2#1 - opened for add
  + p4 delete file_to_delete
  //depot/file_to_delete#1 - opened for delete
  + p4 opened
  //depot/file1#1 - edit default change (text)
  //depot/file2#1 - add default change (text)
  //depot/file_to_delete#1 - delete default change (text)
  + p4 shelve -i
  Change 3 created with 3 open file(s).
  Shelving files for change 3.
  edit //depot/file1#1
  add //depot/file2#1
  delete //depot/file_to_delete#1
  Change 3 files shelved.
  + cd /home/szeder/src/git/t/trash directory.t9832-unshelve/git
  + last_shelved_change
  + p4 changes -s shelved -m1
  + cut -d   -f 2
  + change=3
  + git p4 unshelve 3
  Traceback (most recent call last):
File "/home/szeder/src/git/git-p4", line 3975, in 
  main()
File "/home/szeder/src/git/git-p4", line 3969, in main
  if not cmd.run(args):
File "/home/szeder/src/git/git-p4", line 3851, in run
  sync.importChanges(changes, shelved=True,
  origin_revision=origin_revision)
File "/home/szeder/src/git/git-p4", line 3296, in importChanges
  files = self.extractFilesFromCommit(description, shelved, change,
  origin_revision)
File "/home/szeder/src/git/git-p4", line 2496, in
  extractFilesFromCommit
  not self.cmp_shelved(path, file["rev"], origin_revision):
File "/home/szeder/src/git/git-p4", line 2467, in cmp_shelved
  return ret["status"] == "identical"
  KeyError: 'status'
  error: last command exited with $?=1
  not ok 4 - create shelved changelist

The next test fails with the same backtrace, too, so I won't include
it.

> +
> +test_expect_success 'update shelved changelist and re-unshelve' '
> + test_when_finished cleanup_git &&
> + (
> + cd "$cli" &&
> + change=$(last_shelved_change) &&
> + echo "file3" >file3 &&
> + p4 add -c $change file3 &&
> + p4 shelve -i -r < +Change: $change
> +Description:
> + Test commit
> +
> + Further description
> +Files:
> + //depot/file1
> + //depot/file2
> + //depot/file3
> + //depot/file_to_delete
> +EOF
> + p4 describe $change
> + ) &&
> + (
> + cd "$git" &&
> + change=$(last_shelved_change) &&
> + git p4 unshelve $change &&
> + git diff refs/remotes/p4/unshelved/$change.0 
> refs/remotes/p4/unshelved/$change | grep -q file3
> + )
> +'


Re: [PATCH v5 0/4] unpack_trees_options: free messages when done

2018-05-21 Thread Jeff King
On Mon, May 21, 2018 at 04:54:24PM +0200, Martin Ågren wrote:

> That is, I've replaced the `string_list_appendf()`-patch with Junio's
> `argv_push*()`-patch, then squashed Junio's "redoing the 4/4"-patch into
> patch 4/4 -- with the exception of keeping the `memset(opts->msgs, ...)`
> which I suspect was mistakenly dropped.
> 
> Again, thanks for all the helpful comments and patches pointing me in
> the right direction.

I like it. Thanks for seeing it through. That was a lot of
back-and-forth for a small cleanup, but I hope we've established a
pattern that can be used elsewhere.

-Peff


Re: [PATCH 1/5] http: use strbufs instead of fixed buffers

2018-05-21 Thread Stefan Beller
On Mon, May 21, 2018 at 12:41 PM, Jeff King  wrote:
> On Mon, May 21, 2018 at 11:11:51AM -0700, Stefan Beller wrote:
>
>> Hi Jeff,
>>
>> On Fri, May 18, 2018 at 6:56 PM, Jeff King  wrote:
>>
>> > @@ -2421,4 +2426,5 @@ void release_http_object_request(struct 
>> > http_object_request *freq)
>> ..
>> > +   strbuf_release(>tmpfile);
>>
>> Do we need an equivalent in release_http_pack_request as well?
>
> Yes, but isn't there one?
>
> From the original patch:
>
> --- a/http.c
> +++ b/http.c
> @@ -2082,6 +2082,7 @@ void release_http_pack_request(struct http_pack_request 
> *preq)
> preq->packfile = NULL;
> }
> preq->slot = NULL;
> +   strbuf_release(>tmpfile);
> free(preq->url);
> free(preq);
>  }

which is at the very top, but I was not looking this far up,
but looked for functions order that my editor had.

Clearly I have to practice code reviews, still.

Sorry for the noise.

Stefan


[RFC PATCH] fetch-pack: space out sent "haves" in negotiation

2018-05-21 Thread Jonathan Tan
I was thinking about fetch negotiation in some non-ideal situations
(specifically, when the client repo contains two or more independent
branches that meet only somewhere far in the past) and thought about
skipping over intermediate commits, using exponentially larger skips as
we proceed, when generating "have" lines. This is in the hope of
reducing the bandwidth and roundtrips needed when fetching, and does not
require a modification to the server.

I'm not sure if this is the best way, however, so I'm wrapping up and
writing what I have now. I'll talk about (1) the implementation that I
have, (2) possible future work for my implementation, and (3) other,
possibly better, ways that negotiation could be improved instead.

(1) The implementation that I have

This patch contains some drop-in code that passes all existing tests,
but the new negotiation algorithm is not tested.

To mitigate the effect of skipping, I included functionality wherein
the client will retry the commits in a skip if the server ACKs the
destination of the skip, but this is currently imperfect - in
particular, the server might end the negotiation early, and the commits
retried in my current implementation are a superset due to the fact that
I didn't store the commits in the skip.

(2) Possible future work for my implementation

Since each sent commit maintains pointers to sent descendants and sent
ancestors (strictly speaking, only the "close" ones - to find all of
them, you need the transitive closure), this can be used for some sort
of error correction when, during a stateless RPC negotiation, the server
(which may be a group of eventually consistent servers behind a load
balancer) reports that it no longer has a commit that it said it had.
For example, we could in this case mark that commit as "they_have=NO"
and for all its closest ancestors, set it to "they_have=YES" unless they
in turn have a descendant with "they_have=YES" or
"they_have=HAVE_DESCENDANT".

(3) Other ways of improving negotiation

If we're prepared to commit-walk a significant part of the entire local
repo (as we are, in the situation I described in the first paragraph),
and if we have access to corresponding remote-tracking information,
one other way of improving negotiation might be to limit the "have"s we
sent to ancestors or descendants of the corresponding remote-tracking
tips.

This can be done simultaneously with the approach in this patch, but if
we were to evaluate only one first, the
ancestor-or-descendant-of-remote-tracking-tip approach might be the
better one to do first.

Signed-off-by: Jonathan Tan 
---
 Makefile   |   1 +
 fetch-negotiator.c | 309 +
 fetch-negotiator.h |  40 ++
 fetch-pack.c   | 174 ++---
 object.h   |   1 +
 5 files changed, 392 insertions(+), 133 deletions(-)
 create mode 100644 fetch-negotiator.c
 create mode 100644 fetch-negotiator.h

diff --git a/Makefile b/Makefile
index ad880d1fc5..8bbedfa521 100644
--- a/Makefile
+++ b/Makefile
@@ -859,6 +859,7 @@ LIB_OBJS += ewah/ewah_bitmap.o
 LIB_OBJS += ewah/ewah_io.o
 LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec-cmd.o
+LIB_OBJS += fetch-negotiator.o
 LIB_OBJS += fetch-object.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
diff --git a/fetch-negotiator.c b/fetch-negotiator.c
new file mode 100644
index 00..58975e1c37
--- /dev/null
+++ b/fetch-negotiator.c
@@ -0,0 +1,309 @@
+#include "cache.h"
+#include "commit.h"
+#include "fetch-negotiator.h"
+
+#define NO_THE_INDEX_COMPATIBILITY_MACROS
+
+/* Remember to update object flag allocation in object.h */
+/*
+ * This commit entered the "candidates" priority queue (and may still be in
+ * it).
+ */
+#define SEEN (1u << 6)
+/*
+ * This commit was returned from fetch_negotiator_next().
+ */
+#define EMITTED (1u << 7)
+
+enum they_have {
+   /*
+* We do not know if the server has this commit, or we know that the
+* server does not have this commit.
+*/
+   NO,
+
+   /*
+* The server has this commit, and we do not know (or did not keep
+* track of) whether it has any of its descendants.
+*/
+   YES,
+
+   /*
+* The server has at least one of this commit's descendants, and that
+* descendant is marked with YES. When resending "have" lines, we do
+* not need to resend this commit, because doing so is redundant.
+*/
+   HAVE_DESCENDANT
+};
+
+struct sent_commit {
+   struct commit *commit;
+   enum they_have they_have;
+
+   /*
+* To obtain all sent ancestors of this commit, calculate the
+* transitive closure obtained by following these pointers.
+*/
+   struct sent_commit **sent_ancestors;
+   size_t sent_ancestor_nr, sent_ancestor_alloc;
+
+   /*
+* To obtain all sent descendants of this commit, calculate the
+* transitive closure obtained by following these 

Re: [PATCH v2 02/18] Add a new builtin: branch-diff

2018-05-21 Thread Jeff King
On Mon, May 21, 2018 at 10:56:47AM -0700, Stefan Beller wrote:

> > It is very much about
> > comparing two *ranges of* revisions, and not just any ranges, no. Those
> > ranges need to be so related as to contain mostly identical changes.
> 
> range-diff, eh?
> 
> > Most fellow German software engineers (who seem to have a knack for
> > idiotically long variable/function names) would now probably suggest:
> >
> > git compare-patch-series-with-revised-patch-series
> 
> or short:
> 
>   revision-compare
>   compare-revs
>   com-revs
> 
>   revised-diff
>   revise-diff
>   revised-compare
> 
>   diff-revise

I still like "range diff", but I think something around "revise" is a
good line of thought, too. Because it implies that we expect the two
ranges to be composed of almost-the-same commits.

That leads to another use case where I think focusing on topic branches
(or even branches at all) would be a misnomer. Imagine I cherry-pick a
bunch of commits with:

  git cherry-pick -10 $old_commit

I might then want to see how the result differs with something like:

  git range-diff $old_commit~10..$old_commit HEAD~10..HEAD

I wouldn't think of this as a topic-branch operation, but just as
comparing two sequences of commits. I guess "revise" isn't strictly
accurate here either, as I'm not revising. But I do assume the two
ranges share some kind of mapping of patches.

-Peff

PS I wish there were a nicer syntax to do that. Perhaps
   "git range-diff -10 $old_commit HEAD" could work, though occasionally
   the two ranges are not the same length (e.g., if you ended up
   skipping one of the cherry-picked commits). Anyway, those kind of
   niceties can easily come later on top. :)


Re: [PATCH 1/5] http: use strbufs instead of fixed buffers

2018-05-21 Thread Jeff King
On Mon, May 21, 2018 at 11:11:51AM -0700, Stefan Beller wrote:

> Hi Jeff,
> 
> On Fri, May 18, 2018 at 6:56 PM, Jeff King  wrote:
> 
> > @@ -2421,4 +2426,5 @@ void release_http_object_request(struct 
> > http_object_request *freq)
> ..
> > +   strbuf_release(>tmpfile);
> 
> Do we need an equivalent in release_http_pack_request as well?

Yes, but isn't there one?

>From the original patch:

--- a/http.c
+++ b/http.c
@@ -2082,6 +2082,7 @@ void release_http_pack_request(struct http_pack_request 
*preq)
preq->packfile = NULL;
}
preq->slot = NULL;
+   strbuf_release(>tmpfile);
free(preq->url);
free(preq);
 }

Or am I missing something?

-Peff


Re: [PATCH] t: make many tests depend less on the refs being files

2018-05-21 Thread Stefan Beller
On Sun, May 20, 2018 at 10:51 PM, Christian Couder
 wrote:
> From: David Turner 
>
> So that they work under alternate ref storage backends.

Sometimes I have a disconnect between the subject and the commit
message, (e.g. in an email reader the subject is not displayed accurately on
top of the message).

So I would prefer if the first part of the body message is an actual
sentence, and
not a continuum from the subject.

Maybe elaborate a bit more:

  The current tests are very focused on the file system representation of
  the loose and packed refs code.  As there are plans to implement other
  ref storage systems, migrate most tests to a form that test the intent of the
  refs storage system instead of it internals. The internals of the loose and
  packed refs are tested in , whereas the tests in this patch focus
  on testing other aspects.

>
> This will be really needed when such alternate ref storage backends are
> developed. But this could already help by making clear to readers that
> some tests do not depend on which ref backend is used.

Ah, this is what I picked up already in the suggested edit above. :/

> This patch just takes care of many low hanging fruits. It does not try
> to completely solves the issue.
>
> Signed-off-by: David Turner 
> Signed-off-by: Christian Couder 



> ---
>
> Thanks for all the great feedback regarding implementing reftable [1].
>
> Looking at David Turner's tests in [2], it seems that they could indeed
> be already valuable, so let's start by extracting most of the simple
> improvements they make.

Thanks for tackling refstables!

Stefan


Re: [PATCH 02/11] repository: introduce repo_read_index_or_die

2018-05-21 Thread Stefan Beller
Hi Brandon,

>> One of the reviewers of the series questioned the overall goal of the
>> series as we want to move away from _die() in top level code but this
>> series moves more towards it.
>
> I've heard every once in a while that we want to move toward this but I
> don't believe there is an actual effort along those lines just yet.  For
> that to be the case we would need a clearly defined error handling
> methodology (aside from the existing "die"ing behavior), which we don't
> currently have.

We have the example in the refs code, which I would want to
imitate. :)

/*
 * Return 0 if a reference named refname could be created without
 * conflicting with the name of an existing reference. Otherwise,
 * return a negative value and write an explanation to err. [...]
 */

int refs_verify_refname_available(struct ref_store *refs, ...
struct strbuf *err);

extern int refs_init_db(struct strbuf *err);

But it is true that there is no active effort currently being pushed.

Thanks,
Stefan


Re: [RFC/PATCH 1/7] rerere: unify error message when read_cache fails

2018-05-21 Thread Stefan Beller
On Sun, May 20, 2018 at 2:12 PM, Thomas Gummerer  wrote:
> We have multiple different variants of the error message we show to
> the user if 'read_cache' fails.  The "Could not read index" variant we
> are using in 'rerere.c' is currently not used anywhere in translated
> form.
>
> As a subsequent commit will mark all output that comes from 'rerere.c'
> for translation, make the life of the translators a little bit easier
> by using a string that is used elsewhere, and marked for translation
> there, and thus most likely already translated.
>
> "index file corrupt" seems to be the most common error message we show
> when 'read_cache' fails, so use that here as well.
>
> Signed-off-by: Thomas Gummerer 
> ---
>
> "index file corrupt" is also what Stefan chose for his series unifying
> these error messages (and 'die'ing, which I'm not sure is the right
> thing to do here as also mentioned in my reply to [1]).  I'm happy to
> drop this if we decide to go with that series.

Acked-by: 

I'd happily have this patch instead of the one in my series.

I was about to ask for translation, but the commit message hints
at a follow up patch marking this for translation, so I'll read on.

Stefan


Re: [PATCH v2 03/12] commit-graph: test that 'verify' finds corruption

2018-05-21 Thread Jakub Narebski
Derrick Stolee  writes:

> Add test cases to t5318-commit-graph.sh that corrupt the commit-graph
> file and check that the 'git commit-graph verify' command fails. These
> tests verify the header and chunk information is checked carefully.
>
> Helped-by: Martin Ågren 
> Signed-off-by: Derrick Stolee 
> ---
>  t/t5318-commit-graph.sh | 53 
> +
>  1 file changed, 53 insertions(+)
>
> diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
> index 6ca451dfd2..0cb88232fa 100755
> --- a/t/t5318-commit-graph.sh
> +++ b/t/t5318-commit-graph.sh
> @@ -240,4 +240,57 @@ test_expect_success 'git commit-graph verify' '
>   git commit-graph verify >output
>  '
>  
> +# usage: corrupt_data   []
> +corrupt_data() {
> + file=$1
> + pos=$2
> + data="${3:-\0}"
> + printf "$data" | dd of="$file" bs=1 seek="$pos" conv=notrunc
> +}

First, if we do this that way (and not by adding a test helper), the use
of this function should be, I think, protected using appropriate test
prerequisite.  Not everyone has 'dd' tool installed, for example on
MS Windows.

Second, the commit-graph file format has H-byte HASH-checksum of all of
the contents excluding checksum trailer.  It feels like any corruption
should have been caught by checksum test; thus to actually test that
contents is verified we should adjust checksum too, e.g. with sha1sum if
available or with test helper... oh, actually we have t/helper/test-sha1.
Unfortulately, it looks like it has no docs (beside commit message).

> +
> +test_expect_success 'detect bad signature' '
> + cd "$TRASH_DIRECTORY/full" &&

This 'cd' outside subshell and withou accompanying change back feels a
bit strange to me.

> + cp $objdir/info/commit-graph commit-graph-backup &&
> + test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
> + corrupt_data $objdir/info/commit-graph 0 "\0" &&

So 'CGPH' signature is currupted into '\0GPH'.

> + test_must_fail git commit-graph verify 2>err &&
> + grep -v "^\+" err > verify-errors &&

Minor nit: redirection should be cuddled to the file, i.e.:

  + grep -v "^\+" err >verify-errors &&

A question: why do you filter-out lines starting with "+" here?

> + test_line_count = 1 verify-errors &&
> + grep "graph signature" verify-errors

If messages from 'git commit-graph verify' can be localized (are
translatable), then it should be i18n_grep, isn't it?

> +'
> +
> +test_expect_success 'detect bad version number' '
> + cd "$TRASH_DIRECTORY/full" &&
> + cp $objdir/info/commit-graph commit-graph-backup &&
> + test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
> + corrupt_data $objdir/info/commit-graph 4 "\02" &&

All right, so we replace commit-graph format version 1 ("\01") with
version 2 ("\02").  First, why 2 and not 0?  Second, is "\02" portable?

> + test_must_fail git commit-graph verify 2>err &&
> + grep -v "^\+" err > verify-errors &&
> + test_line_count = 1 verify-errors &&

The above three lines is common across all test cases; I wonder if it
would be possible to extract it into function, to avoid code
duplication.

> + grep "graph version" verify-errors
> +'
> +
> +test_expect_success 'detect bad hash version' '
> + cd "$TRASH_DIRECTORY/full" &&
> + cp $objdir/info/commit-graph commit-graph-backup &&
> + test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
> + corrupt_data $objdir/info/commit-graph 5 "\02" &&

All right, so we change / corrupt hash version from value of 1, which
means SHA-1, to value of 2... which would soon meen NewHash.  Why not
"\777" (i.e. 0xff)?

> + test_must_fail git commit-graph verify 2>err &&
> + grep -v "^\+" err > verify-errors &&
> + test_line_count = 1 verify-errors &&
> + grep "hash version" verify-errors
> +'


Note: all of the above tests check in load_commit_graph_one(), not the
one in verify_commit_graph().  Just FYI.

> +
> +test_expect_success 'detect too small chunk-count' '
> + cd "$TRASH_DIRECTORY/full" &&
> + cp $objdir/info/commit-graph commit-graph-backup &&
> + test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
> + corrupt_data $objdir/info/commit-graph 6 "\01" &&
> + test_must_fail git commit-graph verify 2>err &&
> + grep -v "^\+" err > verify-errors &&
> + test_line_count = 2 verify-errors &&
> + grep "missing the OID Lookup chunk" verify-errors &&
> + grep "missing the Commit Data chunk" verify-errors

This feels too implementation specific.  We should have at least two
chunks missing (there are 3 required chunks, and number of chunks was
changed to 1), but commit-graph format specification does not state that
OID Fanout must be first, and thus it is two remaining required chunks
that would be missing.

> +'
> +
>  test_done

One test that I would like to see 

Re: [PATCH 02/11] repository: introduce repo_read_index_or_die

2018-05-21 Thread Brandon Williams
On 05/21, Stefan Beller wrote:
> On Fri, May 18, 2018 at 11:37 PM, Duy Nguyen  wrote:
> > On Wed, May 16, 2018 at 03:21:09PM -0700, Stefan Beller wrote:
> >> A common pattern with the repo_read_index function is to die if the return
> >> of repo_read_index is negative.  Move this pattern into a function.
> >>
> >> This patch replaces the calls of repo_read_index with its _or_die version,
> >> if the error message is exactly the same.
> >>
> >> Signed-off-by: Stefan Beller 
> >> ---
> >>  builtin/add.c   | 3 +--
> >>  builtin/check-ignore.c  | 7 ---
> >>  builtin/clean.c | 4 ++--
> >>  builtin/mv.c| 3 +--
> >>  builtin/reset.c | 3 +--
> >>  builtin/rm.c| 3 +--
> >>  builtin/submodule--helper.c | 3 +--
> >
> > 'git grep -w -A3 read_cache' tells me you're missing (*)
> 
> > (*) yes yes you did mention "if the error message is exactly the
> > same". But these look like good candicates to convert anyway
> >
> > builtin/diff-tree.c:if (read_cache() < 0)
> > builtin/diff-tree.c-die(_("index file corrupt"));
> >
> 
> I would expect this to be covered in a follow up patch as I did look
> for (read_cache() < 0) specifically.
> 
> > builtin/merge-ours.c:   if (read_cache() < 0)
> > builtin/merge-ours.c:   die_errno("read_cache failed");
> >
> > builtin/update-index.c: entries = read_cache();
> > builtin/update-index.c- if (entries < 0)
> > builtin/update-index.c- die("cache corrupted");
> >
> > merge-ours.c is interesting because it uses die_errno() version
> > instead. I think that might give inaccurate diagnostics because we do
> > not only fail when a libc/system call fails in read_cache(), so it
> > should be safe to just use die() here.
> >
> > update-index.c is also interesting because it actually saves the
> > return value. I don't think it's used anywhere, so you can just drop
> > that 'entries' variable. But perhaps it's good to make
> > repo_read_index_or_die() return the number of entries too.
> 
> Yeah this series left out all the interesting cases, as I just sent it out
> without much thought.
> 
> Returning the number of entries makes sense.
> 
> One of the reviewers of the series questioned the overall goal of the
> series as we want to move away from _die() in top level code but this
> series moves more towards it.

I've heard every once in a while that we want to move toward this but I
don't believe there is an actual effort along those lines just yet.  For
that to be the case we would need a clearly defined error handling
methodology (aside from the existing "die"ing behavior), which we don't
currently have.

> 
> I don't know.
> 
> Stefan

-- 
Brandon Williams


Re: [PATCH 07/11] rerere: use repo_read_index_or_die

2018-05-21 Thread Stefan Beller
Hi Thomas,

On Sun, May 20, 2018 at 10:45 AM, Thomas Gummerer  wrote:
> On 05/16, Stefan Beller wrote:
>> By switching to repo_read_index_or_die, we'll get a slightly different
>> error message ("index file corrupt") as well as localization of it.
>
> Apart from the slightly different error message, and the localization
> (both of which I think are a good thing), I notice that this also
> turns a "return error(...)" into a "die(...)".  I thought we were
> going more towards not 'die'ing in libgit.a code, and letting the
> caller handling the errors?  Either way I think this change should be
> described in the commit message.

oops, I will to drop or fix this patch from the series as I think the
not die()ing is a good idea for libgit.

> Also all other messages in 'rerere.c' are currently not translated.
> I'm currently working on a series that includes a patch to do just
> that (amongst some other changes to 'rerere'), which I'm hoping to
> send soon-ish, but the textual conflicts should we still want this
> patch should be easy to solve.

I did not mark this series well enough, it was a mere attempt to
"see how it goes", but was retracted as a whole[1] after Junio
dreaded some merge conflicts.

[1] 
https://public-inbox.org/git/CAGZ79kbvjoTq5079Ks+h2HNb+D99RELYPcJk2=pvzf9-y8d...@mail.gmail.com/

So do not feel bad about any merge conflicts in rerere, as this
series will not land any time soon.

Stefan


Re: [PATCH] regex: do not call `regfree()` if compilation fails

2018-05-21 Thread Stefan Beller
On Sun, May 20, 2018 at 3:50 AM, Martin Ågren  wrote:
> It is apparently undefined behavior to call `regfree()` on a regex where
> `regcomp()` failed. The language in [1] is a bit muddy, at least to me,
> but the clearest hint is this (`preg` is the `regex_t *`):
>
> Upon successful completion, the regcomp() function shall return 0.
> Otherwise, it shall return an integer value indicating an error as
> described in , and the content of preg is undefined.
>
> Funnily enough, there is also the `regerror()` function which should be
> given a pointer to such a "failed" `regex_t` -- the content of which
> would supposedly be undefined -- and which may investigate it to come up
> with a detailed error message.
>
> In any case, the example in that document shows how `regfree()` is not
> called after `regcomp()` fails.
>
> We have quite a few users of this API and most get this right. These
> three users do not.
>
> Several implementations can handle this just fine [2] and these code paths
> supposedly have not wreaked havoc or we'd have heard about it. (These
> are all in code paths where git got bad input and is just about to die
> anyway.) But let's just avoid the issue altogether.
>
> [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/regcomp.html
>
> [2] https://www.redhat.com/archives/libvir-list/2013-September/msg00262.html
>
> Researched-by: Eric Sunshine 
> Signed-off-byi Martin Ågren 
> ---
>
> On 14 May 2018 at 05:05, Eric Sunshine  wrote:
>> My research (for instance [1,2]) seems to indicate that it would be
>> better to avoid regfree() upon failed regcomp(), even though such a
>> situation is handled sanely in some implementations.
>>
>> [1]: https://www.redhat.com/archives/libvir-list/2013-September/msg00276.html
>> [2]: https://www.redhat.com/archives/libvir-list/2013-September/msg00273.html
>
> Thank you for researching this. I think it would make sense to get rid
> of the few places we have where we get this wrong (if our understanding
> of this being undefined is right).
>
>  diffcore-pickaxe.c | 1 -
>  grep.c | 2 --
>  2 files changed, 3 deletions(-)
>
> diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
> index 239ce5122b..800a899c86 100644
> --- a/diffcore-pickaxe.c
> +++ b/diffcore-pickaxe.c
> @@ -215,7 +215,6 @@ static void regcomp_or_die(regex_t *regex, const char 
> *needle, int cflags)
> /* The POSIX.2 people are surely sick */
> char errbuf[1024];
> regerror(err, regex, errbuf, 1024);
> -   regfree(regex);

While the commit message is very clear why we supposedly introduce a leak here,
it is hard to be found from the source code (as we only delete code
there, so digging
for history is not obvious), so maybe

 /* regfree(regex) is invalid here */

instead?

> die("invalid regex: %s", errbuf);
> }
>  }
> diff --git a/grep.c b/grep.c
> index 65b90c10a3..5e4f3f9a9d 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -636,7 +636,6 @@ static void compile_fixed_regexp(struct grep_pat *p, 
> struct grep_opt *opt)
> if (err) {
> char errbuf[1024];
> regerror(err, >regexp, errbuf, sizeof(errbuf));
> -   regfree(>regexp);
> compile_regexp_failed(p, errbuf);
> }
>  }
> @@ -701,7 +700,6 @@ static void compile_regexp(struct grep_pat *p, struct 
> grep_opt *opt)
> if (err) {
> char errbuf[1024];
> regerror(err, >regexp, errbuf, 1024);
> -   regfree(>regexp);
> compile_regexp_failed(p, errbuf);
> }
>  }
> --
> 2.17.0.840.g5d83f92caf
>


Re: [PATCH 02/11] repository: introduce repo_read_index_or_die

2018-05-21 Thread Stefan Beller
On Fri, May 18, 2018 at 11:37 PM, Duy Nguyen  wrote:
> On Wed, May 16, 2018 at 03:21:09PM -0700, Stefan Beller wrote:
>> A common pattern with the repo_read_index function is to die if the return
>> of repo_read_index is negative.  Move this pattern into a function.
>>
>> This patch replaces the calls of repo_read_index with its _or_die version,
>> if the error message is exactly the same.
>>
>> Signed-off-by: Stefan Beller 
>> ---
>>  builtin/add.c   | 3 +--
>>  builtin/check-ignore.c  | 7 ---
>>  builtin/clean.c | 4 ++--
>>  builtin/mv.c| 3 +--
>>  builtin/reset.c | 3 +--
>>  builtin/rm.c| 3 +--
>>  builtin/submodule--helper.c | 3 +--
>
> 'git grep -w -A3 read_cache' tells me you're missing (*)

> (*) yes yes you did mention "if the error message is exactly the
> same". But these look like good candicates to convert anyway
>
> builtin/diff-tree.c:if (read_cache() < 0)
> builtin/diff-tree.c-die(_("index file corrupt"));
>

I would expect this to be covered in a follow up patch as I did look
for (read_cache() < 0) specifically.

> builtin/merge-ours.c:   if (read_cache() < 0)
> builtin/merge-ours.c:   die_errno("read_cache failed");
>
> builtin/update-index.c: entries = read_cache();
> builtin/update-index.c- if (entries < 0)
> builtin/update-index.c- die("cache corrupted");
>
> merge-ours.c is interesting because it uses die_errno() version
> instead. I think that might give inaccurate diagnostics because we do
> not only fail when a libc/system call fails in read_cache(), so it
> should be safe to just use die() here.
>
> update-index.c is also interesting because it actually saves the
> return value. I don't think it's used anywhere, so you can just drop
> that 'entries' variable. But perhaps it's good to make
> repo_read_index_or_die() return the number of entries too.

Yeah this series left out all the interesting cases, as I just sent it out
without much thought.

Returning the number of entries makes sense.

One of the reviewers of the series questioned the overall goal of the
series as we want to move away from _die() in top level code but this
series moves more towards it.

I don't know.

Stefan


Re: commit-graph: change in "best" merge-base when ambiguous

2018-05-21 Thread Elijah Newren
Hi,

On Mon, May 21, 2018 at 11:10 AM, Derrick Stolee  wrote:
> Hello all,
>
> While working on the commit-graph feature, I made a test commit that sets
> core.commitGraph and gc.commitGraph to true by default AND runs 'git
> commit-graph write --reachable' after each 'git commit' command. This helped
> me find instances in the test suite where the commit-graph feature changes
> existing functionality. Most of these were in regards to grafts,
> replace-objects, and shallow-clones (as expected) or when trying to find a
> corrupt or hidden commit (the commit-graph hides this corrupt/missing data).
> However, there was one interesting case that I'd like to mention on-list.
>
> In t6024-recursive-merge.sh, we have the following commit structure:
>
> # 1 - A - D - F
> #   \   X   /
> # B   X
> #   X   \
> # 2 - C - E - G
>
> When merging F to G, there are two "best" merge-bases, A and C. With
> core.commitGraph=false, 'git merge-base F G' returns A, while it returns C
> when core.commitGraph=true. This is due to the new walk order when using
> generation numbers, although I have not dug deep into the code to point out
> exactly where the choice between A and C is made. Likely it's just whatever
> order they are inserted into a list.

Ooh, interesting.

Just a guess, but could it be related to relative ordering of
committer timestamps?  Ordering of committer timestamps apparently
affects order of merge-bases returned to merge-recursive, and although
that shouldn't have mattered, a few bugs meant that it did and the
order ended up determining what contents a successful merge would
have.  See this recent post:

https://public-inbox.org/git/CABPp-BFc1OLYKzS5rauOehvEugPc0oGMJp-NMEAmVMW7QR=4...@mail.gmail.com/

The fact that the merge was successful for both orderings of merge
bases was the real bug, though; it should have detected and reported a
conflict both ways.


I'm not sure where else we have an accidental and incorrect dependence
on merge-base tie-breaker or ordering logic, but if it's like this
one, changing the tie-breaker should be okay.


Re: which files are "known to git"?

2018-05-21 Thread Robert P. J. Day
On Mon, 21 May 2018, Elijah Newren wrote:

> > can anyone refresh my memory if that happened here, and whether
> > that was the consensus after the discussion was over?
>
> Perhaps this: 
> https://public-inbox.org/git/EEC5BA1D5F274F02AE20FC269868FDEF@PhilipOakley/
> ?

  yup, that's it, thanks.

rday


Re: which files are "known to git"?

2018-05-21 Thread Stefan Beller
On Mon, May 21, 2018 at 10:40 AM, Robert P. J. Day
 wrote:
> On Mon, 21 May 2018, Elijah Newren wrote:
>
>> Hi Robert,
>>
>> I had always assumed prior to your email that 'known to Git' meant
>> 'tracked' or 'recorded in the index'...
>
>   i *know* i've been in this discussion before,

https://public-inbox.org/git/alpine.LFD.2.21.1711120430580.30032@localhost.localdomain/

via

git clone --mirror https://public-inbox.org/git git-ml && cd git-ml
git log --oneline --author=rpj...@crashcourse.ca
/ known # search for "known" in message subjects

I really value the public inbox to work as a git repo, as then you
can dig though it just as you dig through commits.


Re: which files are "known to git"?

2018-05-21 Thread Jonathan Nieder
Hi,

Robert P. J. Day wrote:

>   i did a quick search for that phrase in the current code base and
> came up with:
>
> builtin/difftool.c:   /* The symlink is unknown to Git so read from 
> the filesystem */
> dir.c:error("pathspec '%s' did not match any file(s) known to 
> git.",
> Documentation/git-rm.txt:removes only the paths that are known to Git.  
> Giving the name of
> Documentation/git-commit.txt:   be known to Git);
> Documentation/user-manual.txt:error: pathspec 
> '261dfac35cb99d380eb966e102c1197139f7fa24' did not match any file(s) known to 
> git.
> Documentation/gitattributes.txt:  Notice all types of potential 
> whitespace errors known to Git.
> Documentation/git-clean.txt:Normally, only files unknown to Git are removed, 
> but if the `-x`
> Documentation/RelNotes/1.8.2.1.txt: * The code to keep track of what 
> directory names are known to Git on
> Documentation/RelNotes/1.8.1.6.txt: * The code to keep track of what 
> directory names are known to Git on
> Documentation/RelNotes/2.9.0.txt:   known to Git.  They have been taught to 
> do the normalization.
> Documentation/RelNotes/2.8.4.txt:   known to Git.  They have been taught to 
> do the normalization.
> Documentation/RelNotes/1.8.3.txt: * The code to keep track of what directory 
> names are known to Git on
> t/t3005-ls-files-relative.sh: echo "error: pathspec $sq$f$sq 
> did not match any file(s) known to git."
> t/t3005-ls-files-relative.sh: echo "error: pathspec $sq$f$sq 
> did not match any file(s) known to git."
>
> so it's not like there's a *ton* of that, but still enough to want to
> get it right. should there be a precise definition for the phrase
> "known to git", or should that phrase simply be banned/replaced?

In my opinion: the latter.

It's not like the phrase represents some concept that we don't have
any other name for.  They're also known as "tracked files" and that
name is more intuitive.

Thanks,
Jonathan


Re: which files are "known to git"?

2018-05-21 Thread Elijah Newren
On Mon, May 21, 2018 at 10:40 AM, Robert P. J. Day
 wrote:
> On Mon, 21 May 2018, Elijah Newren wrote:
>
>> I had always assumed prior to your email that 'known to Git' meant
>> 'tracked' or 'recorded in the index'...
>
>   i *know* i've been in this discussion before, but i don't remember
> where, i *assume* it was on this list, and i recall someone (again,
> don't remember who) who opined that there are two categories of files
> that are "known to git":
>
>   1) files known in a *positive* sense, those being explicitly tracked
>   files, and
>
>   2) files known in a *negative* sense, as in explicitly ignored files
>
> can anyone refresh my memory if that happened here, and whether that
> was the consensus after the discussion was over?

Perhaps this: 
https://public-inbox.org/git/EEC5BA1D5F274F02AE20FC269868FDEF@PhilipOakley/
?

> If that's the
> definition that's being used, then this passage makes sense:
>
>   "Normally, only files unknown to Git are removed, but if the -x
>   option is specified, ignored files are also removed."
>
> that pretty clearly implies that ignored files are considered "known"
> to git.

Yes, _if_ that's the definition used, then that passage makes sense.
But if that's the definition used, then the other two passages I
pointed out in Documentation/git-commit.txt and
Documentation/git-rm.txt do NOT make sense and need to be rewritten.

Junio has already chimed in elsewhere on this thread and stated pretty
clearly that the intended meaning for 'known to Git' was just (1), not
(2), and even provided a suggested wording fix for
Documentation/git-clean.txt.  Putting that into a patch format and
submitting along with an update to Documentation/glossary-content.txt
as Duy suggested look like the two todos to me, though perhaps others
want to discuss ways to just avoid the phrase 'known to Git' (as
suggested by Jonathan).


Re: [PATCH 1/5] http: use strbufs instead of fixed buffers

2018-05-21 Thread Stefan Beller
Hi Jeff,

On Fri, May 18, 2018 at 6:56 PM, Jeff King  wrote:

> @@ -2421,4 +2426,5 @@ void release_http_object_request(struct 
> http_object_request *freq)
..
> +   strbuf_release(>tmpfile);

Do we need an equivalent in release_http_pack_request as well?


commit-graph: change in "best" merge-base when ambiguous

2018-05-21 Thread Derrick Stolee

Hello all,

While working on the commit-graph feature, I made a test commit that 
sets core.commitGraph and gc.commitGraph to true by default AND runs 
'git commit-graph write --reachable' after each 'git commit' command. 
This helped me find instances in the test suite where the commit-graph 
feature changes existing functionality. Most of these were in regards to 
grafts, replace-objects, and shallow-clones (as expected) or when trying 
to find a corrupt or hidden commit (the commit-graph hides this 
corrupt/missing data). However, there was one interesting case that I'd 
like to mention on-list.


In t6024-recursive-merge.sh, we have the following commit structure:

    # 1 - A - D - F
    #   \   X   /
    # B   X
    #   X   \
    # 2 - C - E - G

When merging F to G, there are two "best" merge-bases, A and C. With 
core.commitGraph=false, 'git merge-base F G' returns A, while it returns 
C when core.commitGraph=true. This is due to the new walk order when 
using generation numbers, although I have not dug deep into the code to 
point out exactly where the choice between A and C is made. Likely it's 
just whatever order they are inserted into a list.


In the Discussion section of the `git merge-base` docs [1], we have the 
following:


    When the history involves criss-cross merges, there can be more 
than one best common ancestor for two commits. For example, with this 
topology:


    ---1---o---A
        \ /
     X
        / \
    ---2---o---o---B

    both 1 and 2 are merge-bases of A and B. Neither one is better than 
the other (both are best merge bases). When the --all option is not 
given,     it is unspecified which best one is output.


This means our official documentation mentions that we do not have a 
concrete way to differentiate between these choices. This makes me think 
that this change in behavior is not a bug, but it _is_ a change in 
behavior. It's worth mentioning, but I don't think there is any value in 
making sure `git merge-base` returns the same output.


Does anyone disagree? Is this something we should solidify so we always 
have a "definitive" merge-base?


The biggest reason I think we should avoid sticking to the existing 
behavior is that the current behavior depends on the walk order. That 
means we would not be able to concretely define a tie-breaker without 
changing the existing behavior anyway.


Thanks,
-Stolee

[1] https://git-scm.com/docs/git-merge-base#_discussion



Re: which files are "known to git"?

2018-05-21 Thread Robert P. J. Day
On Mon, 21 May 2018, Jonathan Nieder wrote:

> Robert P. J. Day wrote:
> > On Mon, 21 May 2018, Elijah Newren wrote:
>
> >> Hi Robert,
> >>
> >> I had always assumed prior to your email that 'known to Git'
> >> meant 'tracked' or 'recorded in the index'...
> >
> >   i *know* i've been in this discussion before, but i don't
> > remember where, i *assume* it was on this list, and i recall
> > someone (again, don't remember who) who opined that there are two
> > categories of files that are "known to git":
>
> My understanding was the same as Elijah's.
>
> I would be in favor of a patch that replaces the phrase "known to
> Git" in Git's documentation with something less confusing.

  first, i want to apologize to everyone for opening this apparent can
of worms. (it's victoria day here in canada, and i intended to spend
it just puttering around with git-related minutiae, not encouraging
thought-provoking questions about the fundamental nature of git.)

  i did a quick search for that phrase in the current code base and
came up with:

builtin/difftool.c: /* The symlink is unknown to Git so read from 
the filesystem */
dir.c:  error("pathspec '%s' did not match any file(s) known to git.",
Documentation/git-rm.txt:removes only the paths that are known to Git.  Giving 
the name of
Documentation/git-commit.txt:   be known to Git);
Documentation/user-manual.txt:error: pathspec 
'261dfac35cb99d380eb966e102c1197139f7fa24' did not match any file(s) known to 
git.
Documentation/gitattributes.txt:Notice all types of potential 
whitespace errors known to Git.
Documentation/git-clean.txt:Normally, only files unknown to Git are removed, 
but if the `-x`
Documentation/RelNotes/1.8.2.1.txt: * The code to keep track of what directory 
names are known to Git on
Documentation/RelNotes/1.8.1.6.txt: * The code to keep track of what directory 
names are known to Git on
Documentation/RelNotes/2.9.0.txt:   known to Git.  They have been taught to do 
the normalization.
Documentation/RelNotes/2.8.4.txt:   known to Git.  They have been taught to do 
the normalization.
Documentation/RelNotes/1.8.3.txt: * The code to keep track of what directory 
names are known to Git on
t/t3005-ls-files-relative.sh:   echo "error: pathspec $sq$f$sq 
did not match any file(s) known to git."
t/t3005-ls-files-relative.sh:   echo "error: pathspec $sq$f$sq 
did not match any file(s) known to git."

so it's not like there's a *ton* of that, but still enough to want to
get it right. should there be a precise definition for the phrase
"known to git", or should that phrase simply be banned/replaced? i
have no idea, open to suggestions.

rday


Re: [PATCH v2 02/18] Add a new builtin: branch-diff

2018-05-21 Thread Stefan Beller
Hi Johannes,

>> (2) git diff --branch topic-v1...topic-v2
>
> From my point of view, `git diff --branch` indicates that I diff
> *branches*. Which is not really something that makes sense, and definitely
> not what this command is about.
>
> We are not comparing branches.
>
> We are comparing versions of the same branch.

I happen to have a messier workflow than you have, as I
develop a "resend" of a topic in a new branch (or I have to
restore the old sent topic from the reflog).

Now that I have the tool I also compare two branches,
namely, the branch that Junio queued
(origin/base..origin/sb/intelligent-name) vs the resend
that I had locally (origin/base..foo).

Next time I might compare Junios queued topic to the
local format-patch'es that I already annotated.

So in a way this diffs different versions of a topic, "diff-topic-versions".

>> It`s all still a comparison between two revisions (pointed to by
>> "topic-v1" and "topic-v2" branch heads in this specific example), but
>> it differs in what we are comparing - (1) set of files contained in
>> endpoints, or (2) set of revisions contained in (or "leading to")
>> endpoints.
>
> It is very much not about comparing *two* revisions.

I wonder if we can make the tool more intelligent to take two revisions
and it figures out the range by finding the base branch itself.
Probably as a follow up.


> It is very much about
> comparing two *ranges of* revisions, and not just any ranges, no. Those
> ranges need to be so related as to contain mostly identical changes.

range-diff, eh?

> Most fellow German software engineers (who seem to have a knack for
> idiotically long variable/function names) would now probably suggest:
>
> git compare-patch-series-with-revised-patch-series

or short:

  revision-compare
  compare-revs
  com-revs

  revised-diff
  revise-diff
  revised-compare

  diff-revise

> I hope you agree that that is better *and* worse than your suggestions,
> depending from what angle you look at it: it is better because it
> describes what the command is *actually* doing. But it is much worse at
> the same time because it is too long.

btw, you think very much in terms of *patch series*, but there are workflows
without patches (pull requests at Github et Al., changes in Gerrit),
and I would think the output of the tool under discussion would still be
useful.

In [1] Junio gives his use case, it is "before accepting them", which could
be comparing an mbox or patch files against a branch, or first building
up a local history on a detached head (and then wondering if to reset
the branch to the new history), which would be all in Git.

That use case still has 'patches' involved, but these are not the main
selling point for the tool, as you could turn patches into commits before
using this tool.

[1] https://public-inbox.org/git/xmqqvabh1ung@gitster-ct.c.googlers.com/

Thanks,
Stefan


Re: which files are "known to git"?

2018-05-21 Thread Jonathan Nieder
Robert P. J. Day wrote:
> On Mon, 21 May 2018, Elijah Newren wrote:

>> Hi Robert,
>>
>> I had always assumed prior to your email that 'known to Git' meant
>> 'tracked' or 'recorded in the index'...
>
>   i *know* i've been in this discussion before, but i don't remember
> where, i *assume* it was on this list, and i recall someone (again,
> don't remember who) who opined that there are two categories of files
> that are "known to git":

My understanding was the same as Elijah's.

I would be in favor of a patch that replaces the phrase "known to Git"
in Git's documentation with something less confusing.

Thanks,
Jonathan


Re: which files are "known to git"?

2018-05-21 Thread Robert P. J. Day
On Mon, 21 May 2018, Elijah Newren wrote:

> Hi Robert,
>
> I had always assumed prior to your email that 'known to Git' meant
> 'tracked' or 'recorded in the index'...

  i *know* i've been in this discussion before, but i don't remember
where, i *assume* it was on this list, and i recall someone (again,
don't remember who) who opined that there are two categories of files
that are "known to git":

  1) files known in a *positive* sense, those being explicitly tracked
  files, and

  2) files known in a *negative* sense, as in explicitly ignored files

can anyone refresh my memory if that happened here, and whether that
was the consensus after the discussion was over? if that's the
definition that's being used, then this passage makes sense:

  "Normally, only files unknown to Git are removed, but if the -x
  option is specified, ignored files are also removed."

that pretty clearly implies that ignored files are considered "known"
to git.

rday


Bug in update-index man page

2018-05-21 Thread Kirill Likhodedov

I've noticed the following text in the documentation pages for the 
`update-index` command:

   
   Files to act on. Note that files beginning with .  are discarded. 
This includes ./file and
   dir/./file. If you don't want this, then use cleaner names. The same 
applies to directories
   ending / and paths with //


It seems that the phrase "files beginning with .  are discarded." is incorrect, 
because update-index works correctly on files like `.gitignore`. It also seems 
that even names like `./file` are handled correctly after 
cfb0af1d50247e66ea1d46014650e60e9cfb87b9. Could you please verify and fix the 
doc if needed.

Thanks!
Kirill.

Re: What's cooking in git.git (May 2018, #02; Thu, 17)

2018-05-21 Thread Stefan Beller
>>  Many tests hardcode the raw object names, which would change once
>>  we migrate away from SHA-1.  While some of them must test against
>>  exact object names, most of them do not have to use hardcoded
>>  constants in the test.  The latter kind of tests have been updated
>>  to test the moral equivalent of the original without hardcoding the
>>  actual object names.
>>
>>  Will merge to 'next'.
>
> I think there was one minor change Stefan wanted out of this series.
> I'll send a reroll (and tbdiff) with just that change.

I was just unsure how we treat [e]grep in our test suite, so I asked
questions out of sheer curiosity. I have no objections to the change
made, though.

Thanks!


Re: [PATCH 5/5] merge-recursive: simplify handle_change_delete

2018-05-21 Thread Elijah Newren
Hi Dscho,

On Mon, May 21, 2018 at 6:41 AM, Johannes Schindelin
 wrote:
> On Sat, 19 May 2018, Elijah Newren wrote:
>> On Sat, May 19, 2018 at 12:32 AM, Johannes Sixt  wrote:

>> >
>> > Oh, there is a reason for the repeated message text: translations! Please 
>> > do
>> > not play sentence Lego with translated strings. The original code is
>> > preferable.
>>
>> Ah, translations; that makes sense now.  I'm still annoyed by the
>> code, but I retract patch 5 then.
>
> Maybe you can remove the temptation for others, too, my replacing your
> patch 5 with one that adds the code comment "No sentence Lego! Think of
> our poor translators and refrain from making their life miserable!" or
> some more appropriate one?

That sounds reasonable.  I'll make that change (plus your other
suggested changes on other patches; thanks for all the reviews) in the
next round.

I'm also considering adopting a setup_unpack_trees_porcelain() style
of handling (from unpack-trees.c) for all these messages in
merge-recursive.c, but that's a bigger change that I'd probably put in
a separate series if I decide to go with it.

Elijah


Re: issues(?) installing git-lfs via fedora "dnf" command

2018-05-21 Thread Robert P. J. Day
On Mon, 21 May 2018, Jonathan Nieder wrote:

> Hi,
>
> Robert P. J. Day wrote:
> > On Mon, 21 May 2018, Jonathan Nieder wrote:
>
> >> The packager should be able to find out whether it's an issue in
> >> git-lfs upstream and report it to that project if it is.  Git-lfs is
> >> not part of git.git; it's a separate project:
> >> https://github.com/git-lfs/git-lfs/blob/master/CONTRIBUTING.md
> >> I believe they use github's issue tracker to track bugs.
> >
> >   it would *appear* that this is a combination of both a git issue,
> > and a red hat packaging issue:
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=1580357
> > https://github.com/git-lfs/git-lfs/issues/3013
>
> Can you clarify?  Neither of those bugs points to a git issue if I
> understand correctly.  They may be a git-lfs issue, though.

  sorry, i mistyped, i should have said a "git-lfs issue".

rday


Re: issues(?) installing git-lfs via fedora "dnf" command

2018-05-21 Thread Jonathan Nieder
Hi,

Robert P. J. Day wrote:
> On Mon, 21 May 2018, Jonathan Nieder wrote:

>> The packager should be able to find out whether it's an issue in
>> git-lfs upstream and report it to that project if it is.  Git-lfs is
>> not part of git.git; it's a separate project:
>> https://github.com/git-lfs/git-lfs/blob/master/CONTRIBUTING.md
>> I believe they use github's issue tracker to track bugs.
>
>   it would *appear* that this is a combination of both a git issue,
> and a red hat packaging issue:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1580357
> https://github.com/git-lfs/git-lfs/issues/3013

Can you clarify?  Neither of those bugs points to a git issue if I
understand correctly.  They may be a git-lfs issue, though.

This kind of confusion is exactly why the Git project adopted a
trademark policy to prevent outside projects from identifying
themselves with Git: https://git-scm.com/about/trademark.
Unfortunately Git LFS's existence precedes that policy.  I believe
they got permission to continue using the name, confusing as it is.

Thanks,
Jonathan


Re: issues(?) installing git-lfs via fedora "dnf" command

2018-05-21 Thread Robert P. J. Day
On Mon, 21 May 2018, Jonathan Nieder wrote:

> Robert P. J. Day wrote:
>
> >   $ sudo dnf install git-lfs
> [...]
> >   Running transaction
> > Preparing:
> > Installing   : git-lfs-2.4.0-1.fc28.x86_64
> > Running scriptlet: git-lfs-2.4.0-1.fc28.x86_64
> >   Error: Failed to call git rev-parse --git-dir --show-toplevel: "fatal:
> >   not a git repository (or any of the parent directories): .git\n"
> [...]
> > is one supposed to be *in* a git repository when installing, because i
> > was in fact at the top level of my linux kernel source repo, so i'm
> > unclear on what that "Error" is trying to tell me. am i just being
> > clueless? is this something that i should submit as a fedora packaging
> > issue?
>
> Yes, this looks like something that should be reported as a Fedora
> packaging issue.
>
> The packager should be able to find out whether it's an issue in
> git-lfs upstream and report it to that project if it is.  Git-lfs is
> not part of git.git; it's a separate project:
> https://github.com/git-lfs/git-lfs/blob/master/CONTRIBUTING.md
> I believe they use github's issue tracker to track bugs.

  it would *appear* that this is a combination of both a git issue,
and a red hat packaging issue:

https://bugzilla.redhat.com/show_bug.cgi?id=1580357
https://github.com/git-lfs/git-lfs/issues/3013

rday


Re: issues(?) installing git-lfs via fedora "dnf" command

2018-05-21 Thread Jonathan Nieder
Robert P. J. Day wrote:

>   $ sudo dnf install git-lfs
[...]
>   Running transaction
> Preparing:
> Installing   : git-lfs-2.4.0-1.fc28.x86_64
> Running scriptlet: git-lfs-2.4.0-1.fc28.x86_64
>   Error: Failed to call git rev-parse --git-dir --show-toplevel: "fatal:
>   not a git repository (or any of the parent directories): .git\n"
[...]
> is one supposed to be *in* a git repository when installing, because i
> was in fact at the top level of my linux kernel source repo, so i'm
> unclear on what that "Error" is trying to tell me. am i just being
> clueless? is this something that i should submit as a fedora packaging
> issue?

Yes, this looks like something that should be reported as a Fedora
packaging issue.

The packager should be able to find out whether it's an issue in
git-lfs upstream and report it to that project if it is.  Git-lfs is
not part of git.git; it's a separate project:
https://github.com/git-lfs/git-lfs/blob/master/CONTRIBUTING.md
I believe they use github's issue tracker to track bugs.

Thanks,
Jonathan


Re: [PATCH 2/5] merge-recursive: fix numerous argument alignment issues

2018-05-21 Thread Elijah Newren
Hi Dscho,

On Mon, May 21, 2018 at 6:42 AM, Johannes Schindelin
 wrote:
> Hi Elijah,
>
> On Fri, 18 May 2018, Elijah Newren wrote:
>
>> Various refactorings throughout the code have left lots of alignment
>> issues that were driving me crazy; fix them.
>
> I hope you did not do that manually. What is your code formatting tool of
> choice?

Sorry to disappoint but it was manual.  I noticed and fixed one of
them many months ago, tossing it into a 'misc' branch.  Then ran
across another and added it.  When I hit the third, I was annoyed and
cleaned them all up -- and combined them with other changes into this
series.

However, it's hard to call this formatting entirely manual.  A quick
regex found the relevant sites pretty easily, and 'M-x indent-region'
(emacs) fixes the indentation for a block of lines all at once.  I
guess if I had taken the time to fix a few other emacs formatting
rules, I could have highlighted the whole file and ran C-M-\ (a.k.a.
indent-region), but didn't.

> The patch looks obviously good to me.

Thanks for taking a look!


why does "man git-clean" not mention files ignored by core.excludesFile?

2018-05-21 Thread Robert P. J. Day

  sort of related to my previous post, but in "man git-clean", one
reads:

  -e , --exclude=
 In addition to those found in .gitignore (per directory)
 and $GIT_DIR/info/exclude, also consider these patterns to
 be in the set of the ignore rules in effect.

  -x
 Don’t use the standard ignore rules read from .gitignore
 (per directory) and $GIT_DIR/info/exclude, but do still use
 the ignore rules given with -e options. This allows
 removing all untracked files, including build products.
 This can be used (possibly in conjunction with git reset)
 to create a pristine working directory to test a clean
 build.

  why is there no mention of files ignored via a user's
core.excludesFile configuration? those sections seem sufficiently
comprehensive to list all of the other ways to ignore files, is there
a reason that that config setting is not mentioned?

rday

Re: which files are "known to git"?

2018-05-21 Thread Junio C Hamano
On Tue, May 22, 2018 at 12:09 AM, Elijah Newren  wrote:
>
> I had always assumed prior to your email that 'known to Git' meant
> 'tracked' or 'recorded in the index'.

That's been my intention as well ;-)

> From Documentation/git-clean.txt:
>
> Normally, only files unknown to Git are removed, but if the `-x`
> option is specified, ignored files are also removed.

The above makes it sound as if "unknown to Git" is synonym
to "not marked as ignored via the exclude mechanism", which would
incorrectly imply "known to Git" is "marked as ignored via the exclude
mechanism".

Which is a sheer nonsense.

I think this is written while forgetting that "known to Git" was already a term
with a specific meaning, and used a confusing term unnecessarily loosely.

 "clean" removes files that are not in the index and are not marked as
ignored by default, but with "clean -x", the user can remove all files that
are not in the index, even the ones that are marked as ignored.

In the above version of description, "files that are not in the index" can
be replaced with "untracked files" and we can also say "files unknown to
Git" (if we want to), but the set of files "clean" operates by default is
narrower than "unknown to Git"--it is "unknown to Git and not marked as
ignored".


Re: which files are "known to git"?

2018-05-21 Thread Robert P. J. Day
On Mon, 21 May 2018, Elijah Newren wrote:

> Hi Robert,

> I had always assumed prior to your email that 'known to Git' meant
> 'tracked' or 'recorded in the index'.  However, a quick `git grep -i
> known.to.git` shows that we're actually not consistent by what we
> mean with this phrase.  A little test setup:
>
>   $ echo ignoreme >>.gitignore
>   $ git add .gitignore
>   $ git commit -m ignoreme
>   $ touch ignoreme
>   $ git ls-files -o
>   ignoreme
>   $ git ls-files -o --exclude-standard
>   $
>
> >From Documentation/git-clean.txt:
>
> Normally, only files unknown to Git are removed, but if the `-x`
> option is specified, ignored files are also removed.
>
>   This implies that ignored files are not 'unknown to Git', or fixing the
>   double negative, that ignored files are 'known to Git':
>   $ git clean -n
>   $ git clean -nx
>   Would remove ignoreme
>   $

  uh oh ... i'm just now remembering a discussion once upon a time
where this wasn't simply a double negative. IIRC (and someone else
help me out here), "known to git" also meant known *not* to be tracked
or something like that (as in, ignored files).

  anyone remember that conversation?

rday


Re: which files are "known to git"?

2018-05-21 Thread Robert P. J. Day
On Mon, 21 May 2018, Elijah Newren wrote:

> Hi Robert,
>
> On Mon, May 21, 2018 at 4:18 AM, Robert P. J. Day  
> wrote:
> >
> >   updating my git courseware and, since some man pages refer to files
> > "known to git", i just want to make sure i understand precisely which
> > files those are. AIUI, they would include:
> >
> >   * tracked files
> >   * ignored files
> >   * new files which have been staged but not yet committed
> >
> > is that it? are there others?
>
> Doesn't the first category of yours include the third?  I always
> read 'tracked' as 'in the index'.

 you're right, i was being redundant.

rday


Re: which files are "known to git"?

2018-05-21 Thread Duy Nguyen
On Mon, May 21, 2018 at 5:09 PM, Elijah Newren  wrote:
> Robert, since you're working on documentation of sorts anyway, would
> you like to propose some patches to fix things here?  I'm not entirely
> sure what to suggest, and we might need a random suggestion to get the
> discussion started before we figure out what we want here, but it'd be
> nice to fix this inconsistency.

Make sure to fix Documentation/glossary-content.txt too, Robert, if
you plan to improve documentation.
-- 
Duy


Re: which files are "known to git"?

2018-05-21 Thread Elijah Newren
Hi Robert,

On Mon, May 21, 2018 at 4:18 AM, Robert P. J. Day  wrote:
>
>   updating my git courseware and, since some man pages refer to files
> "known to git", i just want to make sure i understand precisely which
> files those are. AIUI, they would include:
>
>   * tracked files
>   * ignored files
>   * new files which have been staged but not yet committed
>
> is that it? are there others?

Doesn't the first category of yours include the third?  I always read
'tracked' as 'in the index'.

I had always assumed prior to your email that 'known to Git' meant
'tracked' or 'recorded in the index'.  However, a quick `git grep -i
known.to.git` shows that we're actually not consistent by what we mean
with this phrase.  A little test setup:

  $ echo ignoreme >>.gitignore
  $ git add .gitignore
  $ git commit -m ignoreme
  $ touch ignoreme
  $ git ls-files -o
  ignoreme
  $ git ls-files -o --exclude-standard
  $

>From Documentation/git-clean.txt:

Normally, only files unknown to Git are removed, but if the `-x`
option is specified, ignored files are also removed.

  This implies that ignored files are not 'unknown to Git', or fixing the
  double negative, that ignored files are 'known to Git':
  $ git clean -n
  $ git clean -nx
  Would remove ignoreme
  $

>From Documentation/git-commit.txt:

3. by listing files as arguments to the 'commit' command
   (without --interactive or --patch switch), in which
   case the commit will ignore changes staged in the index, and instead
   record the current content of the listed files (which must already
   be known to Git);

  This implies that only recorded-in-the-index files are known to Git:
  $ git commit -m testing ignoreme
  error: pathspec 'ignoreme' did not match any file(s) known to git.
  $

>From Documentation/git-rm.txt:

The  list given to the command can be exact pathnames,
file glob patterns, or leading directory names.  The command
removes only the paths that are known to Git.  Giving the name of
a file that you have not told Git about does not remove that file.

  This also implies that only recorded-in-the-index files are known to Git:
  $ git rm ignoreme
  fatal: pathspec 'ignoreme' did not match any files
  $


I can't see any evidence of usage that suggests any more categories
than tracked and ignored, but whether ignored files are included in
the set of 'files known to Git' appears to depend on which man page
you are reading...which is rather unfortunate.

Robert, since you're working on documentation of sorts anyway, would
you like to propose some patches to fix things here?  I'm not entirely
sure what to suggest, and we might need a random suggestion to get the
discussion started before we figure out what we want here, but it'd be
nice to fix this inconsistency.


Elijah


Re: [PATCHv4 1/1] git-p4: add unshelve command

2018-05-21 Thread Luke Diamand
On 21 May 2018 at 08:07, Junio C Hamano  wrote:
> SZEDER Gábor  writes:
>
>>> diff --git a/t/t9832-unshelve.sh b/t/t9832-unshelve.sh
>>> new file mode 100755
>>> index 00..cca2dec536
>
> ... in short, I'd queue a fix-up on top like this to be later
> squashed after getting an ack?

That looks good to me, thanks! Ack.

Luke

>
>  t/t9832-unshelve.sh | 23 ---
>  1 file changed, 4 insertions(+), 19 deletions(-)
>
> diff --git a/t/t9832-unshelve.sh b/t/t9832-unshelve.sh
> index cca2dec536..3513abd21a 100755
> --- a/t/t9832-unshelve.sh
> +++ b/t/t9832-unshelve.sh
> @@ -1,6 +1,6 @@
>  #!/bin/sh
>
> -last_shelved_change() {
> +last_shelved_change () {
> p4 changes -s shelved -m1 | cut -d " " -f 2
>  }
>
> @@ -17,7 +17,7 @@ test_expect_success 'init depot' '
> cd "$cli" &&
> echo file1 >file1 &&
> p4 add file1 &&
> -   p4 submit -d "change 1"
> +   p4 submit -d "change 1" &&
> : >file_to_delete &&
> p4 add file_to_delete &&
> p4 submit -d "file to delete"
> @@ -120,29 +120,14 @@ EOF
> )
>  '
>
> -diff_adds_line() {
> -   text="$1" &&
> -   file="$2" &&
> -   grep -q "^+$text" $file || (echo "expected \"text\" $text not found 
> in $file" && exit 1)
> -}
> -
> -diff_excludes_line() {
> -   text="$1" &&
> -   file="$2" &&
> -   if grep -q "^+$text" $file; then
> -   echo "unexpected text \"$text\" found in $file" &&
> -   exit 1
> -   fi
> -}
> -
>  # Now try to unshelve it. git-p4 should refuse to do so.
>  test_expect_success 'try to unshelve the change' '
> test_when_finished cleanup_git &&
> (
> change=$(last_shelved_change) &&
> cd "$git" &&
> -   ! git p4 unshelve $change >out.txt 2>&1 &&
> -   grep -q "cannot unshelve" out.txt
> +   test_must_fail git p4 unshelve $change 2>err.txt &&
> +   grep -q "cannot unshelve" err.txt
> )
>  '
>


[PATCH v5 3/4] argv-array: return the pushed string from argv_push*()

2018-05-21 Thread Martin Ågren
From: Junio C Hamano 

Such an API change allows us to use an argv_array this way:

struct argv_array to_free = ARGV_ARRAY_INIT;
const char *msg;

if (some condition) {
msg = "constant string message";
... other logic ...
} else {
msg = argv_pushf(_free, "format %s", var);
}
... use "msg" ...
... do other things ...
argv_clear(_free);

Note that argv_array_pushl() and argv_array_pushv() are used to push
one or more strings with a single call, so we do not return any one
of these strings from these two functions in order to reduce the
chance to misuse the API.

Signed-off-by: Junio C Hamano 
Signed-off-by: Martin Ågren 
---
 argv-array.h | 4 ++--
 argv-array.c | 6 --
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/argv-array.h b/argv-array.h
index 29056e49a1..715c93b246 100644
--- a/argv-array.h
+++ b/argv-array.h
@@ -12,9 +12,9 @@ struct argv_array {
 #define ARGV_ARRAY_INIT { empty_argv, 0, 0 }
 
 void argv_array_init(struct argv_array *);
-void argv_array_push(struct argv_array *, const char *);
+const char *argv_array_push(struct argv_array *, const char *);
 __attribute__((format (printf,2,3)))
-void argv_array_pushf(struct argv_array *, const char *fmt, ...);
+const char *argv_array_pushf(struct argv_array *, const char *fmt, ...);
 LAST_ARG_MUST_BE_NULL
 void argv_array_pushl(struct argv_array *, ...);
 void argv_array_pushv(struct argv_array *, const char **);
diff --git a/argv-array.c b/argv-array.c
index 5d370fa336..449dfc105a 100644
--- a/argv-array.c
+++ b/argv-array.c
@@ -21,12 +21,13 @@ static void argv_array_push_nodup(struct argv_array *array, 
const char *value)
array->argv[array->argc] = NULL;
 }
 
-void argv_array_push(struct argv_array *array, const char *value)
+const char *argv_array_push(struct argv_array *array, const char *value)
 {
argv_array_push_nodup(array, xstrdup(value));
+   return array->argv[array->argc - 1];
 }
 
-void argv_array_pushf(struct argv_array *array, const char *fmt, ...)
+const char *argv_array_pushf(struct argv_array *array, const char *fmt, ...)
 {
va_list ap;
struct strbuf v = STRBUF_INIT;
@@ -36,6 +37,7 @@ void argv_array_pushf(struct argv_array *array, const char 
*fmt, ...)
va_end(ap);
 
argv_array_push_nodup(array, strbuf_detach(, NULL));
+   return array->argv[array->argc - 1];
 }
 
 void argv_array_pushl(struct argv_array *array, ...)
-- 
2.17.0.840.g5d83f92caf



[PATCH v5 2/4] merge-recursive: provide pair of `unpack_trees_{start,finish}()`

2018-05-21 Thread Martin Ågren
From: Elijah Newren 

Rename `git_merge_trees()` to `unpack_trees_start()` and extract the
call to `discard_index()` into a new function `unpack_trees_finish()`.
As a result, these are called early resp. late in `merge_trees()`,
making the resource handling clearer. A later commit will expand on
that, teaching `..._finish()` to free more memory. (So rather than
moving the FIXME-comment, just drop it, since it will be addressed soon
enough.)

Also call `..._finish()` when `merge_trees()` returns early.

Signed-off-by: Elijah Newren 
Signed-off-by: Martin Ågren 
Signed-off-by: Junio C Hamano 
---
 merge-recursive.c | 29 +++--
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 680e01226b..ddb0fa7369 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -337,10 +337,10 @@ static void init_tree_desc_from_tree(struct tree_desc 
*desc, struct tree *tree)
init_tree_desc(desc, tree->buffer, tree->size);
 }
 
-static int git_merge_trees(struct merge_options *o,
-  struct tree *common,
-  struct tree *head,
-  struct tree *merge)
+static int unpack_trees_start(struct merge_options *o,
+ struct tree *common,
+ struct tree *head,
+ struct tree *merge)
 {
int rc;
struct tree_desc t[3];
@@ -379,6 +379,11 @@ static int git_merge_trees(struct merge_options *o,
return rc;
 }
 
+static void unpack_trees_finish(struct merge_options *o)
+{
+   discard_index(>orig_index);
+}
+
 struct tree *write_tree_from_memory(struct merge_options *o)
 {
struct tree *result = NULL;
@@ -3088,13 +3093,14 @@ int merge_trees(struct merge_options *o,
return 1;
}
 
-   code = git_merge_trees(o, common, head, merge);
+   code = unpack_trees_start(o, common, head, merge);
 
if (code != 0) {
if (show(o, 4) || o->call_depth)
err(o, _("merging of trees %s and %s failed"),
oid_to_hex(>object.oid),
oid_to_hex(>object.oid));
+   unpack_trees_finish(o);
return -1;
}
 
@@ -3147,20 +3153,15 @@ int merge_trees(struct merge_options *o,
 
hashmap_free(>current_file_dir_set, 1);
 
-   if (clean < 0)
+   if (clean < 0) {
+   unpack_trees_finish(o);
return clean;
+   }
}
else
clean = 1;
 
-   /* Free the extra index left from git_merge_trees() */
-   /*
-* FIXME: Need to also free data allocated by
-* setup_unpack_trees_porcelain() tucked away in o->unpack_opts.msgs,
-* but the problem is that only half of it refers to dynamically
-* allocated data, while the other half points at static strings.
-*/
-   discard_index(>orig_index);
+   unpack_trees_finish(o);
 
if (o->call_depth && !(*result = write_tree_from_memory(o)))
return -1;
-- 
2.17.0.840.g5d83f92caf



[PATCH v5 4/4] unpack_trees_options: free messages when done

2018-05-21 Thread Martin Ågren
The strings allocated in `setup_unpack_trees_porcelain()` are never
freed. Provide a function `clear_unpack_trees_porcelain()` to do so and
call it where we use `setup_unpack_trees_porcelain()`. The only
non-trivial user is `unpack_trees_start()`, where we should place the
new call in `unpack_trees_finish()`.

We keep the string pointers in an array, mixing pointers to static
memory and memory that we allocate on the heap. We also keep several
copies of the individual pointers. So we need to make sure that we do
not free what we must not free and that we do not double-free. Let a
separate argv_array take ownership of all the strings we create so that
we can easily free them.

Zero the whole array of string pointers to make sure that we do not
leave any dangling pointers.

Note that we only take responsibility for the memory allocated in
`setup_unpack_trees_porcelain()` and not any other members of the
`struct unpack_trees_options`.

Helped-by: Junio C Hamano 
Signed-off-by: Junio C Hamano 
Signed-off-by: Martin Ågren 
---
 unpack-trees.h |  8 +++-
 builtin/checkout.c |  1 +
 merge-recursive.c  |  1 +
 merge.c|  3 +++
 unpack-trees.c | 17 ++---
 5 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/unpack-trees.h b/unpack-trees.h
index 41178ada94..c2b434c606 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -2,7 +2,7 @@
 #define UNPACK_TREES_H
 
 #include "tree-walk.h"
-#include "string-list.h"
+#include "argv-array.h"
 
 #define MAX_UNPACK_TREES 8
 
@@ -33,6 +33,11 @@ enum unpack_trees_error_types {
 void setup_unpack_trees_porcelain(struct unpack_trees_options *opts,
  const char *cmd);
 
+/*
+ * Frees resources allocated by setup_unpack_trees_porcelain().
+ */
+void clear_unpack_trees_porcelain(struct unpack_trees_options *opts);
+
 struct unpack_trees_options {
unsigned int reset,
 merge,
@@ -57,6 +62,7 @@ struct unpack_trees_options {
struct pathspec *pathspec;
merge_fn_t fn;
const char *msgs[NB_UNPACK_TREES_ERROR_TYPES];
+   struct argv_array msgs_to_free;
/*
 * Store error messages in an array, each case
 * corresponding to a error message type
diff --git a/builtin/checkout.c b/builtin/checkout.c
index b49b582071..5cebe170fc 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -526,6 +526,7 @@ static int merge_working_tree(const struct checkout_opts 
*opts,
init_tree_desc([1], tree->buffer, tree->size);
 
ret = unpack_trees(2, trees, );
+   clear_unpack_trees_porcelain();
if (ret == -1) {
/*
 * Unpack couldn't do a trivial merge; either
diff --git a/merge-recursive.c b/merge-recursive.c
index ddb0fa7369..338f63a952 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -382,6 +382,7 @@ static int unpack_trees_start(struct merge_options *o,
 static void unpack_trees_finish(struct merge_options *o)
 {
discard_index(>orig_index);
+   clear_unpack_trees_porcelain(>unpack_opts);
 }
 
 struct tree *write_tree_from_memory(struct merge_options *o)
diff --git a/merge.c b/merge.c
index f123658e58..b433291d0c 100644
--- a/merge.c
+++ b/merge.c
@@ -130,8 +130,11 @@ int checkout_fast_forward(const struct object_id *head,
 
if (unpack_trees(nr_trees, t, )) {
rollback_lock_file(_file);
+   clear_unpack_trees_porcelain();
return -1;
}
+   clear_unpack_trees_porcelain();
+
if (write_locked_index(_index, _file, COMMIT_LOCK))
return error(_("unable to write new index file"));
return 0;
diff --git a/unpack-trees.c b/unpack-trees.c
index 79fd97074e..73a6dc1701 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1,5 +1,6 @@
 #define NO_THE_INDEX_COMPATIBILITY_MACROS
 #include "cache.h"
+#include "argv-array.h"
 #include "repository.h"
 #include "config.h"
 #include "dir.h"
@@ -103,6 +104,8 @@ void setup_unpack_trees_porcelain(struct 
unpack_trees_options *opts,
const char **msgs = opts->msgs;
const char *msg;
 
+   argv_array_init(>msgs_to_free);
+
if (!strcmp(cmd, "checkout"))
msg = advice_commit_before_merge
  ? _("Your local changes to the following files would be 
overwritten by checkout:\n%%s"
@@ -119,7 +122,7 @@ void setup_unpack_trees_porcelain(struct 
unpack_trees_options *opts,
  "Please commit your changes or stash them before you 
%s.")
  : _("Your local changes to the following files would be 
overwritten by %s:\n%%s");
msgs[ERROR_WOULD_OVERWRITE] = msgs[ERROR_NOT_UPTODATE_FILE] =
-   xstrfmt(msg, cmd, cmd);
+   argv_array_pushf(>msgs_to_free, msg, cmd, cmd);
 
msgs[ERROR_NOT_UPTODATE_DIR] =
_("Updating the 

[PATCH v5 1/4] merge: setup `opts` later in `checkout_fast_forward()`

2018-05-21 Thread Martin Ågren
After we initialize the various fields in `opts` but before we actually
use them, we might return early. Move the initialization further down,
to immediately before we use `opts`.

This limits the scope of `opts` and will help a later commit fix a
memory leak without having to worry about those early returns.

This patch is best viewed using something like this (note the tab!):
--color-moved --anchored="  trees[nr_trees] = parse_tree_indirect"

Signed-off-by: Martin Ågren 
Signed-off-by: Junio C Hamano 
---
 merge.c | 32 +---
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/merge.c b/merge.c
index f06a4773d4..f123658e58 100644
--- a/merge.c
+++ b/merge.c
@@ -94,8 +94,24 @@ int checkout_fast_forward(const struct object_id *head,
return -1;
 
memset(, 0, sizeof(trees));
-   memset(, 0, sizeof(opts));
memset(, 0, sizeof(t));
+
+   trees[nr_trees] = parse_tree_indirect(head);
+   if (!trees[nr_trees++]) {
+   rollback_lock_file(_file);
+   return -1;
+   }
+   trees[nr_trees] = parse_tree_indirect(remote);
+   if (!trees[nr_trees++]) {
+   rollback_lock_file(_file);
+   return -1;
+   }
+   for (i = 0; i < nr_trees; i++) {
+   parse_tree(trees[i]);
+   init_tree_desc(t+i, trees[i]->buffer, trees[i]->size);
+   }
+
+   memset(, 0, sizeof(opts));
if (overwrite_ignore) {
memset(, 0, sizeof(dir));
dir.flags |= DIR_SHOW_IGNORED;
@@ -112,20 +128,6 @@ int checkout_fast_forward(const struct object_id *head,
opts.fn = twoway_merge;
setup_unpack_trees_porcelain(, "merge");
 
-   trees[nr_trees] = parse_tree_indirect(head);
-   if (!trees[nr_trees++]) {
-   rollback_lock_file(_file);
-   return -1;
-   }
-   trees[nr_trees] = parse_tree_indirect(remote);
-   if (!trees[nr_trees++]) {
-   rollback_lock_file(_file);
-   return -1;
-   }
-   for (i = 0; i < nr_trees; i++) {
-   parse_tree(trees[i]);
-   init_tree_desc(t+i, trees[i]->buffer, trees[i]->size);
-   }
if (unpack_trees(nr_trees, t, )) {
rollback_lock_file(_file);
return -1;
-- 
2.17.0.840.g5d83f92caf



[PATCH v5 0/4] unpack_trees_options: free messages when done

2018-05-21 Thread Martin Ågren
On 21 May 2018 at 02:25, Junio C Hamano  wrote:
> Junio C Hamano  writes:
>
>> I have a feeling that argv_array might be a better fit for the
>> purpose of keeping track of to_free[] strings in the context of this
>> series.  Moving away from string_list would allow us to sidestep the
>> storage ownership issues the API has, and we do not need the .util
>> thing string_list gives us (which is one distinct advantage string_list
>> has over argv_array, if the application needs that feature).
>>
>> We would need to make _pushf() and friends return "const char *" if
>> we go that route to make the resulting API more useful, though.
>
> ... and redoing the 4/4 patch using argv_array_pushf() makes the
> result look like this, which does not look too bad.

Thanks to Jacob, Junio and Peff for comments on the previous iteration.

I've taken the six patches that Junio has queued and rebuilt the series
to get rid of the new and possibly bug-prone function that no-one uses
once the series is over.

That is, I've replaced the `string_list_appendf()`-patch with Junio's
`argv_push*()`-patch, then squashed Junio's "redoing the 4/4"-patch into
patch 4/4 -- with the exception of keeping the `memset(opts->msgs, ...)`
which I suspect was mistakenly dropped.

Again, thanks for all the helpful comments and patches pointing me in
the right direction.

Martin

Elijah Newren (1):
  merge-recursive: provide pair of `unpack_trees_{start,finish}()`

Junio C Hamano (1):
  argv-array: return the pushed string from argv_push*()

Martin Ågren (2):
  merge: setup `opts` later in `checkout_fast_forward()`
  unpack_trees_options: free messages when done

 argv-array.h   |  4 ++--
 unpack-trees.h |  8 +++-
 argv-array.c   |  6 --
 builtin/checkout.c |  1 +
 merge-recursive.c  | 30 --
 merge.c| 35 ---
 unpack-trees.c | 17 ++---
 7 files changed, 64 insertions(+), 37 deletions(-)

-- 
2.17.0.840.g5d83f92caf



Re: [PATCH 4/5] merge-recursive: rename conflict_rename_*() family of functions

2018-05-21 Thread Johannes Schindelin
Hi Elijah,


On Fri, 18 May 2018, Elijah Newren wrote:

> These functions were added because processing of these conflicts needed
> to be deferred until process_entry() in order to get D/F conflicts and
> such right.  The number of these has grown over time, and now include
> some whose name is misleading:
>   * conflict_rename_normal() is for handling normal file renames; a
> typical rename may need content merging, but we expect conflicts
> from that to be more the exception than the rule.
>   * conflict_rename_via_dir() will not be a conflict; it was just an
> add that turned into a move due to directory rename detection.
> (If there was a file in the way of the move, that would have been
> detected and reported earlier.)
>   * conflict_rename_rename_2to1 and conflict_rename_add (the latter
> of which doesn't exist yet but has been submitted before and I
> intend to resend) technically might not be conflicts if the
> colliding paths happen to match exactly.
> Rename this family of functions to handle_rename_*().
> 
> Also rename handle_renames() to detect_and_process_renames() both to make
> it clearer what it does, and to differentiate it as a pre-processing step
> from all the handle_rename_*() functions which are called from
> process_entry().
> 
> Signed-off-by: Elijah Newren 

Makes sense, and the patch looks obviously correct to me.

Thanks,
Dscho


Re: [PATCH 3/5] merge-recursive: clarify the rename_dir/RENAME_DIR meaning

2018-05-21 Thread Johannes Schindelin
Hi Elijah,

On Fri, 18 May 2018, Elijah Newren wrote:

> We had an enum of rename types which included RENAME_DIR; this name felt
> misleading since it was not about an entire directory but was a status for
> each individual file add that occurred within a renamed directory.  Since
> this type is for signifying that the files in question were being renamed
> due to directory rename detection, rename this enum value to
> RENAME_VIA_DIR.  Make a similar change to the conflict_rename_dir()
> function, and add a comment to the top of that function explaining its
> purpose (it may not be quite as obvious as for the other
> conflict_rename_*() functions).
> 
> Signed-off-by: Elijah Newren 
> ---

Make sense. I think the reading flow could be improved a little bit by
splitting this paragraph into three.

> diff --git a/merge-recursive.c b/merge-recursive.c
> index 01306c87eb..d30085d9c7 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> -static int conflict_rename_dir(struct merge_options *o,
> -struct diff_filepair *pair,
> -const char *rename_branch,
> -const char *other_branch)
> +static int conflict_rename_via_dir(struct merge_options *o,
> +struct diff_filepair *pair,
> +const char *rename_branch,
> +const char *other_branch)
>  {
> + /*
> +  * Handle file adds that need to be renamed due to directory rename
> +  * detection.  This differs from handle_rename_normal, because
> +  * there is no content merge to do; just move the path into the

Technically, we do not move the path, but the file.

The rest of the diff looks obviously correct to me.

Thanks,
Dscho


RE: which files are "known to git"?

2018-05-21 Thread Randall S. Becker
On May 21, 2018 7:19 AM, Robert P. J. Day:
>   updating my git courseware and, since some man pages refer to files
> "known to git", i just want to make sure i understand precisely which
files
> those are. AIUI, they would include:
> 
>   * tracked files
>   * ignored files
>   * new files which have been staged but not yet committed

You might want to consider git metadata/config/attribute files, hooks,
filters, etc., that may not be not formally part of a repository, but can be
required to ensure the content is complete.

Cheers,
Randall

-- Brief whoami:
 NonStop developer since approximately 2112884442
 UNIX developer since approximately 421664400
-- In my real life, I talk too much.





Re: [PATCH 2/5] merge-recursive: fix numerous argument alignment issues

2018-05-21 Thread Johannes Schindelin
Hi Elijah,

On Fri, 18 May 2018, Elijah Newren wrote:

> Various refactorings throughout the code have left lots of alignment
> issues that were driving me crazy; fix them.

I hope you did not do that manually. What is your code formatting tool of
choice?

The patch looks obviously good to me.

Ciao,
Dscho


Re: [PATCH 5/5] merge-recursive: simplify handle_change_delete

2018-05-21 Thread Johannes Schindelin
Hi Elijah,

On Sat, 19 May 2018, Elijah Newren wrote:

> On Sat, May 19, 2018 at 12:32 AM, Johannes Sixt  wrote:
> > Am 19.05.2018 um 04:07 schrieb Elijah Newren:
> >>
> >> There is really no need for four branches of nearly identical messages
> >> when we can store the differences into small variables before printing.
> >
> > Oh, there is a reason for the repeated message text: translations! Please do
> > not play sentence Lego with translated strings. The original code is
> > preferable.
> 
> Ah, translations; that makes sense now.  I'm still annoyed by the
> code, but I retract patch 5 then.

Maybe you can remove the temptation for others, too, my replacing your
patch 5 with one that adds the code comment "No sentence Lego! Think of
our poor translators and refrain from making their life miserable!" or
some more appropriate one?

Ciao,
Dscho


Re: [PATCH v4 00/28] Hash-independent tests

2018-05-21 Thread Ben Peart



On 5/20/2018 10:01 PM, brian m. carlson wrote:

This is part 2 in the series to make tests hash independent.



Do you plan to update those tests that hard code the SHA of the index 
file itself?  For example, t1700-split-index.sh has hard coded values 
for the SHA and currently only supports different hard coded values for 
V4 indexes vs others.


Re: [PATCH] t: make many tests depend less on the refs being files

2018-05-21 Thread Johannes Schindelin
Hi,

of course I messed up and did not fix the Cc: list... now fixed ;-)


On Mon, 21 May 2018, Johannes Schindelin wrote:

> Hi Chris,
> 
> On Mon, 21 May 2018, Christian Couder wrote:
> 
> > From: David Turner 
> 
> I vaguely remember that Dave suggested using a different email address
> these days...
> 
> *clicketyclick*
> 
> ah, yes:
> https://public-inbox.org/git/1474935093-26757-3-git-send-email-dtur...@twosigma.com/
> 
> So maybe update it here, too, to 
> 
>   From: David Turner 
> 
> ?
> 
> > So that they work under alternate ref storage backends.
> > 
> > This will be really needed when such alternate ref storage backends are
> > developed. But this could already help by making clear to readers that
> > some tests do not depend on which ref backend is used.
> > 
> > This patch just takes care of many low hanging fruits. It does not try
> > to completely solves the issue.
> > 
> > Signed-off-by: David Turner 
> > Signed-off-by: Christian Couder 
> > ---
> 
> This is great. I am all for making the tests better ;-)
> 
> > diff --git a/t/lib-t6000.sh b/t/lib-t6000.sh
> > index 3f2d873fec..b8567cdf94 100644
> > --- a/t/lib-t6000.sh
> > +++ b/t/lib-t6000.sh
> > @@ -4,11 +4,10 @@ mkdir -p .git/refs/tags
> >  
> >  >sed.script
> >  
> > -# Answer the sha1 has associated with the tag. The tag must exist in 
> > .git/refs/tags
> > +# Answer the sha1 has associated with the tag. The tag must exist under 
> > refs/tags
> >  tag () {
> > _tag=$1
> > -   test -f ".git/refs/tags/$_tag" || error "tag: \"$_tag\" does not exist"
> > -   cat ".git/refs/tags/$_tag"
> > +   git rev-parse --verify "refs/tags/$_tag" || error "tag: \"$_tag\" does 
> > not exist"
> 
> Line longer than 80 columns...
> 
> > diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
> > index 0680dec808..886a9e3b72 100755
> > --- a/t/t5500-fetch-pack.sh
> > +++ b/t/t5500-fetch-pack.sh
> > @@ -30,7 +30,7 @@ add () {
> > test_tick &&
> > commit=$(echo "$text" | git commit-tree $tree $parents) &&
> > eval "$name=$commit; export $name" &&
> > -   echo $commit > .git/refs/heads/$branch &&
> > +   git update-ref refs/heads/$branch $commit &&
> 
> I think we have to be careful here to quote both "refs/heads/$branch" and
> "$commit" here, as a bug that introduces spaces into $commit or $branch
> would have been caught earlier, but not now, unless we quote.
> 
> This goes for all introduced `update-ref` calls.
> 
> Maybe even for some `git rev-parse --verify` calls.
> 
> > diff --git a/t/t9903-bash-prompt.sh b/t/t9903-bash-prompt.sh
> > index 8f5c811dd7..c3b89ae783 100755
> > --- a/t/t9903-bash-prompt.sh
> > +++ b/t/t9903-bash-prompt.sh
> > @@ -148,7 +148,7 @@ test_expect_success 'prompt - inside .git directory' '
> >  test_expect_success 'prompt - deep inside .git directory' '
> > printf " (GIT_DIR!)" >expected &&
> > (
> > -   cd .git/refs/heads &&
> > +   cd .git/objects &&
> > __git_ps1 >"$actual"
> > ) &&
> > test_cmp expected "$actual"
> > -- 
> 
> This one looks wrong.
> 
> Upon cursory review, one might be tempted to assume that the file is now
> written into .git/objects/ instead of .git/refs/heads/. And the patch
> context provided in the email is not enough to see (gawd, I hate
> mail-based patch review, it really takes all my Git tools away from me).
> The trick is that `actual` points at an absolute path:
> 
>   #!/bin/sh
>   #
>   # Copyright (c) 2012 SZEDER Gábor
>   #
> 
>   test_description='test git-specific bash prompt functions'
> 
>   . ./lib-bash.sh
> 
>   . "$GIT_BUILD_DIR/contrib/completion/git-prompt.sh"
> 
>   actual="$TRASH_DIRECTORY/actual"
>   [...]
> 
> So the remaining question (which probably wants to be added to the commit
> message together with a hint that `actual` points at an absolute path) is:
> Why not `cd .git` instead?
> 
> Ciao,
> Dscho

Re: [PATCH 3/5] query_fsmonitor: use xsnprintf for formatting integers

2018-05-21 Thread Ben Peart



On 5/19/2018 4:27 AM, René Scharfe wrote:

Am 19.05.2018 um 03:57 schrieb Jeff King:

These formatted integers should always fit into their
64-byte buffers. Let's use xsnprintf() to add an assertion
that this is the case, which makes auditing for other
unchecked snprintfs() easier.


How about this instead?

-- >8 --
  
-	snprintf(ver, sizeof(ver), "%d", version);

-   snprintf(date, sizeof(date), "%" PRIuMAX, (uintmax_t)last_update);
-   argv[1] = ver;
-   argv[2] = date;
-   argv[3] = NULL;
-   cp.argv = argv;
+   argv_array_push(, core_fsmonitor);
+   argv_array_pushf(, "%d", version);
+   argv_array_pushf(, "%" PRIuMAX, (uintmax_t)last_update);


Looks good.  Simpler, cleaner, less error prone.


cp.use_shell = 1;
cp.dir = get_git_work_tree();
  



Re: [PATCH] t: make many tests depend less on the refs being files

2018-05-21 Thread Christian Couder
Hi Dscho,

On Mon, May 21, 2018 at 11:41 AM, Johannes Schindelin
 wrote:
>
> On Mon, 21 May 2018, Christian Couder wrote:
>
>> From: David Turner 
>
> I vaguely remember that Dave suggested using a different email address
> these days...
>
> *clicketyclick*
>
> ah, yes:
> https://public-inbox.org/git/1474935093-26757-3-git-send-email-dtur...@twosigma.com/

Yeah, I actually used "David Turner " in the --cc
option I gave to `git send-email`...

> So maybe update it here, too, to
>
> From: David Turner 

...but I thought it was better to keep the original author and
Signed-off-by as they are in the original commit:

https://github.com/dturner-tw/git/commit/0a3fa7fbd7f99280b5f128be3e1ed04e045da671

Now I am ok to update them if it is preferred.

>> So that they work under alternate ref storage backends.
>>
>> This will be really needed when such alternate ref storage backends are
>> developed. But this could already help by making clear to readers that
>> some tests do not depend on which ref backend is used.
>>
>> This patch just takes care of many low hanging fruits. It does not try
>> to completely solves the issue.
>>
>> Signed-off-by: David Turner 
>> Signed-off-by: Christian Couder 
>> ---
>
> This is great. I am all for making the tests better ;-)

;-)

>> diff --git a/t/lib-t6000.sh b/t/lib-t6000.sh
>> index 3f2d873fec..b8567cdf94 100644
>> --- a/t/lib-t6000.sh
>> +++ b/t/lib-t6000.sh
>> @@ -4,11 +4,10 @@ mkdir -p .git/refs/tags
>>
>>  >sed.script
>>
>> -# Answer the sha1 has associated with the tag. The tag must exist in 
>> .git/refs/tags
>> +# Answer the sha1 has associated with the tag. The tag must exist under 
>> refs/tags
>>  tag () {
>>   _tag=$1
>> - test -f ".git/refs/tags/$_tag" || error "tag: \"$_tag\" does not exist"
>> - cat ".git/refs/tags/$_tag"
>> + git rev-parse --verify "refs/tags/$_tag" || error "tag: \"$_tag\" does 
>> not exist"
>
> Line longer than 80 columns...

Ok, the 'error "..."' will be on another line in the next version.

>> diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
>> index 0680dec808..886a9e3b72 100755
>> --- a/t/t5500-fetch-pack.sh
>> +++ b/t/t5500-fetch-pack.sh
>> @@ -30,7 +30,7 @@ add () {
>>   test_tick &&
>>   commit=$(echo "$text" | git commit-tree $tree $parents) &&
>>   eval "$name=$commit; export $name" &&
>> - echo $commit > .git/refs/heads/$branch &&
>> + git update-ref refs/heads/$branch $commit &&
>
> I think we have to be careful here to quote both "refs/heads/$branch" and
> "$commit" here, as a bug that introduces spaces into $commit or $branch
> would have been caught earlier, but not now, unless we quote.
>
> This goes for all introduced `update-ref` calls.

Ok, they will all have quoted arguments in the next version.

> Maybe even for some `git rev-parse --verify` calls.

Ok, I will take a look at that.

>> diff --git a/t/t9903-bash-prompt.sh b/t/t9903-bash-prompt.sh
>> index 8f5c811dd7..c3b89ae783 100755
>> --- a/t/t9903-bash-prompt.sh
>> +++ b/t/t9903-bash-prompt.sh
>> @@ -148,7 +148,7 @@ test_expect_success 'prompt - inside .git directory' '
>>  test_expect_success 'prompt - deep inside .git directory' '
>>   printf " (GIT_DIR!)" >expected &&
>>   (
>> - cd .git/refs/heads &&
>> + cd .git/objects &&
>>   __git_ps1 >"$actual"
>>   ) &&
>>   test_cmp expected "$actual"
>> --
>
> This one looks wrong.
>
> Upon cursory review, one might be tempted to assume that the file is now
> written into .git/objects/ instead of .git/refs/heads/. And the patch
> context provided in the email is not enough to see (gawd, I hate
> mail-based patch review, it really takes all my Git tools away from me).
> The trick is that `actual` points at an absolute path:
>
> #!/bin/sh
> #
> # Copyright (c) 2012 SZEDER Gábor
> #
>
> test_description='test git-specific bash prompt functions'
>
> . ./lib-bash.sh
>
> . "$GIT_BUILD_DIR/contrib/completion/git-prompt.sh"
>
> actual="$TRASH_DIRECTORY/actual"
> [...]
>
> So the remaining question (which probably wants to be added to the commit
> message together with a hint that `actual` points at an absolute path) is:
> Why not `cd .git` instead?

I think anywhere inside ".git/" should work, so I guess
".git/refs/heads" was chosen to make sure that adding anything after
".git/" does not change the result.

I think I can drop this for now and change it later in its own commit
with related explanations.

Thanks,
Christian.


Re: [PATCH 0/2] Test improvements for 'sg/complete-paths'

2018-05-21 Thread Johannes Schindelin
Hi Junio,

On Mon, 21 May 2018, Johannes Schindelin wrote:

> On Fri, 18 May 2018, SZEDER Gábor wrote:
> 
> > > > So, I think for v2 I will rewrite these tests to call
> > > > __git_complete_index_file() directly instead of using
> > > > 'test_completion', and will include a test with spaces in path
> > > > names.
> > > 
> > > Quite well thought-out reasoning.  Thanks.
> > 
> > Unfortunately I couldn't get around to it soon enough, and now the topic
> > 'sg/complete-paths' is already in next, so here are those test
> > improvements on top.
> 
> I can verify that the weeks-long breakage of `pu` on Windows has been
> addressed, probably by this patch series.

Please note that as the branch that is fixed by these two patches was
already merged down to `next`, the Continuous Testing on Windows reports
that `next` is broken.

(I saw that breakage for over a week, but was too busy elsewhere to act on
it.)

It would be really lovely to see it fixed soon, so that other bugs cannot
be hidden by that breakage.

And I would also love to see sg/complete-paths to be merged down to
`master` *only* in conjunction with these two patches, not on its own
(because that would break the Continuous Testing on Windows of `master`,
which is something I really want us to avoid).

Thanks,
Dscho

Re: [PATCH 2/2] t9902-completion: exercise __git_complete_index_file() directly

2018-05-21 Thread Johannes Schindelin
Hi Gábor,

On Fri, 18 May 2018, SZEDER Gábor wrote:

> The tests added in 2f271cd9cf (t9902-completion: add tests
> demonstrating issues with quoted pathnames, 2018-05-08) and in
> 2ab6eab4fe (completion: improve handling quoted paths in 'git
> ls-files's output, 2018-03-28) have a few shortcomings:
> 
>   - All these test use the 'test_completion' helper function, thus
> they are exercising the whole completion machinery, although they
> are only interested in how git-aware path completion, specifically
> the __git_complete_index_file() function deals with unusual
> characters in pathnames and on the command line.
> 
>   - These tests can't satisfactorily test the case of pathnames
> containing spaces, because 'test_completion' gets the words on the
> command line as a single argument and it uses space as word
> separator.
> 
>   - Some of the tests are protected by different FUNNYNAMES_* prereqs
> depending on whether they put backslashes and double quotes or
> separator characters (FS, GS, RS, US) in pathnames, although a
> filesystem not allowing one likely doesn't allow the others
> either.
> 
>   - One of the tests operates on paths containing '|' and '&'
> characters without being protected by a FUNNYNAMES prereq, but
> some filesystems (notably on Windows) don't allow these characters
> in pathnames, either.
> 
> Replace these tests with basically equivalent, more focused tests that
> call __git_complete_index_file() directly.  Since this function only
> looks at the current word to be completed, i.e. the $cur variable, we
> can easily include pathnames containing spaces in the tests, so use
> such pathnames instead of pathnames containing '|' and '&'.  Finally,
> use only a single FUNNYNAMES prereq for all kinds of special
> characters.

Makes sense.

> diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
> index 955932174c..1b6d275254 100755
> --- a/t/t9902-completion.sh
> +++ b/t/t9902-completion.sh
> @@ -1209,6 +1209,124 @@ test_expect_success 'teardown after ref completion' '
>   git remote remove other
>  '
>  
> +
> +test_path_completion ()
> +{
> + test $# = 2 || error "bug in the test script: not 2 parameters to 
> test_path_completion"

Maybe shorten this to

test $# = 2 || error "BUG: test_path_completion requires 2 parameters"

in order to keep to the 80 columns/line limit?

> +
> + local cur="$1" expected="$2"

I thought `local` was a Bash-ism we tried to avoid in the test scripts.
But I guess this file is already littered with `local` keywords...

> + echo "$expected" >expected &&
> + (
> + # In the following tests calling this function we only
> + # care about how __git_complete_index_file() deals with
> + # unusual characters in path names.  By requesting only
> + # untracked files we dont have to bother adding any
> + # paths to the index in those tests.
> + __git_complete_index_file --others &&
> + print_comp
> + ) &&
> + test_cmp expected out
> +}
> +
> +test_expect_success 'setup for path completion tests' '
> + mkdir simple-dir \
> +   "spaces in dir" \
> +   árvíztűrő &&
> + touch simple-dir/simple-file \
> +   "spaces in dir/spaces in file" \
> +   "árvíztűrő/Сайн яваарай" &&
> + if test_have_prereq !MINGW &&
> +mkdir BS\\dir \
> +  '$'separators\034in\035dir'' &&
> +touch BS\\dir/DQ\"file \
> +  '$'separators\034in\035dir/sep\036in\037file''
> + then
> + test_set_prereq FUNNYNAMES
> + else
> + rm -rf BS\\dir '$'separators\034in\035dir''
> + fi
> +'
> +
> +test_expect_success '__git_complete_index_file - simple' '
> + test_path_completion simple simple-dir &&  # Bash is supposed to
> +# add the trailing /.
> + test_path_completion simple-dir/simple simple-dir/simple-file
> +'
> +
> +test_expect_success \
> +'__git_complete_index_file - escaped characters on cmdline' '
> + test_path_completion spac "spaces in dir" &&  # Bash will turn this
> +   # into "spaces\ in\ dir"
> + test_path_completion "spaces\\ i" \
> +  "spaces in dir" &&
> + test_path_completion "spaces\\ in\\ dir/s" \
> +  "spaces in dir/spaces in file" &&
> + test_path_completion "spaces\\ in\\ dir/spaces\\ i" \
> +  "spaces in dir/spaces in file"
> +'
> +
> +test_expect_success \
> +'__git_complete_index_file - quoted characters on cmdline' '
> + # Testing with an opening but without a corresponding closing
> + # double quote is important.
> + test_path_completion \"spac "spaces in dir" &&
> + test_path_completion "\"spaces i" \
> +  "spaces in dir" &&
> + 

Re: [PATCH] t: make many tests depend less on the refs being files

2018-05-21 Thread SZEDER Gábor
> > diff --git a/t/t9903-bash-prompt.sh b/t/t9903-bash-prompt.sh
> > index 8f5c811dd7..c3b89ae783 100755
> > --- a/t/t9903-bash-prompt.sh
> > +++ b/t/t9903-bash-prompt.sh
> > @@ -148,7 +148,7 @@ test_expect_success 'prompt - inside .git directory' '
> >  test_expect_success 'prompt - deep inside .git directory' '
> > printf " (GIT_DIR!)" >expected &&
> > (
> > -   cd .git/refs/heads &&
> > +   cd .git/objects &&
> > __git_ps1 >"$actual"
> > ) &&
> > test_cmp expected "$actual"
> 
> This one looks wrong.

It's right, though.

> Why not `cd .git` instead?

That case is covered in the previous test.

Once upon a time it mattered how deep we were in the .git directory
when running __git_ps1(), because it executed different branches of a
long-ish if-elif chain.  For the prompt it doesn't matter anymore,
because those ifs were eliminated, but for the completion script it
still does, which brings us to:

Christian!  There is a similar test, '__git_find_repo_path - parent is
a .git directory' in 't9902-completion.sh', which, too, performs a 'cd
.git/refs/heads'.



Re: [PATCH 0/2] Test improvements for 'sg/complete-paths'

2018-05-21 Thread Johannes Schindelin
Hi Gábor,

On Fri, 18 May 2018, SZEDER Gábor wrote:

> > > So, I think for v2 I will rewrite these tests to call
> > > __git_complete_index_file() directly instead of using
> > > 'test_completion', and will include a test with spaces in path
> > > names.
> > 
> > Quite well thought-out reasoning.  Thanks.
> 
> Unfortunately I couldn't get around to it soon enough, and now the topic
> 'sg/complete-paths' is already in next, so here are those test
> improvements on top.

I can verify that the weeks-long breakage of `pu` on Windows has been
addressed, probably by this patch series.

Ciao,
Dscho

which files are "known to git"?

2018-05-21 Thread Robert P. J. Day

  updating my git courseware and, since some man pages refer to files
"known to git", i just want to make sure i understand precisely which
files those are. AIUI, they would include:

  * tracked files
  * ignored files
  * new files which have been staged but not yet committed

is that it? are there others?

rday


Re: [PATCH v2 02/18] Add a new builtin: branch-diff

2018-05-21 Thread Johannes Schindelin
Hi Junio,

On Tue, 8 May 2018, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > It would be easy to introduce, but I am wary about its usefulness.
> > Unless you re-generate the branch from patches (which I guess you do a
> > lot, but I don't), you are likely to compare incomplete patch series: say,
> > when you call `git rebase -i` to reword 05/18's commit message, your
> > command will only compare 05--18 of the patch series.
> 
> Well that is exactly the point of that "..@{1} @{1}..", which turned
> out to be very useful in practice at least for me when I am updating
> a topic with "rebase -i", and then reviewing what I did with tbdiff.
> 
> I do not want 01-04 in the above case as I already know I did not
> touch them.

And you are a seasoned veteran maintainer.

To the occasional contributor, this information is not obvious, and it is
not stored in their brain. It needs to be made explicit, which is why this
here command outputs those `abcdef = 012345` lines: it lists all the
commits, stating which ones are unchanged. In your 01-04 example, those
lines would be of the form `abcdef = abcdef`, of course.

Ciao,
Dscho


Re: [PATCH v2 02/18] Add a new builtin: branch-diff

2018-05-21 Thread Johannes Schindelin
Hi Buga,

On Mon, 7 May 2018, Igor Djordjevic wrote:

> On 07/05/2018 09:48, Jeff King wrote:
> > 
> > > > Let's, please, not fall into the trap of polluting git-branch with
> > > > utterly unrelated functionality, as has happened a few times with
> > > > other Git commands. Let's especially not do so merely for the sake of
> > > > tab-completion. git-branch is for branch management; it's not for
> > > > diff'ing.
> > >
> > > I totally disagree. `git branch` is *the* command to work with branches.
> > > Yes, you can manage branches. But you can also list them. And now you can
> > > also compare them.
> > 
> > One of the things I don't like about "git branch --diff" is that this
> > feature is not _just_ about branches at all. E.g., I could do:
> > 
> >   git tbdiff HEAD~10 HEAD~5 foo
> > 
> > Or even:
> > 
> >   git tbdiff v2.16.0 v2.17.0 my-rewritten-v2.17.0
> > 
> > Those arguments really are just commitishes, not necessarily branches.
> > One of the current interface rules for "git branch" is that the branch
> > names we hand it are interpreted _exactly_ as branch names. You cannot
> > "git branch -m v2.16.0", and there is no ambiguity in "git branch -d
> > foo" if "foo" is both a tag and a branch.
> > 
> > But this new mode does not fit the pattern at all.
> > 
> > If we were to attach this to an existing command, I think it has more to
> > do with "diff" than "branch". But I'm not sure we want to overload
> > "diff" either (which has traditionally been about two endpoints, and
> > does not really traverse at all, though arguably "foo...bar" is a bit of
> > a cheat :) ).
> > 
> > > > Of the suggestions thus far, Junio's git-topic-diff seems the least
> > > > worse, and doesn't suffer from tab-completion problems.
> > >
> > > Except that this is too limited a view.
> > 
> > Right, I agree with you. Topic branches are the intended use, but that's
> > not what it _does_, and obviously it can be applied in other cases. So
> > since "branch" is too specific, I think "topic branch" is even more so.
> > 
> > It's really "diff-history" or something, I think. That's not very
> > catchy, but I think the best name would imply that it was diffing a set
> > of commits (so even "diff-commit" would not be right, because that again
> > sounds like endpoints).
> 
> This is exactly what I feel as well, thanks for concise and 
> to-the-point spelling out.
> 
> From user interface perspective, I would expect something like this 
> to be possible (and natural):
> 
> (1) git diff topic-v1...topic-v2

No, we cannot. The `git diff topic-v1...topic-v2` invocation has worked
for a long time, and does something very different.

We should not even allow ourselves to think of such a breakage.

> (2) git diff --branch topic-v1...topic-v2

>From my point of view, `git diff --branch` indicates that I diff
*branches*. Which is not really something that makes sense, and definitely
not what this command is about.

We are not comparing branches.

We are comparing versions of the same branch.

> (1) is what we are all familiar with, providing a diff between two 
> revisions with focus on file changes, where (2) shifts focus to 
> history changes.
> 
> It`s all still a comparison between two revisions (pointed to by 
> "topic-v1" and "topic-v2" branch heads in this specific example), but 
> it differs in what we are comparing - (1) set of files contained in 
> endpoints, or (2) set of revisions contained in (or "leading to") 
> endpoints.

It is very much not about comparing *two* revisions. It is very much about
comparing two *ranges of* revisions, and not just any ranges, no. Those
ranges need to be so related as to contain mostly identical changes.
Otherwise, `git branch --diff` will spend a ton of time, just to come back
with a series of `-` lines followed by a series of `+` lines
(figuratively, not literally). Which would be stupid, to spend that much
time on something that `git rev-list --left-right topic1...topic2` would
have computed a lot faster.

> Hmm... what about `git diff --history`? :/ It does seem more "true" 
> to what it does, though I still like `git diff --branch` more 
> (catchier, indeed).

It certainly is catchier. But also a ton more puzzling.

I do not want to compare histories, after all. That would be like saying:
okay, topic1 and topic2 ended up at the same stage, but *how* did they
get there?

What I *want* to ask via the command implemented by this patch series is
the question: there was a set of patches previously, and now I have a set
of revised patches, what changed?

Most fellow German software engineers (who seem to have a knack for
idiotically long variable/function names) would now probably suggest:

git compare-patch-series-with-revised-patch-series

I hope you agree that that is better *and* worse than your suggestions,
depending from what angle you look at it: it is better because it
describes what the command is *actually* doing. But it is much worse at
the same time because it is too 

issues(?) installing git-lfs via fedora "dnf" command

2018-05-21 Thread Robert P. J. Day

  first, i realize this is almost entirely a fedora packaging issue
and i'm planning on reporting it there, but i just want to make sure
i'm not doing something stupid.

  knowing nothing about the git-lfs package, i thought i would install
it and just poke around out of sheer curiosity, so on my fully-updated
fedora 28 system, i ran:

  $ sudo dnf install git-lfs

which did in fact install that package, but also generated (abridged
version):

  ... snip ...
  Running transaction
Preparing:
Installing   : git-lfs-2.4.0-1.fc28.x86_64
Running scriptlet: git-lfs-2.4.0-1.fc28.x86_64
  Error: Failed to call git rev-parse --git-dir --show-toplevel: "fatal:
  not a git repository (or any of the parent directories): .git\n"

  Git LFS initialized.
Verifying: git-lfs-2.4.0-1.fc28.x86_64

  Installed:
git-lfs.x86_64 2.4.0-1.fc28

is one supposed to be *in* a git repository when installing, because i
was in fact at the top level of my linux kernel source repo, so i'm
unclear on what that "Error" is trying to tell me. am i just being
clueless? is this something that i should submit as a fedora packaging
issue?

rday


Opt-in MSPs Contact List

2018-05-21 Thread cheryl . goode

"Hi there,

Hope this mail finds you well.

I am writing this email to inform you that we have fresh updated Managed  
Service Providers-(MSPs) contacts to increase your 2018 sales.


You can use these database for direct mailing campaigns, email campaigns,  
cold calling, and to share newsletter.


List Contains: Names, Title, Email, Phone, Company Name, Company URL,  
Company physical address, Industry, website, Company Size, Revenue and  
Employee Etc.


• All company sizes available.
• Almost 17k+ technology/vendor users available.
• Very good coverage from US, EMEA, APAC and even LATAM.
• Customization as per 8-10 filters.

Please feel free to get back to me with your target criteria, if your  
criteria is different from the above mentioned applications let me know we  
have close to 17k+ technology installations.


Best Regards,
Cheryl Goode
Marketing Analyst

Note: This email is not expected to be a spam. It would be ideal if you  
acknowledge our conciliatory sentiments and answer in the subject taking  
with leave off to be expelled from our Mailing list. Why not try it/us out?

"


Re: [PATCH 01/18] Add a function to solve least-cost assignment problems

2018-05-21 Thread Johannes Schindelin
Hi Duy,

On Sun, 13 May 2018, Duy Nguyen wrote:

> On Thu, May 3, 2018 at 5:30 PM, Johannes Schindelin
>  wrote:
> > +   /* reduction transfer */
> > +   free_row = xmalloc(sizeof(int) * row_count);
> > +   for (int i = 0; i < row_count; i++) {
> 
> travis complains about this
> 
> 
> hungarian.c: In function ‘compute_assignment’:
> hungarian.c:47:11: error: redeclaration of ‘i’ with no linkage
>   for (int i = 0; i < row_count; i++) {
>^
> hungarian.c:21:6: note: previous declaration of ‘i’ was here
>   int i, j, phase;
>   ^
> hungarian.c:47:2: error: ‘for’ loop initial declarations are only
> allowed in C99 mode
>   for (int i = 0; i < row_count; i++) {
>   ^
> hungarian.c:47:2: note: use option -std=c99 or -std=gnu99 to compile your code

Yep, I fixed it locally already before seeing your mail. It is good to
know that some people pay attention, though!

Ciao,
Dscho

Re: [PATCH 00/18] Add `branch-diff`, a `tbdiff` lookalike

2018-05-21 Thread Johannes Schindelin
Hi Junio,

On Mon, 21 May 2018, Junio C Hamano wrote:

> I've been using both branch-diff and tbdiff while comparing rerolled
> topics before accepting them.  One obvious difference between the two is
> that the speed to compute pairing is quite different but that is
> expected ;-)

Yep.

It is also expected that the branch --diff actually works without you
having to make sure that the Python modules numpy and hungarian are built
correctly (which you Linux folks couldn't care less about because it is
done for ya).

I spent such a lot on time trying to get it to work on Windows that it
probably only now reaches parity with the time I spent on implementing
branch --diff.

> Another difference that is somewhat irritating is that a SP that
> leads a context line in a diff hunk that is introduced in the newer
> iteration is often painted in reverse, and I do not know what aspect
> of that single SP on these lines the branch-diff wants to pull my
> attention to.

Right. This is due to the fact that the diff does not realize that it
looks at pre/post images being diffs. And hence it does not understand
that the single space indicates a context line and is therefore not a
whitespace error before the tab.

> https://pasteboard.co/Hm9ujI7F.png
> 
> In the picture, the three pre-context lines that are all indented by
> a HT are prefixed by a SP, and that is prefixed by a '+' sign of the
> outer diff.

Yep, that's exactly it.

The way tbdiff did it was to parse the diff and re-roll the coloring on
its own. I am not really keen on doing that in `branch --diff`, too.

> We can use --dual-color mode to unsee these but it is somewhat
> frustrating not to know what the branch-diff program wants to tell
> me by highlighting the single SPs on these lines.

I was wondering from the get-go whether it would make sense to make
--dual-color the default.

But now I wonder whether there is actually *any* use in `--color` without
`--dual-color`.

What do you think? Should I make the dual color mode the *only* color
mode?

Ciao,
Dscho


  1   2   >