date:20180322

Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Jeff King

On Thu, Mar 22, 2018 at 10:46:09PM -0400, Jeff King wrote:

> > which begs the question, how much slower would it be if we
> > replaced the radix-sort with an in-place sort (e.g. heapsort).
> > 
> > I hacked up the patch below, just for fun. I don't have any
> > large repos (or enough disk space) to do any meaningful perf
> > tests, but I did at least compile it and it passes the test-suite.
> > (That is no guarantee that I haven't introduced bugs, of course!)
> 
> It might have been easier to just revert 8b8dfd5132 (pack-revindex:
> radix-sort the revindex, 2013-07-11). It even includes some performance
> numbers. :)
> 
> In short, no, I don't think we want to go back to a comparison-sort. The
> radix sort back then was around 4 times faster for linux.git. And that
> was when there were half as many objects in the repository, so the radix
> sort should continue to improve as the repo size grows.

I was curious whether my hand-waving there was true. It turns out that
it is: the radix sort has stayed about the same speed but the comparison
sort has gotten even slower. Here are best-of-five timings for "git
cat-file --batch-check='%(objectsize:disk)'", which does very little
besides generate the rev-index:

  [current master, using radix sort]
  real  0m0.104s
  user  0m0.088s
  sys   0m0.016s

  [reverting 8b8dfd5132, going back to qsort]
  real  0m1.193s
  user  0m1.176s
  sys   0m0.016s

So it's now a factor of 11. Yikes.

That number does match some napkin math. The radix sort uses four 16-bit
buckets, but can quit when after two rounds (because none of the offsets
is beyond 2^32). So it's essentially O(2n). Whereas the comparison sort
is O(n log n), and with n around 6M, that puts log(n) right around 22.

It's possible that some other comparison-based sort might be a little
more efficient than qsort, but I don't think you'll be able to beat the
algorithmic speedup.

The revert of 8b8dfd5132 is below for reference (it needed a few
conflict tweaks).

-Peff

---
diff --git a/pack-revindex.c b/pack-revindex.c
index ff5f62c033..c20aa9541b 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -15,102 +15,11 @@
  * get the object sha1 from the main index.
  */
 
-/*
- * This is a least-significant-digit radix sort.
- *
- * It sorts each of the "n" items in "entries" by its offset field. The "max"
- * parameter must be at least as large as the largest offset in the array,
- * and lets us quit the sort early.
- */
-static void sort_revindex(struct revindex_entry *entries, unsigned n, off_t 
max)
+static int cmp_offset(const void *a_, const void *b_)
 {
-   /*
-* We use a "digit" size of 16 bits. That keeps our memory
-* usage reasonable, and we can generally (for a 4G or smaller
-* packfile) quit after two rounds of radix-sorting.
-*/
-#define DIGIT_SIZE (16)
-#define BUCKETS (1 << DIGIT_SIZE)
-   /*
-* We want to know the bucket that a[i] will go into when we are using
-* the digit that is N bits from the (least significant) end.
-*/
-#define BUCKET_FOR(a, i, bits) (((a)[(i)].offset >> (bits)) & (BUCKETS-1))
-
-   /*
-* We need O(n) temporary storage. Rather than do an extra copy of the
-* partial results into "entries", we sort back and forth between the
-* real array and temporary storage. In each iteration of the loop, we
-* keep track of them with alias pointers, always sorting from "from"
-* to "to".
-*/
-   struct revindex_entry *tmp, *from, *to;
-   int bits;
-   unsigned *pos;
-
-   ALLOC_ARRAY(pos, BUCKETS);
-   ALLOC_ARRAY(tmp, n);
-   from = entries;
-   to = tmp;
-
-   /*
-* If (max >> bits) is zero, then we know that the radix digit we are
-* on (and any higher) will be zero for all entries, and our loop will
-* be a no-op, as everybody lands in the same zero-th bucket.
-*/
-   for (bits = 0; max >> bits; bits += DIGIT_SIZE) {
-   unsigned i;
-
-   memset(pos, 0, BUCKETS * sizeof(*pos));
-
-   /*
-* We want pos[i] to store the index of the last element that
-* will go in bucket "i" (actually one past the last element).
-* To do this, we first count the items that will go in each
-* bucket, which gives us a relative offset from the last
-* bucket. We can then cumulatively add the index from the
-* previous bucket to get the true index.
-*/
-   for (i = 0; i < n; i++)
-   pos[BUCKET_FOR(from, i, bits)]++;
-   for (i = 1; i < BUCKETS; i++)
-   pos[i] += pos[i-1];
-
-   /*
-* Now we can drop the elements into their correct buckets (in
-* our temporary array).  We iterate the pos counter backwards
-* to avoid using an extra index to count up. And since we

[PATCH v2] filter-branch: fix errors caused by refs that cannot be used with ^0

2018-03-22 Thread Yuki Kokubun

"git filter-branch -- --all" print unwanted error messages when refs that
cannot be used with ^0 exist. Such refs can be created by "git replace" with
trees or blobs. Also, "git tag" with trees or blobs can create such refs.
---
 git-filter-branch.sh | 14 --
 t/t7003-filter-branch.sh | 14 ++
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 1b7e4b2cd..41efecb28 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -251,8 +251,18 @@ done < "$tempdir"/backup-refs
 
 # The refs should be updated if their heads were rewritten
 git rev-parse --no-flags --revs-only --symbolic-full-name \
-   --default HEAD "$@" > "$tempdir"/raw-heads || exit
-sed -e '/^^/d' "$tempdir"/raw-heads >"$tempdir"/heads
+   --default HEAD "$@" > "$tempdir"/raw-refs || exit
+while read ref
+do
+   case "$ref" in ^?*) continue ;; esac
+
+   if git rev-parse --verify "$ref"^0 >/dev/null 2>&1
+   then
+   echo "$ref"
+   else
+   warn "WARNING: not rewriting '$ref' (not a committish)"
+   fi
+done >"$tempdir"/heads <"$tempdir"/raw-refs
 
 test -s "$tempdir"/heads ||
die "You must specify a ref to rewrite."
diff --git a/t/t7003-filter-branch.sh b/t/t7003-filter-branch.sh
index 7cb60799b..04f79f32b 100755
--- a/t/t7003-filter-branch.sh
+++ b/t/t7003-filter-branch.sh
@@ -470,4 +470,18 @@ test_expect_success 'tree-filter deals with object name vs 
pathname ambiguity' '
git show HEAD:$ambiguous
 '
 
+test_expect_success 'rewrite repository including refs that point at 
non-commit object' '
+   test_when_finished "git reset --hard original" &&
+   tree=$(git rev-parse HEAD^{tree}) &&
+   test_when_finished "git replace -d $tree" &&
+   echo A >new &&
+   git add new &&
+   new_tree=$(git write-tree) &&
+   git replace $tree $new_tree &&
+   git tag -a -m "tag to a tree" treetag $new_tree &&
+   git reset --hard HEAD &&
+   git filter-branch -f -- --all >filter-output 2>&1 &&
+   ! fgrep fatal filter-output
+'
+
 test_done
-- 
2.16.2.18.g09cb46d

[RFC PATCH v4] rebase-interactive

2018-03-22 Thread Wink Saville

This is v4 of the first 2 patches of "[RFC PATCH vV n/9] rebase-interactive",
looking forward to any additional comments.


Wink Saville (2):
  rebase-interactive: Simplify pick_on_preserving_merges
  rebase: Update invocation of rebase dot-sourced scripts

 git-rebase--am.sh  | 11 ---
 git-rebase--interactive.sh | 28 +++-
 git-rebase--merge.sh   | 11 ---
 git-rebase.sh  |  2 ++
 4 files changed, 9 insertions(+), 43 deletions(-)

-- 
2.16.2

[RFC PATCH v4] rebase: Update invocation of rebase dot-sourced scripts

2018-03-22 Thread Wink Saville

The backend scriptlets for "git rebase" were structured in a
bit unusual way for historical reasons.  Originally, it was
designed in such a way that dot-sourcing them from "git
rebase" would be sufficient to invoke the specific backend.

When it was discovered that some shell implementations
(e.g. FreeBSD 9.x) misbehaved when exiting with a "return"
is executed at the top level of a dot-sourced script (the
original was expecting that the control returns to the next
command in "git rebase" after dot-sourcing the scriptlet).

To fix this issue the whole body of git-rebase--$backend.sh
was made into a shell function git_rebase__$backend and then
the last statement of the scriptlet would invoke the function.

Here the call is moved to "git rebase" side, instead of at the
end of each scriptlet.  This give us a more normal arrangement
where the scriptlet function library and allows multiple functions
to be implemented in a scriptlet.

Signed-off-by: Wink Saville 
Reviewed-by: Junio C Hamano 
Reviewed-by: Eric Sunsine 
---
 git-rebase--am.sh  | 11 ---
 git-rebase--interactive.sh | 11 ---
 git-rebase--merge.sh   | 11 ---
 git-rebase.sh  |  2 ++
 4 files changed, 2 insertions(+), 33 deletions(-)

diff --git a/git-rebase--am.sh b/git-rebase--am.sh
index be3f06892..e5fd6101d 100644
--- a/git-rebase--am.sh
+++ b/git-rebase--am.sh
@@ -4,15 +4,6 @@
 # Copyright (c) 2010 Junio C Hamano.
 #
 
-# The whole contents of this file is run by dot-sourcing it from
-# inside a shell function.  It used to be that "return"s we see
-# below were not inside any function, and expected to return
-# to the function that dot-sourced us.
-#
-# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a
-# construct and continue to run the statements that follow such a "return".
-# As a work-around, we introduce an extra layer of a function
-# here, and immediately call it after defining it.
 git_rebase__am () {
 
 case "$action" in
@@ -105,5 +96,3 @@ fi
 move_to_original_branch
 
 }
-# ... and then we call the whole thing.
-git_rebase__am
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 561e2660e..213d75f43 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -740,15 +740,6 @@ get_missing_commit_check_level () {
printf '%s' "$check_level" | tr 'A-Z' 'a-z'
 }
 
-# The whole contents of this file is run by dot-sourcing it from
-# inside a shell function.  It used to be that "return"s we see
-# below were not inside any function, and expected to return
-# to the function that dot-sourced us.
-#
-# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a
-# construct and continue to run the statements that follow such a "return".
-# As a work-around, we introduce an extra layer of a function
-# here, and immediately call it after defining it.
 git_rebase__interactive () {
 
 case "$action" in
@@ -1029,5 +1020,3 @@ fi
 do_rest
 
 }
-# ... and then we call the whole thing.
-git_rebase__interactive
diff --git a/git-rebase--merge.sh b/git-rebase--merge.sh
index ceb715453..685f48ca4 100644
--- a/git-rebase--merge.sh
+++ b/git-rebase--merge.sh
@@ -104,15 +104,6 @@ finish_rb_merge () {
say All done.
 }
 
-# The whole contents of this file is run by dot-sourcing it from
-# inside a shell function.  It used to be that "return"s we see
-# below were not inside any function, and expected to return
-# to the function that dot-sourced us.
-#
-# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a
-# construct and continue to run the statements that follow such a "return".
-# As a work-around, we introduce an extra layer of a function
-# here, and immediately call it after defining it.
 git_rebase__merge () {
 
 case "$action" in
@@ -171,5 +162,3 @@ done
 finish_rb_merge
 
 }
-# ... and then we call the whole thing.
-git_rebase__merge
diff --git a/git-rebase.sh b/git-rebase.sh
index a1f6e5de6..4595a316a 100755
--- a/git-rebase.sh
+++ b/git-rebase.sh
@@ -196,7 +196,9 @@ run_specific_rebase () {
export GIT_EDITOR
autosquash=
fi
+   # Source the code and invoke it
. git-rebase--$type
+   git_rebase__$type
ret=$?
if test $ret -eq 0
then
-- 
2.16.2

[RFC PATCH v4] rebase-interactive: Simplify pick_on_preserving_merges

2018-03-22 Thread Wink Saville

Use compound if statement instead of nested if statements to
simplify pick_on_preserving_merges.

Signed-off-by: Wink Saville 
Reviewed-by: Junio C Hamano 
---
 git-rebase--interactive.sh | 17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 331c8dfea..561e2660e 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -307,17 +307,14 @@ pick_one_preserving_merges () {
esac
sha1=$(git rev-parse $sha1)
 
-   if test -f "$state_dir"/current-commit
+   if test -f "$state_dir"/current-commit && test "$fast_forward" = t
then
-   if test "$fast_forward" = t
-   then
-   while read current_commit
-   do
-   git rev-parse HEAD > 
"$rewritten"/$current_commit
-   done <"$state_dir"/current-commit
-   rm "$state_dir"/current-commit ||
-   die "$(gettext "Cannot write current commit's 
replacement sha1")"
-   fi
+   while read current_commit
+   do
+   git rev-parse HEAD > "$rewritten"/$current_commit
+   done <"$state_dir"/current-commit
+   rm "$state_dir"/current-commit ||
+   die "$(gettext "Cannot write current commit's 
replacement sha1")"
fi
 
echo $sha1 >> "$state_dir"/current-commit
-- 
2.16.2

Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Jeff King

On Fri, Mar 23, 2018 at 01:28:12AM +, Ramsay Jones wrote:

> > Of the used heap after your patches:
> > 
> >  - ~40% of that is from packlist_alloc()
> >  - ~17% goes to "struct object"
> >  - ~10% for the object.c hash table to store all the "struct object"
> >  - ~7% goes to the delta cache
> >  - ~7% goes to the pack revindex (actually, there's a duplicate 7%
> >there, too; I think our peak is when we're sorting the revindex
> >and have to keep two copies in memory at once)
> 
> which begs the question, how much slower would it be if we
> replaced the radix-sort with an in-place sort (e.g. heapsort).
> 
> I hacked up the patch below, just for fun. I don't have any
> large repos (or enough disk space) to do any meaningful perf
> tests, but I did at least compile it and it passes the test-suite.
> (That is no guarantee that I haven't introduced bugs, of course!)

It might have been easier to just revert 8b8dfd5132 (pack-revindex:
radix-sort the revindex, 2013-07-11). It even includes some performance
numbers. :)

In short, no, I don't think we want to go back to a comparison-sort. The
radix sort back then was around 4 times faster for linux.git. And that
was when there were half as many objects in the repository, so the radix
sort should continue to improve as the repo size grows.

The absolute time savings aren't huge for something as bulky as a
repack, so it's less exciting in this context. But it's also not that
much memory (7% of the peak here, but as I showed elsewhere, if we can
stop holding all of the "struct object" memory once we're done with it,
then this revindex stuff doesn't even factor into the peak).

-Peff

Re: [PATCH] filter-branch: consider refs can refer to an object other than commit or tag

2018-03-22 Thread Yuki Kokubun

> However, if we pre-filter to limit the refs in "$tempdir/heads" to
> those that are committish (i.e. those that pass "$ref^0") like the
> patch and subsequent discussion suggests, wouldn't we lose the
> warning for these replace refs and non-committish tags.  We perhaps
> could do something like:
> 
>   git rev-parse --no-flags ... >"$tempdir/raw-heads" || exit
> 
>   while read ref
>   do
>   case "$ref" in ^?*) continue ;; esac
>   if git rev-parse --verify "$ref^0" 2>/dev/null
> then
>   echo "$ref"
>   else
>   warn "WARNING: not rewriting '$ref' (not a committish)"
>   fi
>   done >"$tempdir/heads" <"$tempdir/raw-heads"
> 
> (note: the else clause is new, relative to my earlier suggestion).

I agree these suggestions.
I'm gonna send a new patch that follow it.

[PATCH] branch: implement shortcut to delete last branch

2018-03-22 Thread Aaron Greenberg

Add support for using the "-" shortcut to delete the last checked-out
branch. This functionality already exists for git-merge, git-checkout,
and git-revert.

Signed-off-by: Aaron Greenberg 
---
 builtin/branch.c  | 3 +++
 t/t3200-branch.sh | 9 +
 2 files changed, 12 insertions(+)

diff --git a/builtin/branch.c b/builtin/branch.c
index 6d0cea9..9e37078 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -221,6 +221,9 @@ static int delete_branches(int argc, const char **argv, int 
force, int kinds,
char *target = NULL;
int flags = 0;
 
+   if (!strcmp(argv[i], "-"))
+   argv[i] = "@{-1}";
+
strbuf_branchname(, argv[i], allowed_interpret);
free(name);
name = mkpathdup(fmt, bname.buf);
diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
index 6c0b7ea..a3ffd54 100755
--- a/t/t3200-branch.sh
+++ b/t/t3200-branch.sh
@@ -776,6 +776,15 @@ test_expect_success 'deleting currently checked out branch 
fails' '
test_must_fail git branch -d my7
 '
 
+test_expect_success 'test deleting last branch' '
+   git checkout -b my7.2 &&
+   git checkout  - &&
+   sha1=$(git rev-parse my7 | cut -c 1-7) &&
+   echo "Deleted branch my7.2 (was $sha1)." >expect &&
+   git branch -d - >actual 2>&1 &&
+   test_i18ncmp expect actual
+'
+
 test_expect_success 'test --track without .fetch entries' '
git branch --track my8 &&
test "$(git config branch.my8.remote)" &&
-- 
2.7.4

[PATCH] branch: implement shortcut to delete last branch

2018-03-22 Thread Aaron Greenberg

This patch gives git-branch the ability to delete the previous
checked-out branch using the "-" shortcut. This shortcut already exists
for git-checkout, git-merge, and git-revert. One of my common workflows
is to do some work on a local topic branch and push it to a remote,
where it gets merged in to 'master'. Then, I switch back to my local
master, fetch the remote master, and delete the previous topic branch.

$ git checkout -b topic-a
$ # Do some work...
$ git commit -am "Implement feature A"
$ git push origin topic-a

# 'origin/topic-a' gets merged into 'origin/master'

$ git checkout master
$ git branch -d topic-a
$ # With this patch, a user could simply type
$ git branch -d -

I think it's a useful shortcut for cleaning up a just-merged branch
(or a just switched-from branch.)

Re: [RFC PATCH v3.1 2/9 2/2] rebase-interactive: Do not automatically run code

2018-03-22 Thread Junio C Hamano

Wink Saville  writes:

> Currently I'm not rebasing the other commits (3..9)
> to reduce the amount of work I have to do in each
> review cycle, is that OK?

Yeah, I want to see others more heavily involved in this part of the
system to comment on your patches.  As to the organization of the
changes, I have some more opinions of my own, primarily regarding to
reviewability, but they are of secondary importance than reviews by
area experts.  I think it would be helpful to give them a target
that is not moving too rapidly ;-)

> Also, will you merge commits 1 and 2 before the other
> commits or is the procedure to merge the complete set
> at once?

I am inclined to take the early preliminary clean-up steps before
the remainder.

[GSOC]About the microproject related to CI

2018-03-22 Thread Zhibin Li


Hi all,

I'm Zhibin Li, an undergraduate from China and I'm interested in 
automated testing. Since the application deadline is coming, hope it's 
not too late for me to start with the microproject. If it's ok, I would 
like to take Git CI Improvements 4 as my starting point. But the 
description on the website shows less details so I wonder what am I 
supposed to do more specifically? Reporting the results or trying to 
figure out the how and why those results come out independently? It 
would be nice if you guys can tell me about any details.


Thanks,
Zhibin

Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Ramsay Jones



On 22/03/18 09:32, Jeff King wrote:
> On Wed, Mar 21, 2018 at 04:59:19PM +0100, Duy Nguyen wrote:
> 
>>> I hate to be a wet blanket, but am I the only one who is wondering
>>> whether the tradeoffs is worth it? 8% memory reduction doesn't seem
>>> mind-bogglingly good,
>>
>> AEvar measured RSS. If we count objects[] array alone, the saving is
>> 40% (136 bytes per entry down to 80). Some is probably eaten up by
>> mmap in rss.
> 
> Measuring actual heap usage with massif, I get before/after peak heaps
> of 1728 and 1346MB respectively when repacking linux.git. So that's ~22%
> savings overall.
> 
> Of the used heap after your patches:
> 
>  - ~40% of that is from packlist_alloc()
>  - ~17% goes to "struct object"
>  - ~10% for the object.c hash table to store all the "struct object"
>  - ~7% goes to the delta cache
>  - ~7% goes to the pack revindex (actually, there's a duplicate 7%
>there, too; I think our peak is when we're sorting the revindex
>and have to keep two copies in memory at once)

which begs the question, how much slower would it be if we
replaced the radix-sort with an in-place sort (e.g. heapsort).

I hacked up the patch below, just for fun. I don't have any
large repos (or enough disk space) to do any meaningful perf
tests, but I did at least compile it and it passes the test-suite.
(That is no guarantee that I haven't introduced bugs, of course!)

;-)

ATB,
Ramsay Jones

-- >8 --
Subject: [PATCH] pack-revindex: replace radix-sort with in-place heapsort

Signed-off-by: Ramsay Jones 
---
 pack-revindex.c | 60 +
 1 file changed, 60 insertions(+)

diff --git a/pack-revindex.c b/pack-revindex.c
index ff5f62c03..16f17eac1 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -15,6 +15,7 @@
  * get the object sha1 from the main index.
  */
 
+#ifdef DUMMY
 /*
  * This is a least-significant-digit radix sort.
  *
@@ -112,6 +113,65 @@ static void sort_revindex(struct revindex_entry *entries, 
unsigned n, off_t max)
 #undef BUCKETS
 #undef DIGIT_SIZE
 }
+#endif
+
+static inline void swap(struct revindex_entry *a, int i, int j)
+{
+   struct revindex_entry t;
+
+   t = a[i];
+   a[i] = a[j];
+   a[j] = t;
+}
+
+/*
+ * assume that elements first .. last (array index first-1 .. last-1) obey
+ * the partially ordered tree property, except possibly for the children of
+ * the first element. push down the first element until the partially
+ * ordered tree property is restored.
+ */
+static void push_down(struct revindex_entry *a, int first, int last)
+{
+   int parent = first;
+   int last_node = last / 2;
+
+   while (parent <= last_node) {
+
+   int left = 2 * parent;
+   int right = left + 1;
+   int biggest;
+
+   if (right > last) /* parent only has one child */
+   biggest = left;
+   else {
+   if (a[left-1].offset >= a[right-1].offset)
+   biggest = left;
+   else
+   biggest = right;
+
+   }
+
+   if (a[parent-1].offset >= a[biggest-1].offset)
+   break; /* partially ordered tree property, we're done */
+
+   /* push parent down */
+   swap(a, parent-1, biggest-1);
+   parent = biggest;
+   }
+}
+
+static void sort_revindex(struct revindex_entry *entries, unsigned n, off_t 
max)
+{
+   int i;
+
+   for (i = n/2; i > 0; i--)
+   push_down(entries, i, n);
+
+   for (i = n; i > 1; i--) {
+   swap(entries, 0, i-1);
+   push_down(entries, 1, i-1);
+   }
+}
 
 /*
  * Ordered list of offsets of objects in the pack.
-- 
2.16.0

Re: [GSoC] Convert “git stash” to builtin proposal

2018-03-22 Thread Paul-Sebastian Ungureanu

Hello,

On Tue, 2018-03-20 at 23:08 +0100, Christian Couder wrote:
> Hi,
> 
> On Tue, Mar 20, 2018 at 9:09 PM, Paul-Sebastian Ungureanu
>  wrote:
> > 
> > * Convert function: this step is basically makes up the goal of
> > this
> > project.
> 
> Could you explain a bit more how you would convert a function? Or
> could you explain for example how you would convert "git stash list"
> below?

In order to convert a command, all the functions which are used by the
command must be converted first. The conversion will start with the
bottom-level functions, which do not have any dependencies.

For example, to convert "git stash list", the parser will call
“list_stash”, which will call “have_stash”. The conversion of these
functions will be made in reverse order they were mentioned (have_stash
first and then list_stash).

It is very important to know the Git source well in order to avoid
reimplementing functionality. In this case “have_stash()” is somehow
already implemented as “get_oid(ref_stash, )”. 

> > I am expecting to submit a patch for every command that is
> > converted.
> > This way, I have well set milestones and results will be seen as
> > soon
> > as possible. Moreover, seeing my contributions getting merged will
> > be a
> > huge confidence boost to myself.
> > Each “convert” phase of the project below is not only about
> > converting
> > from Shell to C, but also ensuring that everything is documented,
> > there
> > are enough tests and there are no regressions.
> > 
> > 14th May - 20th May - Convert "git stash list"
> 
> For example could you explain a bit more which functions/commands you
> would create in which file and how you would call them to convert
> `git
> stash list`

The new C code will be found in stash-helper.c and will be used by git-
stash.sh until the full conversion is complete. As soon as the entire
conversion is done, stash-helper.c will be promoted to stash.c. Any
functionality that will be implemented, but is not strictly related to
git stash will reside in the appropriate files (for example if I had to
implement similar to get_oid, which is not related to git stash, but to
Git, then I would not implement it in stash-helper.c; anyway, I do not
believe I will encounter this situation that often).

In the updated version of the proposal [1], I included the ideas stated
before and more information about the procces of benchmarking. In
addition, to test my capabilities and to be sure that I am able to work
on a project of this kind, I tried to convert “git stash list” into a
built-in (the code can be found in proposal). 

[1] https://docs.google.com/document/d/1TtdD7zgUsEOszHG5HX1at4zHMDsPmSB
YWqydOPTTpV8/edit

Convert “git stash” to builtin

## Name and Contact Information
--
Hello! My name is Paul-Sebastian Ungureanu. I am currently a first year
Computer Science & Engineering student at Politehnica University of
Bucharest, Romania.

My email address is ungureanupaulsebast...@gmail.com and my phone
number is [CENSORED]. You can also find me on #git IRC channel as
ungps.

FLOSS manual recommends students to include in their proposal their
postal address and mention a relative as a emergency contact. My
permanent address is [CENSORED]. In case of an emergency, you may
contact my brother, [CENSORED] by email at [CENSORED] or by phone at
[CENSORED].

## Synopsis
--
Currently, many components of Git are still in the form of Shell or
Perl scripts. This choice makes sense especially when considering how
faster new features can be implemented in Shell and Perl scripts,
rather than writing them in C. However, in production, where Git runs
daily on millions of computers with distinct configurations (i.e.
different operating systems) some problems appear (i.e. POSIX-to-
Windows path conversion issues).

The idea of this project is to take “git-stash.sh” and reimplement it
in C. Apart from fixing compatibility issues and increasing the
performance by going one step closer to native code, I believe this is
an excellent excuse to fix long-standing bugs or implement new minor
features.

## Benefits to community
--
The essential benefit brought by rewriting Git commands is the
increased compatibility with a large number computers with distinct
configuration. I believe that this project can have a positive impact
on a large mass of developers around the world. By rewriting the code
behind some popular commands and making them “built-in”, developers
will have a better overall experience when using Git and get to focus
on what really matters to them.

As a side effect, there will be a number of other improvements:
increased performance, ability to rethink the design of some code that
suffered from patching along the time, have the chance to create new
features and fix old bugs.

In theory, switching from Bash or Shell scripts, Git will

Re: [RFC PATCH v3.1 2/9 2/2] rebase-interactive: Do not automatically run code

2018-03-22 Thread Wink Saville

On Thu, Mar 22, 2018 at 1:46 PM, Junio C Hamano  wrote:
> Wink Saville  writes:
>
>> At Junio's suggestion have git-rebase--am and git-rebase--merge work the
>> same way as git-rebase--interactive. This makes the code more consistent.
>
> I mumbled about making git_rebase__$type functions for all $type in
> my previous response, but that was done without even looking at
> git-rebase--$type.sh scriptlets.  It seems that they all shared the
> same structure (i.e. define git_rebase__$type function and then at
> the end clla it) and were consistent already.  It was the v3 that
> changed the calling convention only for interactive, which made it
> inconsistent.  If you are making git-rebase.sh call the helper shell
> function for all backend $type, you are keeping the existing
> consistency.
>
> This is no longer about "interactive" alone, though, and need to be
> retitled ;-)
>
>> Signed-off-by: Wink Saville 
>> ---
>>  git-rebase--am.sh  | 17 ++---
>>  git-rebase--interactive.sh |  8 +++-
>>  git-rebase--merge.sh   | 17 ++---
>>  git-rebase.sh  | 13 -
>>  4 files changed, 23 insertions(+), 32 deletions(-)
>>
>> diff --git a/git-rebase--am.sh b/git-rebase--am.sh
>> index be3f06892..47dc69ed9 100644
>> --- a/git-rebase--am.sh
>> +++ b/git-rebase--am.sh
>> @@ -4,17 +4,14 @@
>>  # Copyright (c) 2010 Junio C Hamano.
>>  #
>>
>> +# The whole contents of this file is loaded by dot-sourcing it from
>> +# inside another shell function, hence no shebang on the first line
>> +# and then the caller invokes git_rebase__am.
>
> Is this comment necessary?

Removed

>
>> +# Previously this file was sourced and it called itself to get this
>> +# was to get around a bug in older (9.x) versions of FreeBSD.
>
> ECANTPARSE.  But this probably is no longer needed here, even though
> it may make sense to explain why this comment is no longer relevant
> in the log message.  E.g.
>
> The backend scriptlets for "git rebase" are structured in a
> bit unusual way for historical reasons.  Originally, it was
> designed in such a way that dot-sourcing them from "git
> rebase" would be sufficient to invoke the specific backend.
> When it was discovered that some shell implementations
> (e.g. FreeBSD 9.x) misbehaved by exiting when "return" is
> executed at the top level of a dot-sourced script (the
> original was expecting that the control returns to the next
> command in "git rebase" after dot-sourcing the scriptlet),
> the whole body of git-rebase--$backend.sh was made into a
> shell function git_rebase__$backend and then the scriptlet
> was made to call this function at the end as a workaround.
>
> Move the call to "git rebase" side, instead of at the end of
> each scriptlet.  This would give us a more normal
> arrangement where a function library lives in a scriptlet
> that is dot-sourced, and then these helper functions are
> called by the script that dot-sourced the scriptlet.
>
> While at it, remove the large comment that explains why this
> rather unusual structure was used from these scriptlets.
>
> or something like that in the log message, and then we can get rid
> of these in-code comments, I would think.

Updated commit message

>>  git_rebase__am () {
>> -
>> +echo "git_rebase_am:+" 1>&5
>
> debuggin'?  I see similar stuff left in other parts (snipped) of
> this patch.

Removed debugging :(

Currently I'm not rebasing the other commits (3..9)
to reduce the amount of work I have to do in each
review cycle, is that OK?

Also, will you merge commits 1 and 2 before the other
commits or is the procedure to merge the complete set
at once?

[ANNOUNCE] Git v2.16.3

2018-03-22 Thread Junio C Hamano

The latest maintenance release Git v2.16.3 is now available at
the usual places.  It merges many small fixes and documentation
updates that have been in the 'master' branch for a few weeks.

The tarballs are found at:

https://www.kernel.org/pub/software/scm/git/

The following public repositories all have a copy of the 'v2.16.3'
tag and the 'maint' branch that the tag points at:

  url = https://kernel.googlesource.com/pub/scm/git/git
  url = git://repo.or.cz/alt-git.git
  url = https://github.com/gitster/git



Git v2.16.3 Release Notes
=

Fixes since v2.16.2
---

 * "git status" after moving a path in the working tree (hence making
   it appear "removed") and then adding with the -N option (hence
   making that appear "added") detected it as a rename, but did not
   report the  old and new pathnames correctly.

 * "git commit --fixup" did not allow "-m" option to be used
   at the same time; allow it to annotate resulting commit with more
   text.

 * When resetting the working tree files recursively, the working tree
   of submodules are now also reset to match.

 * Fix for a commented-out code to adjust it to a rather old API change
   around object ID.

 * When there are too many changed paths, "git diff" showed a warning
   message but in the middle of a line.

 * The http tracing code, often used to debug connection issues,
   learned to redact potentially sensitive information from its output
   so that it can be more safely sharable.

 * Crash fix for a corner case where an error codepath tried to unlock
   what it did not acquire lock on.

 * The split-index mode had a few corner case bugs fixed.

 * Assorted fixes to "git daemon".

 * Completion of "git merge -s" (in contrib/) did not work
   well in non-C locale.

 * Workaround for segfault with more recent versions of SVN.

 * Recently introduced leaks in fsck have been plugged.

 * Travis CI integration now builds the executable in 'script' phase
   to follow the established practice, rather than during
   'before_script' phase.  This allows the CI categorize the failures
   better ('failed' is project's fault, 'errored' is build
   environment's).

Also contains various documentation updates and code clean-ups.



Changes since v2.16.2 are as follows:

Ben Peart (1):
  fsmonitor: update documentation to remove reference to invalid config 
settings

Brandon Williams (1):
  oidmap: ensure map is initialized

Christian Ludwig (1):
  t9001: use existing helper in send-email test

Eric Sunshine (2):
  git-worktree.txt: fix missing ")" typo
  git-worktree.txt: fix indentation of example and text of 'add' command

Eric Wong (2):
  fsck: fix leak when traversing trees
  git-svn: control destruction order to avoid segfault

Genki Sky (1):
  test-lib.sh: unset XDG_CACHE_HOME

Jeff King (10):
  t5570: use ls-remote instead of clone for interp tests
  t/lib-git-daemon: record daemon log
  daemon: fix off-by-one in logging extended attributes
  daemon: handle NULs in extended attribute string
  t/lib-git-daemon: add network-protocol helpers
  daemon: fix length computation in newline stripping
  t0205: drop redundant test
  git-sh-i18n: check GETTEXT_POISON before USE_GETTEXT_SCHEME
  commit: drop uses of get_cached_commit_buffer()
  revision: drop --show-all option

Jonathan Tan (2):
  http: support cookie redaction when tracing
  http: support omitting data from traces

Juan F. Codagnone (1):
  mailinfo: avoid segfault when can't open files

Junio C Hamano (2):
  worktree: say that "add" takes an arbitrary commit in short-help
  Git 2.16.3

Kaartic Sivaraam (2):
  Doc/gitsubmodules: make some changes to improve readability and syntax
  Doc/git-submodule: improve readability and grammar of a sentence

Mathias Rav (1):
  files_initial_transaction_commit(): only unlock if locked

Motoki Seki (1):
  Documentation/gitsubmodules.txt: avoid non-ASCII apostrophes

Nguyễn Thái Ngọc Duy (12):
  t2203: test status output with porcelain v2 format
  Use DIFF_DETECT_RENAME for detect_rename assignments
  wt-status.c: coding style fix
  wt-status.c: catch unhandled diff status codes
  wt-status.c: rename rename-related fields in wt_status_change_data
  wt-status.c: handle worktree renames
  read-cache.c: change type of "temp" in write_shared_index()
  read-cache.c: move tempfile creation/cleanup out of write_shared_index
  diff.c: flush stdout before printing rename warnings
  read-cache: don't write index twice if we can't write shared index
  completion: fix completing merge strategies on non-C locales
  gitignore.txt: elaborate shell glob syntax

Ramsay Jones (2):
  config.mak.uname: remove SPARSE_FLAGS setting for cygwin

Donation made to you

2018-03-22 Thread Mrs.Elisabeth Schaeffler Thumann




--
Donation made to you.
 Donor: Mrs.Elisabeth Schaeffler Thumann.
 please Contact;(mrselisabeth.schaefflerth0...@gmail.com) for more 
details.

Donation made to you.

2018-03-22 Thread Mrs.Elisabeth Schaeffler Thumann




--
Donation made to you.
 Donor: Mrs.Elisabeth Schaeffler Thumann.
 please Contact;(mrselisabeth.schaefflerth0...@gmail.com) for more 
details.

Re: [RFC][GSoC] Project proposal: convert interactive rebase to C

2018-03-22 Thread Alban Gruin

Hi,

here is my second draft of my proposal. As last time, any feedback is
welcome :)

I did not write my phone number and address here for obvious reasons,
but they will be in the “about me” section of the final proposal.

Apart from that, do you think there is something to add?

---
ABSTRACT

git is a modular source control management software, and all of its
subcommands are programs on their own. A lot of them are written in C,
but a couple of them are shell or Perl scripts. This is the case of
git-rebase--interactive (or interactive rebase), which is a shell
script. Rewriting it in C would improve its performance, its
portability, and maybe its robustness.


ABOUT `git-rebase` and `git-rebase--interactive`

git-rebase allows to re-apply changes on top of another branch. For
instance, when a local branch and a remote branch have diverged,
git-rebase can re-unify them, applying each change made on the
local branch on top of the remote branch.

git-rebase--interactive is used to reorganize commits by reordering,
rewording, or squashing them. To achieve this purpose, git opens the
list of commits to be modified in a text editor (hence the
interactivity), as well as the actions to be performed for each of
them.


PROJECT GOALS

The goal of this project is to rewrite git-rebase--interactive in C
as it has been discussed on the git mailing list[1], for multiple
reasons :


Performance improvements
Shell scripts are inherently slow. That’s because each command is a
program by itself. So, for each command, the shell interpreter has to
spawn a new process and to load a new program (with fork() and
exec() syscalls), which is an expensive process.

Those commands can be other git commands. Sometimes, they are
wrappers to call internal C functions (eg. git-rebase--helper),
something shell scripts can’t do natively. These wrappers basically
parse the parameters, then start the appropriate function, which is
obviously slower than just calling a function from C.

Other commands can be POSIX utilities (eg. sed, cut, etc.). They
have their own problems (speed aside), namely portability.

Portability improvements
Shell scripts often relies on many of those POSIX utilities, which are
not necessarily natively available on all platforms (most notably,
Windows), or may have more or less features depending on the
implementation.

Although C is not perfect portability-wise, it’s still better than
shell scripts. For instance, the resulting binaries will not
necessarily depend on third-party programs or libraries.


RISKS

Of course, rewriting a piece of software takes time, and can lead to
regressions (ie. new bugs). To mitigate that risk, I should understand
well the functions I want to rewrite, run tests on a regular basis and
write new if needed, and of course discuss about my code with the
community during reviews.


APPROXIMATIVE TIMELINE

Community bonding -- April 23, 2018 - May 14, 2018
During the community bonding, I would like to dive into git’s
codebase, and to understand what git-rebase--interactive does under
the hood. At the same time, I’d communicate with the community and my
mentor, seeking for clarifications, and asking questions about how
things should or should not be done.

Weeks 1 & 2 -- May 14, 2018 - May 27, 2018
/From May 14 to 18, I have exams at the university, so I won’t be able
to work full time./

I would search for edge cases not covered by current tests and write
some if needed.

Week 3 -- May 28, 2018 - June 3, 2018
At the same time, I would refactor --preserve-merges in its own
shell script (as described in Dscho’s email[1]), if it has
not been deprecated or moved in the meantime. This operation is not
really tricky by itself, as --preserve-merges is about only 50 lines
of code into git_rebase__interactive().

Weeks 4 to 7 -- May 4, 2018 - July 1, 2018
Then, I would start to incrementally rewrite
git-rebase--interactive.sh functions in C, and move them
git-rebase--helper.c (as in commits 0cce4a2756[2] (rebase -i
-x: add exec commands via the rebase--helper) and b903674b35[3]
(bisect--helper: `is_expected_rev` & `check_expected_revs` shell
function in C)).

There is a lot of functions into git-rebase--interactive.sh to
rewrite. Most of them are small, and some of them are even wrappers
for a single command (eg. is_merge_commit()), so they shouldn’t be
really problematic.

A couple of them are quite long (eg. pick_one()), and will probably
be even longer once rewritten in C due to the low-level nature of the
language. They also tend to depend a lot on other smaller functions.

The plan here would be to start rewriting the smaller functions when
applicable (ie. they’re not a simple command wrapper) before
working on the biggest of them.

Week 8 -- July 2, 2018 - July 8, 2018
When all majors functions from git-rebase--interactive.sh have been
rewritten in C, I would retire the script in favor of a builtin.

Weeks 9 & 10 -- July 9, 2018 - July 22, 2018
I plan to spend theses two weeks to improve the

Re: The most efficient way to test if repositories share the same objects

2018-03-22 Thread Junio C Hamano

Konstantin Ryabitsev  writes:

> $ time git rev-list --max-parents=0 HEAD
> a101ad945113be3d7f283a181810d76897f0a0d6
> cd26f1bd6bf3c73cc5afe848677b430ab342a909
> be0e5c097fc206b863ce9fe6b3cfd6974b0110f4
> 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2
>
> real0m6.311s
> user0m6.153s
> sys 0m0.110s
>
> If I try to do this for each of the 7700 heads, this will take roughly
> 12 hours.

Wouldn't it be more efficient to avoid doing so one-by-one?  
That is, wouldn't

rev-list --max-parents=0 --all

be a bit faster than

for-each-ref |
while read object type refname
do
rev-list --max-parents=0 $refname
done

I wonder?

Re: The most efficient way to test if repositories share the same objects

2018-03-22 Thread Konstantin Ryabitsev

On 03/22/18 15:35, Junio C Hamano wrote:
> I am not sure how Konstantin defines "the most efficient", but if it
> is "with the smallest number of bits exchanged between the
> repositories", then the answer would probably be to find the root
> commit(s) in each repository and if they share any common root(s).
> If there isn't then there is no hope to share objects between them,
> of course.

Hmm... so, this a cool idea that I'd like to use, but there are two
annoying gotchas:

1. I cannot assume that refs/heads/master is meaningful -- my problem is
actually with something like
https://source.codeaurora.org/quic/la/kernel/msm-3.18 -- you will find
that master is actually unborn and there are 7700 other heads (don't get
me started on that unless you're buying me a lot of drinks).

2. Even if there is a HEAD I know I can use, it's pretty slow on large
repos (e.g. linux.git):

$ time git rev-list --max-parents=0 HEAD
a101ad945113be3d7f283a181810d76897f0a0d6
cd26f1bd6bf3c73cc5afe848677b430ab342a909
be0e5c097fc206b863ce9fe6b3cfd6974b0110f4
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2

real0m6.311s
user0m6.153s
sys 0m0.110s

If I try to do this for each of the 7700 heads, this will take roughly
12 hours.

My current strategy has been pretty much:

git -C msm-3.10.git show-ref --tags -s | sort -u > /tmp/refs1
git -C msm-3.18.git show-ref --tags -s | sort -u > /tmp/refs2

and then checking if an intersection of these matches at least half of
refs in either repo:

#/usr/bin/env python
import numpy

refs1 = numpy.array(open('/tmp/refs1').readlines())
refs2 = numpy.array(open('/tmp/refs2').readlines())

in_common = len(numpy.intersect1d(refs1, refs2))
if in_common > len(refs1)/2 or in_common > len(refs2)/2:
print('Lots of shared refs')
else:
print('None or too few shared refs')

This works well enough at least for those repos with lots of shared
tags, but will miss potentially large repos where there's only heads
that can be pointing at commits that aren't necessarily the same between
two repos.

Thanks for your help!

Best,
-- 
Konstantin Ryabitsev
Director, IT Infrastructure Security
The Linux Foundation

signature.asc
Description: OpenPGP digital signature

Re: [RFC PATCH v3.1 2/9 2/2] rebase-interactive: Do not automatically run code

2018-03-22 Thread Junio C Hamano

Wink Saville  writes:

> At Junio's suggestion have git-rebase--am and git-rebase--merge work the
> same way as git-rebase--interactive. This makes the code more consistent.

I mumbled about making git_rebase__$type functions for all $type in
my previous response, but that was done without even looking at
git-rebase--$type.sh scriptlets.  It seems that they all shared the
same structure (i.e. define git_rebase__$type function and then at
the end clla it) and were consistent already.  It was the v3 that
changed the calling convention only for interactive, which made it
inconsistent.  If you are making git-rebase.sh call the helper shell
function for all backend $type, you are keeping the existing
consistency.

This is no longer about "interactive" alone, though, and need to be
retitled ;-)

> Signed-off-by: Wink Saville 
> ---
>  git-rebase--am.sh  | 17 ++---
>  git-rebase--interactive.sh |  8 +++-
>  git-rebase--merge.sh   | 17 ++---
>  git-rebase.sh  | 13 -
>  4 files changed, 23 insertions(+), 32 deletions(-)
>
> diff --git a/git-rebase--am.sh b/git-rebase--am.sh
> index be3f06892..47dc69ed9 100644
> --- a/git-rebase--am.sh
> +++ b/git-rebase--am.sh
> @@ -4,17 +4,14 @@
>  # Copyright (c) 2010 Junio C Hamano.
>  #
>  
> +# The whole contents of this file is loaded by dot-sourcing it from
> +# inside another shell function, hence no shebang on the first line
> +# and then the caller invokes git_rebase__am.

Is this comment necessary?

> +# Previously this file was sourced and it called itself to get this
> +# was to get around a bug in older (9.x) versions of FreeBSD.

ECANTPARSE.  But this probably is no longer needed here, even though
it may make sense to explain why this comment is no longer relevant
in the log message.  E.g.

The backend scriptlets for "git rebase" are structured in a
bit unusual way for historical reasons.  Originally, it was
designed in such a way that dot-sourcing them from "git
rebase" would be sufficient to invoke the specific backend.
When it was discovered that some shell implementations
(e.g. FreeBSD 9.x) misbehaved by exiting when "return" is
executed at the top level of a dot-sourced script (the
original was expecting that the control returns to the next
command in "git rebase" after dot-sourcing the scriptlet),
the whole body of git-rebase--$backend.sh was made into a
shell function git_rebase__$backend and then the scriptlet
was made to call this function at the end as a workaround.

Move the call to "git rebase" side, instead of at the end of
each scriptlet.  This would give us a more normal
arrangement where a function library lives in a scriptlet
that is dot-sourced, and then these helper functions are
called by the script that dot-sourced the scriptlet.

While at it, remove the large comment that explains why this
rather unusual structure was used from these scriptlets.

or something like that in the log message, and then we can get rid
of these in-code comments, I would think.

>  git_rebase__am () {
> -
> +echo "git_rebase_am:+" 1>&5

debuggin'?  I see similar stuff left in other parts (snipped) of
this patch.

[RFC PATCH v3.1 2/9 2/2] rebase-interactive: Do not automatically run code

2018-03-22 Thread Wink Saville

At Junio's suggestion have git-rebase--am and git-rebase--merge work the
same way as git-rebase--interactive. This makes the code more consistent.

Signed-off-by: Wink Saville 
---
 git-rebase--am.sh  | 17 ++---
 git-rebase--interactive.sh |  8 +++-
 git-rebase--merge.sh   | 17 ++---
 git-rebase.sh  | 13 -
 4 files changed, 23 insertions(+), 32 deletions(-)

diff --git a/git-rebase--am.sh b/git-rebase--am.sh
index be3f06892..47dc69ed9 100644
--- a/git-rebase--am.sh
+++ b/git-rebase--am.sh
@@ -4,17 +4,14 @@
 # Copyright (c) 2010 Junio C Hamano.
 #
 
-# The whole contents of this file is run by dot-sourcing it from
-# inside a shell function.  It used to be that "return"s we see
-# below were not inside any function, and expected to return
-# to the function that dot-sourced us.
+# The whole contents of this file is loaded by dot-sourcing it from
+# inside another shell function, hence no shebang on the first line
+# and then the caller invokes git_rebase__am.
 #
-# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a
-# construct and continue to run the statements that follow such a "return".
-# As a work-around, we introduce an extra layer of a function
-# here, and immediately call it after defining it.
+# Previously this file was sourced and it called itself to get this
+# was to get around a bug in older (9.x) versions of FreeBSD.
 git_rebase__am () {
-
+echo "git_rebase_am:+" 1>&5
 case "$action" in
 continue)
git am --resolved --resolvemsg="$resolvemsg" \
@@ -105,5 +102,3 @@ fi
 move_to_original_branch
 
 }
-# ... and then we call the whole thing.
-git_rebase__am
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 213d75f43..48f358333 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -740,8 +740,14 @@ get_missing_commit_check_level () {
printf '%s' "$check_level" | tr 'A-Z' 'a-z'
 }
 
+# The whole contents of this file is loaded by dot-sourcing it from
+# inside another shell function, hence no shebang on the first line
+# and then the caller invokes git_rebase__interactive.
+#
+# Previously this file was sourced and it called itself to get this
+# was to get around a bug in older (9.x) versions of FreeBSD.
 git_rebase__interactive () {
-
+echo "git_rebase_interactive:+" 1>&5
 case "$action" in
 continue)
if test ! -d "$rewritten"
diff --git a/git-rebase--merge.sh b/git-rebase--merge.sh
index ceb715453..71de80788 100644
--- a/git-rebase--merge.sh
+++ b/git-rebase--merge.sh
@@ -104,17 +104,14 @@ finish_rb_merge () {
say All done.
 }
 
-# The whole contents of this file is run by dot-sourcing it from
-# inside a shell function.  It used to be that "return"s we see
-# below were not inside any function, and expected to return
-# to the function that dot-sourced us.
+# The whole contents of this file is loaded by dot-sourcing it from
+# inside another shell function, hence no shebang on the first line
+# and then the caller invokes git_rebase__merge.
 #
-# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a
-# construct and continue to run the statements that follow such a "return".
-# As a work-around, we introduce an extra layer of a function
-# here, and immediately call it after defining it.
+# Previously this file was sourced and it called itself to get this
+# was to get around a bug in older (9.x) versions of FreeBSD.
 git_rebase__merge () {
-
+echo "git_rebase_merge:+" 1>&5
 case "$action" in
 continue)
read_state
@@ -171,5 +168,3 @@ done
 finish_rb_merge
 
 }
-# ... and then we call the whole thing.
-git_rebase__merge
diff --git a/git-rebase.sh b/git-rebase.sh
index c4ec7c21b..4595a316a 100755
--- a/git-rebase.sh
+++ b/git-rebase.sh
@@ -196,15 +196,10 @@ run_specific_rebase () {
export GIT_EDITOR
autosquash=
fi
-   if test "$type" = interactive
-   then
-   . git-rebase--interactive
-   git_rebase__interactive
-   ret=$?
-   else
-   . git-rebase--$type
-   ret=$?
-   fi
+   # Source the code and invoke it
+   . git-rebase--$type
+   git_rebase__$type
+   ret=$?
if test $ret -eq 0
then
finish_rebase
-- 
2.16.2

git: cover letter and automatic Cc: [was Re: [PATCH 23/45] sched: add do_sched_yield() helper; remove in-kernel call to sched_yield()]

2018-03-22 Thread Xose Vazquez Perez

Linus Torvalds wrote:

> On Thu, Mar 22, 2018 at 10:29 AM, Peter Zijlstra  wrote:
>>
>> But why !? Either Cc me on more of the series such that the whole makes
>> sense, or better yet, write a proper Changelog.
> 
> This is a common issue. We should encourage people to always send at
> least the cover-page to everybody who gets cc'd, even if they don't
> get the whole series.


git should be smart enough to do it automatically by itself.

Re: The most efficient way to test if repositories share the same objects

2018-03-22 Thread Ævar Arnfjörð Bjarmason


On Thu, Mar 22 2018, Junio C. Hamano wrote:

> Ævar Arnfjörð Bjarmason  writes:
>
>> But of course that'll just give you the tips. You could then use `git
>> cat-file --batch-check` on both ends to see what commits from the other
>> they report knowing about, in case they have branches that are
>> ahead/behind the other.
>
> I am not sure how you are envisioning to use "cat-file
> --batch-check" here.  Do you mean to take "rev-list --all" output
> from both and compare, or something?

By doing something like this:

(
cd /tmp &&
git clone http://github.com/gitster/git gitster-git;
git clone http://github.com/avar/git avar-git;
git -C gitster-git for-each-ref --format="%(object)" | git -C avar-git 
cat-file --batch-check|grep -E -o '(commit|missing)'|sort|uniq -c &&
git -C avar-git for-each-ref --format="%(object)" | git -C gitster-git 
cat-file --batch-check|grep -E -o '(commit|missing)'|sort|uniq -c
)

Which outputs:

673 commit
696 missing
374 commit
495 missing

Which of course, as noted, isn't going to be a good general solution in
all cases, but it's blindingly fast, and since the original question is
essentially that he's starting out with doing a rough equivalent of
that, but for tags only, maybe it'll work for his use-case.

> I am not sure how Konstantin defines "the most efficient", but if it
> is "with the smallest number of bits exchanged between the
> repositories", then the answer would probably be to find the root
> commit(s) in each repository and if they share any common root(s).
> If there isn't then there is no hope to share objects between them,
> of course.

Yes, that would probably be much better:

diff -ru <(git -C avar-git log --oneline --pretty=format:%H 
--max-parents=0) \
 <(git -C gitster-git log --oneline --pretty=format:%H 
--max-parents=0) &&
do_stuff_because_they_have_stuff_in_common

Re: The most efficient way to test if repositories share the same objects

2018-03-22 Thread Junio C Hamano

Ævar Arnfjörð Bjarmason  writes:

> But of course that'll just give you the tips. You could then use `git
> cat-file --batch-check` on both ends to see what commits from the other
> they report knowing about, in case they have branches that are
> ahead/behind the other.

I am not sure how you are envisioning to use "cat-file
--batch-check" here.  Do you mean to take "rev-list --all" output
from both and compare, or something?

I am not sure how Konstantin defines "the most efficient", but if it
is "with the smallest number of bits exchanged between the
repositories", then the answer would probably be to find the root
commit(s) in each repository and if they share any common root(s).
If there isn't then there is no hope to share objects between them,
of course.

Re: [RFC PATCH v3 2/9] Call git_rebase__interactive from run_specific_rebase

2018-03-22 Thread Wink Saville

On Thu, Mar 22, 2018 at 11:27 AM, Junio C Hamano  wrote:
> Wink Saville  writes:
>
>> Instead of indirectly invoking git_rebase__interactive this invokes
>> it directly after sourcing.
>>
>> Signed-off-by: Wink Saville 
>> ---
>>  git-rebase--interactive.sh | 11 ---
>>  git-rebase.sh  | 11 +--
>>  2 files changed, 9 insertions(+), 13 deletions(-)
>>
>> diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
>> index 561e2660e..213d75f43 100644
>> --- a/git-rebase--interactive.sh
>> +++ b/git-rebase--interactive.sh
>> @@ -740,15 +740,6 @@ get_missing_commit_check_level () {
>>   printf '%s' "$check_level" | tr 'A-Z' 'a-z'
>>  }
>>
>> -# The whole contents of this file is run by dot-sourcing it from
>> -# inside a shell function.  It used to be that "return"s we see
>> -# below were not inside any function, and expected to return
>> -# to the function that dot-sourced us.
>> -#
>> -# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a
>> -# construct and continue to run the statements that follow such a "return".
>> -# As a work-around, we introduce an extra layer of a function
>> -# here, and immediately call it after defining it.
>
> We still enclose the whole thing (including the returns that are
> problematic for older FreeBSD shells) in a shell function, so it's
> not like we are dropping the workaround for these systems.  It's
> just the caller of the function moved.
>
> I think the removal of this large comment is justifiable, but the
> structure still needs a bit of explanation, especially given that
> the caller in git-rebase.sh needs to treat this scriptlet a bit
> differently from others.
>
> If we were not in the (longer term) process of getting rid of
> git-rebase.sh, it might even make sense to port the same
> "dot-sourced scriptlet defines a shell function to be called, and
> the caller calls it after dot-sourcing it" pattern to other rebase
> backends, so that the calling side can be unified again to become
> something like:
>
> . git-rebase--$type
> git_rebase__$type
> ret=$?

I've gone with the above suggestion and added back some
of the header.

Re: The most efficient way to test if repositories share the same objects

2018-03-22 Thread Ævar Arnfjörð Bjarmason

On Thu, Mar 22 2018, Konstantin Ryabitsev wrote:

> What is the most efficient way to test if repoA and repoB share common
> commits? My goal is to automatically figure out if repoB can benefit
> from setting alternates to repoA and repacking. I currently do it by
> comparing the output of "show-ref --tags -s", but that does not work for
> repos without tags.

If you're using show-ref already to get the tag tips, you can use
for-each-ref to get all tips.

But of course that'll just give you the tips. You could then use `git
cat-file --batch-check` on both ends to see what commits from the other
they report knowing about, in case they have branches that are
ahead/behind the other.

[GSoC][PATCH v7] parse-options: do not show usage upon invalid option value

2018-03-22 Thread Paul-Sebastian Ungureanu

Usually, the usage should be shown only if the user does not know what
options are available. If the user specifies an invalid value, the user
is already aware of the available options. In this case, there is no
point in displaying the usage anymore.

This patch applies to "git tag --contains", "git branch --contains",
"git branch --points-at", "git for-each-ref --contains" and many more.

Signed-off-by: Paul-Sebastian Ungureanu 
---

The only differences between this patch and the previous one [1] are
related to t/t0041-usage.sh tests, based on Junio's feedback.

[1] 
https://public-inbox.org/git/20180320175005.18759-1-ungureanupaulsebast...@gmail.com/

 builtin/blame.c   |   1 +
 builtin/shortlog.c|   1 +
 builtin/update-index.c|   1 +
 parse-options.c   |  20 ---
 parse-options.h   |   1 +
 t/t0040-parse-options.sh  |   2 +-
 t/t0041-usage.sh  | 107 ++
 t/t3404-rebase-interactive.sh |   6 +-
 8 files changed, 125 insertions(+), 14 deletions(-)
 create mode 100755 t/t0041-usage.sh

diff --git a/builtin/blame.c b/builtin/blame.c
index 9dcb367b9..e8c6a4d6a 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -729,6 +729,7 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
for (;;) {
switch (parse_options_step(, options, blame_opt_usage)) {
case PARSE_OPT_HELP:
+   case PARSE_OPT_ERROR:
exit(129);
case PARSE_OPT_DONE:
if (ctx.argv[0])
diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index e29875b84..be4df6a03 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -283,6 +283,7 @@ int cmd_shortlog(int argc, const char **argv, const char 
*prefix)
for (;;) {
switch (parse_options_step(, options, shortlog_usage)) {
case PARSE_OPT_HELP:
+   case PARSE_OPT_ERROR:
exit(129);
case PARSE_OPT_DONE:
goto parse_done;
diff --git a/builtin/update-index.c b/builtin/update-index.c
index 58d1c2d28..34adf55a7 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1059,6 +1059,7 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
break;
switch (parseopt_state) {
case PARSE_OPT_HELP:
+   case PARSE_OPT_ERROR:
exit(129);
case PARSE_OPT_NON_OPTION:
case PARSE_OPT_DONE:
diff --git a/parse-options.c b/parse-options.c
index 125e84f98..0f7059a8a 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -317,14 +317,16 @@ static int parse_long_opt(struct parse_opt_ctx_t *p, 
const char *arg,
return get_value(p, options, all_opts, flags ^ opt_flags);
}
 
-   if (ambiguous_option)
-   return error("Ambiguous option: %s "
+   if (ambiguous_option) {
+   error("Ambiguous option: %s "
"(could be --%s%s or --%s%s)",
arg,
(ambiguous_flags & OPT_UNSET) ?  "no-" : "",
ambiguous_option->long_name,
(abbrev_flags & OPT_UNSET) ?  "no-" : "",
abbrev_option->long_name);
+   return -3;
+   }
if (abbrev_option)
return get_value(p, abbrev_option, all_opts, abbrev_flags);
return -2;
@@ -476,7 +478,6 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
   const char * const usagestr[])
 {
int internal_help = !(ctx->flags & PARSE_OPT_NO_INTERNAL_HELP);
-   int err = 0;
 
/* we must reset ->opt, unknown short option leave it dangling */
ctx->opt = NULL;
@@ -505,7 +506,7 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
ctx->opt = arg + 1;
switch (parse_short_opt(ctx, options)) {
case -1:
-   goto show_usage_error;
+   return PARSE_OPT_ERROR;
case -2:
if (ctx->opt)
check_typos(arg + 1, options);
@@ -518,7 +519,7 @@ int parse_options_step(struct parse_opt_ctx_t *ctx,
while (ctx->opt) {
switch (parse_short_opt(ctx, options)) {
case -1:
-   goto show_usage_error;
+   return PARSE_OPT_ERROR;
case -2:
if (internal_help && *ctx->opt == 'h')
goto show_usage;
@@ -550,9 +551,11 @@ int parse_options_step(struct parse_opt_ctx_t

Re: [GSoC][PATCH v6] parse-options: do not show usage upon invalid option value

2018-03-22 Thread Paul-Sebastian Ungureanu

Hi,

Thank you a lot for your advice! I will keep in mind your words next
time I will send a patch. 

Best regards,
Paul Ungurenanu

Re: [RFC PATCH v3 2/9] Call git_rebase__interactive from run_specific_rebase

2018-03-22 Thread Junio C Hamano

Wink Saville  writes:

> Instead of indirectly invoking git_rebase__interactive this invokes
> it directly after sourcing.
>
> Signed-off-by: Wink Saville 
> ---
>  git-rebase--interactive.sh | 11 ---
>  git-rebase.sh  | 11 +--
>  2 files changed, 9 insertions(+), 13 deletions(-)
>
> diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
> index 561e2660e..213d75f43 100644
> --- a/git-rebase--interactive.sh
> +++ b/git-rebase--interactive.sh
> @@ -740,15 +740,6 @@ get_missing_commit_check_level () {
>   printf '%s' "$check_level" | tr 'A-Z' 'a-z'
>  }
>  
> -# The whole contents of this file is run by dot-sourcing it from
> -# inside a shell function.  It used to be that "return"s we see
> -# below were not inside any function, and expected to return
> -# to the function that dot-sourced us.
> -#
> -# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a
> -# construct and continue to run the statements that follow such a "return".
> -# As a work-around, we introduce an extra layer of a function
> -# here, and immediately call it after defining it.

We still enclose the whole thing (including the returns that are
problematic for older FreeBSD shells) in a shell function, so it's
not like we are dropping the workaround for these systems.  It's
just the caller of the function moved.

I think the removal of this large comment is justifiable, but the
structure still needs a bit of explanation, especially given that
the caller in git-rebase.sh needs to treat this scriptlet a bit
differently from others.

If we were not in the (longer term) process of getting rid of
git-rebase.sh, it might even make sense to port the same
"dot-sourced scriptlet defines a shell function to be called, and
the caller calls it after dot-sourcing it" pattern to other rebase
backends, so that the calling side can be unified again to become
something like:

. git-rebase--$type
git_rebase__$type
ret=$?




>  git_rebase__interactive () {
>  
>  case "$action" in
> @@ -1029,5 +1020,3 @@ fi
>  do_rest
>  
>  }
> -# ... and then we call the whole thing.
> -git_rebase__interactive
> diff --git a/git-rebase.sh b/git-rebase.sh
> index a1f6e5de6..c4ec7c21b 100755
> --- a/git-rebase.sh
> +++ b/git-rebase.sh
> @@ -196,8 +196,15 @@ run_specific_rebase () {
>   export GIT_EDITOR
>   autosquash=
>   fi
> - . git-rebase--$type
> - ret=$?
> + if test "$type" = interactive
> + then
> + . git-rebase--interactive
> + git_rebase__interactive
> + ret=$?
> + else
> + . git-rebase--$type
> + ret=$?
> + fi
>   if test $ret -eq 0
>   then
>   finish_rebase

Re: [PATCH] completion: add option completion for most builtin commands

2018-03-22 Thread Duy Nguyen

On Thu, Mar 22, 2018 at 6:56 PM, Junio C Hamano  wrote:
> Duy Nguyen  writes:
>
>> +__git_main_with_parseopt_helper='
>> + blame cat-file check-attr check-ignore
>> + check-mailmap checkout-index column count-objects fast-export
>> + hash-object index-pack interpret-trailers merge-file mktree
>> + pack-objects pack-refs prune prune-packed read-tree repack
>> + send-pack show-ref stripspace symbolic-ref update-index
>> + update-ref verify-commit verify-tag write-tree
>> +'
>> +__git_complete_command() {
>> + local command="$1"
>> + local completion_func="_git_${command//-/_}"
>> + if declare -f $completion_func >/dev/null 2>/dev/null; then
>> + $completion_func
>> + elif echo $__git_main_with_parseopt_helper | git grep -w "$command" 
>> >/dev/null; then
>
> "git grep"???

GIt has taught my fingers some bad habit.

>
> I imagined that you'd keep an associative shell array (we are not
> constrained by POSIX here) that can be used like so
>
> elif test -n "${__git_main_with_parseopt_helper[$command]}"; then
>

Great. We could even kill two existing _git_xxx functions because they
are too simple they could be replaced with this new code. I'll send
out a series later.
-- 
Duy

Re: [PATCH] completion: add option completion for most builtin commands

2018-03-22 Thread Junio C Hamano

Duy Nguyen  writes:

> +__git_main_with_parseopt_helper='
> + blame cat-file check-attr check-ignore
> + check-mailmap checkout-index column count-objects fast-export
> + hash-object index-pack interpret-trailers merge-file mktree
> + pack-objects pack-refs prune prune-packed read-tree repack
> + send-pack show-ref stripspace symbolic-ref update-index
> + update-ref verify-commit verify-tag write-tree
> +'
> +__git_complete_command() {
> + local command="$1"
> + local completion_func="_git_${command//-/_}"
> + if declare -f $completion_func >/dev/null 2>/dev/null; then
> + $completion_func
> + elif echo $__git_main_with_parseopt_helper | git grep -w "$command" 
> >/dev/null; then

"git grep"???

I imagined that you'd keep an associative shell array (we are not
constrained by POSIX here) that can be used like so

elif test -n "${__git_main_with_parseopt_helper[$command]}"; then

Of course, a more traditional way to write it without spawning grep
or pipe is

case " $__git_main_with_parseopt_helper " in
*" $command "*)
... Yes, $command is on the list ...
;;
*)
... No, $command is not on the list ...
;;
esac

Re: [GSoC] Regarding "Convert scripts to builtins"

2018-03-22 Thread Wink Saville

On Thu, Mar 22, 2018, 10:32 AM Johannes Schindelin
 wrote:
>
> Hi Wink,


>
> > Please see "[RFC PATCH 0/3] rebase-interactive" and
> > "[RFC PATCH v2 0/1] rebase-interactive: ...". I'm looking for
> > advice on how to proceed.
>
> Sadly, I had almost no time to spend on the Git mailing list today, but I
> will have a look tomorrow, okay?


NP, I totally understand and, of course, I now have a
version 3: "[RFC PATCH v3 0/9] rebase-interactive:"

Cheers,
Winthrop Lyon Saville III (I had to repsond with my full name although I
always use my nick name, Wink, just because Johannes seems so formal :)

[PATCH v2 3/3] sha1_name: use bsearch_pack() for abbreviations

2018-03-22 Thread Derrick Stolee

When computing abbreviation lengths for an object ID against a single
packfile, the method find_abbrev_len_for_pack() currently implements
binary search. This is one of several implementations. One issue with
this implementation is that it ignores the fanout table in the pack-
index.

Translate this binary search to use the existing bsearch_pack() method
that correctly uses a fanout table.

Due to the use of the fanout table, the abbreviation computation is
slightly faster than before. For a fully-repacked copy of the Linux
repo, the following 'git log' commands improved:

* git log --oneline --parents --raw
  Before: 59.2s
  After:  56.9s
  Rel %:  -3.8%

* git log --oneline --parents
  Before: 6.48s
  After:  5.91s
  Rel %: -8.9%

Signed-off-by: Derrick Stolee 
---
 sha1_name.c | 24 
 1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/sha1_name.c b/sha1_name.c
index 16e0003396..24894b3dbe 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -512,32 +512,16 @@ static void find_abbrev_len_for_pack(struct packed_git *p,
 struct min_abbrev_data *mad)
 {
int match = 0;
-   uint32_t num, last, first = 0;
+   uint32_t num, first = 0;
struct object_id oid;
+   const struct object_id *mad_oid;
 
if (open_pack_index(p) || !p->num_objects)
return;
 
num = p->num_objects;
-   last = num;
-   while (first < last) {
-   uint32_t mid = first + (last - first) / 2;
-   const unsigned char *current;
-   int cmp;
-
-   current = nth_packed_object_sha1(p, mid);
-   cmp = hashcmp(mad->oid->hash, current);
-   if (!cmp) {
-   match = 1;
-   first = mid;
-   break;
-   }
-   if (cmp > 0) {
-   first = mid + 1;
-   continue;
-   }
-   last = mid;
-   }
+   mad_oid = mad->oid;
+   match = bsearch_pack(mad_oid, p, );
 
/*
 * first is now the position in the packfile where we would insert
-- 
2.17.0.rc0

[PATCH v2 0/3] Use bsearch_hash() for abbreviations

2018-03-22 Thread Derrick Stolee

Thanks to Jonathan and Brian for the help with the proper way to handle
OIDs and existing callers to bsearch_hash(). This patch includes one
commit that Brian sent in the previous discussion (included again here
for completeness).

Derrick Stolee (2):
  packfile: define and use bsearch_pack()
  sha1_name: use bsearch_pack() for abbreviations

brian m. carlson (1):
  sha1_name: use bsearch_hash() for abbreviations

 packfile.c  | 42 ++
 packfile.h  |  8 
 sha1_name.c | 28 ++--
 3 files changed, 40 insertions(+), 38 deletions(-)


base-commit: 1a750441a7360b29fff7a414649ece1d35acaca6
-- 
2.17.0.rc0

[PATCH v2 2/3] packfile: define and use bsearch_pack()

2018-03-22 Thread Derrick Stolee

The method bsearch_hash() generalizes binary searches using a
fanout table. The only consumer is currently find_pack_entry_one().
It requires a bit of pointer arithmetic to align the fanout table
and the lookup table depending on the pack-index version.

Extract the pack-index pointer arithmetic to a new method,
bsearch_pack(), so this can be re-used in other code paths.

Signed-off-by: Derrick Stolee 
---
 packfile.c | 42 ++
 packfile.h |  8 
 2 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/packfile.c b/packfile.c
index f26395ecab..69d3afda6c 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1654,6 +1654,29 @@ void *unpack_entry(struct packed_git *p, off_t 
obj_offset,
return data;
 }
 
+int bsearch_pack(const struct object_id *oid, const struct packed_git *p, 
uint32_t *result)
+{
+   const unsigned char *index_fanout = p->index_data;
+   const unsigned char *index_lookup;
+   int index_lookup_width;
+
+   if (!index_fanout)
+   BUG("bsearch_pack called without a valid pack-index");
+
+   index_lookup = index_fanout + 4 * 256;
+   if (p->index_version == 1) {
+   index_lookup_width = 24;
+   index_lookup += 4;
+   } else {
+   index_lookup_width = 20;
+   index_fanout += 8;
+   index_lookup += 8;
+   }
+
+   return bsearch_hash(oid->hash, (const uint32_t*)index_fanout,
+   index_lookup, index_lookup_width, result);
+}
+
 const unsigned char *nth_packed_object_sha1(struct packed_git *p,
uint32_t n)
 {
@@ -1720,30 +1743,17 @@ off_t nth_packed_object_offset(const struct packed_git 
*p, uint32_t n)
 off_t find_pack_entry_one(const unsigned char *sha1,
  struct packed_git *p)
 {
-   const uint32_t *level1_ofs = p->index_data;
const unsigned char *index = p->index_data;
-   unsigned stride;
+   struct object_id oid;
uint32_t result;
 
if (!index) {
if (open_pack_index(p))
return 0;
-   level1_ofs = p->index_data;
-   index = p->index_data;
-   }
-   if (p->index_version > 1) {
-   level1_ofs += 2;
-   index += 8;
-   }
-   index += 4 * 256;
-   if (p->index_version > 1) {
-   stride = 20;
-   } else {
-   stride = 24;
-   index += 4;
}
 
-   if (bsearch_hash(sha1, level1_ofs, index, stride, ))
+   hashcpy(oid.hash, sha1);
+   if (bsearch_pack(, p, ))
return nth_packed_object_offset(p, result);
return 0;
 }
diff --git a/packfile.h b/packfile.h
index a7fca598d6..ec08cb2bb0 100644
--- a/packfile.h
+++ b/packfile.h
@@ -78,6 +78,14 @@ extern struct packed_git *add_packed_git(const char *path, 
size_t path_len, int
  */
 extern void check_pack_index_ptr(const struct packed_git *p, const void *ptr);
 
+/*
+ * Perform binary search on a pack-index for a given oid. Packfile is expected 
to
+ * have a valid pack-index.
+ *
+ * See 'bsearch_hash' for more information.
+ */
+int bsearch_pack(const struct object_id *oid, const struct packed_git *p, 
uint32_t *result);
+
 /*
  * Return the SHA-1 of the nth object within the specified packfile.
  * Open the index if it is not already open.  The return value points
-- 
2.17.0.rc0

[PATCH v2 1/3] sha1_name: convert struct min_abbrev_data to object_id

2018-03-22 Thread Derrick Stolee

From: "brian m. carlson" 

This structure is only written to in one place, where we already have a
struct object_id.  Convert the struct to use a struct object_id instead.

Signed-off-by: brian m. carlson 
---
 sha1_name.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/sha1_name.c b/sha1_name.c
index 39e911c8ba..16e0003396 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -480,7 +480,7 @@ struct min_abbrev_data {
unsigned int init_len;
unsigned int cur_len;
char *hex;
-   const unsigned char *hash;
+   const struct object_id *oid;
 };
 
 static inline char get_hex_char_from_oid(const struct object_id *oid,
@@ -526,7 +526,7 @@ static void find_abbrev_len_for_pack(struct packed_git *p,
int cmp;
 
current = nth_packed_object_sha1(p, mid);
-   cmp = hashcmp(mad->hash, current);
+   cmp = hashcmp(mad->oid->hash, current);
if (!cmp) {
match = 1;
first = mid;
@@ -603,7 +603,7 @@ int find_unique_abbrev_r(char *hex, const struct object_id 
*oid, int len)
mad.init_len = len;
mad.cur_len = len;
mad.hex = hex;
-   mad.hash = oid->hash;
+   mad.oid = oid;
 
find_abbrev_len_packed();
 

base-commit: 1a750441a7360b29fff7a414649ece1d35acaca6
-- 
2.17.0.rc0

Re: [PATCH] completion: add option completion for most builtin commands

2018-03-22 Thread Duy Nguyen

On Thu, Mar 22, 2018 at 10:11:53AM -0700, Junio C Hamano wrote:
> Duy Nguyen  writes:
> 
> >> And that pattern repeats throughout the patch.  I wonder if we can
> >> express the same a lot more concisely by updating the caller that
> >> calls these command specific helpers?
> >
> > Yeah. I almost went to just generate and eval these functions. But we
> > still need to keep a list of "bultin with --git-completion-helper"
> > support somewhere, and people may want to complete arguments without
> > double dashes (e.g. read-tree should take a ref...) which can't be
> > helped by --git-completion-helper.
> 
> Hmph, I actually did not have 'eval' in mind.
> 
> Rather, I was wondering if it is cleaner to update __git_main where
> it computes $completion_func by mangling $command and then calls
> it---instead call __gitcomp_builtin directly when the $command
> appears in such a "list of builtins that knows --completion-helper
> and no custom completion".

Ah. Something like this? Seems to work fine and definitely not as ugly
as eval.

-- 8< --
diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 6da95b8095..960e26e09d 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -3032,6 +3032,32 @@ _git_worktree ()
fi
 }
 
+__git_main_with_parseopt_helper='
+   blame cat-file check-attr check-ignore
+   check-mailmap checkout-index column count-objects fast-export
+   hash-object index-pack interpret-trailers merge-file mktree
+   pack-objects pack-refs prune prune-packed read-tree repack
+   send-pack show-ref stripspace symbolic-ref update-index
+   update-ref verify-commit verify-tag write-tree
+'
+__git_complete_command() {
+   local command="$1"
+   local completion_func="_git_${command//-/_}"
+   if declare -f $completion_func >/dev/null 2>/dev/null; then
+   $completion_func
+   elif echo $__git_main_with_parseopt_helper | git grep -w "$command" 
>/dev/null; then
+   case "$cur" in
+   --*)
+   __gitcomp_builtin "$command"
+   return 0
+   ;;
+   esac
+   return 0
+   else
+   return 1
+   fi
+}
+
 __git_main ()
 {
local i c=1 command __git_dir __git_repo_path
@@ -3091,14 +3117,12 @@ __git_main ()
return
fi
 
-   local completion_func="_git_${command//-/_}"
-   declare -f $completion_func >/dev/null 2>/dev/null && $completion_func 
&& return
+   __git_complete_command "$command" && return
 
local expansion=$(__git_aliased_command "$command")
if [ -n "$expansion" ]; then
words[1]=$expansion
-   completion_func="_git_${expansion//-/_}"
-   declare -f $completion_func >/dev/null 2>/dev/null && 
$completion_func
+   __git_complete_command "$expansion"
fi
 }
 
-- 8< --

Re: [GSoC] Regarding "Convert scripts to builtins"

2018-03-22 Thread Johannes Schindelin

Hi Wink,

On Wed, 21 Mar 2018, Wink Saville wrote:

> I plead guilty to being the preson refactoring --preserve-merges. But
> after reading this and seeing that --recreate-merges is coming and
> possibly git-rebase--* moving to C I'm worried I'd be messing things up.

Don't worry. We will just work together to avoid messing anything up.

> Also, Eric Sunshine felt my v1 changes causes the blame information to
> be obscured. So I created a v2 change which keeps everything in the
> git-rebase--interactive.sh.

Great!

> Please see "[RFC PATCH 0/3] rebase-interactive" and
> "[RFC PATCH v2 0/1] rebase-interactive: ...". I'm looking for
> advice on how to proceed.

Sadly, I had almost no time to spend on the Git mailing list today, but I
will have a look tomorrow, okay?

Ciao,
Johannes

Re: [PATCH] completion: add option completion for most builtin commands

2018-03-22 Thread Junio C Hamano

Duy Nguyen  writes:

>> And that pattern repeats throughout the patch.  I wonder if we can
>> express the same a lot more concisely by updating the caller that
>> calls these command specific helpers?
>
> Yeah. I almost went to just generate and eval these functions. But we
> still need to keep a list of "bultin with --git-completion-helper"
> support somewhere, and people may want to complete arguments without
> double dashes (e.g. read-tree should take a ref...) which can't be
> helped by --git-completion-helper.

Hmph, I actually did not have 'eval' in mind.

Rather, I was wondering if it is cleaner to update __git_main where
it computes $completion_func by mangling $command and then calls
it---instead call __gitcomp_builtin directly when the $command
appears in such a "list of builtins that knows --completion-helper
and no custom completion".

Re: [PATCH] completion: clear cached --options when sourcing the completion script

2018-03-22 Thread Duy Nguyen

On Thu, Mar 22, 2018 at 3:16 PM, SZEDER Gábor  wrote:
> The established way to update the completion script in an already
> running shell is to simply source it again: this brings in any new
> --options and features, and clears caching variables.  E.g. it clears
> the variables caching the list of (all|porcelain) git commands, so
> when they are later lazy-initialized again, then they will list and
> cache any newly installed commmands as well.
>
> Unfortunately, since d401f3debc (git-completion.bash: introduce
> __gitcomp_builtin, 2018-02-09) and subsequent patches this doesn't
> work for a lot of git commands' options.  To eliminate a lot of
> hard-to-maintain hard-coded lists of options, those commits changed
> the completion script to use a bunch of programmatically created and
> lazy-initialized variables to cache the options of those builtin
> porcelain commands that use parse-options.  These variables are not
> cleared upon sourcing the completion script, therefore they continue
> caching the old lists of options, even when some commands recently
> learned new options or when deprecated options were removed.
>
> Always 'unset' these variables caching the options of builtin commands
> when sourcing the completion script.

And here I have been happily unsetting these manually when I re-source
to test stuff, not thinking it as a bug. Thanks!

> Redirect 'unset's stderr to /dev/null, because ZSH's 'unset' complains
> if it's invoked without any arguments, i.e. no variables caching
> builtin's options are set.  This can happen, if someone were to source
> the completion script twice without completing any --options in
> between.  Bash stays silent in this case.
>
> Add tests to ensure that these variables are indeed cleared when the
> completion script is sourced; not just the variables caching options,
> but all other caching variables, i.e. the variables caching commands,
> porcelain commands and merge strategies as well.
>
> Signed-off-by: SZEDER Gábor 
> ---
>
> Fixes a recent regression introduced in 'nd/parseopt-completion'.
>
>  contrib/completion/git-completion.bash |  4 
>  t/t9902-completion.sh  | 31 ++
>  2 files changed, 35 insertions(+)
>
> diff --git a/contrib/completion/git-completion.bash 
> b/contrib/completion/git-completion.bash
> index 7c84eb1912..602352f952 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -280,6 +280,10 @@ __gitcomp ()
> esac
>  }
>
> +# Clear the variables caching builtins' options when (re-)sourcing
> +# the completion script.
> +unset $(set |sed -ne 
> 's/^\(__gitcomp_builtin_[a-zA-Z0-9_][a-zA-Z0-9_]*\)=.*/\1/p') 2>/dev/null
> +
>  # This function is equivalent to
>  #
>  #__gitcomp "$(git xxx --git-completion-helper) ..."
> diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
> index e6485feb0a..4c86adadf2 100755
> --- a/t/t9902-completion.sh
> +++ b/t/t9902-completion.sh
> @@ -1497,4 +1497,35 @@ do
> '
>  done
>
> +test_expect_success 'sourcing the completion script clears cached commands' '
> +   __git_compute_all_commands &&
> +   verbose test -n "$__git_all_commands" &&
> +   . "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" &&
> +   verbose test -z "$__git_all_commands"
> +'
> +
> +test_expect_success 'sourcing the completion script clears cached porcelain 
> commands' '
> +   __git_compute_porcelain_commands &&
> +   verbose test -n "$__git_porcelain_commands" &&
> +   . "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" &&
> +   verbose test -z "$__git_porcelain_commands"
> +'
> +
> +test_expect_success 'sourcing the completion script clears cached merge 
> strategies' '
> +   __git_compute_merge_strategies &&
> +   verbose test -n "$__git_merge_strategies" &&
> +   . "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" &&
> +   verbose test -z "$__git_merge_strategies"
> +'
> +
> +test_expect_success 'sourcing the completion script clears cached --options' 
> '
> +   __gitcomp_builtin checkout &&
> +   verbose test -n "$__gitcomp_builtin_checkout" &&
> +   __gitcomp_builtin notes_edit &&
> +   verbose test -n "$__gitcomp_builtin_notes_edit" &&
> +   . "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" &&
> +   verbose test -z "$__gitcomp_builtin_checkout" &&
> +   verbose test -z "$__gitcomp_builtin_notes_edit"
> +'
> +
>  test_done
> --
> 2.17.0.rc0.103.gbdc5836ed3
>



-- 
Duy

Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Duy Nguyen

On Thu, Mar 22, 2018 at 12:52 PM, Jeff King  wrote:
> On Thu, Mar 22, 2018 at 11:57:27AM +0100, Duy Nguyen wrote:
>
>> On Thu, Mar 22, 2018 at 10:32 AM, Jeff King  wrote:
>> > That would still mean you could get into a broken state for serving
>> > fetches, but you could at least get out of it by running "git repack".
>>
>> I was puzzled by this "broken state" statement. But you were right! I
>> focused on the repack case and forgot about fetch/clone case. I will
>> probably just drop this patch for now. Then maybe revisit this some
>> time in fiture when I find out how to deal with this nicely.
>
> Here's a sketch of the "separate array" concept I mentioned before, in
> case that helps. Not tested at all beyond compiling.

Brilliant! Sorry I couldn't read your suggestion carefully this morning.
-- 
Duy

Re: [PATCH] filter-branch: consider refs can refer to an object other than commit or tag

2018-03-22 Thread Junio C Hamano

Yuki Kokubun  writes:

>> Yuki Kokubun  writes:
>> 
>> >> Yuki Kokubun  writes:
>> >> 
>> >> > "git filter-branch -- --all" can be confused when refs that refer to 
>> >> > objects
>> >> > other than commits or tags exists.
>> ...
>
> I meant the confusion is abnormal messages from the output of "git 
> filter-branch -- --all".

OK, so it is not that the program logic gets confused and ends up
performing a wrong rewrite, but mostly that it gives confusing
messages.

> For example, this is an output of "git filter-branch -- --all":
>
> Rewrite bcdbd016c77df3d5641a3cf820b2ed46ba7bf3b4 (5/5) (0 seconds passed, 
> remaining 0 predicted)
> WARNING: Ref 'refs/heads/master' is unchanged
> WARNING: Ref 'refs/heads/no-newline' is unchanged
> WARNING: Ref 'refs/heads/original' is unchanged

These are worth keeping, as I think existing users expect to see
them.

> error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit
> error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit
> fatal: ambiguous argument 
> 'refs/replace/8a2016f3730cad8309c110f819c855403ed0a5b9^0': unknown revision 
> or path not in the working tree.
> Use '--' to separate paths from revisions, like this:
> 'git  [...] -- [...]'
> WARNING: Ref 'refs/replace/8a2016f3730cad8309c110f819c855403ed0a5b9' is 
> unchanged
> WARNING: Ref 'refs/tags/add-file' is unchanged
> WARNING: Ref 'refs/tags/file' is unchanged
> error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit
> error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit
> fatal: ambiguous argument 'refs/tags/treetag^0': unknown revision or path not 
> in the working tree.
> Use '--' to separate paths from revisions, like this:
> 'git  [...] -- [...]'
> WARNING: Ref 'refs/tags/treetag' is unchanged

I think these warning messages should be kept, especially if we are
to keep the warning messages for the unchanged branches.  However,
the internal error messages are unwanted--these are implementation
details that reach the conclusion, i.e. the ref we were asked to
rewrite ended up being unchanged hence we did not touch it.

However, if we pre-filter to limit the refs in "$tempdir/heads" to
those that are committish (i.e. those that pass "$ref^0") like the
patch and subsequent discussion suggests, wouldn't we lose the
warning for these replace refs and non-committish tags.  We perhaps
could do something like:

git rev-parse --no-flags ... >"$tempdir/raw-heads" || exit

while read ref
do
case "$ref" in ^?*) continue ;; esac
if git rev-parse --verify "$ref^0" 2>/dev/null
then
echo "$ref"
else
warn "WARNING: not rewriting '$ref' (not a committish)"
fi
done >"$tempdir/heads" <"$tempdir/raw-heads"

(note: the else clause is new, relative to my earlier suggestion).

[RFC PATCH v3 3/9] Indent function git_rebase__interactive

2018-03-22 Thread Wink Saville

Signed-off-by: Wink Saville 
---
 git-rebase--interactive.sh | 432 ++---
 1 file changed, 215 insertions(+), 217 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 213d75f43..a79330f45 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -741,27 +741,26 @@ get_missing_commit_check_level () {
 }
 
 git_rebase__interactive () {
-
-case "$action" in
-continue)
-   if test ! -d "$rewritten"
-   then
-   exec git rebase--helper ${force_rebase:+--no-ff} 
$allow_empty_message \
-   --continue
-   fi
-   # do we have anything to commit?
-   if git diff-index --cached --quiet HEAD --
-   then
-   # Nothing to commit -- skip this commit
-
-   test ! -f "$GIT_DIR"/CHERRY_PICK_HEAD ||
-   rm "$GIT_DIR"/CHERRY_PICK_HEAD ||
-   die "$(gettext "Could not remove CHERRY_PICK_HEAD")"
-   else
-   if ! test -f "$author_script"
+   case "$action" in
+   continue)
+   if test ! -d "$rewritten"
then
-   gpg_sign_opt_quoted=${gpg_sign_opt:+$(git rev-parse 
--sq-quote "$gpg_sign_opt")}
-   die "$(eval_gettext "\
+   exec git rebase--helper ${force_rebase:+--no-ff} 
$allow_empty_message \
+   --continue
+   fi
+   # do we have anything to commit?
+   if git diff-index --cached --quiet HEAD --
+   then
+   # Nothing to commit -- skip this commit
+
+   test ! -f "$GIT_DIR"/CHERRY_PICK_HEAD ||
+   rm "$GIT_DIR"/CHERRY_PICK_HEAD ||
+   die "$(gettext "Could not remove CHERRY_PICK_HEAD")"
+   else
+   if ! test -f "$author_script"
+   then
+   gpg_sign_opt_quoted=${gpg_sign_opt:+$(git 
rev-parse --sq-quote "$gpg_sign_opt")}
+   die "$(eval_gettext "\
 You have staged changes in your working tree.
 If these changes are meant to be
 squashed into the previous commit, run:
@@ -776,197 +775,197 @@ In both cases, once you're done, continue with:
 
   git rebase --continue
 ")"
-   fi
-   . "$author_script" ||
-   die "$(gettext "Error trying to find the author 
identity to amend commit")"
-   if test -f "$amend"
-   then
-   current_head=$(git rev-parse --verify HEAD)
-   test "$current_head" = $(cat "$amend") ||
-   die "$(gettext "\
+   fi
+   . "$author_script" ||
+   die "$(gettext "Error trying to find the author 
identity to amend commit")"
+   if test -f "$amend"
+   then
+   current_head=$(git rev-parse --verify HEAD)
+   test "$current_head" = $(cat "$amend") ||
+   die "$(gettext "\
 You have uncommitted changes in your working tree. Please commit them
 first and then run 'git rebase --continue' again.")"
-   do_with_author git commit --amend --no-verify -F "$msg" 
-e \
-   ${gpg_sign_opt:+"$gpg_sign_opt"} 
$allow_empty_message ||
-   die "$(gettext "Could not commit staged 
changes.")"
-   else
-   do_with_author git commit --no-verify -F "$msg" -e \
-   ${gpg_sign_opt:+"$gpg_sign_opt"} 
$allow_empty_message ||
-   die "$(gettext "Could not commit staged 
changes.")"
+   do_with_author git commit --amend --no-verify 
-F "$msg" -e \
+   ${gpg_sign_opt:+"$gpg_sign_opt"} 
$allow_empty_message ||
+   die "$(gettext "Could not commit staged 
changes.")"
+   else
+   do_with_author git commit --no-verify -F "$msg" 
-e \
+   ${gpg_sign_opt:+"$gpg_sign_opt"} 
$allow_empty_message ||
+   die "$(gettext "Could not commit staged 
changes.")"
+   fi
fi
-   fi
 
-   if test -r "$state_dir"/stopped-sha
-   then
-   record_in_rewritten "$(cat "$state_dir"/stopped-sha)"
-   fi
+   if test -r "$state_dir"/stopped-sha
+   then
+   record_in_rewritten "$(cat "$state_dir"/stopped-sha)"
+   fi
 
-   require_clean_work_tree "rebase"
-   do_rest
-   return 0
-   ;;
-skip)
-   git rerere clear
+   require_clean_work_tree

[RFC PATCH v3 8/9] Remove unused code paths from git_rebaseinteractivepreserve_merges

2018-03-22 Thread Wink Saville

Since git_rebase__interactive__preserve_merges is now always called with
$preserve_merges = t we can remove the unused code paths.

Signed-off-by: Wink Saville 
---
 git-rebase--interactive.sh | 152 -
 1 file changed, 69 insertions(+), 83 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 346da0f67..ddbd126f2 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -982,100 +982,86 @@ git_rebase__interactive__preserve_merges () {
setup_reflog_action
init_basic_state
 
-   if test t = "$preserve_merges"
+   if test -z "$rebase_root"
then
-   if test -z "$rebase_root"
-   then
-   mkdir "$rewritten" &&
-   for c in $(git merge-base --all $orig_head $upstream)
-   do
-   echo $onto > "$rewritten"/$c ||
-   die "$(gettext "Could not init 
rewritten commits")"
-   done
-   else
-   mkdir "$rewritten" &&
-   echo $onto > "$rewritten"/root ||
+   mkdir "$rewritten" &&
+   for c in $(git merge-base --all $orig_head $upstream)
+   do
+   echo $onto > "$rewritten"/$c ||
die "$(gettext "Could not init rewritten 
commits")"
-   fi
-   # No cherry-pick because our first pass is to determine
-   # parents to rewrite and skipping dropped commits would
-   # prematurely end our probe
-   merges_option=
+   done
else
-   merges_option="--no-merges --cherry-pick"
+   mkdir "$rewritten" &&
+   echo $onto > "$rewritten"/root ||
+   die "$(gettext "Could not init rewritten commits")"
fi
 
+   # No cherry-pick because our first pass is to determine
+   # parents to rewrite and skipping dropped commits would
+   # prematurely end our probe
+   merges_option=
+
init_revisions_and_shortrevisions
 
-   if test t != "$preserve_merges"
-   then
-   git rebase--helper --make-script ${keep_empty:+--keep-empty} \
-   $revisions ${restrict_revision+^$restrict_revision} 
>"$todo" ||
-   die "$(gettext "Could not generate todo list")"
-   else
-   format=$(git config --get rebase.instructionFormat)
-   # the 'rev-list .. | sed' requires %m to parse; the instruction 
requires %H to parse
-   git rev-list $merges_option --format="%m%H ${format:-%s}" \
-   --reverse --left-right --topo-order \
-   $revisions ${restrict_revision+^$restrict_revision} | \
-   sed -n "s/^>//p" |
-   while read -r sha1 rest
-   do
+   format=$(git config --get rebase.instructionFormat)
+   # the 'rev-list .. | sed' requires %m to parse; the instruction 
requires %H to parse
+   git rev-list $merges_option --format="%m%H ${format:-%s}" \
+   --reverse --left-right --topo-order \
+   $revisions ${restrict_revision+^$restrict_revision} | \
+   sed -n "s/^>//p" |
+   while read -r sha1 rest
+   do
 
-   if test -z "$keep_empty" && is_empty_commit $sha1 && ! 
is_merge_commit $sha1
-   then
-   comment_out="$comment_char "
-   else
-   comment_out=
-   fi
+   if test -z "$keep_empty" && is_empty_commit $sha1 && ! 
is_merge_commit $sha1
+   then
+   comment_out="$comment_char "
+   else
+   comment_out=
+   fi
 
-   if test -z "$rebase_root"
-   then
-   preserve=t
-   for p in $(git rev-list --parents -1 $sha1 | 
cut -d' ' -s -f2-)
-   do
-   if test -f "$rewritten"/$p
-   then
-   preserve=f
-   fi
-   done
-   else
-   preserve=f
-   fi
-   if test f = "$preserve"
-   then
-   touch "$rewritten"/$sha1
-   printf '%s\n' "${comment_out}pick $sha1 $rest" 
>>"$todo"
-   fi
-   done
-   fi
+   if test -z "$rebase_root"
+   then
+   preserve=t
+   for p

[RFC PATCH v3 7/9] Remove unused code paths from git_rebase__interactive

2018-03-22 Thread Wink Saville

Since git_rebase__interactive is now never called with
$preserve_merges = t we can remove those code paths.

Signed-off-by: Wink Saville 
---
 git-rebase--interactive.sh | 95 ++
 1 file changed, 4 insertions(+), 91 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index ab5513d80..346da0f67 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -961,100 +961,13 @@ git_rebase__interactive () {
setup_reflog_action
init_basic_state
 
-   if test t = "$preserve_merges"
-   then
-   if test -z "$rebase_root"
-   then
-   mkdir "$rewritten" &&
-   for c in $(git merge-base --all $orig_head $upstream)
-   do
-   echo $onto > "$rewritten"/$c ||
-   die "$(gettext "Could not init 
rewritten commits")"
-   done
-   else
-   mkdir "$rewritten" &&
-   echo $onto > "$rewritten"/root ||
-   die "$(gettext "Could not init rewritten 
commits")"
-   fi
-   # No cherry-pick because our first pass is to determine
-   # parents to rewrite and skipping dropped commits would
-   # prematurely end our probe
-   merges_option=
-   else
-   merges_option="--no-merges --cherry-pick"
-   fi
+   merges_option="--no-merges --cherry-pick"
 
init_revisions_and_shortrevisions
 
-   if test t != "$preserve_merges"
-   then
-   git rebase--helper --make-script ${keep_empty:+--keep-empty} \
-   $revisions ${restrict_revision+^$restrict_revision} 
>"$todo" ||
-   die "$(gettext "Could not generate todo list")"
-   else
-   format=$(git config --get rebase.instructionFormat)
-   # the 'rev-list .. | sed' requires %m to parse; the instruction 
requires %H to parse
-   git rev-list $merges_option --format="%m%H ${format:-%s}" \
-   --reverse --left-right --topo-order \
-   $revisions ${restrict_revision+^$restrict_revision} | \
-   sed -n "s/^>//p" |
-   while read -r sha1 rest
-   do
-
-   if test -z "$keep_empty" && is_empty_commit $sha1 && ! 
is_merge_commit $sha1
-   then
-   comment_out="$comment_char "
-   else
-   comment_out=
-   fi
-
-   if test -z "$rebase_root"
-   then
-   preserve=t
-   for p in $(git rev-list --parents -1 $sha1 | 
cut -d' ' -s -f2-)
-   do
-   if test -f "$rewritten"/$p
-   then
-   preserve=f
-   fi
-   done
-   else
-   preserve=f
-   fi
-   if test f = "$preserve"
-   then
-   touch "$rewritten"/$sha1
-   printf '%s\n' "${comment_out}pick $sha1 $rest" 
>>"$todo"
-   fi
-   done
-   fi
-
-   # Watch for commits that been dropped by --cherry-pick
-   if test t = "$preserve_merges"
-   then
-   mkdir "$dropped"
-   # Save all non-cherry-picked changes
-   git rev-list $revisions --left-right --cherry-pick | \
-   sed -n "s/^>//p" > "$state_dir"/not-cherry-picks
-   # Now all commits and note which ones are missing in
-   # not-cherry-picks and hence being dropped
-   git rev-list $revisions |
-   while read rev
-   do
-   if test -f "$rewritten"/$rev &&
-  ! sane_grep "$rev" "$state_dir"/not-cherry-picks 
>/dev/null
-   then
-   # Use -f2 because if rev-list is telling us 
this commit is
-   # not worthwhile, we don't want to track its 
multiple heads,
-   # just the history of its first-parent for 
others that will
-   # be rebasing on top of it
-   git rev-list --parents -1 $rev | cut -d' ' -s 
-f2 > "$dropped"/$rev
-   sha1=$(git rev-list -1 $rev)
-   sane_grep -v "^[a-z][a-z]* $sha1" <"$todo" > 
"${todo}2" ; mv "${todo}2" "$todo"
-

[RFC PATCH v3 5/9] Use new functions in git_rebase__interactive

2018-03-22 Thread Wink Saville

Use initiate_action, setup_reflog_action, init_basic_state,
init_revisions_and_shortrevisions and complete_action.

Signed-off-by: Wink Saville 
---
 git-rebase--interactive.sh | 187 ++---
 1 file changed, 8 insertions(+), 179 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index b72f80ae8..2c10a7f1a 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -952,120 +952,15 @@ EOF
 }
 
 git_rebase__interactive () {
-   case "$action" in
-   continue)
-   if test ! -d "$rewritten"
-   then
-   exec git rebase--helper ${force_rebase:+--no-ff} 
$allow_empty_message \
-   --continue
-   fi
-   # do we have anything to commit?
-   if git diff-index --cached --quiet HEAD --
-   then
-   # Nothing to commit -- skip this commit
-
-   test ! -f "$GIT_DIR"/CHERRY_PICK_HEAD ||
-   rm "$GIT_DIR"/CHERRY_PICK_HEAD ||
-   die "$(gettext "Could not remove CHERRY_PICK_HEAD")"
-   else
-   if ! test -f "$author_script"
-   then
-   gpg_sign_opt_quoted=${gpg_sign_opt:+$(git 
rev-parse --sq-quote "$gpg_sign_opt")}
-   die "$(eval_gettext "\
-You have staged changes in your working tree.
-If these changes are meant to be
-squashed into the previous commit, run:
-
-  git commit --amend \$gpg_sign_opt_quoted
-
-If they are meant to go into a new commit, run:
-
-  git commit \$gpg_sign_opt_quoted
-
-In both cases, once you're done, continue with:
-
-  git rebase --continue
-")"
-   fi
-   . "$author_script" ||
-   die "$(gettext "Error trying to find the author 
identity to amend commit")"
-   if test -f "$amend"
-   then
-   current_head=$(git rev-parse --verify HEAD)
-   test "$current_head" = $(cat "$amend") ||
-   die "$(gettext "\
-You have uncommitted changes in your working tree. Please commit them
-first and then run 'git rebase --continue' again.")"
-   do_with_author git commit --amend --no-verify 
-F "$msg" -e \
-   ${gpg_sign_opt:+"$gpg_sign_opt"} 
$allow_empty_message ||
-   die "$(gettext "Could not commit staged 
changes.")"
-   else
-   do_with_author git commit --no-verify -F "$msg" 
-e \
-   ${gpg_sign_opt:+"$gpg_sign_opt"} 
$allow_empty_message ||
-   die "$(gettext "Could not commit staged 
changes.")"
-   fi
-   fi
-
-   if test -r "$state_dir"/stopped-sha
-   then
-   record_in_rewritten "$(cat "$state_dir"/stopped-sha)"
-   fi
-
-   require_clean_work_tree "rebase"
-   do_rest
-   return 0
-   ;;
-   skip)
-   git rerere clear
-
-   if test ! -d "$rewritten"
-   then
-   exec git rebase--helper ${force_rebase:+--no-ff} 
$allow_empty_message \
-   --continue
-   fi
-   do_rest
+   initiate_action "$action"
+   ret=$?
+   if test $ret = 0; then
return 0
-   ;;
-   edit-todo)
-   git stripspace --strip-comments <"$todo" >"$todo".new
-   mv -f "$todo".new "$todo"
-   collapse_todo_ids
-   append_todo_help
-   gettext "
-You are editing the todo file of an ongoing interactive rebase.
-To continue rebase after editing, run:
-git rebase --continue
-
-" | git stripspace --comment-lines >>"$todo"
-
-   git_sequence_editor "$todo" ||
-   die "$(gettext "Could not execute editor")"
-   expand_todo_ids
-
-   exit
-   ;;
-   show-current-patch)
-   exec git show REBASE_HEAD --
-   ;;
-   esac
-
-   comment_for_reflog start
-
-   if test ! -z "$switch_to"
-   then
-   GIT_REFLOG_ACTION="$GIT_REFLOG_ACTION: checkout $switch_to"
-   output git checkout "$switch_to" -- ||
-   die "$(eval_gettext "Could not checkout \$switch_to")"
-
-   comment_for_reflog start
fi
 
-   orig_head=$(git rev-parse --verify HEAD) || die "$(gettext "No HEAD?")"
-   mkdir -p "$state_dir" || die "$(eval_gettext "Could not create 
temporary \$state_dir")"
-   rm -f "$(git rev-parse

[RFC PATCH v3 9/9] Remove merges_option and a blank line

2018-03-22 Thread Wink Saville

merges_option is unused in git_rebase__interactive and always empty in
git_rebase__interactive__preserve_merges so it can be removed.

Signed-off-by: Wink Saville 
---
 git-rebase--interactive.sh | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index ddbd126f2..50323fc27 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -961,8 +961,6 @@ git_rebase__interactive () {
setup_reflog_action
init_basic_state
 
-   merges_option="--no-merges --cherry-pick"
-
init_revisions_and_shortrevisions
 
git rebase--helper --make-script ${keep_empty:+--keep-empty} \
@@ -996,22 +994,16 @@ git_rebase__interactive__preserve_merges () {
die "$(gettext "Could not init rewritten commits")"
fi
 
-   # No cherry-pick because our first pass is to determine
-   # parents to rewrite and skipping dropped commits would
-   # prematurely end our probe
-   merges_option=
-
init_revisions_and_shortrevisions
 
format=$(git config --get rebase.instructionFormat)
# the 'rev-list .. | sed' requires %m to parse; the instruction 
requires %H to parse
-   git rev-list $merges_option --format="%m%H ${format:-%s}" \
+   git rev-list --format="%m%H ${format:-%s}" \
--reverse --left-right --topo-order \
$revisions ${restrict_revision+^$restrict_revision} | \
sed -n "s/^>//p" |
while read -r sha1 rest
do
-
if test -z "$keep_empty" && is_empty_commit $sha1 && ! 
is_merge_commit $sha1
then
comment_out="$comment_char "
-- 
2.16.2

[RFC PATCH v3 4/9] Extract functions out of git_rebase__interactive

2018-03-22 Thread Wink Saville

The extracted functions are:
  - initiate_action
  - setup_reflog_action
  - init_basic_state
  - init_revisions_and_shortrevisions
  - complete_action

Signed-off-by: Wink Saville 
---
 git-rebase--interactive.sh | 211 +
 1 file changed, 211 insertions(+)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index a79330f45..b72f80ae8 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -740,6 +740,217 @@ get_missing_commit_check_level () {
printf '%s' "$check_level" | tr 'A-Z' 'a-z'
 }
 
+# Initiate an action. If the cannot be any
+# further action it  may exec a command
+# or exit and not return.
+#
+# TODO: Consider a cleaner return model so it
+# never exits and always return 0 if process
+# is complete.
+#
+# Parameter 1 is the action to initiate.
+#
+# Returns 0 if the action was able to complete
+# and if 1 if further processing is required.
+initiate_action () {
+   case "$1" in
+   continue)
+   if test ! -d "$rewritten"
+   then
+   exec git rebase--helper ${force_rebase:+--no-ff} 
$allow_empty_message \
+   --continue
+   fi
+   # do we have anything to commit?
+   if git diff-index --cached --quiet HEAD --
+   then
+   # Nothing to commit -- skip this commit
+
+   test ! -f "$GIT_DIR"/CHERRY_PICK_HEAD ||
+   rm "$GIT_DIR"/CHERRY_PICK_HEAD ||
+   die "$(gettext "Could not remove CHERRY_PICK_HEAD")"
+   else
+   if ! test -f "$author_script"
+   then
+   gpg_sign_opt_quoted=${gpg_sign_opt:+$(git 
rev-parse --sq-quote "$gpg_sign_opt")}
+   die "$(eval_gettext "\
+You have staged changes in your working tree.
+If these changes are meant to be
+squashed into the previous commit, run:
+
+  git commit --amend \$gpg_sign_opt_quoted
+
+If they are meant to go into a new commit, run:
+
+  git commit \$gpg_sign_opt_quoted
+
+In both cases, once you're done, continue with:
+
+  git rebase --continue
+")"
+   fi
+   . "$author_script" ||
+   die "$(gettext "Error trying to find the author 
identity to amend commit")"
+   if test -f "$amend"
+   then
+   current_head=$(git rev-parse --verify HEAD)
+   test "$current_head" = $(cat "$amend") ||
+   die "$(gettext "\
+You have uncommitted changes in your working tree. Please commit them
+first and then run 'git rebase --continue' again.")"
+   do_with_author git commit --amend --no-verify 
-F "$msg" -e \
+   ${gpg_sign_opt:+"$gpg_sign_opt"} 
$allow_empty_message ||
+   die "$(gettext "Could not commit staged 
changes.")"
+   else
+   do_with_author git commit --no-verify -F "$msg" 
-e \
+   ${gpg_sign_opt:+"$gpg_sign_opt"} 
$allow_empty_message ||
+   die "$(gettext "Could not commit staged 
changes.")"
+   fi
+   fi
+
+   if test -r "$state_dir"/stopped-sha
+   then
+   record_in_rewritten "$(cat "$state_dir"/stopped-sha)"
+   fi
+
+   require_clean_work_tree "rebase"
+   do_rest
+   return 0
+   ;;
+   skip)
+   git rerere clear
+
+   if test ! -d "$rewritten"
+   then
+   exec git rebase--helper ${force_rebase:+--no-ff} 
$allow_empty_message \
+   --continue
+   fi
+   do_rest
+   return 0
+   ;;
+   edit-todo)
+   git stripspace --strip-comments <"$todo" >"$todo".new
+   mv -f "$todo".new "$todo"
+   collapse_todo_ids
+   append_todo_help
+   gettext "
+You are editing the todo file of an ongoing interactive rebase.
+To continue rebase after editing, run:
+git rebase --continue
+
+" | git stripspace --comment-lines >>"$todo"
+
+   git_sequence_editor "$todo" ||
+   die "$(gettext "Could not execute editor")"
+   expand_todo_ids
+
+   exit
+   ;;
+   show-current-patch)
+   exec git show REBASE_HEAD --
+   ;;
+   *)
+   return 1 # continue
+   ;;
+   esac
+}
+
+setup_reflog_action () {
+   comment_for_reflog start
+
+   if test ! -z "$switch_to"
+   then
+

[RFC PATCH v3 2/9] Call git_rebase__interactive from run_specific_rebase

2018-03-22 Thread Wink Saville

Instead of indirectly invoking git_rebase__interactive this invokes
it directly after sourcing.

Signed-off-by: Wink Saville 
---
 git-rebase--interactive.sh | 11 ---
 git-rebase.sh  | 11 +--
 2 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 561e2660e..213d75f43 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -740,15 +740,6 @@ get_missing_commit_check_level () {
printf '%s' "$check_level" | tr 'A-Z' 'a-z'
 }
 
-# The whole contents of this file is run by dot-sourcing it from
-# inside a shell function.  It used to be that "return"s we see
-# below were not inside any function, and expected to return
-# to the function that dot-sourced us.
-#
-# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a
-# construct and continue to run the statements that follow such a "return".
-# As a work-around, we introduce an extra layer of a function
-# here, and immediately call it after defining it.
 git_rebase__interactive () {
 
 case "$action" in
@@ -1029,5 +1020,3 @@ fi
 do_rest
 
 }
-# ... and then we call the whole thing.
-git_rebase__interactive
diff --git a/git-rebase.sh b/git-rebase.sh
index a1f6e5de6..c4ec7c21b 100755
--- a/git-rebase.sh
+++ b/git-rebase.sh
@@ -196,8 +196,15 @@ run_specific_rebase () {
export GIT_EDITOR
autosquash=
fi
-   . git-rebase--$type
-   ret=$?
+   if test "$type" = interactive
+   then
+   . git-rebase--interactive
+   git_rebase__interactive
+   ret=$?
+   else
+   . git-rebase--$type
+   ret=$?
+   fi
if test $ret -eq 0
then
finish_rebase
-- 
2.16.2

[RFC PATCH v3 6/9] Add and use git_rebaseinteractivepreserve_merges

2018-03-22 Thread Wink Saville

At the moment it's an exact copy of git_rebase__interactive except
the name has changed.

Signed-off-by: Wink Saville 
---
 git-rebase--interactive.sh | 108 +
 git-rebase.sh  |   7 ++-
 2 files changed, 114 insertions(+), 1 deletion(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 2c10a7f1a..ab5513d80 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -1058,3 +1058,111 @@ git_rebase__interactive () {
 
complete_action
 }
+
+git_rebase__interactive__preserve_merges () {
+   initiate_action "$action"
+   ret=$?
+   if test $ret = 0; then
+   return 0
+   fi
+
+   setup_reflog_action
+   init_basic_state
+
+   if test t = "$preserve_merges"
+   then
+   if test -z "$rebase_root"
+   then
+   mkdir "$rewritten" &&
+   for c in $(git merge-base --all $orig_head $upstream)
+   do
+   echo $onto > "$rewritten"/$c ||
+   die "$(gettext "Could not init 
rewritten commits")"
+   done
+   else
+   mkdir "$rewritten" &&
+   echo $onto > "$rewritten"/root ||
+   die "$(gettext "Could not init rewritten 
commits")"
+   fi
+   # No cherry-pick because our first pass is to determine
+   # parents to rewrite and skipping dropped commits would
+   # prematurely end our probe
+   merges_option=
+   else
+   merges_option="--no-merges --cherry-pick"
+   fi
+
+   init_revisions_and_shortrevisions
+
+   if test t != "$preserve_merges"
+   then
+   git rebase--helper --make-script ${keep_empty:+--keep-empty} \
+   $revisions ${restrict_revision+^$restrict_revision} 
>"$todo" ||
+   die "$(gettext "Could not generate todo list")"
+   else
+   format=$(git config --get rebase.instructionFormat)
+   # the 'rev-list .. | sed' requires %m to parse; the instruction 
requires %H to parse
+   git rev-list $merges_option --format="%m%H ${format:-%s}" \
+   --reverse --left-right --topo-order \
+   $revisions ${restrict_revision+^$restrict_revision} | \
+   sed -n "s/^>//p" |
+   while read -r sha1 rest
+   do
+
+   if test -z "$keep_empty" && is_empty_commit $sha1 && ! 
is_merge_commit $sha1
+   then
+   comment_out="$comment_char "
+   else
+   comment_out=
+   fi
+
+   if test -z "$rebase_root"
+   then
+   preserve=t
+   for p in $(git rev-list --parents -1 $sha1 | 
cut -d' ' -s -f2-)
+   do
+   if test -f "$rewritten"/$p
+   then
+   preserve=f
+   fi
+   done
+   else
+   preserve=f
+   fi
+   if test f = "$preserve"
+   then
+   touch "$rewritten"/$sha1
+   printf '%s\n' "${comment_out}pick $sha1 $rest" 
>>"$todo"
+   fi
+   done
+   fi
+
+   # Watch for commits that been dropped by --cherry-pick
+   if test t = "$preserve_merges"
+   then
+   mkdir "$dropped"
+   # Save all non-cherry-picked changes
+   git rev-list $revisions --left-right --cherry-pick | \
+   sed -n "s/^>//p" > "$state_dir"/not-cherry-picks
+   # Now all commits and note which ones are missing in
+   # not-cherry-picks and hence being dropped
+   git rev-list $revisions |
+   while read rev
+   do
+   if test -f "$rewritten"/$rev &&
+  ! sane_grep "$rev" "$state_dir"/not-cherry-picks 
>/dev/null
+   then
+   # Use -f2 because if rev-list is telling us 
this commit is
+   # not worthwhile, we don't want to track its 
multiple heads,
+   # just the history of its first-parent for 
others that will
+   # be rebasing on top of it
+   git rev-list --parents -1 $rev | cut -d' ' -s 
-f2 > "$dropped"/$rev
+

[RFC PATCH v3 0/9] rebase-interactive:

2018-03-22 Thread Wink Saville

I've incorporated review feed back to date. I'm split the change
into 9 commits with each commit do a single class of operation.

I've prepared these commits using github and have Travis-CI setup to
test my changes. Of the 9 commits 2 failed, the 1st and 5th commits, I
tested those two locally and they were fine. I then restarted the builds
on Travis-CI they finished fine so it seems the errors were spurious.

Wink Saville (9):
  Simplify pick_on_preserving_merges
  Call git_rebase__interactive from run_specific_rebase
  Indent function git_rebase__interactive
  Extract functions out of git_rebase__interactive
  Use new functions in git_rebase__interactive
  Add and use git_rebase__interactive__preserve_merges
  Remove unused code paths from git_rebase__interactive
  Remove unused code paths from git_rebase__interactive__preserve_merges
  Remove merges_option and a blank line

 git-rebase--interactive.sh | 407 -
 git-rebase.sh  |  16 +-
 2 files changed, 229 insertions(+), 194 deletions(-)

-- 
2.16.2

[RFC PATCH v3 1/9] Simplify pick_on_preserving_merges

2018-03-22 Thread Wink Saville

Signed-off-by: Wink Saville 
---
 git-rebase--interactive.sh | 17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 331c8dfea..561e2660e 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -307,17 +307,14 @@ pick_one_preserving_merges () {
esac
sha1=$(git rev-parse $sha1)
 
-   if test -f "$state_dir"/current-commit
+   if test -f "$state_dir"/current-commit && test "$fast_forward" = t
then
-   if test "$fast_forward" = t
-   then
-   while read current_commit
-   do
-   git rev-parse HEAD > 
"$rewritten"/$current_commit
-   done <"$state_dir"/current-commit
-   rm "$state_dir"/current-commit ||
-   die "$(gettext "Cannot write current commit's 
replacement sha1")"
-   fi
+   while read current_commit
+   do
+   git rev-parse HEAD > "$rewritten"/$current_commit
+   done <"$state_dir"/current-commit
+   rm "$state_dir"/current-commit ||
+   die "$(gettext "Cannot write current commit's 
replacement sha1")"
fi
 
echo $sha1 >> "$state_dir"/current-commit
-- 
2.16.2

RE: Bug With git rebase -p

2018-03-22 Thread Joseph Strauss

I meant to say that I installed 2.17.0-rc0, and it worked perfectly. Sorry for 
the ambiguity.

-Original Message-
From: Junio C Hamano [mailto:jch2...@gmail.com] On Behalf Of Junio C Hamano
Sent: Thursday, March 22, 2018 12:39 PM
To: Johannes Schindelin 
Cc: Joseph Strauss ; git@vger.kernel.org
Subject: Re: Bug With git rebase -p

Johannes Schindelin  writes:

> On Tue, 20 Mar 2018, Joseph Strauss wrote:
>
>> Perfect. Thank you.
>
> You are welcome.
>
> I am puzzled, though... does your message mean that you tested the Git 
> for Windows v2.17.0-rc0 installer and it did fix your problem? Or do 
> you simply assume that it does fix your problem because Junio & I 
> expect it to fix your problem?

Thanks for asking, as I was curious about the same thing after interpreting 
what Joseph said as "oh, perfect that there is a packaged thing I can readily 
test" (implying "I'll get back to you after seeing if it helps").

Re: [PATCH 0/2] rebase --recreate-merges --keep-empty: don't prune empty

2018-03-22 Thread Junio C Hamano

Phillip Wood  writes:

> On 20/03/18 19:32, Junio C Hamano wrote:
>
>> With or without the above plan, what we saw from you were a bit
>> messy to queue.  The --keep-empty fix series is based on 'maint',
>> while the --signoff series depends on changes that happened to
>> sequencer between 'maint' and 'master', but yet depends on the
>> former.
>
> Yes, that is awkward and unfortunate but the idea behind splitting them
> into two separate series was to have a single set of bug fixes in the
> history. The feature needed to be based on master, so if I'd had the bug
> fixes in the same series you'd of had to cherry-pick them to maint which
> would break branch/tag --contains. I'm not sure if that is a better option.

I said "a bit messy" but that was a statement of a fact, not a
complaint.  Sometimes, we cannot avoid that necessary solutions to
real-life problems must be messy.

I still think what you sent was the best organization, given the
constraints ;-).

Thanks.

The most efficient way to test if repositories share the same objects

2018-03-22 Thread Konstantin Ryabitsev

Hi, all:

What is the most efficient way to test if repoA and repoB share common
commits? My goal is to automatically figure out if repoB can benefit
from setting alternates to repoA and repacking. I currently do it by
comparing the output of "show-ref --tags -s", but that does not work for
repos without tags.

Best,
-- 
Konstantin Ryabitsev



signature.asc
Description: OpenPGP digital signature

UriFormatException output when interacting with a Git server

2018-03-22 Thread John Chesshir

I'm getting this error on a fresh install of git version 2.16.2.windows.1:

fatal: UriFormatException encountered.
   queryUrl

See this post I found, 
https://stackoverflow.com/questions/48775400/git-fatal-uriformatexception-encountered-actualurl
 for more details.  Note that the latest comments are mine, and describe the 
conditions on which I first saw the problem.  I'll warn you, it's weird.

Thank you,
John Chesshir

P.S.  I also found this older post, which appears related, but has clearly been 
fixed: 
https://superuser.com/questions/1114193/when-cloning-on-with-git-bash-on-windows-getting-fatal-uriformatexception-enco.
  Include just in case that helps get closer to the latest problem.

Re: Bug With git rebase -p

2018-03-22 Thread Junio C Hamano

Johannes Schindelin  writes:

> On Tue, 20 Mar 2018, Joseph Strauss wrote:
>
>> Perfect. Thank you.
>
> You are welcome.
>
> I am puzzled, though... does your message mean that you tested the Git for
> Windows v2.17.0-rc0 installer and it did fix your problem? Or do you
> simply assume that it does fix your problem because Junio & I expect it to
> fix your problem?

Thanks for asking, as I was curious about the same thing after
interpreting what Joseph said as "oh, perfect that there is a
packaged thing I can readily test" (implying "I'll get back to you
after seeing if it helps").

Re: [PATCH 1/1] Fix typo in merge-strategies documentation

2018-03-22 Thread Junio C Hamano

Johannes Schindelin  writes:

> On Mon, 19 Mar 2018, Junio C Hamano wrote:
>
>> David Pursehouse  writes:
>> 
>> > From: David Pursehouse 
>> >
>> > Signed-off-by: David Pursehouse 
>> > ---
>> 
>> I somehow had to stare at the patch for a few minutes, view it in
>> two Emacs buffers and run M-x compare-windows before I finally spot
>> the single-byte typofix.
>
> Pro-tip: git am && git show --color-words

Yeah, I know, but that would not work while I am wondering if the
patch is worth applying in the first place ;-)

Re: [PATCH] filter-branch: consider refs can refer to an object other than commit or tag

2018-03-22 Thread Yuki Kokubun

> Yuki Kokubun  writes:
> 
> >> Yuki Kokubun  writes:
> >> 
> >> > "git filter-branch -- --all" can be confused when refs that refer to 
> >> > objects
> >> > other than commits or tags exists.
> >> > Because "git rev-parse --all" that is internally used can return refs 
> >> > that
> >> > refer to an object other than commit or tag. But it is not considered in 
> >> > the
> >> > phase of updating refs.
> >> 
> >> Could you describe what the consequence of that is?  We have a ref
> >> that points directly at a blob object, or a ref that points at a tag
> >> object that points at a blob object.  The current code leaves both of
> >> these refs in "$tempdir/heads".  Then...?
> >
> > Sorry, this is my wrong.
> > I wrongly thought only refs/replace can point at a blob or tree object.
> 
> No need to be sorry.  You still need to describe what (bad things)
> happen if we do not filter out refs that do not point at committish
> in the proposed log message.  
> 
> IOW, can you elaborate and clarify your "can be confused" at the
> beginning?

I meant the confusion is abnormal messages from the output of "git 
filter-branch -- --all".
For example, this is an output of "git filter-branch -- --all":

Rewrite bcdbd016c77df3d5641a3cf820b2ed46ba7bf3b4 (5/5) (0 seconds passed, 
remaining 0 predicted)
WARNING: Ref 'refs/heads/master' is unchanged
WARNING: Ref 'refs/heads/no-newline' is unchanged
WARNING: Ref 'refs/heads/original' is unchanged
error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit
error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit
fatal: ambiguous argument 
'refs/replace/8a2016f3730cad8309c110f819c855403ed0a5b9^0': unknown revision or 
path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git  [...] -- [...]'
WARNING: Ref 'refs/replace/8a2016f3730cad8309c110f819c855403ed0a5b9' is 
unchanged
WARNING: Ref 'refs/tags/add-file' is unchanged
WARNING: Ref 'refs/tags/file' is unchanged
error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit
error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit
fatal: ambiguous argument 'refs/tags/treetag^0': unknown revision or path not 
in the working tree.
Use '--' to separate paths from revisions, like this:
'git  [...] -- [...]'
WARNING: Ref 'refs/tags/treetag' is unchanged

You can see a lot of terrible messages such as "error" and "fatal".
But on the whole, the result of "git filter-branch -- --all" is not so abnormal.
So, this is a just problem about abonormal messages.

I think this messages should be suppressed.
How do you feel about it?

[PATCH] completion: clear cached --options when sourcing the completion script

2018-03-22 Thread SZEDER Gábor

The established way to update the completion script in an already
running shell is to simply source it again: this brings in any new
--options and features, and clears caching variables.  E.g. it clears
the variables caching the list of (all|porcelain) git commands, so
when they are later lazy-initialized again, then they will list and
cache any newly installed commmands as well.

Unfortunately, since d401f3debc (git-completion.bash: introduce
__gitcomp_builtin, 2018-02-09) and subsequent patches this doesn't
work for a lot of git commands' options.  To eliminate a lot of
hard-to-maintain hard-coded lists of options, those commits changed
the completion script to use a bunch of programmatically created and
lazy-initialized variables to cache the options of those builtin
porcelain commands that use parse-options.  These variables are not
cleared upon sourcing the completion script, therefore they continue
caching the old lists of options, even when some commands recently
learned new options or when deprecated options were removed.

Always 'unset' these variables caching the options of builtin commands
when sourcing the completion script.

Redirect 'unset's stderr to /dev/null, because ZSH's 'unset' complains
if it's invoked without any arguments, i.e. no variables caching
builtin's options are set.  This can happen, if someone were to source
the completion script twice without completing any --options in
between.  Bash stays silent in this case.

Add tests to ensure that these variables are indeed cleared when the
completion script is sourced; not just the variables caching options,
but all other caching variables, i.e. the variables caching commands,
porcelain commands and merge strategies as well.

Signed-off-by: SZEDER Gábor 
---

Fixes a recent regression introduced in 'nd/parseopt-completion'.

 contrib/completion/git-completion.bash |  4 
 t/t9902-completion.sh  | 31 ++
 2 files changed, 35 insertions(+)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 7c84eb1912..602352f952 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -280,6 +280,10 @@ __gitcomp ()
esac
 }
 
+# Clear the variables caching builtins' options when (re-)sourcing
+# the completion script.
+unset $(set |sed -ne 
's/^\(__gitcomp_builtin_[a-zA-Z0-9_][a-zA-Z0-9_]*\)=.*/\1/p') 2>/dev/null
+
 # This function is equivalent to
 #
 #__gitcomp "$(git xxx --git-completion-helper) ..."
diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
index e6485feb0a..4c86adadf2 100755
--- a/t/t9902-completion.sh
+++ b/t/t9902-completion.sh
@@ -1497,4 +1497,35 @@ do
'
 done
 
+test_expect_success 'sourcing the completion script clears cached commands' '
+   __git_compute_all_commands &&
+   verbose test -n "$__git_all_commands" &&
+   . "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" &&
+   verbose test -z "$__git_all_commands"
+'
+
+test_expect_success 'sourcing the completion script clears cached porcelain 
commands' '
+   __git_compute_porcelain_commands &&
+   verbose test -n "$__git_porcelain_commands" &&
+   . "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" &&
+   verbose test -z "$__git_porcelain_commands"
+'
+
+test_expect_success 'sourcing the completion script clears cached merge 
strategies' '
+   __git_compute_merge_strategies &&
+   verbose test -n "$__git_merge_strategies" &&
+   . "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" &&
+   verbose test -z "$__git_merge_strategies"
+'
+
+test_expect_success 'sourcing the completion script clears cached --options' '
+   __gitcomp_builtin checkout &&
+   verbose test -n "$__gitcomp_builtin_checkout" &&
+   __gitcomp_builtin notes_edit &&
+   verbose test -n "$__gitcomp_builtin_notes_edit" &&
+   . "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" &&
+   verbose test -z "$__gitcomp_builtin_checkout" &&
+   verbose test -z "$__gitcomp_builtin_notes_edit"
+'
+
 test_done
-- 
2.17.0.rc0.103.gbdc5836ed3

Re: [PATCH v2] filter-branch: use printf instead of echo -e

2018-03-22 Thread Michele Locati


Il 21/03/2018 22:50, Johannes Schindelin ha scritto:

Hi Michele,

On Mon, 19 Mar 2018, Michele Locati wrote:


[...]
--
2.16.2.windows.1


Yay!

Out of curiosity: did the CONTRIBUTING.md file help that was recently
turned into a guide how to contribute to Git (for Windows) by Derrick
Stolee?


Well, no. Here's what led me here...

The people behind the concrete5 CMS asked support for improving the 
following procedure:


concrete5 is stored in
https://github.com/concrete5/concrete5
In order to make it installable with Composer (a PHP package manager), 
we need to extract its "/concrete" directory, and push the results to 
https://github.com/concrete5/concrete5-core

We previously used "git filter-branch --all" with this script:
https://github.com/concrete5/core_splitter/blob/70879e676b95160f7fc5d0ffc22b8f7420b0580b/bin/splitcore

That script is executed every time someone pushes concrete5/concrete5, 
and it took very long time (3~5 minutes).


I noticed on the git-filter-branch manual page that there's a 
"--state-branch", and it seemed quite interesting.
So, I asked in git@vger.kernel.org how to use it, and the author of that 
feature (Ian Campbell) told me to look at some code he wrote.

So, I wrote
https://github.com/concrete5/incremental-filter-branch
which wraps "git filter-branch --filter-state", and it's used in
https://github.com/concrete5/core_splitter/blob/99bbbcea1ab90572a04864ccb92327532ab6a42c/bin/splitcore
(it not yet in production: Korvin wants to be sure everything works well 
before adopting it)
The time required for the operation dropped from 3~5 minutes to ~10 
seconds ;)


While writing that script, I noticed a couple of things that could be 
improved in "git-filter-branch", so I submitted

https://marc.info/?l=git=152111905428554=2
(so that the script could run without problems if there's nothing to do)
and
https://marc.info/?l=git=152147095416175=2
(so that  --filter-state could be used on Mac and other non-Linux systems).
How did I learn how to submit to git? I searched for "git src", landed 
on https://github.com/git/git, read Documentation/SubmittingPatches and 
used git send-email.


And yes, all that on Windows (and WSL).




Ciao,
Johannes


Ciao!

--
Michele

Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Jeff King

On Thu, Mar 22, 2018 at 11:57:27AM +0100, Duy Nguyen wrote:

> On Thu, Mar 22, 2018 at 10:32 AM, Jeff King  wrote:
> > That would still mean you could get into a broken state for serving
> > fetches, but you could at least get out of it by running "git repack".
> 
> I was puzzled by this "broken state" statement. But you were right! I
> focused on the repack case and forgot about fetch/clone case. I will
> probably just drop this patch for now. Then maybe revisit this some
> time in fiture when I find out how to deal with this nicely.

Here's a sketch of the "separate array" concept I mentioned before, in
case that helps. Not tested at all beyond compiling.

---
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 4406af640f..e4e308b453 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1090,7 +1090,7 @@ static void create_object_entry(const struct object_id 
*oid,
else
nr_result++;
if (found_pack) {
-   oe_set_in_pack(entry, found_pack);
+   oe_set_in_pack(_pack, entry, found_pack);
entry->in_pack_offset = found_offset;
}
 
diff --git a/pack-objects.h b/pack-objects.h
index 9f8e450e19..b94b9232fa 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -7,6 +7,8 @@
 #define OE_Z_DELTA_BITS16
 #define OE_DELTA_SIZE_BITS 31
 
+#define OE_IN_PACK_EXTENDED ((1 << OE_IN_PACK_BITS) - 1)
+
 /*
  * State flags for depth-first search used for analyzing delta cycles.
  *
@@ -111,8 +113,13 @@ struct packing_data {
uint32_t index_size;
 
unsigned int *in_pack_pos;
-   int in_pack_count;
-   struct packed_git *in_pack[1 << OE_IN_PACK_BITS];
+
+   struct packed_git **in_pack;
+   uint32_t in_pack_count;
+   size_t in_pack_alloc;
+
+   uint32_t *in_pack_extended;
+   size_t in_pack_extended_alloc;
 };
 
 struct object_entry *packlist_alloc(struct packing_data *pdata,
@@ -174,17 +181,13 @@ static inline void oe_set_in_pack_pos(const struct 
packing_data *pack,
 static inline unsigned int oe_add_pack(struct packing_data *pack,
   struct packed_git *p)
 {
-   if (pack->in_pack_count >= (1 << OE_IN_PACK_BITS))
-   die(_("too many packs to handle in one go. "
- "Please add .keep files to exclude\n"
- "some pack files and keep the number "
- "of non-kept files below %d."),
-   1 << OE_IN_PACK_BITS);
if (p) {
if (p->index > 0)
die("BUG: this packed is already indexed");
p->index = pack->in_pack_count;
}
+   ALLOC_GROW(pack->in_pack, pack->in_pack_count + 1,
+  pack->in_pack_alloc);
pack->in_pack[pack->in_pack_count] = p;
return pack->in_pack_count++;
 }
@@ -192,18 +195,28 @@ static inline unsigned int oe_add_pack(struct 
packing_data *pack,
 static inline struct packed_git *oe_in_pack(const struct packing_data *pack,
const struct object_entry *e)
 {
-   return pack->in_pack[e->in_pack_idx];
-
+   uint32_t idx = e->in_pack_idx;
+   if (idx == OE_IN_PACK_EXTENDED)
+   idx = pack->in_pack_extended[e - pack->objects];
+   return pack->in_pack[idx];
 }
 
-static inline void oe_set_in_pack(struct object_entry *e,
+static inline void oe_set_in_pack(struct packing_data *pack,
+ struct object_entry *e,
  struct packed_git *p)
 {
if (p->index <= 0)
die("BUG: found_pack should be NULL "
"instead of having non-positive index");
-   e->in_pack_idx = p->index;
-
+   else if (p->index < OE_IN_PACK_EXTENDED)
+   e->in_pack_idx = p->index;
+   else {
+   size_t index = e - pack->objects;
+   ALLOC_GROW(pack->in_pack_extended,
+  index, pack->in_pack_extended_alloc);
+   pack->in_pack_extended[index] = p->index;
+   e->in_pack_idx = OE_IN_PACK_EXTENDED;
+   }
 }
 
 static inline struct object_entry *oe_delta(

Re: [PATCH 7/7] diff-highlight: detect --graph by indent

2018-03-22 Thread Jeff King

On Thu, Mar 22, 2018 at 10:59:31AM +, Phillip Wood wrote:

> > +sub handle_line {
> > +   my $orig = shift;
> > +   local $_ = $orig;
> > +
> > +   # match a graph line that begins a commit
> > +   if (/^(?:$COLOR?\|$COLOR?[ ])* # zero or more leading "|" with space
> > +$COLOR?\*$COLOR?[ ]   # a "*" with its trailing space
> > + (?:$COLOR?\|$COLOR?[ ])* # zero or more trailing "|"
> > +[ ]*  # trailing whitespace for merges
> 
> Hi Peff, thanks for looking at this. I've only had a quick look through
> but I wonder if this will be confused by commit messages that contain
>   * bullet points
>   * like this

I think we should be OK because the commit messages are indented by 4
spaces, and we allow only single spaces amidst the graph-drawing bits
(and we respect the "*" only in those graph-drawing bits).

So I think you could fool it with something like:

  git log --graph --format='* oops!'

or maybe even:

  git log --graph --format='%B'

but not with a regular git-log format. Those final corner cases I don't
think we can overcome; it's just syntactically ambiguous. Unless perhaps
we traced the graph lines from the start of the output and knew how many
to expect, but that seems rather complicated.

Ultimately the best solution would be to avoid this parsing nonsense
entirely by teaching Git's internal diff to produce the highlighting
internally.

-Peff

PS I also eyeballed the results of "git log --graph -p --no-merges
  -1000" piped through the old and new versions. There are several
  changes, but they all look like improvements.

Re: [PATCH 0/2] rebase --recreate-merges --keep-empty: don't prune empty

2018-03-22 Thread Phillip Wood

On 20/03/18 19:32, Junio C Hamano wrote:
> Phillip Wood  writes:
> 
>> On 20/03/18 15:42, Johannes Schindelin wrote:
>> ...
>>> As indicated in another reply, I'd rather rebase the --recreate-merges
>>> patches on top of your --keep-empty patch series. This obviously means
>>> that I would fold essentially all of your 2/2 changes into my
>>> "rebase-helper --make-script: introduce a flag to recreate merges"
>>>
>>> The 1/2 (with s/failure/success/g) would then be added to the
>>> --recreate-merges patch series at the end.
>>>
>>> Would that be okay with you?
>>
>> Yes, that's fine, it would give a clearer history
> 
> With or without the above plan, what we saw from you were a bit
> messy to queue.  The --keep-empty fix series is based on 'maint',
> while the --signoff series depends on changes that happened to
> sequencer between 'maint' and 'master', but yet depends on the
> former.

Yes, that is awkward and unfortunate but the idea behind splitting them
into two separate series was to have a single set of bug fixes in the
history. The feature needed to be based on master, so if I'd had the bug
fixes in the same series you'd of had to cherry-pick them to maint which
would break branch/tag --contains. I'm not sure if that is a better option.

Best Wishes

Phillip

> In what I'll be pushing out at the end of today's integration run,
> I'll have two topics organized this way:
> 
>  - pw/rebase-keep-empty-fixes: built by applying the three
>'--keep-empty' patches on top of 'maint'.
> 
>  - pw/rebase-signoff: built by first merging the above to 0f57f731
>("Merge branch 'pw/sequencer-in-process-commit'", 2018-02-13) and
>then applying "rebase --signoff" series.
> 
> Also, I'll revert merge of Dscho's recreate-merges topic to 'next';
> doing so would probably have to invalidate a few evil merges I've
> been carrying to resolve conflicts between it and bc/object-id
> topic, so today's integration cycle may take a bit longer than
> usual.
>

Re: [PATCH 7/7] diff-highlight: detect --graph by indent

2018-03-22 Thread Phillip Wood

On 21/03/18 05:59, Jeff King wrote:
> This patch fixes a corner case where diff-highlight may
> scramble some diffs when combined with --graph.
> 
> Commit 7e4ffb4c17 (diff-highlight: add support for --graph
> output, 2016-08-29) taught diff-highlight to skip past the
> graph characters at the start of each line with this regex:
> 
>   ($COLOR?\|$COLOR?\s+)*
> 
> I.e., any series of pipes separated by and followed by
> arbitrary whitespace.  We need to match more than just a
> single space because the commit in question may be indented
> to accommodate other parts of the graph drawing. E.g.:
> 
>  * commit 1234abcd
>  | ...
>  | diff --git ...
> 
> has only a single space, but for the last commit before a
> fork:
> 
>  | | |
>  | * | commit 1234abcd
>  | |/  ...
>  | |   diff --git
> 
> the diff lines have more spaces between the pipes and the
> start of the diff.
> 
> However, when we soak up all of those spaces with the
> $GRAPH regex, we may accidentally include the leading space
> for a context line. That means we may consider the actual
> contents of a context line as part of the diff syntax. In
> other words, something like this:
> 
>normal context line
>   -old line
>   +new line
>-this is a context line with a leading dash
> 
> would cause us to see that final context line as a removal
> line, and we'd end up showing the hunk in the wrong order:
> 
>   normal context line
>   -old line
>-this is a context line with a leading dash
>   +new line
> 
> Instead, let's a be a little more clever about parsing the
> graph. We'll look for the actual "*" line that marks the
> start of a commit, and record the indentation we see there.
> Then we can skip past that indentation when checking whether
> the line is a hunk header, removal, addition, etc.
> 
> There is one tricky thing: the indentation in bytes may be
> different for various lines of the graph due to coloring.
> E.g., the "*" on a commit line is generally shown without
> color, but on the actual diff lines, it will be replaced
> with a colorized "|" character, adding several bytes. We
> work around this here by counting "visible" bytes. This is
> unfortunately a bit more expensive, making us about twice as
> slow to handle --graph output. But since this is meant to be
> used interactively anyway, it's tolerably fast (and the
> non-graph case is unaffected).
> 
> One alternative would be to search for hunk header lines and
> use their indentation (since they'd have the same colors as
> the diff lines which follow). But that just opens up
> different corner cases. If we see:
> 
>   | |@@ 1,2 1,3 @@
> 
> we cannot know if this is a real diff that has been
> indented due to the graph, or if it's a context line that
> happens to look like a diff header. We can only be sure of
> the indent on the "*" lines, since we know those don't
> contain arbitrary data (technically the user could include a
> bunch of extra indentation via --format, but that's rare
> enough to disregard).
> 
> Reported-by: Phillip Wood 
> Signed-off-by: Jeff King 
> ---
>  contrib/diff-highlight/DiffHighlight.pm   | 78 +++
>  .../diff-highlight/t/t9400-diff-highlight.sh  | 28 +++
>  2 files changed, 91 insertions(+), 15 deletions(-)
> 
> diff --git a/contrib/diff-highlight/DiffHighlight.pm 
> b/contrib/diff-highlight/DiffHighlight.pm
> index e07cd5931d..536754583b 100644
> --- a/contrib/diff-highlight/DiffHighlight.pm
> +++ b/contrib/diff-highlight/DiffHighlight.pm
> @@ -21,34 +21,82 @@ my $RESET = "\x1b[m";
>  my $COLOR = qr/\x1b\[[0-9;]*m/;
>  my $BORING = qr/$COLOR|\s/;
>  
> -# The patch portion of git log -p --graph should only ever have preceding | 
> and
> -# not / or \ as merge history only shows up on the commit line.
> -my $GRAPH = qr/$COLOR?\|$COLOR?\s+/;
> -
>  my @removed;
>  my @added;
>  my $in_hunk;
> +my $graph_indent = 0;
>  
>  our $line_cb = sub { print @_ };
>  our $flush_cb = sub { local $| = 1 };
>  
> -sub handle_line {
> +# Count the visible width of a string, excluding any terminal color 
> sequences.
> +sub visible_width {
>   local $_ = shift;
> + my $ret = 0;
> + while (length) {
> + if (s/^$COLOR//) {
> + # skip colors
> + } elsif (s/^.//) {
> + $ret++;
> + }
> + }
> + return $ret;
> +}
> +
> +# Return a substring of $str, omitting $len visible characters from the
> +# beginning, where terminal color sequences do not count as visible.
> +sub visible_substr {
> + my ($str, $len) = @_;
> + while ($len > 0) {
> + if ($str =~ s/^$COLOR//) {
> + next
> + }
> + $str =~ s/^.//;
> + $len--;
> + }
> + return $str;
> +}
> +
> +sub handle_line {
> + my $orig = shift;
> + local $_ = $orig;
> +
> + # match a graph line that begins a commit
> + if (/^(?:$COLOR?\|$COLOR?[ ])* # zero or more

Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Duy Nguyen

On Thu, Mar 22, 2018 at 10:32 AM, Jeff King  wrote:
> That would still mean you could get into a broken state for serving
> fetches, but you could at least get out of it by running "git repack".

I was puzzled by this "broken state" statement. But you were right! I
focused on the repack case and forgot about fetch/clone case. I will
probably just drop this patch for now. Then maybe revisit this some
time in fiture when I find out how to deal with this nicely.
-- 
Duy

Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Jeff King

On Thu, Mar 22, 2018 at 09:23:42AM +0100, Duy Nguyen wrote:

> > I wish you were right about the rarity, but it's unfortunately something
> > I have seen multiple times in the wild (and why I spent time optimizing
> > the many-packs case for pack-objects). Unfortunately I don't know how
> > often it actually comes up, because in theory running "git repack"
> > cleans it up without further ado. But after these patches, not so much.
> 
> It's good to know this case is real and I can start to fix it
> (assuming that the other concern about readability will not stop this
> series).
> 
> I think I'll try to fix this without involving repack. pack-objects
> can produce multiple packs, so if we have more than 16k pack files, we
> produce  one new pack per 16k old ones.

I suspect that's going to be hard given the structure of the code.

Could we perhaps just bump to an auxiliary storage in that case?  I.e.,
allocate the final index number to mean "look in this other table", and
then have another table of uint32 indices?

That would mean we can behave as we did previously, but just use a
little more memory in the uncommon >16k case.

-Peff

Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Jeff King

On Thu, Mar 22, 2018 at 05:32:12AM -0400, Jeff King wrote:

> So 27% of the total heap goes away if you switch to a separate rev-list.
> Though it's mostly just going to a different process, it does help peak
> because that process would have exited by the time we get to the
> revindex bits.
> 
> I suspect you could get the same effect by just teaching pack-objects to
> clear obj_hash and all of the allocated object structs. I think that
> should be safe to do as long as we clear _all_ of the objects, so there
> are no dangling pointers.

The patch below tries that. It's kind of hacky, but it drops my peak
heap for packing linux.git from 1336MB to 1129MB.

That's not quite as exciting as 27%, because it just moves our peak
earlier, to when we do have all of the object structs in memory (so the
savings are really just that we're not holding the revindex, etc at the
same time as the object structs).

But we also hold that peak for a lot shorter period, because we drop the
memory before we do any delta compression (which itself can be memory
hungry[1]), and don't hold onto it during the write phase (which can be
network-limited when serving a fetch). So during that write phase we're
holding only ~900MB instead of ~1250MB.

-Peff

[1] All of my timings are on noop repacks of a single pack, so there's
no actual delta compression. On average, it will use something like
"nr_threads * window * avg_blob_size". For a "normal" repo, that's
only a few megabytes. But the peak will depend on the large blobs,
so it could have some outsize cases. I don't know if it's worth
worrying about too much for this analysis.

---
Here's the patch. It's probably asking for trouble to have this kind of
clearing interface, as a surprising number of things may hold onto
pointers to objects (see the comment below about the bitmap code).

So maybe the separate process is less insane.

diff --git a/alloc.c b/alloc.c
index 12afadfacd..50d444a3b0 100644
--- a/alloc.c
+++ b/alloc.c
@@ -30,15 +30,23 @@ struct alloc_state {
int count; /* total number of nodes allocated */
int nr;/* number of nodes left in current allocation */
void *p;   /* first free node in current allocation */
+
+   /* book-keeping for clearing */
+   void *start;
+   struct alloc_state *prev;
 };
 
-static inline void *alloc_node(struct alloc_state *s, size_t node_size)
+static inline void *alloc_node(struct alloc_state **sp, size_t node_size)
 {
+   struct alloc_state *s = *sp;
void *ret;
 
-   if (!s->nr) {
+   if (!s || !s->nr) {
+   s = xmalloc(sizeof(*s));
s->nr = BLOCKING;
-   s->p = xmalloc(BLOCKING * node_size);
+   s->start = s->p = xmalloc(BLOCKING * node_size);
+   s->prev = *sp;
+   *sp = s;
}
s->nr--;
s->count++;
@@ -48,7 +56,7 @@ static inline void *alloc_node(struct alloc_state *s, size_t 
node_size)
return ret;
 }
 
-static struct alloc_state blob_state;
+static struct alloc_state *blob_state;
 void *alloc_blob_node(void)
 {
struct blob *b = alloc_node(_state, sizeof(struct blob));
@@ -56,7 +64,7 @@ void *alloc_blob_node(void)
return b;
 }
 
-static struct alloc_state tree_state;
+static struct alloc_state *tree_state;
 void *alloc_tree_node(void)
 {
struct tree *t = alloc_node(_state, sizeof(struct tree));
@@ -64,7 +72,7 @@ void *alloc_tree_node(void)
return t;
 }
 
-static struct alloc_state tag_state;
+static struct alloc_state *tag_state;
 void *alloc_tag_node(void)
 {
struct tag *t = alloc_node(_state, sizeof(struct tag));
@@ -72,7 +80,7 @@ void *alloc_tag_node(void)
return t;
 }
 
-static struct alloc_state object_state;
+static struct alloc_state *object_state;
 void *alloc_object_node(void)
 {
struct object *obj = alloc_node(_state, sizeof(union 
any_object));
@@ -80,7 +88,7 @@ void *alloc_object_node(void)
return obj;
 }
 
-static struct alloc_state commit_state;
+static struct alloc_state *commit_state;
 
 unsigned int alloc_commit_index(void)
 {
@@ -103,7 +111,7 @@ static void report(const char *name, unsigned int count, 
size_t size)
 }
 
 #define REPORT(name, type) \
-report(#name, name##_state.count, name##_state.count * sizeof(type) >> 10)
+report(#name, name##_state->count, name##_state->count * sizeof(type) >> 
10)
 
 void alloc_report(void)
 {
@@ -113,3 +121,22 @@ void alloc_report(void)
REPORT(tag, struct tag);
REPORT(object, union any_object);
 }
+
+static void alloc_clear(struct alloc_state **sp)
+{
+   while (*sp) {
+   struct alloc_state *s = *sp;
+   *sp = s->prev;
+   free(s->start);
+   free(s);
+   }
+}
+
+void alloc_clear_all(void)
+{
+   alloc_clear(_state);
+   alloc_clear(_state);
+   alloc_clear(_state);
+   alloc_clear(_state);
+   alloc_clear(_state);
+}
diff --git

Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Jeff King

On Wed, Mar 21, 2018 at 04:59:19PM +0100, Duy Nguyen wrote:

> > I hate to be a wet blanket, but am I the only one who is wondering
> > whether the tradeoffs is worth it? 8% memory reduction doesn't seem
> > mind-bogglingly good,
> 
> AEvar measured RSS. If we count objects[] array alone, the saving is
> 40% (136 bytes per entry down to 80). Some is probably eaten up by
> mmap in rss.

Measuring actual heap usage with massif, I get before/after peak heaps
of 1728 and 1346MB respectively when repacking linux.git. So that's ~22%
savings overall.

Of the used heap after your patches:

 - ~40% of that is from packlist_alloc()
 - ~17% goes to "struct object"
 - ~10% for the object.c hash table to store all the "struct object"
 - ~7% goes to the delta cache
 - ~7% goes to the pack revindex (actually, there's a duplicate 7%
   there, too; I think our peak is when we're sorting the revindex
   and have to keep two copies in memory at once)
 - ~5% goes to the packlist_find() hash table
 - ~3.5% for the get_object_details() sorting list (this is only held
 for a minute, but again, our peak comes during this sort, which
 in turn loads the revindex)

So 27% of the total heap goes away if you switch to a separate rev-list.
Though it's mostly just going to a different process, it does help peak
because that process would have exited by the time we get to the
revindex bits.

I suspect you could get the same effect by just teaching pack-objects to
clear obj_hash and all of the allocated object structs. I think that
should be safe to do as long as we clear _all_ of the objects, so there
are no dangling pointers.

> About the 16k limit (and some other limits as well), I'm making these
> patches with the assumption that large scale deployment probably will
> go with custom builds anyway. Adjusting the limits back should be
> quite easy while we can still provide reasonable defaults for most
> people.

I think this 16k limit is the thing I _most_ dislike about the series.
If we could tweak that case such that we always made forward progress, I
think I'd be a lot less nervous. I responded elsewhere in the thread
(before seeing that both Junio and you seemed aware of the issues ;) ),
but I think it would be acceptable to have git-repack enforce the limit.

That would still mean you could get into a broken state for serving
fetches, but you could at least get out of it by running "git repack".

-Peff

Dear Friend Please Respond

2018-03-22 Thread sandoo richard

From:MR.RICHARD SANDOO.
Bill and Exchange Manager
Micro Finance Bank Plc
Burkina Faso

Dear Friend,

I know that this mail will come to you as a surprise. I am MR.RICHARD
SANDOO.  and I am  the bill and Exchange manager in a Bank here in my
country .I Hope that you will not expose or betray this trust and
confident that i am about to Repose on you for the mutual benefit of
our both families.

I need your urgent assistance in transferring the sum of $15 million
Immediately to your account. The money has been dormant for years in
our Bank here without any body coming for it.

I want my bank to release the money to you as the nearest person to
our deceased customer the owner Of the account who died a long with
his supposed next of kin in an air Crash since July 2000.I don't want
the money to go into our Bank treasury as an abandoned fund. So this
is the reason why i contacted you, so that the bank can release the
money to you as the nearest person to the deceased customer.

Please i will like you to keep this proposal as a top secret . Upon
receipt of your reply, I will send you full details on how the
transfer will be executed and also note that you will have 40% of the
Above mentioned sum. Acknowledge receipt of this message in acceptance
of my mutual business endeavour by furnishing me with the following
information;

1. Your Full Names and Address...

2. Country... ..

3. Direct Telephone .

4. Occupation ..

Please send me your  information as soon as you reply   I will give
you full details on how you and me can claim the money from our bank.

I am waiting to receive your reply.

MR.RICHARD SANDOO
Micro Finance Bank ,
Burkina Faso

Re: [PATCH v2] routines to generate JSON data

2018-03-22 Thread Ævar Arnfjörð Bjarmason


On Wed, Mar 21 2018, g...@jeffhostetler.com wrote:

> So, I'm not sure we have a route to get UTF-8-clean data out of Git, and if
> we do it is beyond the scope of this patch series.
>
> So I think for our uses here, defining this as "JSON-like" is probably the
> best answer.  We write the strings as we received them (from the file system,
> the index, or whatever).  These strings are properly escaped WRT double
> quotes, backslashes, and control characters, so we shouldn't have an issue
> with decoders getting out of sync -- only with them rejecting non-UTF-8
> sequences.
>
> We could blindly \u encode each of the hi-bit characters, if that would
> help the parsers, but I don't want to do that right now.
>
> WRT binary data, I had not intended using this for binary data.  And without
> knowing what kinds or quantity of binary data we might use it for, I'd like
> to ignore this for now.

I agree we should just ignore this problem for now given the immediate
use-case.

Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Duy Nguyen

On Thu, Mar 22, 2018 at 9:07 AM, Jeff King  wrote:
> On Wed, Mar 21, 2018 at 05:31:14PM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> > [...]Yes, having that many packs is insane, but that's going to be
>> > small consolation to somebody whose automated maintenance program now
>> > craps out at 16k packs, when it previously would have just worked to
>> > fix the situation[...]
>>
>> That's going to be super rare (and probably nonexisting) edge case, but
>> (untested) I wonder if something like this on top would alleviate your
>> concerns, i.e. instead of dying we just take the first N packs up to our
>> limit:
>
> I wish you were right about the rarity, but it's unfortunately something
> I have seen multiple times in the wild (and why I spent time optimizing
> the many-packs case for pack-objects). Unfortunately I don't know how
> often it actually comes up, because in theory running "git repack"
> cleans it up without further ado. But after these patches, not so much.

It's good to know this case is real and I can start to fix it
(assuming that the other concern about readability will not stop this
series).

I think I'll try to fix this without involving repack. pack-objects
can produce multiple packs, so if we have more than 16k pack files, we
produce  one new pack per 16k old ones.
-- 
Duy

Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-22 Thread Jeff King

On Wed, Mar 21, 2018 at 05:31:14PM +0100, Ævar Arnfjörð Bjarmason wrote:

> > [...]Yes, having that many packs is insane, but that's going to be
> > small consolation to somebody whose automated maintenance program now
> > craps out at 16k packs, when it previously would have just worked to
> > fix the situation[...]
> 
> That's going to be super rare (and probably nonexisting) edge case, but
> (untested) I wonder if something like this on top would alleviate your
> concerns, i.e. instead of dying we just take the first N packs up to our
> limit:

I wish you were right about the rarity, but it's unfortunately something
I have seen multiple times in the wild (and why I spent time optimizing
the many-packs case for pack-objects). Unfortunately I don't know how
often it actually comes up, because in theory running "git repack"
cleans it up without further ado. But after these patches, not so much.

I'll admit that my experiences aren't necessarily typical of most git
users. But I wouldn't be surprised if other people hosting their own
repositories run into this, too (e.g., somebody pushing in a loop,
auto-gc disabled or clogged by something silly like the "too many loose
objects" warning).

> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 4406af640f..49d467ab2a 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -1065,8 +1065,9 @@ static int want_object_in_pack(const struct 
> object_id *oid,
> 
> want = 1;
>  done:
> -   if (want && *found_pack && !(*found_pack)->index)
> -   oe_add_pack(_pack, *found_pack);
> +   if (want && *found_pack && !(*found_pack)->index) {
> +   if (oe_add_pack(_pack, *found_pack) == -1)
> +   return 0;

Something like this does seem like a much better fallback, as we'd make
forward progress instead of aborting (and exacerbating whatever caused
the packs to stack up in the first place).

I think the patch as-is does not work, though. You say "oops, too many
packs" and so the "yes we want this object" return becomes "no, we do
not want it". And it is not included in the resulting packfile.

But what happens after that? After pack-objects finishes, we return to
"git repack", which assumes that pack-objects packed everything it was
told to. And with "-d", it then _deletes_ the old packs, knowing that
anything of value was copied to the new pack. So with this patch, we'd
corrupt the repository if this code is ever hit.

You'd need some way to report back to "git repack" that the pack was
omitted. Or probably more sensibly, you'd need "git repack" to count up
the packs and make sure that it marks anybody beyond the limit manually
as .keep (presumably using Duy's new command-line option rather than
actually writing a file).

-Peff

[no subject]

2018-03-22 Thread bbenta

 hi Git


https://goo.gl/RLDyn8







Bbenta

Re: [PATCH v2] routines to generate JSON data

2018-03-22 Thread Jeff King

On Wed, Mar 21, 2018 at 07:28:26PM +, g...@jeffhostetler.com wrote:

> It includes a new "struct json_writer" which is used to guide the
> accumulation of JSON data -- knowing whether an object or array is
> currently being composed.  This allows error checking during construction.
> 
> It also allows construction of nested structures using an inline model (in
> addition to the original bottom-up composition).
> 
> The test helper has been updated to include both the original unit tests and
> a new scripting API to allow individual tests to be written directly in our
> t/t*.sh shell scripts.

Thanks for all of this. The changes look quite sensible to me (I do
still suspect we could do the "first_item" thing without having to
allocate, but I really like the assertions you were able to put in).

> So I think for our uses here, defining this as "JSON-like" is probably the
> best answer.  We write the strings as we received them (from the file system,
> the index, or whatever).  These strings are properly escaped WRT double
> quotes, backslashes, and control characters, so we shouldn't have an issue
> with decoders getting out of sync -- only with them rejecting non-UTF-8
> sequences.

Yeah, I think I've come to the same conclusion. My main goal in raising
it now was to see if there was some other format we might use before we
go too far down the JSON road. But as far as I can tell there really
isn't another good option.

> WRT binary data, I had not intended using this for binary data.  And without
> knowing what kinds or quantity of binary data we might use it for, I'd like
> to ignore this for now.

Yeah, I don't have any plans here either. I was thinking more about
things like author names and file paths.

-Peff

76 matches

Mail list logo