Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates
On Thu, Mar 22, 2018 at 10:46:09PM -0400, Jeff King wrote: > > which begs the question, how much slower would it be if we > > replaced the radix-sort with an in-place sort (e.g. heapsort). > > > > I hacked up the patch below, just for fun. I don't have any > > large repos (or enough disk space) to do any meaningful perf > > tests, but I did at least compile it and it passes the test-suite. > > (That is no guarantee that I haven't introduced bugs, of course!) > > It might have been easier to just revert 8b8dfd5132 (pack-revindex: > radix-sort the revindex, 2013-07-11). It even includes some performance > numbers. :) > > In short, no, I don't think we want to go back to a comparison-sort. The > radix sort back then was around 4 times faster for linux.git. And that > was when there were half as many objects in the repository, so the radix > sort should continue to improve as the repo size grows. I was curious whether my hand-waving there was true. It turns out that it is: the radix sort has stayed about the same speed but the comparison sort has gotten even slower. Here are best-of-five timings for "git cat-file --batch-check='%(objectsize:disk)'", which does very little besides generate the rev-index: [current master, using radix sort] real 0m0.104s user 0m0.088s sys 0m0.016s [reverting 8b8dfd5132, going back to qsort] real 0m1.193s user 0m1.176s sys 0m0.016s So it's now a factor of 11. Yikes. That number does match some napkin math. The radix sort uses four 16-bit buckets, but can quit when after two rounds (because none of the offsets is beyond 2^32). So it's essentially O(2n). Whereas the comparison sort is O(n log n), and with n around 6M, that puts log(n) right around 22. It's possible that some other comparison-based sort might be a little more efficient than qsort, but I don't think you'll be able to beat the algorithmic speedup. The revert of 8b8dfd5132 is below for reference (it needed a few conflict tweaks). -Peff --- diff --git a/pack-revindex.c b/pack-revindex.c index ff5f62c033..c20aa9541b 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -15,102 +15,11 @@ * get the object sha1 from the main index. */ -/* - * This is a least-significant-digit radix sort. - * - * It sorts each of the "n" items in "entries" by its offset field. The "max" - * parameter must be at least as large as the largest offset in the array, - * and lets us quit the sort early. - */ -static void sort_revindex(struct revindex_entry *entries, unsigned n, off_t max) +static int cmp_offset(const void *a_, const void *b_) { - /* -* We use a "digit" size of 16 bits. That keeps our memory -* usage reasonable, and we can generally (for a 4G or smaller -* packfile) quit after two rounds of radix-sorting. -*/ -#define DIGIT_SIZE (16) -#define BUCKETS (1 << DIGIT_SIZE) - /* -* We want to know the bucket that a[i] will go into when we are using -* the digit that is N bits from the (least significant) end. -*/ -#define BUCKET_FOR(a, i, bits) (((a)[(i)].offset >> (bits)) & (BUCKETS-1)) - - /* -* We need O(n) temporary storage. Rather than do an extra copy of the -* partial results into "entries", we sort back and forth between the -* real array and temporary storage. In each iteration of the loop, we -* keep track of them with alias pointers, always sorting from "from" -* to "to". -*/ - struct revindex_entry *tmp, *from, *to; - int bits; - unsigned *pos; - - ALLOC_ARRAY(pos, BUCKETS); - ALLOC_ARRAY(tmp, n); - from = entries; - to = tmp; - - /* -* If (max >> bits) is zero, then we know that the radix digit we are -* on (and any higher) will be zero for all entries, and our loop will -* be a no-op, as everybody lands in the same zero-th bucket. -*/ - for (bits = 0; max >> bits; bits += DIGIT_SIZE) { - unsigned i; - - memset(pos, 0, BUCKETS * sizeof(*pos)); - - /* -* We want pos[i] to store the index of the last element that -* will go in bucket "i" (actually one past the last element). -* To do this, we first count the items that will go in each -* bucket, which gives us a relative offset from the last -* bucket. We can then cumulatively add the index from the -* previous bucket to get the true index. -*/ - for (i = 0; i < n; i++) - pos[BUCKET_FOR(from, i, bits)]++; - for (i = 1; i < BUCKETS; i++) - pos[i] += pos[i-1]; - - /* -* Now we can drop the elements into their correct buckets (in -* our temporary array). We iterate the pos counter backwards -* to avoid using an extra index to count up. And since we
[PATCH v2] filter-branch: fix errors caused by refs that cannot be used with ^0
"git filter-branch -- --all" print unwanted error messages when refs that cannot be used with ^0 exist. Such refs can be created by "git replace" with trees or blobs. Also, "git tag" with trees or blobs can create such refs. --- git-filter-branch.sh | 14 -- t/t7003-filter-branch.sh | 14 ++ 2 files changed, 26 insertions(+), 2 deletions(-) diff --git a/git-filter-branch.sh b/git-filter-branch.sh index 1b7e4b2cd..41efecb28 100755 --- a/git-filter-branch.sh +++ b/git-filter-branch.sh @@ -251,8 +251,18 @@ done < "$tempdir"/backup-refs # The refs should be updated if their heads were rewritten git rev-parse --no-flags --revs-only --symbolic-full-name \ - --default HEAD "$@" > "$tempdir"/raw-heads || exit -sed -e '/^^/d' "$tempdir"/raw-heads >"$tempdir"/heads + --default HEAD "$@" > "$tempdir"/raw-refs || exit +while read ref +do + case "$ref" in ^?*) continue ;; esac + + if git rev-parse --verify "$ref"^0 >/dev/null 2>&1 + then + echo "$ref" + else + warn "WARNING: not rewriting '$ref' (not a committish)" + fi +done >"$tempdir"/heads <"$tempdir"/raw-refs test -s "$tempdir"/heads || die "You must specify a ref to rewrite." diff --git a/t/t7003-filter-branch.sh b/t/t7003-filter-branch.sh index 7cb60799b..04f79f32b 100755 --- a/t/t7003-filter-branch.sh +++ b/t/t7003-filter-branch.sh @@ -470,4 +470,18 @@ test_expect_success 'tree-filter deals with object name vs pathname ambiguity' ' git show HEAD:$ambiguous ' +test_expect_success 'rewrite repository including refs that point at non-commit object' ' + test_when_finished "git reset --hard original" && + tree=$(git rev-parse HEAD^{tree}) && + test_when_finished "git replace -d $tree" && + echo A >new && + git add new && + new_tree=$(git write-tree) && + git replace $tree $new_tree && + git tag -a -m "tag to a tree" treetag $new_tree && + git reset --hard HEAD && + git filter-branch -f -- --all >filter-output 2>&1 && + ! fgrep fatal filter-output +' + test_done -- 2.16.2.18.g09cb46d
[RFC PATCH v4] rebase-interactive
This is v4 of the first 2 patches of "[RFC PATCH vV n/9] rebase-interactive", looking forward to any additional comments. Wink Saville (2): rebase-interactive: Simplify pick_on_preserving_merges rebase: Update invocation of rebase dot-sourced scripts git-rebase--am.sh | 11 --- git-rebase--interactive.sh | 28 +++- git-rebase--merge.sh | 11 --- git-rebase.sh | 2 ++ 4 files changed, 9 insertions(+), 43 deletions(-) -- 2.16.2
[RFC PATCH v4] rebase: Update invocation of rebase dot-sourced scripts
The backend scriptlets for "git rebase" were structured in a bit unusual way for historical reasons. Originally, it was designed in such a way that dot-sourcing them from "git rebase" would be sufficient to invoke the specific backend. When it was discovered that some shell implementations (e.g. FreeBSD 9.x) misbehaved when exiting with a "return" is executed at the top level of a dot-sourced script (the original was expecting that the control returns to the next command in "git rebase" after dot-sourcing the scriptlet). To fix this issue the whole body of git-rebase--$backend.sh was made into a shell function git_rebase__$backend and then the last statement of the scriptlet would invoke the function. Here the call is moved to "git rebase" side, instead of at the end of each scriptlet. This give us a more normal arrangement where the scriptlet function library and allows multiple functions to be implemented in a scriptlet. Signed-off-by: Wink SavilleReviewed-by: Junio C Hamano Reviewed-by: Eric Sunsine --- git-rebase--am.sh | 11 --- git-rebase--interactive.sh | 11 --- git-rebase--merge.sh | 11 --- git-rebase.sh | 2 ++ 4 files changed, 2 insertions(+), 33 deletions(-) diff --git a/git-rebase--am.sh b/git-rebase--am.sh index be3f06892..e5fd6101d 100644 --- a/git-rebase--am.sh +++ b/git-rebase--am.sh @@ -4,15 +4,6 @@ # Copyright (c) 2010 Junio C Hamano. # -# The whole contents of this file is run by dot-sourcing it from -# inside a shell function. It used to be that "return"s we see -# below were not inside any function, and expected to return -# to the function that dot-sourced us. -# -# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a -# construct and continue to run the statements that follow such a "return". -# As a work-around, we introduce an extra layer of a function -# here, and immediately call it after defining it. git_rebase__am () { case "$action" in @@ -105,5 +96,3 @@ fi move_to_original_branch } -# ... and then we call the whole thing. -git_rebase__am diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index 561e2660e..213d75f43 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -740,15 +740,6 @@ get_missing_commit_check_level () { printf '%s' "$check_level" | tr 'A-Z' 'a-z' } -# The whole contents of this file is run by dot-sourcing it from -# inside a shell function. It used to be that "return"s we see -# below were not inside any function, and expected to return -# to the function that dot-sourced us. -# -# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a -# construct and continue to run the statements that follow such a "return". -# As a work-around, we introduce an extra layer of a function -# here, and immediately call it after defining it. git_rebase__interactive () { case "$action" in @@ -1029,5 +1020,3 @@ fi do_rest } -# ... and then we call the whole thing. -git_rebase__interactive diff --git a/git-rebase--merge.sh b/git-rebase--merge.sh index ceb715453..685f48ca4 100644 --- a/git-rebase--merge.sh +++ b/git-rebase--merge.sh @@ -104,15 +104,6 @@ finish_rb_merge () { say All done. } -# The whole contents of this file is run by dot-sourcing it from -# inside a shell function. It used to be that "return"s we see -# below were not inside any function, and expected to return -# to the function that dot-sourced us. -# -# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a -# construct and continue to run the statements that follow such a "return". -# As a work-around, we introduce an extra layer of a function -# here, and immediately call it after defining it. git_rebase__merge () { case "$action" in @@ -171,5 +162,3 @@ done finish_rb_merge } -# ... and then we call the whole thing. -git_rebase__merge diff --git a/git-rebase.sh b/git-rebase.sh index a1f6e5de6..4595a316a 100755 --- a/git-rebase.sh +++ b/git-rebase.sh @@ -196,7 +196,9 @@ run_specific_rebase () { export GIT_EDITOR autosquash= fi + # Source the code and invoke it . git-rebase--$type + git_rebase__$type ret=$? if test $ret -eq 0 then -- 2.16.2
[RFC PATCH v4] rebase-interactive: Simplify pick_on_preserving_merges
Use compound if statement instead of nested if statements to simplify pick_on_preserving_merges. Signed-off-by: Wink SavilleReviewed-by: Junio C Hamano --- git-rebase--interactive.sh | 17 +++-- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index 331c8dfea..561e2660e 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -307,17 +307,14 @@ pick_one_preserving_merges () { esac sha1=$(git rev-parse $sha1) - if test -f "$state_dir"/current-commit + if test -f "$state_dir"/current-commit && test "$fast_forward" = t then - if test "$fast_forward" = t - then - while read current_commit - do - git rev-parse HEAD > "$rewritten"/$current_commit - done <"$state_dir"/current-commit - rm "$state_dir"/current-commit || - die "$(gettext "Cannot write current commit's replacement sha1")" - fi + while read current_commit + do + git rev-parse HEAD > "$rewritten"/$current_commit + done <"$state_dir"/current-commit + rm "$state_dir"/current-commit || + die "$(gettext "Cannot write current commit's replacement sha1")" fi echo $sha1 >> "$state_dir"/current-commit -- 2.16.2
Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates
On Fri, Mar 23, 2018 at 01:28:12AM +, Ramsay Jones wrote: > > Of the used heap after your patches: > > > > - ~40% of that is from packlist_alloc() > > - ~17% goes to "struct object" > > - ~10% for the object.c hash table to store all the "struct object" > > - ~7% goes to the delta cache > > - ~7% goes to the pack revindex (actually, there's a duplicate 7% > >there, too; I think our peak is when we're sorting the revindex > >and have to keep two copies in memory at once) > > which begs the question, how much slower would it be if we > replaced the radix-sort with an in-place sort (e.g. heapsort). > > I hacked up the patch below, just for fun. I don't have any > large repos (or enough disk space) to do any meaningful perf > tests, but I did at least compile it and it passes the test-suite. > (That is no guarantee that I haven't introduced bugs, of course!) It might have been easier to just revert 8b8dfd5132 (pack-revindex: radix-sort the revindex, 2013-07-11). It even includes some performance numbers. :) In short, no, I don't think we want to go back to a comparison-sort. The radix sort back then was around 4 times faster for linux.git. And that was when there were half as many objects in the repository, so the radix sort should continue to improve as the repo size grows. The absolute time savings aren't huge for something as bulky as a repack, so it's less exciting in this context. But it's also not that much memory (7% of the peak here, but as I showed elsewhere, if we can stop holding all of the "struct object" memory once we're done with it, then this revindex stuff doesn't even factor into the peak). -Peff
Re: [PATCH] filter-branch: consider refs can refer to an object other than commit or tag
> However, if we pre-filter to limit the refs in "$tempdir/heads" to > those that are committish (i.e. those that pass "$ref^0") like the > patch and subsequent discussion suggests, wouldn't we lose the > warning for these replace refs and non-committish tags. We perhaps > could do something like: > > git rev-parse --no-flags ... >"$tempdir/raw-heads" || exit > > while read ref > do > case "$ref" in ^?*) continue ;; esac > if git rev-parse --verify "$ref^0" 2>/dev/null > then > echo "$ref" > else > warn "WARNING: not rewriting '$ref' (not a committish)" > fi > done >"$tempdir/heads" <"$tempdir/raw-heads" > > (note: the else clause is new, relative to my earlier suggestion). I agree these suggestions. I'm gonna send a new patch that follow it.
[PATCH] branch: implement shortcut to delete last branch
Add support for using the "-" shortcut to delete the last checked-out branch. This functionality already exists for git-merge, git-checkout, and git-revert. Signed-off-by: Aaron Greenberg--- builtin/branch.c | 3 +++ t/t3200-branch.sh | 9 + 2 files changed, 12 insertions(+) diff --git a/builtin/branch.c b/builtin/branch.c index 6d0cea9..9e37078 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -221,6 +221,9 @@ static int delete_branches(int argc, const char **argv, int force, int kinds, char *target = NULL; int flags = 0; + if (!strcmp(argv[i], "-")) + argv[i] = "@{-1}"; + strbuf_branchname(, argv[i], allowed_interpret); free(name); name = mkpathdup(fmt, bname.buf); diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh index 6c0b7ea..a3ffd54 100755 --- a/t/t3200-branch.sh +++ b/t/t3200-branch.sh @@ -776,6 +776,15 @@ test_expect_success 'deleting currently checked out branch fails' ' test_must_fail git branch -d my7 ' +test_expect_success 'test deleting last branch' ' + git checkout -b my7.2 && + git checkout - && + sha1=$(git rev-parse my7 | cut -c 1-7) && + echo "Deleted branch my7.2 (was $sha1)." >expect && + git branch -d - >actual 2>&1 && + test_i18ncmp expect actual +' + test_expect_success 'test --track without .fetch entries' ' git branch --track my8 && test "$(git config branch.my8.remote)" && -- 2.7.4
[PATCH] branch: implement shortcut to delete last branch
This patch gives git-branch the ability to delete the previous checked-out branch using the "-" shortcut. This shortcut already exists for git-checkout, git-merge, and git-revert. One of my common workflows is to do some work on a local topic branch and push it to a remote, where it gets merged in to 'master'. Then, I switch back to my local master, fetch the remote master, and delete the previous topic branch. $ git checkout -b topic-a $ # Do some work... $ git commit -am "Implement feature A" $ git push origin topic-a # 'origin/topic-a' gets merged into 'origin/master' $ git checkout master $ git branch -d topic-a $ # With this patch, a user could simply type $ git branch -d - I think it's a useful shortcut for cleaning up a just-merged branch (or a just switched-from branch.)
Re: [RFC PATCH v3.1 2/9 2/2] rebase-interactive: Do not automatically run code
Wink Savillewrites: > Currently I'm not rebasing the other commits (3..9) > to reduce the amount of work I have to do in each > review cycle, is that OK? Yeah, I want to see others more heavily involved in this part of the system to comment on your patches. As to the organization of the changes, I have some more opinions of my own, primarily regarding to reviewability, but they are of secondary importance than reviews by area experts. I think it would be helpful to give them a target that is not moving too rapidly ;-) > Also, will you merge commits 1 and 2 before the other > commits or is the procedure to merge the complete set > at once? I am inclined to take the early preliminary clean-up steps before the remainder.
[GSOC]About the microproject related to CI
Hi all, I'm Zhibin Li, an undergraduate from China and I'm interested in automated testing. Since the application deadline is coming, hope it's not too late for me to start with the microproject. If it's ok, I would like to take Git CI Improvements 4 as my starting point. But the description on the website shows less details so I wonder what am I supposed to do more specifically? Reporting the results or trying to figure out the how and why those results come out independently? It would be nice if you guys can tell me about any details. Thanks, Zhibin
Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates
On 22/03/18 09:32, Jeff King wrote: > On Wed, Mar 21, 2018 at 04:59:19PM +0100, Duy Nguyen wrote: > >>> I hate to be a wet blanket, but am I the only one who is wondering >>> whether the tradeoffs is worth it? 8% memory reduction doesn't seem >>> mind-bogglingly good, >> >> AEvar measured RSS. If we count objects[] array alone, the saving is >> 40% (136 bytes per entry down to 80). Some is probably eaten up by >> mmap in rss. > > Measuring actual heap usage with massif, I get before/after peak heaps > of 1728 and 1346MB respectively when repacking linux.git. So that's ~22% > savings overall. > > Of the used heap after your patches: > > - ~40% of that is from packlist_alloc() > - ~17% goes to "struct object" > - ~10% for the object.c hash table to store all the "struct object" > - ~7% goes to the delta cache > - ~7% goes to the pack revindex (actually, there's a duplicate 7% >there, too; I think our peak is when we're sorting the revindex >and have to keep two copies in memory at once) which begs the question, how much slower would it be if we replaced the radix-sort with an in-place sort (e.g. heapsort). I hacked up the patch below, just for fun. I don't have any large repos (or enough disk space) to do any meaningful perf tests, but I did at least compile it and it passes the test-suite. (That is no guarantee that I haven't introduced bugs, of course!) ;-) ATB, Ramsay Jones -- >8 -- Subject: [PATCH] pack-revindex: replace radix-sort with in-place heapsort Signed-off-by: Ramsay Jones--- pack-revindex.c | 60 + 1 file changed, 60 insertions(+) diff --git a/pack-revindex.c b/pack-revindex.c index ff5f62c03..16f17eac1 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -15,6 +15,7 @@ * get the object sha1 from the main index. */ +#ifdef DUMMY /* * This is a least-significant-digit radix sort. * @@ -112,6 +113,65 @@ static void sort_revindex(struct revindex_entry *entries, unsigned n, off_t max) #undef BUCKETS #undef DIGIT_SIZE } +#endif + +static inline void swap(struct revindex_entry *a, int i, int j) +{ + struct revindex_entry t; + + t = a[i]; + a[i] = a[j]; + a[j] = t; +} + +/* + * assume that elements first .. last (array index first-1 .. last-1) obey + * the partially ordered tree property, except possibly for the children of + * the first element. push down the first element until the partially + * ordered tree property is restored. + */ +static void push_down(struct revindex_entry *a, int first, int last) +{ + int parent = first; + int last_node = last / 2; + + while (parent <= last_node) { + + int left = 2 * parent; + int right = left + 1; + int biggest; + + if (right > last) /* parent only has one child */ + biggest = left; + else { + if (a[left-1].offset >= a[right-1].offset) + biggest = left; + else + biggest = right; + + } + + if (a[parent-1].offset >= a[biggest-1].offset) + break; /* partially ordered tree property, we're done */ + + /* push parent down */ + swap(a, parent-1, biggest-1); + parent = biggest; + } +} + +static void sort_revindex(struct revindex_entry *entries, unsigned n, off_t max) +{ + int i; + + for (i = n/2; i > 0; i--) + push_down(entries, i, n); + + for (i = n; i > 1; i--) { + swap(entries, 0, i-1); + push_down(entries, 1, i-1); + } +} /* * Ordered list of offsets of objects in the pack. -- 2.16.0
Re: [GSoC] Convert “git stash” to builtin proposal
Hello, On Tue, 2018-03-20 at 23:08 +0100, Christian Couder wrote: > Hi, > > On Tue, Mar 20, 2018 at 9:09 PM, Paul-Sebastian Ungureanu >wrote: > > > > * Convert function: this step is basically makes up the goal of > > this > > project. > > Could you explain a bit more how you would convert a function? Or > could you explain for example how you would convert "git stash list" > below? In order to convert a command, all the functions which are used by the command must be converted first. The conversion will start with the bottom-level functions, which do not have any dependencies. For example, to convert "git stash list", the parser will call “list_stash”, which will call “have_stash”. The conversion of these functions will be made in reverse order they were mentioned (have_stash first and then list_stash). It is very important to know the Git source well in order to avoid reimplementing functionality. In this case “have_stash()” is somehow already implemented as “get_oid(ref_stash, )”. > > I am expecting to submit a patch for every command that is > > converted. > > This way, I have well set milestones and results will be seen as > > soon > > as possible. Moreover, seeing my contributions getting merged will > > be a > > huge confidence boost to myself. > > Each “convert” phase of the project below is not only about > > converting > > from Shell to C, but also ensuring that everything is documented, > > there > > are enough tests and there are no regressions. > > > > 14th May - 20th May - Convert "git stash list" > > For example could you explain a bit more which functions/commands you > would create in which file and how you would call them to convert > `git > stash list` The new C code will be found in stash-helper.c and will be used by git- stash.sh until the full conversion is complete. As soon as the entire conversion is done, stash-helper.c will be promoted to stash.c. Any functionality that will be implemented, but is not strictly related to git stash will reside in the appropriate files (for example if I had to implement similar to get_oid, which is not related to git stash, but to Git, then I would not implement it in stash-helper.c; anyway, I do not believe I will encounter this situation that often). In the updated version of the proposal [1], I included the ideas stated before and more information about the procces of benchmarking. In addition, to test my capabilities and to be sure that I am able to work on a project of this kind, I tried to convert “git stash list” into a built-in (the code can be found in proposal). [1] https://docs.google.com/document/d/1TtdD7zgUsEOszHG5HX1at4zHMDsPmSB YWqydOPTTpV8/edit Convert “git stash” to builtin ## Name and Contact Information -- Hello! My name is Paul-Sebastian Ungureanu. I am currently a first year Computer Science & Engineering student at Politehnica University of Bucharest, Romania. My email address is ungureanupaulsebast...@gmail.com and my phone number is [CENSORED]. You can also find me on #git IRC channel as ungps. FLOSS manual recommends students to include in their proposal their postal address and mention a relative as a emergency contact. My permanent address is [CENSORED]. In case of an emergency, you may contact my brother, [CENSORED] by email at [CENSORED] or by phone at [CENSORED]. ## Synopsis -- Currently, many components of Git are still in the form of Shell or Perl scripts. This choice makes sense especially when considering how faster new features can be implemented in Shell and Perl scripts, rather than writing them in C. However, in production, where Git runs daily on millions of computers with distinct configurations (i.e. different operating systems) some problems appear (i.e. POSIX-to- Windows path conversion issues). The idea of this project is to take “git-stash.sh” and reimplement it in C. Apart from fixing compatibility issues and increasing the performance by going one step closer to native code, I believe this is an excellent excuse to fix long-standing bugs or implement new minor features. ## Benefits to community -- The essential benefit brought by rewriting Git commands is the increased compatibility with a large number computers with distinct configuration. I believe that this project can have a positive impact on a large mass of developers around the world. By rewriting the code behind some popular commands and making them “built-in”, developers will have a better overall experience when using Git and get to focus on what really matters to them. As a side effect, there will be a number of other improvements: increased performance, ability to rethink the design of some code that suffered from patching along the time, have the chance to create new features and fix old bugs. In theory, switching from Bash or Shell scripts, Git will
Re: [RFC PATCH v3.1 2/9 2/2] rebase-interactive: Do not automatically run code
On Thu, Mar 22, 2018 at 1:46 PM, Junio C Hamanowrote: > Wink Saville writes: > >> At Junio's suggestion have git-rebase--am and git-rebase--merge work the >> same way as git-rebase--interactive. This makes the code more consistent. > > I mumbled about making git_rebase__$type functions for all $type in > my previous response, but that was done without even looking at > git-rebase--$type.sh scriptlets. It seems that they all shared the > same structure (i.e. define git_rebase__$type function and then at > the end clla it) and were consistent already. It was the v3 that > changed the calling convention only for interactive, which made it > inconsistent. If you are making git-rebase.sh call the helper shell > function for all backend $type, you are keeping the existing > consistency. > > This is no longer about "interactive" alone, though, and need to be > retitled ;-) > >> Signed-off-by: Wink Saville >> --- >> git-rebase--am.sh | 17 ++--- >> git-rebase--interactive.sh | 8 +++- >> git-rebase--merge.sh | 17 ++--- >> git-rebase.sh | 13 - >> 4 files changed, 23 insertions(+), 32 deletions(-) >> >> diff --git a/git-rebase--am.sh b/git-rebase--am.sh >> index be3f06892..47dc69ed9 100644 >> --- a/git-rebase--am.sh >> +++ b/git-rebase--am.sh >> @@ -4,17 +4,14 @@ >> # Copyright (c) 2010 Junio C Hamano. >> # >> >> +# The whole contents of this file is loaded by dot-sourcing it from >> +# inside another shell function, hence no shebang on the first line >> +# and then the caller invokes git_rebase__am. > > Is this comment necessary? Removed > >> +# Previously this file was sourced and it called itself to get this >> +# was to get around a bug in older (9.x) versions of FreeBSD. > > ECANTPARSE. But this probably is no longer needed here, even though > it may make sense to explain why this comment is no longer relevant > in the log message. E.g. > > The backend scriptlets for "git rebase" are structured in a > bit unusual way for historical reasons. Originally, it was > designed in such a way that dot-sourcing them from "git > rebase" would be sufficient to invoke the specific backend. > When it was discovered that some shell implementations > (e.g. FreeBSD 9.x) misbehaved by exiting when "return" is > executed at the top level of a dot-sourced script (the > original was expecting that the control returns to the next > command in "git rebase" after dot-sourcing the scriptlet), > the whole body of git-rebase--$backend.sh was made into a > shell function git_rebase__$backend and then the scriptlet > was made to call this function at the end as a workaround. > > Move the call to "git rebase" side, instead of at the end of > each scriptlet. This would give us a more normal > arrangement where a function library lives in a scriptlet > that is dot-sourced, and then these helper functions are > called by the script that dot-sourced the scriptlet. > > While at it, remove the large comment that explains why this > rather unusual structure was used from these scriptlets. > > or something like that in the log message, and then we can get rid > of these in-code comments, I would think. Updated commit message >> git_rebase__am () { >> - >> +echo "git_rebase_am:+" 1>&5 > > debuggin'? I see similar stuff left in other parts (snipped) of > this patch. Removed debugging :( Currently I'm not rebasing the other commits (3..9) to reduce the amount of work I have to do in each review cycle, is that OK? Also, will you merge commits 1 and 2 before the other commits or is the procedure to merge the complete set at once?
[ANNOUNCE] Git v2.16.3
The latest maintenance release Git v2.16.3 is now available at the usual places. It merges many small fixes and documentation updates that have been in the 'master' branch for a few weeks. The tarballs are found at: https://www.kernel.org/pub/software/scm/git/ The following public repositories all have a copy of the 'v2.16.3' tag and the 'maint' branch that the tag points at: url = https://kernel.googlesource.com/pub/scm/git/git url = git://repo.or.cz/alt-git.git url = https://github.com/gitster/git Git v2.16.3 Release Notes = Fixes since v2.16.2 --- * "git status" after moving a path in the working tree (hence making it appear "removed") and then adding with the -N option (hence making that appear "added") detected it as a rename, but did not report the old and new pathnames correctly. * "git commit --fixup" did not allow "-m" option to be used at the same time; allow it to annotate resulting commit with more text. * When resetting the working tree files recursively, the working tree of submodules are now also reset to match. * Fix for a commented-out code to adjust it to a rather old API change around object ID. * When there are too many changed paths, "git diff" showed a warning message but in the middle of a line. * The http tracing code, often used to debug connection issues, learned to redact potentially sensitive information from its output so that it can be more safely sharable. * Crash fix for a corner case where an error codepath tried to unlock what it did not acquire lock on. * The split-index mode had a few corner case bugs fixed. * Assorted fixes to "git daemon". * Completion of "git merge -s" (in contrib/) did not work well in non-C locale. * Workaround for segfault with more recent versions of SVN. * Recently introduced leaks in fsck have been plugged. * Travis CI integration now builds the executable in 'script' phase to follow the established practice, rather than during 'before_script' phase. This allows the CI categorize the failures better ('failed' is project's fault, 'errored' is build environment's). Also contains various documentation updates and code clean-ups. Changes since v2.16.2 are as follows: Ben Peart (1): fsmonitor: update documentation to remove reference to invalid config settings Brandon Williams (1): oidmap: ensure map is initialized Christian Ludwig (1): t9001: use existing helper in send-email test Eric Sunshine (2): git-worktree.txt: fix missing ")" typo git-worktree.txt: fix indentation of example and text of 'add' command Eric Wong (2): fsck: fix leak when traversing trees git-svn: control destruction order to avoid segfault Genki Sky (1): test-lib.sh: unset XDG_CACHE_HOME Jeff King (10): t5570: use ls-remote instead of clone for interp tests t/lib-git-daemon: record daemon log daemon: fix off-by-one in logging extended attributes daemon: handle NULs in extended attribute string t/lib-git-daemon: add network-protocol helpers daemon: fix length computation in newline stripping t0205: drop redundant test git-sh-i18n: check GETTEXT_POISON before USE_GETTEXT_SCHEME commit: drop uses of get_cached_commit_buffer() revision: drop --show-all option Jonathan Tan (2): http: support cookie redaction when tracing http: support omitting data from traces Juan F. Codagnone (1): mailinfo: avoid segfault when can't open files Junio C Hamano (2): worktree: say that "add" takes an arbitrary commit in short-help Git 2.16.3 Kaartic Sivaraam (2): Doc/gitsubmodules: make some changes to improve readability and syntax Doc/git-submodule: improve readability and grammar of a sentence Mathias Rav (1): files_initial_transaction_commit(): only unlock if locked Motoki Seki (1): Documentation/gitsubmodules.txt: avoid non-ASCII apostrophes Nguyễn Thái Ngọc Duy (12): t2203: test status output with porcelain v2 format Use DIFF_DETECT_RENAME for detect_rename assignments wt-status.c: coding style fix wt-status.c: catch unhandled diff status codes wt-status.c: rename rename-related fields in wt_status_change_data wt-status.c: handle worktree renames read-cache.c: change type of "temp" in write_shared_index() read-cache.c: move tempfile creation/cleanup out of write_shared_index diff.c: flush stdout before printing rename warnings read-cache: don't write index twice if we can't write shared index completion: fix completing merge strategies on non-C locales gitignore.txt: elaborate shell glob syntax Ramsay Jones (2): config.mak.uname: remove SPARSE_FLAGS setting for cygwin
Donation made to you
-- Donation made to you. Donor: Mrs.Elisabeth Schaeffler Thumann. please Contact;(mrselisabeth.schaefflerth0...@gmail.com) for more details.
Donation made to you.
-- Donation made to you. Donor: Mrs.Elisabeth Schaeffler Thumann. please Contact;(mrselisabeth.schaefflerth0...@gmail.com) for more details.
Re: [RFC][GSoC] Project proposal: convert interactive rebase to C
Hi, here is my second draft of my proposal. As last time, any feedback is welcome :) I did not write my phone number and address here for obvious reasons, but they will be in the “about me” section of the final proposal. Apart from that, do you think there is something to add? --- ABSTRACT git is a modular source control management software, and all of its subcommands are programs on their own. A lot of them are written in C, but a couple of them are shell or Perl scripts. This is the case of git-rebase--interactive (or interactive rebase), which is a shell script. Rewriting it in C would improve its performance, its portability, and maybe its robustness. ABOUT `git-rebase` and `git-rebase--interactive` git-rebase allows to re-apply changes on top of another branch. For instance, when a local branch and a remote branch have diverged, git-rebase can re-unify them, applying each change made on the local branch on top of the remote branch. git-rebase--interactive is used to reorganize commits by reordering, rewording, or squashing them. To achieve this purpose, git opens the list of commits to be modified in a text editor (hence the interactivity), as well as the actions to be performed for each of them. PROJECT GOALS The goal of this project is to rewrite git-rebase--interactive in C as it has been discussed on the git mailing list[1], for multiple reasons : Performance improvements Shell scripts are inherently slow. That’s because each command is a program by itself. So, for each command, the shell interpreter has to spawn a new process and to load a new program (with fork() and exec() syscalls), which is an expensive process. Those commands can be other git commands. Sometimes, they are wrappers to call internal C functions (eg. git-rebase--helper), something shell scripts can’t do natively. These wrappers basically parse the parameters, then start the appropriate function, which is obviously slower than just calling a function from C. Other commands can be POSIX utilities (eg. sed, cut, etc.). They have their own problems (speed aside), namely portability. Portability improvements Shell scripts often relies on many of those POSIX utilities, which are not necessarily natively available on all platforms (most notably, Windows), or may have more or less features depending on the implementation. Although C is not perfect portability-wise, it’s still better than shell scripts. For instance, the resulting binaries will not necessarily depend on third-party programs or libraries. RISKS Of course, rewriting a piece of software takes time, and can lead to regressions (ie. new bugs). To mitigate that risk, I should understand well the functions I want to rewrite, run tests on a regular basis and write new if needed, and of course discuss about my code with the community during reviews. APPROXIMATIVE TIMELINE Community bonding -- April 23, 2018 - May 14, 2018 During the community bonding, I would like to dive into git’s codebase, and to understand what git-rebase--interactive does under the hood. At the same time, I’d communicate with the community and my mentor, seeking for clarifications, and asking questions about how things should or should not be done. Weeks 1 & 2 -- May 14, 2018 - May 27, 2018 /From May 14 to 18, I have exams at the university, so I won’t be able to work full time./ I would search for edge cases not covered by current tests and write some if needed. Week 3 -- May 28, 2018 - June 3, 2018 At the same time, I would refactor --preserve-merges in its own shell script (as described in Dscho’s email[1]), if it has not been deprecated or moved in the meantime. This operation is not really tricky by itself, as --preserve-merges is about only 50 lines of code into git_rebase__interactive(). Weeks 4 to 7 -- May 4, 2018 - July 1, 2018 Then, I would start to incrementally rewrite git-rebase--interactive.sh functions in C, and move them git-rebase--helper.c (as in commits 0cce4a2756[2] (rebase -i -x: add exec commands via the rebase--helper) and b903674b35[3] (bisect--helper: `is_expected_rev` & `check_expected_revs` shell function in C)). There is a lot of functions into git-rebase--interactive.sh to rewrite. Most of them are small, and some of them are even wrappers for a single command (eg. is_merge_commit()), so they shouldn’t be really problematic. A couple of them are quite long (eg. pick_one()), and will probably be even longer once rewritten in C due to the low-level nature of the language. They also tend to depend a lot on other smaller functions. The plan here would be to start rewriting the smaller functions when applicable (ie. they’re not a simple command wrapper) before working on the biggest of them. Week 8 -- July 2, 2018 - July 8, 2018 When all majors functions from git-rebase--interactive.sh have been rewritten in C, I would retire the script in favor of a builtin. Weeks 9 & 10 -- July 9, 2018 - July 22, 2018 I plan to spend theses two weeks to improve the
Re: The most efficient way to test if repositories share the same objects
Konstantin Ryabitsevwrites: > $ time git rev-list --max-parents=0 HEAD > a101ad945113be3d7f283a181810d76897f0a0d6 > cd26f1bd6bf3c73cc5afe848677b430ab342a909 > be0e5c097fc206b863ce9fe6b3cfd6974b0110f4 > 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 > > real0m6.311s > user0m6.153s > sys 0m0.110s > > If I try to do this for each of the 7700 heads, this will take roughly > 12 hours. Wouldn't it be more efficient to avoid doing so one-by-one? That is, wouldn't rev-list --max-parents=0 --all be a bit faster than for-each-ref | while read object type refname do rev-list --max-parents=0 $refname done I wonder?
Re: The most efficient way to test if repositories share the same objects
On 03/22/18 15:35, Junio C Hamano wrote: > I am not sure how Konstantin defines "the most efficient", but if it > is "with the smallest number of bits exchanged between the > repositories", then the answer would probably be to find the root > commit(s) in each repository and if they share any common root(s). > If there isn't then there is no hope to share objects between them, > of course. Hmm... so, this a cool idea that I'd like to use, but there are two annoying gotchas: 1. I cannot assume that refs/heads/master is meaningful -- my problem is actually with something like https://source.codeaurora.org/quic/la/kernel/msm-3.18 -- you will find that master is actually unborn and there are 7700 other heads (don't get me started on that unless you're buying me a lot of drinks). 2. Even if there is a HEAD I know I can use, it's pretty slow on large repos (e.g. linux.git): $ time git rev-list --max-parents=0 HEAD a101ad945113be3d7f283a181810d76897f0a0d6 cd26f1bd6bf3c73cc5afe848677b430ab342a909 be0e5c097fc206b863ce9fe6b3cfd6974b0110f4 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 real0m6.311s user0m6.153s sys 0m0.110s If I try to do this for each of the 7700 heads, this will take roughly 12 hours. My current strategy has been pretty much: git -C msm-3.10.git show-ref --tags -s | sort -u > /tmp/refs1 git -C msm-3.18.git show-ref --tags -s | sort -u > /tmp/refs2 and then checking if an intersection of these matches at least half of refs in either repo: #/usr/bin/env python import numpy refs1 = numpy.array(open('/tmp/refs1').readlines()) refs2 = numpy.array(open('/tmp/refs2').readlines()) in_common = len(numpy.intersect1d(refs1, refs2)) if in_common > len(refs1)/2 or in_common > len(refs2)/2: print('Lots of shared refs') else: print('None or too few shared refs') This works well enough at least for those repos with lots of shared tags, but will miss potentially large repos where there's only heads that can be pointing at commits that aren't necessarily the same between two repos. Thanks for your help! Best, -- Konstantin Ryabitsev Director, IT Infrastructure Security The Linux Foundation signature.asc Description: OpenPGP digital signature
Re: [RFC PATCH v3.1 2/9 2/2] rebase-interactive: Do not automatically run code
Wink Savillewrites: > At Junio's suggestion have git-rebase--am and git-rebase--merge work the > same way as git-rebase--interactive. This makes the code more consistent. I mumbled about making git_rebase__$type functions for all $type in my previous response, but that was done without even looking at git-rebase--$type.sh scriptlets. It seems that they all shared the same structure (i.e. define git_rebase__$type function and then at the end clla it) and were consistent already. It was the v3 that changed the calling convention only for interactive, which made it inconsistent. If you are making git-rebase.sh call the helper shell function for all backend $type, you are keeping the existing consistency. This is no longer about "interactive" alone, though, and need to be retitled ;-) > Signed-off-by: Wink Saville > --- > git-rebase--am.sh | 17 ++--- > git-rebase--interactive.sh | 8 +++- > git-rebase--merge.sh | 17 ++--- > git-rebase.sh | 13 - > 4 files changed, 23 insertions(+), 32 deletions(-) > > diff --git a/git-rebase--am.sh b/git-rebase--am.sh > index be3f06892..47dc69ed9 100644 > --- a/git-rebase--am.sh > +++ b/git-rebase--am.sh > @@ -4,17 +4,14 @@ > # Copyright (c) 2010 Junio C Hamano. > # > > +# The whole contents of this file is loaded by dot-sourcing it from > +# inside another shell function, hence no shebang on the first line > +# and then the caller invokes git_rebase__am. Is this comment necessary? > +# Previously this file was sourced and it called itself to get this > +# was to get around a bug in older (9.x) versions of FreeBSD. ECANTPARSE. But this probably is no longer needed here, even though it may make sense to explain why this comment is no longer relevant in the log message. E.g. The backend scriptlets for "git rebase" are structured in a bit unusual way for historical reasons. Originally, it was designed in such a way that dot-sourcing them from "git rebase" would be sufficient to invoke the specific backend. When it was discovered that some shell implementations (e.g. FreeBSD 9.x) misbehaved by exiting when "return" is executed at the top level of a dot-sourced script (the original was expecting that the control returns to the next command in "git rebase" after dot-sourcing the scriptlet), the whole body of git-rebase--$backend.sh was made into a shell function git_rebase__$backend and then the scriptlet was made to call this function at the end as a workaround. Move the call to "git rebase" side, instead of at the end of each scriptlet. This would give us a more normal arrangement where a function library lives in a scriptlet that is dot-sourced, and then these helper functions are called by the script that dot-sourced the scriptlet. While at it, remove the large comment that explains why this rather unusual structure was used from these scriptlets. or something like that in the log message, and then we can get rid of these in-code comments, I would think. > git_rebase__am () { > - > +echo "git_rebase_am:+" 1>&5 debuggin'? I see similar stuff left in other parts (snipped) of this patch.
[RFC PATCH v3.1 2/9 2/2] rebase-interactive: Do not automatically run code
At Junio's suggestion have git-rebase--am and git-rebase--merge work the same way as git-rebase--interactive. This makes the code more consistent. Signed-off-by: Wink Saville--- git-rebase--am.sh | 17 ++--- git-rebase--interactive.sh | 8 +++- git-rebase--merge.sh | 17 ++--- git-rebase.sh | 13 - 4 files changed, 23 insertions(+), 32 deletions(-) diff --git a/git-rebase--am.sh b/git-rebase--am.sh index be3f06892..47dc69ed9 100644 --- a/git-rebase--am.sh +++ b/git-rebase--am.sh @@ -4,17 +4,14 @@ # Copyright (c) 2010 Junio C Hamano. # -# The whole contents of this file is run by dot-sourcing it from -# inside a shell function. It used to be that "return"s we see -# below were not inside any function, and expected to return -# to the function that dot-sourced us. +# The whole contents of this file is loaded by dot-sourcing it from +# inside another shell function, hence no shebang on the first line +# and then the caller invokes git_rebase__am. # -# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a -# construct and continue to run the statements that follow such a "return". -# As a work-around, we introduce an extra layer of a function -# here, and immediately call it after defining it. +# Previously this file was sourced and it called itself to get this +# was to get around a bug in older (9.x) versions of FreeBSD. git_rebase__am () { - +echo "git_rebase_am:+" 1>&5 case "$action" in continue) git am --resolved --resolvemsg="$resolvemsg" \ @@ -105,5 +102,3 @@ fi move_to_original_branch } -# ... and then we call the whole thing. -git_rebase__am diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index 213d75f43..48f358333 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -740,8 +740,14 @@ get_missing_commit_check_level () { printf '%s' "$check_level" | tr 'A-Z' 'a-z' } +# The whole contents of this file is loaded by dot-sourcing it from +# inside another shell function, hence no shebang on the first line +# and then the caller invokes git_rebase__interactive. +# +# Previously this file was sourced and it called itself to get this +# was to get around a bug in older (9.x) versions of FreeBSD. git_rebase__interactive () { - +echo "git_rebase_interactive:+" 1>&5 case "$action" in continue) if test ! -d "$rewritten" diff --git a/git-rebase--merge.sh b/git-rebase--merge.sh index ceb715453..71de80788 100644 --- a/git-rebase--merge.sh +++ b/git-rebase--merge.sh @@ -104,17 +104,14 @@ finish_rb_merge () { say All done. } -# The whole contents of this file is run by dot-sourcing it from -# inside a shell function. It used to be that "return"s we see -# below were not inside any function, and expected to return -# to the function that dot-sourced us. +# The whole contents of this file is loaded by dot-sourcing it from +# inside another shell function, hence no shebang on the first line +# and then the caller invokes git_rebase__merge. # -# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a -# construct and continue to run the statements that follow such a "return". -# As a work-around, we introduce an extra layer of a function -# here, and immediately call it after defining it. +# Previously this file was sourced and it called itself to get this +# was to get around a bug in older (9.x) versions of FreeBSD. git_rebase__merge () { - +echo "git_rebase_merge:+" 1>&5 case "$action" in continue) read_state @@ -171,5 +168,3 @@ done finish_rb_merge } -# ... and then we call the whole thing. -git_rebase__merge diff --git a/git-rebase.sh b/git-rebase.sh index c4ec7c21b..4595a316a 100755 --- a/git-rebase.sh +++ b/git-rebase.sh @@ -196,15 +196,10 @@ run_specific_rebase () { export GIT_EDITOR autosquash= fi - if test "$type" = interactive - then - . git-rebase--interactive - git_rebase__interactive - ret=$? - else - . git-rebase--$type - ret=$? - fi + # Source the code and invoke it + . git-rebase--$type + git_rebase__$type + ret=$? if test $ret -eq 0 then finish_rebase -- 2.16.2
git: cover letter and automatic Cc: [was Re: [PATCH 23/45] sched: add do_sched_yield() helper; remove in-kernel call to sched_yield()]
Linus Torvalds wrote: > On Thu, Mar 22, 2018 at 10:29 AM, Peter Zijlstrawrote: >> >> But why !? Either Cc me on more of the series such that the whole makes >> sense, or better yet, write a proper Changelog. > > This is a common issue. We should encourage people to always send at > least the cover-page to everybody who gets cc'd, even if they don't > get the whole series. git should be smart enough to do it automatically by itself.
Re: The most efficient way to test if repositories share the same objects
On Thu, Mar 22 2018, Junio C. Hamano wrote: > Ævar Arnfjörð Bjarmasonwrites: > >> But of course that'll just give you the tips. You could then use `git >> cat-file --batch-check` on both ends to see what commits from the other >> they report knowing about, in case they have branches that are >> ahead/behind the other. > > I am not sure how you are envisioning to use "cat-file > --batch-check" here. Do you mean to take "rev-list --all" output > from both and compare, or something? By doing something like this: ( cd /tmp && git clone http://github.com/gitster/git gitster-git; git clone http://github.com/avar/git avar-git; git -C gitster-git for-each-ref --format="%(object)" | git -C avar-git cat-file --batch-check|grep -E -o '(commit|missing)'|sort|uniq -c && git -C avar-git for-each-ref --format="%(object)" | git -C gitster-git cat-file --batch-check|grep -E -o '(commit|missing)'|sort|uniq -c ) Which outputs: 673 commit 696 missing 374 commit 495 missing Which of course, as noted, isn't going to be a good general solution in all cases, but it's blindingly fast, and since the original question is essentially that he's starting out with doing a rough equivalent of that, but for tags only, maybe it'll work for his use-case. > I am not sure how Konstantin defines "the most efficient", but if it > is "with the smallest number of bits exchanged between the > repositories", then the answer would probably be to find the root > commit(s) in each repository and if they share any common root(s). > If there isn't then there is no hope to share objects between them, > of course. Yes, that would probably be much better: diff -ru <(git -C avar-git log --oneline --pretty=format:%H --max-parents=0) \ <(git -C gitster-git log --oneline --pretty=format:%H --max-parents=0) && do_stuff_because_they_have_stuff_in_common
Re: The most efficient way to test if repositories share the same objects
Ævar Arnfjörð Bjarmasonwrites: > But of course that'll just give you the tips. You could then use `git > cat-file --batch-check` on both ends to see what commits from the other > they report knowing about, in case they have branches that are > ahead/behind the other. I am not sure how you are envisioning to use "cat-file --batch-check" here. Do you mean to take "rev-list --all" output from both and compare, or something? I am not sure how Konstantin defines "the most efficient", but if it is "with the smallest number of bits exchanged between the repositories", then the answer would probably be to find the root commit(s) in each repository and if they share any common root(s). If there isn't then there is no hope to share objects between them, of course.
Re: [RFC PATCH v3 2/9] Call git_rebase__interactive from run_specific_rebase
On Thu, Mar 22, 2018 at 11:27 AM, Junio C Hamanowrote: > Wink Saville writes: > >> Instead of indirectly invoking git_rebase__interactive this invokes >> it directly after sourcing. >> >> Signed-off-by: Wink Saville >> --- >> git-rebase--interactive.sh | 11 --- >> git-rebase.sh | 11 +-- >> 2 files changed, 9 insertions(+), 13 deletions(-) >> >> diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh >> index 561e2660e..213d75f43 100644 >> --- a/git-rebase--interactive.sh >> +++ b/git-rebase--interactive.sh >> @@ -740,15 +740,6 @@ get_missing_commit_check_level () { >> printf '%s' "$check_level" | tr 'A-Z' 'a-z' >> } >> >> -# The whole contents of this file is run by dot-sourcing it from >> -# inside a shell function. It used to be that "return"s we see >> -# below were not inside any function, and expected to return >> -# to the function that dot-sourced us. >> -# >> -# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a >> -# construct and continue to run the statements that follow such a "return". >> -# As a work-around, we introduce an extra layer of a function >> -# here, and immediately call it after defining it. > > We still enclose the whole thing (including the returns that are > problematic for older FreeBSD shells) in a shell function, so it's > not like we are dropping the workaround for these systems. It's > just the caller of the function moved. > > I think the removal of this large comment is justifiable, but the > structure still needs a bit of explanation, especially given that > the caller in git-rebase.sh needs to treat this scriptlet a bit > differently from others. > > If we were not in the (longer term) process of getting rid of > git-rebase.sh, it might even make sense to port the same > "dot-sourced scriptlet defines a shell function to be called, and > the caller calls it after dot-sourcing it" pattern to other rebase > backends, so that the calling side can be unified again to become > something like: > > . git-rebase--$type > git_rebase__$type > ret=$? I've gone with the above suggestion and added back some of the header.
Re: The most efficient way to test if repositories share the same objects
On Thu, Mar 22 2018, Konstantin Ryabitsev wrote: > What is the most efficient way to test if repoA and repoB share common > commits? My goal is to automatically figure out if repoB can benefit > from setting alternates to repoA and repacking. I currently do it by > comparing the output of "show-ref --tags -s", but that does not work for > repos without tags. If you're using show-ref already to get the tag tips, you can use for-each-ref to get all tips. But of course that'll just give you the tips. You could then use `git cat-file --batch-check` on both ends to see what commits from the other they report knowing about, in case they have branches that are ahead/behind the other.
[GSoC][PATCH v7] parse-options: do not show usage upon invalid option value
Usually, the usage should be shown only if the user does not know what options are available. If the user specifies an invalid value, the user is already aware of the available options. In this case, there is no point in displaying the usage anymore. This patch applies to "git tag --contains", "git branch --contains", "git branch --points-at", "git for-each-ref --contains" and many more. Signed-off-by: Paul-Sebastian Ungureanu--- The only differences between this patch and the previous one [1] are related to t/t0041-usage.sh tests, based on Junio's feedback. [1] https://public-inbox.org/git/20180320175005.18759-1-ungureanupaulsebast...@gmail.com/ builtin/blame.c | 1 + builtin/shortlog.c| 1 + builtin/update-index.c| 1 + parse-options.c | 20 --- parse-options.h | 1 + t/t0040-parse-options.sh | 2 +- t/t0041-usage.sh | 107 ++ t/t3404-rebase-interactive.sh | 6 +- 8 files changed, 125 insertions(+), 14 deletions(-) create mode 100755 t/t0041-usage.sh diff --git a/builtin/blame.c b/builtin/blame.c index 9dcb367b9..e8c6a4d6a 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -729,6 +729,7 @@ int cmd_blame(int argc, const char **argv, const char *prefix) for (;;) { switch (parse_options_step(, options, blame_opt_usage)) { case PARSE_OPT_HELP: + case PARSE_OPT_ERROR: exit(129); case PARSE_OPT_DONE: if (ctx.argv[0]) diff --git a/builtin/shortlog.c b/builtin/shortlog.c index e29875b84..be4df6a03 100644 --- a/builtin/shortlog.c +++ b/builtin/shortlog.c @@ -283,6 +283,7 @@ int cmd_shortlog(int argc, const char **argv, const char *prefix) for (;;) { switch (parse_options_step(, options, shortlog_usage)) { case PARSE_OPT_HELP: + case PARSE_OPT_ERROR: exit(129); case PARSE_OPT_DONE: goto parse_done; diff --git a/builtin/update-index.c b/builtin/update-index.c index 58d1c2d28..34adf55a7 100644 --- a/builtin/update-index.c +++ b/builtin/update-index.c @@ -1059,6 +1059,7 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) break; switch (parseopt_state) { case PARSE_OPT_HELP: + case PARSE_OPT_ERROR: exit(129); case PARSE_OPT_NON_OPTION: case PARSE_OPT_DONE: diff --git a/parse-options.c b/parse-options.c index 125e84f98..0f7059a8a 100644 --- a/parse-options.c +++ b/parse-options.c @@ -317,14 +317,16 @@ static int parse_long_opt(struct parse_opt_ctx_t *p, const char *arg, return get_value(p, options, all_opts, flags ^ opt_flags); } - if (ambiguous_option) - return error("Ambiguous option: %s " + if (ambiguous_option) { + error("Ambiguous option: %s " "(could be --%s%s or --%s%s)", arg, (ambiguous_flags & OPT_UNSET) ? "no-" : "", ambiguous_option->long_name, (abbrev_flags & OPT_UNSET) ? "no-" : "", abbrev_option->long_name); + return -3; + } if (abbrev_option) return get_value(p, abbrev_option, all_opts, abbrev_flags); return -2; @@ -476,7 +478,6 @@ int parse_options_step(struct parse_opt_ctx_t *ctx, const char * const usagestr[]) { int internal_help = !(ctx->flags & PARSE_OPT_NO_INTERNAL_HELP); - int err = 0; /* we must reset ->opt, unknown short option leave it dangling */ ctx->opt = NULL; @@ -505,7 +506,7 @@ int parse_options_step(struct parse_opt_ctx_t *ctx, ctx->opt = arg + 1; switch (parse_short_opt(ctx, options)) { case -1: - goto show_usage_error; + return PARSE_OPT_ERROR; case -2: if (ctx->opt) check_typos(arg + 1, options); @@ -518,7 +519,7 @@ int parse_options_step(struct parse_opt_ctx_t *ctx, while (ctx->opt) { switch (parse_short_opt(ctx, options)) { case -1: - goto show_usage_error; + return PARSE_OPT_ERROR; case -2: if (internal_help && *ctx->opt == 'h') goto show_usage; @@ -550,9 +551,11 @@ int parse_options_step(struct parse_opt_ctx_t
Re: [GSoC][PATCH v6] parse-options: do not show usage upon invalid option value
Hi, Thank you a lot for your advice! I will keep in mind your words next time I will send a patch. Best regards, Paul Ungurenanu
Re: [RFC PATCH v3 2/9] Call git_rebase__interactive from run_specific_rebase
Wink Savillewrites: > Instead of indirectly invoking git_rebase__interactive this invokes > it directly after sourcing. > > Signed-off-by: Wink Saville > --- > git-rebase--interactive.sh | 11 --- > git-rebase.sh | 11 +-- > 2 files changed, 9 insertions(+), 13 deletions(-) > > diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh > index 561e2660e..213d75f43 100644 > --- a/git-rebase--interactive.sh > +++ b/git-rebase--interactive.sh > @@ -740,15 +740,6 @@ get_missing_commit_check_level () { > printf '%s' "$check_level" | tr 'A-Z' 'a-z' > } > > -# The whole contents of this file is run by dot-sourcing it from > -# inside a shell function. It used to be that "return"s we see > -# below were not inside any function, and expected to return > -# to the function that dot-sourced us. > -# > -# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a > -# construct and continue to run the statements that follow such a "return". > -# As a work-around, we introduce an extra layer of a function > -# here, and immediately call it after defining it. We still enclose the whole thing (including the returns that are problematic for older FreeBSD shells) in a shell function, so it's not like we are dropping the workaround for these systems. It's just the caller of the function moved. I think the removal of this large comment is justifiable, but the structure still needs a bit of explanation, especially given that the caller in git-rebase.sh needs to treat this scriptlet a bit differently from others. If we were not in the (longer term) process of getting rid of git-rebase.sh, it might even make sense to port the same "dot-sourced scriptlet defines a shell function to be called, and the caller calls it after dot-sourcing it" pattern to other rebase backends, so that the calling side can be unified again to become something like: . git-rebase--$type git_rebase__$type ret=$? > git_rebase__interactive () { > > case "$action" in > @@ -1029,5 +1020,3 @@ fi > do_rest > > } > -# ... and then we call the whole thing. > -git_rebase__interactive > diff --git a/git-rebase.sh b/git-rebase.sh > index a1f6e5de6..c4ec7c21b 100755 > --- a/git-rebase.sh > +++ b/git-rebase.sh > @@ -196,8 +196,15 @@ run_specific_rebase () { > export GIT_EDITOR > autosquash= > fi > - . git-rebase--$type > - ret=$? > + if test "$type" = interactive > + then > + . git-rebase--interactive > + git_rebase__interactive > + ret=$? > + else > + . git-rebase--$type > + ret=$? > + fi > if test $ret -eq 0 > then > finish_rebase
Re: [PATCH] completion: add option completion for most builtin commands
On Thu, Mar 22, 2018 at 6:56 PM, Junio C Hamanowrote: > Duy Nguyen writes: > >> +__git_main_with_parseopt_helper=' >> + blame cat-file check-attr check-ignore >> + check-mailmap checkout-index column count-objects fast-export >> + hash-object index-pack interpret-trailers merge-file mktree >> + pack-objects pack-refs prune prune-packed read-tree repack >> + send-pack show-ref stripspace symbolic-ref update-index >> + update-ref verify-commit verify-tag write-tree >> +' >> +__git_complete_command() { >> + local command="$1" >> + local completion_func="_git_${command//-/_}" >> + if declare -f $completion_func >/dev/null 2>/dev/null; then >> + $completion_func >> + elif echo $__git_main_with_parseopt_helper | git grep -w "$command" >> >/dev/null; then > > "git grep"??? GIt has taught my fingers some bad habit. > > I imagined that you'd keep an associative shell array (we are not > constrained by POSIX here) that can be used like so > > elif test -n "${__git_main_with_parseopt_helper[$command]}"; then > Great. We could even kill two existing _git_xxx functions because they are too simple they could be replaced with this new code. I'll send out a series later. -- Duy
Re: [PATCH] completion: add option completion for most builtin commands
Duy Nguyenwrites: > +__git_main_with_parseopt_helper=' > + blame cat-file check-attr check-ignore > + check-mailmap checkout-index column count-objects fast-export > + hash-object index-pack interpret-trailers merge-file mktree > + pack-objects pack-refs prune prune-packed read-tree repack > + send-pack show-ref stripspace symbolic-ref update-index > + update-ref verify-commit verify-tag write-tree > +' > +__git_complete_command() { > + local command="$1" > + local completion_func="_git_${command//-/_}" > + if declare -f $completion_func >/dev/null 2>/dev/null; then > + $completion_func > + elif echo $__git_main_with_parseopt_helper | git grep -w "$command" > >/dev/null; then "git grep"??? I imagined that you'd keep an associative shell array (we are not constrained by POSIX here) that can be used like so elif test -n "${__git_main_with_parseopt_helper[$command]}"; then Of course, a more traditional way to write it without spawning grep or pipe is case " $__git_main_with_parseopt_helper " in *" $command "*) ... Yes, $command is on the list ... ;; *) ... No, $command is not on the list ... ;; esac
Re: [GSoC] Regarding "Convert scripts to builtins"
On Thu, Mar 22, 2018, 10:32 AM Johannes Schindelinwrote: > > Hi Wink, > > > Please see "[RFC PATCH 0/3] rebase-interactive" and > > "[RFC PATCH v2 0/1] rebase-interactive: ...". I'm looking for > > advice on how to proceed. > > Sadly, I had almost no time to spend on the Git mailing list today, but I > will have a look tomorrow, okay? NP, I totally understand and, of course, I now have a version 3: "[RFC PATCH v3 0/9] rebase-interactive:" Cheers, Winthrop Lyon Saville III (I had to repsond with my full name although I always use my nick name, Wink, just because Johannes seems so formal :)
[PATCH v2 3/3] sha1_name: use bsearch_pack() for abbreviations
When computing abbreviation lengths for an object ID against a single packfile, the method find_abbrev_len_for_pack() currently implements binary search. This is one of several implementations. One issue with this implementation is that it ignores the fanout table in the pack- index. Translate this binary search to use the existing bsearch_pack() method that correctly uses a fanout table. Due to the use of the fanout table, the abbreviation computation is slightly faster than before. For a fully-repacked copy of the Linux repo, the following 'git log' commands improved: * git log --oneline --parents --raw Before: 59.2s After: 56.9s Rel %: -3.8% * git log --oneline --parents Before: 6.48s After: 5.91s Rel %: -8.9% Signed-off-by: Derrick Stolee--- sha1_name.c | 24 1 file changed, 4 insertions(+), 20 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index 16e0003396..24894b3dbe 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -512,32 +512,16 @@ static void find_abbrev_len_for_pack(struct packed_git *p, struct min_abbrev_data *mad) { int match = 0; - uint32_t num, last, first = 0; + uint32_t num, first = 0; struct object_id oid; + const struct object_id *mad_oid; if (open_pack_index(p) || !p->num_objects) return; num = p->num_objects; - last = num; - while (first < last) { - uint32_t mid = first + (last - first) / 2; - const unsigned char *current; - int cmp; - - current = nth_packed_object_sha1(p, mid); - cmp = hashcmp(mad->oid->hash, current); - if (!cmp) { - match = 1; - first = mid; - break; - } - if (cmp > 0) { - first = mid + 1; - continue; - } - last = mid; - } + mad_oid = mad->oid; + match = bsearch_pack(mad_oid, p, ); /* * first is now the position in the packfile where we would insert -- 2.17.0.rc0
[PATCH v2 0/3] Use bsearch_hash() for abbreviations
Thanks to Jonathan and Brian for the help with the proper way to handle OIDs and existing callers to bsearch_hash(). This patch includes one commit that Brian sent in the previous discussion (included again here for completeness). Derrick Stolee (2): packfile: define and use bsearch_pack() sha1_name: use bsearch_pack() for abbreviations brian m. carlson (1): sha1_name: use bsearch_hash() for abbreviations packfile.c | 42 ++ packfile.h | 8 sha1_name.c | 28 ++-- 3 files changed, 40 insertions(+), 38 deletions(-) base-commit: 1a750441a7360b29fff7a414649ece1d35acaca6 -- 2.17.0.rc0
[PATCH v2 2/3] packfile: define and use bsearch_pack()
The method bsearch_hash() generalizes binary searches using a fanout table. The only consumer is currently find_pack_entry_one(). It requires a bit of pointer arithmetic to align the fanout table and the lookup table depending on the pack-index version. Extract the pack-index pointer arithmetic to a new method, bsearch_pack(), so this can be re-used in other code paths. Signed-off-by: Derrick Stolee--- packfile.c | 42 ++ packfile.h | 8 2 files changed, 34 insertions(+), 16 deletions(-) diff --git a/packfile.c b/packfile.c index f26395ecab..69d3afda6c 100644 --- a/packfile.c +++ b/packfile.c @@ -1654,6 +1654,29 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset, return data; } +int bsearch_pack(const struct object_id *oid, const struct packed_git *p, uint32_t *result) +{ + const unsigned char *index_fanout = p->index_data; + const unsigned char *index_lookup; + int index_lookup_width; + + if (!index_fanout) + BUG("bsearch_pack called without a valid pack-index"); + + index_lookup = index_fanout + 4 * 256; + if (p->index_version == 1) { + index_lookup_width = 24; + index_lookup += 4; + } else { + index_lookup_width = 20; + index_fanout += 8; + index_lookup += 8; + } + + return bsearch_hash(oid->hash, (const uint32_t*)index_fanout, + index_lookup, index_lookup_width, result); +} + const unsigned char *nth_packed_object_sha1(struct packed_git *p, uint32_t n) { @@ -1720,30 +1743,17 @@ off_t nth_packed_object_offset(const struct packed_git *p, uint32_t n) off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *p) { - const uint32_t *level1_ofs = p->index_data; const unsigned char *index = p->index_data; - unsigned stride; + struct object_id oid; uint32_t result; if (!index) { if (open_pack_index(p)) return 0; - level1_ofs = p->index_data; - index = p->index_data; - } - if (p->index_version > 1) { - level1_ofs += 2; - index += 8; - } - index += 4 * 256; - if (p->index_version > 1) { - stride = 20; - } else { - stride = 24; - index += 4; } - if (bsearch_hash(sha1, level1_ofs, index, stride, )) + hashcpy(oid.hash, sha1); + if (bsearch_pack(, p, )) return nth_packed_object_offset(p, result); return 0; } diff --git a/packfile.h b/packfile.h index a7fca598d6..ec08cb2bb0 100644 --- a/packfile.h +++ b/packfile.h @@ -78,6 +78,14 @@ extern struct packed_git *add_packed_git(const char *path, size_t path_len, int */ extern void check_pack_index_ptr(const struct packed_git *p, const void *ptr); +/* + * Perform binary search on a pack-index for a given oid. Packfile is expected to + * have a valid pack-index. + * + * See 'bsearch_hash' for more information. + */ +int bsearch_pack(const struct object_id *oid, const struct packed_git *p, uint32_t *result); + /* * Return the SHA-1 of the nth object within the specified packfile. * Open the index if it is not already open. The return value points -- 2.17.0.rc0
[PATCH v2 1/3] sha1_name: convert struct min_abbrev_data to object_id
From: "brian m. carlson"This structure is only written to in one place, where we already have a struct object_id. Convert the struct to use a struct object_id instead. Signed-off-by: brian m. carlson --- sha1_name.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index 39e911c8ba..16e0003396 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -480,7 +480,7 @@ struct min_abbrev_data { unsigned int init_len; unsigned int cur_len; char *hex; - const unsigned char *hash; + const struct object_id *oid; }; static inline char get_hex_char_from_oid(const struct object_id *oid, @@ -526,7 +526,7 @@ static void find_abbrev_len_for_pack(struct packed_git *p, int cmp; current = nth_packed_object_sha1(p, mid); - cmp = hashcmp(mad->hash, current); + cmp = hashcmp(mad->oid->hash, current); if (!cmp) { match = 1; first = mid; @@ -603,7 +603,7 @@ int find_unique_abbrev_r(char *hex, const struct object_id *oid, int len) mad.init_len = len; mad.cur_len = len; mad.hex = hex; - mad.hash = oid->hash; + mad.oid = oid; find_abbrev_len_packed(); base-commit: 1a750441a7360b29fff7a414649ece1d35acaca6 -- 2.17.0.rc0
Re: [PATCH] completion: add option completion for most builtin commands
On Thu, Mar 22, 2018 at 10:11:53AM -0700, Junio C Hamano wrote: > Duy Nguyenwrites: > > >> And that pattern repeats throughout the patch. I wonder if we can > >> express the same a lot more concisely by updating the caller that > >> calls these command specific helpers? > > > > Yeah. I almost went to just generate and eval these functions. But we > > still need to keep a list of "bultin with --git-completion-helper" > > support somewhere, and people may want to complete arguments without > > double dashes (e.g. read-tree should take a ref...) which can't be > > helped by --git-completion-helper. > > Hmph, I actually did not have 'eval' in mind. > > Rather, I was wondering if it is cleaner to update __git_main where > it computes $completion_func by mangling $command and then calls > it---instead call __gitcomp_builtin directly when the $command > appears in such a "list of builtins that knows --completion-helper > and no custom completion". Ah. Something like this? Seems to work fine and definitely not as ugly as eval. -- 8< -- diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index 6da95b8095..960e26e09d 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -3032,6 +3032,32 @@ _git_worktree () fi } +__git_main_with_parseopt_helper=' + blame cat-file check-attr check-ignore + check-mailmap checkout-index column count-objects fast-export + hash-object index-pack interpret-trailers merge-file mktree + pack-objects pack-refs prune prune-packed read-tree repack + send-pack show-ref stripspace symbolic-ref update-index + update-ref verify-commit verify-tag write-tree +' +__git_complete_command() { + local command="$1" + local completion_func="_git_${command//-/_}" + if declare -f $completion_func >/dev/null 2>/dev/null; then + $completion_func + elif echo $__git_main_with_parseopt_helper | git grep -w "$command" >/dev/null; then + case "$cur" in + --*) + __gitcomp_builtin "$command" + return 0 + ;; + esac + return 0 + else + return 1 + fi +} + __git_main () { local i c=1 command __git_dir __git_repo_path @@ -3091,14 +3117,12 @@ __git_main () return fi - local completion_func="_git_${command//-/_}" - declare -f $completion_func >/dev/null 2>/dev/null && $completion_func && return + __git_complete_command "$command" && return local expansion=$(__git_aliased_command "$command") if [ -n "$expansion" ]; then words[1]=$expansion - completion_func="_git_${expansion//-/_}" - declare -f $completion_func >/dev/null 2>/dev/null && $completion_func + __git_complete_command "$expansion" fi } -- 8< --
Re: [GSoC] Regarding "Convert scripts to builtins"
Hi Wink, On Wed, 21 Mar 2018, Wink Saville wrote: > I plead guilty to being the preson refactoring --preserve-merges. But > after reading this and seeing that --recreate-merges is coming and > possibly git-rebase--* moving to C I'm worried I'd be messing things up. Don't worry. We will just work together to avoid messing anything up. > Also, Eric Sunshine felt my v1 changes causes the blame information to > be obscured. So I created a v2 change which keeps everything in the > git-rebase--interactive.sh. Great! > Please see "[RFC PATCH 0/3] rebase-interactive" and > "[RFC PATCH v2 0/1] rebase-interactive: ...". I'm looking for > advice on how to proceed. Sadly, I had almost no time to spend on the Git mailing list today, but I will have a look tomorrow, okay? Ciao, Johannes
Re: [PATCH] completion: add option completion for most builtin commands
Duy Nguyenwrites: >> And that pattern repeats throughout the patch. I wonder if we can >> express the same a lot more concisely by updating the caller that >> calls these command specific helpers? > > Yeah. I almost went to just generate and eval these functions. But we > still need to keep a list of "bultin with --git-completion-helper" > support somewhere, and people may want to complete arguments without > double dashes (e.g. read-tree should take a ref...) which can't be > helped by --git-completion-helper. Hmph, I actually did not have 'eval' in mind. Rather, I was wondering if it is cleaner to update __git_main where it computes $completion_func by mangling $command and then calls it---instead call __gitcomp_builtin directly when the $command appears in such a "list of builtins that knows --completion-helper and no custom completion".
Re: [PATCH] completion: clear cached --options when sourcing the completion script
On Thu, Mar 22, 2018 at 3:16 PM, SZEDER Gáborwrote: > The established way to update the completion script in an already > running shell is to simply source it again: this brings in any new > --options and features, and clears caching variables. E.g. it clears > the variables caching the list of (all|porcelain) git commands, so > when they are later lazy-initialized again, then they will list and > cache any newly installed commmands as well. > > Unfortunately, since d401f3debc (git-completion.bash: introduce > __gitcomp_builtin, 2018-02-09) and subsequent patches this doesn't > work for a lot of git commands' options. To eliminate a lot of > hard-to-maintain hard-coded lists of options, those commits changed > the completion script to use a bunch of programmatically created and > lazy-initialized variables to cache the options of those builtin > porcelain commands that use parse-options. These variables are not > cleared upon sourcing the completion script, therefore they continue > caching the old lists of options, even when some commands recently > learned new options or when deprecated options were removed. > > Always 'unset' these variables caching the options of builtin commands > when sourcing the completion script. And here I have been happily unsetting these manually when I re-source to test stuff, not thinking it as a bug. Thanks! > Redirect 'unset's stderr to /dev/null, because ZSH's 'unset' complains > if it's invoked without any arguments, i.e. no variables caching > builtin's options are set. This can happen, if someone were to source > the completion script twice without completing any --options in > between. Bash stays silent in this case. > > Add tests to ensure that these variables are indeed cleared when the > completion script is sourced; not just the variables caching options, > but all other caching variables, i.e. the variables caching commands, > porcelain commands and merge strategies as well. > > Signed-off-by: SZEDER Gábor > --- > > Fixes a recent regression introduced in 'nd/parseopt-completion'. > > contrib/completion/git-completion.bash | 4 > t/t9902-completion.sh | 31 ++ > 2 files changed, 35 insertions(+) > > diff --git a/contrib/completion/git-completion.bash > b/contrib/completion/git-completion.bash > index 7c84eb1912..602352f952 100644 > --- a/contrib/completion/git-completion.bash > +++ b/contrib/completion/git-completion.bash > @@ -280,6 +280,10 @@ __gitcomp () > esac > } > > +# Clear the variables caching builtins' options when (re-)sourcing > +# the completion script. > +unset $(set |sed -ne > 's/^\(__gitcomp_builtin_[a-zA-Z0-9_][a-zA-Z0-9_]*\)=.*/\1/p') 2>/dev/null > + > # This function is equivalent to > # > #__gitcomp "$(git xxx --git-completion-helper) ..." > diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh > index e6485feb0a..4c86adadf2 100755 > --- a/t/t9902-completion.sh > +++ b/t/t9902-completion.sh > @@ -1497,4 +1497,35 @@ do > ' > done > > +test_expect_success 'sourcing the completion script clears cached commands' ' > + __git_compute_all_commands && > + verbose test -n "$__git_all_commands" && > + . "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" && > + verbose test -z "$__git_all_commands" > +' > + > +test_expect_success 'sourcing the completion script clears cached porcelain > commands' ' > + __git_compute_porcelain_commands && > + verbose test -n "$__git_porcelain_commands" && > + . "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" && > + verbose test -z "$__git_porcelain_commands" > +' > + > +test_expect_success 'sourcing the completion script clears cached merge > strategies' ' > + __git_compute_merge_strategies && > + verbose test -n "$__git_merge_strategies" && > + . "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" && > + verbose test -z "$__git_merge_strategies" > +' > + > +test_expect_success 'sourcing the completion script clears cached --options' > ' > + __gitcomp_builtin checkout && > + verbose test -n "$__gitcomp_builtin_checkout" && > + __gitcomp_builtin notes_edit && > + verbose test -n "$__gitcomp_builtin_notes_edit" && > + . "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" && > + verbose test -z "$__gitcomp_builtin_checkout" && > + verbose test -z "$__gitcomp_builtin_notes_edit" > +' > + > test_done > -- > 2.17.0.rc0.103.gbdc5836ed3 > -- Duy
Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates
On Thu, Mar 22, 2018 at 12:52 PM, Jeff Kingwrote: > On Thu, Mar 22, 2018 at 11:57:27AM +0100, Duy Nguyen wrote: > >> On Thu, Mar 22, 2018 at 10:32 AM, Jeff King wrote: >> > That would still mean you could get into a broken state for serving >> > fetches, but you could at least get out of it by running "git repack". >> >> I was puzzled by this "broken state" statement. But you were right! I >> focused on the repack case and forgot about fetch/clone case. I will >> probably just drop this patch for now. Then maybe revisit this some >> time in fiture when I find out how to deal with this nicely. > > Here's a sketch of the "separate array" concept I mentioned before, in > case that helps. Not tested at all beyond compiling. Brilliant! Sorry I couldn't read your suggestion carefully this morning. -- Duy
Re: [PATCH] filter-branch: consider refs can refer to an object other than commit or tag
Yuki Kokubunwrites: >> Yuki Kokubun writes: >> >> >> Yuki Kokubun writes: >> >> >> >> > "git filter-branch -- --all" can be confused when refs that refer to >> >> > objects >> >> > other than commits or tags exists. >> ... > > I meant the confusion is abnormal messages from the output of "git > filter-branch -- --all". OK, so it is not that the program logic gets confused and ends up performing a wrong rewrite, but mostly that it gives confusing messages. > For example, this is an output of "git filter-branch -- --all": > > Rewrite bcdbd016c77df3d5641a3cf820b2ed46ba7bf3b4 (5/5) (0 seconds passed, > remaining 0 predicted) > WARNING: Ref 'refs/heads/master' is unchanged > WARNING: Ref 'refs/heads/no-newline' is unchanged > WARNING: Ref 'refs/heads/original' is unchanged These are worth keeping, as I think existing users expect to see them. > error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit > error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit > fatal: ambiguous argument > 'refs/replace/8a2016f3730cad8309c110f819c855403ed0a5b9^0': unknown revision > or path not in the working tree. > Use '--' to separate paths from revisions, like this: > 'git [...] -- [...]' > WARNING: Ref 'refs/replace/8a2016f3730cad8309c110f819c855403ed0a5b9' is > unchanged > WARNING: Ref 'refs/tags/add-file' is unchanged > WARNING: Ref 'refs/tags/file' is unchanged > error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit > error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit > fatal: ambiguous argument 'refs/tags/treetag^0': unknown revision or path not > in the working tree. > Use '--' to separate paths from revisions, like this: > 'git [...] -- [...]' > WARNING: Ref 'refs/tags/treetag' is unchanged I think these warning messages should be kept, especially if we are to keep the warning messages for the unchanged branches. However, the internal error messages are unwanted--these are implementation details that reach the conclusion, i.e. the ref we were asked to rewrite ended up being unchanged hence we did not touch it. However, if we pre-filter to limit the refs in "$tempdir/heads" to those that are committish (i.e. those that pass "$ref^0") like the patch and subsequent discussion suggests, wouldn't we lose the warning for these replace refs and non-committish tags. We perhaps could do something like: git rev-parse --no-flags ... >"$tempdir/raw-heads" || exit while read ref do case "$ref" in ^?*) continue ;; esac if git rev-parse --verify "$ref^0" 2>/dev/null then echo "$ref" else warn "WARNING: not rewriting '$ref' (not a committish)" fi done >"$tempdir/heads" <"$tempdir/raw-heads" (note: the else clause is new, relative to my earlier suggestion).
[RFC PATCH v3 3/9] Indent function git_rebase__interactive
Signed-off-by: Wink Saville--- git-rebase--interactive.sh | 432 ++--- 1 file changed, 215 insertions(+), 217 deletions(-) diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index 213d75f43..a79330f45 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -741,27 +741,26 @@ get_missing_commit_check_level () { } git_rebase__interactive () { - -case "$action" in -continue) - if test ! -d "$rewritten" - then - exec git rebase--helper ${force_rebase:+--no-ff} $allow_empty_message \ - --continue - fi - # do we have anything to commit? - if git diff-index --cached --quiet HEAD -- - then - # Nothing to commit -- skip this commit - - test ! -f "$GIT_DIR"/CHERRY_PICK_HEAD || - rm "$GIT_DIR"/CHERRY_PICK_HEAD || - die "$(gettext "Could not remove CHERRY_PICK_HEAD")" - else - if ! test -f "$author_script" + case "$action" in + continue) + if test ! -d "$rewritten" then - gpg_sign_opt_quoted=${gpg_sign_opt:+$(git rev-parse --sq-quote "$gpg_sign_opt")} - die "$(eval_gettext "\ + exec git rebase--helper ${force_rebase:+--no-ff} $allow_empty_message \ + --continue + fi + # do we have anything to commit? + if git diff-index --cached --quiet HEAD -- + then + # Nothing to commit -- skip this commit + + test ! -f "$GIT_DIR"/CHERRY_PICK_HEAD || + rm "$GIT_DIR"/CHERRY_PICK_HEAD || + die "$(gettext "Could not remove CHERRY_PICK_HEAD")" + else + if ! test -f "$author_script" + then + gpg_sign_opt_quoted=${gpg_sign_opt:+$(git rev-parse --sq-quote "$gpg_sign_opt")} + die "$(eval_gettext "\ You have staged changes in your working tree. If these changes are meant to be squashed into the previous commit, run: @@ -776,197 +775,197 @@ In both cases, once you're done, continue with: git rebase --continue ")" - fi - . "$author_script" || - die "$(gettext "Error trying to find the author identity to amend commit")" - if test -f "$amend" - then - current_head=$(git rev-parse --verify HEAD) - test "$current_head" = $(cat "$amend") || - die "$(gettext "\ + fi + . "$author_script" || + die "$(gettext "Error trying to find the author identity to amend commit")" + if test -f "$amend" + then + current_head=$(git rev-parse --verify HEAD) + test "$current_head" = $(cat "$amend") || + die "$(gettext "\ You have uncommitted changes in your working tree. Please commit them first and then run 'git rebase --continue' again.")" - do_with_author git commit --amend --no-verify -F "$msg" -e \ - ${gpg_sign_opt:+"$gpg_sign_opt"} $allow_empty_message || - die "$(gettext "Could not commit staged changes.")" - else - do_with_author git commit --no-verify -F "$msg" -e \ - ${gpg_sign_opt:+"$gpg_sign_opt"} $allow_empty_message || - die "$(gettext "Could not commit staged changes.")" + do_with_author git commit --amend --no-verify -F "$msg" -e \ + ${gpg_sign_opt:+"$gpg_sign_opt"} $allow_empty_message || + die "$(gettext "Could not commit staged changes.")" + else + do_with_author git commit --no-verify -F "$msg" -e \ + ${gpg_sign_opt:+"$gpg_sign_opt"} $allow_empty_message || + die "$(gettext "Could not commit staged changes.")" + fi fi - fi - if test -r "$state_dir"/stopped-sha - then - record_in_rewritten "$(cat "$state_dir"/stopped-sha)" - fi + if test -r "$state_dir"/stopped-sha + then + record_in_rewritten "$(cat "$state_dir"/stopped-sha)" + fi - require_clean_work_tree "rebase" - do_rest - return 0 - ;; -skip) - git rerere clear + require_clean_work_tree
[RFC PATCH v3 8/9] Remove unused code paths from git_rebase__interactive__preserve_merges
Since git_rebase__interactive__preserve_merges is now always called with $preserve_merges = t we can remove the unused code paths. Signed-off-by: Wink Saville--- git-rebase--interactive.sh | 152 - 1 file changed, 69 insertions(+), 83 deletions(-) diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index 346da0f67..ddbd126f2 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -982,100 +982,86 @@ git_rebase__interactive__preserve_merges () { setup_reflog_action init_basic_state - if test t = "$preserve_merges" + if test -z "$rebase_root" then - if test -z "$rebase_root" - then - mkdir "$rewritten" && - for c in $(git merge-base --all $orig_head $upstream) - do - echo $onto > "$rewritten"/$c || - die "$(gettext "Could not init rewritten commits")" - done - else - mkdir "$rewritten" && - echo $onto > "$rewritten"/root || + mkdir "$rewritten" && + for c in $(git merge-base --all $orig_head $upstream) + do + echo $onto > "$rewritten"/$c || die "$(gettext "Could not init rewritten commits")" - fi - # No cherry-pick because our first pass is to determine - # parents to rewrite and skipping dropped commits would - # prematurely end our probe - merges_option= + done else - merges_option="--no-merges --cherry-pick" + mkdir "$rewritten" && + echo $onto > "$rewritten"/root || + die "$(gettext "Could not init rewritten commits")" fi + # No cherry-pick because our first pass is to determine + # parents to rewrite and skipping dropped commits would + # prematurely end our probe + merges_option= + init_revisions_and_shortrevisions - if test t != "$preserve_merges" - then - git rebase--helper --make-script ${keep_empty:+--keep-empty} \ - $revisions ${restrict_revision+^$restrict_revision} >"$todo" || - die "$(gettext "Could not generate todo list")" - else - format=$(git config --get rebase.instructionFormat) - # the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse - git rev-list $merges_option --format="%m%H ${format:-%s}" \ - --reverse --left-right --topo-order \ - $revisions ${restrict_revision+^$restrict_revision} | \ - sed -n "s/^>//p" | - while read -r sha1 rest - do + format=$(git config --get rebase.instructionFormat) + # the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse + git rev-list $merges_option --format="%m%H ${format:-%s}" \ + --reverse --left-right --topo-order \ + $revisions ${restrict_revision+^$restrict_revision} | \ + sed -n "s/^>//p" | + while read -r sha1 rest + do - if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1 - then - comment_out="$comment_char " - else - comment_out= - fi + if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1 + then + comment_out="$comment_char " + else + comment_out= + fi - if test -z "$rebase_root" - then - preserve=t - for p in $(git rev-list --parents -1 $sha1 | cut -d' ' -s -f2-) - do - if test -f "$rewritten"/$p - then - preserve=f - fi - done - else - preserve=f - fi - if test f = "$preserve" - then - touch "$rewritten"/$sha1 - printf '%s\n' "${comment_out}pick $sha1 $rest" >>"$todo" - fi - done - fi + if test -z "$rebase_root" + then + preserve=t + for p
[RFC PATCH v3 7/9] Remove unused code paths from git_rebase__interactive
Since git_rebase__interactive is now never called with $preserve_merges = t we can remove those code paths. Signed-off-by: Wink Saville--- git-rebase--interactive.sh | 95 ++ 1 file changed, 4 insertions(+), 91 deletions(-) diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index ab5513d80..346da0f67 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -961,100 +961,13 @@ git_rebase__interactive () { setup_reflog_action init_basic_state - if test t = "$preserve_merges" - then - if test -z "$rebase_root" - then - mkdir "$rewritten" && - for c in $(git merge-base --all $orig_head $upstream) - do - echo $onto > "$rewritten"/$c || - die "$(gettext "Could not init rewritten commits")" - done - else - mkdir "$rewritten" && - echo $onto > "$rewritten"/root || - die "$(gettext "Could not init rewritten commits")" - fi - # No cherry-pick because our first pass is to determine - # parents to rewrite and skipping dropped commits would - # prematurely end our probe - merges_option= - else - merges_option="--no-merges --cherry-pick" - fi + merges_option="--no-merges --cherry-pick" init_revisions_and_shortrevisions - if test t != "$preserve_merges" - then - git rebase--helper --make-script ${keep_empty:+--keep-empty} \ - $revisions ${restrict_revision+^$restrict_revision} >"$todo" || - die "$(gettext "Could not generate todo list")" - else - format=$(git config --get rebase.instructionFormat) - # the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse - git rev-list $merges_option --format="%m%H ${format:-%s}" \ - --reverse --left-right --topo-order \ - $revisions ${restrict_revision+^$restrict_revision} | \ - sed -n "s/^>//p" | - while read -r sha1 rest - do - - if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1 - then - comment_out="$comment_char " - else - comment_out= - fi - - if test -z "$rebase_root" - then - preserve=t - for p in $(git rev-list --parents -1 $sha1 | cut -d' ' -s -f2-) - do - if test -f "$rewritten"/$p - then - preserve=f - fi - done - else - preserve=f - fi - if test f = "$preserve" - then - touch "$rewritten"/$sha1 - printf '%s\n' "${comment_out}pick $sha1 $rest" >>"$todo" - fi - done - fi - - # Watch for commits that been dropped by --cherry-pick - if test t = "$preserve_merges" - then - mkdir "$dropped" - # Save all non-cherry-picked changes - git rev-list $revisions --left-right --cherry-pick | \ - sed -n "s/^>//p" > "$state_dir"/not-cherry-picks - # Now all commits and note which ones are missing in - # not-cherry-picks and hence being dropped - git rev-list $revisions | - while read rev - do - if test -f "$rewritten"/$rev && - ! sane_grep "$rev" "$state_dir"/not-cherry-picks >/dev/null - then - # Use -f2 because if rev-list is telling us this commit is - # not worthwhile, we don't want to track its multiple heads, - # just the history of its first-parent for others that will - # be rebasing on top of it - git rev-list --parents -1 $rev | cut -d' ' -s -f2 > "$dropped"/$rev - sha1=$(git rev-list -1 $rev) - sane_grep -v "^[a-z][a-z]* $sha1" <"$todo" > "${todo}2" ; mv "${todo}2" "$todo" -
[RFC PATCH v3 5/9] Use new functions in git_rebase__interactive
Use initiate_action, setup_reflog_action, init_basic_state, init_revisions_and_shortrevisions and complete_action. Signed-off-by: Wink Saville--- git-rebase--interactive.sh | 187 ++--- 1 file changed, 8 insertions(+), 179 deletions(-) diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index b72f80ae8..2c10a7f1a 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -952,120 +952,15 @@ EOF } git_rebase__interactive () { - case "$action" in - continue) - if test ! -d "$rewritten" - then - exec git rebase--helper ${force_rebase:+--no-ff} $allow_empty_message \ - --continue - fi - # do we have anything to commit? - if git diff-index --cached --quiet HEAD -- - then - # Nothing to commit -- skip this commit - - test ! -f "$GIT_DIR"/CHERRY_PICK_HEAD || - rm "$GIT_DIR"/CHERRY_PICK_HEAD || - die "$(gettext "Could not remove CHERRY_PICK_HEAD")" - else - if ! test -f "$author_script" - then - gpg_sign_opt_quoted=${gpg_sign_opt:+$(git rev-parse --sq-quote "$gpg_sign_opt")} - die "$(eval_gettext "\ -You have staged changes in your working tree. -If these changes are meant to be -squashed into the previous commit, run: - - git commit --amend \$gpg_sign_opt_quoted - -If they are meant to go into a new commit, run: - - git commit \$gpg_sign_opt_quoted - -In both cases, once you're done, continue with: - - git rebase --continue -")" - fi - . "$author_script" || - die "$(gettext "Error trying to find the author identity to amend commit")" - if test -f "$amend" - then - current_head=$(git rev-parse --verify HEAD) - test "$current_head" = $(cat "$amend") || - die "$(gettext "\ -You have uncommitted changes in your working tree. Please commit them -first and then run 'git rebase --continue' again.")" - do_with_author git commit --amend --no-verify -F "$msg" -e \ - ${gpg_sign_opt:+"$gpg_sign_opt"} $allow_empty_message || - die "$(gettext "Could not commit staged changes.")" - else - do_with_author git commit --no-verify -F "$msg" -e \ - ${gpg_sign_opt:+"$gpg_sign_opt"} $allow_empty_message || - die "$(gettext "Could not commit staged changes.")" - fi - fi - - if test -r "$state_dir"/stopped-sha - then - record_in_rewritten "$(cat "$state_dir"/stopped-sha)" - fi - - require_clean_work_tree "rebase" - do_rest - return 0 - ;; - skip) - git rerere clear - - if test ! -d "$rewritten" - then - exec git rebase--helper ${force_rebase:+--no-ff} $allow_empty_message \ - --continue - fi - do_rest + initiate_action "$action" + ret=$? + if test $ret = 0; then return 0 - ;; - edit-todo) - git stripspace --strip-comments <"$todo" >"$todo".new - mv -f "$todo".new "$todo" - collapse_todo_ids - append_todo_help - gettext " -You are editing the todo file of an ongoing interactive rebase. -To continue rebase after editing, run: -git rebase --continue - -" | git stripspace --comment-lines >>"$todo" - - git_sequence_editor "$todo" || - die "$(gettext "Could not execute editor")" - expand_todo_ids - - exit - ;; - show-current-patch) - exec git show REBASE_HEAD -- - ;; - esac - - comment_for_reflog start - - if test ! -z "$switch_to" - then - GIT_REFLOG_ACTION="$GIT_REFLOG_ACTION: checkout $switch_to" - output git checkout "$switch_to" -- || - die "$(eval_gettext "Could not checkout \$switch_to")" - - comment_for_reflog start fi - orig_head=$(git rev-parse --verify HEAD) || die "$(gettext "No HEAD?")" - mkdir -p "$state_dir" || die "$(eval_gettext "Could not create temporary \$state_dir")" - rm -f "$(git rev-parse
[RFC PATCH v3 9/9] Remove merges_option and a blank line
merges_option is unused in git_rebase__interactive and always empty in git_rebase__interactive__preserve_merges so it can be removed. Signed-off-by: Wink Saville--- git-rebase--interactive.sh | 10 +- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index ddbd126f2..50323fc27 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -961,8 +961,6 @@ git_rebase__interactive () { setup_reflog_action init_basic_state - merges_option="--no-merges --cherry-pick" - init_revisions_and_shortrevisions git rebase--helper --make-script ${keep_empty:+--keep-empty} \ @@ -996,22 +994,16 @@ git_rebase__interactive__preserve_merges () { die "$(gettext "Could not init rewritten commits")" fi - # No cherry-pick because our first pass is to determine - # parents to rewrite and skipping dropped commits would - # prematurely end our probe - merges_option= - init_revisions_and_shortrevisions format=$(git config --get rebase.instructionFormat) # the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse - git rev-list $merges_option --format="%m%H ${format:-%s}" \ + git rev-list --format="%m%H ${format:-%s}" \ --reverse --left-right --topo-order \ $revisions ${restrict_revision+^$restrict_revision} | \ sed -n "s/^>//p" | while read -r sha1 rest do - if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1 then comment_out="$comment_char " -- 2.16.2
[RFC PATCH v3 4/9] Extract functions out of git_rebase__interactive
The extracted functions are: - initiate_action - setup_reflog_action - init_basic_state - init_revisions_and_shortrevisions - complete_action Signed-off-by: Wink Saville--- git-rebase--interactive.sh | 211 + 1 file changed, 211 insertions(+) diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index a79330f45..b72f80ae8 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -740,6 +740,217 @@ get_missing_commit_check_level () { printf '%s' "$check_level" | tr 'A-Z' 'a-z' } +# Initiate an action. If the cannot be any +# further action it may exec a command +# or exit and not return. +# +# TODO: Consider a cleaner return model so it +# never exits and always return 0 if process +# is complete. +# +# Parameter 1 is the action to initiate. +# +# Returns 0 if the action was able to complete +# and if 1 if further processing is required. +initiate_action () { + case "$1" in + continue) + if test ! -d "$rewritten" + then + exec git rebase--helper ${force_rebase:+--no-ff} $allow_empty_message \ + --continue + fi + # do we have anything to commit? + if git diff-index --cached --quiet HEAD -- + then + # Nothing to commit -- skip this commit + + test ! -f "$GIT_DIR"/CHERRY_PICK_HEAD || + rm "$GIT_DIR"/CHERRY_PICK_HEAD || + die "$(gettext "Could not remove CHERRY_PICK_HEAD")" + else + if ! test -f "$author_script" + then + gpg_sign_opt_quoted=${gpg_sign_opt:+$(git rev-parse --sq-quote "$gpg_sign_opt")} + die "$(eval_gettext "\ +You have staged changes in your working tree. +If these changes are meant to be +squashed into the previous commit, run: + + git commit --amend \$gpg_sign_opt_quoted + +If they are meant to go into a new commit, run: + + git commit \$gpg_sign_opt_quoted + +In both cases, once you're done, continue with: + + git rebase --continue +")" + fi + . "$author_script" || + die "$(gettext "Error trying to find the author identity to amend commit")" + if test -f "$amend" + then + current_head=$(git rev-parse --verify HEAD) + test "$current_head" = $(cat "$amend") || + die "$(gettext "\ +You have uncommitted changes in your working tree. Please commit them +first and then run 'git rebase --continue' again.")" + do_with_author git commit --amend --no-verify -F "$msg" -e \ + ${gpg_sign_opt:+"$gpg_sign_opt"} $allow_empty_message || + die "$(gettext "Could not commit staged changes.")" + else + do_with_author git commit --no-verify -F "$msg" -e \ + ${gpg_sign_opt:+"$gpg_sign_opt"} $allow_empty_message || + die "$(gettext "Could not commit staged changes.")" + fi + fi + + if test -r "$state_dir"/stopped-sha + then + record_in_rewritten "$(cat "$state_dir"/stopped-sha)" + fi + + require_clean_work_tree "rebase" + do_rest + return 0 + ;; + skip) + git rerere clear + + if test ! -d "$rewritten" + then + exec git rebase--helper ${force_rebase:+--no-ff} $allow_empty_message \ + --continue + fi + do_rest + return 0 + ;; + edit-todo) + git stripspace --strip-comments <"$todo" >"$todo".new + mv -f "$todo".new "$todo" + collapse_todo_ids + append_todo_help + gettext " +You are editing the todo file of an ongoing interactive rebase. +To continue rebase after editing, run: +git rebase --continue + +" | git stripspace --comment-lines >>"$todo" + + git_sequence_editor "$todo" || + die "$(gettext "Could not execute editor")" + expand_todo_ids + + exit + ;; + show-current-patch) + exec git show REBASE_HEAD -- + ;; + *) + return 1 # continue + ;; + esac +} + +setup_reflog_action () { + comment_for_reflog start + + if test ! -z "$switch_to" + then +
[RFC PATCH v3 2/9] Call git_rebase__interactive from run_specific_rebase
Instead of indirectly invoking git_rebase__interactive this invokes it directly after sourcing. Signed-off-by: Wink Saville--- git-rebase--interactive.sh | 11 --- git-rebase.sh | 11 +-- 2 files changed, 9 insertions(+), 13 deletions(-) diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index 561e2660e..213d75f43 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -740,15 +740,6 @@ get_missing_commit_check_level () { printf '%s' "$check_level" | tr 'A-Z' 'a-z' } -# The whole contents of this file is run by dot-sourcing it from -# inside a shell function. It used to be that "return"s we see -# below were not inside any function, and expected to return -# to the function that dot-sourced us. -# -# However, older (9.x) versions of FreeBSD /bin/sh misbehave on such a -# construct and continue to run the statements that follow such a "return". -# As a work-around, we introduce an extra layer of a function -# here, and immediately call it after defining it. git_rebase__interactive () { case "$action" in @@ -1029,5 +1020,3 @@ fi do_rest } -# ... and then we call the whole thing. -git_rebase__interactive diff --git a/git-rebase.sh b/git-rebase.sh index a1f6e5de6..c4ec7c21b 100755 --- a/git-rebase.sh +++ b/git-rebase.sh @@ -196,8 +196,15 @@ run_specific_rebase () { export GIT_EDITOR autosquash= fi - . git-rebase--$type - ret=$? + if test "$type" = interactive + then + . git-rebase--interactive + git_rebase__interactive + ret=$? + else + . git-rebase--$type + ret=$? + fi if test $ret -eq 0 then finish_rebase -- 2.16.2
[RFC PATCH v3 6/9] Add and use git_rebase__interactive__preserve_merges
At the moment it's an exact copy of git_rebase__interactive except the name has changed. Signed-off-by: Wink Saville--- git-rebase--interactive.sh | 108 + git-rebase.sh | 7 ++- 2 files changed, 114 insertions(+), 1 deletion(-) diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index 2c10a7f1a..ab5513d80 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -1058,3 +1058,111 @@ git_rebase__interactive () { complete_action } + +git_rebase__interactive__preserve_merges () { + initiate_action "$action" + ret=$? + if test $ret = 0; then + return 0 + fi + + setup_reflog_action + init_basic_state + + if test t = "$preserve_merges" + then + if test -z "$rebase_root" + then + mkdir "$rewritten" && + for c in $(git merge-base --all $orig_head $upstream) + do + echo $onto > "$rewritten"/$c || + die "$(gettext "Could not init rewritten commits")" + done + else + mkdir "$rewritten" && + echo $onto > "$rewritten"/root || + die "$(gettext "Could not init rewritten commits")" + fi + # No cherry-pick because our first pass is to determine + # parents to rewrite and skipping dropped commits would + # prematurely end our probe + merges_option= + else + merges_option="--no-merges --cherry-pick" + fi + + init_revisions_and_shortrevisions + + if test t != "$preserve_merges" + then + git rebase--helper --make-script ${keep_empty:+--keep-empty} \ + $revisions ${restrict_revision+^$restrict_revision} >"$todo" || + die "$(gettext "Could not generate todo list")" + else + format=$(git config --get rebase.instructionFormat) + # the 'rev-list .. | sed' requires %m to parse; the instruction requires %H to parse + git rev-list $merges_option --format="%m%H ${format:-%s}" \ + --reverse --left-right --topo-order \ + $revisions ${restrict_revision+^$restrict_revision} | \ + sed -n "s/^>//p" | + while read -r sha1 rest + do + + if test -z "$keep_empty" && is_empty_commit $sha1 && ! is_merge_commit $sha1 + then + comment_out="$comment_char " + else + comment_out= + fi + + if test -z "$rebase_root" + then + preserve=t + for p in $(git rev-list --parents -1 $sha1 | cut -d' ' -s -f2-) + do + if test -f "$rewritten"/$p + then + preserve=f + fi + done + else + preserve=f + fi + if test f = "$preserve" + then + touch "$rewritten"/$sha1 + printf '%s\n' "${comment_out}pick $sha1 $rest" >>"$todo" + fi + done + fi + + # Watch for commits that been dropped by --cherry-pick + if test t = "$preserve_merges" + then + mkdir "$dropped" + # Save all non-cherry-picked changes + git rev-list $revisions --left-right --cherry-pick | \ + sed -n "s/^>//p" > "$state_dir"/not-cherry-picks + # Now all commits and note which ones are missing in + # not-cherry-picks and hence being dropped + git rev-list $revisions | + while read rev + do + if test -f "$rewritten"/$rev && + ! sane_grep "$rev" "$state_dir"/not-cherry-picks >/dev/null + then + # Use -f2 because if rev-list is telling us this commit is + # not worthwhile, we don't want to track its multiple heads, + # just the history of its first-parent for others that will + # be rebasing on top of it + git rev-list --parents -1 $rev | cut -d' ' -s -f2 > "$dropped"/$rev +
[RFC PATCH v3 0/9] rebase-interactive:
I've incorporated review feed back to date. I'm split the change into 9 commits with each commit do a single class of operation. I've prepared these commits using github and have Travis-CI setup to test my changes. Of the 9 commits 2 failed, the 1st and 5th commits, I tested those two locally and they were fine. I then restarted the builds on Travis-CI they finished fine so it seems the errors were spurious. Wink Saville (9): Simplify pick_on_preserving_merges Call git_rebase__interactive from run_specific_rebase Indent function git_rebase__interactive Extract functions out of git_rebase__interactive Use new functions in git_rebase__interactive Add and use git_rebase__interactive__preserve_merges Remove unused code paths from git_rebase__interactive Remove unused code paths from git_rebase__interactive__preserve_merges Remove merges_option and a blank line git-rebase--interactive.sh | 407 - git-rebase.sh | 16 +- 2 files changed, 229 insertions(+), 194 deletions(-) -- 2.16.2
[RFC PATCH v3 1/9] Simplify pick_on_preserving_merges
Signed-off-by: Wink Saville--- git-rebase--interactive.sh | 17 +++-- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index 331c8dfea..561e2660e 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -307,17 +307,14 @@ pick_one_preserving_merges () { esac sha1=$(git rev-parse $sha1) - if test -f "$state_dir"/current-commit + if test -f "$state_dir"/current-commit && test "$fast_forward" = t then - if test "$fast_forward" = t - then - while read current_commit - do - git rev-parse HEAD > "$rewritten"/$current_commit - done <"$state_dir"/current-commit - rm "$state_dir"/current-commit || - die "$(gettext "Cannot write current commit's replacement sha1")" - fi + while read current_commit + do + git rev-parse HEAD > "$rewritten"/$current_commit + done <"$state_dir"/current-commit + rm "$state_dir"/current-commit || + die "$(gettext "Cannot write current commit's replacement sha1")" fi echo $sha1 >> "$state_dir"/current-commit -- 2.16.2
RE: Bug With git rebase -p
I meant to say that I installed 2.17.0-rc0, and it worked perfectly. Sorry for the ambiguity. -Original Message- From: Junio C Hamano [mailto:jch2...@gmail.com] On Behalf Of Junio C Hamano Sent: Thursday, March 22, 2018 12:39 PM To: Johannes SchindelinCc: Joseph Strauss ; git@vger.kernel.org Subject: Re: Bug With git rebase -p Johannes Schindelin writes: > On Tue, 20 Mar 2018, Joseph Strauss wrote: > >> Perfect. Thank you. > > You are welcome. > > I am puzzled, though... does your message mean that you tested the Git > for Windows v2.17.0-rc0 installer and it did fix your problem? Or do > you simply assume that it does fix your problem because Junio & I > expect it to fix your problem? Thanks for asking, as I was curious about the same thing after interpreting what Joseph said as "oh, perfect that there is a packaged thing I can readily test" (implying "I'll get back to you after seeing if it helps").
Re: [PATCH 0/2] rebase --recreate-merges --keep-empty: don't prune empty
Phillip Woodwrites: > On 20/03/18 19:32, Junio C Hamano wrote: > >> With or without the above plan, what we saw from you were a bit >> messy to queue. The --keep-empty fix series is based on 'maint', >> while the --signoff series depends on changes that happened to >> sequencer between 'maint' and 'master', but yet depends on the >> former. > > Yes, that is awkward and unfortunate but the idea behind splitting them > into two separate series was to have a single set of bug fixes in the > history. The feature needed to be based on master, so if I'd had the bug > fixes in the same series you'd of had to cherry-pick them to maint which > would break branch/tag --contains. I'm not sure if that is a better option. I said "a bit messy" but that was a statement of a fact, not a complaint. Sometimes, we cannot avoid that necessary solutions to real-life problems must be messy. I still think what you sent was the best organization, given the constraints ;-). Thanks.
The most efficient way to test if repositories share the same objects
Hi, all: What is the most efficient way to test if repoA and repoB share common commits? My goal is to automatically figure out if repoB can benefit from setting alternates to repoA and repacking. I currently do it by comparing the output of "show-ref --tags -s", but that does not work for repos without tags. Best, -- Konstantin Ryabitsev signature.asc Description: OpenPGP digital signature
UriFormatException output when interacting with a Git server
I'm getting this error on a fresh install of git version 2.16.2.windows.1: fatal: UriFormatException encountered. queryUrl See this post I found, https://stackoverflow.com/questions/48775400/git-fatal-uriformatexception-encountered-actualurl for more details. Note that the latest comments are mine, and describe the conditions on which I first saw the problem. I'll warn you, it's weird. Thank you, John Chesshir P.S. I also found this older post, which appears related, but has clearly been fixed: https://superuser.com/questions/1114193/when-cloning-on-with-git-bash-on-windows-getting-fatal-uriformatexception-enco. Include just in case that helps get closer to the latest problem.
Re: Bug With git rebase -p
Johannes Schindelinwrites: > On Tue, 20 Mar 2018, Joseph Strauss wrote: > >> Perfect. Thank you. > > You are welcome. > > I am puzzled, though... does your message mean that you tested the Git for > Windows v2.17.0-rc0 installer and it did fix your problem? Or do you > simply assume that it does fix your problem because Junio & I expect it to > fix your problem? Thanks for asking, as I was curious about the same thing after interpreting what Joseph said as "oh, perfect that there is a packaged thing I can readily test" (implying "I'll get back to you after seeing if it helps").
Re: [PATCH 1/1] Fix typo in merge-strategies documentation
Johannes Schindelinwrites: > On Mon, 19 Mar 2018, Junio C Hamano wrote: > >> David Pursehouse writes: >> >> > From: David Pursehouse >> > >> > Signed-off-by: David Pursehouse >> > --- >> >> I somehow had to stare at the patch for a few minutes, view it in >> two Emacs buffers and run M-x compare-windows before I finally spot >> the single-byte typofix. > > Pro-tip: git am && git show --color-words Yeah, I know, but that would not work while I am wondering if the patch is worth applying in the first place ;-)
Re: [PATCH] filter-branch: consider refs can refer to an object other than commit or tag
> Yuki Kokubunwrites: > > >> Yuki Kokubun writes: > >> > >> > "git filter-branch -- --all" can be confused when refs that refer to > >> > objects > >> > other than commits or tags exists. > >> > Because "git rev-parse --all" that is internally used can return refs > >> > that > >> > refer to an object other than commit or tag. But it is not considered in > >> > the > >> > phase of updating refs. > >> > >> Could you describe what the consequence of that is? We have a ref > >> that points directly at a blob object, or a ref that points at a tag > >> object that points at a blob object. The current code leaves both of > >> these refs in "$tempdir/heads". Then...? > > > > Sorry, this is my wrong. > > I wrongly thought only refs/replace can point at a blob or tree object. > > No need to be sorry. You still need to describe what (bad things) > happen if we do not filter out refs that do not point at committish > in the proposed log message. > > IOW, can you elaborate and clarify your "can be confused" at the > beginning? I meant the confusion is abnormal messages from the output of "git filter-branch -- --all". For example, this is an output of "git filter-branch -- --all": Rewrite bcdbd016c77df3d5641a3cf820b2ed46ba7bf3b4 (5/5) (0 seconds passed, remaining 0 predicted) WARNING: Ref 'refs/heads/master' is unchanged WARNING: Ref 'refs/heads/no-newline' is unchanged WARNING: Ref 'refs/heads/original' is unchanged error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit fatal: ambiguous argument 'refs/replace/8a2016f3730cad8309c110f819c855403ed0a5b9^0': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git [...] -- [...]' WARNING: Ref 'refs/replace/8a2016f3730cad8309c110f819c855403ed0a5b9' is unchanged WARNING: Ref 'refs/tags/add-file' is unchanged WARNING: Ref 'refs/tags/file' is unchanged error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit fatal: ambiguous argument 'refs/tags/treetag^0': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git [...] -- [...]' WARNING: Ref 'refs/tags/treetag' is unchanged You can see a lot of terrible messages such as "error" and "fatal". But on the whole, the result of "git filter-branch -- --all" is not so abnormal. So, this is a just problem about abonormal messages. I think this messages should be suppressed. How do you feel about it?
[PATCH] completion: clear cached --options when sourcing the completion script
The established way to update the completion script in an already running shell is to simply source it again: this brings in any new --options and features, and clears caching variables. E.g. it clears the variables caching the list of (all|porcelain) git commands, so when they are later lazy-initialized again, then they will list and cache any newly installed commmands as well. Unfortunately, since d401f3debc (git-completion.bash: introduce __gitcomp_builtin, 2018-02-09) and subsequent patches this doesn't work for a lot of git commands' options. To eliminate a lot of hard-to-maintain hard-coded lists of options, those commits changed the completion script to use a bunch of programmatically created and lazy-initialized variables to cache the options of those builtin porcelain commands that use parse-options. These variables are not cleared upon sourcing the completion script, therefore they continue caching the old lists of options, even when some commands recently learned new options or when deprecated options were removed. Always 'unset' these variables caching the options of builtin commands when sourcing the completion script. Redirect 'unset's stderr to /dev/null, because ZSH's 'unset' complains if it's invoked without any arguments, i.e. no variables caching builtin's options are set. This can happen, if someone were to source the completion script twice without completing any --options in between. Bash stays silent in this case. Add tests to ensure that these variables are indeed cleared when the completion script is sourced; not just the variables caching options, but all other caching variables, i.e. the variables caching commands, porcelain commands and merge strategies as well. Signed-off-by: SZEDER Gábor--- Fixes a recent regression introduced in 'nd/parseopt-completion'. contrib/completion/git-completion.bash | 4 t/t9902-completion.sh | 31 ++ 2 files changed, 35 insertions(+) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index 7c84eb1912..602352f952 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -280,6 +280,10 @@ __gitcomp () esac } +# Clear the variables caching builtins' options when (re-)sourcing +# the completion script. +unset $(set |sed -ne 's/^\(__gitcomp_builtin_[a-zA-Z0-9_][a-zA-Z0-9_]*\)=.*/\1/p') 2>/dev/null + # This function is equivalent to # #__gitcomp "$(git xxx --git-completion-helper) ..." diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh index e6485feb0a..4c86adadf2 100755 --- a/t/t9902-completion.sh +++ b/t/t9902-completion.sh @@ -1497,4 +1497,35 @@ do ' done +test_expect_success 'sourcing the completion script clears cached commands' ' + __git_compute_all_commands && + verbose test -n "$__git_all_commands" && + . "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" && + verbose test -z "$__git_all_commands" +' + +test_expect_success 'sourcing the completion script clears cached porcelain commands' ' + __git_compute_porcelain_commands && + verbose test -n "$__git_porcelain_commands" && + . "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" && + verbose test -z "$__git_porcelain_commands" +' + +test_expect_success 'sourcing the completion script clears cached merge strategies' ' + __git_compute_merge_strategies && + verbose test -n "$__git_merge_strategies" && + . "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" && + verbose test -z "$__git_merge_strategies" +' + +test_expect_success 'sourcing the completion script clears cached --options' ' + __gitcomp_builtin checkout && + verbose test -n "$__gitcomp_builtin_checkout" && + __gitcomp_builtin notes_edit && + verbose test -n "$__gitcomp_builtin_notes_edit" && + . "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" && + verbose test -z "$__gitcomp_builtin_checkout" && + verbose test -z "$__gitcomp_builtin_notes_edit" +' + test_done -- 2.17.0.rc0.103.gbdc5836ed3
Re: [PATCH v2] filter-branch: use printf instead of echo -e
Il 21/03/2018 22:50, Johannes Schindelin ha scritto: Hi Michele, On Mon, 19 Mar 2018, Michele Locati wrote: [...] -- 2.16.2.windows.1 Yay! Out of curiosity: did the CONTRIBUTING.md file help that was recently turned into a guide how to contribute to Git (for Windows) by Derrick Stolee? Well, no. Here's what led me here... The people behind the concrete5 CMS asked support for improving the following procedure: concrete5 is stored in https://github.com/concrete5/concrete5 In order to make it installable with Composer (a PHP package manager), we need to extract its "/concrete" directory, and push the results to https://github.com/concrete5/concrete5-core We previously used "git filter-branch --all" with this script: https://github.com/concrete5/core_splitter/blob/70879e676b95160f7fc5d0ffc22b8f7420b0580b/bin/splitcore That script is executed every time someone pushes concrete5/concrete5, and it took very long time (3~5 minutes). I noticed on the git-filter-branch manual page that there's a "--state-branch", and it seemed quite interesting. So, I asked in git@vger.kernel.org how to use it, and the author of that feature (Ian Campbell) told me to look at some code he wrote. So, I wrote https://github.com/concrete5/incremental-filter-branch which wraps "git filter-branch --filter-state", and it's used in https://github.com/concrete5/core_splitter/blob/99bbbcea1ab90572a04864ccb92327532ab6a42c/bin/splitcore (it not yet in production: Korvin wants to be sure everything works well before adopting it) The time required for the operation dropped from 3~5 minutes to ~10 seconds ;) While writing that script, I noticed a couple of things that could be improved in "git-filter-branch", so I submitted https://marc.info/?l=git=152111905428554=2 (so that the script could run without problems if there's nothing to do) and https://marc.info/?l=git=152147095416175=2 (so that --filter-state could be used on Mac and other non-Linux systems). How did I learn how to submit to git? I searched for "git src", landed on https://github.com/git/git, read Documentation/SubmittingPatches and used git send-email. And yes, all that on Windows (and WSL). Ciao, Johannes Ciao! -- Michele
Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates
On Thu, Mar 22, 2018 at 11:57:27AM +0100, Duy Nguyen wrote: > On Thu, Mar 22, 2018 at 10:32 AM, Jeff Kingwrote: > > That would still mean you could get into a broken state for serving > > fetches, but you could at least get out of it by running "git repack". > > I was puzzled by this "broken state" statement. But you were right! I > focused on the repack case and forgot about fetch/clone case. I will > probably just drop this patch for now. Then maybe revisit this some > time in fiture when I find out how to deal with this nicely. Here's a sketch of the "separate array" concept I mentioned before, in case that helps. Not tested at all beyond compiling. --- diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 4406af640f..e4e308b453 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1090,7 +1090,7 @@ static void create_object_entry(const struct object_id *oid, else nr_result++; if (found_pack) { - oe_set_in_pack(entry, found_pack); + oe_set_in_pack(_pack, entry, found_pack); entry->in_pack_offset = found_offset; } diff --git a/pack-objects.h b/pack-objects.h index 9f8e450e19..b94b9232fa 100644 --- a/pack-objects.h +++ b/pack-objects.h @@ -7,6 +7,8 @@ #define OE_Z_DELTA_BITS16 #define OE_DELTA_SIZE_BITS 31 +#define OE_IN_PACK_EXTENDED ((1 << OE_IN_PACK_BITS) - 1) + /* * State flags for depth-first search used for analyzing delta cycles. * @@ -111,8 +113,13 @@ struct packing_data { uint32_t index_size; unsigned int *in_pack_pos; - int in_pack_count; - struct packed_git *in_pack[1 << OE_IN_PACK_BITS]; + + struct packed_git **in_pack; + uint32_t in_pack_count; + size_t in_pack_alloc; + + uint32_t *in_pack_extended; + size_t in_pack_extended_alloc; }; struct object_entry *packlist_alloc(struct packing_data *pdata, @@ -174,17 +181,13 @@ static inline void oe_set_in_pack_pos(const struct packing_data *pack, static inline unsigned int oe_add_pack(struct packing_data *pack, struct packed_git *p) { - if (pack->in_pack_count >= (1 << OE_IN_PACK_BITS)) - die(_("too many packs to handle in one go. " - "Please add .keep files to exclude\n" - "some pack files and keep the number " - "of non-kept files below %d."), - 1 << OE_IN_PACK_BITS); if (p) { if (p->index > 0) die("BUG: this packed is already indexed"); p->index = pack->in_pack_count; } + ALLOC_GROW(pack->in_pack, pack->in_pack_count + 1, + pack->in_pack_alloc); pack->in_pack[pack->in_pack_count] = p; return pack->in_pack_count++; } @@ -192,18 +195,28 @@ static inline unsigned int oe_add_pack(struct packing_data *pack, static inline struct packed_git *oe_in_pack(const struct packing_data *pack, const struct object_entry *e) { - return pack->in_pack[e->in_pack_idx]; - + uint32_t idx = e->in_pack_idx; + if (idx == OE_IN_PACK_EXTENDED) + idx = pack->in_pack_extended[e - pack->objects]; + return pack->in_pack[idx]; } -static inline void oe_set_in_pack(struct object_entry *e, +static inline void oe_set_in_pack(struct packing_data *pack, + struct object_entry *e, struct packed_git *p) { if (p->index <= 0) die("BUG: found_pack should be NULL " "instead of having non-positive index"); - e->in_pack_idx = p->index; - + else if (p->index < OE_IN_PACK_EXTENDED) + e->in_pack_idx = p->index; + else { + size_t index = e - pack->objects; + ALLOC_GROW(pack->in_pack_extended, + index, pack->in_pack_extended_alloc); + pack->in_pack_extended[index] = p->index; + e->in_pack_idx = OE_IN_PACK_EXTENDED; + } } static inline struct object_entry *oe_delta(
Re: [PATCH 7/7] diff-highlight: detect --graph by indent
On Thu, Mar 22, 2018 at 10:59:31AM +, Phillip Wood wrote: > > +sub handle_line { > > + my $orig = shift; > > + local $_ = $orig; > > + > > + # match a graph line that begins a commit > > + if (/^(?:$COLOR?\|$COLOR?[ ])* # zero or more leading "|" with space > > +$COLOR?\*$COLOR?[ ] # a "*" with its trailing space > > + (?:$COLOR?\|$COLOR?[ ])* # zero or more trailing "|" > > +[ ]* # trailing whitespace for merges > > Hi Peff, thanks for looking at this. I've only had a quick look through > but I wonder if this will be confused by commit messages that contain > * bullet points > * like this I think we should be OK because the commit messages are indented by 4 spaces, and we allow only single spaces amidst the graph-drawing bits (and we respect the "*" only in those graph-drawing bits). So I think you could fool it with something like: git log --graph --format='* oops!' or maybe even: git log --graph --format='%B' but not with a regular git-log format. Those final corner cases I don't think we can overcome; it's just syntactically ambiguous. Unless perhaps we traced the graph lines from the start of the output and knew how many to expect, but that seems rather complicated. Ultimately the best solution would be to avoid this parsing nonsense entirely by teaching Git's internal diff to produce the highlighting internally. -Peff PS I also eyeballed the results of "git log --graph -p --no-merges -1000" piped through the old and new versions. There are several changes, but they all look like improvements.
Re: [PATCH 0/2] rebase --recreate-merges --keep-empty: don't prune empty
On 20/03/18 19:32, Junio C Hamano wrote: > Phillip Woodwrites: > >> On 20/03/18 15:42, Johannes Schindelin wrote: >> ... >>> As indicated in another reply, I'd rather rebase the --recreate-merges >>> patches on top of your --keep-empty patch series. This obviously means >>> that I would fold essentially all of your 2/2 changes into my >>> "rebase-helper --make-script: introduce a flag to recreate merges" >>> >>> The 1/2 (with s/failure/success/g) would then be added to the >>> --recreate-merges patch series at the end. >>> >>> Would that be okay with you? >> >> Yes, that's fine, it would give a clearer history > > With or without the above plan, what we saw from you were a bit > messy to queue. The --keep-empty fix series is based on 'maint', > while the --signoff series depends on changes that happened to > sequencer between 'maint' and 'master', but yet depends on the > former. Yes, that is awkward and unfortunate but the idea behind splitting them into two separate series was to have a single set of bug fixes in the history. The feature needed to be based on master, so if I'd had the bug fixes in the same series you'd of had to cherry-pick them to maint which would break branch/tag --contains. I'm not sure if that is a better option. Best Wishes Phillip > In what I'll be pushing out at the end of today's integration run, > I'll have two topics organized this way: > > - pw/rebase-keep-empty-fixes: built by applying the three >'--keep-empty' patches on top of 'maint'. > > - pw/rebase-signoff: built by first merging the above to 0f57f731 >("Merge branch 'pw/sequencer-in-process-commit'", 2018-02-13) and >then applying "rebase --signoff" series. > > Also, I'll revert merge of Dscho's recreate-merges topic to 'next'; > doing so would probably have to invalidate a few evil merges I've > been carrying to resolve conflicts between it and bc/object-id > topic, so today's integration cycle may take a bit longer than > usual. >
Re: [PATCH 7/7] diff-highlight: detect --graph by indent
On 21/03/18 05:59, Jeff King wrote: > This patch fixes a corner case where diff-highlight may > scramble some diffs when combined with --graph. > > Commit 7e4ffb4c17 (diff-highlight: add support for --graph > output, 2016-08-29) taught diff-highlight to skip past the > graph characters at the start of each line with this regex: > > ($COLOR?\|$COLOR?\s+)* > > I.e., any series of pipes separated by and followed by > arbitrary whitespace. We need to match more than just a > single space because the commit in question may be indented > to accommodate other parts of the graph drawing. E.g.: > > * commit 1234abcd > | ... > | diff --git ... > > has only a single space, but for the last commit before a > fork: > > | | | > | * | commit 1234abcd > | |/ ... > | | diff --git > > the diff lines have more spaces between the pipes and the > start of the diff. > > However, when we soak up all of those spaces with the > $GRAPH regex, we may accidentally include the leading space > for a context line. That means we may consider the actual > contents of a context line as part of the diff syntax. In > other words, something like this: > >normal context line > -old line > +new line >-this is a context line with a leading dash > > would cause us to see that final context line as a removal > line, and we'd end up showing the hunk in the wrong order: > > normal context line > -old line >-this is a context line with a leading dash > +new line > > Instead, let's a be a little more clever about parsing the > graph. We'll look for the actual "*" line that marks the > start of a commit, and record the indentation we see there. > Then we can skip past that indentation when checking whether > the line is a hunk header, removal, addition, etc. > > There is one tricky thing: the indentation in bytes may be > different for various lines of the graph due to coloring. > E.g., the "*" on a commit line is generally shown without > color, but on the actual diff lines, it will be replaced > with a colorized "|" character, adding several bytes. We > work around this here by counting "visible" bytes. This is > unfortunately a bit more expensive, making us about twice as > slow to handle --graph output. But since this is meant to be > used interactively anyway, it's tolerably fast (and the > non-graph case is unaffected). > > One alternative would be to search for hunk header lines and > use their indentation (since they'd have the same colors as > the diff lines which follow). But that just opens up > different corner cases. If we see: > > | |@@ 1,2 1,3 @@ > > we cannot know if this is a real diff that has been > indented due to the graph, or if it's a context line that > happens to look like a diff header. We can only be sure of > the indent on the "*" lines, since we know those don't > contain arbitrary data (technically the user could include a > bunch of extra indentation via --format, but that's rare > enough to disregard). > > Reported-by: Phillip Wood> Signed-off-by: Jeff King > --- > contrib/diff-highlight/DiffHighlight.pm | 78 +++ > .../diff-highlight/t/t9400-diff-highlight.sh | 28 +++ > 2 files changed, 91 insertions(+), 15 deletions(-) > > diff --git a/contrib/diff-highlight/DiffHighlight.pm > b/contrib/diff-highlight/DiffHighlight.pm > index e07cd5931d..536754583b 100644 > --- a/contrib/diff-highlight/DiffHighlight.pm > +++ b/contrib/diff-highlight/DiffHighlight.pm > @@ -21,34 +21,82 @@ my $RESET = "\x1b[m"; > my $COLOR = qr/\x1b\[[0-9;]*m/; > my $BORING = qr/$COLOR|\s/; > > -# The patch portion of git log -p --graph should only ever have preceding | > and > -# not / or \ as merge history only shows up on the commit line. > -my $GRAPH = qr/$COLOR?\|$COLOR?\s+/; > - > my @removed; > my @added; > my $in_hunk; > +my $graph_indent = 0; > > our $line_cb = sub { print @_ }; > our $flush_cb = sub { local $| = 1 }; > > -sub handle_line { > +# Count the visible width of a string, excluding any terminal color > sequences. > +sub visible_width { > local $_ = shift; > + my $ret = 0; > + while (length) { > + if (s/^$COLOR//) { > + # skip colors > + } elsif (s/^.//) { > + $ret++; > + } > + } > + return $ret; > +} > + > +# Return a substring of $str, omitting $len visible characters from the > +# beginning, where terminal color sequences do not count as visible. > +sub visible_substr { > + my ($str, $len) = @_; > + while ($len > 0) { > + if ($str =~ s/^$COLOR//) { > + next > + } > + $str =~ s/^.//; > + $len--; > + } > + return $str; > +} > + > +sub handle_line { > + my $orig = shift; > + local $_ = $orig; > + > + # match a graph line that begins a commit > + if (/^(?:$COLOR?\|$COLOR?[ ])* # zero or more
Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates
On Thu, Mar 22, 2018 at 10:32 AM, Jeff Kingwrote: > That would still mean you could get into a broken state for serving > fetches, but you could at least get out of it by running "git repack". I was puzzled by this "broken state" statement. But you were right! I focused on the repack case and forgot about fetch/clone case. I will probably just drop this patch for now. Then maybe revisit this some time in fiture when I find out how to deal with this nicely. -- Duy
Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates
On Thu, Mar 22, 2018 at 09:23:42AM +0100, Duy Nguyen wrote: > > I wish you were right about the rarity, but it's unfortunately something > > I have seen multiple times in the wild (and why I spent time optimizing > > the many-packs case for pack-objects). Unfortunately I don't know how > > often it actually comes up, because in theory running "git repack" > > cleans it up without further ado. But after these patches, not so much. > > It's good to know this case is real and I can start to fix it > (assuming that the other concern about readability will not stop this > series). > > I think I'll try to fix this without involving repack. pack-objects > can produce multiple packs, so if we have more than 16k pack files, we > produce one new pack per 16k old ones. I suspect that's going to be hard given the structure of the code. Could we perhaps just bump to an auxiliary storage in that case? I.e., allocate the final index number to mean "look in this other table", and then have another table of uint32 indices? That would mean we can behave as we did previously, but just use a little more memory in the uncommon >16k case. -Peff
Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates
On Thu, Mar 22, 2018 at 05:32:12AM -0400, Jeff King wrote: > So 27% of the total heap goes away if you switch to a separate rev-list. > Though it's mostly just going to a different process, it does help peak > because that process would have exited by the time we get to the > revindex bits. > > I suspect you could get the same effect by just teaching pack-objects to > clear obj_hash and all of the allocated object structs. I think that > should be safe to do as long as we clear _all_ of the objects, so there > are no dangling pointers. The patch below tries that. It's kind of hacky, but it drops my peak heap for packing linux.git from 1336MB to 1129MB. That's not quite as exciting as 27%, because it just moves our peak earlier, to when we do have all of the object structs in memory (so the savings are really just that we're not holding the revindex, etc at the same time as the object structs). But we also hold that peak for a lot shorter period, because we drop the memory before we do any delta compression (which itself can be memory hungry[1]), and don't hold onto it during the write phase (which can be network-limited when serving a fetch). So during that write phase we're holding only ~900MB instead of ~1250MB. -Peff [1] All of my timings are on noop repacks of a single pack, so there's no actual delta compression. On average, it will use something like "nr_threads * window * avg_blob_size". For a "normal" repo, that's only a few megabytes. But the peak will depend on the large blobs, so it could have some outsize cases. I don't know if it's worth worrying about too much for this analysis. --- Here's the patch. It's probably asking for trouble to have this kind of clearing interface, as a surprising number of things may hold onto pointers to objects (see the comment below about the bitmap code). So maybe the separate process is less insane. diff --git a/alloc.c b/alloc.c index 12afadfacd..50d444a3b0 100644 --- a/alloc.c +++ b/alloc.c @@ -30,15 +30,23 @@ struct alloc_state { int count; /* total number of nodes allocated */ int nr;/* number of nodes left in current allocation */ void *p; /* first free node in current allocation */ + + /* book-keeping for clearing */ + void *start; + struct alloc_state *prev; }; -static inline void *alloc_node(struct alloc_state *s, size_t node_size) +static inline void *alloc_node(struct alloc_state **sp, size_t node_size) { + struct alloc_state *s = *sp; void *ret; - if (!s->nr) { + if (!s || !s->nr) { + s = xmalloc(sizeof(*s)); s->nr = BLOCKING; - s->p = xmalloc(BLOCKING * node_size); + s->start = s->p = xmalloc(BLOCKING * node_size); + s->prev = *sp; + *sp = s; } s->nr--; s->count++; @@ -48,7 +56,7 @@ static inline void *alloc_node(struct alloc_state *s, size_t node_size) return ret; } -static struct alloc_state blob_state; +static struct alloc_state *blob_state; void *alloc_blob_node(void) { struct blob *b = alloc_node(_state, sizeof(struct blob)); @@ -56,7 +64,7 @@ void *alloc_blob_node(void) return b; } -static struct alloc_state tree_state; +static struct alloc_state *tree_state; void *alloc_tree_node(void) { struct tree *t = alloc_node(_state, sizeof(struct tree)); @@ -64,7 +72,7 @@ void *alloc_tree_node(void) return t; } -static struct alloc_state tag_state; +static struct alloc_state *tag_state; void *alloc_tag_node(void) { struct tag *t = alloc_node(_state, sizeof(struct tag)); @@ -72,7 +80,7 @@ void *alloc_tag_node(void) return t; } -static struct alloc_state object_state; +static struct alloc_state *object_state; void *alloc_object_node(void) { struct object *obj = alloc_node(_state, sizeof(union any_object)); @@ -80,7 +88,7 @@ void *alloc_object_node(void) return obj; } -static struct alloc_state commit_state; +static struct alloc_state *commit_state; unsigned int alloc_commit_index(void) { @@ -103,7 +111,7 @@ static void report(const char *name, unsigned int count, size_t size) } #define REPORT(name, type) \ -report(#name, name##_state.count, name##_state.count * sizeof(type) >> 10) +report(#name, name##_state->count, name##_state->count * sizeof(type) >> 10) void alloc_report(void) { @@ -113,3 +121,22 @@ void alloc_report(void) REPORT(tag, struct tag); REPORT(object, union any_object); } + +static void alloc_clear(struct alloc_state **sp) +{ + while (*sp) { + struct alloc_state *s = *sp; + *sp = s->prev; + free(s->start); + free(s); + } +} + +void alloc_clear_all(void) +{ + alloc_clear(_state); + alloc_clear(_state); + alloc_clear(_state); + alloc_clear(_state); + alloc_clear(_state); +} diff --git
Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates
On Wed, Mar 21, 2018 at 04:59:19PM +0100, Duy Nguyen wrote: > > I hate to be a wet blanket, but am I the only one who is wondering > > whether the tradeoffs is worth it? 8% memory reduction doesn't seem > > mind-bogglingly good, > > AEvar measured RSS. If we count objects[] array alone, the saving is > 40% (136 bytes per entry down to 80). Some is probably eaten up by > mmap in rss. Measuring actual heap usage with massif, I get before/after peak heaps of 1728 and 1346MB respectively when repacking linux.git. So that's ~22% savings overall. Of the used heap after your patches: - ~40% of that is from packlist_alloc() - ~17% goes to "struct object" - ~10% for the object.c hash table to store all the "struct object" - ~7% goes to the delta cache - ~7% goes to the pack revindex (actually, there's a duplicate 7% there, too; I think our peak is when we're sorting the revindex and have to keep two copies in memory at once) - ~5% goes to the packlist_find() hash table - ~3.5% for the get_object_details() sorting list (this is only held for a minute, but again, our peak comes during this sort, which in turn loads the revindex) So 27% of the total heap goes away if you switch to a separate rev-list. Though it's mostly just going to a different process, it does help peak because that process would have exited by the time we get to the revindex bits. I suspect you could get the same effect by just teaching pack-objects to clear obj_hash and all of the allocated object structs. I think that should be safe to do as long as we clear _all_ of the objects, so there are no dangling pointers. > About the 16k limit (and some other limits as well), I'm making these > patches with the assumption that large scale deployment probably will > go with custom builds anyway. Adjusting the limits back should be > quite easy while we can still provide reasonable defaults for most > people. I think this 16k limit is the thing I _most_ dislike about the series. If we could tweak that case such that we always made forward progress, I think I'd be a lot less nervous. I responded elsewhere in the thread (before seeing that both Junio and you seemed aware of the issues ;) ), but I think it would be acceptable to have git-repack enforce the limit. That would still mean you could get into a broken state for serving fetches, but you could at least get out of it by running "git repack". -Peff
Dear Friend Please Respond
From:MR.RICHARD SANDOO. Bill and Exchange Manager Micro Finance Bank Plc Burkina Faso Dear Friend, I know that this mail will come to you as a surprise. I am MR.RICHARD SANDOO. and I am the bill and Exchange manager in a Bank here in my country .I Hope that you will not expose or betray this trust and confident that i am about to Repose on you for the mutual benefit of our both families. I need your urgent assistance in transferring the sum of $15 million Immediately to your account. The money has been dormant for years in our Bank here without any body coming for it. I want my bank to release the money to you as the nearest person to our deceased customer the owner Of the account who died a long with his supposed next of kin in an air Crash since July 2000.I don't want the money to go into our Bank treasury as an abandoned fund. So this is the reason why i contacted you, so that the bank can release the money to you as the nearest person to the deceased customer. Please i will like you to keep this proposal as a top secret . Upon receipt of your reply, I will send you full details on how the transfer will be executed and also note that you will have 40% of the Above mentioned sum. Acknowledge receipt of this message in acceptance of my mutual business endeavour by furnishing me with the following information; 1. Your Full Names and Address... 2. Country... .. 3. Direct Telephone . 4. Occupation .. Please send me your information as soon as you reply I will give you full details on how you and me can claim the money from our bank. I am waiting to receive your reply. MR.RICHARD SANDOO Micro Finance Bank , Burkina Faso
Re: [PATCH v2] routines to generate JSON data
On Wed, Mar 21 2018, g...@jeffhostetler.com wrote: > So, I'm not sure we have a route to get UTF-8-clean data out of Git, and if > we do it is beyond the scope of this patch series. > > So I think for our uses here, defining this as "JSON-like" is probably the > best answer. We write the strings as we received them (from the file system, > the index, or whatever). These strings are properly escaped WRT double > quotes, backslashes, and control characters, so we shouldn't have an issue > with decoders getting out of sync -- only with them rejecting non-UTF-8 > sequences. > > We could blindly \u encode each of the hi-bit characters, if that would > help the parsers, but I don't want to do that right now. > > WRT binary data, I had not intended using this for binary data. And without > knowing what kinds or quantity of binary data we might use it for, I'd like > to ignore this for now. I agree we should just ignore this problem for now given the immediate use-case.
Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates
On Thu, Mar 22, 2018 at 9:07 AM, Jeff Kingwrote: > On Wed, Mar 21, 2018 at 05:31:14PM +0100, Ævar Arnfjörð Bjarmason wrote: > >> > [...]Yes, having that many packs is insane, but that's going to be >> > small consolation to somebody whose automated maintenance program now >> > craps out at 16k packs, when it previously would have just worked to >> > fix the situation[...] >> >> That's going to be super rare (and probably nonexisting) edge case, but >> (untested) I wonder if something like this on top would alleviate your >> concerns, i.e. instead of dying we just take the first N packs up to our >> limit: > > I wish you were right about the rarity, but it's unfortunately something > I have seen multiple times in the wild (and why I spent time optimizing > the many-packs case for pack-objects). Unfortunately I don't know how > often it actually comes up, because in theory running "git repack" > cleans it up without further ado. But after these patches, not so much. It's good to know this case is real and I can start to fix it (assuming that the other concern about readability will not stop this series). I think I'll try to fix this without involving repack. pack-objects can produce multiple packs, so if we have more than 16k pack files, we produce one new pack per 16k old ones. -- Duy
Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates
On Wed, Mar 21, 2018 at 05:31:14PM +0100, Ævar Arnfjörð Bjarmason wrote: > > [...]Yes, having that many packs is insane, but that's going to be > > small consolation to somebody whose automated maintenance program now > > craps out at 16k packs, when it previously would have just worked to > > fix the situation[...] > > That's going to be super rare (and probably nonexisting) edge case, but > (untested) I wonder if something like this on top would alleviate your > concerns, i.e. instead of dying we just take the first N packs up to our > limit: I wish you were right about the rarity, but it's unfortunately something I have seen multiple times in the wild (and why I spent time optimizing the many-packs case for pack-objects). Unfortunately I don't know how often it actually comes up, because in theory running "git repack" cleans it up without further ado. But after these patches, not so much. I'll admit that my experiences aren't necessarily typical of most git users. But I wouldn't be surprised if other people hosting their own repositories run into this, too (e.g., somebody pushing in a loop, auto-gc disabled or clogged by something silly like the "too many loose objects" warning). > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c > index 4406af640f..49d467ab2a 100644 > --- a/builtin/pack-objects.c > +++ b/builtin/pack-objects.c > @@ -1065,8 +1065,9 @@ static int want_object_in_pack(const struct > object_id *oid, > > want = 1; > done: > - if (want && *found_pack && !(*found_pack)->index) > - oe_add_pack(_pack, *found_pack); > + if (want && *found_pack && !(*found_pack)->index) { > + if (oe_add_pack(_pack, *found_pack) == -1) > + return 0; Something like this does seem like a much better fallback, as we'd make forward progress instead of aborting (and exacerbating whatever caused the packs to stack up in the first place). I think the patch as-is does not work, though. You say "oops, too many packs" and so the "yes we want this object" return becomes "no, we do not want it". And it is not included in the resulting packfile. But what happens after that? After pack-objects finishes, we return to "git repack", which assumes that pack-objects packed everything it was told to. And with "-d", it then _deletes_ the old packs, knowing that anything of value was copied to the new pack. So with this patch, we'd corrupt the repository if this code is ever hit. You'd need some way to report back to "git repack" that the pack was omitted. Or probably more sensibly, you'd need "git repack" to count up the packs and make sure that it marks anybody beyond the limit manually as .keep (presumably using Duy's new command-line option rather than actually writing a file). -Peff
[no subject]
hi Git https://goo.gl/RLDyn8 Bbenta
Re: [PATCH v2] routines to generate JSON data
On Wed, Mar 21, 2018 at 07:28:26PM +, g...@jeffhostetler.com wrote: > It includes a new "struct json_writer" which is used to guide the > accumulation of JSON data -- knowing whether an object or array is > currently being composed. This allows error checking during construction. > > It also allows construction of nested structures using an inline model (in > addition to the original bottom-up composition). > > The test helper has been updated to include both the original unit tests and > a new scripting API to allow individual tests to be written directly in our > t/t*.sh shell scripts. Thanks for all of this. The changes look quite sensible to me (I do still suspect we could do the "first_item" thing without having to allocate, but I really like the assertions you were able to put in). > So I think for our uses here, defining this as "JSON-like" is probably the > best answer. We write the strings as we received them (from the file system, > the index, or whatever). These strings are properly escaped WRT double > quotes, backslashes, and control characters, so we shouldn't have an issue > with decoders getting out of sync -- only with them rejecting non-UTF-8 > sequences. Yeah, I think I've come to the same conclusion. My main goal in raising it now was to see if there was some other format we might use before we go too far down the JSON road. But as far as I can tell there really isn't another good option. > WRT binary data, I had not intended using this for binary data. And without > knowing what kinds or quantity of binary data we might use it for, I'd like > to ignore this for now. Yeah, I don't have any plans here either. I was thinking more about things like author names and file paths. -Peff