Re: [PATCH] completion: add option completion for most builtin commands

2018-03-21 Thread Duy Nguyen
On Wed, Mar 21, 2018 at 9:59 PM, Junio C Hamano  wrote:
> Nguyễn Thái Ngọc Duy   writes:
>
>> These commands can take options and use parse-options so it's quite
>> easy to allow option completion. This does not pollute the command
>> name completion though. "git " will show you the same set as
>> before. This only kicks in when you type the correct command name.
>>
>> Some other builtin commands are not still added because either they
>> don't use parse-options, or they are deprecated, or they are those
>> -helper commands that are used to move some logic back in C for
>> sh-based commands.
>>
>> Signed-off-by: Nguyễn Thái Ngọc Duy 
>> ---
>>  contrib/completion/git-completion.bash | 276 +
>>  1 file changed, 276 insertions(+)
>
> Wow, just wow.  It however looks a lot of boilerplates, e.g. looking
> at these, we notice ...
>
>> +_git_blame() {
>> + case "$cur" in
>> + --*)
>> + __gitcomp_builtin blame
>> + return
>> + ;;
>> + esac
>> +}
>> +
>>
>> +_git_cat_file() {
>> + case "$cur" in
>> + --*)
>> + __gitcomp_builtin cat-file
>> + return
>> + ;;
>> + esac
>> +}
>> +
>> +_git_check_attr() {
>> + case "$cur" in
>> + --*)
>> + __gitcomp_builtin check-attr
>> + return
>> + ;;
>> + esac
>> +}
>
> ... the only thing we need for the above three is a table that says
> "use blame for blame, cat-file for cat_file, and check-attr for
> check_attr".
>
> And that pattern repeats throughout the patch.  I wonder if we can
> express the same a lot more concisely by updating the caller that
> calls these command specific helpers?
>

Yeah. I almost went to just generate and eval these functions. But we
still need to keep a list of "bultin with --git-completion-helper"
support somewhere, and people may want to complete arguments without
double dashes (e.g. read-tree should take a ref...) which can't be
helped by --git-completion-helper.
-- 
Duy


ITS ALL ABOUT FACEBOOK

2018-03-21 Thread Facebook Int'l


Hello,

Facebook is given out 14,000,000.USD (Fourteen Million Dollars) its all about 
14 Please, respond with your Unique Code (FB/BF14-13M5250UD) using your 
registration email, to the Verification Department at; 
dustinmoskovitz.facebo...@gmail.com

Dustin Moskovitz
Facebook Team
Copyright © 2018 Facebook Int'l


Re: [PATCH 40/44] packfile: allow prepare_packed_git to handle arbitrary repositories

2018-03-21 Thread Junio C Hamano
Brandon Williams  writes:

> On 03/03, Nguyễn Thái Ngọc Duy wrote:
>> From: Stefan Beller 
>> 
>> Signed-off-by: Stefan Beller 
>> Signed-off-by: Junio C Hamano 
>> Signed-off-by: Nguyễn Thái Ngọc Duy 
>> ---
>
> This is an invalid conversion.
>
>>  packfile.c | 18 +-
>>  packfile.h |  3 +--
>>  2 files changed, 10 insertions(+), 11 deletions(-)
>> 
>> diff --git a/packfile.c b/packfile.c
>> index 52febba932..2276e2ad26 100644
>> --- a/packfile.c
>> +++ b/packfile.c
>> @@ -882,19 +882,19 @@ static void prepare_packed_git_mru(struct repository 
>> *r)
>>  list_add_tail(>mru, >objects.packed_git_mru);
>>  }
>>  
>> -void prepare_packed_git_the_repository(void)
>> +void prepare_packed_git(struct repository *r)
>>  {
>>  struct alternate_object_database *alt;
>>  
>> -if (the_repository->objects.packed_git_initialized)
>> +if (r->objects.packed_git_initialized)
>>  return;
>> -prepare_packed_git_one(the_repository, get_object_directory(), 1);
>> -prepare_alt_odb(the_repository);
>> -for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next)
>> -prepare_packed_git_one(the_repository, alt->path, 0);
>> -rearrange_packed_git(the_repository);
>> -prepare_packed_git_mru(the_repository);
>> -the_repository->objects.packed_git_initialized = 1;
>> +prepare_packed_git_one(r, get_object_directory(), 1);
>
> Calling get_object_directory() returns the_repository's object dir,
> this needs to be replaced with r->objects.objectdir.

Nicely spotted.  I think this was inherited from the orginal,
e.g. the one from the end of last month

https://public-inbox.org/git/20180228010608.215505-9-sbel...@google.com/

also calls get_object_directory().

Thanks.


Re: [RFC][GSoC] Project proposal: convert interactive rebase to C

2018-03-21 Thread Johannes Schindelin
Hi Alban,

On Wed, 21 Mar 2018, Alban Gruin wrote:

> Le mardi 20 mars 2018 17:29:28 CET, vous avez écrit :
> 
> > Also, I have a hunch that there is actually almost nothing left to
> > rewrite after my sequencer improvements that made it into Git v2.13.0,
> > together with the upcoming changes (which are on top of the
> > --recreate-merges patch series, hence I did not send them to the
> > mailing list yet) in
> > https://github.com/dscho/git/commit/c261f17a4a3e
> 
> One year ago, you said[2] that converting this script "will fill up 3
> month, very easily". Is this not accurate anymore?

Let me read that mail ;-)

*goes and reads*

Well, I was talking about two different aspects to Ivan and to you. I
should have been clearer. So let me try again:

To convert `git-rebase--interactive.sh`, I think the most important part
is to factor out the preserve-merges code into its own script. After that,
there is little I can think of (apart from support for --root, which a
not-yet-contributed patch in my sequencer-shears branch on
https://github.com/dscho/git addresses) that still needs to be converted.
For somebody familiar with Git's source code, I would estimate one week
(and therefore 3 weeks would be a realistic estimate :-)).

Come to think of it, a better approach might be to leave the
preserve-merges stuff in, and teach `git-rebase.sh` to call the sequencer
directly for --interactive without --preserve-merges, then rename the
script to git-rebase--preserve.sh

The other aspect, the one I thought would take up to 3 months, easily, was
to convert the entirety of rebase -i into C. That would entail also the
option parsing, for which you would have to convert also git-rebase.sh
(and if you do not convert git-rebase--am.sh and git-rebase--merge.sh
first, you would then have to teach builtin/rebase.c to populate the
environment variables expected by those shell scripts while spawning
them).

I still think that the latter is too big a task for a single GSoC.

> I’ll send a new draft as soon as possible (hopefully this afternoon).

I look forward to reading it!

Ciao,
Johannes

Re: [PATCH 1/1] Fix typo in merge-strategies documentation

2018-03-21 Thread David Pursehouse
On Tue, Mar 20, 2018 at 1:51 AM, Junio C Hamano  wrote:
>
> I somehow had to stare at the patch for a few minutes, view it in
> two Emacs buffers and run M-x compare-windows before I finally spot
> the single-byte typofix.
>
> Will queue with a retitle.

[resending as plain text]

Thanks, and sorry for the confusion.  I'll provide a more detailed
commit message next time.


Re: [RFC PATCH v2 1/1] rebase-interactive: Add git_rebase__interactive__preserve_merges

2018-03-21 Thread Junio C Hamano
Wink Saville  writes:

>> Good code simplification that turns
>>
>> if A
>> if B
>> do this thing
>> fi
>> fi
>>
>> into
>>
>> if A & B
>> do this thing
>> fi
>
> Will be keeping this in a future commit

I think this one is simple enough to be in the "prelim clean-up"
change, so that you can reduce the size of the remaining stuff that
really needs focused review.

> But if that doesn't come to past, I believe my goal of simplication and 
> fixing:
>   "not ok 24 - exchange two commits with -p # TODO known breakage"
> In t3404-rebase-interactive.sh is worth while.

Oh, thanks for working on this.  


Re: [PATCH 00/44] reroll nd/remove-ignore-env.. sb/object-store and sb/packfiles..

2018-03-21 Thread Brandon Williams
On 03/03, Nguyễn Thái Ngọc Duy wrote:
> On Sat, Mar 3, 2018 at 9:54 AM, Duy Nguyen  wrote:
> > On Thu, Mar 1, 2018 at 2:09 AM, Junio C Hamano  wrote:
> >> Stefan Beller  writes:
> >>
> >>> On Wed, Feb 28, 2018 at 9:59 AM, Junio C Hamano  wrote:
>  Duy Nguyen  writes:
> 
> > Looking at the full-series diff though, it makes me wonder if we
> > should keep prepare_packed_git() private (i.e. how we initialize the
> > object store, packfile included, is a private matter). How about
> > something like this on top?
> 
>  Yup, that looks cleaner.
> >>>
> >>> I agree that it looks cleaner. So we plan on just putting
> >>> it on top of that series?
> >>
> >> We tend to avoid "oops, that was wrong and here is a band aid on
> >> top" for things that are still mushy, so it would be preferrable to
> >> get it fixed inline, especially if there are more changes to the
> >> other parts of the series coming.
> >
> > I agree with this. Stefan, if you're getting bored with rerolling
> > refactor patches, I can update this series and send out v2.
> 
> Since Stefan is traveling, I take this opportunity to reroll it.
> Unfortunately, I think the fix should go in 46cd557bd9 (object-store:
> move packed_git and packed_git_mru to object store - 2018-02-23) where
> we start removing the global "packed_git". But that's in
> sb/object-store, so.. I'm rerolling all three
> 
> 01/44 - 05/44: nd/remove-ignore-env-field
> 
>   This series is moved up top. After this the patch that touch
>   alternate-db in sha1_file.c looks natural because no env is involved
>   anymore
> 
>   I also take this opportunity to introduce a new patch 01/44 to avoid
>   struct initialization that makes it hard to read and update. Later
>   patches are also simplified thanks to this.
> 
> 06/44 - 32/44: sb/object-store
> 
>   06/44 is updated to introduce raw_object_store_init() instead of
>   RAW_OBJECT_STORE_INIT macro. This function is now used to initialize
>   both main repo and submodule ones.
> 
>   10/44 (moving "packed_git") also introduces two new access wrapper
>   get_packed_git() and get_packed_git_mru()
> 
> 33/44 - 44/44: sb/packfiles-in-repository
> 
>   The only thing new here is 44/44 which makes prepare_packed_git()
>   internal. get_packed_git() and get_packed_git_mru() introduced
>   earlier will call prepare_packed_git() automatically.
> 
> The whole thing is also available at
> 
> https://github.com/pclouds/git/tree/ignore-env-object-store-packfiles
> 
> And interdiff of all three, compared to what is currently in 'pu'.
> Looks pretty good in my opinon:

The only major issue I could find is with patch 40 and must be fixed
before this can be merged.  The rest of the series looks good.

-- 
Brandon Williams


Re: [PATCH 44/44] packfile: keep prepare_packed_git() private

2018-03-21 Thread Brandon Williams
On 03/03, Nguyễn Thái Ngọc Duy wrote:
> The reason callers have to call this is to make sure either packed_git
> or packed_git_mru pointers are initialized since we don't do that by
> default. Sometimes it's hard to see this connection between where the
> function is called and where packed_git pointer is used (sometimes in
> separate functions).
> 
> Keep this dependency internal because now all access to packed_git and
> packed_git_mru must go through get_xxx() wrappers.

Ahh now I understand the rational for trying to make it "private".

> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  builtin/count-objects.c  | 3 +--
>  builtin/fsck.c   | 2 --
>  builtin/gc.c | 1 -
>  builtin/pack-objects.c   | 1 -
>  builtin/pack-redundant.c | 2 --
>  fast-import.c| 1 -
>  http-backend.c   | 1 -
>  pack-bitmap.c| 1 -
>  packfile.c   | 5 -
>  packfile.h   | 1 -
>  server-info.c| 1 -
>  sha1_name.c  | 2 --
>  12 files changed, 5 insertions(+), 16 deletions(-)
> 
> diff --git a/builtin/count-objects.c b/builtin/count-objects.c
> index 2793c98ed3..ee6ae35244 100644
> --- a/builtin/count-objects.c
> +++ b/builtin/count-objects.c
> @@ -121,8 +121,7 @@ int cmd_count_objects(int argc, const char **argv, const 
> char *prefix)
>   struct strbuf loose_buf = STRBUF_INIT;
>   struct strbuf pack_buf = STRBUF_INIT;
>   struct strbuf garbage_buf = STRBUF_INIT;
> - if (!get_packed_git(the_repository))
> - prepare_packed_git(the_repository);
> +
>   for (p = get_packed_git(the_repository); p; p = p->next) {
>   if (!p->pack_local)
>   continue;
> diff --git a/builtin/fsck.c b/builtin/fsck.c
> index 7a3e323e9e..9911c52bc8 100644
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -726,8 +726,6 @@ int cmd_fsck(int argc, const char **argv, const char 
> *prefix)
>   uint32_t total = 0, count = 0;
>   struct progress *progress = NULL;
>  
> - prepare_packed_git(the_repository);
> -
>   if (show_progress) {
>   for (p = get_packed_git(the_repository); p;
>p = p->next) {
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 560d58daec..be63bec09c 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -173,7 +173,6 @@ static int too_many_packs(void)
>   if (gc_auto_pack_limit <= 0)
>   return 0;
>  
> - prepare_packed_git(the_repository);
>   for (cnt = 0, p = get_packed_git(the_repository); p; p = p->next) {
>   if (!p->pack_local)
>   continue;
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 6020a7e230..435f091a69 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -3151,7 +3151,6 @@ int cmd_pack_objects(int argc, const char **argv, const 
> char *prefix)
>   if (progress && all_progress_implied)
>   progress = 2;
>  
> - prepare_packed_git(the_repository);
>   if (ignore_packed_keep) {
>   struct packed_git *p;
>   for (p = get_packed_git(the_repository); p; p = p->next)
> diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
> index bf42e164eb..02b5f0becc 100644
> --- a/builtin/pack-redundant.c
> +++ b/builtin/pack-redundant.c
> @@ -630,8 +630,6 @@ int cmd_pack_redundant(int argc, const char **argv, const 
> char *prefix)
>   break;
>   }
>  
> - prepare_packed_git(the_repository);
> -
>   if (load_all_packs)
>   load_all();
>   else
> diff --git a/fast-import.c b/fast-import.c
> index 985eb2eccc..2298bfcdfd 100644
> --- a/fast-import.c
> +++ b/fast-import.c
> @@ -3472,7 +3472,6 @@ int cmd_main(int argc, const char **argv)
>   rc_free[i].next = _free[i + 1];
>   rc_free[cmd_save - 1].next = NULL;
>  
> - prepare_packed_git(the_repository);
>   start_packfile();
>   set_die_routine(die_nicely);
>   set_checkpoint_signal();
> diff --git a/http-backend.c b/http-backend.c
> index 659ddfb5f1..22d2e1668e 100644
> --- a/http-backend.c
> +++ b/http-backend.c
> @@ -518,7 +518,6 @@ static void get_info_packs(struct strbuf *hdr, char *arg)
>   size_t cnt = 0;
>  
>   select_getanyfile(hdr);
> - prepare_packed_git(the_repository);
>   for (p = get_packed_git(the_repository); p; p = p->next) {
>   if (p->pack_local)
>   cnt++;
> diff --git a/pack-bitmap.c b/pack-bitmap.c
> index 01c9cd1642..2a007b5539 100644
> --- a/pack-bitmap.c
> +++ b/pack-bitmap.c
> @@ -335,7 +335,6 @@ static int open_pack_bitmap(void)
>  
>   assert(!bitmap_git.map && !bitmap_git.loaded);
>  
> - prepare_packed_git(the_repository);
>   for (p = get_packed_git(the_repository); p; p = p->next) {
> 

Re: [PATCH] sha1_name: use bsearch_hash() for abbreviations

2018-03-21 Thread brian m. carlson
On Wed, Mar 21, 2018 at 09:24:06AM -0400, Derrick Stolee wrote:
> On 3/20/2018 6:25 PM, Jonathan Tan wrote:
> > On Tue, 20 Mar 2018 16:03:25 -0400
> > Derrick Stolee  wrote:
> > > One caveat about the patch: there is a place where I cast a sha1 hash
> > > into a struct object_id pointer. This is because the abbreviation code
> > > still uses 'const unsigned char *' instead of structs. I wanted to avoid
> > > a hashcpy() in these calls, but perhaps that is not too heavy a cost.
> > I recall a discussion that there were alignment issues with doing this,
> > but I might have be remembering wrongly - in my limited knowledge of C
> > alignment, both "unsigned char *" and "struct object_id *" have the same
> > constraints, but I'm not sure.
> 
> Adding Brian M. Carlson in the CC line for advice on how to do this
> translation between old sha1's and new object_ids. If it isn't safe, then we
> could do a hashcpy() until the translation makes it unnecessary.
> 
> I should have compared the two methods before sending the patch, but running
> the "git log --oneline --parents" test with a hashcpy() versus a cast has no
> measurable difference in performance for Linux. Probably best to do the
> safest thing here if there is no cost to perf.

There is no alignment difference.  The alignment of struct object_id is
going to be the same as the underlying hash.  My concern in the past has
been strict aliasing violations, which compilers can sometimes exploit
to generate incorrect code.

However, the bigger concern tends to be that when we switch to a new
hash function, we may extend struct object_id with a hash type byte.
The current hash function transition plan certainly makes this a likely
scenario.  In such a case, a cast would end reading past the end of the
underlying array should we read the type byte.

If this isn't a performance critical path, I'd recommend simply making a
copy.  I can clean up the definition of struct min_abbrev_data in a
future series, or I can do something like the following on top of the
last series I sent, which is in next (only compile tested).  If you're
willing to wait until it hits master, you can just drop the patch in.

-- >8 --
From  Mon Sep 17 00:00:00 2001
From: "brian m. carlson" 
Date: Wed, 21 Mar 2018 22:38:09 +
Subject: [PATCH] sha1_name: convert struct min_abbrev_data to object_id

This structure is only written to in one place, where we already have a
struct object_id.  Convert the struct to use a struct object_id instead.

Signed-off-by: brian m. carlson 
---
 sha1_name.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/sha1_name.c b/sha1_name.c
index 39e911c8ba..16e0003396 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -480,7 +480,7 @@ struct min_abbrev_data {
unsigned int init_len;
unsigned int cur_len;
char *hex;
-   const unsigned char *hash;
+   const struct object_id *oid;
 };
 
 static inline char get_hex_char_from_oid(const struct object_id *oid,
@@ -526,7 +526,7 @@ static void find_abbrev_len_for_pack(struct packed_git *p,
int cmp;
 
current = nth_packed_object_sha1(p, mid);
-   cmp = hashcmp(mad->hash, current);
+   cmp = hashcmp(mad->oid->hash, current);
if (!cmp) {
match = 1;
first = mid;
@@ -603,7 +603,7 @@ int find_unique_abbrev_r(char *hex, const struct object_id 
*oid, int len)
mad.init_len = len;
mad.cur_len = len;
mad.hex = hex;
-   mad.hash = oid->hash;
+   mad.oid = oid;
 
find_abbrev_len_packed();
 
-- >8 --
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH 40/44] packfile: allow prepare_packed_git to handle arbitrary repositories

2018-03-21 Thread Brandon Williams
On 03/03, Nguyễn Thái Ngọc Duy wrote:
> From: Stefan Beller 
> 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Junio C Hamano 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---

This is an invalid conversion.

>  packfile.c | 18 +-
>  packfile.h |  3 +--
>  2 files changed, 10 insertions(+), 11 deletions(-)
> 
> diff --git a/packfile.c b/packfile.c
> index 52febba932..2276e2ad26 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -882,19 +882,19 @@ static void prepare_packed_git_mru(struct repository *r)
>   list_add_tail(>mru, >objects.packed_git_mru);
>  }
>  
> -void prepare_packed_git_the_repository(void)
> +void prepare_packed_git(struct repository *r)
>  {
>   struct alternate_object_database *alt;
>  
> - if (the_repository->objects.packed_git_initialized)
> + if (r->objects.packed_git_initialized)
>   return;
> - prepare_packed_git_one(the_repository, get_object_directory(), 1);
> - prepare_alt_odb(the_repository);
> - for (alt = the_repository->objects.alt_odb_list; alt; alt = alt->next)
> - prepare_packed_git_one(the_repository, alt->path, 0);
> - rearrange_packed_git(the_repository);
> - prepare_packed_git_mru(the_repository);
> - the_repository->objects.packed_git_initialized = 1;
> + prepare_packed_git_one(r, get_object_directory(), 1);

Calling get_object_directory() returns the_repository's object dir,
this needs to be replaced with r->objects.objectdir.

> + prepare_alt_odb(r);
> + for (alt = r->objects.alt_odb_list; alt; alt = alt->next)
> + prepare_packed_git_one(r, alt->path, 0);
> + rearrange_packed_git(r);
> + prepare_packed_git_mru(r);
> + r->objects.packed_git_initialized = 1;
>  }
>  
>  void reprepare_packed_git_the_repository(void)
> diff --git a/packfile.h b/packfile.h
> index ab5046938c..3fd9092472 100644
> --- a/packfile.h
> +++ b/packfile.h
> @@ -34,8 +34,7 @@ extern struct packed_git *parse_pack_index(unsigned char 
> *sha1, const char *idx_
>  #define PACKDIR_FILE_GARBAGE 4
>  extern void (*report_garbage)(unsigned seen_bits, const char *path);
>  
> -#define prepare_packed_git(r) prepare_packed_git_##r()
> -extern void prepare_packed_git_the_repository(void);
> +extern void prepare_packed_git(struct repository *r);
>  #define reprepare_packed_git(r) reprepare_packed_git_##r()
>  extern void reprepare_packed_git_the_repository(void);
>  extern void install_packed_git(struct repository *r, struct packed_git 
> *pack);
> -- 
> 2.16.1.435.g8f24da2e1a
> 

-- 
Brandon Williams


Re: [PATCH 2/3] rebase -i --keep-empty: don't prune empty commits

2018-03-21 Thread Johannes Schindelin
Hi Junio,

On Tue, 20 Mar 2018, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> >> +  if (!keep_empty && is_empty)
> >>strbuf_addf(, "%c ", comment_line_char);
> 
> We are not trying to preserve an empty one, and have found an empty
> one, so we comment it out, and then...
> 
> >> +  if (is_empty || !(commit->object.flags & PATCHSAME)) {
> >
> > May I suggest inverting the logic here, to make the code more obvious and
> > also to avoid indenting the block even further?
> >
> > if (!is_empty && (commit->object.flags & PATCHSAME))
> > continue;
> 
> ... if a non-empty one that already appears in the upstream, we do
> not do anything to it.  There is no room for keep-empty or lack of
> it to affect what happens to these commits.
> 
> Otherwise the insn is emitted for the commit.
> 
> >> +  strbuf_addf(, "%s %s ", insn,
> >> +  oid_to_hex(>object.oid));
> >> +  pretty_print_commit(, commit, );
> >> +  strbuf_addch(, '\n');
> >> +  fputs(buf.buf, out);
> >> +  }
> 
> I tend to agree that the suggested structure is easier to follow
> than Phillip's version.
> 
> But I wonder if this is even easier to follow.  It makes it even
> more clear that patchsame commits that are not empty are discarded
> unconditionally.
> 
>   while ((commit = get_revision())) {
>   int is_empty  = is_original_commit_empty(commit);
>   if (!is_empty && (commit->object.flags & PATCHSAME))
>   continue;
>   strbuf_reset();
>   if (!keep_empty && is_empty)
>   strbuf_addf(, "%c ", comment_line_char);
>   strbuf_addf(, "%s %s ", insn,
>   oid_to_hex(>object.oid));
>   pretty_print_commit(, commit, );
>   strbuf_addch(, '\n');
>   fputs(buf.buf, out);
>   }
> 
> Or did I screw up the rewrite?

This looks correct. And the postimage is easier to follow than the one of
my suggested change.

My version is easier to review on the mailing list, of course, as it
minimizes the diff... ;-)

Ciao,
Dscho


Re: [PATCH 32/44] sha1_file: allow sha1_loose_object_info to handle arbitrary repositories

2018-03-21 Thread Brandon Williams
On 03/03, Nguyễn Thái Ngọc Duy wrote:
> From: Jonathan Nieder 
> 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 
> Signed-off-by: Junio C Hamano 
> Signed-off-by: Nguyễn Thái Ngọc Duy 

Patches up to here look sane.

> ---
>  sha1_file.c | 11 +--
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/sha1_file.c b/sha1_file.c
> index c5a8b00aed..7066d4c9ce 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1149,10 +1149,9 @@ int parse_sha1_header(const char *hdr, unsigned long 
> *sizep)
>   return parse_sha1_header_extended(hdr, , 0);
>  }
>  
> -#define sha1_loose_object_info(r, s, o, f) sha1_loose_object_info_##r(s, o, 
> f)
> -static int sha1_loose_object_info_the_repository(const unsigned char *sha1,
> -  struct object_info *oi,
> -  int flags)
> +static int sha1_loose_object_info(struct repository *r,
> +   const unsigned char *sha1,
> +   struct object_info *oi, int flags)
>  {
>   int status = 0;
>   unsigned long mapsize;
> @@ -1176,14 +1175,14 @@ static int 
> sha1_loose_object_info_the_repository(const unsigned char *sha1,
>   if (!oi->typep && !oi->typename && !oi->sizep && !oi->contentp) {
>   const char *path;
>   struct stat st;
> - if (stat_sha1_file(the_repository, sha1, , ) < 0)
> + if (stat_sha1_file(r, sha1, , ) < 0)
>   return -1;
>   if (oi->disk_sizep)
>   *oi->disk_sizep = st.st_size;
>   return 0;
>   }
>  
> - map = map_sha1_file(the_repository, sha1, );
> + map = map_sha1_file(r, sha1, );
>   if (!map)
>   return -1;
>  
> -- 
> 2.16.1.435.g8f24da2e1a
> 

-- 
Brandon Williams


RE: Bug With git rebase -p

2018-03-21 Thread Johannes Schindelin
Hi Joseph,

On Tue, 20 Mar 2018, Joseph Strauss wrote:

> Perfect. Thank you.

You are welcome.

I am puzzled, though... does your message mean that you tested the Git for
Windows v2.17.0-rc0 installer and it did fix your problem? Or do you
simply assume that it does fix your problem because Junio & I expect it to
fix your problem?

Ciao,
Johannes


Re: [PATCH 1/1] Fix typo in merge-strategies documentation

2018-03-21 Thread Johannes Schindelin
Hi Junio,

On Mon, 19 Mar 2018, Junio C Hamano wrote:

> David Pursehouse  writes:
> 
> > From: David Pursehouse 
> >
> > Signed-off-by: David Pursehouse 
> > ---
> 
> I somehow had to stare at the patch for a few minutes, view it in
> two Emacs buffers and run M-x compare-windows before I finally spot
> the single-byte typofix.

Pro-tip: git am && git show --color-words

:-)

Ciao,
Dscho


Re: [PATCH 19/44] sha1_file: allow link_alt_odb_entries to handle arbitrary repositories

2018-03-21 Thread Brandon Williams
On 03/03, Nguyễn Thái Ngọc Duy wrote:
> From: Stefan Beller 
> 
> Actually this also allows read_info_alternates and link_alt_odb_entry to
> handle arbitrary repositories, but link_alt_odb_entries is the most
> interesting function in this set of functions, hence the commit subject.
> 
> These functions span a strongly connected component in the function
> graph, i.e. the recursive call chain might look like
> 
>   -> link_alt_odb_entries
> -> link_alt_odb_entry
>   -> read_info_alternates
> -> link_alt_odb_entries
> 
> That is why we need to convert all these functions at the same time.

This conversion looks good.

> 
> Signed-off-by: Jonathan Nieder 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Junio C Hamano 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  object-store.h |  4 
>  sha1_file.c| 36 
>  2 files changed, 20 insertions(+), 20 deletions(-)
> 
> diff --git a/object-store.h b/object-store.h
> index 3d0f8a87cb..20ba136c1f 100644
> --- a/object-store.h
> +++ b/object-store.h
> @@ -18,6 +18,10 @@ struct alternate_object_database {
>   char loose_objects_subdir_seen[256];
>   struct oid_array loose_objects_cache;
>  
> + /*
> +  * Path to the alternative object store. If this is a relative path,
> +  * it is relative to the current working directory.
> +  */
>   char path[FLEX_ARRAY];
>  };
>  #define prepare_alt_odb(r) prepare_alt_odb_##r()
> diff --git a/sha1_file.c b/sha1_file.c
> index 027e0f3741..f34eb69e39 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -390,11 +390,10 @@ static int alt_odb_usable(struct raw_object_store *o,
>   * SHA1, an extra slash for the first level indirection, and the
>   * terminating NUL.
>   */
> -#define read_info_alternates(r, rb, d) read_info_alternates_##r(rb, d)
> -static void read_info_alternates_the_repository(const char *relative_base,
> - int depth);
> -#define link_alt_odb_entry(r, e, rb, d, n) link_alt_odb_entry_##r(e, rb, d, 
> n)
> -static int link_alt_odb_entry_the_repository(const char *entry,
> +static void read_info_alternates(struct repository *r,
> +  const char *relative_base,
> +  int depth);
> +static int link_alt_odb_entry(struct repository *r, const char *entry,
>   const char *relative_base, int depth, const char *normalized_objdir)
>  {
>   struct alternate_object_database *ent;
> @@ -420,7 +419,7 @@ static int link_alt_odb_entry_the_repository(const char 
> *entry,
>   while (pathbuf.len && pathbuf.buf[pathbuf.len - 1] == '/')
>   strbuf_setlen(, pathbuf.len - 1);
>  
> - if (!alt_odb_usable(_repository->objects, , 
> normalized_objdir)) {
> + if (!alt_odb_usable(>objects, , normalized_objdir)) {
>   strbuf_release();
>   return -1;
>   }
> @@ -428,12 +427,12 @@ static int link_alt_odb_entry_the_repository(const char 
> *entry,
>   ent = alloc_alt_odb(pathbuf.buf);
>  
>   /* add the alternate entry */
> - *the_repository->objects.alt_odb_tail = ent;
> - the_repository->objects.alt_odb_tail = &(ent->next);
> + *r->objects.alt_odb_tail = ent;
> + r->objects.alt_odb_tail = &(ent->next);
>   ent->next = NULL;
>  
>   /* recursively add alternates */
> - read_info_alternates(the_repository, pathbuf.buf, depth + 1);
> + read_info_alternates(r, pathbuf.buf, depth + 1);
>  
>   strbuf_release();
>   return 0;
> @@ -468,12 +467,8 @@ static const char *parse_alt_odb_entry(const char 
> *string,
>   return end;
>  }
>  
> -#define link_alt_odb_entries(r, a, s, rb, d) \
> - link_alt_odb_entries_##r(a, s, rb, d)
> -static void link_alt_odb_entries_the_repository(const char *alt,
> - int sep,
> - const char *relative_base,
> - int depth)
> +static void link_alt_odb_entries(struct repository *r, const char *alt,
> +  int sep, const char *relative_base, int depth)
>  {
>   struct strbuf objdirbuf = STRBUF_INIT;
>   struct strbuf entry = STRBUF_INIT;
> @@ -487,7 +482,7 @@ static void link_alt_odb_entries_the_repository(const 
> char *alt,
>   return;
>   }
>  
> - strbuf_add_absolute_path(, get_object_directory());
> + strbuf_add_absolute_path(, r->objects.objectdir);
>   if (strbuf_normalize_path() < 0)
>   die("unable to normalize object directory: %s",
>   objdirbuf.buf);
> @@ -496,15 +491,16 @@ static void link_alt_odb_entries_the_repository(const 
> char *alt,
>   alt = parse_alt_odb_entry(alt, sep, );
>   if (!entry.len)
>   continue;
> - 

Re: [PATCH 13/44] pack: move approximate object count to object store

2018-03-21 Thread Brandon Williams
On 03/04, Duy Nguyen wrote:
> On Sun, Mar 4, 2018 at 9:47 AM, Eric Sunshine  wrote:
> > On Sat, Mar 3, 2018 at 6:36 AM, Nguyễn Thái Ngọc Duy  
> > wrote:
> >> The approximate_object_count() function maintains a rough count of
> >> objects in a repository to estimate how long object name abbreviates
> >> should be.  Object names are scoped to a repository and the
> >> appropriate length may differ by repository, so the object count
> >> should not be global.
> >>
> >> Signed-off-by: Stefan Beller 
> >> Signed-off-by: Jonathan Nieder 
> >> Signed-off-by: Nguyễn Thái Ngọc Duy 
> >> ---
> >> diff --git a/packfile.c b/packfile.c
> >> @@ -813,8 +811,8 @@ static int approximate_object_count_valid;
> >>  unsigned long approximate_object_count(void)
> >>  {
> >> -   static unsigned long count;
> >> -   if (!approximate_object_count_valid) {
> >> +   if (!the_repository->objects.approximate_object_count_valid) {
> >> +   unsigned long count;
> >> struct packed_git *p;
> >>
> >> prepare_packed_git();
> >> @@ -824,8 +822,9 @@ unsigned long approximate_object_count(void)
> >> continue;
> >> count += p->num_objects;
> >> }
> >> +   the_repository->objects.approximate_object_count = count;
> >> }
> >> -   return count;
> >> +   return the_repository->objects.approximate_object_count;
> >>  }
> >> @@ -900,7 +899,7 @@ void prepare_packed_git(void)
> >>  void reprepare_packed_git(void)
> >>  {
> >> -   approximate_object_count_valid = 0;
> >> +   the_repository->objects.approximate_object_count_valid = 0;
> >
> > Not an issue specific to this patch, but where, how, when does
> > 'approximate_object_count_valid' ever get set to anything other than
> > 0? Even in the existing code (without this patch), there doesn't seem
> > to be anyplace which sets this to a non-zero value. Am I missing
> > something obvious (or subtle)?
> 
> Probably related to this
> https://public-inbox.org/git/20180226085508.ga30...@sigill.intra.peff.net/#t

Yeah as far as doing a conversion this should be fine, we can come by
and clean it up at a later point.

-- 
Brandon Williams


Re: [PATCH 12/44] pack: move prepare_packed_git_run_once to object store

2018-03-21 Thread Brandon Williams
On 03/03, Nguyễn Thái Ngọc Duy wrote:
> From: Stefan Beller 
> 
> Each repository's object store can be initialized independently, so
> they must not share a run_once variable.

Looks good.

> 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  object-store.h | 6 ++
>  packfile.c | 7 +++
>  2 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/object-store.h b/object-store.h
> index 1f3e66a3b8..b954396615 100644
> --- a/object-store.h
> +++ b/object-store.h
> @@ -92,6 +92,12 @@ struct raw_object_store {
>  
>   struct alternate_object_database *alt_odb_list;
>   struct alternate_object_database **alt_odb_tail;
> +
> + /*
> +  * Whether packed_git has already been populated with this repository's
> +  * packs.
> +  */
> + unsigned packed_git_initialized : 1;
>  };
>  
>  void raw_object_store_init(struct raw_object_store *o);
> diff --git a/packfile.c b/packfile.c
> index 1e38334ba2..caeab0f68c 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -883,12 +883,11 @@ static void prepare_packed_git_mru(void)
>   list_add_tail(>mru, _repository->objects.packed_git_mru);
>  }
>  
> -static int prepare_packed_git_run_once = 0;
>  void prepare_packed_git(void)
>  {
>   struct alternate_object_database *alt;
>  
> - if (prepare_packed_git_run_once)
> + if (the_repository->objects.packed_git_initialized)
>   return;
>   prepare_packed_git_one(get_object_directory(), 1);
>   prepare_alt_odb();
> @@ -896,13 +895,13 @@ void prepare_packed_git(void)
>   prepare_packed_git_one(alt->path, 0);
>   rearrange_packed_git();
>   prepare_packed_git_mru();
> - prepare_packed_git_run_once = 1;
> + the_repository->objects.packed_git_initialized = 1;
>  }
>  
>  void reprepare_packed_git(void)
>  {
>   approximate_object_count_valid = 0;
> - prepare_packed_git_run_once = 0;
> + the_repository->objects.packed_git_initialized = 0;
>   prepare_packed_git();
>  }
>  
> -- 
> 2.16.1.435.g8f24da2e1a
> 

-- 
Brandon Williams


Re: [PATCH 11/44] object-store: close all packs upon clearing the object store

2018-03-21 Thread Brandon Williams
On 03/03, Nguyễn Thái Ngọc Duy wrote:
> From: Stefan Beller 
> 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Junio C Hamano 
> Signed-off-by: Nguyễn Thái Ngọc Duy 

Looks good.

> ---
>  builtin/am.c   | 2 +-
>  builtin/clone.c| 2 +-
>  builtin/fetch.c| 2 +-
>  builtin/merge.c| 2 +-
>  builtin/receive-pack.c | 2 +-
>  object.c   | 7 +++
>  packfile.c | 4 ++--
>  packfile.h | 2 +-
>  8 files changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/builtin/am.c b/builtin/am.c
> index 5bdd2d7578..4762a702e3 100644
> --- a/builtin/am.c
> +++ b/builtin/am.c
> @@ -1859,7 +1859,7 @@ static void am_run(struct am_state *state, int resume)
>*/
>   if (!state->rebasing) {
>   am_destroy(state);
> - close_all_packs();
> + close_all_packs(_repository->objects);
>   run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
>   }
>  }
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 101c27a593..13cfaa6488 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -1217,7 +1217,7 @@ int cmd_clone(int argc, const char **argv, const char 
> *prefix)
>   transport_disconnect(transport);
>  
>   if (option_dissociate) {
> - close_all_packs();
> + close_all_packs(_repository->objects);
>   dissociate_from_references();
>   }
>  
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 8ee998ea2e..4d72efca78 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1478,7 +1478,7 @@ int cmd_fetch(int argc, const char **argv, const char 
> *prefix)
>  
>   string_list_clear(, 0);
>  
> - close_all_packs();
> + close_all_packs(_repository->objects);
>  
>   argv_array_pushl(_gc_auto, "gc", "--auto", NULL);
>   if (verbosity < 0)
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 30264cfd7c..907ae44ab5 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -411,7 +411,7 @@ static void finish(struct commit *head_commit,
>* We ignore errors in 'gc --auto', since the
>* user should see them.
>*/
> - close_all_packs();
> + close_all_packs(_repository->objects);
>   run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
>   }
>   }
> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> index b7ce7c7f52..ac478b7b99 100644
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -2026,7 +2026,7 @@ int cmd_receive_pack(int argc, const char **argv, const 
> char *prefix)
>   proc.git_cmd = 1;
>   proc.argv = argv_gc_auto;
>  
> - close_all_packs();
> + close_all_packs(_repository->objects);
>   if (!start_command()) {
>   if (use_sideband)
>   copy_to_sideband(proc.err, -1, NULL);
> diff --git a/object.c b/object.c
> index 83be6b1ecb..60ca76b285 100644
> --- a/object.c
> +++ b/object.c
> @@ -4,6 +4,7 @@
>  #include "tree.h"
>  #include "commit.h"
>  #include "tag.h"
> +#include "packfile.h"
>  
>  static struct object **obj_hash;
>  static int nr_objs, obj_hash_size;
> @@ -475,8 +476,6 @@ void raw_object_store_clear(struct raw_object_store *o)
>   o->alt_odb_tail = NULL;
>  
>   INIT_LIST_HEAD(>packed_git_mru);
> - /*
> -  * TODO: call close_all_packs once migrated to
> -  * take an object store argument
> -  */
> + close_all_packs(o);
> + o->packed_git = NULL;
>  }
> diff --git a/packfile.c b/packfile.c
> index d3c4a12ae0..1e38334ba2 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -310,11 +310,11 @@ static void close_pack(struct packed_git *p)
>   close_pack_index(p);
>  }
>  
> -void close_all_packs(void)
> +void close_all_packs(struct raw_object_store *o)
>  {
>   struct packed_git *p;
>  
> - for (p = the_repository->objects.packed_git; p; p = p->next)
> + for (p = o->packed_git; p; p = p->next)
>   if (p->do_not_close)
>   die("BUG: want to close pack marked 'do-not-close'");
>   else
> diff --git a/packfile.h b/packfile.h
> index 76496226bb..5b1ce00f84 100644
> --- a/packfile.h
> +++ b/packfile.h
> @@ -66,7 +66,7 @@ extern void close_pack_index(struct packed_git *);
>  
>  extern unsigned char *use_pack(struct packed_git *, struct pack_window **, 
> off_t, unsigned long *);
>  extern void close_pack_windows(struct packed_git *);
> -extern void close_all_packs(void);
> +extern void close_all_packs(struct raw_object_store *o);
>  extern void unuse_pack(struct pack_window **);
>  extern void clear_delta_base_cache(void);
>  extern struct packed_git *add_packed_git(const char *path, size_t path_len, 
> int local);
> -- 
> 

Re: [PATCH 10/44] object-store: move packed_git and packed_git_mru to object store

2018-03-21 Thread Brandon Williams
On 03/03, Nguyễn Thái Ngọc Duy wrote:
> From: Stefan Beller 
> 
> In a process with multiple repositories open, packfile accessors
> should be associated to a single repository and not shared globally.
> Move packed_git and packed_git_mru into the_repository and adjust
> callers to reflect this.
> 
> [nd: while at there, wrap access to these two fields in get_packed_git()
> and get_packed_git_mru(). This allows us to lazily initialize these
> fields without caller doing that explicitly]
> 
> Patch generated by
> 
>  1. Moving the struct packed_git declaration to object-store.h
> and packed_git, packed_git_mru globals to struct object_store.
> 
>  2. Apply the following semantic patch to adjust callers:
> @@ @@
> - packed_git
> + the_repository->objects.packed_git
> 
> @@ @@
> - packed_git_mru
> + the_repository->objects.packed_git_mru

As Jonathan mentioned, this should probably be taken out of the commit
message because it doesn't reflect what the code actually does.  What it
actually does took me a second to figure out.  You're marking packed_git
as "private"...well C has no notion of private vs public fields in a
struct so it might be difficult to keep that convention, it also took me
a second to realize that it was only in the scope of packfile.c where it
was ok to reference it directly.  Maybe it'll be ok?  If we really
wanted something to be private we'd need it to be an opaque type
instead, which may be out of the scope of this code refactor.

> 
>  3. Applying line wrapping fixes from "make style" to break the
> resulting long lines.
> 
>  4. Adding missing #includes of repository.h where needed.
> 
>  5. As the packfiles are now owned by an objectstore, which is ephemeral
> unlike globals, we introduce memory leaks. So address them in
> raw_object_store_clear(). Defer freeing packed_git to the next
> patch due to the size of this patch.
> 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  builtin/count-objects.c  |  5 ++--
>  builtin/fsck.c   |  6 +++--
>  builtin/gc.c |  3 ++-
>  builtin/pack-objects.c   | 20 
>  builtin/pack-redundant.c |  5 ++--
>  cache.h  | 29 
>  fast-import.c|  7 --
>  http-backend.c   |  5 ++--
>  object-store.h   | 28 +++
>  object.c |  7 ++
>  pack-bitmap.c|  3 ++-
>  packfile.c   | 49 
>  packfile.h   |  3 +++
>  server-info.c|  5 ++--
>  sha1_name.c  |  5 ++--
>  15 files changed, 107 insertions(+), 73 deletions(-)
> 
> diff --git a/builtin/count-objects.c b/builtin/count-objects.c
> index 33343818c8..5c7c3c6ae3 100644
> --- a/builtin/count-objects.c
> +++ b/builtin/count-objects.c
> @@ -7,6 +7,7 @@
>  #include "cache.h"
>  #include "config.h"
>  #include "dir.h"
> +#include "repository.h"
>  #include "builtin.h"
>  #include "parse-options.h"
>  #include "quote.h"
> @@ -120,9 +121,9 @@ int cmd_count_objects(int argc, const char **argv, const 
> char *prefix)
>   struct strbuf loose_buf = STRBUF_INIT;
>   struct strbuf pack_buf = STRBUF_INIT;
>   struct strbuf garbage_buf = STRBUF_INIT;
> - if (!packed_git)
> + if (!get_packed_git(the_repository))
>   prepare_packed_git();
> - for (p = packed_git; p; p = p->next) {
> + for (p = get_packed_git(the_repository); p; p = p->next) {
>   if (!p->pack_local)
>   continue;
>   if (open_pack_index(p))
> diff --git a/builtin/fsck.c b/builtin/fsck.c
> index b284a3a74e..7aca9699f6 100644
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -729,7 +729,8 @@ int cmd_fsck(int argc, const char **argv, const char 
> *prefix)
>   prepare_packed_git();
>  
>   if (show_progress) {
> - for (p = packed_git; p; p = p->next) {
> + for (p = get_packed_git(the_repository); p;
> +  p = p->next) {
>   if (open_pack_index(p))
>   continue;
>   total += p->num_objects;
> @@ -737,7 +738,8 @@ int cmd_fsck(int argc, const char **argv, const char 
> *prefix)
>  
>   progress = start_progress(_("Checking 
> objects"), total);
>   }
> - for (p = packed_git; p; p = p->next) {
> + for (p = get_packed_git(the_repository); p;
> +  p = p->next) {
>   /* verify gives error messages itself 

Re: [PATCH 09/44] object-store: free alt_odb_list

2018-03-21 Thread Brandon Williams
On 03/03, Eric Sunshine wrote:
> On Sat, Mar 3, 2018 at 6:36 AM, Nguyễn Thái Ngọc Duy  
> wrote:
> > Free the memory and reset alt_odb_{list, tail} to NULL.
> >
> > Signed-off-by: Stefan Beller 
> > Signed-off-by: Nguyễn Thái Ngọc Duy 
> > ---
> > diff --git a/object.c b/object.c
> > @@ -450,8 +450,26 @@ void raw_object_store_init(struct raw_object_store *o)
> > +static void free_alt_odb(struct alternate_object_database *alt)
> > +{
> > +   strbuf_release(>scratch);
> > +   oid_array_clear(>loose_objects_cache);
> > +}
> 
> This doesn't free the 'struct alternate_object_database' entry itself, right?
> 
> Is that intentional? Isn't the idea that this should free the entries too?

Yep it definitely should free the entry itself.  Should be fixed easy
enough by freeing the list entry after grabbing the next entry

> 
> > +static void free_alt_odbs(struct raw_object_store *o)
> > +{
> > +   while (o->alt_odb_list) {

struct alternate_object_database old = o->alt_odb_list;

> > +   free_alt_odb(o->alt_odb_list);
> > +   o->alt_odb_list = o->alt_odb_list->next;

free(old);

> > +   }
> > +}
> 
> Accessing an entry's 'next' member after invoking free_alt_odb() works
> because the entry itself hasn't been freed (as noted above).
> 
> Is leaking the entries themselves intentional?
> 
> >  void raw_object_store_clear(struct raw_object_store *o)
> >  {
> > FREE_AND_NULL(o->objectdir);
> > FREE_AND_NULL(o->alternate_db);
> > +
> > +   free_alt_odbs(o);
> > +   o->alt_odb_tail = NULL;
> >  }
> 
> The commit message talks about freeing memory and resetting
> alt_odb_list and alt_odb_tail, but this code only resets alt_odb_tail.

-- 
Brandon Williams


Re: [PATCH] stash: drop superfluos pathspec parameter

2018-03-21 Thread Junio C Hamano
Thomas Gummerer  writes:

> On 03/21, Thomas Gummerer wrote:
>>
>> Argh I just noticed we could drop the "$@" here, as this is no longer
>> the pathspec case.  It doesn't hurt anything, except it may be a bit
>> confusing when reading the code.
>> 
>> Although if we end up implementing 'git checkout --index ',
>> we'd have to add it back, but we do have a test covering this case, so
>> there's no worries about forgetting to add it back.
>
> Here's a patch for that.  Not sure it's worth doing, but since we're
> already touching the area, this may be a good cleanup.

OK, I can go either way, and since you have already written a
change, let's not waste it ;-)

Thanks.

>
> This is based on top of tg/stash-untracked-with-pathspec-fix.
>
> --- >8 ---
> Subject: [PATCH] stash: drop superfluos pathspec parameter
>
> Since 833622a945 ("stash push: avoid printing errors", 2018-03-19) we
> don't use the 'git clean' call for the pathspec case anymore.  The
> commit however forgot to remove the pathspec argument to the call.
> Remove the superfluos argument to make the code a little more obvious.
>
> Signed-off-by: Thomas Gummerer 
> ---
>  git-stash.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/git-stash.sh b/git-stash.sh
> index 4e55f278bd..d31924aea3 100755
> --- a/git-stash.sh
> +++ b/git-stash.sh
> @@ -310,7 +310,7 @@ push_stash () {
>   test "$untracked" = "all" && CLEAN_X_OPTION=-x || 
> CLEAN_X_OPTION=
>   if test -n "$untracked" && test $# = 0
>   then
> - git clean --force --quiet -d $CLEAN_X_OPTION -- "$@"
> + git clean --force --quiet -d $CLEAN_X_OPTION
>   fi
>  
>   if test $# != 0


Re: [PATCH 06/44] repository: introduce raw object store field

2018-03-21 Thread Brandon Williams
On 03/03, Nguyễn Thái Ngọc Duy wrote:
> From: Stefan Beller 
> 
> The raw object store field will contain any objects needed for
> access to objects in a given repository.
> 
> This patch introduces the raw object store and populates it with the
> `objectdir`, which used to be part of the repository struct.
> 
> As the struct gains members, we'll also populate the function to clear
> the memory for these members.
> 
> In a later we'll introduce a struct object_parser, that will complement
> the object parsing in a repository struct: The raw object parser is the
> layer that will provide access to raw object content, while the higher
> level object parser code will parse raw objects and keeps track of
> parenthood and other object relationships using 'struct object'.
> For now only add the lower level to the repository struct.

Patch looks good. Only nit would be that I prefer the embedded struct to
be a pointer simply because I don't have to worry about putting '&' at
some points.  I know others probably feel the opposite so its not worth
changing for me :)

> 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  builtin/grep.c |  2 +-
>  environment.c  |  5 +++--
>  object-store.h | 18 ++
>  object.c   | 10 ++
>  path.c |  2 +-
>  repository.c   | 14 +-
>  repository.h   | 10 --
>  sha1_file.c|  3 ++-
>  8 files changed, 48 insertions(+), 16 deletions(-)
>  create mode 100644 object-store.h
> 
> diff --git a/builtin/grep.c b/builtin/grep.c
> index 3ca4ac80d8..0f0c195705 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -432,7 +432,7 @@ static int grep_submodule(struct grep_opt *opt, struct 
> repository *superproject,
>* object.
>*/
>   grep_read_lock();
> - add_to_alternates_memory(submodule.objectdir);
> + add_to_alternates_memory(submodule.objects.objectdir);
>   grep_read_unlock();
>  
>   if (oid) {
> diff --git a/environment.c b/environment.c
> index a5eaa97fb1..c05705e384 100644
> --- a/environment.c
> +++ b/environment.c
> @@ -14,6 +14,7 @@
>  #include "fmt-merge-msg.h"
>  #include "commit.h"
>  #include "argv-array.h"
> +#include "object-store.h"
>  
>  int trust_executable_bit = 1;
>  int trust_ctime = 1;
> @@ -270,9 +271,9 @@ const char *get_git_work_tree(void)
>  
>  char *get_object_directory(void)
>  {
> - if (!the_repository->objectdir)
> + if (!the_repository->objects.objectdir)
>   BUG("git environment hasn't been setup");
> - return the_repository->objectdir;
> + return the_repository->objects.objectdir;
>  }
>  
>  int odb_mkstemp(struct strbuf *template, const char *pattern)
> diff --git a/object-store.h b/object-store.h
> new file mode 100644
> index 00..69bb5ac065
> --- /dev/null
> +++ b/object-store.h
> @@ -0,0 +1,18 @@
> +#ifndef OBJECT_STORE_H
> +#define OBJECT_STORE_H
> +
> +struct raw_object_store {
> + /*
> +  * Path to the repository's object store.
> +  * Cannot be NULL after initialization.
> +  */
> + char *objectdir;
> +
> + /* Path to extra alternate object database if not NULL */
> + char *alternate_db;
> +};
> +
> +void raw_object_store_init(struct raw_object_store *o);
> +void raw_object_store_clear(struct raw_object_store *o);
> +
> +#endif /* OBJECT_STORE_H */
> diff --git a/object.c b/object.c
> index 9e6f9ff20b..e91711dd56 100644
> --- a/object.c
> +++ b/object.c
> @@ -445,3 +445,13 @@ void clear_commit_marks_all(unsigned int flags)
>   obj->flags &= ~flags;
>   }
>  }
> +
> +void raw_object_store_init(struct raw_object_store *o)
> +{
> + memset(o, 0, sizeof(*o));
> +}
> +void raw_object_store_clear(struct raw_object_store *o)
> +{
> + FREE_AND_NULL(o->objectdir);
> + FREE_AND_NULL(o->alternate_db);
> +}
> diff --git a/path.c b/path.c
> index da8b655730..81a42d9115 100644
> --- a/path.c
> +++ b/path.c
> @@ -382,7 +382,7 @@ static void adjust_git_path(const struct repository *repo,
>   strbuf_splice(buf, 0, buf->len,
> repo->index_file, strlen(repo->index_file));
>   else if (dir_prefix(base, "objects"))
> - replace_dir(buf, git_dir_len + 7, repo->objectdir);
> + replace_dir(buf, git_dir_len + 7, repo->objects.objectdir);
>   else if (git_hooks_path && dir_prefix(base, "hooks"))
>   replace_dir(buf, git_dir_len + 5, git_hooks_path);
>   else if (repo->different_commondir)
> diff --git a/repository.c b/repository.c
> index 62f52f47fc..34c0a7f180 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -1,5 +1,6 @@
>  #include "cache.h"
>  #include "repository.h"
> +#include "object-store.h"
>  #include "config.h"
>  #include "submodule-config.h"
>  
> @@ -12,6 +13,7 @@ void initialize_the_repository(void)
>   the_repository = _repo;
>  
>

Re: [PATCH 05/44] repository: delete ignore_env member

2018-03-21 Thread Brandon Williams
On 03/03, Nguyễn Thái Ngọc Duy wrote:
> This variable was added because the repo_set_gitdir() was created to
> cover both submodule and main repos, but these two are initialized a
> bit differently so ignore_env == 0 means main repo, while ignore_env
> != 0 is submodules.
> 
> Since the difference part (env variables) has been moved out of
> repo_set_gitdir(), this function works the same way for both repo
> types and ignore_env is not needed anymore.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 

Sweet! Thanks for getting rid of this.

> ---
>  repository.c | 2 --
>  repository.h | 9 -
>  2 files changed, 11 deletions(-)
> 
> diff --git a/repository.c b/repository.c
> index 04d85a2869..62f52f47fc 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -140,8 +140,6 @@ static int repo_init(struct repository *repo,
>   struct repository_format format;
>   memset(repo, 0, sizeof(*repo));
>  
> - repo->ignore_env = 1;
> -
>   if (repo_init_gitdir(repo, gitdir))
>   goto error;
>  
> diff --git a/repository.h b/repository.h
> index 2bfbf762f3..e7127baffb 100644
> --- a/repository.h
> +++ b/repository.h
> @@ -75,15 +75,6 @@ struct repository {
>   const struct git_hash_algo *hash_algo;
>  
>   /* Configurations */
> - /*
> -  * Bit used during initialization to indicate if repository state (like
> -  * the location of the 'objectdir') should be read from the
> -  * environment.  By default this bit will be set at the begining of
> -  * 'repo_init()' so that all repositories will ignore the environment.
> -  * The exception to this is 'the_repository', which doesn't go through
> -  * the normal 'repo_init()' process.
> -  */
> - unsigned ignore_env:1;
>  
>   /* Indicate if a repository has a different 'commondir' from 'gitdir' */
>   unsigned different_commondir:1;
> -- 
> 2.16.1.435.g8f24da2e1a
> 

-- 
Brandon Williams


Re: [PATCH 04/44] sha1_file.c: move delayed getenv(altdb) back to setup_git_env()

2018-03-21 Thread Brandon Williams
On 03/03, Nguyễn Thái Ngọc Duy wrote:
> getenv() is supposed to work on the main repository only. This delayed
> getenv() code in sha1_file.c makes it more difficult to convert
> sha1_file.c to a generic object store that could be used by both
> submodule and main repositories.
> 
> Move the getenv() back in setup_git_env() where other env vars are
> also fetched.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 

Looks good

> ---
>  environment.c | 1 +
>  repository.c  | 3 +++
>  repository.h  | 4 
>  sha1_file.c   | 6 +-
>  4 files changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/environment.c b/environment.c
> index da3f7daa09..a5eaa97fb1 100644
> --- a/environment.c
> +++ b/environment.c
> @@ -174,6 +174,7 @@ void setup_git_env(const char *git_dir)
>   args.object_dir = getenv_safe(_free, DB_ENVIRONMENT);
>   args.graft_file = getenv_safe(_free, GRAFT_ENVIRONMENT);
>   args.index_file = getenv_safe(_free, INDEX_ENVIRONMENT);
> + args.alternate_db = getenv_safe(_free, ALTERNATE_DB_ENVIRONMENT);
>   repo_set_gitdir(the_repository, git_dir, );
>   argv_array_clear(_free);
>  
> diff --git a/repository.c b/repository.c
> index e65f4138a7..04d85a2869 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -60,6 +60,8 @@ void repo_set_gitdir(struct repository *repo,
>   repo_set_commondir(repo, o->commondir);
>   expand_base_dir(>objectdir, o->object_dir,
>   repo->commondir, "objects");
> + free(repo->alternate_db);
> + repo->alternate_db = xstrdup_or_null(o->alternate_db);
>   expand_base_dir(>graft_file, o->graft_file,
>   repo->commondir, "info/grafts");
>   expand_base_dir(>index_file, o->index_file,
> @@ -215,6 +217,7 @@ void repo_clear(struct repository *repo)
>   FREE_AND_NULL(repo->gitdir);
>   FREE_AND_NULL(repo->commondir);
>   FREE_AND_NULL(repo->objectdir);
> + FREE_AND_NULL(repo->alternate_db);
>   FREE_AND_NULL(repo->graft_file);
>   FREE_AND_NULL(repo->index_file);
>   FREE_AND_NULL(repo->worktree);
> diff --git a/repository.h b/repository.h
> index 84aeac2825..2bfbf762f3 100644
> --- a/repository.h
> +++ b/repository.h
> @@ -26,6 +26,9 @@ struct repository {
>*/
>   char *objectdir;
>  
> + /* Path to extra alternate object database if not NULL */
> + char *alternate_db;
> +
>   /*
>* Path to the repository's graft file.
>* Cannot be NULL after initialization.
> @@ -93,6 +96,7 @@ struct set_gitdir_args {
>   const char *object_dir;
>   const char *graft_file;
>   const char *index_file;
> + const char *alternate_db;
>  };
>  
>  extern void repo_set_gitdir(struct repository *repo,
> diff --git a/sha1_file.c b/sha1_file.c
> index 831d9e7343..4af422e3cf 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -667,15 +667,11 @@ int foreach_alt_odb(alt_odb_fn fn, void *cb)
>  
>  void prepare_alt_odb(void)
>  {
> - const char *alt;
> -
>   if (alt_odb_tail)
>   return;
>  
> - alt = getenv(ALTERNATE_DB_ENVIRONMENT);
> -
>   alt_odb_tail = _odb_list;
> - link_alt_odb_entries(alt, PATH_SEP, NULL, 0);
> + link_alt_odb_entries(the_repository->alternate_db, PATH_SEP, NULL, 0);
>  
>   read_info_alternates(get_object_directory(), 0);
>  }
> -- 
> 2.16.1.435.g8f24da2e1a
> 

-- 
Brandon Williams


Re: [PATCH 03/44] repository.c: delete dead functions

2018-03-21 Thread Brandon Williams
On 03/03, Nguyễn Thái Ngọc Duy wrote:
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> Signed-off-by: Junio C Hamano 
> ---
>  repository.c | 25 -
>  1 file changed, 25 deletions(-)
> 
> diff --git a/repository.c b/repository.c
> index bb53b54b6d..e65f4138a7 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -15,31 +15,6 @@ void initialize_the_repository(void)
>   repo_set_hash_algo(_repo, GIT_HASH_SHA1);
>  }
>  
> -static char *git_path_from_env(const char *envvar, const char *git_dir,
> -const char *path, int fromenv)
> -{
> - if (fromenv) {
> - const char *value = getenv(envvar);
> - if (value)
> - return xstrdup(value);
> - }
> -
> - return xstrfmt("%s/%s", git_dir, path);
> -}
> -
> -static int find_common_dir(struct strbuf *sb, const char *gitdir, int 
> fromenv)
> -{
> - if (fromenv) {
> - const char *value = getenv(GIT_COMMON_DIR_ENVIRONMENT);
> - if (value) {
> - strbuf_addstr(sb, value);
> - return 1;
> - }
> - }
> -
> - return get_common_dir_noenv(sb, gitdir);
> -}
> -
>  static void expand_base_dir(char **out, const char *in,
>   const char *base_dir, const char *def_in)
>  {
> -- 
> 2.16.1.435.g8f24da2e1a
> 

This patch needs to be squashed into the previous one, the build breaks
otherwise (unless built with -Werror=unused-function).

-- 
Brandon Williams


Re: [PATCH v5 2/3] stash push: avoid printing errors

2018-03-21 Thread Junio C Hamano
Thomas Gummerer  writes:

>> > diff --git a/git-stash.sh b/git-stash.sh
>> > index 4c92ec931f..5e06f96da5 100755
>> > --- a/git-stash.sh
>> > +++ b/git-stash.sh
>> > @@ -308,14 +308,16 @@ push_stash () {
>> >if test -z "$patch_mode"
>> >then
>> >test "$untracked" = "all" && CLEAN_X_OPTION=-x || 
>> > CLEAN_X_OPTION=
>> > -  if test -n "$untracked"
>> > +  if test -n "$untracked" && test $# = 0
>> >then
>> >git clean --force --quiet -d $CLEAN_X_OPTION -- "$@"
>
> Argh I just noticed we could drop the "$@" here, as this is no longer
> the pathspec case.  It doesn't hurt anything, except it may be a bit
> confusing when reading the code.

Yes, we could, but we do not have to.  I think it is fine to leave
it in especially if we envision that the codepaths for two cases
will be unified in the future.


Re: [PATCH 02/44] repository.c: move env-related setup code back to environment.c

2018-03-21 Thread Brandon Williams
On 03/19, Duy Nguyen wrote:
> On Mon, Mar 19, 2018 at 7:07 PM, Jonathan Tan  
> wrote:
> >> -extern void repo_set_gitdir(struct repository *repo, const char *path);
> >> +struct set_gitdir_args {
> >> + const char *commondir;
> >> + const char *object_dir;
> >> + const char *graft_file;
> >> + const char *index_file;
> >> +};
> >> +
> >> +extern void repo_set_gitdir(struct repository *repo,
> >> + const char *root,
> >> + const struct set_gitdir_args *optional);
> >
> > Optional: Reading this header file alone makes me think that the 3rd
> > argument can be NULL, but it actually can't. I would name it
> > "extra_args" and add a comment inside "struct set_gitdir_args"
> > explaining (e.g.):
> >
> >   /*
> >* Any of these fields may be NULL. If so, the name of that directory
> >* is instead derived from the root path of the repository.
> >*/
> 
> Yeah. I think Eric made the same comment. I'm still (very slowly) in
> the process of unifying the repo setup for the main repo and
> submodules, which hopefully may kill this function or replace it with
> something better. But it's too early to tell. Since this part of the
> series has landed in 'next', I'll post a fixup patch soon with your
> suggestion.
> -- 
> Duy

Yeah looking at the patch this is probably my only complaint.  The rest
looked good.

-- 
Brandon Williams


Re: [PATCH v2] filter-branch: use printf instead of echo -e

2018-03-21 Thread Johannes Schindelin
Hi Michele,

On Mon, 19 Mar 2018, Michele Locati wrote:

> [...]
> -- 
> 2.16.2.windows.1

Yay!

Out of curiosity: did the CONTRIBUTING.md file help that was recently
turned into a guide how to contribute to Git (for Windows) by Derrick
Stolee?

Ciao,
Johannes


[PATCH] stash: drop superfluos pathspec parameter (was: Re: [PATCH v5 2/3] stash push: avoid printing errors)

2018-03-21 Thread Thomas Gummerer
On 03/21, Thomas Gummerer wrote:
>
> Argh I just noticed we could drop the "$@" here, as this is no longer
> the pathspec case.  It doesn't hurt anything, except it may be a bit
> confusing when reading the code.
> 
> Although if we end up implementing 'git checkout --index ',
> we'd have to add it back, but we do have a test covering this case, so
> there's no worries about forgetting to add it back.

Here's a patch for that.  Not sure it's worth doing, but since we're
already touching the area, this may be a good cleanup.

This is based on top of tg/stash-untracked-with-pathspec-fix.

--- >8 ---
Subject: [PATCH] stash: drop superfluos pathspec parameter

Since 833622a945 ("stash push: avoid printing errors", 2018-03-19) we
don't use the 'git clean' call for the pathspec case anymore.  The
commit however forgot to remove the pathspec argument to the call.
Remove the superfluos argument to make the code a little more obvious.

Signed-off-by: Thomas Gummerer 
---
 git-stash.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-stash.sh b/git-stash.sh
index 4e55f278bd..d31924aea3 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -310,7 +310,7 @@ push_stash () {
test "$untracked" = "all" && CLEAN_X_OPTION=-x || 
CLEAN_X_OPTION=
if test -n "$untracked" && test $# = 0
then
-   git clean --force --quiet -d $CLEAN_X_OPTION -- "$@"
+   git clean --force --quiet -d $CLEAN_X_OPTION
fi
 
if test $# != 0
-- 
2.15.1.33.g3165b24a68



Re: [RFC PATCH v2 1/1] rebase-interactive: Add git_rebase__interactive__preserve_merges

2018-03-21 Thread Wink Saville
On Wed, Mar 21, 2018 at 12:42 PM, Junio C Hamano  wrote:
> Wink Saville  writes:
>
>> Refactor git_rebase__interactive__preserve_merges out of
>> git_rebase__interactive resulting in fewer conditionals making
>> both routines are simpler.
>>
>> Changed run_specific_rebase in git-rebase to dispatch to the appropriate
>> function after sourcing git-rebase--interactive.
>> ---
>
> I think this will become (more) reviewable if it is split into two
> patches (at least).  A preliminary patch that does the style changes
> and line wrapping (left below) as step #1, and all the rest on top
> as step #2.

Yes, I will break this into several commits

>
>> diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
>> index 331c8dfea..65ff75654 100644
>> --- a/git-rebase--interactive.sh
>> +++ b/git-rebase--interactive.sh
>> @@ -1,5 +1,7 @@
>> -# This shell script fragment is sourced by git-rebase to implement
>> -# its interactive mode.  "git rebase --interactive" makes it easy
>> +#!/bin/sh
>
> Addition of #! implies that this might be invoked as the top-level
> script; is that the case now?  I did not get such an impression from
> the proposed log message and it is still always dot-sourced (in
> which case #! gives a wrong signal to the readers).

Will remove adding the shebang

>> +# This shell script fragment is sourced by git-rebase--interactive
>> +# and git-rebase--interactive--preserve-merges in support of the
>> +# interactive mode.  "git rebase --interactive" makes it easy
>>  # to fix up commits in the middle of a series and rearrange commits.
>>  #
>>  # Copyright (c) 2006 Johannes E. Schindelin
>> @@ -7,6 +9,7 @@
>>  # The original idea comes from Eric W. Biederman, in
>>  # https://public-inbox.org/git/m1odwkyuf5.fsf...@ebiederm.dsl.xmission.com/
>>  #
>> +
>>  # The file containing rebase commands, comments, and empty lines.
>
> Why?

Will remove the blank line.

>> @@ -274,7 +290,8 @@ pick_one () {
>>
>>   case "$1" in -n) sha1=$2; ff= ;; *) sha1=$1 ;; esac
>>   case "$force_rebase" in '') ;; ?*) ff= ;; esac
>> - output git rev-parse --verify $sha1 || die "$(eval_gettext "Invalid 
>> commit name: \$sha1")"
>> + output git rev-parse --verify $sha1 ||
>> + die "$(eval_gettext "Invalid commit name: \$sha1")"
>
> Just linewrapping.

Will be leaving for now

>
>> @@ -287,8 +304,8 @@ pick_one () {
>>   ${gpg_sign_opt:+$(git rev-parse --sq-quote 
>> "$gpg_sign_opt")} \
>>   "$strategy_args" $empty_args $ff "$@"
>>
>> - # If cherry-pick dies it leaves the to-be-picked commit unrecorded. 
>> Reschedule
>> - # previous task so this commit is not lost.
>> + # If cherry-pick dies it leaves the to-be-picked commit unrecorded.
>> + # Reschedule previous task so this commit is not lost.
>
> Ditto.

Will be leaving for now

>
>> @@ -307,17 +324,15 @@ pick_one_preserving_merges () {
>>   esac
>>   sha1=$(git rev-parse $sha1)
>>
>> - if test -f "$state_dir"/current-commit
>> + if test -f "$state_dir"/current-commit && test "$fast_forward" = t
>>   then
>> - if test "$fast_forward" = t
>> - then
>> - while read current_commit
>> - do
>> - git rev-parse HEAD > 
>> "$rewritten"/$current_commit
>> - done <"$state_dir"/current-commit
>> - rm "$state_dir"/current-commit ||
>> - die "$(gettext "Cannot write current commit's 
>> replacement sha1")"
>> - fi
>> + while read current_commit
>> + do
>> + git rev-parse HEAD > "$rewritten"/$current_commit
>> + done <"$state_dir"/current-commit
>> + rm "$state_dir"/current-commit ||
>> + die "$(gettext \
>> + "Cannot write current commit's replacement sha1")"
>>   fi
>
> Good code simplification that turns
>
> if A
> if B
> do this thing
> fi
> fi
>
> into
>
> if A & B
> do this thing
> fi

Will be keeping this in a future commit

>
>> @@ -337,10 +352,10 @@ pick_one_preserving_merges () {
>>   if test -f "$rewritten"/$p
>>   then
>>   new_p=$(cat "$rewritten"/$p)
>> -
>> - # If the todo reordered commits, and our parent is 
>> marked for
>> - # rewriting, but hasn't been gotten to yet, assume the 
>> user meant to
>> - # drop it on top of the current HEAD
>> + # If the todo reordered commits, and our parent is
>> + # marked for rewriting, but hasn't been gotten to yet,
>> + # assume the user meant to drop it on top of the
>> + # current HEAD
>
> Just line wrapping.

Will be 

Re: [PATCH 01/44] repository: initialize the_repository in main()

2018-03-21 Thread Brandon Williams
On 03/03, Nguyễn Thái Ngọc Duy wrote:
> This simplifies initialization of struct repository and anything
> inside. Easier to read. Easier to add/remove fields.
> 
> Everything will go through main() common-main.c so this should cover all
> programs, including t/helper.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  common-main.c |  2 ++
>  repository.c  | 18 +-
>  repository.h  |  2 +-
>  3 files changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/common-main.c b/common-main.c
> index 6a689007e7..7d716d5a54 100644
> --- a/common-main.c
> +++ b/common-main.c
> @@ -34,6 +34,8 @@ int main(int argc, const char **argv)
>  
>   git_setup_gettext();
>  
> + initialize_the_repository();
> +
>   attr_start();
>  
>   git_extract_argv0_path(argv[0]);
> diff --git a/repository.c b/repository.c
> index 4ffbe9bc94..0eddf28fcd 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -4,10 +4,16 @@
>  #include "submodule-config.h"
>  
>  /* The main repository */
> -static struct repository the_repo = {
> - NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, _index, 
> _algos[GIT_HASH_SHA1], 0, 0
> -};
> -struct repository *the_repository = _repo;
> +static struct repository the_repo;
> +struct repository *the_repository;
> +
> +void initialize_the_repository(void)
> +{
> + the_repository = _repo;
> +
> + the_repo.index = _index;
> + repo_set_hash_algo(_repo, GIT_HASH_SHA1);
> +}

Nice, one place to do all the crazy initialization for the_repo.  I
wanted to do something like this when I introduced the repository but I
shied away from it because it didn't seem like the right thing to do at
the time.  Now that its grown a bit, its definitely the right move until
using the repository object is ubiquitous enough that the setup code can
produce a repository struct instead of relying on a global one.

>  
>  static char *git_path_from_env(const char *envvar, const char *git_dir,
>  const char *path, int fromenv)
> @@ -128,7 +134,9 @@ static int read_and_verify_repository_format(struct 
> repository_format *format,
>   * Initialize 'repo' based on the provided 'gitdir'.
>   * Return 0 upon success and a non-zero value upon failure.
>   */
> -int repo_init(struct repository *repo, const char *gitdir, const char 
> *worktree)
> +static int repo_init(struct repository *repo,
> +  const char *gitdir,
> +  const char *worktree)
>  {
>   struct repository_format format;
>   memset(repo, 0, sizeof(*repo));
> diff --git a/repository.h b/repository.h
> index 0329e40c7f..40c1c81bdc 100644
> --- a/repository.h
> +++ b/repository.h
> @@ -91,7 +91,7 @@ extern struct repository *the_repository;
>  extern void repo_set_gitdir(struct repository *repo, const char *path);
>  extern void repo_set_worktree(struct repository *repo, const char *path);
>  extern void repo_set_hash_algo(struct repository *repo, int algo);
> -extern int repo_init(struct repository *repo, const char *gitdir, const char 
> *worktree);

I had to double check that this was ok, it indeed is as repo_init is
only used in repository.c.

> +extern void initialize_the_repository(void);
>  extern int repo_submodule_init(struct repository *submodule,
>  struct repository *superproject,
>  const char *path);
> -- 
> 2.16.1.435.g8f24da2e1a
> 

-- 
Brandon Williams


Re: [PATCH v5 2/3] stash push: avoid printing errors

2018-03-21 Thread Thomas Gummerer
On 03/20, Junio C Hamano wrote:
> Thomas Gummerer  writes:
> 
> > ...
> > Fix this by avoiding the 'git clean' if a pathspec is given, and use the
> > pipeline that's used for pathspec mode to get rid of the untracked files
> > as well.
> 
> That made me wonder if we can get rid of 'git clean' altogether by
> pretending that we saw a pathspec '.' that match everything when no
> pathspec is given---that way we only have to worry about a single
> codepath.
> 
> But I guess doing it this way can minimize potential damage.  Those
> who do not use pathspec when running "git stash" won't be affected
> even if this change had some bugs ;-)

Heh yeah, we found enough bugs in this code so far, so it's probably
best to leave the part that's working alone at least for now.

> > diff --git a/git-stash.sh b/git-stash.sh
> > index 4c92ec931f..5e06f96da5 100755
> > --- a/git-stash.sh
> > +++ b/git-stash.sh
> > @@ -308,14 +308,16 @@ push_stash () {
> > if test -z "$patch_mode"
> > then
> > test "$untracked" = "all" && CLEAN_X_OPTION=-x || 
> > CLEAN_X_OPTION=
> > -   if test -n "$untracked"
> > +   if test -n "$untracked" && test $# = 0
> > then
> > git clean --force --quiet -d $CLEAN_X_OPTION -- "$@"

Argh I just noticed we could drop the "$@" here, as this is no longer
the pathspec case.  It doesn't hurt anything, except it may be a bit
confusing when reading the code.

Although if we end up implementing 'git checkout --index ',
we'd have to add it back, but we do have a test covering this case, so
there's no worries about forgetting to add it back.

> > fi
> >  
> > if test $# != 0
> > then
> > -   git add -u -- "$@"
> > +   test -z "$untracked" && UPDATE_OPTION="-u" || 
> > UPDATE_OPTION=
> > +   test "$untracked" = "all" && FORCE_OPTION="--force" || 
> > FORCE_OPTION=
> > +   git add $UPDATE_OPTION $FORCE_OPTION -- "$@"
> > git diff-index -p --cached --binary HEAD -- "$@" |
> > git apply --index -R
> > else
> 
> Thanks, I'll take the change as-is.
> 
> I however wonder if we restructure the code to
> 
>   if test $# = 0
>   then
>   # no pathspec
>   if test -n "$untracked"
>   then
>   git clean --force --quiet -d $CLEAN_OPTION -- "$@"
>   fi
>   git reset --hard -q
>   else
>   test -z "$untracked" && UPDATE=-u || UPDATE=
>   test "$untracked" = all && FORCE=--force || FORCE=
>   git add $UPDATE $FORCE-- "$@"
>   git diff-index -p --cached --binary HEAD -- "$@" |
>   git apply --index -R
>   fi
> 
> it becomes easier to understand what is going on.

I like that code structure more than what I have now.  I see you
already merged what I had to next, and I like keeping the change small
now that we're in the rc period (assuming you want to get this into
2.17?)  Maybe we can restructure the code as a separate cleanup once
2.17 is out, so this has more time to cook in master and hopefully
we'd notice regressions before the next release?

> That way, once we have a plumbing command to help the else clause of
> the above, i.e. "git checkout --index  -- "
> [*1*], then we can lose the if/then/else and rewrite the whole "we
> have created stash, so it's time to get rid of local modifications
> to the paths that match the pathspec" code to:
> 
>   if test "$untracked"
>   then
>   git clean --force --quiet -d $CLEAN_OPTION -- "$@"
>   fi
>   git checkout --index HEAD -- "$@"

Yeah, this would be nice to have.  I wanted to have a look at what it
would take to implement 'git checkout --{cached,index}', but I'm not
familiar with the checkout code at all, so it will probably be a while
until I can get around to do it. 

> [Footnote]
> cf. https://public-inbox.org/git/xmqq4loqplou@gitster.mtv.corp.google.com/
> 
> What we want in the case in the code in question when there is
> pathspec is "match the index entries and the working tree files to
> those that appear in a given tree for paths that match the given
> pathspec".  This is close to "git checkout  -- "
> but not quite.  Current "git checkout  -- " is
> an overlay operation in that paths that match  but do not
> exist in  are *NOT* removed from the working tree.  We
> obviously cannot change the behaviour of the command.
> 
> But we can add an option to ask for the new behaviour.  In general,
> for an operation that affects the index and the working tree, we can
> have "--cached" mode that affects only the index without touching
> the working tree, and "--index" mode that affects both.
> 
> "git reset  -- ", which is a UI mistake, is
> better spelled "git checkout --cached  -- ".  We
> take paths that match  from  and stuff into the
> index, and remove 

Re: [PATCH v2] json_writer: new routines to create data in JSON format

2018-03-21 Thread Junio C Hamano
g...@jeffhostetler.com writes:

> From: Jeff Hostetler 
>
> Add basic routines to generate data in JSON format.

And the point of having capability to write JSON data in our
codebase is...?

> diff --git a/json-writer.c b/json-writer.c
> new file mode 100644
> index 000..89a6abb
> --- /dev/null
> +++ b/json-writer.c
> @@ -0,0 +1,321 @@
> +#include "cache.h"
> +#include "json-writer.h"
> +
> +static char g_ch_open[2]  = { '{', '[' };
> +static char g_ch_close[2] = { '}', ']' };

What's "g_" prefix?

> +
> +/*
> + * Append JSON-quoted version of the given string to 'out'.
> + */
> +static void append_quoted_string(struct strbuf *out, const char *in)
> +{
> + strbuf_addch(out, '"');
> + for (/**/; *in; in++) {
> + unsigned char c = (unsigned char)*in;

It is clear enough to lose /**/, i.e.

for (; *in; in++) {

but for this one. I wonder if

unsigned char c;
strbuf_addch(out, '"');
while ((c = *in++) != '\0') {
...

is easier to follow, though.

> +static inline void begin(struct json_writer *jw, int is_array)
> +{
> + ALLOC_GROW(jw->levels, jw->nr + 1, jw->alloc);
> +
> + jw->levels[jw->nr].level_is_array = !!is_array;
> + jw->levels[jw->nr].level_is_empty = 1;

An element of this array is a struct that represents a level, and
everybody who accesses an element of that type knows it is talking
about a level by the field that has the array being named as
.levels[] (also [*1*]).  In such a context, it is a bit too loud to
name the fields with level_$blah.  IOW,

struct json_writer_level
{
unsigned is_array : 1;
unsigned is_empty : 1;
};

> +struct json_writer_level
> +{
> + unsigned level_is_array : 1;
> + unsigned level_is_empty : 1;
> +};
> +
> +struct json_writer
> +{
> + struct json_writer_level *levels;
> + int nr, alloc;
> + struct strbuf json;
> +};

[Footnote]

*1* I personally prefer to call an array of things "thing[]", not
"things[]", because then you can refer to an individual element
e.g. "thing[4]" and read it as "the fourth thing".

Unless the code often treats an array as a whole, that is, in
which case, things[] is OK as you'll be calling the whole thing
with the plural name (e.g. call that function and give all the
things by passing things[]).

In this case, one level instance is an element of a stack, and
the code would be accessing one level at a time most of the
time, so "writer.level[4].is_empty" would read more naturally
than "writer.levels[4].level_is_empty".




Re: [PATCH] completion: add option completion for most builtin commands

2018-03-21 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> These commands can take options and use parse-options so it's quite
> easy to allow option completion. This does not pollute the command
> name completion though. "git " will show you the same set as
> before. This only kicks in when you type the correct command name.
>
> Some other builtin commands are not still added because either they
> don't use parse-options, or they are deprecated, or they are those
> -helper commands that are used to move some logic back in C for
> sh-based commands.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  contrib/completion/git-completion.bash | 276 +
>  1 file changed, 276 insertions(+)

Wow, just wow.  It however looks a lot of boilerplates, e.g. looking
at these, we notice ...

> +_git_blame() {
> + case "$cur" in
> + --*)
> + __gitcomp_builtin blame
> + return
> + ;;
> + esac
> +}
> +
>  
> +_git_cat_file() {
> + case "$cur" in
> + --*)
> + __gitcomp_builtin cat-file
> + return
> + ;;
> + esac
> +}
> +
> +_git_check_attr() {
> + case "$cur" in
> + --*)
> + __gitcomp_builtin check-attr
> + return
> + ;;
> + esac
> +}

... the only thing we need for the above three is a table that says
"use blame for blame, cat-file for cat_file, and check-attr for
check_attr".

And that pattern repeats throughout the patch.  I wonder if we can
express the same a lot more concisely by updating the caller that
calls these command specific helpers?



Re: [PATCH v5 4/6] ref-filter: change parsing function error handling

2018-03-21 Thread Junio C Hamano
Olga Telezhnaya  writes:

> @@ -2144,13 +2151,15 @@ int format_ref_array_item(struct ref_array_item *info,
>  
>   for (cp = format->format; *cp && (sp = find_next(cp)); cp = ep + 1) {
>   struct atom_value *atomv;
> + int pos;
>  
>   ep = strchr(sp, ')');
>   if (cp < sp)
>   append_literal(cp, sp, );
> - get_ref_atom_value(info,
> -parse_ref_filter_atom(format, sp + 2, ep),
> -);
> + pos = parse_ref_filter_atom(format, sp + 2, ep, error_buf);
> + if (pos < 0)
> + return -1;
> + get_ref_atom_value(info, pos, );
>   if (atomv->handler(atomv, , error_buf))
>   return -1;
>   }

These error returns leave the formatting state "state" on the stack
holding onto its resources, no?

The only thing the caller of format_ref_array_item() that notices an
error return does is to die even after this series, so in that sense
it does not matter (yet), but it still feels somewhat wrong.




Re: [PATCH v5 1/6] strbuf: add shortcut to work with error messages

2018-03-21 Thread Junio C Hamano
Olga Telezhnaya  writes:

> Add function strbuf_error() that helps to save few lines of code.
> Function expands fmt with placeholders, append resulting error message
> to strbuf *err, and return error code ret.
>
> Signed-off-by: Olga Telezhnaia 
> ---
>  strbuf.h | 13 +
>  1 file changed, 13 insertions(+)
>
> diff --git a/strbuf.h b/strbuf.h
> index e6cae5f4398c8..fa66d4835f1a7 100644
> --- a/strbuf.h
> +++ b/strbuf.h
> @@ -620,4 +620,17 @@ char *xstrvfmt(const char *fmt, va_list ap);
>  __attribute__((format (printf, 1, 2)))
>  char *xstrfmt(const char *fmt, ...);
>  
> +/*
> + * Expand error message, append it to strbuf *err, then return error code 
> ret.
> + * Allow to save few lines of code.
> + */
> +static inline int strbuf_error(struct strbuf *err, int ret, const char *fmt, 
> ...)
> +{

With this function, err does not have to be an error message, and
ret does not have to be negative.  Hence strbuf_error() is a wrong
name for the wrapper.

It somewhat is bothersome to see that this is inlined; if it is
meant for error codepath, it probably shouldn't have to be.

> + va_list ap;
> + va_start(ap, fmt);
> + strbuf_vaddf(err, fmt, ap);
> + va_end(ap);
> + return ret;
> +}
> +
>  #endif /* STRBUF_H */

Quite honestly, I am not sure if it is worth to be in strbuf.h; it
feels a bit too specific to the immediate need for these five
patches and nowhere else.  Are there many existing calls to
strbuf_addf() immediately followed by an "return" of an integer,
that can be simplified by using this helper?  I see quite a many
instances of addf() soon followed by "return -1" in
refs/files-backend.c, but they are not immediately adjacent to each
other, and won't be helped.


Re: [PATCH] filter-branch: consider refs can refer to an object other than commit or tag

2018-03-21 Thread Junio C Hamano
Yuki Kokubun  writes:

>> Yuki Kokubun  writes:
>> 
>> > "git filter-branch -- --all" can be confused when refs that refer to 
>> > objects
>> > other than commits or tags exists.
>> > Because "git rev-parse --all" that is internally used can return refs that
>> > refer to an object other than commit or tag. But it is not considered in 
>> > the
>> > phase of updating refs.
>> 
>> Could you describe what the consequence of that is?  We have a ref
>> that points directly at a blob object, or a ref that points at a tag
>> object that points at a blob object.  The current code leaves both of
>> these refs in "$tempdir/heads".  Then...?
>
> Sorry, this is my wrong.
> I wrongly thought only refs/replace can point at a blob or tree object.

No need to be sorry.  You still need to describe what (bad things)
happen if we do not filter out refs that do not point at committish
in the proposed log message.  

IOW, can you elaborate and clarify your "can be confused" at the
beginning?


Re: [PATCH v4 5/5] ref-filter: get_ref_atom_value() error handling

2018-03-21 Thread Eric Sunshine
On Wed, Mar 21, 2018 at 3:30 PM, Junio C Hamano  wrote:
> Eric Sunshine  writes:
>> strbuf_error() was a possibility proposed in [1], and it does take a
>> strbuf. Failure to pass in a strbuf here is just a typo.
>
> I've seen it; I just thought it was a joke and not a serious
> suggestion.
> A macro or helper function that is local to the file might be OK,

My thought all along was that this convenience helper (in whatever
form) should start life local to ref-filter.c, and only be published
more widely if found to be generally useful. Unfortunately, I forgot
to state so explicitly when writing the review. Suggesting the name
strbuf_error() didn't help to convey that thought either. My bad.

> but I do not think "strbuf_error()" is a useful abstraction that is
> generic enough in the first place (the questions to ask yourself to
> think about it are: Why should it be limited to return -1?  Why
> should it be limited to always do the addf() to a strbuf?).

There is some precedent in the existing error() function. As with
error(), as a _convenience_ function, it does not necessarily have to
be universally general. That it simplifies a reasonably large body of
code may be justification enough, despite it shortcomings. I don't
feel strongly about it, though, and, as noted above, agree that it can
be local to ref-filter.c.


[ANNOUNCE] Git v2.17.0-rc1

2018-03-21 Thread Junio C Hamano
A release candidate Git v2.17.0-rc1 is now available for testing
at the usual places.  It is comprised of 493 non-merge commits
since v2.16.0, contributed by 62 people, 19 of which are new faces.

The tarballs are found at:

https://www.kernel.org/pub/software/scm/git/testing/

The following public repositories all have a copy of the
'v2.17.0-rc1' tag and the 'master' branch that the tag points at:

  url = https://kernel.googlesource.com/pub/scm/git/git
  url = git://repo.or.cz/alt-git.git
  url = https://github.com/gitster/git

New contributors whose contributions weren't in v2.16.0 are as follows.
Welcome to the Git development community!

  Adam Borowski, Alban Gruin, Andreas G. Schacker, Bernhard
  M. Wiedemann, Christian Ludwig, Gargi Sharma, Genki Sky,
  Gregory Herrero, Jon Simons, Juan F. Codagnone, Kim Gybels,
  Lucas Werkmeister, Mathias Rav, Michele Locati, Motoki Seki,
  Stefan Moch, Stephen R Guglielmo, Tatyana Krasnukha, and Thomas
  Levesque.

Returning contributors who helped this release are as follows.
Thanks for your continued support.

  Ævar Arnfjörð Bjarmason, Alexander Shopov, Alex Bennée, Ben
  Peart, Brandon Williams, brian m. carlson, Christian Couder,
  Daniel Knittl-Frank, David Pursehouse, Derrick Stolee, Elijah
  Newren, Eric Sunshine, Eric Wong, Jason Merrill, Jeff Hostetler,
  Jeff King, Johannes Schindelin, Jonathan Nieder, Jonathan Tan,
  Junio C Hamano, Kaartic Sivaraam, Mårten Kongstad, Martin
  Ågren, Matthieu Moy, Michael Haggerty, Nathan Payre, Nguyễn
  Thái Ngọc Duy, Nicolas Morey-Chaisemartin, Olga Telezhnaya,
  Patryk Obara, Phillip Wood, Prathamesh Chavan, Ramsay Jones,
  Randall S. Becker, Rasmus Villemoes, René Scharfe, Robert
  P. J. Day, Stefan Beller, SZEDER Gábor, Thomas Gummerer,
  Todd Zullinger, Torsten Bögershausen, and Yasushi SHOJI.



Git 2.17 Release Notes (draft)
==

Updates since v2.16
---

UI, Workflows & Features

 * "diff" family of commands learned "--find-object=" option
   to limit the findings to changes that involve the named object.

 * "git format-patch" learned to give 72-cols to diffstat, which is
   consistent with other line length limits the subcommand uses for
   its output meant for e-mails.

 * The log from "git daemon" can be redirected with a new option; one
   relevant use case is to send the log to standard error (instead of
   syslog) when running it from inetd.

 * "git rebase" learned to take "--allow-empty-message" option.

 * "git am" has learned the "--quit" option, in addition to the
   existing "--abort" option; having the pair mirrors a few other
   commands like "rebase" and "cherry-pick".

 * "git worktree add" learned to run the post-checkout hook, just like
   "git clone" runs it upon the initial checkout.

 * "git tag" learned an explicit "--edit" option that allows the
   message given via "-m" and "-F" to be further edited.

 * "git fetch --prune-tags" may be used as a handy short-hand for
   getting rid of stale tags that are locally held.

 * The new "--show-current-patch" option gives an end-user facing way
   to get the diff being applied when "git rebase" (and "git am")
   stops with a conflict.

 * "git add -p" used to offer "/" (look for a matching hunk) as a
   choice, even there was only one hunk, which has been corrected.
   Also the single-key help is now given only for keys that are
   enabled (e.g. help for '/' won't be shown when there is only one
   hunk).

 * Since Git 1.7.9, "git merge" defaulted to --no-ff (i.e. even when
   the side branch being merged is a descendant of the current commit,
   create a merge commit instead of fast-forwarding) when merging a
   tag object.  This was appropriate default for integrators who pull
   signed tags from their downstream contributors, but caused an
   unnecessary merges when used by downstream contributors who
   habitually "catch up" their topic branches with tagged releases
   from the upstream.  Update "git merge" to default to --no-ff only
   when merging a tag object that does *not* sit at its usual place in
   refs/tags/ hierarchy, and allow fast-forwarding otherwise, to
   mitigate the problem.

 * "git status" can spend a lot of cycles to compute the relation
   between the current branch and its upstream, which can now be
   disabled with "--no-ahead-behind" option.

 * "git diff" and friends learned funcname patterns for Go language
   source files.

 * "git send-email" learned "--reply-to=" option.

 * Funcname pattern used for C# now recognizes "async" keyword.

 * In a way similar to how "git tag" learned to honor the pager
   setting only in the list mode, "git config" learned to ignore the
   pager setting when it is used for setting values (i.e. when the
   purpose of the operation is not to "show").


Performance, Internal Implementation, Development Support etc.

 * More perf tests for threaded grep

 * "perf" test output 

Re: [PATCH] filter-branch: consider refs can refer to an object other than commit or tag

2018-03-21 Thread Yuki Kokubun
> Yuki Kokubun  writes:
> 
> > "git filter-branch -- --all" can be confused when refs that refer to objects
> > other than commits or tags exists.
> > Because "git rev-parse --all" that is internally used can return refs that
> > refer to an object other than commit or tag. But it is not considered in the
> > phase of updating refs.
> 
> Could you describe what the consequence of that is?  We have a ref
> that points directly at a blob object, or a ref that points at a tag
> object that points at a blob object.  The current code leaves both of
> these refs in "$tempdir/heads".  Then...?

Sorry, this is my wrong.
I wrongly thought only refs/replace can point at a blob or tree object.

> 
>   ... goes and looks ...
> 
> There is a loop that looks like this:
> 
>   while read ref
>   do
>   sha1=$(git rev-parse "$ref^0")
>   ...
>   done <"$tempdir/heads"
> 
> which would break on anything but a commit-ish.
> 
> >  # The refs should be updated if their heads were rewritten
> >  git rev-parse --no-flags --revs-only --symbolic-full-name \
> > -   --default HEAD "$@" > "$tempdir"/raw-heads || exit
> > +   --default HEAD "$@" > "$tempdir"/raw-objects || exit
> > +# refs/replace can refer to an object other than commit or tag
> 
> Mention of replace refs in the proposed log message gives an easy to
> understand example and is a good idea, but this in code comment does
> not have to single out the replace refs.  A tag can also point at an
> object with any type, e.g. "git tag v2.6.11-tree v2.6.11^{tree}"
> would make "refs/tags/v2.6.11-tree" point at the tree at the top
> level of the tree-ish "v2.6.11".  It probably is OK to drop this
> comment altogether.

OK, I'm gonna drop the incorrect comment.

> 
> > +while read ref
> > +do
> > +   type=$(git cat-file -t "$ref")
> > +   if test $type = commit || test $type = tag
> > +   then
> > +   echo "$ref"
> > +   fi
> > +done >"$tempdir"/raw-heads <"$tempdir"/raw-objects
> >  sed -e '/^^/d' "$tempdir"/raw-heads >"$tempdir"/heads
> 
> So... is the idea to limit the set of refs to be rewritten to those
> that point at commits and tags?  As I already alluded to, I do not
> think you want to accept a ref that points at any tag object---only
> the ones that point at a tag that points at a commit-ish, so that
> the code will not barf when doing "$ref^0".
> 
> So perhaps
> 
>   git rev-parse --no-flags ... >"$tempdir/raw-heads" || exit
> 
>   while read ref
>   do
>   case "$ref" in ^?*) continue ;; esac
>   if git rev-parse --verify "$ref^0" 2>/dev/null
> then
>   echo "$ref"
>   fi
>   done >"$tempdir/heads" <"$tempdir/raw-heads"
> 
> or something?  Note that you do not need the "sed" as the loop
> already excludes the negative revs.

I feel using "git rev-parse --verify" is a good way as you said.
I'm gonna modify the patch to use it.

> 
> >  test -s "$tempdir"/heads ||
> > diff --git a/t/t7003-filter-branch.sh b/t/t7003-filter-branch.sh
> > index 7cb60799b..efeaf5887 100755
> > --- a/t/t7003-filter-branch.sh
> > +++ b/t/t7003-filter-branch.sh
> > @@ -470,4 +470,17 @@ test_expect_success 'tree-filter deals with object 
> > name vs pathname ambiguity' '
> > git show HEAD:$ambiguous
> >  '
> >  
> > +test_expect_success 'rewrite repository including refs/replace that point 
> > to non commit object' '
> > +   test_when_finished "git reset --hard original" &&
> > +   tree=$(git rev-parse HEAD^{tree}) &&
> > +   test_when_finished "git replace -d $tree" &&
> > +   echo A >new &&
> > +   git add new &&
> > +   new_tree=$(git write-tree) &&
> > +   git replace $tree $new_tree &&
> 
> Perhaps something like this here:
> 
>   git tag -a "tag to a tree" treetag $new_tree &&
> 
> can tell su how well it works with a tag that points at a tree?

Sounds good. I'm gonna add such tags to the test case.

> 
> > +   git reset --hard HEAD &&
> > +   git filter-branch -f -- --all >filter-output 2>&1 &&
> > +   ! fgrep fatal filter-output
> > +'
> > +
> >  test_done


Re: [RFC PATCH v2 1/1] rebase-interactive: Add git_rebase__interactive__preserve_merges

2018-03-21 Thread Junio C Hamano
Wink Saville  writes:

> Refactor git_rebase__interactive__preserve_merges out of
> git_rebase__interactive resulting in fewer conditionals making
> both routines are simpler.
>
> Changed run_specific_rebase in git-rebase to dispatch to the appropriate
> function after sourcing git-rebase--interactive.
> ---

I think this will become (more) reviewable if it is split into two
patches (at least).  A preliminary patch that does the style changes
and line wrapping (left below) as step #1, and all the rest on top
as step #2.

> diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
> index 331c8dfea..65ff75654 100644
> --- a/git-rebase--interactive.sh
> +++ b/git-rebase--interactive.sh
> @@ -1,5 +1,7 @@
> -# This shell script fragment is sourced by git-rebase to implement
> -# its interactive mode.  "git rebase --interactive" makes it easy
> +#!/bin/sh

Addition of #! implies that this might be invoked as the top-level
script; is that the case now?  I did not get such an impression from
the proposed log message and it is still always dot-sourced (in
which case #! gives a wrong signal to the readers).

> +# This shell script fragment is sourced by git-rebase--interactive
> +# and git-rebase--interactive--preserve-merges in support of the
> +# interactive mode.  "git rebase --interactive" makes it easy
>  # to fix up commits in the middle of a series and rearrange commits.
>  #
>  # Copyright (c) 2006 Johannes E. Schindelin
> @@ -7,6 +9,7 @@
>  # The original idea comes from Eric W. Biederman, in
>  # https://public-inbox.org/git/m1odwkyuf5.fsf...@ebiederm.dsl.xmission.com/
>  #
> +
>  # The file containing rebase commands, comments, and empty lines.

Why?

> @@ -274,7 +290,8 @@ pick_one () {
>  
>   case "$1" in -n) sha1=$2; ff= ;; *) sha1=$1 ;; esac
>   case "$force_rebase" in '') ;; ?*) ff= ;; esac
> - output git rev-parse --verify $sha1 || die "$(eval_gettext "Invalid 
> commit name: \$sha1")"
> + output git rev-parse --verify $sha1 ||
> + die "$(eval_gettext "Invalid commit name: \$sha1")"

Just linewrapping.

> @@ -287,8 +304,8 @@ pick_one () {
>   ${gpg_sign_opt:+$(git rev-parse --sq-quote 
> "$gpg_sign_opt")} \
>   "$strategy_args" $empty_args $ff "$@"
>  
> - # If cherry-pick dies it leaves the to-be-picked commit unrecorded. 
> Reschedule
> - # previous task so this commit is not lost.
> + # If cherry-pick dies it leaves the to-be-picked commit unrecorded.
> + # Reschedule previous task so this commit is not lost.

Ditto.

> @@ -307,17 +324,15 @@ pick_one_preserving_merges () {
>   esac
>   sha1=$(git rev-parse $sha1)
>  
> - if test -f "$state_dir"/current-commit
> + if test -f "$state_dir"/current-commit && test "$fast_forward" = t
>   then
> - if test "$fast_forward" = t
> - then
> - while read current_commit
> - do
> - git rev-parse HEAD > 
> "$rewritten"/$current_commit
> - done <"$state_dir"/current-commit
> - rm "$state_dir"/current-commit ||
> - die "$(gettext "Cannot write current commit's 
> replacement sha1")"
> - fi
> + while read current_commit
> + do
> + git rev-parse HEAD > "$rewritten"/$current_commit
> + done <"$state_dir"/current-commit
> + rm "$state_dir"/current-commit ||
> + die "$(gettext \
> + "Cannot write current commit's replacement sha1")"
>   fi

Good code simplification that turns

if A
if B
do this thing
fi
fi

into

if A & B
do this thing
fi

> @@ -337,10 +352,10 @@ pick_one_preserving_merges () {
>   if test -f "$rewritten"/$p
>   then
>   new_p=$(cat "$rewritten"/$p)
> -
> - # If the todo reordered commits, and our parent is 
> marked for
> - # rewriting, but hasn't been gotten to yet, assume the 
> user meant to
> - # drop it on top of the current HEAD
> + # If the todo reordered commits, and our parent is
> + # marked for rewriting, but hasn't been gotten to yet,
> + # assume the user meant to drop it on top of the
> + # current HEAD

Just line wrapping.

> @@ -379,7 +394,7 @@ pick_one_preserving_merges () {
>   then
>   # detach HEAD to current parent
>   output git checkout $first_parent 2> /dev/null ||
> - die "$(eval_gettext "Cannot move HEAD to 
> \$first_parent")"
> +die "$(eval_gettext "Cannot move HEAD to 
> \$first_parent")"
>   fi

Ditto.

> @@ -553,7 

gitk takes a looong time for a subdir of the Gentoo repository

2018-03-21 Thread Toralf Förster
steps to reproduce:

$> git clone  git://git.gentoo.org/repo/gentoo.git
$> cd gentoo
$> gitk  sys-apps/portage-mgorny

For few minutes I do just read here : "Reading commits..."

-- 
Toralf
PGP C4EACDDE 0076E94E



signature.asc
Description: OpenPGP digital signature


[PATCH] completion: add option completion for most builtin commands

2018-03-21 Thread Nguyễn Thái Ngọc Duy
These commands can take options and use parse-options so it's quite
easy to allow option completion. This does not pollute the command
name completion though. "git " will show you the same set as
before. This only kicks in when you type the correct command name.

Some other builtin commands are not still added because either they
don't use parse-options, or they are deprecated, or they are those
-helper commands that are used to move some logic back in C for
sh-based commands.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 contrib/completion/git-completion.bash | 276 +
 1 file changed, 276 insertions(+)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 6da95b8095..0cd9180f48 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1202,6 +1202,15 @@ _git_bisect ()
esac
 }
 
+_git_blame() {
+   case "$cur" in
+   --*)
+   __gitcomp_builtin blame
+   return
+   ;;
+   esac
+}
+
 _git_branch ()
 {
local i c=1 only_local_ref="n" has_r="n"
@@ -1254,6 +1263,42 @@ _git_bundle ()
esac
 }
 
+_git_cat_file() {
+   case "$cur" in
+   --*)
+   __gitcomp_builtin cat-file
+   return
+   ;;
+   esac
+}
+
+_git_check_attr() {
+   case "$cur" in
+   --*)
+   __gitcomp_builtin check-attr
+   return
+   ;;
+   esac
+}
+
+_git_check_ignore() {
+   case "$cur" in
+   --*)
+   __gitcomp_builtin check-ignore
+   return
+   ;;
+   esac
+}
+
+_git_check_mailmap() {
+   case "$cur" in
+   --*)
+   __gitcomp_builtin check-mailmap
+   return
+   ;;
+   esac
+}
+
 _git_checkout ()
 {
__git_has_doubledash && return
@@ -1278,6 +1323,15 @@ _git_checkout ()
esac
 }
 
+_git_checkout_index() {
+   case "$cur" in
+   --*)
+   __gitcomp_builtin checkout-index
+   return
+   ;;
+   esac
+}
+
 _git_cherry ()
 {
__git_complete_refs
@@ -1326,6 +1380,15 @@ _git_clone ()
esac
 }
 
+_git_column() {
+   case "$cur" in
+   --*)
+   __gitcomp_builtin column
+   return
+   ;;
+   esac
+}
+
 __git_untracked_file_modes="all no normal"
 
 _git_commit ()
@@ -1365,6 +1428,15 @@ _git_commit ()
fi
 }
 
+_git_count_objects() {
+   case "$cur" in
+   --*)
+   __gitcomp_builtin count-objects
+   return
+   ;;
+   esac
+}
+
 _git_describe ()
 {
case "$cur" in
@@ -1446,6 +1518,15 @@ _git_difftool ()
__git_complete_revlist_file
 }
 
+_git_fast_export() {
+   case "$cur" in
+   --*)
+   __gitcomp_builtin fast-export
+   return
+   ;;
+   esac
+}
+
 __git_fetch_recurse_submodules="yes on-demand no"
 
 _git_fetch ()
@@ -1573,6 +1654,15 @@ _git_grep ()
__git_complete_refs
 }
 
+_git_hash_object() {
+   case "$cur" in
+   --*)
+   __gitcomp_builtin hash-object
+   return
+   ;;
+   esac
+}
+
 _git_help ()
 {
case "$cur" in
@@ -1590,6 +1680,15 @@ _git_help ()
"
 }
 
+_git_index_pack() {
+   case "$cur" in
+   --*)
+   __gitcomp_builtin index-pack
+   return
+   ;;
+   esac
+}
+
 _git_init ()
 {
case "$cur" in
@@ -1606,6 +1705,15 @@ _git_init ()
esac
 }
 
+_git_interpret_trailers() {
+   case "$cur" in
+   --*)
+   __gitcomp_builtin interpret-trailers
+   return
+   ;;
+   esac
+}
+
 _git_ls_files ()
 {
case "$cur" in
@@ -1765,6 +1873,15 @@ _git_merge ()
__git_complete_refs
 }
 
+_git_merge_file() {
+   case "$cur" in
+   --*)
+   __gitcomp_builtin merge-file
+   return
+   ;;
+   esac
+}
+
 _git_mergetool ()
 {
case "$cur" in
@@ -1790,6 +1907,15 @@ _git_merge_base ()
__git_complete_refs
 }
 
+_git_mktree() {
+   case "$cur" in
+   --*)
+   __gitcomp_builtin mktree
+   return
+   ;;
+   esac
+}
+
 _git_mv ()
 {
case "$cur" in
@@ -1853,6 +1979,46 @@ _git_notes ()
esac
 }
 
+_git_pack_objects ()
+{
+   case "$cur" in
+   --*)
+   __gitcomp_builtin pack-objects
+   return
+   ;;
+   esac
+}
+
+_git_pack_refs ()
+{
+   case "$cur" in
+   --*)
+   __gitcomp_builtin pack-refs
+   return
+   ;;
+   esac
+}
+
+_git_prune ()
+{
+   case "$cur" in
+   --*)
+   __gitcomp_builtin prune
+   return
+   ;;
+   esac
+}
+
+_git_prune_packed ()
+{
+   case "$cur" in
+   --*)
+  

Re: [PATCH v4 5/5] ref-filter: get_ref_atom_value() error handling

2018-03-21 Thread Junio C Hamano
Eric Sunshine  writes:

>> I have no idea what strbuf_error() that does not take any strbuf is
>> doing,...
>
> strbuf_error() was a possibility proposed in [1], and it does take a
> strbuf. Failure to pass in a strbuf here is just a typo.

I've seen it; I just thought it was a joke and not a serious
suggestion.

A macro or helper function that is local to the file might be OK,
but I do not think "strbuf_error()" is a useful abstraction that is
generic enough in the first place (the questions to ask yourself to
think about it are: Why should it be limited to return -1?  Why
should it be limited to always do the addf() to a strbuf?).



[PATCH v2] json_writer: new routines to create data in JSON format

2018-03-21 Thread git
From: Jeff Hostetler 

Add basic routines to generate data in JSON format.

Signed-off-by: Jeff Hostetler 
---
 Makefile|   2 +
 json-writer.c   | 321 +
 json-writer.h   |  86 +
 t/helper/test-json-writer.c | 420 
 t/t0019-json-writer.sh  | 102 +++
 5 files changed, 931 insertions(+)
 create mode 100644 json-writer.c
 create mode 100644 json-writer.h
 create mode 100644 t/helper/test-json-writer.c
 create mode 100755 t/t0019-json-writer.sh

diff --git a/Makefile b/Makefile
index 1a9b23b..57f58e6 100644
--- a/Makefile
+++ b/Makefile
@@ -662,6 +662,7 @@ TEST_PROGRAMS_NEED_X += test-fake-ssh
 TEST_PROGRAMS_NEED_X += test-genrandom
 TEST_PROGRAMS_NEED_X += test-hashmap
 TEST_PROGRAMS_NEED_X += test-index-version
+TEST_PROGRAMS_NEED_X += test-json-writer
 TEST_PROGRAMS_NEED_X += test-lazy-init-name-hash
 TEST_PROGRAMS_NEED_X += test-line-buffer
 TEST_PROGRAMS_NEED_X += test-match-trees
@@ -815,6 +816,7 @@ LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
 LIB_OBJS += ident.o
+LIB_OBJS += json-writer.o
 LIB_OBJS += kwset.o
 LIB_OBJS += levenshtein.o
 LIB_OBJS += line-log.o
diff --git a/json-writer.c b/json-writer.c
new file mode 100644
index 000..89a6abb
--- /dev/null
+++ b/json-writer.c
@@ -0,0 +1,321 @@
+#include "cache.h"
+#include "json-writer.h"
+
+static char g_ch_open[2]  = { '{', '[' };
+static char g_ch_close[2] = { '}', ']' };
+
+/*
+ * Append JSON-quoted version of the given string to 'out'.
+ */
+static void append_quoted_string(struct strbuf *out, const char *in)
+{
+   strbuf_addch(out, '"');
+   for (/**/; *in; in++) {
+   unsigned char c = (unsigned char)*in;
+   if (c == '"')
+   strbuf_add(out, "\\\"", 2);
+   else if (c == '\\')
+   strbuf_add(out, "", 2);
+   else if (c == '\n')
+   strbuf_add(out, "\\n", 2);
+   else if (c == '\r')
+   strbuf_add(out, "\\r", 2);
+   else if (c == '\t')
+   strbuf_add(out, "\\t", 2);
+   else if (c == '\f')
+   strbuf_add(out, "\\f", 2);
+   else if (c == '\b')
+   strbuf_add(out, "\\b", 2);
+   else if (c < 0x20)
+   strbuf_addf(out, "\\u%04x", c);
+   else
+   strbuf_addch(out, c);
+   }
+   strbuf_addch(out, '"');
+}
+
+
+static inline void begin(struct json_writer *jw, int is_array)
+{
+   ALLOC_GROW(jw->levels, jw->nr + 1, jw->alloc);
+
+   jw->levels[jw->nr].level_is_array = !!is_array;
+   jw->levels[jw->nr].level_is_empty = 1;
+
+   strbuf_addch(>json, g_ch_open[!!is_array]);
+
+   jw->nr++;
+}
+
+/*
+ * Assert that we have an open object at this level.
+ */
+static void inline assert_in_object(const struct json_writer *jw, const char 
*key)
+{
+   if (!jw->nr)
+   BUG("object: missing jw_object_begin(): '%s'", key);
+   if (jw->levels[jw->nr - 1].level_is_array)
+   BUG("object: not in object: '%s'", key);
+}
+
+/*
+ * Assert that we have an open array at this level.
+ */
+static void inline assert_in_array(const struct json_writer *jw)
+{
+   if (!jw->nr)
+   BUG("array: missing jw_begin()");
+   if (!jw->levels[jw->nr - 1].level_is_array)
+   BUG("array: not in array");
+}
+
+/*
+ * Add comma if we have already seen a member at this level.
+ */
+static void inline maybe_add_comma(struct json_writer *jw)
+{
+   if (jw->levels[jw->nr - 1].level_is_empty)
+   jw->levels[jw->nr - 1].level_is_empty = 0;
+   else
+   strbuf_addch(>json, ',');
+}
+
+/*
+ * Assert that the given JSON object or JSON array has been properly
+ * terminated.  (Has closing bracket.)
+ */
+static void inline assert_is_terminated(const struct json_writer *jw)
+{
+   if (jw->nr)
+   BUG("object: missing jw_end(): '%s'", jw->json.buf);
+}
+
+void jw_object_begin(struct json_writer *jw)
+{
+   begin(jw, 0);
+}
+
+void jw_object_string(struct json_writer *jw, const char *key, const char 
*value)
+{
+   assert_in_object(jw, key);
+   maybe_add_comma(jw);
+
+   append_quoted_string(>json, key);
+   strbuf_addch(>json, ':');
+   append_quoted_string(>json, value);
+}
+
+void jw_object_int(struct json_writer *jw, const char *key, int value)
+{
+   assert_in_object(jw, key);
+   maybe_add_comma(jw);
+
+   append_quoted_string(>json, key);
+   strbuf_addf(>json, ":%d", value);
+}
+
+void jw_object_uint64(struct json_writer *jw, const char *key, uint64_t value)
+{
+   assert_in_object(jw, key);
+   maybe_add_comma(jw);
+
+   append_quoted_string(>json, key);
+   strbuf_addf(>json, 

[PATCH v2] routines to generate JSON data

2018-03-21 Thread git
From: Jeff Hostetler 

This is version 2 of my JSON data format routines.  This version addresses
the non-utf8 questions raised on V1.

It includes a new "struct json_writer" which is used to guide the
accumulation of JSON data -- knowing whether an object or array is
currently being composed.  This allows error checking during construction.

It also allows construction of nested structures using an inline model (in
addition to the original bottom-up composition).

The test helper has been updated to include both the original unit tests and
a new scripting API to allow individual tests to be written directly in our
t/t*.sh shell scripts.


TODO


I still don't know what to do about the Unicode/UTF-8 questions that
were raised WRT strings.  Pathnames on Linux can be any sequence of 8bit
characters -- this is likely to be UTF-8 on modern systems.  Pathnames on
Windows are UCS2/UTF-16 in the filesystem and we always convert to/from
UTF-8 when moving between git data structures and IO calls.

There are few other fields (like author name) that we may want to log which
may or may not be, but that is beyond our control.  Even localized error
messages may be problematic if they include other fields.

So, I'm not sure we have a route to get UTF-8-clean data out of Git, and if
we do it is beyond the scope of this patch series.

So I think for our uses here, defining this as "JSON-like" is probably the
best answer.  We write the strings as we received them (from the file system,
the index, or whatever).  These strings are properly escaped WRT double
quotes, backslashes, and control characters, so we shouldn't have an issue
with decoders getting out of sync -- only with them rejecting non-UTF-8
sequences.

We could blindly \u encode each of the hi-bit characters, if that would
help the parsers, but I don't want to do that right now.

WRT binary data, I had not intended using this for binary data.  And without
knowing what kinds or quantity of binary data we might use it for, I'd like
to ignore this for now.


Jeff Hostetler (1):
  json_writer: new routines to create data in JSON format

 Makefile|   2 +
 json-writer.c   | 321 +
 json-writer.h   |  86 +
 t/helper/test-json-writer.c | 420 
 t/t0019-json-writer.sh  | 102 +++
 5 files changed, 931 insertions(+)
 create mode 100644 json-writer.c
 create mode 100644 json-writer.h
 create mode 100644 t/helper/test-json-writer.c
 create mode 100755 t/t0019-json-writer.sh

-- 
2.9.3



Re: [PATCH 0/3] Extract memory pool logic into reusable component

2018-03-21 Thread Junio C Hamano
jameson.mille...@gmail.com writes:

> This patch series extracts the memory pool implementation, currently
> used by fast-import, into a generalized component. This memory pool
> can then be generally used by any component that needs a pool of
> memory.

The way this series is organized is quite reviewer hostile.  

Step #2 that adds bunch of new code without introducing any caller
nor adjusting existing code means the readers will have no idea how
the new code relates to the existing code in fast-import.c (e.g. "Is
it just a rename of some type, structure fields, function names
etc., and moving the lines without no other change?"  "Is the new
code more capable than fast-import's one and does step #3 just makes
the existing fast-import callers use just a subset of the feature?").

Either case, if I were doing this (by the way, I am not in
particular an expert in this area), I probably would make existing
memtool implementation more generic (e.g. renaming, feature
generalization, etc.) and adjust existing callers while the code is
still in its original place in one step, and then in a separate step
move the resulting reusable bits to a new file wilt a new header.





Re: [GSoC] Regarding "Convert scripts to builtins"

2018-03-21 Thread Wink Saville
On Wed, Mar 21, 2018 at 5:32 AM, Johannes Schindelin
 wrote:
> Hi Paul,
>

> Note! There is one exception, and it is not even a full script. As
> everybody knows who follows my patch series on this here mailing list, I
> consider --preserve-merges one of my stupidest design mistakes ever. To
> undo this (or at least to alleviate the damage I caused), I already
> submitted a patch series to introduce a superseding option:
> --recreate-merges. This patch series is on hold due to the -rc phase of
> v2.17 and will be kicked back into action after v2.17.0 final is out. As
> it is my hope that --preserve-merges can be deprecated in favor of
> --recreate-merges (and eventually even phased out of Git), I would be
> totally cool with git-rebase--preserve-merges.sh being factored out of
> git-rebase--interactive.sh before converting the latter to pure C, and
> leaving the --preserve-merges script alone until the time when it is put
> to rest.
>
> (While I was sleeping, leaving this mail unfinished, to be completed and
> sent today, a patch series was sent to the mailing list that seems to
> perform this refactoring of --preserve-merges into its own script.)

I plead guilty to being the preson refactoring --preserve-merges. But after
reading this and seeing that --recreate-merges is coming and possibly
git-rebase--* moving to C I'm worried I'd be messing things up.

Also, Eric Sunshine felt my v1 changes causes the blame information to
be obscured. So I created a v2 change which keeps everything in the
git-rebase--interactive.sh.

Please see "[RFC PATCH 0/3] rebase-interactive" and
"[RFC PATCH v2 0/1] rebase-interactive: ...". I'm looking for
advice on how to proceed.


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Junio C Hamano
Duy Nguyen  writes:

> And we could even do something like this to make custom builds
> easier. Some more gluing is needed so you can set this from config.mak
> but you get the idea. This removes all limits set by this
> series.

Yes, we _could_, but it would mean we would have many variants of
the codepath that is pretty crucial to the integrity of the data we
keep in the repository, all of which must pretty much be bug-free.

> Readability in pack-objects.c and object_entry struct declaration
> is still a concern though.

Yup, a change like this does not change the readability; personally,
I do not think the original is _too_ bad, though.



Re: [GSoC][PATCH v3] test: avoid pipes in git related commands for test

2018-03-21 Thread Eric Sunshine
On Wed, Mar 21, 2018 at 2:11 PM, Junio C Hamano  wrote:
> Pratik Karki  writes:
>>  Avoid using pipes downstream of Git commands since the exit
>>  codes of commands upstream of pipes get swallowed, thus potentially hiding
>>  failure of those commands. Instead, capture Git command output to a file and
>>  apply the downstream command(s) to that file.
>
> Please do not indent the body of the log message by one space.

One other issue I forgot to mention is that the commit message in v3
started getting too wide again[1]; it was fine in v2. Pratik, try to
keep the commit message wrapped to about 70-72 characters or so.

[1]: 
https://public-inbox.org/git/CAPig+cRPzyw525ODC4=-E7w=zbpbhvn2eqxsydslij5wfw8...@mail.gmail.com/


Re: [GSoC][PATCH v3] test: avoid pipes in git related commands for test

2018-03-21 Thread Eric Sunshine
On Wed, Mar 21, 2018 at 2:11 PM, Junio C Hamano  wrote:
> Pratik Karki  writes:
>> The changes in patch increased from v1 to v2 because I
>> got excited to work in Git codebase and I tried to
>> fix the exisiting problems as much as possible.
>> Hence, the large number of changes.
>
> Eric understands why the scope was increased between the two; he
> explained why it is not a good idea to increase the scope in the
> middle, and I tend to agree with his reasoning.  The reason why the
> scope was increased does not matter.

Thanks, Junio. I had just started writing a review of v3 when your
review arrived, and you covered every point I was going to make, thus
saved me the effort. I agree with everything in your review.

One additional comment, Pratik, is that this patch seems to be based
upon a slightly old version of the Git source code, thus does not
apply cleanly to present-day 'master'. Before re-rolling, update to
the latest Git and rebase your patch atop it.

>> - PACK5=$( git rev-list --objects "$LIST" "$LI" "$ST" | \
>> - git pack-objects test-5 ) &&
>> - PACK6=$( (
>> + git rev-list --objects "$LIST" "$LI" "$ST" >actual &&
>> + PACK5=$( git pack-objects test-5 > + PACK6=$((
>
> I thought that Eric already pointed out and explained why this
> change to PACK6 is wrong in the previous round?

I probably should have been more explicit by naming PACK6 directly.
Comparing v3 against v2, I see that Pratik probably misunderstood my
comment, thinking that I was talking about losing the whitespace
inside PACK5=$(...); v2 dropped that whitespace and v3 restored it.

Pratik, dropping the unnecessary whitespace inside PACK5=$(...) is
fine (no complaint about that), but changing PACK6=$( (...) ) to
PACK6=$((...)) is outright incorrect as explained in [1].

[1]: 
https://public-inbox.org/git/capig+ctkkp6kpfcjfvv8w1ejcrcwqh33mhtgfun+mpmgw5i...@mail.gmail.com/


[PATCH v5 0/6] ref-filter: remove die() calls from formatting logic

2018-03-21 Thread Оля Тележная
Add strbuf_error() as a first commit and use it in all other commits.
Good line reduction, -67 lines compare to previous version.
Eric, thanks a lot, new code looks much better!

Thank all of you,
Olga


[PATCH v5 4/6] ref-filter: change parsing function error handling

2018-03-21 Thread Olga Telezhnaya
Continue removing die() calls from ref-filter formatting logic,
so that it could be used by other commands.

Change the signature of parse_ref_filter_atom() by adding
strbuf parameter for error message.
The function returns the position in the used_atom[] array
(as before) for the given atom, or -1 to signal an error.
Upon failure, error message is appended to the strbuf.

Signed-off-by: Olga Telezhnaia 
---
 ref-filter.c | 30 ++
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index f79c8c477f1dc..ca38728b4c998 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -397,7 +397,8 @@ struct atom_value {
  * Used to parse format string and sort specifiers
  */
 static int parse_ref_filter_atom(const struct ref_format *format,
-const char *atom, const char *ep)
+const char *atom, const char *ep,
+struct strbuf *err)
 {
const char *sp;
const char *arg;
@@ -407,7 +408,8 @@ static int parse_ref_filter_atom(const struct ref_format 
*format,
if (*sp == '*' && sp < ep)
sp++; /* deref */
if (ep <= sp)
-   die(_("malformed field name: %.*s"), (int)(ep-atom), atom);
+   return strbuf_error(err, -1, _("malformed field name: %.*s"),
+   (int)(ep-atom), atom);
 
/* Do we have the atom already used elsewhere? */
for (i = 0; i < used_atom_cnt; i++) {
@@ -433,7 +435,8 @@ static int parse_ref_filter_atom(const struct ref_format 
*format,
}
 
if (ARRAY_SIZE(valid_atom) <= i)
-   die(_("unknown field name: %.*s"), (int)(ep-atom), atom);
+   return strbuf_error(err, -1, _("unknown field name: %.*s"),
+   (int)(ep-atom), atom);
 
/* Add it in, including the deref prefix */
at = used_atom_cnt;
@@ -715,17 +718,21 @@ int verify_ref_format(struct ref_format *format)
 
format->need_color_reset_at_eol = 0;
for (cp = format->format; *cp && (sp = find_next(cp)); ) {
+   struct strbuf err = STRBUF_INIT;
const char *color, *ep = strchr(sp, ')');
int at;
 
if (!ep)
return error(_("malformed format string %s"), sp);
/* sp points at "%(" and ep points at the closing ")" */
-   at = parse_ref_filter_atom(format, sp + 2, ep);
+   at = parse_ref_filter_atom(format, sp + 2, ep, );
+   if (at < 0)
+   die("%s", err.buf);
cp = ep + 1;
 
if (skip_prefix(used_atom[at].name, "color:", ))
format->need_color_reset_at_eol = !!strcmp(color, 
"reset");
+   strbuf_release();
}
if (format->need_color_reset_at_eol && !want_color(format->use_color))
format->need_color_reset_at_eol = 0;
@@ -2144,13 +2151,15 @@ int format_ref_array_item(struct ref_array_item *info,
 
for (cp = format->format; *cp && (sp = find_next(cp)); cp = ep + 1) {
struct atom_value *atomv;
+   int pos;
 
ep = strchr(sp, ')');
if (cp < sp)
append_literal(cp, sp, );
-   get_ref_atom_value(info,
-  parse_ref_filter_atom(format, sp + 2, ep),
-  );
+   pos = parse_ref_filter_atom(format, sp + 2, ep, error_buf);
+   if (pos < 0)
+   return -1;
+   get_ref_atom_value(info, pos, );
if (atomv->handler(atomv, , error_buf))
return -1;
}
@@ -2203,7 +2212,12 @@ static int parse_sorting_atom(const char *atom)
 */
struct ref_format dummy = REF_FORMAT_INIT;
const char *end = atom + strlen(atom);
-   return parse_ref_filter_atom(, atom, end);
+   struct strbuf err = STRBUF_INIT;
+   int res = parse_ref_filter_atom(, atom, end, );
+   if (res < 0)
+   die("%s", err.buf);
+   strbuf_release();
+   return res;
 }
 
 /*  If no sorting option is given, use refname to sort as default */

--
https://github.com/git/git/pull/466


[PATCH v5 2/6] ref-filter: start adding strbufs with errors

2018-03-21 Thread Olga Telezhnaya
This is a first step in removing die() calls from ref-filter
formatting logic, so that it could be used by other commands
that do not want to die during formatting process.
die() calls related to bugs in code will not be touched in this patch.

Everything would be the same for show_ref_array_item() users.
But, if you want to deal with errors by your own, you could invoke
format_ref_array_item(). It means that you need to print everything
(the result and errors) on your side.

This commit changes signature of format_ref_array_item() by adding
return value and strbuf parameter for errors, and adjusts
its callers. While at it, reduce the scope of the out-variable.

Signed-off-by: Olga Telezhnaia 
---
 builtin/branch.c |  7 +--
 ref-filter.c | 13 +
 ref-filter.h |  7 ---
 3 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/builtin/branch.c b/builtin/branch.c
index 6d0cea9d4bcc4..c21e5a04a0177 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -391,7 +391,6 @@ static void print_ref_list(struct ref_filter *filter, 
struct ref_sorting *sortin
struct ref_array array;
int maxwidth = 0;
const char *remote_prefix = "";
-   struct strbuf out = STRBUF_INIT;
char *to_free = NULL;
 
/*
@@ -419,7 +418,10 @@ static void print_ref_list(struct ref_filter *filter, 
struct ref_sorting *sortin
ref_array_sort(sorting, );
 
for (i = 0; i < array.nr; i++) {
-   format_ref_array_item(array.items[i], format, );
+   struct strbuf out = STRBUF_INIT;
+   struct strbuf err = STRBUF_INIT;
+   if (format_ref_array_item(array.items[i], format, , ))
+   die("%s", err.buf);
if (column_active(colopts)) {
assert(!filter->verbose && "--column and --verbose are 
incompatible");
 /* format to a string_list to let print_columns() do 
its job */
@@ -428,6 +430,7 @@ static void print_ref_list(struct ref_filter *filter, 
struct ref_sorting *sortin
fwrite(out.buf, 1, out.len, stdout);
putchar('\n');
}
+   strbuf_release();
strbuf_release();
}
 
diff --git a/ref-filter.c b/ref-filter.c
index 45fc56216aaa8..2c3fb2f003708 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2118,9 +2118,10 @@ static void append_literal(const char *cp, const char 
*ep, struct ref_formatting
}
 }
 
-void format_ref_array_item(struct ref_array_item *info,
+int format_ref_array_item(struct ref_array_item *info,
   const struct ref_format *format,
-  struct strbuf *final_buf)
+  struct strbuf *final_buf,
+  struct strbuf *error_buf)
 {
const char *cp, *sp, *ep;
struct ref_formatting_state state = REF_FORMATTING_STATE_INIT;
@@ -2149,18 +2150,22 @@ void format_ref_array_item(struct ref_array_item *info,
append_atom(, );
}
if (state.stack->prev)
-   die(_("format: %%(end) atom missing"));
+   return strbuf_error(error_buf, -1, _("format: %%(end) atom 
missing"));
strbuf_addbuf(final_buf, >output);
pop_stack_element();
+   return 0;
 }
 
 void show_ref_array_item(struct ref_array_item *info,
 const struct ref_format *format)
 {
struct strbuf final_buf = STRBUF_INIT;
+   struct strbuf error_buf = STRBUF_INIT;
 
-   format_ref_array_item(info, format, _buf);
+   if (format_ref_array_item(info, format, _buf, _buf))
+   die("%s", error_buf.buf);
fwrite(final_buf.buf, 1, final_buf.len, stdout);
+   strbuf_release(_buf);
strbuf_release(_buf);
putchar('\n');
 }
diff --git a/ref-filter.h b/ref-filter.h
index 0d98342b34319..e13f8e6f8721a 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -110,9 +110,10 @@ int verify_ref_format(struct ref_format *format);
 /*  Sort the given ref_array as per the ref_sorting provided */
 void ref_array_sort(struct ref_sorting *sort, struct ref_array *array);
 /*  Based on the given format and quote_style, fill the strbuf */
-void format_ref_array_item(struct ref_array_item *info,
-  const struct ref_format *format,
-  struct strbuf *final_buf);
+int format_ref_array_item(struct ref_array_item *info,
+ const struct ref_format *format,
+ struct strbuf *final_buf,
+ struct strbuf *error_buf);
 /*  Print the ref using the given format and quote_style */
 void show_ref_array_item(struct ref_array_item *info, const struct ref_format 
*format);
 /*  Parse a single sort specifier and add it to the list */

--
https://github.com/git/git/pull/466


[PATCH v5 1/6] strbuf: add shortcut to work with error messages

2018-03-21 Thread Olga Telezhnaya
Add function strbuf_error() that helps to save few lines of code.
Function expands fmt with placeholders, append resulting error message
to strbuf *err, and return error code ret.

Signed-off-by: Olga Telezhnaia 
---
 strbuf.h | 13 +
 1 file changed, 13 insertions(+)

diff --git a/strbuf.h b/strbuf.h
index e6cae5f4398c8..fa66d4835f1a7 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -620,4 +620,17 @@ char *xstrvfmt(const char *fmt, va_list ap);
 __attribute__((format (printf, 1, 2)))
 char *xstrfmt(const char *fmt, ...);
 
+/*
+ * Expand error message, append it to strbuf *err, then return error code ret.
+ * Allow to save few lines of code.
+ */
+static inline int strbuf_error(struct strbuf *err, int ret, const char *fmt, 
...)
+{
+   va_list ap;
+   va_start(ap, fmt);
+   strbuf_vaddf(err, fmt, ap);
+   va_end(ap);
+   return ret;
+}
+
 #endif /* STRBUF_H */

--
https://github.com/git/git/pull/466


[PATCH v5 5/6] ref-filter: add return value to parsers

2018-03-21 Thread Olga Telezhnaya
Continue removing die() calls from ref-filter formatting logic,
so that it could be used by other commands.

Change the signature of parsers by adding return value and
strbuf parameter for error message.
Return value equals 0 upon success and -1 upon failure.
Upon failure, error message is appended to the strbuf.

Signed-off-by: Olga Telezhnaia 
---
 ref-filter.c | 112 ---
 1 file changed, 68 insertions(+), 44 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index ca38728b4c998..0ef7386b5fd20 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -101,22 +101,25 @@ static struct used_atom {
 } *used_atom;
 static int used_atom_cnt, need_tagged, need_symref;
 
-static void color_atom_parser(const struct ref_format *format, struct 
used_atom *atom, const char *color_value)
+static int color_atom_parser(const struct ref_format *format, struct used_atom 
*atom,
+const char *color_value, struct strbuf *err)
 {
if (!color_value)
-   die(_("expected format: %%(color:)"));
+   return strbuf_error(err, -1, _("expected format: 
%%(color:)"));
if (color_parse(color_value, atom->u.color) < 0)
-   die(_("unrecognized color: %%(color:%s)"), color_value);
+   return strbuf_error(err, -1, _("unrecognized color: 
%%(color:%s)"),
+   color_value);
/*
 * We check this after we've parsed the color, which lets us complain
 * about syntactically bogus color names even if they won't be used.
 */
if (!want_color(format->use_color))
color_parse("", atom->u.color);
+   return 0;
 }
 
-static void refname_atom_parser_internal(struct refname_atom *atom,
-const char *arg, const char *name)
+static int refname_atom_parser_internal(struct refname_atom *atom, const char 
*arg,
+const char *name, struct strbuf *err)
 {
if (!arg)
atom->option = R_NORMAL;
@@ -126,16 +129,18 @@ static void refname_atom_parser_internal(struct 
refname_atom *atom,
 skip_prefix(arg, "strip=", )) {
atom->option = R_LSTRIP;
if (strtol_i(arg, 10, >lstrip))
-   die(_("Integer value expected refname:lstrip=%s"), arg);
+   return strbuf_error(err, -1, _("Integer value expected 
refname:lstrip=%s"), arg);
} else if (skip_prefix(arg, "rstrip=", )) {
atom->option = R_RSTRIP;
if (strtol_i(arg, 10, >rstrip))
-   die(_("Integer value expected refname:rstrip=%s"), arg);
+   return strbuf_error(err, -1, _("Integer value expected 
refname:rstrip=%s"), arg);
} else
-   die(_("unrecognized %%(%s) argument: %s"), name, arg);
+   return strbuf_error(err, -1, _("unrecognized %%(%s) argument: 
%s"), name, arg);
+   return 0;
 }
 
-static void remote_ref_atom_parser(const struct ref_format *format, struct 
used_atom *atom, const char *arg)
+static int remote_ref_atom_parser(const struct ref_format *format, struct 
used_atom *atom,
+ const char *arg, struct strbuf *err)
 {
struct string_list params = STRING_LIST_INIT_DUP;
int i;
@@ -145,9 +150,8 @@ static void remote_ref_atom_parser(const struct ref_format 
*format, struct used_
 
if (!arg) {
atom->u.remote_ref.option = RR_REF;
-   refname_atom_parser_internal(>u.remote_ref.refname,
-arg, atom->name);
-   return;
+   return refname_atom_parser_internal(>u.remote_ref.refname,
+   arg, atom->name, err);
}
 
atom->u.remote_ref.nobracket = 0;
@@ -170,29 +174,36 @@ static void remote_ref_atom_parser(const struct 
ref_format *format, struct used_
atom->u.remote_ref.push_remote = 1;
} else {
atom->u.remote_ref.option = RR_REF;
-   
refname_atom_parser_internal(>u.remote_ref.refname,
-arg, atom->name);
+   if 
(refname_atom_parser_internal(>u.remote_ref.refname,
+arg, atom->name, err))
+   return -1;
}
}
 
string_list_clear(, 0);
+   return 0;
 }
 
-static void body_atom_parser(const struct ref_format *format, struct used_atom 
*atom, const char *arg)
+static int body_atom_parser(const struct ref_format *format, struct used_atom 
*atom,
+   const char *arg, struct strbuf *err)
 {
if (arg)
-   die(_("%%(body) does not take arguments"));
+   

[PATCH v5 6/6] ref-filter: libify get_ref_atom_value()

2018-03-21 Thread Olga Telezhnaya
Finish removing die() calls from ref-filter formatting logic,
so that it could be used by other commands.

Change the signature of get_ref_atom_value() and underlying functions
by adding return value and strbuf parameter for error message.
Return value equals 0 upon success and -1 upon failure.
Upon failure, error message is appended to the strbuf.

Signed-off-by: Olga Telezhnaia 
---
 ref-filter.c | 49 ++---
 1 file changed, 30 insertions(+), 19 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 0ef7386b5fd20..52bb84398dc8d 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1398,28 +1398,30 @@ static const char *get_refname(struct used_atom *atom, 
struct ref_array_item *re
return show_ref(>u.refname, ref->refname);
 }
 
-static void get_object(struct ref_array_item *ref, const struct object_id *oid,
-  int deref, struct object **obj)
+static int get_object(struct ref_array_item *ref, const struct object_id *oid,
+  int deref, struct object **obj, struct strbuf *err)
 {
int eaten;
+   int ret = 0;
unsigned long size;
void *buf = get_obj(oid, obj, , );
if (!buf)
-   die(_("missing object %s for %s"),
-   oid_to_hex(oid), ref->refname);
-   if (!*obj)
-   die(_("parse_object_buffer failed on %s for %s"),
-   oid_to_hex(oid), ref->refname);
-
-   grab_values(ref->value, deref, *obj, buf, size);
+   ret = strbuf_error(err, -1, _("missing object %s for %s"),
+  oid_to_hex(oid), ref->refname);
+   else if (!*obj)
+   ret = strbuf_error(err, -1, _("parse_object_buffer failed on %s 
for %s"),
+  oid_to_hex(oid), ref->refname);
+   else
+   grab_values(ref->value, deref, *obj, buf, size);
if (!eaten)
free(buf);
+   return ret;
 }
 
 /*
  * Parse the object referred by ref, and grab needed value.
  */
-static void populate_value(struct ref_array_item *ref)
+static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 {
struct object *obj;
int i;
@@ -1541,16 +1543,17 @@ static void populate_value(struct ref_array_item *ref)
break;
}
if (used_atom_cnt <= i)
-   return;
+   return 0;
 
-   get_object(ref, >objectname, 0, );
+   if (get_object(ref, >objectname, 0, , err))
+   return -1;
 
/*
 * If there is no atom that wants to know about tagged
 * object, we are done.
 */
if (!need_tagged || (obj->type != OBJ_TAG))
-   return;
+   return 0;
 
/*
 * If it is a tag object, see if we use a value that derefs
@@ -1564,20 +1567,23 @@ static void populate_value(struct ref_array_item *ref)
 * is not consistent with what deref_tag() does
 * which peels the onion to the core.
 */
-   get_object(ref, tagged, 1, );
+   return get_object(ref, tagged, 1, , err);
 }
 
 /*
  * Given a ref, return the value for the atom.  This lazily gets value
  * out of the object by calling populate value.
  */
-static void get_ref_atom_value(struct ref_array_item *ref, int atom, struct 
atom_value **v)
+static int get_ref_atom_value(struct ref_array_item *ref, int atom,
+ struct atom_value **v, struct strbuf *err)
 {
if (!ref->value) {
-   populate_value(ref);
+   if (populate_value(ref, err))
+   return -1;
fill_missing_values(ref->value);
}
*v = >value[atom];
+   return 0;
 }
 
 /*
@@ -2101,9 +2107,13 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct 
ref_array_item *a, stru
int cmp;
cmp_type cmp_type = used_atom[s->atom].type;
int (*cmp_fn)(const char *, const char *);
+   struct strbuf err = STRBUF_INIT;
 
-   get_ref_atom_value(a, s->atom, );
-   get_ref_atom_value(b, s->atom, );
+   if (get_ref_atom_value(a, s->atom, , ))
+   die("%s", err.buf);
+   if (get_ref_atom_value(b, s->atom, , ))
+   die("%s", err.buf);
+   strbuf_release();
cmp_fn = s->ignore_case ? strcasecmp : strcmp;
if (s->version)
cmp = versioncmp(va->s, vb->s);
@@ -2183,7 +2193,8 @@ int format_ref_array_item(struct ref_array_item *info,
pos = parse_ref_filter_atom(format, sp + 2, ep, error_buf);
if (pos < 0)
return -1;
-   get_ref_atom_value(info, pos, );
+   if (get_ref_atom_value(info, pos, , error_buf))
+   return -1;
if (atomv->handler(atomv, , error_buf))
return -1;
}

--
https://github.com/git/git/pull/466


[PATCH v5 3/6] ref-filter: add return value && strbuf to handlers

2018-03-21 Thread Olga Telezhnaya
Continue removing die() calls from ref-filter formatting logic,
so that it could be used by other commands.

Change the signature of handlers by adding return value
and strbuf parameter for errors.
Return value equals 0 upon success and -1 upon failure.
Upon failure, error message is appended to the strbuf.

Signed-off-by: Olga Telezhnaia 
---
 ref-filter.c | 47 +++
 1 file changed, 31 insertions(+), 16 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 2c3fb2f003708..f79c8c477f1dc 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -387,7 +387,8 @@ struct ref_formatting_state {
 
 struct atom_value {
const char *s;
-   void (*handler)(struct atom_value *atomv, struct ref_formatting_state 
*state);
+   int (*handler)(struct atom_value *atomv, struct ref_formatting_state 
*state,
+  struct strbuf *err);
uintmax_t value; /* used for sorting when not FIELD_STR */
struct used_atom *atom;
 };
@@ -481,7 +482,8 @@ static void quote_formatting(struct strbuf *s, const char 
*str, int quote_style)
}
 }
 
-static void append_atom(struct atom_value *v, struct ref_formatting_state 
*state)
+static int append_atom(struct atom_value *v, struct ref_formatting_state 
*state,
+  struct strbuf *unused_err)
 {
/*
 * Quote formatting is only done when the stack has a single
@@ -493,6 +495,7 @@ static void append_atom(struct atom_value *v, struct 
ref_formatting_state *state
quote_formatting(>stack->output, v->s, 
state->quote_style);
else
strbuf_addstr(>stack->output, v->s);
+   return 0;
 }
 
 static void push_stack_element(struct ref_formatting_stack **stack)
@@ -527,7 +530,8 @@ static void end_align_handler(struct ref_formatting_stack 
**stack)
strbuf_release();
 }
 
-static void align_atom_handler(struct atom_value *atomv, struct 
ref_formatting_state *state)
+static int align_atom_handler(struct atom_value *atomv, struct 
ref_formatting_state *state,
+ struct strbuf *unused_err)
 {
struct ref_formatting_stack *new_stack;
 
@@ -535,6 +539,7 @@ static void align_atom_handler(struct atom_value *atomv, 
struct ref_formatting_s
new_stack = state->stack;
new_stack->at_end = end_align_handler;
new_stack->at_end_data = >atom->u.align;
+   return 0;
 }
 
 static void if_then_else_handler(struct ref_formatting_stack **stack)
@@ -572,7 +577,8 @@ static void if_then_else_handler(struct 
ref_formatting_stack **stack)
free(if_then_else);
 }
 
-static void if_atom_handler(struct atom_value *atomv, struct 
ref_formatting_state *state)
+static int if_atom_handler(struct atom_value *atomv, struct 
ref_formatting_state *state,
+  struct strbuf *unused_err)
 {
struct ref_formatting_stack *new_stack;
struct if_then_else *if_then_else = xcalloc(sizeof(struct 
if_then_else), 1);
@@ -584,6 +590,7 @@ static void if_atom_handler(struct atom_value *atomv, 
struct ref_formatting_stat
new_stack = state->stack;
new_stack->at_end = if_then_else_handler;
new_stack->at_end_data = if_then_else;
+   return 0;
 }
 
 static int is_empty(const char *s)
@@ -596,7 +603,8 @@ static int is_empty(const char *s)
return 1;
 }
 
-static void then_atom_handler(struct atom_value *atomv, struct 
ref_formatting_state *state)
+static int then_atom_handler(struct atom_value *atomv, struct 
ref_formatting_state *state,
+struct strbuf *err)
 {
struct ref_formatting_stack *cur = state->stack;
struct if_then_else *if_then_else = NULL;
@@ -604,11 +612,11 @@ static void then_atom_handler(struct atom_value *atomv, 
struct ref_formatting_st
if (cur->at_end == if_then_else_handler)
if_then_else = (struct if_then_else *)cur->at_end_data;
if (!if_then_else)
-   die(_("format: %%(then) atom used without an %%(if) atom"));
+   return strbuf_error(err, -1, _("format: %%(then) atom used 
without an %%(if) atom"));
if (if_then_else->then_atom_seen)
-   die(_("format: %%(then) atom used more than once"));
+   return strbuf_error(err, -1, _("format: %%(then) atom used more 
than once"));
if (if_then_else->else_atom_seen)
-   die(_("format: %%(then) atom used after %%(else)"));
+   return strbuf_error(err, -1, _("format: %%(then) atom used 
after %%(else)"));
if_then_else->then_atom_seen = 1;
/*
 * If the 'equals' or 'notequals' attribute is used then
@@ -624,9 +632,11 @@ static void then_atom_handler(struct atom_value *atomv, 
struct ref_formatting_st
} else if (cur->output.len && !is_empty(cur->output.buf))
if_then_else->condition_satisfied = 1;
strbuf_reset(>output);
+   return 0;
 }
 
-static void 

Re: [GSoC][PATCH v6] parse-options: do not show usage upon invalid option value

2018-03-21 Thread Junio C Hamano
Paul-Sebastian Ungureanu  writes:

> diff --git a/t/t0041-usage.sh b/t/t0041-usage.sh
> new file mode 100755
> index 0..ac96bc3b9
> --- /dev/null
> +++ b/t/t0041-usage.sh
> @@ -0,0 +1,107 @@
> +#!/bin/sh
> +
> +test_description='Test commands behavior when given invalid argument value'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup ' '
> + test_commit "v1.0"
> +'
> +
> +test_expect_success 'tag --contains ' '
> + git tag --contains "v1.0" 1>actual 2>actual.err &&

It is not wrong per-se, but >redirection without a number redirects
fd#1, so "1>actual" is an unusual way to spell ">actual".

> + grep "v1.0" actual &&
> + test_line_count = 0 actual.err
> +'
> +
> +test_expect_success 'tag --contains ' '
> + test_must_fail git tag --contains "notag" 1>actual 2>actual.err &&
> + test_line_count = 0 actual &&
> + test_i18ngrep "error" actual.err &&
> + test_must_fail test_i18ngrep "usage" actual.err

test_must_fail mustn't be used like that for two reasons

 - It is to be used for "git" stuff, because it knows failing with
   segfault and other failure modes are _wrong_ and should not be
   happy even if the command "fail"ed.  We however do not expect
   commands that are not git (e.g. test_i18ngrep) to require special
   casing of the failure modes.

 - test_i18ngrep pretends to always "have found match" when running
   under GETTEXT_POISON build, so it will pretend that usage exists
   in actual.error.  test_must_fail will then say "oops, the string
   'usage' shouldn't appear in the output but it did", which is not
   what you want.

Perhaps

test_i18ngrep ! "usage" actual.err

is what you want to say here instead.

In any case, the tests got a lot cleaner compared to the previous
round, and I feel that this patch is "getting there" ;-)

Thanks for working on it.


Re: [GSoC][PATCH v3] test: avoid pipes in git related commands for test

2018-03-21 Thread Junio C Hamano
Pratik Karki  writes:

> Thank you Eric, for the review. This is follow on patch[1].
>
> The changes in patch increased from v1 to v2 because I
> got excited to work in Git codebase and I tried to
> fix the exisiting problems as much as possible.
> Hence, the large number of changes.

Eric understands why the scope was increased between the two; he
explained why it is not a good idea to increase the scope in the
middle, and I tend to agree with his reasoning.  The reason why the
scope was increased does not matter.

>>> @@ -120,18 +122,20 @@ test_expect_success 'test another branch' '
>>> git config --add svn-remote.four.url "$svnrepo" &&
>>> git config --add svn-remote.four.fetch 
>>> trunk:refs/remotes/four/trunk &&
>>> git config --add svn-remote.four.branches \
>>> -"branches/*/*:refs/remotes/four/branches/*/*" &&
>>> +"branches/*/*:refs/remotes/four/branches/*/*" &&
>>> git config --add svn-remote.four.tags \
>>> -"tags/*:refs/remotes/four/tags/*" &&
>>> +"tags/*:refs/remotes/four/tags/*" &&
>
>>I guess you sneaked in a whitespace change here which is unrelated to
>>the stated purpose of this patch, thus acted as a speed bump during
>>review
>>Therefore, you should omit this change.
>
> I used tabify in Emacs and it must have messed up the whitespace
> change.

Then don't.  Make sure the lines _you_ change are indented and
formatted correctly.  Do not touch near-by (or far-away for that
matter) lines that you do not have to touch only to change the
formatting.

> I read SubmittingPatches guideline[2] for git where it
> is said that whitespace check must be done and git community is
> picky about it and 'git diff --check' must be done before commit.

Yes.

> If I change this change back to original the 'git diff --check'
> reports whitespace identation with space.

I do not think 'diff --check' would.  Save the patch you sent to a
file, edit it so that these two lines Eric pointed out are not
changed, and then apply it with "git apply --whitespace=nowarn".
Then ask "git diff --check"---it should not complain about an
existing whitespace glitch that you did not introduce.

> So, isn't this
> supposedly a fix?

Unless it is a "here is a patch to reindent and fix whitespaces"
patch that does nothing else, it is an unwelcome noise that
distracts reviewers from the real changes.

> - 
> >8

This is not wrong per se, but just a

-- >8 --

is sufficient ;-)

>
>  Avoid using pipes downstream of Git commands since the exit
>  codes of commands upstream of pipes get swallowed, thus potentially hiding
>  failure of those commands. Instead, capture Git command output to a file and
>  apply the downstream command(s) to that file.

Please do not indent the body of the log message by one space.

> diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
> index 9c68b9925..707208284 100755
> --- a/t/t5300-pack-object.sh
> +++ b/t/t5300-pack-object.sh
> @@ -311,9 +311,9 @@ test_expect_success 'unpacking with --strict' '
>   rm -f .git/index &&
>   tail -n 10 LIST | git update-index --index-info &&
>   ST=$(git write-tree) &&
> - PACK5=$( git rev-list --objects "$LIST" "$LI" "$ST" | \
> - git pack-objects test-5 ) &&
> - PACK6=$( (
> + git rev-list --objects "$LIST" "$LI" "$ST" >actual &&
> + PACK5=$( git pack-objects test-5  + PACK6=$((

I thought that Eric already pointed out and explained why this
change to PACK6 is wrong in the previous round?

> diff --git a/t/t9111-git-svn-use-svnsync-props.sh 
> b/t/t9111-git-svn-use-svnsync-props.sh
> index 22b6e5ee7..a4225c9f6 100755
> --- a/t/t9111-git-svn-use-svnsync-props.sh
> +++ b/t/t9111-git-svn-use-svnsync-props.sh
> @@ -20,32 +20,32 @@ uuid=161ce429-a9dd-4828-af4a-52023f968c89
>  
>  bar_url=http://mayonaise/svnrepo/bar
>  test_expect_success 'verify metadata for /bar' "
> - git cat-file commit refs/remotes/bar | \
> -grep '^git-svn-id: $bar_url@12 $uuid$' &&
> - git cat-file commit refs/remotes/bar~1 | \
> -grep '^git-svn-id: $bar_url@11 $uuid$' &&
> - git cat-file commit refs/remotes/bar~2 | \
> -grep '^git-svn-id: $bar_url@10 $uuid$' &&
> - git cat-file commit refs/remotes/bar~3 | \
> -grep '^git-svn-id: $bar_url@9 $uuid$' &&
> - git cat-file commit refs/remotes/bar~4 | \
> -grep '^git-svn-id: $bar_url@6 $uuid$' &&
> - git cat-file commit refs/remotes/bar~5 | \
> -grep '^git-svn-id: $bar_url@1 $uuid$'
> + git cat-file commit refs/remotes/bar >actual &&
> + grep '^git-svn-id: $bar_url@12 $uuid$' actual &&
> + git cat-file commit refs/remotes/bar~1 >actual1 &&
> + grep '^git-svn-id: $bar_url@11 $uuid$' actual1 &&
> + git cat-file commit refs/remotes/bar~2 >actual2 &&
> + grep '^git-svn-id: 

[RFC PATCH v2 0/1] rebase-interactive: Add git_rebase__interactive__preserve_merges

2018-03-21 Thread Wink Saville
Version 2 keeps all changes in git-rebase--interactive.sh this should
make it easier to use blame. But there is quite a bit of refactoring
and reformatting so using "git blame -w" or "git blame -w -C -C" is
needed to improve the results of blame.

I can break this into several patches to have the history show the code
movement more directly if that is desired.

Wink Saville (1):
  rebase-interactive: Add git_rebase__interactive__preserve_merges

 git-rebase--interactive.sh | 469 +
 git-rebase.sh  |  16 +-
 2 files changed, 273 insertions(+), 212 deletions(-)

-- 
2.16.2



[RFC PATCH v2 1/1] rebase-interactive: Add git_rebase__interactive__preserve_merges

2018-03-21 Thread Wink Saville
Refactor git_rebase__interactive__preserve_merges out of
git_rebase__interactive resulting in fewer conditionals making
both routines are simpler.

Changed run_specific_rebase in git-rebase to dispatch to the appropriate
function after sourcing git-rebase--interactive.
---
 git-rebase--interactive.sh | 469 +
 git-rebase.sh  |  16 +-
 2 files changed, 273 insertions(+), 212 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 331c8dfea..65ff75654 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -1,5 +1,7 @@
-# This shell script fragment is sourced by git-rebase to implement
-# its interactive mode.  "git rebase --interactive" makes it easy
+#!/bin/sh
+# This shell script fragment is sourced by git-rebase--interactive
+# and git-rebase--interactive--preserve-merges in support of the
+# interactive mode.  "git rebase --interactive" makes it easy
 # to fix up commits in the middle of a series and rearrange commits.
 #
 # Copyright (c) 2006 Johannes E. Schindelin
@@ -7,6 +9,7 @@
 # The original idea comes from Eric W. Biederman, in
 # https://public-inbox.org/git/m1odwkyuf5.fsf...@ebiederm.dsl.xmission.com/
 #
+
 # The file containing rebase commands, comments, and empty lines.
 # This file is created by "git rebase -i" then edited by the user.  As
 # the lines are processed, they are removed from the front of this
@@ -125,6 +128,19 @@ comment_for_reflog () {
esac
 }
 
+setup_reflog_action () {
+   comment_for_reflog start
+
+   if test ! -z "$switch_to"
+   then
+   GIT_REFLOG_ACTION="$GIT_REFLOG_ACTION: checkout $switch_to"
+   output git checkout "$switch_to" -- ||
+   die "$(eval_gettext "Could not checkout \$switch_to")"
+
+   comment_for_reflog start
+   fi
+}
+
 last_count=
 mark_action_done () {
sed -e 1q < "$todo" >> "$done"
@@ -274,7 +290,8 @@ pick_one () {
 
case "$1" in -n) sha1=$2; ff= ;; *) sha1=$1 ;; esac
case "$force_rebase" in '') ;; ?*) ff= ;; esac
-   output git rev-parse --verify $sha1 || die "$(eval_gettext "Invalid 
commit name: \$sha1")"
+   output git rev-parse --verify $sha1 ||
+   die "$(eval_gettext "Invalid commit name: \$sha1")"
 
if is_empty_commit "$sha1"
then
@@ -287,8 +304,8 @@ pick_one () {
${gpg_sign_opt:+$(git rev-parse --sq-quote 
"$gpg_sign_opt")} \
"$strategy_args" $empty_args $ff "$@"
 
-   # If cherry-pick dies it leaves the to-be-picked commit unrecorded. 
Reschedule
-   # previous task so this commit is not lost.
+   # If cherry-pick dies it leaves the to-be-picked commit unrecorded.
+   # Reschedule previous task so this commit is not lost.
ret=$?
case "$ret" in [01]) ;; *) reschedule_last_action ;; esac
return $ret
@@ -307,17 +324,15 @@ pick_one_preserving_merges () {
esac
sha1=$(git rev-parse $sha1)
 
-   if test -f "$state_dir"/current-commit
+   if test -f "$state_dir"/current-commit && test "$fast_forward" = t
then
-   if test "$fast_forward" = t
-   then
-   while read current_commit
-   do
-   git rev-parse HEAD > 
"$rewritten"/$current_commit
-   done <"$state_dir"/current-commit
-   rm "$state_dir"/current-commit ||
-   die "$(gettext "Cannot write current commit's 
replacement sha1")"
-   fi
+   while read current_commit
+   do
+   git rev-parse HEAD > "$rewritten"/$current_commit
+   done <"$state_dir"/current-commit
+   rm "$state_dir"/current-commit ||
+   die "$(gettext \
+   "Cannot write current commit's replacement sha1")"
fi
 
echo $sha1 >> "$state_dir"/current-commit
@@ -337,10 +352,10 @@ pick_one_preserving_merges () {
if test -f "$rewritten"/$p
then
new_p=$(cat "$rewritten"/$p)
-
-   # If the todo reordered commits, and our parent is 
marked for
-   # rewriting, but hasn't been gotten to yet, assume the 
user meant to
-   # drop it on top of the current HEAD
+   # If the todo reordered commits, and our parent is
+   # marked for rewriting, but hasn't been gotten to yet,
+   # assume the user meant to drop it on top of the
+   # current HEAD
if test -z "$new_p"
then
new_p=$(git rev-parse HEAD)
@@ -379,7 +394,7 @@ pick_one_preserving_merges () {
then
# detach HEAD to current 

Re: [BUG] log --graph corrupts patch

2018-03-21 Thread Junio C Hamano
Jeff King  writes:

>> To make it bullet-proof, I think we'd have to actually parse the graph
>> structure, finding a "*" line and then accepting only an indent that
>> matched it.
>
> Wow. Nerd snipe successful. This turned out to be quite tricky, but also
> kind of interesting.

The last patch looks quite "interesting" ;-).  The idea is sound and
probably the execution, too, but it indeed is tricky.

>
> Here's a series which fixes it. The meaty bits are in the final commit;
> the rest is just preparatory cleanup, and adding some tests (all are
> cases which I managed to break while fixing this).
>
>   [1/7]: diff-highlight: correct test graph diagram
>   [2/7]: diff-highlight: use test_tick in graph test
>   [3/7]: diff-highlight: prefer "echo" to "cat" in tests
>   [4/7]: diff-highlight: test interleaved parallel lines of history
>   [5/7]: diff-highlight: test graphs with --color
>   [6/7]: diff-highlight: factor out flush_hunk() helper
>   [7/7]: diff-highlight: detect --graph by indent
>
>  contrib/diff-highlight/DiffHighlight.pm   | 89 +++
>  .../diff-highlight/t/t9400-diff-highlight.sh  | 81 +
>  2 files changed, 133 insertions(+), 37 deletions(-)
>
> -Peff


Re: [PATCH] filter-branch: consider refs can refer to an object other than commit or tag

2018-03-21 Thread Junio C Hamano
Yuki Kokubun  writes:

> "git filter-branch -- --all" can be confused when refs that refer to objects
> other than commits or tags exists.
> Because "git rev-parse --all" that is internally used can return refs that
> refer to an object other than commit or tag. But it is not considered in the
> phase of updating refs.

Could you describe what the consequence of that is?  We have a ref
that points directly at a blob object, or a ref that points at a tag
object that points at a blob object.  The current code leaves both of
these refs in "$tempdir/heads".  Then...?

... goes and looks ...

There is a loop that looks like this:

while read ref
do
sha1=$(git rev-parse "$ref^0")
...
done <"$tempdir/heads"

which would break on anything but a commit-ish.

>  # The refs should be updated if their heads were rewritten
>  git rev-parse --no-flags --revs-only --symbolic-full-name \
> - --default HEAD "$@" > "$tempdir"/raw-heads || exit
> + --default HEAD "$@" > "$tempdir"/raw-objects || exit
> +# refs/replace can refer to an object other than commit or tag

Mention of replace refs in the proposed log message gives an easy to
understand example and is a good idea, but this in code comment does
not have to single out the replace refs.  A tag can also point at an
object with any type, e.g. "git tag v2.6.11-tree v2.6.11^{tree}"
would make "refs/tags/v2.6.11-tree" point at the tree at the top
level of the tree-ish "v2.6.11".  It probably is OK to drop this
comment altogether.

> +while read ref
> +do
> + type=$(git cat-file -t "$ref")
> + if test $type = commit || test $type = tag
> + then
> + echo "$ref"
> + fi
> +done >"$tempdir"/raw-heads <"$tempdir"/raw-objects
>  sed -e '/^^/d' "$tempdir"/raw-heads >"$tempdir"/heads

So... is the idea to limit the set of refs to be rewritten to those
that point at commits and tags?  As I already alluded to, I do not
think you want to accept a ref that points at any tag object---only
the ones that point at a tag that points at a commit-ish, so that
the code will not barf when doing "$ref^0".

So perhaps

git rev-parse --no-flags ... >"$tempdir/raw-heads" || exit

while read ref
do
case "$ref" in ^?*) continue ;; esac
if git rev-parse --verify "$ref^0" 2>/dev/null
then
echo "$ref"
fi
done >"$tempdir/heads" <"$tempdir/raw-heads"

or something?  Note that you do not need the "sed" as the loop
already excludes the negative revs.

>  test -s "$tempdir"/heads ||
> diff --git a/t/t7003-filter-branch.sh b/t/t7003-filter-branch.sh
> index 7cb60799b..efeaf5887 100755
> --- a/t/t7003-filter-branch.sh
> +++ b/t/t7003-filter-branch.sh
> @@ -470,4 +470,17 @@ test_expect_success 'tree-filter deals with object name 
> vs pathname ambiguity' '
>   git show HEAD:$ambiguous
>  '
>  
> +test_expect_success 'rewrite repository including refs/replace that point to 
> non commit object' '
> + test_when_finished "git reset --hard original" &&
> + tree=$(git rev-parse HEAD^{tree}) &&
> + test_when_finished "git replace -d $tree" &&
> + echo A >new &&
> + git add new &&
> + new_tree=$(git write-tree) &&
> + git replace $tree $new_tree &&

Perhaps something like this here:

git tag -a "tag to a tree" treetag $new_tree &&

can tell su how well it works with a tag that points at a tree?

> + git reset --hard HEAD &&
> + git filter-branch -f -- --all >filter-output 2>&1 &&
> + ! fgrep fatal filter-output
> +'
> +
>  test_done


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Duy Nguyen
On Wed, Mar 21, 2018 at 5:53 PM, Junio C Hamano  wrote:
> Ævar Arnfjörð Bjarmason  writes:
>
>> That's going to be super rare (and probably nonexisting) edge case, but
>> (untested) I wonder if something like this on top would alleviate your
>> concerns, i.e. instead of dying we just take the first N packs up to our
>> limit:
>>
>> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
>> index 4406af640f..49d467ab2a 100644
>> --- a/builtin/pack-objects.c
>> +++ b/builtin/pack-objects.c
>> @@ -1065,8 +1065,9 @@ static int want_object_in_pack(const struct 
>> object_id *oid,
>>
>> want = 1;
>>  done:
>> -   if (want && *found_pack && !(*found_pack)->index)
>> -   oe_add_pack(_pack, *found_pack);
>> +   if (want && *found_pack && !(*found_pack)->index) {
>> +   if (oe_add_pack(_pack, *found_pack) == -1)
>> +   return 0;
>>
>> return want;
>>  }
>
> It is probably a small first step in the right direction, but we'd
> need to communicate which packs we ignored with this logic to the
> calling program.  I offhand do not know how we would handle the "-d"
> part of "repack -a -d" without it.

repack will delete all the packs except ones with .keep files and ones
created by pack-objects. So this change alone is not enough. I think I
did mention that we could make this work by making repack run
pack-objects multiple times. But I did not do it because I did not
think it could really happen.
-- 
Duy


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> That's going to be super rare (and probably nonexisting) edge case, but
> (untested) I wonder if something like this on top would alleviate your
> concerns, i.e. instead of dying we just take the first N packs up to our
> limit:
>
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 4406af640f..49d467ab2a 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -1065,8 +1065,9 @@ static int want_object_in_pack(const struct 
> object_id *oid,
>
> want = 1;
>  done:
> -   if (want && *found_pack && !(*found_pack)->index)
> -   oe_add_pack(_pack, *found_pack);
> +   if (want && *found_pack && !(*found_pack)->index) {
> +   if (oe_add_pack(_pack, *found_pack) == -1)
> +   return 0;
>
> return want;
>  }

It is probably a small first step in the right direction, but we'd
need to communicate which packs we ignored with this logic to the
calling program.  I offhand do not know how we would handle the "-d"
part of "repack -a -d" without it.



Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Duy Nguyen
On Wed, Mar 21, 2018 at 04:59:19PM +0100, Duy Nguyen wrote:
> About the 16k limit (and some other limits as well), I'm making these
> patches with the assumption that large scale deployment probably will
> go with custom builds anyway. Adjusting the limits back should be
> quite easy while we can still provide reasonable defaults for most
> people.

And we could even do something like this to make custom builds
easier. Some more gluing is needed so you can set this from config.mak
but you get the idea. This removes all limits set by this
series. Readability in pack-objects.c and object_entry struct
declaration is still a concern though.

-- 8< --
diff --git a/pack-objects.h b/pack-objects.h
index af40211105..b6e84c9b48 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -2,10 +2,17 @@
 #define PACK_OBJECTS_H
 
 #define OE_DFS_STATE_BITS  2
+#ifdef PACK_OBJECTS_BIG_MEMORY
+#define OE_DEPTH_BITS  31
+/* OE_IN_PACK_BITS is not defined */
+#define OE_Z_DELTA_BITS32
+#define OE_DELTA_SIZE_BITS 32
+#else
 #define OE_DEPTH_BITS  12
 #define OE_IN_PACK_BITS14
 #define OE_Z_DELTA_BITS16
 #define OE_DELTA_SIZE_BITS 31
+#endif
 
 /*
  * State flags for depth-first search used for analyzing delta cycles.
@@ -82,7 +89,11 @@ struct object_entry {
 */
uint32_t delta_size_:OE_DELTA_SIZE_BITS; /* delta data size 
(uncompressed) */
uint32_t delta_size_valid:1;
+#ifdef PACK_OBJECTS_BIG_MEMORY
+   struct packed_git *in_pack; /* already in pack */
+#else
unsigned in_pack_idx:OE_IN_PACK_BITS;   /* already in pack */
+#endif
unsigned size_valid:1;
unsigned z_delta_size:OE_Z_DELTA_BITS;
unsigned type_valid:1;
@@ -112,7 +123,9 @@ struct packing_data {
 
unsigned int *in_pack_pos;
int in_pack_count;
+#ifndef PACK_OBJECTS_BIG_MEMORY
struct packed_git *in_pack[1 << OE_IN_PACK_BITS];
+#endif
 };
 
 struct object_entry *packlist_alloc(struct packing_data *pdata,
@@ -174,6 +187,9 @@ static inline void oe_set_in_pack_pos(const struct 
packing_data *pack,
 static inline unsigned int oe_add_pack(struct packing_data *pack,
   struct packed_git *p)
 {
+#ifdef PACK_OBJECTS_BIG_MEMORY
+   return 0;
+#else
if (pack->in_pack_count >= (1 << OE_IN_PACK_BITS))
die(_("too many packs to handle in one go. "
  "Please add .keep files to exclude\n"
@@ -187,22 +203,31 @@ static inline unsigned int oe_add_pack(struct 
packing_data *pack,
}
pack->in_pack[pack->in_pack_count] = p;
return pack->in_pack_count++;
+#endif
 }
 
 static inline struct packed_git *oe_in_pack(const struct packing_data *pack,
const struct object_entry *e)
 {
+#ifdef PACK_OBJECTS_BIG_MEMORY
+   return e->in_pack;
+#else
return pack->in_pack[e->in_pack_idx];
+#endif
 
 }
 
 static inline void oe_set_in_pack(struct object_entry *e,
  struct packed_git *p)
 {
+#ifdef PACK_OBJECTS_BIG_MEMORY
+   e->in_pack = p;
+#else
if (p->index <= 0)
die("BUG: found_pack should be NULL "
"instead of having non-positive index");
e->in_pack_idx = p->index;
+#endif
 
 }
 
-- 8< --


[PATCH 2/3] Introduce a reusable memory pool type

2018-03-21 Thread jameson . miller81
From: Jameson Miller 

Extract the existing memory pool logic used by fast-import into a
generalized component. This memory pool component can then be used by
other components that need this functionality.

Signed-off-by: Jameson Miller 
---
 Makefile   |   1 +
 mem-pool.c | 130 +
 mem-pool.h |  48 +++
 3 files changed, 179 insertions(+)
 create mode 100644 mem-pool.c
 create mode 100644 mem-pool.h

diff --git a/Makefile b/Makefile
index a1d8775adb..1e142b1dd9 100644
--- a/Makefile
+++ b/Makefile
@@ -832,6 +832,7 @@ LIB_OBJS += lockfile.o
 LIB_OBJS += log-tree.o
 LIB_OBJS += mailinfo.o
 LIB_OBJS += mailmap.o
+LIB_OBJS += mem-pool.o
 LIB_OBJS += match-trees.o
 LIB_OBJS += merge.o
 LIB_OBJS += merge-blobs.o
diff --git a/mem-pool.c b/mem-pool.c
new file mode 100644
index 00..3028bc3c67
--- /dev/null
+++ b/mem-pool.c
@@ -0,0 +1,130 @@
+/*
+ * Memory Pool implementation logic.
+ */
+
+#include "cache.h"
+#include "mem-pool.h"
+
+#define MIN_ALLOC_GROWTH_SIZE 1024 * 1024
+
+struct mp_block {
+   struct mp_block *next_block;
+   char *next_free;
+   char *end;
+   uintmax_t space[FLEX_ARRAY];
+};
+
+static struct mp_block *mem_pool_alloc_block(struct mem_pool *mem_pool, size_t 
block_alloc)
+{
+   struct mp_block *p;
+
+   /* Round up to a 'uintmax_t' alignment */
+   if (block_alloc & (sizeof(uintmax_t) - 1))
+   block_alloc += sizeof(uintmax_t) - (block_alloc & 
(sizeof(uintmax_t) - 1));
+
+   mem_pool->pool_alloc += block_alloc;
+
+   p = xmalloc(st_add(sizeof(struct mp_block), block_alloc));
+
+   p->next_block = mem_pool->mp_block;
+   p->next_free = (char *)p->space;
+   p->end = p->next_free + block_alloc;
+   mem_pool->mp_block = p;
+
+   return p;
+}
+
+void mem_pool_init(struct mem_pool **mem_pool, size_t block_alloc, size_t 
initial_size)
+{
+   if (!(*mem_pool))
+   {
+   if (block_alloc < MIN_ALLOC_GROWTH_SIZE)
+   block_alloc = MIN_ALLOC_GROWTH_SIZE;
+
+   *mem_pool = xmalloc(sizeof(struct mem_pool));
+   (*mem_pool)->pool_alloc = 0;
+   (*mem_pool)->mp_block = 0;
+   (*mem_pool)->block_alloc = block_alloc;
+
+   if (initial_size > 0)
+   mem_pool_alloc_block((*mem_pool), initial_size);
+   }
+}
+
+void mem_pool_discard(struct mem_pool *mem_pool)
+{
+   struct mp_block *block, *block_to_free;
+   for (block = mem_pool->mp_block; block;)
+   {
+   block_to_free = block;
+   block = block->next_block;
+   free(block_to_free);
+   }
+
+   free(mem_pool);
+}
+
+void *mem_pool_alloc(struct mem_pool *mem_pool, size_t len)
+{
+   struct mp_block *p;
+   void *r;
+
+   /* Round up to a 'uintmax_t' alignment */
+   if (len & (sizeof(uintmax_t) - 1))
+   len += sizeof(uintmax_t) - (len & (sizeof(uintmax_t) - 1));
+
+   p = mem_pool->mp_block;
+
+   if (p &&
+  (p->end - p->next_free < len)) {
+   for (p = p->next_block; p; p = p->next_block)
+   if (p->end - p->next_free >= len)
+   break;
+   }
+
+   if (!p) {
+   if (len >= ((mem_pool->block_alloc - sizeof(struct mp_block)) / 
2)) {
+   p = mem_pool_alloc_block(mem_pool, len);
+   }
+   else
+   p = mem_pool_alloc_block(mem_pool, 
mem_pool->block_alloc);
+   }
+
+   r = p->next_free;
+   p->next_free += len;
+   return r;
+}
+
+void *mem_pool_calloc(struct mem_pool *mem_pool, size_t count, size_t size)
+{
+   size_t len = st_mult(count, size);
+   void *r = mem_pool_alloc(mem_pool, len);
+   memset(r, 0, len);
+   return r;
+}
+
+int mem_pool_contains(struct mem_pool *mem_pool, void *mem)
+{
+   struct mp_block *p;
+   for (p = mem_pool->mp_block; p; p = p->next_block)
+   if ((mem >= ((void *)p->space)) &&
+   (mem < ((void *)p->end)))
+   return 1;
+
+   return 0;
+}
+
+void mem_pool_combine(struct mem_pool *dst, struct mem_pool *src)
+{
+   struct mp_block **tail = >mp_block;
+   /* find pointer of dst's last block (if any) */
+   while (*tail)
+   tail = &(*tail)->next_block;
+
+   /* append the blocks from src to dst */
+   *tail = src->mp_block;
+
+   dst->pool_alloc += src->pool_alloc;
+   src->pool_alloc = 0;
+   src->mp_block = NULL;
+}
diff --git a/mem-pool.h b/mem-pool.h
new file mode 100644
index 00..902ef8caf2
--- /dev/null
+++ b/mem-pool.h
@@ -0,0 +1,48 @@
+#ifndef MEM_POOL_H
+#define MEM_POOL_H
+
+struct mem_pool {
+   struct mp_block *mp_block;
+
+   /* The size of new blocks to grow the pool by. */
+   size_t block_alloc;
+
+ 

[PATCH 3/3] fast-import: use built-in mem pool

2018-03-21 Thread jameson . miller81
From: Jameson Miller 

Signed-off-by: Jameson Miller 
---
 fast-import.c | 50 +++---
 1 file changed, 7 insertions(+), 43 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index 4e68acc156..126f2da118 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -168,6 +168,7 @@ Format of STDIN stream:
 #include "dir.h"
 #include "run-command.h"
 #include "packfile.h"
+#include "mem-pool.h"
 
 #define PACK_ID_BITS 16
 #define MAX_PACK_ID ((1<next_pool)
-   if ((p->end - p->next_free >= len))
-   break;
-
-   if (!p) {
-   if (len >= (fi_mem_pool_alloc/2)) {
-   total_allocd += len;
-   return xmalloc(len);
-   }
-   total_allocd += sizeof(struct fi_mem_pool) + fi_mem_pool_alloc;
-   p = xmalloc(st_add(sizeof(struct fi_mem_pool), 
fi_mem_pool_alloc));
-   p->next_pool = mem_pool;
-   p->next_free = (char *) p->space;
-   p->end = p->next_free + fi_mem_pool_alloc;
-   mem_pool = p;
-   }
-
-   r = p->next_free;
-   p->next_free += len;
-   return r;
+   return mem_pool_alloc(_pool, len);
 }
 
 static void *pool_calloc(size_t count, size_t size)
 {
-   size_t len = count * size;
-   void *r = pool_alloc(len);
-   memset(r, 0, len);
-   return r;
+   return mem_pool_calloc(_pool, count, size);
 }
 
 static char *pool_strdup(const char *s)
@@ -3541,8 +3505,8 @@ int cmd_main(int argc, const char **argv)
fprintf(stderr, "Total branches:  %10lu (%10lu loads )\n", 
branch_count, branch_load_count);
fprintf(stderr, "  marks: %10" PRIuMAX " (%10" PRIuMAX 
" unique)\n", (((uintmax_t)1) << marks->shift) * 1024, marks_set_count);
fprintf(stderr, "  atoms: %10u\n", atom_cnt);
-   fprintf(stderr, "Memory total:%10" PRIuMAX " KiB\n", 
(total_allocd + alloc_count*sizeof(struct object_entry))/1024);
-   fprintf(stderr, "   pools:%10lu KiB\n", (unsigned 
long)(total_allocd/1024));
+   fprintf(stderr, "Memory total:%10" PRIuMAX " KiB\n", 
(total_allocd + mem_pool.pool_alloc + alloc_count*sizeof(struct 
object_entry))/1024);
+   fprintf(stderr, "   pools:%10lu KiB\n", (unsigned 
long)((total_allocd + mem_pool.pool_alloc) /1024));
fprintf(stderr, " objects:%10" PRIuMAX " KiB\n", 
(alloc_count*sizeof(struct object_entry))/1024);
fprintf(stderr, 
"-\n");
pack_report();
-- 
2.14.3



[PATCH 1/3] fast-import: rename mem_pool to fi_mem_pool

2018-03-21 Thread jameson . miller81
From: Jameson Miller 

Rename the mem_pool variables and structs in fast-import.c that will
conflict with an upcoming global mem_pool type.

Signed-off-by: Jameson Miller 
---
 fast-import.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index 58ef360da4..4e68acc156 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -209,8 +209,8 @@ struct last_object {
unsigned no_swap : 1;
 };
 
-struct mem_pool {
-   struct mem_pool *next_pool;
+struct fi_mem_pool {
+   struct fi_mem_pool *next_pool;
char *next_free;
char *end;
uintmax_t space[FLEX_ARRAY]; /* more */
@@ -304,9 +304,9 @@ static int global_argc;
 static const char **global_argv;
 
 /* Memory pools */
-static size_t mem_pool_alloc = 2*1024*1024 - sizeof(struct mem_pool);
+static size_t fi_mem_pool_alloc = 2*1024*1024 - sizeof(struct fi_mem_pool);
 static size_t total_allocd;
-static struct mem_pool *mem_pool;
+static struct fi_mem_pool *mem_pool;
 
 /* Atom management */
 static unsigned int atom_table_sz = 4451;
@@ -636,7 +636,7 @@ static unsigned int hc_str(const char *s, size_t len)
 
 static void *pool_alloc(size_t len)
 {
-   struct mem_pool *p;
+   struct fi_mem_pool *p;
void *r;
 
/* round up to a 'uintmax_t' alignment */
@@ -648,15 +648,15 @@ static void *pool_alloc(size_t len)
break;
 
if (!p) {
-   if (len >= (mem_pool_alloc/2)) {
+   if (len >= (fi_mem_pool_alloc/2)) {
total_allocd += len;
return xmalloc(len);
}
-   total_allocd += sizeof(struct mem_pool) + mem_pool_alloc;
-   p = xmalloc(st_add(sizeof(struct mem_pool), mem_pool_alloc));
+   total_allocd += sizeof(struct fi_mem_pool) + fi_mem_pool_alloc;
+   p = xmalloc(st_add(sizeof(struct fi_mem_pool), 
fi_mem_pool_alloc));
p->next_pool = mem_pool;
p->next_free = (char *) p->space;
-   p->end = p->next_free + mem_pool_alloc;
+   p->end = p->next_free + fi_mem_pool_alloc;
mem_pool = p;
}
 
-- 
2.14.3



[PATCH 0/3] Extract memory pool logic into reusable component

2018-03-21 Thread jameson . miller81
From: Jameson Miller 

This patch series extracts the memory pool implementation, currently
used by fast-import, into a generalized component. This memory pool
can then be generally used by any component that needs a pool of
memory.

This patch is in preparation for a change to speed up the loading of
indexes with a large number of cache entries by reducing the number of
malloc() calls. For a rough example of how this component will be used
in this capacity, please see [1].

[1] 
https://github.com/jamill/git/compare/block_allocation_base...jamill:block_allocation

Jameson Miller (3):
  fast-import: rename mem_pool to fi_mem_pool
  Introduce a reusable memory pool type
  fast-import: use built-in mem pool

 Makefile  |   1 +
 fast-import.c |  50 --
 mem-pool.c| 130 ++
 mem-pool.h|  48 ++
 4 files changed, 186 insertions(+), 43 deletions(-)
 create mode 100644 mem-pool.c
 create mode 100644 mem-pool.h

-- 
2.14.3



Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Ævar Arnfjörð Bjarmason

On Wed, Mar 21 2018, Jeff King wrote:

> On Sun, Mar 18, 2018 at 03:25:15PM +0100, Nguyễn Thái Ngọc Duy wrote:
>
>> v6 fixes the one optimization that I just couldn't get right, fixes
>> two off-by-one error messages and a couple commit message update
>> (biggest change is in 11/11 to record some numbers from AEvar)
>
> [...]Yes, having that many packs is insane, but that's going to be
> small consolation to somebody whose automated maintenance program now
> craps out at 16k packs, when it previously would have just worked to
> fix the situation[...]

That's going to be super rare (and probably nonexisting) edge case, but
(untested) I wonder if something like this on top would alleviate your
concerns, i.e. instead of dying we just take the first N packs up to our
limit:

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 4406af640f..49d467ab2a 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1065,8 +1065,9 @@ static int want_object_in_pack(const struct object_id 
*oid,

want = 1;
 done:
-   if (want && *found_pack && !(*found_pack)->index)
-   oe_add_pack(_pack, *found_pack);
+   if (want && *found_pack && !(*found_pack)->index) {
+   if (oe_add_pack(_pack, *found_pack) == -1)
+   return 0;

return want;
 }
diff --git a/pack-objects.h b/pack-objects.h
index 9f8e450e19..50ed2028fb 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -171,15 +171,17 @@ static inline void oe_set_in_pack_pos(const struct 
packing_data *pack,
pack->in_pack_pos[e - pack->objects] = pos;
 }

-static inline unsigned int oe_add_pack(struct packing_data *pack,
+static inline int oe_add_pack(struct packing_data *pack,
   struct packed_git *p)
 {
-   if (pack->in_pack_count >= (1 << OE_IN_PACK_BITS))
-   die(_("too many packs to handle in one go. "
- "Please add .keep files to exclude\n"
- "some pack files and keep the number "
- "of non-kept files below %d."),
+   if (pack->in_pack_count >= (1 << OE_IN_PACK_BITS)) {
+   warning(_("Too many packs to handle in one go. "
+ "Ran into the limit of %d.\n"
+ "Limping along by pretending packs beyond that"
+ "number have *.keep!"),
1 << OE_IN_PACK_BITS);
+   return -1;
+   }
if (p) {
if (p->index > 0)
die("BUG: this packed is already indexed");


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Duy Nguyen
On Wed, Mar 21, 2018 at 5:17 PM, Ævar Arnfjörð Bjarmason
 wrote:
>>> I think ultimately to work on low-memory machines we'll need a
>>> fundamentally different approach that scales with the objects since the
>>> last pack, and not with the complete history.
>>
>> Absolutely. Which is covered in a separate "gc --auto" series. Some
>> memory reduction here may be still nice to have though. Even on beefy
>> machine, memory can still be reused somewhere other than wasted in
>> unused bits.
>
> FWIW I've been running a combination of these two at work (also keeping
> the big pack), and they've had a sizable impact on packing our monorepo,
> on one of our dev boxes on a real-world checkout with a combo of the
> "base" pack and other packs + loose objects, as measured by
> /usr/bin/time
>
>  * Reduction in user time by 37%
>  * Reduction in system time by 84%
>  * Reduction in RSS by 61%
>  * Reduction in page faults by 58% & 94% (time(1) reports two different 
> numbers)
>  * Reduction in the I of I/O by 58%
>  * Reduction in the O of I/O by 94%

The keeping big pack changes very likely contributes to most of this
reduction, so just to be clear these numbers can't be be used as an
argument in favor of this pack-objects series (but otherwise, wow! I
guess I need to finish up the gc series soon, then start the external
rev-list work to reduce even more ;-)
-- 
Duy


Re: .gitattributes override behavior (possible bug, or documentation bug)

2018-03-21 Thread Junio C Hamano
Dakota Hawkins  writes:

> At any rate, would it at least be a good idea to make the "trailing
> slash halts recursion, won't consider nested .gitignore files"
> explicit in the `.gitignore` doc? Unless I'm missing it, I don't think
> that behavior is called out (or at least not called out concisely/in
> one place). It looks like this is all there is:
>
> "If the pattern ends with a slash, it is removed for the purpose
> of the following description, but it would only find a match with a
> directory. In other words, foo/ will match a directory foo and paths
> underneath it, but will not match a regular file or a symbolic link
> foo (this is consistent with the way how pathspec works in general in
> Git)."

This is not saying "trailing slash halts" (and there is no need to
say "trailing slash halts"---the rule is "directory is not recursed
into").  The mention of '/' in the above paragraph is merely saying
that we chose to use a trailing slash as a syntax to say "I want the
path to match this pattern, but only when the path is a directory".
In other words, if "D" (nothing else on the line) is on a line, it
would match a file whose path is "D", but if you write "D/", it
would not.  It only would match a directory whose path is "D".

What "removed for the purpose of the following description" wants to
say is about where in the path "D" in the above example can appear.
Another rule after the paragraph would say that a pattern with a
slash in it will match only at the current level, and a pattern with
no slash would match in any level down below.  Imagine an entry "D"
and another entry "A/B" in .gitignore at the top-level of the
project.  The former has slash in it, the latter does not.  These
patterns match paths "D", "X/D", "A/B" but not "X/A/B".  The first
two match because the pattern "D" is not anchored with any slash,
the last one does not match because the pattern "A/B" has a slash in
it.

If the top-level .gitignore had "D/" and "A/B" in the above example,
and paths "D" and "X/D" were both directories, the pattern "D/"
still matches the directory "X/D", even though it is spelled with a
slash in it.  That slash is there merely for the purpose of marking
the pattern to only match directories, and does not trigger "slash
in a pattern anchors the pattern" rule, because it is "removed for
the purpose of the following description".

> Also, I'm not sure what the "following description" is in "it is
> removed for the purpose of the following description". Is that trying
> to imply "excluded from the rest of the doc"?

See above.


Re: [PATCH] doc/gitattributes: mention non-recursive behavior

2018-03-21 Thread Duy Nguyen
On Wed, Mar 21, 2018 at 7:50 AM, Jeff King  wrote:
> On Tue, Mar 20, 2018 at 05:41:52PM +0100, Duy Nguyen wrote:
>
>> > +The rules by which the pattern matches paths are the same as in
>> > +`.gitignore` files (see linkgit:gitignore[5]), with a few exceptions:
>> > +
>> > +  - negative patterns are forbidden
>>
>> After 8b1bd02415 (Make !pattern in .gitattributes non-fatal -
>> 2013-03-01) maybe we could use the verb "ignored" too instead of
>> "forbidden"
>
> Makes sense. The original is already in 'next', so do you want to send
> an incremental patch?

It's up to you. After all it's you who's doing all the work :)

>> > +pointless in an attributes file; use `path/**` instead)
>>
>> We probably could do this internally too (converting "path/" to
>> "path/**") but we need to deal with corner cases (e.g. "path" without
>> the trailing slash, but is a directory). So yes, suggesting the user
>> to do it instead may be easier.
>
> Yeah, I almost suggested that, but I worried about those corner cases.
> It seems like documenting the current behavior is the right first step
> in any case.

Agreed.

> One other maybe-difference I came across coincidentally today: you have
> to quote the pattern in .gitattributes if it contains spaces, but not in
> .gitignore. But that's more an artifact of the rest of the file syntax
> than the pattern syntax (.gitignore has no other fields to confuse it
> with).

Yeah I forgot about that (and I was the one who started it). The
document was updated in 860a74d9d9 (attr: support quoting pathname
patterns in C style - 2017-01-27) though.
-- 
Duy


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Ævar Arnfjörð Bjarmason

On Wed, Mar 21 2018, Duy Nguyen wrote:

> On Wed, Mar 21, 2018 at 9:24 AM, Jeff King  wrote:
>> On Sun, Mar 18, 2018 at 03:25:15PM +0100, Nguyễn Thái Ngọc Duy wrote:
>>
>>> v6 fixes the one optimization that I just couldn't get right, fixes
>>> two off-by-one error messages and a couple commit message update
>>> (biggest change is in 11/11 to record some numbers from AEvar)
>>
>> I was traveling during some of the earlier rounds, so I finally got a
>> chance to take a look at this.
>>
>> I hate to be a wet blanket, but am I the only one who is wondering
>> whether the tradeoffs is worth it? 8% memory reduction doesn't seem
>> mind-bogglingly good,
>
> AEvar measured RSS. If we count objects[] array alone, the saving is
> 40% (136 bytes per entry down to 80). Some is probably eaten up by
> mmap in rss.

Yeah, sorry about spreading that confusion.

>> and I'm concerned about two things:
>>
>>   1. The resulting code is harder to read and reason about (things like
>>  the DELTA() macros), and seems a lot more brittle (things like the
>>  new size_valid checks).
>>
>>   2. There are lots of new limits. Some of these are probably fine
>>  (e.g., the cacheable delta size), but things like the
>>  number-of-packs limit don't have very good user-facing behavior.
>>  Yes, having that many packs is insane, but that's going to be small
>>  consolation to somebody whose automated maintenance program now
>>  craps out at 16k packs, when it previously would have just worked
>>  to fix the situation.
>>
>> Saving 8% is nice, but the number of objects in linux.git grew over 12%
>> in the last year. So you've bought yourself 8 months before the problem
>> is back. Is it worth making these changes that we'll have to deal with
>> for many years to buy 8 months of memory savings?
>
> Well, with 40% it buys us a couple more months. The object growth
> affects rev-list --all too so the actual "good months" is probably not
> super far from 8 months.
>
> Is it worth saving? I don't know. I raised the readability point from
> the very first patch and if people believe it makes it much harder to
> read, then no it's not worth it.
>
> While pack-objects is simple from the functionality point of view, it
> has received lots of optimizations and to me is quite fragile.
> Readability does count in this code. Fortunately it still looks quite
> ok to me with this series applied (but then it's subjective)
>
> About the 16k limit (and some other limits as well), I'm making these
> patches with the assumption that large scale deployment probably will
> go with custom builds anyway. Adjusting the limits back should be
> quite easy while we can still provide reasonable defaults for most
> people.
>
>> I think ultimately to work on low-memory machines we'll need a
>> fundamentally different approach that scales with the objects since the
>> last pack, and not with the complete history.
>
> Absolutely. Which is covered in a separate "gc --auto" series. Some
> memory reduction here may be still nice to have though. Even on beefy
> machine, memory can still be reused somewhere other than wasted in
> unused bits.

FWIW I've been running a combination of these two at work (also keeping
the big pack), and they've had a sizable impact on packing our monorepo,
on one of our dev boxes on a real-world checkout with a combo of the
"base" pack and other packs + loose objects, as measured by
/usr/bin/time

 * Reduction in user time by 37%
 * Reduction in system time by 84%
 * Reduction in RSS by 61%
 * Reduction in page faults by 58% & 94% (time(1) reports two different numbers)
 * Reduction in the I of I/O by 58%
 * Reduction in the O of I/O by 94%


Re: [PATCH v6 09/11] pack-objects: shrink size field in struct object_entry

2018-03-21 Thread Duy Nguyen
On Wed, Mar 21, 2018 at 9:03 AM, Jeff King  wrote:
> On Tue, Mar 20, 2018 at 07:08:07PM +0100, Duy Nguyen wrote:
>
>> BTW can you apply this patch? This broken && chain made me think the
>> problem was in the next test. It would have saved me lots of time if I
>> saw this "BUG" line coming from the previous test.
>>
>> -- 8< --
>> Subject: [PATCH] t9300: fix broken && chain
>>
>> Signed-off-by: Nguyễn Thái Ngọc Duy 
>> ---
>>  t/t9300-fast-import.sh | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
>> index e4d06accc4..e2a0ae4075 100755
>> --- a/t/t9300-fast-import.sh
>> +++ b/t/t9300-fast-import.sh
>> @@ -348,7 +348,7 @@ test_expect_success 'B: accept branch name "TEMP_TAG"' '
>>   INPUT_END
>>
>>   test_when_finished "rm -f .git/TEMP_TAG
>> - git gc
>> + git gc &&
>>   git prune" &&
>
> The &&-chain is broken from the first command, too. It's "rm -f", which
> is not that big a deal, but...
>
>> @@ -365,7 +365,7 @@ test_expect_success 'B: accept empty committer' '
>>   INPUT_END
>>
>>   test_when_finished "git update-ref -d refs/heads/empty-committer-1
>> - git gc
>> + git gc &&
>>   git prune" &&
>
> Same here, but we probably care more about noticing update-ref failure.

Yes. I wasn't sure if that update-ref could fail but did not check
since this was a side issue for me.
-- 
Duy


Re: .gitattributes override behavior (possible bug, or documentation bug)

2018-03-21 Thread Duy Nguyen
On Wed, Mar 21, 2018 at 4:22 AM, Dakota Hawkins
 wrote:
> Thinking about this a little more, I'm now attracted to the idea that
> its .gitignore that's weird.
>
> As I understand it, .gitignore stops recursion when there's a
> directory match (`somedir/`) but also explicitly allows nested
> .gitnore file _as well as_ exclusion (`!*.txt`).
>
> So, in the following (contrived) example, the user doesn't get what they want:
>
> repo/
> |- .git/
> |- .gitignore   # /ignore-most/
> |- ignore-most/
> |  |- .gitignore# !*.txt
> |  |- please_ignore.png
> |  |- dont_ignore_me.txt
>
> `repo/ignore-most/dont_ignore_me.txt` is still ignored, despite what
> seems like the obvious intention of the user.

Don't get me started on this. I voiced this problem a couple times.
Attempted to fix it once which made it to rc cycles and caused lots of
regressions. I haven't taken another stab since.

> Maybe a unified "best-practices" would first-and-foremost recommend
> against matching directories at all (makes sense, git doesn't manage
> directories). In the above example, changing `/ignore-most/` to
> `/ignore-most/*` has the "desired" effect.

I think it's actually more intuitive to think "ignore recursively"
(with the trailing slash) These things make sense to you once you know
exactly _how_ the matching machinery works. But normal users don't
know that. It's probably better that we add recursive matching support
in gitattributes. Then both gitignore/gitattr are more or less
consistent again and also easy to use (most of the times)
-- 
Duy


Re: [ANNOUNCE] git-sizer: compute various size-related metrics for your Git repository

2018-03-21 Thread Johannes Schindelin
Hi Michael,

On Fri, 16 Mar 2018, Michael Haggerty wrote:

> What makes a Git repository unwieldy to work with and host? It turns
> out that the respository's on-disk size in gigabytes is only part of
> the story. From our experience at GitHub, repositories cause problems
> because of poor internal layout at least as often as because of their
> overall size. For example,
> 
> * blobs or trees that are too large
> * large blobs that are modified frequently (e.g., database dumps)
> * large trees that are modified frequently
> * trees that expand to unreasonable size when checked out (e.g., "Git
> bombs" [2])
> * too many tiny Git objects
> * too many references
> * other oddities, such as giant octopus merges, super long reference
> names or file paths, huge commit messages, etc.
> 
> `git-sizer` [1] is a new open-source tool that computes various
> size-related statistics for a Git repository and points out those that
> are likely to cause problems or inconvenience to its users.

Thank you very much for sharing this tool.

I packaged this as a MSYS2 package for use in Git for Windows' SDKs. You
can install it via

pacman -Sy mingw-w64-x86_64-git-sizer

(obviously, if you are in a 32-bit SDK you want to replace x86_64 by i686)

Note: I am simply re-bundling the binaries you post to the GitHub
releases; The main purpose is to make it easier for users to include this
in their custom installers.

Second note: I briefly considered including this tool in Git for Windows,
but it does increase the size of the installer by a full megabyte, and
therefore I decided to keep it as SDK-only, optional package.

Thanks!
Dscho


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Duy Nguyen
On Wed, Mar 21, 2018 at 9:24 AM, Jeff King  wrote:
> On Sun, Mar 18, 2018 at 03:25:15PM +0100, Nguyễn Thái Ngọc Duy wrote:
>
>> v6 fixes the one optimization that I just couldn't get right, fixes
>> two off-by-one error messages and a couple commit message update
>> (biggest change is in 11/11 to record some numbers from AEvar)
>
> I was traveling during some of the earlier rounds, so I finally got a
> chance to take a look at this.
>
> I hate to be a wet blanket, but am I the only one who is wondering
> whether the tradeoffs is worth it? 8% memory reduction doesn't seem
> mind-bogglingly good,

AEvar measured RSS. If we count objects[] array alone, the saving is
40% (136 bytes per entry down to 80). Some is probably eaten up by
mmap in rss.

> and I'm concerned about two things:
>
>   1. The resulting code is harder to read and reason about (things like
>  the DELTA() macros), and seems a lot more brittle (things like the
>  new size_valid checks).
>
>   2. There are lots of new limits. Some of these are probably fine
>  (e.g., the cacheable delta size), but things like the
>  number-of-packs limit don't have very good user-facing behavior.
>  Yes, having that many packs is insane, but that's going to be small
>  consolation to somebody whose automated maintenance program now
>  craps out at 16k packs, when it previously would have just worked
>  to fix the situation.
>
> Saving 8% is nice, but the number of objects in linux.git grew over 12%
> in the last year. So you've bought yourself 8 months before the problem
> is back. Is it worth making these changes that we'll have to deal with
> for many years to buy 8 months of memory savings?

Well, with 40% it buys us a couple more months. The object growth
affects rev-list --all too so the actual "good months" is probably not
super far from 8 months.

Is it worth saving? I don't know. I raised the readability point from
the very first patch and if people believe it makes it much harder to
read, then no it's not worth it.

While pack-objects is simple from the functionality point of view, it
has received lots of optimizations and to me is quite fragile.
Readability does count in this code. Fortunately it still looks quite
ok to me with this series applied (but then it's subjective)

About the 16k limit (and some other limits as well), I'm making these
patches with the assumption that large scale deployment probably will
go with custom builds anyway. Adjusting the limits back should be
quite easy while we can still provide reasonable defaults for most
people.

> I think ultimately to work on low-memory machines we'll need a
> fundamentally different approach that scales with the objects since the
> last pack, and not with the complete history.

Absolutely. Which is covered in a separate "gc --auto" series. Some
memory reduction here may be still nice to have though. Even on beefy
machine, memory can still be reused somewhere other than wasted in
unused bits.
-- 
Duy


[PATCH] filter-branch: consider refs can refer to an object other than commit or tag

2018-03-21 Thread Yuki Kokubun
"git filter-branch -- --all" can be confused when refs that refer to objects
other than commits or tags exists.
Because "git rev-parse --all" that is internally used can return refs that
refer to an object other than commit or tag. But it is not considered in the
phase of updating refs. Such refs can be created by "git replace" with
objects other than commits or trees.

Signed-off-by: Yuki Kokubun 
---
This patch fix the bug of the first patch.
The first patch did not consider tags.

 git-filter-branch.sh | 11 ++-
 t/t7003-filter-branch.sh | 13 +
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 1b7e4b2cd..f7cd97b86 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -251,7 +251,16 @@ done < "$tempdir"/backup-refs
 
 # The refs should be updated if their heads were rewritten
 git rev-parse --no-flags --revs-only --symbolic-full-name \
-   --default HEAD "$@" > "$tempdir"/raw-heads || exit
+   --default HEAD "$@" > "$tempdir"/raw-objects || exit
+# refs/replace can refer to an object other than commit or tag
+while read ref
+do
+   type=$(git cat-file -t "$ref")
+   if test $type = commit || test $type = tag
+   then
+   echo "$ref"
+   fi
+done >"$tempdir"/raw-heads <"$tempdir"/raw-objects
 sed -e '/^^/d' "$tempdir"/raw-heads >"$tempdir"/heads
 
 test -s "$tempdir"/heads ||
diff --git a/t/t7003-filter-branch.sh b/t/t7003-filter-branch.sh
index 7cb60799b..efeaf5887 100755
--- a/t/t7003-filter-branch.sh
+++ b/t/t7003-filter-branch.sh
@@ -470,4 +470,17 @@ test_expect_success 'tree-filter deals with object name vs 
pathname ambiguity' '
git show HEAD:$ambiguous
 '
 
+test_expect_success 'rewrite repository including refs/replace that point to 
non commit object' '
+   test_when_finished "git reset --hard original" &&
+   tree=$(git rev-parse HEAD^{tree}) &&
+   test_when_finished "git replace -d $tree" &&
+   echo A >new &&
+   git add new &&
+   new_tree=$(git write-tree) &&
+   git replace $tree $new_tree &&
+   git reset --hard HEAD &&
+   git filter-branch -f -- --all >filter-output 2>&1 &&
+   ! fgrep fatal filter-output
+'
+
 test_done
-- 
2.16.2.18.g09cb46d



[GSoC][PATCH v3] test: avoid pipes in git related commands for test

2018-03-21 Thread Pratik Karki
Thank you Eric, for the review. This is follow on patch[1].

The changes in patch increased from v1 to v2 because I
got excited to work in Git codebase and I tried to
fix the exisiting problems as much as possible.
Hence, the large number of changes.


>> test_cmp expect.two output.two
>> @@ -120,18 +122,20 @@ test_expect_success 'test another branch' '
>> git config --add svn-remote.four.url "$svnrepo" &&
>> git config --add svn-remote.four.fetch trunk:refs/remotes/four/trunk 
>> &&
>> git config --add svn-remote.four.branches \
>> -"branches/*/*:refs/remotes/four/branches/*/*" &&
>> +"branches/*/*:refs/remotes/four/branches/*/*" &&
>> git config --add svn-remote.four.tags \
>> -"tags/*:refs/remotes/four/tags/*" &&
>> +"tags/*:refs/remotes/four/tags/*" &&

>I guess you sneaked in a whitespace change here which is unrelated to
>the stated purpose of this patch, thus acted as a speed bump during
>review. If this was the only instance in this test script of
>whitespace needing correction, then it _might_ be okay to include it
>in this patch, however, that's not the case. There are many other such
>instance in this test script which could be corrected, so it doesn't
>make sense to single out these two and ignore all the others.
>Therefore, you should omit this change.

I used tabify in Emacs and it must have messed up the whitespace
change. I read SubmittingPatches guideline[2] for git where it
is said that whitespace check must be done and git community is
picky about it and 'git diff --check' must be done before commit.
If I change this change back to original the 'git diff --check'
reports whitespace identation with space. So, isn't this
supposedly a fix?

[1]: 
https://public-inbox.org/git/20180319173204.31952-1-predatoram...@gmail.com/
[2]: https://github.com/git/git/blob/master/Documentation/SubmittingPatches

- >8

 Avoid using pipes downstream of Git commands since the exit
 codes of commands upstream of pipes get swallowed, thus potentially hiding
 failure of those commands. Instead, capture Git command output to a file and
 apply the downstream command(s) to that file.

Signed-off-by: Pratik Karki 
---
 t/t5300-pack-object.sh | 10 +++---
 t/t5510-fetch.sh   |  8 ++---
 t/t7001-mv.sh  | 22 ++--
 t/t7003-filter-branch.sh   |  9 +++--
 t/t9104-git-svn-follow-parent.sh   | 16 +
 t/t9108-git-svn-glob.sh| 14 
 t/t9109-git-svn-multi-glob.sh  | 38 +++--
 t/t9110-git-svn-use-svm-props.sh   | 42 +++
 t/t9111-git-svn-use-svnsync-props.sh   | 36 ++--
 t/t9114-git-svn-dcommit-merge.sh   | 10 +++---
 t/t9130-git-svn-authors-file.sh| 28 +---
 t/t9138-git-svn-authors-prog.sh| 31 -
 t/t9153-git-svn-rewrite-uuid.sh|  8 ++---
 t/t9168-git-svn-partially-globbed-names.sh | 34 +++
 t/t9350-fast-export.sh | 54 --
 15 files changed, 193 insertions(+), 167 deletions(-)

diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
index 9c68b9925..707208284 100755
--- a/t/t5300-pack-object.sh
+++ b/t/t5300-pack-object.sh
@@ -311,9 +311,9 @@ test_expect_success 'unpacking with --strict' '
rm -f .git/index &&
tail -n 10 LIST | git update-index --index-info &&
ST=$(git write-tree) &&
-   PACK5=$( git rev-list --objects "$LIST" "$LI" "$ST" | \
-   git pack-objects test-5 ) &&
-   PACK6=$( (
+   git rev-list --objects "$LIST" "$LI" "$ST" >actual &&
+   PACK5=$( git pack-objects test-5 actual &&
+   PACK5=$( git pack-objects test-5 &1 | \
-   grep -e "->" | cut -c 22- >../actual
+   git -c fetch.output=full fetch origin >actual2 2>&1 &&
+   grep -e "->" actual2 | cut -c 22- >../actual
) &&
cat >expect <<-\EOF &&
master   -> origin/master
@@ -708,8 +708,8 @@ test_expect_success C_LOCALE_OUTPUT 'fetch compact output' '
test_commit extraaa &&
(
cd compact &&
-   git -c fetch.output=compact fetch origin 2>&1 | \
-   grep -e "->" | cut -c 22- >../actual
+   git -c fetch.output=compact fetch origin >actual2 2>&1 &&
+   grep -e "->" actual2 | cut -c 22- >../actual
) &&
cat >expect <<-\EOF &&
master -> origin/*
diff --git a/t/t7001-mv.sh b/t/t7001-mv.sh
index 6e5031f56..00aa9e45b 100755
--- a/t/t7001-mv.sh
+++ b/t/t7001-mv.sh
@@ -21,8 +21,8 @@ test_expect_success \
 
 test_expect_success \
 'checking the 

Re: [PATCH] sha1_name: use bsearch_hash() for abbreviations

2018-03-21 Thread Derrick Stolee

On 3/20/2018 6:25 PM, Jonathan Tan wrote:

On Tue, 20 Mar 2018 16:03:25 -0400
Derrick Stolee  wrote:


This patch updates the abbreviation code to use bsearch_hash() as defined
in [1]. It gets a nice speedup since the old implementation did not use
the fanout table at all.

You can refer to the patch as:

   b4e00f7306a1 ("packfile: refactor hash search with fanout table",
   2018-02-15)

Also, might be worth noting that this patch builds on
jt/binsearch-with-fanout.


Thanks. That commit is in master and v2.17.0-rc0, so hopefully it isn't 
difficult to apply the patch.





One caveat about the patch: there is a place where I cast a sha1 hash
into a struct object_id pointer. This is because the abbreviation code
still uses 'const unsigned char *' instead of structs. I wanted to avoid
a hashcpy() in these calls, but perhaps that is not too heavy a cost.

I recall a discussion that there were alignment issues with doing this,
but I might have be remembering wrongly - in my limited knowledge of C
alignment, both "unsigned char *" and "struct object_id *" have the same
constraints, but I'm not sure.


Adding Brian M. Carlson in the CC line for advice on how to do this 
translation between old sha1's and new object_ids. If it isn't safe, 
then we could do a hashcpy() until the translation makes it unnecessary.


I should have compared the two methods before sending the patch, but 
running the "git log --oneline --parents" test with a hashcpy() versus a 
cast has no measurable difference in performance for Linux. Probably 
best to do the safest thing here if there is no cost to perf.





+   const unsigned char *index_fanout = p->index_data;

[snip]

+   return bsearch_hash(oid->hash, (const uint32_t*)index_fanout,
+   index_lookup, index_lookup_width, result);

This cast to "const uint32_t *" is safe, because p->index_data points to
a mmap-ed region (which has very good alignment, as far as I know). I
wonder if we should document alignment guarantees on p->index_data, and
if yes, what guarantees to declare.


The existing application in find_pack_entry_one() [1] does similar 
pack-index arithmetic before calling. A similar amount of arithmetic and 
alignment concerns appear in the commit-graph patch. Where is the right 
place to declare these alignment requirements?


In v2, I'll have find_pack_entry_one() call bsearch_pack() so this logic 
is not duplicated.


Thanks,
-Stolee

[1] 
https://github.com/git/git/blob/c6284da4ff4afbde8211efe5d03f3604b1c6b9d6/packfile.c#L1721-L1749

    find_pack_entry_one() in packfile.c


[ANNOUNCE] Git Rev News edition 37

2018-03-21 Thread Christian Couder
Hi everyone,

The 37th edition of Git Rev News is now published:

  https://git.github.io/rev_news/2018/03/21/edition-37/

Thanks a lot to all the contributors!

Enjoy,
Christian, Jakub, Markus and Gabriel.


Re: [PATCH] doc: clarify non-recursion for ignore paths like `foo/`

2018-03-21 Thread Dakota Hawkins
> I think it makes more sense to document it where it's documented now,
> i.e. under how "!" works in general, since this is an edge case with
> negative matching.

My reasoning is/was that while yes, that's true, I think it's likely
that the positive matches would be added before (in terms of git
history) the exclusions. In other words it may not occur to the user
writing an early version of `.gitignore` to think about exclusions if
they don't need to exclude anything yet -- the eventual symptoms of
this problem may be "far away" from the cause, making it harder to
diagnose (I'm living proof of that!)

> I think it can certainly be made clearer, but maybe with a practical
> example (per above) where we also show the dir structure that
> would/wouldn't be matched.
>
> I just chimed in on this one because your patch says the docs are
> "unclear" then "Explain this behavior (and its implications) more
> explicitly" without a reference to the existing adjacent docs. I think
> whatever we do we should be unifying these docs if they're amended, not
> stating this twice.

I think my inability to see it despite knowing exactly what I was
looking for (this time) makes it "unclear", at least to dummies me, so
I feel like that statement is at least somewhat defensible :)

If the explanation were unified, would the idiomatic way to do that be
to add some section way down the document with a couple of `(see SOME
DETAILS ABOUT THIS below)` and some examples in the examples section?


Re: [GSoC] Regarding "Convert scripts to builtins"

2018-03-21 Thread Johannes Schindelin
Hi Paul,

On Mon, 19 Mar 2018, Paul Sebastian Ungureanu wrote:

> I am interested in the "Convert scripts to builtins" project. I have
> recently started to analyze it better and see exactly what it entails
> and a few questions came to my mind.

Great!

> First of all, I find it difficult to pick which scripts would benefit
> the most by being rewritten.

Which ones do you use, personally? I'd go by that measure if I were you.

Oh, and if you are not really familiar with Perl, you may want to stay
away from those scripts. Perl was jokingly labeled a "write-only"
programming language in my presence, and when I see some of my own Perl
code, I would agree at least partially.

> I am thinking of git bisect, git stash and git rebase since these are
> maybe some of the most popular commands of Git. However, on the other
> side, scripts like git-rebase--interactive.sh and git-bisect.sh are also
> subject of other GSoC projects. Should I steer away from these projects
> or can I consider them?

I second Christian's suggestion: just apply for a couple projects you
would like to work on. If there really should be a conflict with another
strong proposal, we can always work something out.

> Secondly, what is too little or too much?

Judging by past GSoC's, even a moderately-sized script like git-bisect.sh
is too much in one go. Break it down into smaller pieces, start by adding
a --helper builtin (or continue by using one like git-bisect--helper), add
things incrementally.

If you get the proposed part done faster, I am sure you'll find tons of
other things to do ;-)

Now to your list of scripts. I reordered them so that I could group the
list into logical chunks, and then ordered them by what would be my
personal priority (most important scripts first).

>  * git/git-rebase--am.sh
>  * git/git-rebase--merge.sh

These are pretty interesting, and also pretty small. They would be *prime*
candidates for turning into builtins, methinks.

As we already have a builtin rebase--helper, that would be a natural way
to go: if you can analyze those scripts and figure out a natural
progression of incremental steps to convert them (see e.g.
https://github.com/git/git/compare/4e7524e012f...c44a4c650c which is how I
moved some parts of git-rebase--interactive.sh into the rebase--helper:
starting with the code generating the todo list, continuing with
transform_todo_ids, then check_commit_sha, skip_unnecessary_picks and
finally rearrange_squash), a very smooth timeline should fall right out of
that analysis.

One slight complication might be the fact that rebase--helper is really
only catering to rebase -i so far. So `rebase--helper --continue` assumes
that it should continue an interactive rebase. This assumption would
have to be shunned, by adding a function that determines which type of
rebase (if any) is currently in progress, and using that to figure out
what function to call to handle the --continue.

>  * git/git-rebase.sh

This script is a bit larger than rebase--am and rebase--merge combined,
but the biggest reason *not* to start with it is: git-rebase.sh is just an
orchestrator of the git-rebase--* scripts, doing little more than
command-line parsing and then handing off to the respective "backends".

Consequently, I would consider it to make most sense to convert the
rebase--* scripts *first*, and only when that is done, convert git-rebase
(renaming rebase--helper to rebase).

Note! There is one exception, and it is not even a full script. As
everybody knows who follows my patch series on this here mailing list, I
consider --preserve-merges one of my stupidest design mistakes ever. To
undo this (or at least to alleviate the damage I caused), I already
submitted a patch series to introduce a superseding option:
--recreate-merges. This patch series is on hold due to the -rc phase of
v2.17 and will be kicked back into action after v2.17.0 final is out. As
it is my hope that --preserve-merges can be deprecated in favor of
--recreate-merges (and eventually even phased out of Git), I would be
totally cool with git-rebase--preserve-merges.sh being factored out of
git-rebase--interactive.sh before converting the latter to pure C, and
leaving the --preserve-merges script alone until the time when it is put
to rest.

(While I was sleeping, leaving this mail unfinished, to be completed and
sent today, a patch series was sent to the mailing list that seems to
perform this refactoring of --preserve-merges into its own script.)

>  * git/git-add--interactive.perl

I personally would *love* to see this converted. But it is a huge task,
likely too big for a single GSoC project. But if you look at the code and
figure out a natural way to break this project down into smaller,
incremental conversions, that would be a way to go.

>  * git/git-bisect.sh -- there is a project about this
>  * git/git-rebase--interactive.sh -- there is a project about this
>  * git/git-submodule.sh -- there was a project about this

Indeed, there are projects 

Re: [PATCH] doc: clarify non-recursion for ignore paths like `foo/`

2018-03-21 Thread Ævar Arnfjörð Bjarmason

On Wed, Mar 21 2018, Dakota Hawkins wrote:

> Re-reading the section you quoted again a couple of times you're
> correct, but somehow that wasn't clear to me despite reading/searching
> for what I wanted to see several times.

FWIW I just knew this because I'd run into this the other day, so it was
fresh in my mind.

> While what I wrote may need improvement, I think there are a couple of
> valid concerns with the way this behavior is documented currently:
>
>   - Generally: Reading about pattern matching for .gitignore is
> awkward on its face, since positive matches have negative consequences
> (in other words `include = !match`).
>   - Specifically: This behavior that is specific to `foo/` matches is
> documented in the section for `!foo` matches. If you're trying to find
> the implications of `foo/` you may not have read about `!foo` as
> carefully.
>
> Since this behavior is practically applicable to both pattern formats,
> and since patterns in the sum of a repo's .gitignore files can get
> somewhat complicated, I think it would be a good idea to either:

I think it makes more sense to document it where it's documented now,
i.e. under how "!" works in general, since this is an edge case with
negative matching.

Whether you specify "foo" or "foo/" is then just an unrelated edge case
in how we match directories v.s. files, but doesn't per-se have anything
to do with how the inversion rules work, so I think it makes sense to
document it where we document the inversion rules.

I.e. you'd get the same for all of (assuming a directory "foo"):

f*
!*.txt

foo
!*.txt

foo/
!*.txt

So what we subsequently exclude just because it's deeper in the tree has
nothing to do with how the disambiguation syntax for matching the
directory looks like.

>   - Do this and basically explain the same behavior twice in two
> pattern format sections, or
>   - Pull the documentation for this behavior out into another section
> where users would be likely to find and understand it if they're
> looking into either pattern format

I think it can certainly be made clearer, but maybe with a practical
example (per above) where we also show the dir structure that
would/wouldn't be matched.

I just chimed in on this one because your patch says the docs are
"unclear" then "Explain this behavior (and its implications) more
explicitly" without a reference to the existing adjacent docs. I think
whatever we do we should be unifying these docs if they're amended, not
stating this twice.


Re: [PATCH] doc: clarify non-recursion for ignore paths like `foo/`

2018-03-21 Thread Dakota Hawkins
One additional note, I think I was was wrong about this:

  "In order to match `foo/` directories while allowing for
possible later exclusions, consider using a trailing wildcard (`foo/*`).
Note that matching directories with a trailing wildcard incurs some
additional performance cost, since it requires recursion into
subdirectories."

`foo/*` will let `!foo/file` work but still seems to prevent `!foo/bar/file`.

On Wed, Mar 21, 2018 at 7:53 AM, Dakota Hawkins
 wrote:
> First, apologies since I didn't get the in-reply-to correct in my
> email header. I'm not sure how to do that (using gmail/gsuite).
>
> Meant to reply to:
> https://public-inbox.org/git/20180321075041.ga24...@sigill.intra.peff.net/
>
>> Just before the context your patch quotes, we have this in gitignore(5)
>> already:
>>
>> [...]It is not possible to re-include a file if a parent directory
>> of that file is excluded. Git doesn’t list excluded directories for
>> performance reasons, so any patterns on contained files have no
>> effect, no matter where they are defined.[...]
>>
>> I can't see how your change to the documentation doesn't just re-state
>> what we already have documented, which is not to say the docs can't be
>> improved, but then we should clearly state this in one place, not
>> re-state it within the span of a few paragraphs.
>
> Context:
>
> This came up originally because of confusion with .gitattributes
> patterns, since gitattributes doesn't have the same `foo/` matching
> behavior. Jeff King was kind enough to prepare a patch for that
> documentation here:
> https://public-inbox.org/git/20180320041454.ga15...@sigill.intra.peff.net/
>
> Re-reading the section you quoted again a couple of times you're
> correct, but somehow that wasn't clear to me despite reading/searching
> for what I wanted to see several times.
>
> While what I wrote may need improvement, I think there are a couple of
> valid concerns with the way this behavior is documented currently:
>
>   - Generally: Reading about pattern matching for .gitignore is
> awkward on its face, since positive matches have negative consequences
> (in other words `include = !match`).
>   - Specifically: This behavior that is specific to `foo/` matches is
> documented in the section for `!foo` matches. If you're trying to find
> the implications of `foo/` you may not have read about `!foo` as
> carefully.
>
> Since this behavior is practically applicable to both pattern formats,
> and since patterns in the sum of a repo's .gitignore files can get
> somewhat complicated, I think it would be a good idea to either:
>
>   - Do this and basically explain the same behavior twice in two
> pattern format sections, or
>   - Pull the documentation for this behavior out into another section
> where users would be likely to find and understand it if they're
> looking into either pattern format
>
> Does that make sense?
>
> What do you think?
>
> Thanks for the feedback,
>
> - Dakota


Re: [PATCH] doc: clarify non-recursion for ignore paths like `foo/`

2018-03-21 Thread Dakota Hawkins
First, apologies since I didn't get the in-reply-to correct in my
email header. I'm not sure how to do that (using gmail/gsuite).

Meant to reply to:
https://public-inbox.org/git/20180321075041.ga24...@sigill.intra.peff.net/

> Just before the context your patch quotes, we have this in gitignore(5)
> already:
>
> [...]It is not possible to re-include a file if a parent directory
> of that file is excluded. Git doesn’t list excluded directories for
> performance reasons, so any patterns on contained files have no
> effect, no matter where they are defined.[...]
>
> I can't see how your change to the documentation doesn't just re-state
> what we already have documented, which is not to say the docs can't be
> improved, but then we should clearly state this in one place, not
> re-state it within the span of a few paragraphs.

Context:

This came up originally because of confusion with .gitattributes
patterns, since gitattributes doesn't have the same `foo/` matching
behavior. Jeff King was kind enough to prepare a patch for that
documentation here:
https://public-inbox.org/git/20180320041454.ga15...@sigill.intra.peff.net/

Re-reading the section you quoted again a couple of times you're
correct, but somehow that wasn't clear to me despite reading/searching
for what I wanted to see several times.

While what I wrote may need improvement, I think there are a couple of
valid concerns with the way this behavior is documented currently:

  - Generally: Reading about pattern matching for .gitignore is
awkward on its face, since positive matches have negative consequences
(in other words `include = !match`).
  - Specifically: This behavior that is specific to `foo/` matches is
documented in the section for `!foo` matches. If you're trying to find
the implications of `foo/` you may not have read about `!foo` as
carefully.

Since this behavior is practically applicable to both pattern formats,
and since patterns in the sum of a repo's .gitignore files can get
somewhat complicated, I think it would be a good idea to either:

  - Do this and basically explain the same behavior twice in two
pattern format sections, or
  - Pull the documentation for this behavior out into another section
where users would be likely to find and understand it if they're
looking into either pattern format

Does that make sense?

What do you think?

Thanks for the feedback,

- Dakota


Re: [PATCH] doc: clarify non-recursion for ignore paths like `foo/`

2018-03-21 Thread Ævar Arnfjörð Bjarmason

On Wed, Mar 21 2018, Dakota Hawkins jotted:

>>> At any rate, would it at least be a good idea to make the "trailing
>>> slash halts recursion, won't consider nested .gitignore files"
>>> explicit in the `.gitignore` doc? Unless I'm missing it, I don't think
>>> that behavior is called out (or at least not called out concisely/in
>>> one place). It looks like this is all there is:
>>
>> Yeah, it's definitely come up multiple times over the years. I don't
>> know what all is in gitignore(5), but if it's not mentioned it probably
>> should be.
>>
>>> "If the pattern ends with a slash, it is removed for the purpose
>>> of the following description, but it would only find a match with a
>>> directory. In other words, foo/ will match a directory foo and paths
>>> underneath it, but will not match a regular file or a symbolic link
>>> foo (this is consistent with the way how pathspec works in general in
>>> Git)."
>>>
>>> Also, I'm not sure what the "following description" is in "it is
>>> removed for the purpose of the following description". Is that trying
>>> to imply "excluded from the rest of the doc"?
>>
>> I think it means "for the rest of the description of how the patterns
>> work". I.e., "foo/" matches as "foo" when the rest of the matching rules
>> are applied. I agree it's a bit awkward. Patches welcome. :)
>
> I hope this is correct. I tried to follow the instructions. Please let
> me know if I need to fix something with how I'm sending this!
>
> -- >8 --
> Subject: [PATCH] doc: clarify non-recursion for ignore paths like `foo/`
>
> The behavior of .gitignore patterns ending with trailing slashes is
> unclear. The user may expect subsequent matching patterns to matter, while
> they do not. For example:
>
>   foo/   # Ignores `foo` directories and everything inside of them
>   !foo/*.txt # Does nothing
>
> Explain this behavior (and its implications) more explicitly.
>
> Signed-off-by: Dakota Hawkins 
> ---
>  Documentation/gitignore.txt | 17 +++--
>  1 file changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt
> index ff5d7f9ed..e9c34c1d5 100644
> --- a/Documentation/gitignore.txt
> +++ b/Documentation/gitignore.txt
> @@ -89,12 +89,17 @@ PATTERN FORMAT
> Put a backslash ("`\`") in front of the first "`!`" for patterns
> that begin with a literal "`!`", for example, "`\!important!.txt`".
>
> - - If the pattern ends with a slash, it is removed for the
> -   purpose of the following description, but it would only find
> -   a match with a directory.  In other words, `foo/` will match a
> -   directory `foo` and paths underneath it, but will not match a
> -   regular file or a symbolic link `foo` (this is consistent
> -   with the way how pathspec works in general in Git).
> + - If the pattern ends with a slash it will match directories but prevent
> +   further recursion into subdirectories. In other words, `foo/` will match a
> +   directory `foo`, excluding files and paths underneath it, but will not 
> match
> +   a regular file or a symbolic link `foo` (this is consistent with the way
> +   that pathspec works in general in Git). Consequently, `foo/` will prevent
> +   consideration of subsequent matches, including exclusions (for example,
> +   `!foo/*.noignore`). In order to match `foo/` directories while allowing 
> for
> +   possible later exclusions, consider using a trailing wildcard (`foo/*`).
> +   Note that matching directories with a trailing wildcard incurs some
> +   additional performance cost, since it requires recursion into
> +   subdirectories.
>
>   - If the pattern does not contain a slash '/', Git treats it as
> a shell glob pattern and checks for a match against the

Just before the context your patch quotes, we have this in gitignore(5)
already:

[...]It is not possible to re-include a file if a parent directory
of that file is excluded. Git doesn’t list excluded directories for
performance reasons, so any patterns on contained files have no
effect, no matter where they are defined.[...]

I can't see how your change to the documentation doesn't just re-state
what we already have documented, which is not to say the docs can't be
improved, but then we should clearly state this in one place, not
re-state it within the span of a few paragraphs.


[PATCH] doc: clarify non-recursion for ignore paths like `foo/`

2018-03-21 Thread Dakota Hawkins
>> At any rate, would it at least be a good idea to make the "trailing
>> slash halts recursion, won't consider nested .gitignore files"
>> explicit in the `.gitignore` doc? Unless I'm missing it, I don't think
>> that behavior is called out (or at least not called out concisely/in
>> one place). It looks like this is all there is:
>
> Yeah, it's definitely come up multiple times over the years. I don't
> know what all is in gitignore(5), but if it's not mentioned it probably
> should be.
>
>> "If the pattern ends with a slash, it is removed for the purpose
>> of the following description, but it would only find a match with a
>> directory. In other words, foo/ will match a directory foo and paths
>> underneath it, but will not match a regular file or a symbolic link
>> foo (this is consistent with the way how pathspec works in general in
>> Git)."
>>
>> Also, I'm not sure what the "following description" is in "it is
>> removed for the purpose of the following description". Is that trying
>> to imply "excluded from the rest of the doc"?
>
> I think it means "for the rest of the description of how the patterns
> work". I.e., "foo/" matches as "foo" when the rest of the matching rules
> are applied. I agree it's a bit awkward. Patches welcome. :)

I hope this is correct. I tried to follow the instructions. Please let
me know if I need to fix something with how I'm sending this!

-- >8 --
Subject: [PATCH] doc: clarify non-recursion for ignore paths like `foo/`

The behavior of .gitignore patterns ending with trailing slashes is
unclear. The user may expect subsequent matching patterns to matter, while
they do not. For example:

  foo/   # Ignores `foo` directories and everything inside of them
  !foo/*.txt # Does nothing

Explain this behavior (and its implications) more explicitly.

Signed-off-by: Dakota Hawkins 
---
 Documentation/gitignore.txt | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt
index ff5d7f9ed..e9c34c1d5 100644
--- a/Documentation/gitignore.txt
+++ b/Documentation/gitignore.txt
@@ -89,12 +89,17 @@ PATTERN FORMAT
Put a backslash ("`\`") in front of the first "`!`" for patterns
that begin with a literal "`!`", for example, "`\!important!.txt`".

- - If the pattern ends with a slash, it is removed for the
-   purpose of the following description, but it would only find
-   a match with a directory.  In other words, `foo/` will match a
-   directory `foo` and paths underneath it, but will not match a
-   regular file or a symbolic link `foo` (this is consistent
-   with the way how pathspec works in general in Git).
+ - If the pattern ends with a slash it will match directories but prevent
+   further recursion into subdirectories. In other words, `foo/` will match a
+   directory `foo`, excluding files and paths underneath it, but will not match
+   a regular file or a symbolic link `foo` (this is consistent with the way
+   that pathspec works in general in Git). Consequently, `foo/` will prevent
+   consideration of subsequent matches, including exclusions (for example,
+   `!foo/*.noignore`). In order to match `foo/` directories while allowing for
+   possible later exclusions, consider using a trailing wildcard (`foo/*`).
+   Note that matching directories with a trailing wildcard incurs some
+   additional performance cost, since it requires recursion into
+   subdirectories.

  - If the pattern does not contain a slash '/', Git treats it as
a shell glob pattern and checks for a match against the
-- 
2.16.2.windows.1


Re: [RFC][GSoC] Project proposal: convert interactive rebase to C

2018-03-21 Thread Alban Gruin
Hi Johannes,

Le mardi 20 mars 2018 17:29:28 CET, vous avez écrit :
> > Weeks 1 & 2 — May 14, 2018 – May 28, 2018
> > First, I would refactor --preserve-merges in its own shell script, as
> > described in Dscho’s email.
> 
> Could you go into detail a bit here? Like, describe what parts of the
> git-rebase--interactive.sh script would need to be duplicated, which ones
> would be truly moved, etc
> 

It would lead to duplicate a good chunk of git_rebase__interactive(), 
apparently. The moved parts would be everything in `if test t = 
"$preserve_merges"; then …; fi` statements. That is, about 50 lines of shell 
code.

Judging by that, beginning by that is probably not the right thing to do. 
Also, somebody is already working on that[1]. 

> > Weeks 3 & 4 — May 18, 2018 – June 11, 2018
> > Then, I would start to rewrite git-rebase--interactive, and get rid of
> > git-
> > rebase--helper.
> 
> I think this is a bit premature, as the rebase--helper would not only
> serve the --interactive part, but in later conversions also --am and
> --merge, and only in the very end, when git-rebase.sh gets converted,
> would we be able to simply rename the rebase--helper to rebase.
> 

Yes, Christian Couder told me that it would not be a good methodology too.

> Also, I have a hunch that there is actually almost nothing left to rewrite
> after my sequencer improvements that made it into Git v2.13.0, together
> with the upcoming changes (which are on top of the --recreate-merges patch
> series, hence I did not send them to the mailing list yet) in
> https://github.com/dscho/git/commit/c261f17a4a3e

One year ago, you said[2] that converting this script "will fill up 3 month, 
very easily". Is this not accurate anymore?

> 
> So I would like to see more details here... ;-)

Yep, I’m working on that. 

> > Weeks 5 to 9 — June 11, 2018 – July 15, 2018
> > During this period, I would continue to rewrite git-rebase--interactive.
> 
> It would be good if the proposal said what parts of the conversion are
> tricky, to merit spending a month on them.
> 
> > Weeks 10 & 11 — July 16, 2018 – July 29, 2018
> > In the second half of July, I would look for bugs in the new code, test
> > it,
> > and improve its coverage.
> 
> As I mentioned in a related mail, the test suite coverage would be most
> sensibly extended *before* starting to rewrite code in C, as it helps
> catching bugs early and avoids having to merge buggy code that needs to be
> fixed immediately.

Makes sense.

> 
> > Weeks 12 — July 30, 2018 – August 5, 2018
> > In the last week, I would polish the code where needed, in order to
> > improve for performance or to make the code more readable.
> 
> Thank you for sharing this draft with us!
> Johannes

I’ll send a new draft as soon as possible (hopefully this afternoon).

Thank you for your enthousiasm :)
Alban

[1] https://public-inbox.org/git/20180320204507.12623-1-w...@saville.com/
[2] https://public-inbox.org/git/alpine.DEB.
2.20.1703231827060.3767@virtualbox/




Re: [GSoC][PATCH] test: avoid pipes in git related commands for test suite

2018-03-21 Thread Eric Sunshine
On Mon, Mar 19, 2018 at 1:32 PM, Pratik Karki  wrote:
> Thank you Eric Sunshine,
> I have done as you had instructed me. I look forward to more
> understanding of the codebase and would love to fix
> "git rev-parse" problems in my follow-on patches.
> Thank you for the professional review comment.
>
> Sorry for late follow-on patch, I got tied up with my university stuffs.

No need to apologize; this is not a race. It's better to take time
preparing submissions carefully, than trying to rush them out.

> Please do review this patch as before. I will correct it if needed.

Below comments are meant to be instructive and constructive.

> Cheers,
> Pratik Karki

Place a "-- >8--" scissor line right here so that git-am knows where
the commit message begins; otherwise, all of the above commentary will
undesirably be included in the commit message.

> [PATCH] test: avoid pipes in git related commands for test suite

As this is the second attempt at this patch, the subject should be
"[PATCH v2]". Also, as an aid to reviewers -- who see a lot of patches
each day and are likely to forget details of each submission -- please
include a link in the commentary (not in the actual commit message)
pointing at the previous iteration, like this[1].

[1]: https://public-inbox.org/git/20180313201945.8409-1-predatoram...@gmail.com/

> Avoid using pipes downstream of Git commands since the exit codes
> of commands upstream of pipes get swallowed, thus potentially
> hiding failure of those commands. Instead, capture Git command
> output to a file apply the downstream command(s) to that file.

This rewrite of the commit message which I suggested in [2] has a
grammatical error (which I noticed immediately after hitting "Send").
Unfortunately, you copied it verbatim, so the error is reproduced
here. Specifically, you want to insert "and" between "file" and
"apply":

... capture Git command output to a file _and_ apply...

[2]: 
https://public-inbox.org/git/CAPig+cRPzyw525ODC4=-E7w=zbpbhvn2eqxsydslij5wfw8...@mail.gmail.com/

> Signed-off-by: Pratik Karki 
> ---
>  t/t5300-pack-object.sh | 10 +++---
>  t/t5510-fetch.sh   |  8 ++---
>  t/t7001-mv.sh  | 22 ++---
>  t/t7003-filter-branch.sh   |  9 --
>  t/t9104-git-svn-follow-parent.sh   | 16 +
>  t/t9108-git-svn-glob.sh| 14 
>  t/t9109-git-svn-multi-glob.sh  | 28 +---
>  t/t9110-git-svn-use-svm-props.sh   | 42 
>  t/t9111-git-svn-use-svnsync-props.sh   | 36 ++---
>  t/t9114-git-svn-dcommit-merge.sh   | 10 +++---
>  t/t9130-git-svn-authors-file.sh| 28 +---
>  t/t9138-git-svn-authors-prog.sh| 31 +-
>  t/t9153-git-svn-rewrite-uuid.sh|  8 ++---
>  t/t9168-git-svn-partially-globbed-names.sh | 34 +++
>  t/t9350-fast-export.sh | 52 
> --
>  15 files changed, 187 insertions(+), 161 deletions(-)

The goal of iterating a patch or patch series is to converge to a
point at which the submission is in good enough shape to be accepted.
Ideally, each iteration should involve fewer changes than the previous
attempt.

Version 1 of this patch touched only 8 files, however, this version
touches 15, and is now uncomfortably large and difficult to review in
a single sitting (it took over 1.5 hours). Rather than converging, it
has instead diverged, and is thus potentially further from being in an
acceptable state than it would have been if v2 had merely addressed
the problems identified by the v1 review.

While the desire to address these additional cases is admirable, it is
better to focus on "landing" the current patch (getting it accepted)
before expanding your efforts; it's also more reviewer-friendly to
stay focused, especially with patches, such as this, which involve
primarily mechanical changes (which tend to be mind-numbing to
review).

> diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
> @@ -311,9 +311,9 @@ test_expect_success 'unpacking with --strict' '
> rm -f .git/index &&
> tail -n 10 LIST | git update-index --index-info &&
> ST=$(git write-tree) &&
> -   PACK5=$( git rev-list --objects "$LIST" "$LI" "$ST" | \
> -   git pack-objects test-5 ) &&
> -   PACK6=$( (
> +   git rev-list --objects "$LIST" "$LI" "$ST" >actual &&
> +   PACK5=$(git pack-objects test-5 < actual) &&
> +   PACK6=$((

Losing the space between the two left parentheses is wrong. $( ( foo )
), which captures the output of subshell running 'foo', has very
different meaning than $((foo)), which performs arithmetic. This
change turns it into $(( foo) ), which, at best, is undefined.
Although bash seems to tolerate this change, other more strict shells
barf on it.

[PATCH] filter-branch: consider refs/replace can refer to an object other than commit

2018-03-21 Thread Yuki Kokubun
"git filter-branch -- --all" can be confused when refs that refer to objects
other than commits exists.
Because "git rev-parse --all" that is internally used can return refs that
refer to an object other than commit. But it is not considered in the phase of
updating refs. Such refs can be created by "git replace" with objects other than
commits.

Signed-off-by: Yuki Kokubun 
---
 git-filter-branch.sh |  3 +++
 t/t7003-filter-branch.sh | 13 +
 2 files changed, 16 insertions(+)

diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index 1b7e4b2cd..a9fafe64f 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -479,6 +479,9 @@ do
# avoid rewriting a ref twice
test -f "$orig_namespace$ref" && continue
 
+   # refs/replace can point to a non commit object
+   test $(git cat-file -t "$ref") = commit || continue
+
sha1=$(git rev-parse "$ref"^0)
rewritten=$(map $sha1)
 
diff --git a/t/t7003-filter-branch.sh b/t/t7003-filter-branch.sh
index 7cb60799b..efeaf5887 100755
--- a/t/t7003-filter-branch.sh
+++ b/t/t7003-filter-branch.sh
@@ -470,4 +470,17 @@ test_expect_success 'tree-filter deals with object name vs 
pathname ambiguity' '
git show HEAD:$ambiguous
 '
 
+test_expect_success 'rewrite repository including refs/replace that point to 
non commit object' '
+   test_when_finished "git reset --hard original" &&
+   tree=$(git rev-parse HEAD^{tree}) &&
+   test_when_finished "git replace -d $tree" &&
+   echo A >new &&
+   git add new &&
+   new_tree=$(git write-tree) &&
+   git replace $tree $new_tree &&
+   git reset --hard HEAD &&
+   git filter-branch -f -- --all >filter-output 2>&1 &&
+   ! fgrep fatal filter-output
+'
+
 test_done
-- 
2.16.2.18.g09cb46d



Re: [RFC PATCH 0/3] rebase-interactive

2018-03-21 Thread Eric Sunshine
{cc:+junio}

On Tue, Mar 20, 2018 at 11:31 PM, Wink Saville  wrote:
> Anyway, I've played around and my current thoughts are to not create
> any new files and keep git_rebase__interactive and the new
> git_rebase__interactive__preserve_merges functions in
> git-rebase--interactive.sh. Doing that will preserve the blame for
> the existing functions.

"blame -C" detects code movement across files, so you needn't feel
constrained in that sense. (In fact, "blame -C -C" should detect the
copied -- not moved -- code in your patch 1/2, so my objection is not
entirely valid, although "-C -C" is potentially much slower.)

Nevertheless, leaving all the code in git-rebase--interactive.sh
sounds sensible if there is not a compelling reason to split it out,
especially since each refactoring (especially a big one) has the
potential to collide with in-flight or planned topics.

> But if I do indentation reformating as I extract functions that will
> be shared between git_rebase__interactive and
> git_rebase__interactive__preserve_merges then we still lose the
> blame information unless the "-w" parameter is passed to blame. I
> could choose to not do the indentation, but that doesn't seem like a
> good idea.

Don't worry about indentation changes during refactoring and code
movement; it's frequently necessary, and "blame" still works (with
"-w", as you point out).

> Thoughts or other ideas?

A few...

Although you may have the entire refactoring in your head -- you know
what you moved where and why and what changes were needed to make the
move possible -- reviewers don't have that luxury. A nearly 1000-line
patch, such as 1/3, which copies code (possibly verbatim or possibly
with edits -- reviewers don't know which) and which contains
newly-written code is a significant burden on reviewers. Crafting a
patch series such that it holds the hands of reviewers by laying out
each change in easily digested chunks helps significantly, both with
attracting more and better reviews, and with potential buy-in that the
changes are worthwhile.

However, I'm just one (potential) reviewer, and I don't want you
wasting your time embarking on large-scale changes based upon my
comments before hearing from Dscho (Johannes) and Junio if such
refactoring is even welcome.


Re: .gitattributes override behavior (possible bug, or documentation bug)

2018-03-21 Thread Jeff King
On Wed, Mar 21, 2018 at 04:35:26AM -0400, Dakota Hawkins wrote:

> > I think it means "for the rest of the description of how the patterns
> > work". I.e., "foo/" matches as "foo" when the rest of the matching rules
> > are applied. I agree it's a bit awkward. Patches welcome. :)
> 
> I'd be more than happy to do that!
> 
> It will take me a while, this (email and text-patches) is foreign to
> me and will take me some extra time, but I think I can figure it out.
> 
> Is "consistent with the way how pathspec works in general in Git"
> absolutely still true (and relevant?)

I think so. "git log Makefile/" will not match anything, for example.

-Peff


Re: .gitattributes override behavior (possible bug, or documentation bug)

2018-03-21 Thread Dakota Hawkins
> I think it means "for the rest of the description of how the patterns
> work". I.e., "foo/" matches as "foo" when the rest of the matching rules
> are applied. I agree it's a bit awkward. Patches welcome. :)

I'd be more than happy to do that!

It will take me a while, this (email and text-patches) is foreign to
me and will take me some extra time, but I think I can figure it out.

Is "consistent with the way how pathspec works in general in Git"
absolutely still true (and relevant?)


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-21 Thread Jeff King
On Sun, Mar 18, 2018 at 03:25:15PM +0100, Nguyễn Thái Ngọc Duy wrote:

> v6 fixes the one optimization that I just couldn't get right, fixes
> two off-by-one error messages and a couple commit message update
> (biggest change is in 11/11 to record some numbers from AEvar)

I was traveling during some of the earlier rounds, so I finally got a
chance to take a look at this.

I hate to be a wet blanket, but am I the only one who is wondering
whether the tradeoffs is worth it? 8% memory reduction doesn't seem
mind-bogglingly good, and I'm concerned about two things:

  1. The resulting code is harder to read and reason about (things like
 the DELTA() macros), and seems a lot more brittle (things like the
 new size_valid checks).

  2. There are lots of new limits. Some of these are probably fine
 (e.g., the cacheable delta size), but things like the
 number-of-packs limit don't have very good user-facing behavior.
 Yes, having that many packs is insane, but that's going to be small
 consolation to somebody whose automated maintenance program now
 craps out at 16k packs, when it previously would have just worked
 to fix the situation.

Saving 8% is nice, but the number of objects in linux.git grew over 12%
in the last year. So you've bought yourself 8 months before the problem
is back. Is it worth making these changes that we'll have to deal with
for many years to buy 8 months of memory savings?

I think ultimately to work on low-memory machines we'll need a
fundamentally different approach that scales with the objects since the
last pack, and not with the complete history.

-Peff


Re: [PATCH v6 09/11] pack-objects: shrink size field in struct object_entry

2018-03-21 Thread Jeff King
On Tue, Mar 20, 2018 at 07:08:07PM +0100, Duy Nguyen wrote:

> BTW can you apply this patch? This broken && chain made me think the
> problem was in the next test. It would have saved me lots of time if I
> saw this "BUG" line coming from the previous test.
> 
> -- 8< --
> Subject: [PATCH] t9300: fix broken && chain
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  t/t9300-fast-import.sh | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
> index e4d06accc4..e2a0ae4075 100755
> --- a/t/t9300-fast-import.sh
> +++ b/t/t9300-fast-import.sh
> @@ -348,7 +348,7 @@ test_expect_success 'B: accept branch name "TEMP_TAG"' '
>   INPUT_END
>  
>   test_when_finished "rm -f .git/TEMP_TAG
> - git gc
> + git gc &&
>   git prune" &&

The &&-chain is broken from the first command, too. It's "rm -f", which
is not that big a deal, but...

> @@ -365,7 +365,7 @@ test_expect_success 'B: accept empty committer' '
>   INPUT_END
>  
>   test_when_finished "git update-ref -d refs/heads/empty-committer-1
> - git gc
> + git gc &&
>   git prune" &&

Same here, but we probably care more about noticing update-ref failure.

-Peff


Re: .gitattributes override behavior (possible bug, or documentation bug)

2018-03-21 Thread Jeff King
On Wed, Mar 21, 2018 at 03:36:09AM -0400, Dakota Hawkins wrote:

> > I think that ignoring all of /ignore-most/ is much more efficient, since
> > we don't have to enumerate the paths inside it at all (which is why the
> > current behavior works as it does).
> 
> That's definitely true, but I wonder what the impact would be for most
> cases (even for most cases with large repos and larges sets of ignore
> files).

Probably not that much CPU, but it used to be quite annoying to fault in
the dcache for an infrequently used hierarchy. These days I generally
have an SSD and lots of memory for disk caching, so I don't know if it
would still be.

On systems where readdir/stat are slower than Linux it might be
noticeable for a big hierarchy.

> At any rate, would it at least be a good idea to make the "trailing
> slash halts recursion, won't consider nested .gitignore files"
> explicit in the `.gitignore` doc? Unless I'm missing it, I don't think
> that behavior is called out (or at least not called out concisely/in
> one place). It looks like this is all there is:

Yeah, it's definitely come up multiple times over the years. I don't
know what all is in gitignore(5), but if it's not mentioned it probably
should be.

> "If the pattern ends with a slash, it is removed for the purpose
> of the following description, but it would only find a match with a
> directory. In other words, foo/ will match a directory foo and paths
> underneath it, but will not match a regular file or a symbolic link
> foo (this is consistent with the way how pathspec works in general in
> Git)."
> 
> Also, I'm not sure what the "following description" is in "it is
> removed for the purpose of the following description". Is that trying
> to imply "excluded from the rest of the doc"?

I think it means "for the rest of the description of how the patterns
work". I.e., "foo/" matches as "foo" when the rest of the matching rules
are applied. I agree it's a bit awkward. Patches welcome. :)

-Peff


  1   2   >