Re: Migration to Git LFS inflates repository multiple times

2018-11-12 Thread Mateusz Loskot
On Mon, 12 Nov 2018 at 13:31, Jeff King  wrote:
> On Mon, Nov 12, 2018 at 12:47:42AM +0100, Mateusz Loskot wrote:
> >
> > TL;TR: Is this normal a repository migrated to Git LFS inflates multiple 
> > times
> > and how to deal with it?
>
> That does sound odd to me. People with more LFS experience can probably
> give you a better answers

FYI, I forwarded my question to https://github.com/git-lfs/git-lfs/issues/3374

> but one thought occurred to me: does LFS
> store backup copies of the original refs that it rewrites (similar to
> the way filter-branch stores refs/original)?

I don't think I see any backup refs (see below for full list).
But, I may be misunderstanding what they are, how to look for them.

> history. Which might mean storing those large blobs both as Git objects
> (for the old history) and in an LFS cache directory (for the new
> history).

Yes, it makes sense.

> And the right next step is probably to delete those backup refs, and
> then "git gc --prune=now". Hmm, actually thinking about it, reflogs
> could be making the old history reachable, too.
>
> Try looking at the output of "git for-each-ref" and seeing if there are
> any backup refs.

I see. Here is the list (long!) of all I found:

proj.git (BARE:master) $ git for-each-ref
c718eadcf8d09d68c385f0a9355a2c871474fb43 commit refs/heads/1.0
daa75889053b70515179e334cbe3fe6fc7873ff3 commit refs/heads/1.1
cb70db292c1f0c62170d05ffa8dad3c87a6f8ebd commit refs/heads/2.0
f1597e80fcea16bec96dc43f7ab706616126305b commit refs/heads/3.0
1d9e4813ae2fdc5c2b52f7115facda9059b009dc commit refs/heads/master
f41edf37e9a4120bc5d5d66b29d110d403b8db9b commit
refs/svn/attic/tags/1.0.1006/6674
850166b21f27447c6b503bb753c454ccedcea8ef commit refs/svn/attic/tags/1.0.216/1291
8a24407f2df0ea7a401fbc08b387387538912642 commit refs/svn/attic/tags/1.0.252/1543
771d81b0756d6ff7d73779ed79f49a607bffb80e commit refs/svn/attic/tags/1.0.299/1883
10925d0fe0de090d4de109fd6403b86d014d6a21 commit refs/svn/attic/tags/1.0.342/2288
ca8dc7b243d002ac6b27f219a6172d36e7885ac1 commit refs/svn/attic/tags/1.0.391/2470
79ebabed25d31dfa34ad68e58fa9327f71928df1 commit refs/svn/attic/tags/1.0.433/2657
d3ed45804843c5aa153810b7494e2c7c0b842c82 commit refs/svn/attic/tags/1.0.450/2724
088af5dbb225cb8dfbeafb5b63158234f4e4017d commit refs/svn/attic/tags/1.0.502/2967
5da8598ed98a5a6610108d67849955da14f9d5b8 commit refs/svn/attic/tags/1.0.546/3212
c3463397337f9f6e5f9d8e64cd79c013fd798bc8 commit refs/svn/attic/tags/1.0.615/3470
673b2f93fda830cc8f28d436abead5fd54baa361 commit refs/svn/attic/tags/1.0.657/3704
247cf24b90afd39f4f5dd7a27cf6b74483215285 commit refs/svn/attic/tags/1.0.662/3725
e2a1609bb6b15ee767565ca0ff152eae3f72a76b commit refs/svn/attic/tags/1.0.673/3820
48033fb9046b1ee60ad9b73073e9185c31ee4568 commit refs/svn/attic/tags/1.0.742/4325
d7b566a275209d0aebba39c7a4028c9dcfb8a468 commit
refs/svn/attic/tags/1.1.1141/13525
80f922becfc406420d2f14543e6da684f7377504 commit
refs/svn/attic/tags/1.1.1535/16534
252481191cacdef0e77eb6ec02c98b07fca7bc77 commit
refs/svn/attic/tags/1.1.1582/16435
601f072d559b664c101d89f3445ed3d00f4ef5dd commit
refs/svn/attic/tags/1.1.939/12077
417fb23d71cab30e2c6218faed6f86021b67ca25 commit
refs/svn/attic/tags/2.0.1156/21143
5539fbbe0078b782af02d46e0c1abc86ce3d5902 commit refs/svn/map
c718eadcf8d09d68c385f0a9355a2c871474fb43 commit refs/svn/root/branches/1.0
daa75889053b70515179e334cbe3fe6fc7873ff3 commit refs/svn/root/branches/1.1
cb70db292c1f0c62170d05ffa8dad3c87a6f8ebd commit refs/svn/root/branches/2.0
f1597e80fcea16bec96dc43f7ab706616126305b commit refs/svn/root/branches/3.0
e946e96ce2b37a771769196027ae87b8f24181e0 commit refs/svn/root/tags/1.0.1058
06928f42664384bd5e24c115f9c23acc2fd949da commit refs/svn/root/tags/1.0.1240
9f059337974aa195386c5f3ee21957551624aa27 commit refs/svn/root/tags/1.0.1653
db98d68f93d2e9127ab766d3bbe6c933ec169d29 commit refs/svn/root/tags/1.0.764
20ad69c6b94bc8b73b613b5c21d367f22f423501 commit refs/svn/root/tags/1.1.1163
458be535d7f0bc512e800759759e86a211a418b6 commit refs/svn/root/tags/1.1.1290
df045ab97bee94e8cfe72b70802b837719899587 commit refs/svn/root/tags/1.1.1556
cd8ce83868016a0854c0f3cf1b23ea68a32674a2 commit refs/svn/root/tags/1.1.1706
fcd0801b93f48bd46b276ccb82678a70f11fc3ca commit refs/svn/root/tags/1.1.1809
5df0902cfe973b0a041409ec2e8d2314f2b8031e commit refs/svn/root/tags/1.1.2368
c9a06b4f43bab77ed283fe2736ab5c865e03026e commit refs/svn/root/tags/1.1.2417
32e505f8c4deadb73c63bd20a598481cd164541d commit refs/svn/root/tags/1.1.947
ef9c3667ec419bcb6d5eb5b9dbacb1cff0b1051e commit refs/svn/root/tags/2.0.1187
a1a6a5bedb8949eb91f3509929edb9efa9ad2875 commit refs/svn/root/tags/2.0.1198
33a8f49da311caecdb5521759251bbcb78e3bff2 commit refs/svn/root/tags/2.0.1338
63e59278131281858296b56f4ef5dd91c332941a commit refs/svn/root/tags/2.0.1481
d23a1c662f772a7fc0d23a07794b57cfd9eff064 commit refs/svn/root/tags/2.0.1835
c53e2cc4660a9e3121dff33c28c1383766fda39b commit refs/svn/root/tags/2.0.2148
c7e0293ec09fee809fba707054cd1fd8fe492664 commit 

[PATCH 2/3] ieot: default to not writing IEOT section

2018-11-12 Thread Jonathan Nieder
As with EOIE, popular versions of Git do not support the new IEOT
extension yet.  When accessing a Git repository written by a more
modern version of Git, they correctly ignore the unrecognized section,
but in the process they loudly warn

ignoring IEOT extension

resulting in confusion for users.  Introduce the index extension more
gently by not writing it yet in this first version with support for
it.  Soon, once sufficiently many users are running a modern version
of Git, we can flip the default so users benefit from this index
extension by default.

Introduce a '[index] recordOffsetTable' configuration variable to
control whether the new index extension is written.

Signed-off-by: Jonathan Nieder 
---
 Documentation/config.txt |  7 +++
 read-cache.c | 11 ++-
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index d702379db4..cc66fb7de3 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2195,6 +2195,13 @@ index.recordEndOfIndexEntries::
reading the index using Git versions before 2.20. Defaults to
'false'.
 
+index.recordOffsetTable::
+   Specifies whether the index file should include an "Index Entry
+   Offset Table" section. This reduces index load time on
+   multiprocessor machines but produces a message "ignoring IEOT
+   extension" when reading the index using Git versions before 2.20.
+   Defaults to 'false'.
+
 index.threads::
Specifies the number of threads to spawn when loading the index.
This is meant to reduce index load time on multiprocessor machines.
diff --git a/read-cache.c b/read-cache.c
index 4bfe93c4c2..290bd54708 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2707,6 +2707,15 @@ static int record_eoie(void)
return 0;
 }
 
+static int record_ieot(void)
+{
+   int val;
+
+   if (!git_config_get_bool("index.recordoffsettable", ))
+   return val;
+   return 0;
+}
+
 /*
  * On success, `tempfile` is closed. If it is the temporary file
  * of a `struct lock_file`, we will therefore effectively perform
@@ -2767,7 +2776,7 @@ static int do_write_index(struct index_state *istate, 
struct tempfile *tempfile,
 
 #ifndef NO_PTHREADS
nr_threads = git_config_get_index_threads();
-   if (nr_threads != 1) {
+   if (nr_threads != 1 && record_ieot()) {
int ieot_blocks, cpus;
 
/*
-- 
2.19.1.930.g4563a0d9d0



[PATCH 3/3] index: do not warn about unrecognized extensions

2018-11-12 Thread Jonathan Nieder
Documentation/technical/index-format explains:

 4-byte extension signature. If the first byte is 'A'..'Z' the
 extension is optional and can be ignored.

This allows gracefully introducing a new index extension without
having to rely on all readers having support for it.  Mandatory
extensions start with a lowercase letter and optional ones start with
a capital.  Thus the versions of Git acting on a shared local
repository do not have to upgrade in lockstep.

We almost obey that convention, but there is a problem: when
encountering an unrecognized optional extension, we write

ignoring FNCY extension

to stderr, which alarms users.  This means that in practice we have
had to introduce index extensions in two steps: first add read
support, and then a while later, start writing by default.  This
delays when users can benefit from improvements to the index format.

We cannot change the past, but for index extensions of the future,
there is a straightforward improvement: silence that message except
when tracing.  This way, the message is still available when
debugging, but in everyday use it does not show up so (once most Git
users have this patch) we can turn on new optional extensions right
away without alarming people.

Signed-off-by: Jonathan Nieder 
---
Thanks for reading.  Thoughts?

Sincerely,
Jonathan

 read-cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index 290bd54708..65530a68c2 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1720,7 +1720,7 @@ static int read_index_extension(struct index_state 
*istate,
if (*ext < 'A' || 'Z' < *ext)
return error("index uses %.4s extension, which we do 
not understand",
 ext);
-   fprintf(stderr, "ignoring %.4s extension\n", ext);
+   trace_printf("ignoring %.4s extension\n", ext);
break;
}
return 0;
-- 
2.19.1.930.g4563a0d9d0



Re: [PATCH 2/3] ieot: default to not writing IEOT section

2018-11-12 Thread Jonathan Tan
> +index.recordOffsetTable::
> + Specifies whether the index file should include an "Index Entry
> + Offset Table" section. This reduces index load time on
> + multiprocessor machines but produces a message "ignoring IEOT
> + extension" when reading the index using Git versions before 2.20.
> + Defaults to 'false'.

Probably worth adding a test that exercises this new config option -
somehow create an index with index.recordOffsetTable=1, check that the
index contains the appropriate string (a few ways to do this, but I'm
not sure which are portable), and then run a Git command that reads the
index to make sure it is valid; then do the same except
index.recordOffsetTable=0.

The code itself looks good to me.

Same comment for patch 1.


Re: [PATCH 2/3] ieot: default to not writing IEOT section

2018-11-12 Thread Junio C Hamano
Jonathan Nieder  writes:

> As with EOIE, popular versions of Git do not support the new IEOT
> extension yet.  When accessing a Git repository written by a more
> modern version of Git, they correctly ignore the unrecognized section,
> but in the process they loudly warn
>
>   ignoring IEOT extension
>
> resulting in confusion for users.

Then removing the message is throwing it with bathwater.  First
think about which part of the message is confusiong and then make it
less confusing.

How about

hint: ignoring an optional IEOT extension

to make it clear that it is totally harmless?

With that, we can add advise.unknownIndexExtension=false to turn all
of them off with a single switch.




Re: Reply for Auto parts

2018-11-12 Thread Abiba
Dear
Good day. This is Kevin from Future Mould and Plastic Technology Co.,Ltd.
We supply plastic products for SUZUKI with high quality and competitive price. 
For instance, Dashboard panel, Steering wheel cover, Control Panel, Air 
conditioner outlet, Seat fitting, Cup holder, Gloves box, Rear mirror cover, 
Door Handle, etc.
We also have a professional design and production team, We're also satisfied 
with small batch of plastic parts customization and O.E.M services.
Tks & br
Kevin
Trade Representative
Shanghai Future Mould & Plastic Technology Co.,Ltd.
Mb: +86 15801877587 (WhatsApp/WeChat)


Re* [PATCH v3 1/1] protocol: advertise multiple supported versions

2018-11-12 Thread Junio C Hamano
Stefan Beller  writes:

>> +   if (have_advertised_versions_already)
>> +   BUG(_("attempting to register an allowed protocol version 
>> after advertisement"));
>
> If this is a real BUG (due to wrong program flow) instead of bad user input,
> we would not want to burden the translators with this message.
>
> If it is a message that the user is intended to see upon bad input
> we'd rather go with die(_("translatable text here"));

Yeah, good suggestion.  

Perhaps we should spell it out in docstrings for BUG/die with the
above rationale.  A weatherbaloon patch follows.

 git-compat-util.h | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 96a3f86d8e..a1cf68cbbc 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -420,7 +420,16 @@ static inline char *git_find_last_dir_sep(const char *path)
 
 struct strbuf;
 
-/* General helper functions */
+/*
+ * General helper functions
+ *
+ * Note that these are to produce end-user facing messages, and most
+ * of them should be marked for translation (the exception is
+ * messages generated to be sent over the wire---as we currently do not
+ * have a mechanism to notice and honor locale settings used on the
+ * other end, leave them untranslated.  We should *not* send messages
+ * that are localized for our end).
+ */
 extern void vreportf(const char *prefix, const char *err, va_list params);
 extern NORETURN void usage(const char *err);
 extern NORETURN void usagef(const char *err, ...) __attribute__((format 
(printf, 1, 2)));
@@ -1142,6 +1151,17 @@ static inline int regexec_buf(const regex_t *preg, const 
char *buf, size_t size,
 /* usage.c: only to be used for testing BUG() implementation (see test-tool) */
 extern int BUG_exit_code;
 
+/*
+ * BUG(fmt, ...) is to be used when the problem of reaching that point
+ * in code can only be caused by a program error.  To abort with a
+ * message to report impossible-to-continue condition due to external
+ * factors like end-user input, environment settings, data it was fed
+ * over the wire, use die(_(fmt),...).
+ * 
+ * Note that the message from BUG() should not be marked for
+ * translation while the message from die() is in general end-user
+ * facing and should be marked for translation.
+ */
 #ifdef HAVE_VARIADIC_MACROS
 __attribute__((format (printf, 3, 4))) NORETURN
 void BUG_fl(const char *file, int line, const char *fmt, ...);


Re: [PATCH 2/2] built-in rebase: reinstate `checkout -q` behavior where appropriate

2018-11-12 Thread Junio C Hamano
Junio C Hamano  writes:

> Johannes Schindelin  writes:
>
>>> On the other hand, for a file-scope static that is likely to stay as a
>>> non-API function but a local helper, it may not be worth it.
>>
>> Oh, do you think that `reset_head()` is likely to stay as non-API
>> function? I found myself in the need of repeating this tedious
>> unpack_trees() dance quite a few times over the past two years, and it is
>> *always* daunting because the API is *that* unintuitive.
>>
>> So I *do* hope that `reset_head()` will actually be moved to reset.[ch]
>> eventually, and callsites e.g. in `sequencer.c` will be converted from
>> calling `unpack_trees()` to calling `reset_head()` instead.
>
> As long as the public API function is well thought out, of course it
> is OK.  Ideally, builtin/reset.c can detect when it is making a hard
> reset to HEAD and call that same function.  Which may or may not
> mean you would also want to tell it to record ORIG_HEAD and remove
> MERGE_HEAD and others), perhaps with another bit in that flag word.

Nah, forget about that one.  It sometimes does branch switching and
sometimes does hard reset, which looks like a strange chimera.  We
shouldn't burden it further by adding "while at it remove cruft, as
'reset --hard' needs to do that" to it.


Re: [PATCH 3/3] index: do not warn about unrecognized extensions

2018-11-12 Thread Junio C Hamano
Jonathan Nieder  writes:

> We almost obey that convention, but there is a problem: when
> encountering an unrecognized optional extension, we write
>
>   ignoring FNCY extension
>
> to stderr, which alarms users.

Then the same comment as 2/3 applies to this step.


Re: [PATCH 2/3] ieot: default to not writing IEOT section

2018-11-12 Thread Jonathan Nieder
Junio C Hamano wrote:

> How about
>
>   hint: ignoring an optional IEOT extension
>
> to make it clear that it is totally harmless?
>
> With that, we can add advise.unknownIndexExtension=false to turn all
> of them off with a single switch.

I like it.  Expect a patch soon (tonight or tomorrow) that does that.

We'll have to find some appropriate place in the documentation to
explain what the message is about, still.

Thanks,
Jonathan


Re: [PATCH v2 1/2] ref-filter: add worktree atom

2018-11-12 Thread Junio C Hamano
Jeff King  writes:

>> I wonder if "The worktree at /local/src/wt1 has this branch checked
>> out" is something the user of %(worktree) atom, or a variant thereof
>> e.g. "%(worktree:detailed)", may want to learn, but because that
>> information is lost when this function returns, such an enhancement
>> cannot be done without fixing this funciton.
>
> Hmm. I think for the purposes of this series we could jump straight to
> converting %(worktree) to mean "the path of the worktree for which this
> branch is HEAD, or the empty string otherwise".
>
> Then the caller from git-branch (or anybody wanting to emulate it) could
> still do:
>
>   %(if)%(worktree)%(then)+ %(refname)%(end)
>
> As a bonus, the decision to use "+" becomes a lot easier. It is no
> longer a part of the format language that we must promise forever, but
> simply a porcelain decision by git-branch.

Yeah, thanks for following through the thought process to the
logical conclusion.  If a branch is multply checked out, which is a
condition "git worktree" and "git checkout" ought to prevent from
happening, we could leave the result unspecified but a non-empty
string, or something like that.

> file-global data-structure storing the worktree info once (in an ideal
> world, it would be part of a "struct ref_format" that uses no global
> variables, but that is not how the code is structured today).

Yes, I agree that would be the ideal longer-term direction to move
this code in.

>> > +  } else if (!strcmp(name, "worktree")) {
>> > +  if (string_list_has_string(>u.worktree_heads, 
>> > ref->refname))
>> 
>> I thought we were moving towards killing the use of string_list as a
>> look-up table, as we do not want to see thoughtless copy such
>> a code from parts of the code that are not performance critical to a
>> part.  Not very satisfying.
>> 
>>  I think we can let this pass, and later add a wrapper around
>>  hashmap that is meant to only be used to replace string-list
>>  used for this exact purpose, i.e. key is a string, and there
>>  is no need to iterate over the existing elements in any
>>  sorted order.  Optionally, we can limit the look up to only
>>  checking for existence, if it makes the code for the wrapper
>>  simpler.
>
> This came up over in another thread yesterday, too. So yeah, perhaps we
> should move on that (I am OK punting on it for this series and
> converting it later, though).

FWIW, I am OK punting and leaving, too.


Re: [PATCH v2 2/2] branch: Mark and colorize a branch differently if it is checked out in a linked worktree

2018-11-12 Thread Junio C Hamano
Rafael Ascensão  writes:

> But I can see where personal preference starts to play a role here, as
> the logical solution isn't good enough. Which makes the case for being
> able to configure a bit stronger.

Yeah, our preference over time has always been "do not add to our
default color palette to make the default output too colourful;
instead allow the user to specify their choice".  If this feature
can be added like that, that would be preferrable, and if cyan
(which usuallly is used to present "less interesting" piece of
information and in our default palette) works well enough, maybe we
should use that?


Re: [PATCH 1/5] rebase -r: demonstrate bug with conflicting merges

2018-11-12 Thread Junio C Hamano
"Johannes Schindelin via GitGitGadget" 
writes:

> From: Johannes Schindelin 
>
> When calling `merge` on a branch that has already been merged, that
> `merge` is skipped quietly, but currently a MERGE_HEAD file is being
> left behind and will then be grabbed by the next `pick` (that did
> not want to create a *merge* commit).
>
> Demonstrate this.
>
> Signed-off-by: Johannes Schindelin 
> ---
>  t/t3430-rebase-merges.sh | 16 
>  1 file changed, 16 insertions(+)

For a trivially small change/fix like this, it is OK and even
preferrable to make 1+2 a single step, as applying t/ part only to
try to see the breakage (or "am"ing everything and then "diff |
apply -R" the part outside t/ for the same purpose) is easy enough.

Because the patch 2 with your method ends up showing only the test
set-up part in the context by changing _failure to _success, without
showing what end-user visible breakage the step fixed, which usually
comes near the end of the added test piece.  A single patch that
gives tests that ought to succeed would not force the readers to
switch between patches 1 and 2 while reading the fix.

Of course, the above would not apply for a more involved case where
the actual fix to the code needs to span multiple patches.

Thanks.

> diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
> index aa7bfc88ec..1f08a33687 100755
> --- a/t/t3430-rebase-merges.sh
> +++ b/t/t3430-rebase-merges.sh
> @@ -396,4 +396,20 @@ test_expect_success 'with --autosquash and --exec' '
>   grep "G: +G" actual
>  '
>  
> +test_expect_failure '--continue after resolving conflicts after a merge' '
> + git checkout -b already-has-g E &&
> + git cherry-pick E..G &&
> + test_commit H2 &&
> +
> + git checkout -b conflicts-in-merge H &&
> + test_commit H2 H2.t conflicts H2-conflict &&
> + test_must_fail git rebase -r already-has-g &&
> + grep conflicts H2.t &&

Is this making sure that the above test_must_fail succeeded because
of a conflict and not due to any other failure?  I would have used
"ls-files -u H2.t" to see if the index is unmerged, which probably
is a more direct way to test what this is trying to test, but if we
are in the conflicted state, the one side of << == >> has this
string (the other has "H2" in it, presumably?), so in practice this
should be good enough.

> + echo resolved >H2.t &&
> + git add -u &&

and we resolve to continue.

> + git rebase --continue &&
> + test_must_fail git rev-parse --verify HEAD^2 &&

Even if we made an octopus by mistake, the above will catch it,
which is good.

> + test_path_is_missing .git/MERGE_HEAD
> +'
> +
>  test_done

And from the proposed log message, I am reading that the last two
things (i.e. resulting tip is a child with a single parent and there
is no leftover MERGE_HEAD file) fail without the fix.  

This is enough material to convince me or anybody that the bug is
worth fixing.  Thanks for being careful noticing a glitch during
your real (and otherwise unrelated to the bug) work and following
through.


Re: [PATCH] remote-curl: die on server-side errors

2018-11-12 Thread Junio C Hamano
stead...@google.com writes:

> When a smart HTTP server sends an error message via pkt-line,
> remote-curl will fail to detect the error (which usually results in
> incorrectly falling back to dumb-HTTP mode).
>
> This patch adds a check in discover_refs() for server-side error
> messages, as well as a test case for this issue.
>
> Signed-off-by: Josh Steadmon 
> ---
>  remote-curl.c   | 4 +++-
>  t/lib-httpd.sh  | 1 +
>  t/lib-httpd/apache.conf | 4 
>  t/lib-httpd/error-smart-http.sh | 3 +++
>  t/t5551-http-fetch-smart.sh | 5 +
>  5 files changed, 16 insertions(+), 1 deletion(-)
>  create mode 100644 t/lib-httpd/error-smart-http.sh
>
> diff --git a/remote-curl.c b/remote-curl.c
> index 762a55a75f..bb3a86505e 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -436,7 +436,9 @@ static struct discovery *discover_refs(const char 
> *service, int for_push)
>   } else if (maybe_smart &&
>  last->len > 5 && starts_with(last->buf + 4, "version 2")) {
>   last->proto_git = 1;
> - }
> + } else if (maybe_smart && last->len > 5 &&
> +starts_with(last->buf + 4, "ERR "))
> + die(_("remote error: %s"), last->buf + 8);

Two observations and a half.

 - All of these if/else if/ blocks (currently 2, now you added the
   third one) are to special case the "maybe_smart" case.  Perhaps
   the whole thing should be restrucutred to

if (maybe_smart) {
if ...
else if ...
else if ...
}

 - All conditions skip the first four bytes in last->buf and does
   starts_with (the first branch is "starts_with("#") in disguise).
   Can we do something to the hardcoded magic number 4, which looks
   somewhat irritating?

 - This is less of the problem with the suggested change in the
   first bullet item above, but the current code structure makes
   readers wonder if maybe_start that starts as 1 is turned off
   somewhere in the if/else if/ cascade when we find out that we are
   not dealing with smart HTTP after all.  That however is not what
   is happening.  The "maybe" variable is primarily there to avoid
   repeating the logic that determined its initial value in the
   early part of the function.  The last->proto_git field is used to
   record if we are using smart HTTP or not.

Otherwise the change looks sensible.


> diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
> index a8729f8232..4e946623bb 100644
> --- a/t/lib-httpd.sh
> +++ b/t/lib-httpd.sh
> @@ -131,6 +131,7 @@ prepare_httpd() {
>   mkdir -p "$HTTPD_DOCUMENT_ROOT_PATH"
>   cp "$TEST_PATH"/passwd "$HTTPD_ROOT_PATH"
>   install_script broken-smart-http.sh
> + install_script error-smart-http.sh
>   install_script error.sh
>   install_script apply-one-time-sed.sh
>  
> diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
> index 581c010d8f..6de2bc775c 100644
> --- a/t/lib-httpd/apache.conf
> +++ b/t/lib-httpd/apache.conf
> @@ -117,6 +117,7 @@ Alias /auth/dumb/ www/auth/dumb/
>  
>  ScriptAliasMatch /smart_*[^/]*/(.*) ${GIT_EXEC_PATH}/git-http-backend/$1
>  ScriptAlias /broken_smart/ broken-smart-http.sh/
> +ScriptAlias /error_smart/ error-smart-http.sh/
>  ScriptAlias /error/ error.sh/
>  ScriptAliasMatch /one_time_sed/(.*) apply-one-time-sed.sh/$1
>  
> @@ -125,6 +126,9 @@ ScriptAliasMatch /one_time_sed/(.*) 
> apply-one-time-sed.sh/$1
>  
>   Options ExecCGI
>  
> +
> + Options ExecCGI
> +
>  
>Options ExecCGI
>  
> diff --git a/t/lib-httpd/error-smart-http.sh b/t/lib-httpd/error-smart-http.sh
> new file mode 100644
> index 00..0a1af318f7
> --- /dev/null
> +++ b/t/lib-httpd/error-smart-http.sh
> @@ -0,0 +1,3 @@
> +printf "Content-Type: text/%s\n" "html"
> +echo
> +printf "%s" "0019ERR server-side error"
> diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
> index 8630b0cc39..ba83e567e5 100755
> --- a/t/t5551-http-fetch-smart.sh
> +++ b/t/t5551-http-fetch-smart.sh
> @@ -429,5 +429,10 @@ test_expect_success 'GIT_TRACE_CURL_NO_DATA prevents 
> data from being traced' '
>   ! grep "=> Send data" err
>  '
>  
> +test_expect_success 'server-side error detected' '
> + test_must_fail git clone $HTTPD_URL/error_smart/repo.git 2>actual &&
> + grep "server-side error" actual
> +'
> +
>  stop_httpd
>  test_done


Re: [PATCH] remote-curl: die on server-side errors

2018-11-12 Thread Junio C Hamano
stead...@google.com writes:

> When a smart HTTP server sends an error message via pkt-line,
> remote-curl will fail to detect the error (which usually results in
> incorrectly falling back to dumb-HTTP mode).
>
> This patch adds a check in discover_refs() for server-side error
> messages, as well as a test case for this issue.
>
> Signed-off-by: Josh Steadmon 
> ---

Forgot to mention one procedural comment.

As you can see in the To: line of this reply, your MUA is placing
only the e-mail address without name on your From: line.

Preferrably I'd like to see the same string as your sign-off on the
"From:" line in your messages for a bit of human touch ;-) Can you
tweak your MUA to make that happen?

The second preference is to add an in-body header (i.e. as the first
line of the body of the message) so that the body of the message
starts like this:

From: Josh Steadmon 

When a smart HTTP server sends an error message via pkt-line,
remote-curl will fail to detect the error (which usually results in
incorrectly falling back to dumb-HTTP mode).

This patch adds a check in discover_refs() for server-side error
messages, as well as a test case for this issue.

Signed-off-by: Josh Steadmon 
---
 remote-curl.c   | 4 +++-
 t/lib-httpd.sh  | 1 +
 t/lib-httpd/apache.conf | 4 

Either way would make sure that the resulting patch's author line
will be attributed correctly to the same identity as who is signing
it off first as the author.

Thanks.


Proposal

2018-11-12 Thread SGM John Dailey
Hello ,


My name is Sgt Major John Dailey. I am here in Afghanistan , I came
upon a project I think we can work together on. I and my partner (1st
Lt. Daniel Farkas ) have the sum of $15 Million United State Dollars
which we got from a Crude Oil Deal in Iraq before he was killed by an
explosion while on a Vehicle Patrol. Due to this incident, I want you
to receive these funds on my behalf as far as I can be assured that my
share will be safe in your care until I complete my service here in
Afghanistan and come over to meet with you. Since we are working here
for an Official capacity, I cannot keep these funds hence by
contacting you. I Guarantee and Assure you that this is risk free.


I just need your acceptance to help me receive these funds and all is
done. Since the death of my partner, my life is not guaranteed here
anymore, so I have decided to share these funds with you. I am also
offering you 40% of this money for the assistance you will give to me.
One passionate appeal I will make to you, is for you not to discuss
this matter with anybody, should you have reasons to reject this
offer, please and please destroy this message as any leakage of this
information will be too bad for us as soldiers here in Afghanistan. I
do not know how long we will remain here, and I have been shot,
wounded and survived so many suicide bomb attacks, this and other
reasons have prompted me to reach out to you for help. I honestly want
this matter to be resolved immediately, please contact me as soon as
possible on my e-mail address which is my only way of communication.


Yours In Service,
SGM John Dailey


Re: [PATCH 1/3] eoie: default to not writing EOIE section

2018-11-12 Thread Junio C Hamano
Jonathan Nieder  writes:

> Since 3b1d9e04 (eoie: add End of Index Entry (EOIE) extension,
> 2018-10-10) Git defaults to writing the new EOIE section when writing
> out an index file.  Usually that is a good thing because it improves
> threaded performance, but when a Git repository is shared with older
> versions of Git, it produces a confusing warning:
>
>   $ git status
>   ignoring EOIE extension
>   HEAD detached at 371ed0defa
>   nothing to commit, working tree clean
>
> Let's introduce the new index extension more gently.  First we'll roll
> out the new version of Git that understands it, and then once
> sufficiently many users are using such a version, we can flip the
> default to writing it by default.
>
> Introduce a '[index] recordEndOfIndexEntries' configuration variable
> to allow interested users to benefit from this index extension early.

Thanks.  I am in principle OK with this approach.  In fact, I
suspect that the default may want to be dynamically determined, and
we give this knob to let the users further force their preference.
When no extension that benefits from multi-threading is written, the
default can stay "no" in future versions of Git, for example.

> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 41a9ff2b6a..d702379db4 100644

The timing is a bit unfortunate for any topic to touch this file,
and contrib/cocci would not help us in this case X-<.

> diff --git a/read-cache.c b/read-cache.c
> index f3a848d61c..4bfe93c4c2 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -2698,6 +2698,15 @@ void update_index_if_able(struct index_state *istate, 
> struct lock_file *lockfile
>   rollback_lock_file(lockfile);
>  }
>  
> +static int record_eoie(void)
> +{
> + int val;
> +
> + if (!git_config_get_bool("index.recordendofindexentries", ))
> + return val;
> + return 0;
> +}

Unconditionally defaulting to no in this round is perfectly fine.
Let's make a mental note that this is the place to decide dynamic
default in the future when we want to.  It would probably have to
ask around various "extension writing" helpers if they want to have
a say in the outcome (e.g. if there are very many cache entries in
the istate, the entry offset table may want to be written and
otherwise not).

> @@ -2945,7 +2954,7 @@ static int do_write_index(struct index_state *istate, 
> struct tempfile *tempfile,
>* read.  Write it out regardless of the strip_extensions parameter as 
> we need it
>* when loading the shared index.
>*/
> - if (offset) {
> + if (offset && record_eoie()) {
>   struct strbuf sb = STRBUF_INIT;
>  
>   write_eoie_extension(, _c, offset);
> diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
> index 2ac47aa0e4..0cbac64e28 100755
> --- a/t/t1700-split-index.sh
> +++ b/t/t1700-split-index.sh
> @@ -25,14 +25,17 @@ test_expect_success 'enable split index' '
>   git update-index --split-index &&
>   test-tool dump-split-index .git/index >actual &&
>   indexversion=$(test-tool index-version <.git/index) &&
> +
> + # NEEDSWORK: Stop hard-coding checksums.

Also let's stop hard-coding the assumption that the new knob is off
by default.  Ideally, you'd want to test both cases, right?

Perhaps you'd call "git update-index --split-index" we see in the
precontext twice, with "-c VAR=false" and "-c VAR=true", to prepare
"actual.without-eoie" and "actual.with-eoie", or something like
that?

Thanks.

>   if test "$indexversion" = "4"
>   then
> - own=3527df833c6c100d3d1d921a9a782d62a8be4b58
> - base=746f7ab2ed44fb839efdfbffcf399d0b113fb4cb
> + own=432ef4b63f32193984f339431fd50ca796493569
> + base=508851a7f0dfa8691e9f69c7f055865389012491
>   else
> - own=5e9b60117ece18da410ddecc8b8d43766a0e4204
> - base=4370042739b31cd17a5c5cd6043a77c9a00df113
> + own=8299b0bcd1ac364e5f1d7768efb62fa2da79a339
> + base=39d890139ee5356c7ef572216cebcd27aa41f9df
>   fi &&
> +
>   cat >expect <<-EOF &&
>   own $own
>   base $base


Re: [PATCH 2/2] built-in rebase: reinstate `checkout -q` behavior where appropriate

2018-11-12 Thread Junio C Hamano
Johannes Schindelin  writes:

>> On the other hand, for a file-scope static that is likely to stay as a
>> non-API function but a local helper, it may not be worth it.
>
> Oh, do you think that `reset_head()` is likely to stay as non-API
> function? I found myself in the need of repeating this tedious
> unpack_trees() dance quite a few times over the past two years, and it is
> *always* daunting because the API is *that* unintuitive.
>
> So I *do* hope that `reset_head()` will actually be moved to reset.[ch]
> eventually, and callsites e.g. in `sequencer.c` will be converted from
> calling `unpack_trees()` to calling `reset_head()` instead.

As long as the public API function is well thought out, of course it
is OK.  Ideally, builtin/reset.c can detect when it is making a hard
reset to HEAD and call that same function.  Which may or may not
mean you would also want to tell it to record ORIG_HEAD and remove
MERGE_HEAD and others), perhaps with another bit in that flag word.


Re: [PATCH 1/2] refs: show --exclude failure with --branches/tags/remotes=glob

2018-11-12 Thread Junio C Hamano
Rafael Ascensão  writes:

> The documentation of `--exclude=` option from rev-list and rev-parse
> explicitly states that exclude patterns *should not* start with 'refs/'
> when used with `--branches`, `--tags` or `--remotes`.
>
> However, following this advice results in refereces not being excluded
> if the next `--branches`, `--tags`, `--remotes` use the optional
> inclusive glob.
>
> Demonstrate this failure.
>
> Signed-off-by: Rafael Ascensão 
> ---
>  t/t6018-rev-list-glob.sh | 60 ++--
>  1 file changed, 57 insertions(+), 3 deletions(-)

For a trivially small change/fix like this (i.e. the real fix in 2/2
is just 4 lines), it is OK and even preferrable to make 1+2 a single
step, as applying t/ part only to try to see the breakage (or
"am"ing everything and then "diff | apply -R" the part outside t/
for the same purpose) is easy enough.

Often the patch 2 with your method ends up showing only the test
set-up part in the context by changing _failure to _success, without
showing what end-user visible breakage the step fixed, which usually
comes near the end of the added test piece.  For this particular
test, s/_failure/_success/ shows everything in the verification
phase, but the entire set-up for these tests cannot be seen while
reviewing 2/2.  Unlike that, a single patch that gives tests that
ought to succeed would not force the readers to switch between
patches 1 and 2 while reading the fix.

Of course, the above would not apply for a more involved case where
the actual fix to the code needs to span multiple patches.

> diff --git a/t/t6018-rev-list-glob.sh b/t/t6018-rev-list-glob.sh
> index 0bf10d0686..8e2b136356 100755
> --- a/t/t6018-rev-list-glob.sh
> +++ b/t/t6018-rev-list-glob.sh
> @@ -36,7 +36,13 @@ test_expect_success 'setup' '
>   git tag foo/bar master &&
>   commit master3 &&
>   git update-ref refs/remotes/foo/baz master &&
> - commit master4
> + commit master4 &&
> + git update-ref refs/remotes/upstream/one subspace/one &&
> + git update-ref refs/remotes/upstream/two subspace/two &&
> + git update-ref refs/remotes/upstream/x subspace-x &&
> + git tag qux/one subspace/one &&
> + git tag qux/two subspace/two &&
> + git tag qux/x subspace-x
>  '

Let me follow along.

We add three remote-tracking looking branches for 'upstream', and
three tags under refs/tags/qux/.


> +test_expect_failure 'rev-parse --exclude=glob with --branches=glob' '
> + compare rev-parse "--exclude=subspace-* --branches=sub*" "subspace/one 
> subspace/two"
> +'

We want to list all branches that begin with "sub", but we do not
want ones that begin with "subspace-".  subspace/{one,two} should
pass that criteria, while subspace-x, other/three, someref, and
master should not.  Makes sense.

> +
> +test_expect_failure 'rev-parse --exclude=glob with --tags=glob' '
> + compare rev-parse "--exclude=qux/? --tags=qux/*" "qux/one qux/two"
> +'

We want all tags that begin with "qux/" but we do not want qux/
followed by just a single letter.  qux/{one,two} are in, qux/x is
out.  Makes sense.

> +test_expect_failure 'rev-parse --exclude=glob with --remotes=glob' '
> + compare rev-parse "--exclude=upstream/? --remotes=upstream/*" 
> "upstream/one upstream/two"
> +'

Similarly for refs/remotes/upstream/ hierarchy.

> +test_expect_failure 'rev-parse --exclude=ref with --branches=glob' '
> + compare rev-parse "--exclude=subspace-x --branches=sub*" "subspace/one 
> subspace/two
> +'

This is almost a repeat of the first new one.  As subspace-* in
branches only match subspace-x, this should give the same result as
that one.

> +test_expect_failure 'rev-parse --exclude=ref with --tags=glob' '
> + compare rev-parse "--exclude=qux/x --tags=qux/*" "qux/one qux/two"
> +'

Likewise.

> +test_expect_failure 'rev-parse --exclude=ref with --remotes=glob' '
> + compare rev-parse "--exclude=upstream/x --remotes=upstream/*" 
> "upstream/one upstream/two"
> +'

Likewise.

> +test_expect_failure 'rev-list --exclude=glob with --branches=glob' '
> + compare rev-list "--exclude=subspace-* --branches=sub*" "subspace/one 
> subspace/two"
> +'

And then the same pattern continues with rev-list.

> +test_expect_failure 'rev-list --exclude=glob with --tags=glob' '
> + compare rev-list "--exclude=qux/? --tags=qux/*" "qux/one qux/two"
> +'
> +
> +test_expect_failure 'rev-list --exclude=glob with --remotes=glob' '
> + compare rev-list "--exclude=upstream/? --remotes=upstream/*" 
> "upstream/one upstream/two"
> +'
> +
> +test_expect_failure 'rev-list --exclude=ref with --branches=glob' '
> + compare rev-list "--exclude=subspace-x --branches=sub*" "subspace/one 
> subspace/two"
> +'
> +
> +test_expect_failure 'rev-list --exclude=ref with --tags=glob' '
> + compare rev-list "--exclude=qux/x --tags=qux/*" "qux/one qux/two"
> +'
> +
> +test_expect_failure 'rev-list --exclude=ref with --remotes=glob' '
> + compare rev-list 

[PATCH 2/5] rebase -r: do not write MERGE_HEAD unless needed

2018-11-12 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

When we detect that a `merge` can be skipped because the merged commit
is already an ancestor of HEAD, we do not need to commit, therefore
writing the MERGE_HEAD file is useless.

It is actually worse than useless: a subsequent `git commit` will pick
it up and think that we want to merge that commit, still.

To avoid that, move the code that writes the MERGE_HEAD file to a
location where we already know that the `merge` cannot be skipped.

Signed-off-by: Johannes Schindelin 
---
 sequencer.c  | 8 
 t/t3430-rebase-merges.sh | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 9e1ab3a2a7..7a9cd81afb 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -3191,10 +3191,6 @@ static int do_merge(struct commit *commit, const char 
*arg, int arg_len,
}
 
merge_commit = to_merge->item;
-   write_message(oid_to_hex(_commit->object.oid), GIT_SHA1_HEXSZ,
- git_path_merge_head(the_repository), 0);
-   write_message("no-ff", 5, git_path_merge_mode(the_repository), 0);
-
bases = get_merge_bases(head_commit, merge_commit);
if (bases && oideq(_commit->object.oid,
   >item->object.oid)) {
@@ -3203,6 +3199,10 @@ static int do_merge(struct commit *commit, const char 
*arg, int arg_len,
goto leave_merge;
}
 
+   write_message(oid_to_hex(_commit->object.oid), GIT_SHA1_HEXSZ,
+ git_path_merge_head(the_repository), 0);
+   write_message("no-ff", 5, git_path_merge_mode(the_repository), 0);
+
for (j = bases; j; j = j->next)
commit_list_insert(j->item, );
free_commit_list(bases);
diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
index 1f08a33687..cc5646836f 100755
--- a/t/t3430-rebase-merges.sh
+++ b/t/t3430-rebase-merges.sh
@@ -396,7 +396,7 @@ test_expect_success 'with --autosquash and --exec' '
grep "G: +G" actual
 '
 
-test_expect_failure '--continue after resolving conflicts after a merge' '
+test_expect_success '--continue after resolving conflicts after a merge' '
git checkout -b already-has-g E &&
git cherry-pick E..G &&
test_commit H2 &&
-- 
gitgitgadget



[PATCH 3/5] rebase -i: include MERGE_HEAD into files to clean up

2018-11-12 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

Every once in a while, the interactive rebase makes sure that no stale
files are lying around. These days, we need to include MERGE_HEAD into
that set of files, as the `merge` command will generate them.

Signed-off-by: Johannes Schindelin 
---
 sequencer.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/sequencer.c b/sequencer.c
index 7a9cd81afb..2f526390ac 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -3459,6 +3459,7 @@ static int pick_commits(struct todo_list *todo_list, 
struct replay_opts *opts)
unlink(rebase_path_author_script());
unlink(rebase_path_stopped_sha());
unlink(rebase_path_amend());
+   unlink(git_path_merge_head(the_repository));
delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
 
if (item->command == TODO_BREAK)
@@ -3829,6 +3830,7 @@ static int commit_staged_changes(struct replay_opts *opts,
   opts, flags))
return error(_("could not commit staged changes."));
unlink(rebase_path_amend());
+   unlink(git_path_merge_head(the_repository));
if (final_fixup) {
unlink(rebase_path_fixup_msg());
unlink(rebase_path_squash_msg());
-- 
gitgitgadget



[PATCH 0/5] Assorted fixes revolving around rebase and merges

2018-11-12 Thread Johannes Schindelin via GitGitGadget
I noticed a couple of weeks ago that I had bogus merge commits after my
rebases, where the original commits had been regular commits.

This set me out on the adventure that is reflected in this patch series.

Of course, the thing I wanted to fix is demonstrated by 1/5 and fixed in
2/5. But while at it, I ran into other issues and fixed them since I was at
it anyway.

Johannes Schindelin (5):
  rebase -r: demonstrate bug with conflicting merges
  rebase -r: do not write MERGE_HEAD unless needed
  rebase -i: include MERGE_HEAD into files to clean up
  built-in rebase --skip/--abort: clean up stale .git/ files
  status: rebase and merge can be in progress at the same time

 builtin/rebase.c |  3 +++
 sequencer.c  | 10 ++
 t/t3430-rebase-merges.sh | 16 
 wt-status.c  |  9 +++--
 4 files changed, 32 insertions(+), 6 deletions(-)


base-commit: 8858448bb49332d353febc078ce4a3abcc962efe
Published-As: 
https://github.com/gitgitgadget/git/releases/tags/pr-75%2Fdscho%2Frebase-r-and-merge-head-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git 
pr-75/dscho/rebase-r-and-merge-head-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/75
-- 
gitgitgadget


[PATCH 4/5] built-in rebase --skip/--abort: clean up stale .git/ files

2018-11-12 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

The scripted version of the rebase used to execute `git reset --hard`
when skipping or aborting. When we ported this to C, we did update the
worktree and some reflogs, but we failed to imitate `git reset --hard`'s
behavior regarding files in .git/ such as MERGE_HEAD.

Let's address this oversight.

Signed-off-by: Johannes Schindelin 
---
 builtin/rebase.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index 0ee06aa363..017dce1b50 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -23,6 +23,7 @@
 #include "revision.h"
 #include "commit-reach.h"
 #include "rerere.h"
+#include "branch.h"
 
 static char const * const builtin_rebase_usage[] = {
N_("git rebase [-i] [options] [--exec ] [--onto ] "
@@ -1002,6 +1003,7 @@ int cmd_rebase(int argc, const char **argv, const char 
*prefix)
 
if (reset_head(NULL, "reset", NULL, 0, NULL, NULL) < 0)
die(_("could not discard worktree changes"));
+   remove_branch_state();
if (read_basic_state())
exit(1);
goto run_rebase;
@@ -1019,6 +1021,7 @@ int cmd_rebase(int argc, const char **argv, const char 
*prefix)
   options.head_name, 0, NULL, NULL) < 0)
die(_("could not move back to %s"),
oid_to_hex(_head));
+   remove_branch_state();
ret = finish_rebase();
goto cleanup;
}
-- 
gitgitgadget



[PATCH 5/5] status: rebase and merge can be in progress at the same time

2018-11-12 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

Since `git rebase -r` was introduced, that is possible. But our
machinery did not think that possible, and failed to say anything about
the rebase in progress when in the middle of a merge.

Let's work around that in the minimal fashion.

Signed-off-by: Johannes Schindelin 
---
 wt-status.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/wt-status.c b/wt-status.c
index 187568a112..a24711374c 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -1559,6 +1559,7 @@ void wt_status_get_state(struct wt_status_state *state,
struct object_id oid;
 
if (!stat(git_path_merge_head(the_repository), )) {
+   wt_status_check_rebase(NULL, state);
state->merge_in_progress = 1;
} else if (wt_status_check_rebase(NULL, state)) {
;   /* all set */
@@ -1583,9 +1584,13 @@ static void wt_longstatus_print_state(struct wt_status 
*s)
const char *state_color = color(WT_STATUS_HEADER, s);
struct wt_status_state *state = >state;
 
-   if (state->merge_in_progress)
+   if (state->merge_in_progress) {
+   if (state->rebase_interactive_in_progress) {
+   show_rebase_information(s, state_color);
+   fputs("\n", s->fp);
+   }
show_merge_in_progress(s, state_color);
-   else if (state->am_in_progress)
+   } else if (state->am_in_progress)
show_am_in_progress(s, state_color);
else if (state->rebase_in_progress || 
state->rebase_interactive_in_progress)
show_rebase_in_progress(s, state_color);
-- 
gitgitgadget


[PATCH 1/5] rebase -r: demonstrate bug with conflicting merges

2018-11-12 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

When calling `merge` on a branch that has already been merged, that
`merge` is skipped quietly, but currently a MERGE_HEAD file is being
left behind and will then be grabbed by the next `pick` (that did
not want to create a *merge* commit).

Demonstrate this.

Signed-off-by: Johannes Schindelin 
---
 t/t3430-rebase-merges.sh | 16 
 1 file changed, 16 insertions(+)

diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
index aa7bfc88ec..1f08a33687 100755
--- a/t/t3430-rebase-merges.sh
+++ b/t/t3430-rebase-merges.sh
@@ -396,4 +396,20 @@ test_expect_success 'with --autosquash and --exec' '
grep "G: +G" actual
 '
 
+test_expect_failure '--continue after resolving conflicts after a merge' '
+   git checkout -b already-has-g E &&
+   git cherry-pick E..G &&
+   test_commit H2 &&
+
+   git checkout -b conflicts-in-merge H &&
+   test_commit H2 H2.t conflicts H2-conflict &&
+   test_must_fail git rebase -r already-has-g &&
+   grep conflicts H2.t &&
+   echo resolved >H2.t &&
+   git add -u &&
+   git rebase --continue &&
+   test_must_fail git rev-parse --verify HEAD^2 &&
+   test_path_is_missing .git/MERGE_HEAD
+'
+
 test_done
-- 
gitgitgadget



[PATCH 0/3] Avoid confusing messages from new index extensions (Re: [PATCH v8 0/7] speed up index load through parallelization)

2018-11-12 Thread Jonathan Nieder
Hi,

Ben Peart wrote:

> Ben Peart (6):
>   read-cache: clean up casting and byte decoding
>   eoie: add End of Index Entry (EOIE) extension
>   config: add new index.threads config setting
>   read-cache: load cache extensions on a worker thread
>   ieot: add Index Entry Offset Table (IEOT) extension
>   read-cache: load cache entries on worker threads

I love this, but when deploying it I ran into a problem.

How about these patches?

Thanks,
Jonathan Nieder (3):
  eoie: default to not writing EOIE section
  ieot: default to not writing IEOT section
  index: do not warn about unrecognized extensions

 Documentation/config.txt | 14 ++
 read-cache.c | 24 +---
 t/t1700-split-index.sh   | 11 +++
 3 files changed, 42 insertions(+), 7 deletions(-)


[PATCH 1/3] eoie: default to not writing EOIE section

2018-11-12 Thread Jonathan Nieder
Since 3b1d9e04 (eoie: add End of Index Entry (EOIE) extension,
2018-10-10) Git defaults to writing the new EOIE section when writing
out an index file.  Usually that is a good thing because it improves
threaded performance, but when a Git repository is shared with older
versions of Git, it produces a confusing warning:

  $ git status
  ignoring EOIE extension
  HEAD detached at 371ed0defa
  nothing to commit, working tree clean

Let's introduce the new index extension more gently.  First we'll roll
out the new version of Git that understands it, and then once
sufficiently many users are using such a version, we can flip the
default to writing it by default.

Introduce a '[index] recordEndOfIndexEntries' configuration variable
to allow interested users to benefit from this index extension early.

Signed-off-by: Jonathan Nieder 
---
 Documentation/config.txt |  7 +++
 read-cache.c | 11 ++-
 t/t1700-split-index.sh   | 11 +++
 3 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 41a9ff2b6a..d702379db4 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2188,6 +2188,13 @@ imap::
The configuration variables in the 'imap' section are described
in linkgit:git-imap-send[1].
 
+index.recordEndOfIndexEntries::
+   Specifies whether the index file should include an "End Of Index
+   Entry" section. This reduces index load time on multiprocessor
+   machines but produces a message "ignoring EOIE extension" when
+   reading the index using Git versions before 2.20. Defaults to
+   'false'.
+
 index.threads::
Specifies the number of threads to spawn when loading the index.
This is meant to reduce index load time on multiprocessor machines.
diff --git a/read-cache.c b/read-cache.c
index f3a848d61c..4bfe93c4c2 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2698,6 +2698,15 @@ void update_index_if_able(struct index_state *istate, 
struct lock_file *lockfile
rollback_lock_file(lockfile);
 }
 
+static int record_eoie(void)
+{
+   int val;
+
+   if (!git_config_get_bool("index.recordendofindexentries", ))
+   return val;
+   return 0;
+}
+
 /*
  * On success, `tempfile` is closed. If it is the temporary file
  * of a `struct lock_file`, we will therefore effectively perform
@@ -2945,7 +2954,7 @@ static int do_write_index(struct index_state *istate, 
struct tempfile *tempfile,
 * read.  Write it out regardless of the strip_extensions parameter as 
we need it
 * when loading the shared index.
 */
-   if (offset) {
+   if (offset && record_eoie()) {
struct strbuf sb = STRBUF_INIT;
 
write_eoie_extension(, _c, offset);
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index 2ac47aa0e4..0cbac64e28 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -25,14 +25,17 @@ test_expect_success 'enable split index' '
git update-index --split-index &&
test-tool dump-split-index .git/index >actual &&
indexversion=$(test-tool index-version <.git/index) &&
+
+   # NEEDSWORK: Stop hard-coding checksums.
if test "$indexversion" = "4"
then
-   own=3527df833c6c100d3d1d921a9a782d62a8be4b58
-   base=746f7ab2ed44fb839efdfbffcf399d0b113fb4cb
+   own=432ef4b63f32193984f339431fd50ca796493569
+   base=508851a7f0dfa8691e9f69c7f055865389012491
else
-   own=5e9b60117ece18da410ddecc8b8d43766a0e4204
-   base=4370042739b31cd17a5c5cd6043a77c9a00df113
+   own=8299b0bcd1ac364e5f1d7768efb62fa2da79a339
+   base=39d890139ee5356c7ef572216cebcd27aa41f9df
fi &&
+
cat >expect <<-EOF &&
own $own
base $base
-- 
2.19.1.930.g4563a0d9d0



Re: [PATCH 3/5] rebase -i: include MERGE_HEAD into files to clean up

2018-11-12 Thread Junio C Hamano
"Johannes Schindelin via GitGitGadget" 
writes:

> From: Johannes Schindelin 
>
> Every once in a while, the interactive rebase makes sure that no stale
> files are lying around. These days, we need to include MERGE_HEAD into
> that set of files, as the `merge` command will generate them.
>
> Signed-off-by: Johannes Schindelin 
> ---
>  sequencer.c | 2 ++
>  1 file changed, 2 insertions(+)

Makes sense.

>
> diff --git a/sequencer.c b/sequencer.c
> index 7a9cd81afb..2f526390ac 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -3459,6 +3459,7 @@ static int pick_commits(struct todo_list *todo_list, 
> struct replay_opts *opts)
>   unlink(rebase_path_author_script());
>   unlink(rebase_path_stopped_sha());
>   unlink(rebase_path_amend());
> + unlink(git_path_merge_head(the_repository));
>   delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
>  
>   if (item->command == TODO_BREAK)
> @@ -3829,6 +3830,7 @@ static int commit_staged_changes(struct replay_opts 
> *opts,
>  opts, flags))
>   return error(_("could not commit staged changes."));
>   unlink(rebase_path_amend());
> + unlink(git_path_merge_head(the_repository));
>   if (final_fixup) {
>   unlink(rebase_path_fixup_msg());
>   unlink(rebase_path_squash_msg());


Re: [PATCH] Makefile: use CXXFLAGS for linking fuzzers

2018-11-12 Thread Junio C Hamano
stead...@google.com writes:

> OSS-Fuzz requires C++-specific flags to link fuzzers. Passing these in
> CFLAGS causes lots of build warnings. Using separate CXXFLAGS avoids
> this.

We are not a C++ shop, so allow me to show ignorance about how
projects that are OSS-Fuzz-enabled work.  Do they use one set of
CXXFLAGS when compiling the "real thing" and a separate set (perhaps
one is subset of the other, or perhaps these two just have overlap)
of CXXFLAGS when building to link with the fuzzer?

What I am trying to get at is if this should be CXXFLAGS or
CXX_FUZZER_FLAGS.  If the OSS-Fuzz-enabled C++ projects use one set
of flags for the "main" part of the project (to produce binaries to
be run by the end users) and then link with extra flags to work with
fuzzers, I would imagine that they won't call the latter CXXFLAGS
but call it something else, and we probably should follow suit if
that is the case.

Not that we plan to (re)write the maint part of Git in C++ ever, so
I am personally OK with sacrificing the most generic CXXFLAGS macro
for the sole use of OSS-Fuzz linkage, but I'd prefer to leave the
door open so that other things like OSS-Fuzz that require C++ can be
added like your work does to the project.

Thanks.


> Signed-off-by: Josh Steadmon 
> ---
>  Makefile | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index bbfbb4292d..5462bc4b6b 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -497,6 +497,7 @@ GIT-VERSION-FILE: FORCE
>  # CFLAGS and LDFLAGS are for the users to override from the command line.
>  
>  CFLAGS = -g -O2 -Wall
> +CXXFLAGS ?= $(CFLAGS)
>  LDFLAGS =
>  ALL_CFLAGS = $(CPPFLAGS) $(CFLAGS)
>  ALL_LDFLAGS = $(LDFLAGS)
> @@ -3098,14 +3099,14 @@ cover_db_html: cover_db
>  # An example command to build against libFuzzer from LLVM 4.0.0:
>  #
>  # make CC=clang CXX=clang++ \
> -#  CFLAGS="-fsanitize-coverage=trace-pc-guard -fsanitize=address" \
> +#  CXXFLAGS="-fsanitize-coverage=trace-pc-guard -fsanitize=address" \
>  #  LIB_FUZZING_ENGINE=/usr/lib/llvm-4.0/lib/libFuzzer.a \
>  #  fuzz-all
>  #
>  .PHONY: fuzz-all
>  
>  $(FUZZ_PROGRAMS): all
> - $(QUIET_LINK)$(CXX) $(CFLAGS) $(LIB_OBJS) $(BUILTIN_OBJS) \
> + $(QUIET_LINK)$(CXX) $(CXXFLAGS) $(LIB_OBJS) $(BUILTIN_OBJS) \
>   $(XDIFF_OBJS) $(EXTLIBS) git.o $@.o $(LIB_FUZZING_ENGINE) -o $@
>  
>  fuzz-all: $(FUZZ_PROGRAMS)


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-12 Thread brian m. carlson
On Sun, Nov 11, 2018 at 01:33:44PM +0100, Ævar Arnfjörð Bjarmason wrote:
> The users who need protection against git deleting their files the most
> are exactly the sort of users who aren't expert-level enough to
> understand the nuances of how the semantics of .gitignore and "precious"
> are going to interact before git eats their data.
> 
> This is pretty apparent from the bug reports we're getting about
> this. None of them are:
> 
> "Hey, I 100% understood .gitignore semantics including this one part
> of the docs where you say you'll do this, but just forgot one day
> and deleted my work. Can we get some more safety?"
> 
> But rather (with some hyperbole for effect):
> 
> "ZOMG git deleted my file! Is this a bug??"
> 
> So I think we should have the inverse of this "precious"
> attribute". Just a change to the docs to say that .gitignore doesn't
> imply these eager deletion semantics on tree unpacking anymore, and if
> users want it back they can define a "garbage" attribute
> (s/precious/garbage/).
> 
> That will lose no data, and in the very rare cases where a checkout of
> tracked files would overwrite an ignored pattern, we can just error out
> (as we do with the "Ok to overwrite" branch removed) and tell the user
> to delete the files to proceed.

This is going to totally hose automation.  My last job had files which
might move from tracked to untracked (a file that had become generated),
and long-running CI and build systems would need to be able to check out
one status and switch to the other.  Your proposed change will prevent
those systems from working, whereas they previously did.

I agree that your proposal would have been a better design originally,
but breaking the way automated systems currently work is probably going
to be a dealbreaker.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH] builtin/notes: remove unnecessary free

2018-11-12 Thread Junio C Hamano
Johan Herland  writes:

> On Sun, Nov 11, 2018 at 10:49 AM Carlo Marcelo Arenas Belón
>  wrote:
>>
>> 511726e4b1 ("builtin/notes: fix premature failure when trying to add
>> the empty blob", 2014-11-09) removed the check for !len but left a
>> call to free the buffer that will be otherwise NULL

Wow, that's a old one.

>>
>> Signed-off-by: Carlo Marcelo Arenas Belón 
>
> Signed-off-by: Johan Herland 

Thanks, both.  Will apply.



Re: [PATCH 4/5] built-in rebase --skip/--abort: clean up stale .git/ files

2018-11-12 Thread Junio C Hamano
"Johannes Schindelin via GitGitGadget" 
writes:

> From: Johannes Schindelin 
>
> The scripted version of the rebase used to execute `git reset --hard`
> when skipping or aborting. When we ported this to C, we did update the
> worktree and some reflogs, but we failed to imitate `git reset --hard`'s
> behavior regarding files in .git/ such as MERGE_HEAD.
>
> Let's address this oversight.
>
> Signed-off-by: Johannes Schindelin 
> ---
>  builtin/rebase.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 0ee06aa363..017dce1b50 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -23,6 +23,7 @@
>  #include "revision.h"
>  #include "commit-reach.h"
>  #include "rerere.h"
> +#include "branch.h"
>  
>  static char const * const builtin_rebase_usage[] = {
>   N_("git rebase [-i] [options] [--exec ] [--onto ] "
> @@ -1002,6 +1003,7 @@ int cmd_rebase(int argc, const char **argv, const char 
> *prefix)
>  
>   if (reset_head(NULL, "reset", NULL, 0, NULL, NULL) < 0)
>   die(_("could not discard worktree changes"));
> + remove_branch_state();
>   if (read_basic_state())
>   exit(1);
>   goto run_rebase;
> @@ -1019,6 +1021,7 @@ int cmd_rebase(int argc, const char **argv, const char 
> *prefix)
>  options.head_name, 0, NULL, NULL) < 0)
>   die(_("could not move back to %s"),
>   oid_to_hex(_head));
> + remove_branch_state();

Hmph.  Among 5 or so callsites of reset_head(), only these two
places need it, so reset_head() is clearly not a substitute for
"reset --hard HEAD", and it sometimes used to switch branches or
something?  Perhaps we may need to rename it to avoid confusion but
it can wait until we actually decide to make it externally
available.  Until then, it's OK as-is, I would think.

>   ret = finish_rebase();
>   goto cleanup;
>   }


Re: [RFC PATCH 1/5] ref-filter: add objectsize:disk option

2018-11-12 Thread Junio C Hamano
Jeff King  writes:

>> You mean something like
>> 
>>  v->s = xstrfmt("%"PRIdMAX, (intmax_t)oi->disk_size);
>
> I think elsewhere we simply use PRIuMAX for printing large sizes via
> off_t; we know this value isn't going to be negative.
>
> I'm not opposed to PRIdMAX, which _is_ more accurate, but...
>
>> P.S.: I wondered whether we have precedent for PRIdMAX, as we used to use
>> only PRIuMAX, but yes: JeffH's json-writer uses PRIdMAX.
>
> That's pretty recent. I won't be surprised if we have to do some
> preprocessor trickery to handle that at some point. We have a PRIuMAX
> fallback already. That comes from c4001d92be (Use off_t when we really
> mean a file offset., 2007-03-06), but it's not clear to me if that was
> motivated by a real platform or an over-abundance of caution.
>
> I'm OK with just using PRIdMAX as appropriate for now. It will serve as
> a weather-balloon, and we can #define our way out of it later if need
> be.

I am OK if we avoid PRIdMAX and use PRIuMAX instead with a cast to
the corresponding size in this codepath, as long as we properly
handle negative oi.disk_size field, which may be telling some
"unusual" condition to us.




Re: [PATCH v3 1/1] protocol: advertise multiple supported versions

2018-11-12 Thread Junio C Hamano
stead...@google.com writes:

> Currently the client advertises that it supports the wire protocol
> version set in the protocol.version config. However, not all services
> support the same set of protocol versions. When connecting to
> git-receive-pack, the client automatically downgrades to v0 if
> config.protocol is set to v2, but this check is not performed for other
> services.

"downgrades to v0 even if ... is set to v2" you mean?  Otherwise it
is unclear why asking for v2 leads to using v0.

> This patch creates a protocol version registry. Individual operations
> register all the protocol versions they support prior to communicating
> with a server. Versions should be listed in preference order; the
> version specified in protocol.version will automatically be moved to the
> front of the registry.
>
> The protocol version registry is passed to remote helpers via the
> GIT_PROTOCOL environment variable.
>
> Clients now advertise the full list of registered versions. Servers
> select the first recognized version from this advertisement.

Makes sense.

> +void get_client_protocol_version_advertisement(struct strbuf *advert)
> +{
> + int tmp_nr = nr_allowed_versions;
> + enum protocol_version *tmp_allowed_versions, config_version;
> + strbuf_reset(advert);
> +
> + have_advertised_versions_already = 1;
> +
> + config_version = get_protocol_version_config();
> + if (config_version == protocol_v0) {
> + strbuf_addstr(advert, "version=0");
> + return;
> + }
> +
> + if (tmp_nr > 0) {
> + ALLOC_ARRAY(tmp_allowed_versions, tmp_nr);
> + copy_array(tmp_allowed_versions, allowed_versions, tmp_nr,
> +sizeof(enum protocol_version));
> + } else {
> + ALLOC_ARRAY(tmp_allowed_versions, 1);
> + tmp_nr = 1;
> + tmp_allowed_versions[0] = config_version;
> + }
> +
> + if (tmp_allowed_versions[0] != config_version)
> + for (int i = 1; i < nr_allowed_versions; i++)
> + if (tmp_allowed_versions[i] == config_version) {
> + enum protocol_version swap =
> + tmp_allowed_versions[0];
> + tmp_allowed_versions[0] =
> + tmp_allowed_versions[i];
> + tmp_allowed_versions[i] = swap;
> + }
> +
> + strbuf_addf(advert, "version=%s",
> + format_protocol_version(tmp_allowed_versions[0]));
> + for (int i = 1; i < tmp_nr; i++)
> + strbuf_addf(advert, ":version=%s",
> + format_protocol_version(tmp_allowed_versions[i]));
> +}
> +

So the idea is that the protocols the other end can talk come in
advert in their preferred order, and we take an intersection of them
and our "allowed-versions", but the preference is further skewed
with the "swap" thing if we have our own preference specified via
config?

I am wondering if the code added by this patch outside this
function, with if (strcmp(client_ad.buf, "version=0") sprinkled all
over the place, works sensibly when the other side says "I prefer
version=0 but I do not mind talking version=1".

Isn't tmp_allowed_versions[] leaking when we return from this
function?


Re: [RFC PATCH] Using no- for options instead of duplication

2018-11-12 Thread Junio C Hamano
Fredi Fowler  writes:

Here is a space for you to justify the change and sign off your
patch (see Documentation/SubmittingPatches).

> ---

> Subject: Re: [RFC PATCH] Using no- for options instead of duplication

Try to see if you can format the title in ": "
form first, please.  I'll come back to it later.


>  Documentation/merge-options.txt | 21 +++--
>  1 file changed, 7 insertions(+), 14 deletions(-)

A quick counting (which may count false positives) tells me that

$ git grep -e '^--no-' Documentation | wc -l
124 
$ git grep -e '^--\[no-' Documentation | wc -l
44

you are standardizing to the minority way.

A tangent that somebody might want to tackle.  It would be
nice if we had a tool that takes a grep expression (like
'^--no' and '^\[no-' above) and shows histograms of the ages
of lines that match.  It might tell us that all 44 combined
ones are more recent (some of them may even have been
updated from the separate form) than the 124 separate ones,
in which case we can say "we started the process of
migrating to list options singly, like '--[no-]option', in
commit X; let's continue doing so" in the log message.  Or
it may turn out that we have been going in the other
direction and most of these 44 are stale ones yet to be
split.  Without such a tool, the above numbers are the best
measure to go by, which is not quite ideal.

As there are tons of split ones, not just in merge-options.txt, I
suspect that the  of the change can just be "doc", so a good
title may be

Subject: [PATCH] doc: list negative options as --[no-]option

or something like that.

If going in the direction of this patch were a good idea, that is.

I am actually not sure if it is a good idea, especially given that
the only change is the enumeration headers and without adjustment to
the text, though.

> diff --git a/Documentation/merge-options.txt b/Documentation/merge-options.txt
> index 63a3fc09548ab..fc1122ded51a0 100644
> --- a/Documentation/merge-options.txt
> +++ b/Documentation/merge-options.txt
> @@ -1,5 +1,4 @@
> ---commit::
> ---no-commit::
> +--[no-]commit::
>   Perform the merge and commit the result. This option can
>   be used to override --no-commit.
>  +
>  ...
>  With --no-commit perform the merge but pretend the merge
>  failed and do not autocommit, to give the user a chance to
>  inspect and further tweak the merge result before committing.

For example, the original for this one gives the behaviour for --commit
and --no-commit separately, and it visually makes it easier to see two
distinct header items.

Description of some other options read OK either way, which would
justify not touching the description when combining two headings
into one.  But that still does not justify the combining in the
first place.

FWIW, "git help -m merge" begins its OPTIONS section like this:

OPTIONS
   --commit, --no-commit
   Perform the merge and commit the result. This option can be used to
   override --no-commit.

   With --no-commit perform the merge but pretend the merge failed and
   do not autocommit, to give the user a chance to inspect and further
   tweak the merge result before committing.

which is different from heading with a single "--[no-]commit", but I
do not quite see why a single squished form is preferrable.  It does
not save lines, and it forces readers to split and reassemble two
options in their head while reading.


Re: [PATCH 2/2] read-cache: use time_t instead of unsigned long

2018-11-12 Thread Eric Sunshine
On Mon, Nov 12, 2018 at 3:41 AM Carlo Marcelo Arenas Belón
 wrote:
> b968372279 ("read-cache: unlink old sharedindex files", 2017-03-06)
> introduced get_shared_index_expire_date using unsigned long to track
> the modification times of a shared index.
>
> dddbad728c ("timestamp_t: a new data type for timestamps", 2017-04-26)
> shows why that might problematic so move to time_t instead.

s/might/& be/

> Signed-off-by: Carlo Marcelo Arenas Belón 


Re: Add issue management within git

2018-11-12 Thread Konstantin Khomoutov
On Mon, Nov 12, 2018 at 09:35:31AM +0800, yan ke wrote:

> > This would be awesome to handle issue directly with git:
> > Having an offline version of the issues synced to the gitlab/github issues.
> > A lot of work is done on the issues and it is lost when migrating
> > from one service to the other.
> > Beside we don’t always have a good internet connection.
> > There is already a kind of integration between commit message fixing
> > issue automatically when merged in the master branch (with “fix
> > #143’).
>Very very agree, now it is very difficult to find a solution when
> has some problem such build problem an so on! The mail-list is good to
> send patch es, but is it not suitable for problem track or problem
> solution search!
>Now the Github or Gitlab is good to track issues, suggest to open
> the git issue track!

Please don't hijack the discussion: the original poster did not question
the workflow adopted by the Git project itself but rather asked about
what is colloquially called "distributed bug tracker", and wanted to
have one integrated with (or into) Git. That is completely orthogonal
story.

As to searching for Git issues / problem solutions - I'd recommend using
the search on the main Git mailing list archive [1] and the issue
tracker of the Git for Windows project [2].

The communities around Git also include the "Git Users" low-volume
mailing list [3] (also perfectly searcheable), and the "git" tag at
StackOverflow [4].

1. https://public-inbox.org/git/
2. https://github.com/git-for-windows/git/issues
3. https://groups.google.com/forum/#!forum/git-users
4. https://stackoverflow.com/questions/tagged/git



Re: [RFC PATCH] Introduce "precious" file concept

2018-11-12 Thread Matthieu Moy
"Per Lundberg"  wrote:

> On 11/11/18 5:41 PM, Duy Nguyen wrote:
> > On Sun, Nov 11, 2018 at 1:33 PM Ævar Arnfjörð Bjarmason
> >  wrote:
>
> >> That will lose no data, and in the very rare cases where a checkout of
> >> tracked files would overwrite an ignored pattern, we can just error out
> >> (as we do with the "Ok to overwrite" branch removed) and tell the user
> >> to delete the files to proceed.
> > There's also the other side of the coin. If this refuse to overwrite
> > triggers too often, it can become an annoyance.

I may have missed some cases, but to me the cases when checkout may try
to overwrite an ignored file are essentially:

* Someone "git add"ed a file meant to be ignored by mistake (e.g.
  "git add -f *.o").

* A file that was meant to be kept private (e.g. config.mak.dev) ends
  up being tracked. This may happen when we find a way to make per-developer
  settings the same for everyone.

I both cases I'd want at least to be notified that something is going on,
and in the second I'd probably want to keep my local file around.

> If we feel thrashable is stretching it too far (which I don't think it
> is), we could add a "core.ignore_files_are_trashable" setting that
> brings back the old semantics, for those who have a strong feeling about it.

May I remind an idea I sugested in an old thread: add an intermediate level
where ignored files to be overwritten are renamed (eg. foo -> foo~ like Emacs'
backup files):

https://public-inbox.org/git/vpqd3t9656k@bauges.imag.fr/

One advantage of the "rename" behavior is that it's safer that the current,
but still not very disturbing for people who like the current behavior. This
makes it a good candidate for a default behavior.

This could come in complement with this thread's "precious" concept:

* If you know what you're doing and know that such or such file is precious,
  mark it as such and Git will never overwrite it.

* If you don't know about precious files, just keep the default setting and
  the worse that can happen is to get your file overwritten with a bakup
  of the old version kept around.

This would probably play better with a notion of "precious" files than with
a notion of "trashable" files.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-12 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> What I'd add to your list is:
>
> * Some projects (I've seen this in the wild) add e.g. *.mp3 or whatever
>   else usually doesn't belong in the repo as a "soft ignore". This is
>   something we've never recommended, but have implicitly supported since
>   the only caveats are a) you need a one-off "git add -f" and then
>   they're tracked b) them being missing from "status" before being
>   tracked c) the issue under discussion here.

Or only selected "*.o" (vendor supplied binary blob) kept tracked
while everything else is built from the source.

I do not know who you are referring to "we" in your sentence, but as
far as I am concerned, it has been and still is a BCP recommendation
on this list to deal with a case like that.


Re: [PATCH 03/10] fast-export: use value from correct enum

2018-11-12 Thread Ævar Arnfjörð Bjarmason
On Sun, Nov 11, 2018 at 9:10 PM Ævar Arnfjörð Bjarmason
 wrote:
>
>
> On Sun, Nov 11 2018, Jeff King wrote:
>
> > On Sat, Nov 10, 2018 at 10:23:05PM -0800, Elijah Newren wrote:
> >
> >> ABORT and ERROR happen to have the same value, but come from differnt
> >> enums.  Use the one from the correct enum.
> >
> > Yikes. :)
> >
> > This is a good argument for naming these SIGNED_TAG_ABORT, etc. But this
> > is obviously an improvement in the meantime.
>
> In C enum values aren't the types of the enum, but I'd thought someone
> would have added a warning for this:
>
> #include 
>
> enum { A, B } foo = A;
> enum { C, D } bar = C;
>
> int main(void)
> {
> switch (foo) {
>   case C:
> puts("A");
> break;
>   case B:
> puts("B");
> break;
> }
> }
>
> But none of the 4 C compilers (gcc, clang, suncc & xlc) I have warn
> about it. Good to know.

Asked GCC to implement it: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87983


Re: Add issue management within git

2018-11-12 Thread Konstantin Khomoutov
On Sun, Nov 11, 2018 at 11:50:00PM +0100, Martin Delille wrote:

> This would be awesome to handle issue directly with git: Having an
> offline version of the issues synced to the gitlab/github issues.  A
> lot of work is done on the issues and it is lost when migrating from
> one service to the other.  Beside we don’t always have a good internet
> connection.  There is already a kind of integration between commit
> message fixing issue automatically when merged in the master branch
> (with “fix #143’).

[1] is the last discussion (of many) happened on this list on this
topic. Please start there. It also contains pointers to past discussions
and past work done on that front.

1. 
https://public-inbox.org/git/CACSZ0Pwzs2e7E5RUEPDcEUsa=inzCyBAptU7YaCUw+5=mut...@mail.gmail.com/



Re: [PATCH 2/2] read-cache: use time_t instead of unsigned long

2018-11-12 Thread Junio C Hamano
Carlo Marcelo Arenas Belón   writes:

> b968372279 ("read-cache: unlink old sharedindex files", 2017-03-06)
> introduced get_shared_index_expire_date using unsigned long to track
> the modification times of a shared index.
>
> dddbad728c ("timestamp_t: a new data type for timestamps", 2017-04-26)
> shows why that might problematic so move to time_t instead.
>
> Signed-off-by: Carlo Marcelo Arenas Belón 
> ---
>  read-cache.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/read-cache.c b/read-cache.c
> index 7b1354d759..5525d8e679 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -2625,9 +2625,9 @@ static int write_split_index(struct index_state *istate,
>  
>  static const char *shared_index_expire = "2.weeks.ago";
>  
> -static unsigned long get_shared_index_expire_date(void)
> +static time_t get_shared_index_expire_date(void)
>  {
> - static unsigned long shared_index_expire_date;
> + static time_t shared_index_expire_date;
>   static int shared_index_expire_date_prepared;
>  
>   if (!shared_index_expire_date_prepared) {

After this line, the post-context reads like this:

git_config_get_expiry("splitindex.sharedindexexpire",
  _index_expire);
shared_index_expire_date = approxidate(shared_index_expire);
shared_index_expire_date_prepared = 1;
}

return shared_index_expire_date;

Given that the function returns the value obtained from
approxidate(), which is approxidate_careful() in disguise, time_t is
not as appropriate as timestamp_t, no?

IOW, what if time_t were narrower than timestamp_t?


> @@ -2643,7 +2643,7 @@ static unsigned long get_shared_index_expire_date(void)
>  static int should_delete_shared_index(const char *shared_index_path)
>  {
>   struct stat st;
> - unsigned long expiration;
> + time_t expiration;
>  
>   /* Check timestamp */
>   expiration = get_shared_index_expire_date();


[PATCH 1/2] builtin/commit: use timestamp_t in parse_force_date

2018-11-12 Thread Carlo Marcelo Arenas Belón
when dddbad728c ("timestamp_t: a new data type for timestamps", 2017-04-26)
was introduced, the fallback to use approxidate that was introduced in
14ac2864dc ("commit: accept more date formats for "--date"", 2014-05-01)
was not updated to use the new type instead of unsigned long.

Signed-off-by: Carlo Marcelo Arenas Belón 
---
 builtin/commit.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 0d9828e29e..a447e08f62 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -537,10 +537,10 @@ static int parse_force_date(const char *in, struct strbuf 
*out)
 
if (parse_date(in, out) < 0) {
int errors = 0;
-   unsigned long t = approxidate_careful(in, );
+   timestamp_t t = approxidate_careful(in, );
if (errors)
return -1;
-   strbuf_addf(out, "%lu", t);
+   strbuf_addf(out, "%"PRItime, t);
}
 
return 0;
-- 
2.19.1.856.g8858448bb



[PATCH 2/2] read-cache: use time_t instead of unsigned long

2018-11-12 Thread Carlo Marcelo Arenas Belón
b968372279 ("read-cache: unlink old sharedindex files", 2017-03-06)
introduced get_shared_index_expire_date using unsigned long to track
the modification times of a shared index.

dddbad728c ("timestamp_t: a new data type for timestamps", 2017-04-26)
shows why that might problematic so move to time_t instead.

Signed-off-by: Carlo Marcelo Arenas Belón 
---
 read-cache.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 7b1354d759..5525d8e679 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2625,9 +2625,9 @@ static int write_split_index(struct index_state *istate,
 
 static const char *shared_index_expire = "2.weeks.ago";
 
-static unsigned long get_shared_index_expire_date(void)
+static time_t get_shared_index_expire_date(void)
 {
-   static unsigned long shared_index_expire_date;
+   static time_t shared_index_expire_date;
static int shared_index_expire_date_prepared;
 
if (!shared_index_expire_date_prepared) {
@@ -2643,7 +2643,7 @@ static unsigned long get_shared_index_expire_date(void)
 static int should_delete_shared_index(const char *shared_index_path)
 {
struct stat st;
-   unsigned long expiration;
+   time_t expiration;
 
/* Check timestamp */
expiration = get_shared_index_expire_date();
-- 
2.19.1.856.g8858448bb



[PATCH 0/2] avoid unsigned long for timestamps

2018-11-12 Thread Carlo Marcelo Arenas Belón
specially problematic in Windows where unsigned long is only 32bit wide
and therefore the assumption that a time_t would fit will lead to loss
of precision in a 64bit OS.

 builtin/commit.c | 4 ++--
 read-cache.c | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)




Re: [PATCH v2 2/2] branch: Mark and colorize a branch differently if it is checked out in a linked worktree

2018-11-12 Thread Junio C Hamano
nbelakov...@gmail.com writes:

> diff --git a/color.h b/color.h
> index 98894d6a17..857653df73 100644
> --- a/color.h
> +++ b/color.h
> @@ -42,6 +42,24 @@ struct strbuf;
>  #define GIT_COLOR_FAINT_BLUE "\033[2;34m"
>  #define GIT_COLOR_FAINT_MAGENTA  "\033[2;35m"
>  #define GIT_COLOR_FAINT_CYAN "\033[2;36m"
> +#define GIT_COLOR_LIGHT_RED  "\033[91m"
> +#define GIT_COLOR_LIGHT_GREEN"\033[92m"
> +#define GIT_COLOR_LIGHT_YELLOW   "\033[93m"
> +#define GIT_COLOR_LIGHT_BLUE "\033[94m"
> +#define GIT_COLOR_LIGHT_MAGENTA  "\033[95m"
> +#define GIT_COLOR_LIGHT_CYAN "\033[96m"
> +#define GIT_COLOR_BOLD_LIGHT_RED "\033[1;91m"
> +#define GIT_COLOR_BOLD_LIGHT_GREEN   "\033[1;92m"
> +#define GIT_COLOR_BOLD_LIGHT_YELLOW  "\033[1;93m"
> +#define GIT_COLOR_BOLD_LIGHT_BLUE"\033[1;94m"
> +#define GIT_COLOR_BOLD_LIGHT_MAGENTA "\033[1;95m"
> +#define GIT_COLOR_BOLD_LIGHT_CYAN"\033[1;96m"
> +#define GIT_COLOR_FAINT_LIGHT_RED"\033[2;91m"
> +#define GIT_COLOR_FAINT_LIGHT_GREEN  "\033[2;92m"
> +#define GIT_COLOR_FAINT_LIGHT_YELLOW "\033[2;93m"
> +#define GIT_COLOR_FAINT_LIGHT_BLUE   "\033[2;94m"
> +#define GIT_COLOR_FAINT_LIGHT_MAGENTA"\033[2;95m"
> +#define GIT_COLOR_FAINT_LIGHT_CYAN   "\033[2;96m"

Hopefully you made sure that there is no other topic in-flight that
touch this area before doing this change?  Otherwise you'd be
creating pointless merge conflict by futzing with spaces.

Ditto for an earlier hunk of this patch.

Thanks.


Re: [PATCH 2/2] built-in rebase: reinstate `checkout -q` behavior where appropriate

2018-11-12 Thread Johannes Schindelin
Hi Peff,

On Fri, 9 Nov 2018, Jeff King wrote:

> On Fri, Nov 09, 2018 at 06:21:41PM +0100, Johannes Schindelin wrote:
> 
> > Actually, you got me thinking about the desc.buffer. And I think there is
> > one corner case where it could cause a problem: `struct tree_desc desc[2]`
> > does not initialize the buffers to NULL. And what if
> > fill_tree_descriptor() function returns NULL? Then the buffer is still
> > uninitialized.
> > 
> > In practice, our *current* version of fill_tree_descriptor() never returns
> > NULL if the oid parameter is non-NULL. It would die() in the error case
> > instead (bad!). So to prepare for a less bad version, I'd rather
> > initialize the `desc` array and be on the safe (and easier to reason
> > about) side.
> 
> Yeah, I agree with all of that.
> 
> One solution would just be to increment only after success:
> 
>   if (fill_tree_descriptor([nr], ..) < 0)
>   goto error;
>   nr++; /* now we know it's valid! */
> 
> But there are lots of alternatives.  :)

True. I simply prefer to initialize it and be done with it. ;-)

Ciao,
Dscho


Re: [PATCH v2 1/1] remote-curl.c: xcurl_off_t is not portable (on 32 bit platfoms)

2018-11-12 Thread Junio C Hamano
Torsten Bögershausen  writes:

> Thanks, is there a chance to kill a typo here ?
> s/comopared/compared/
> - A temporally variable "size" is used, promoted int uintmax_t and the 
> comopared

Done.  Thanks.


Re: Add issue management within git

2018-11-12 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 12 2018, Konstantin Khomoutov wrote:

> On Mon, Nov 12, 2018 at 09:35:31AM +0800, yan ke wrote:
>
>> > This would be awesome to handle issue directly with git:
>> > Having an offline version of the issues synced to the gitlab/github issues.
>> > A lot of work is done on the issues and it is lost when migrating
>> > from one service to the other.
>> > Beside we don’t always have a good internet connection.
>> > There is already a kind of integration between commit message fixing
>> > issue automatically when merged in the master branch (with “fix
>> > #143’).
>>Very very agree, now it is very difficult to find a solution when
>> has some problem such build problem an so on! The mail-list is good to
>> send patch es, but is it not suitable for problem track or problem
>> solution search!
>>Now the Github or Gitlab is good to track issues, suggest to open
>> the git issue track!
>
> Please don't hijack the discussion: the original poster did not question
> the workflow adopted by the Git project itself but rather asked about
> what is colloquially called "distributed bug tracker", and wanted to
> have one integrated with (or into) Git. That is completely orthogonal
> story.

Correct, but let's assume good faith here and presume yan ke just
misread the original E-mail. Many of us (and perhaps yourself) are
participating in our second, third, fourth etc. language on this list :)

> As to searching for Git issues / problem solutions - I'd recommend using
> the search on the main Git mailing list archive [1] and the issue
> tracker of the Git for Windows project [2].
>
> The communities around Git also include the "Git Users" low-volume
> mailing list [3] (also perfectly searcheable), and the "git" tag at
> StackOverflow [4].
>
> 1. https://public-inbox.org/git/
> 2. https://github.com/git-for-windows/git/issues
> 3. https://groups.google.com/forum/#!forum/git-users
> 4. https://stackoverflow.com/questions/tagged/git

Yeah. I'll add to that that this specific thing has been discussed here
really recently:

https://public-inbox.org/git/CACSZ0Pwzs2e7E5RUEPDcEUsa=inzCyBAptU7YaCUw+5=mut...@mail.gmail.com/
https://github.com/MichaelMure/git-bug/

So Martin, there's already a nascent tool that does this. It looks like
the main thing it needs now is users & testers.


Re: [PATCH v2 1/2] ref-filter: add worktree atom

2018-11-12 Thread Junio C Hamano
nbelakov...@gmail.com writes:

>  
> +static int worktree_head_atom_parser(const struct ref_format *format,
> +  struct 
> used_atom *atom,
> +  const 
> char *arg,
> +  struct 
> strbuf *unused_err)

This and ...

> +{
> + struct worktree **worktrees = get_worktrees(0);
> + int i;
> +
> + string_list_init(>u.worktree_heads, 1);
> +
> + for (i = 0; worktrees[i]; i++) {
> + if (worktrees[i]->head_ref)
> + string_list_append(>u.worktree_heads,
> +
> worktrees[i]->head_ref);

... this makes me suspect that you are using tabstop != 8 and that
is causing you to indent these lines overly deeply.

Please don't, while working on this codebase.


> + }
> +
> + string_list_sort(>u.worktree_heads);
> +
> + free_worktrees(worktrees);
> + return 0;
> +}

So..., this function collects any and all branches that are checked
out in some worktree, and sort them _without_ dedup.  The user of
the resulting information (i.e. atom->u.worktree_heads) cannot tell
where each of the listed branches is checked out.

I wonder if "The worktree at /local/src/wt1 has this branch checked
out" is something the user of %(worktree) atom, or a variant thereof
e.g. "%(worktree:detailed)", may want to learn, but because that
information is lost when this function returns, such an enhancement
cannot be done without fixing this funciton.

Also, I am not sure if this "list of some info on worktrees" really
belongs to an individual atom.  For one thing, if a format includes
more than one instance of %(worktree) atoms, you'd iterate over the
worktrees as many times as the number of these atoms you have.  Is
there another existing atom that "caches" expensive piece of
information per used_atom[] element like this one?  Essentially I am
trying to convince myself that the approach taken by the patch is a
sane one by finding a precedent.

> + } else if (!strcmp(name, "worktree")) {
> + if (string_list_has_string(>u.worktree_heads, 
> ref->refname))

I thought we were moving towards killing the use of string_list as a
look-up table, as we do not want to see thoughtless copy such
a code from parts of the code that are not performance critical to a
part.  Not very satisfying.

I think we can let this pass, and later add a wrapper around
hashmap that is meant to only be used to replace string-list
used for this exact purpose, i.e. key is a string, and there
is no need to iterate over the existing elements in any
sorted order.  Optionally, we can limit the look up to only
checking for existence, if it makes the code for the wrapper
simpler.

> + v->s = xstrdup("+");
> + else
> + v->s = xstrdup(" ");
> + continue;
>   } else if (starts_with(name, "align")) {
>   v->handler = align_atom_handler;
>   v->s = xstrdup("");
> diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
> index fc067ed672..5e6d249d4c 100755
> --- a/t/t6302-for-each-ref-filter.sh
> +++ b/t/t6302-for-each-ref-filter.sh
> @@ -441,4 +441,19 @@ test_expect_success '--merged is incompatible with 
> --no-merged' '
>   test_must_fail git for-each-ref --merged HEAD --no-merged HEAD
>  '
>  
> +test_expect_success '"add" a worktree' '
> + mkdir worktree_dir &&
> + git worktree add -b master_worktree worktree_dir master
> +'
> +
> +test_expect_success 'validate worktree atom' '
> + cat >expect <<-\EOF &&
> + master: checked out in a worktree
> + master_worktree: checked out in a worktree
> + side: not checked out in a worktree

As you started the here-doc with <<-, the next line EOF does not
have to be flushed to the left.  Indent it just the same way with a
tab.

> +EOF

The following line begins with a broken indentation, it seems.

> +git for-each-ref --format="%(refname:short): 
> %(if)%(worktree)%(then)checked out in a worktree%(else)not checked out in a 
> worktree%(end)" refs/heads/ >actual &&
> + test_cmp expect actual
> +'
> +
>  test_done


Re: [PATCH v2] format-patch: respect --stat in cover letter's diffstat

2018-11-12 Thread Laszlo Ersek
On 11/10/18 06:46, Nguyễn Thái Ngọc Duy wrote:
> Commit 43662b23ab (format-patch: keep cover-letter diffstat wrapped in
> 72 columns - 2018-01-24) uncondtionally sets stat width to 72 when
> generating diffstat for the cover letter, ignoring --stat from command
> line. But it should only do so when stat width is still default
> (i.e. stat_width == 0).
> 
> In order to fix this, we should only set stat_width if stat_width is
> zero. But it will never be. Commit 071dd0ba43 (format-patch: reduce
> patch diffstat width to 72 - 2018-02-01) makes sure that default stat
> width will be 72 (ignoring $COLUMNS, but could still be overriden by
> --stat). So all we need to do here is drop the assignment.
> 
> Reported-by: Laszlo Ersek 
> Helped-by: Leif Lindholm 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  builtin/log.c  |  2 --
>  t/t4052-stat-output.sh | 48 +-
>  2 files changed, 33 insertions(+), 17 deletions(-)

* This submission should have been posted as v3, not v2. V2 was posted at

https://public-inbox.org/git/20181107164953.24965-1-pclo...@gmail.com/

* Comparing the patch emails, the only difference is that this version
renames "expect.40" to "expect.60". This should have been mentioned in a
cover letter, or in the Notes section of the current submission.

* In my response to the (original) v2 posting, at

https://public-inbox.org/git/f0f95dd0-1a9e-01d0-70f4-3c6d5450d...@redhat.com/

I stated that I didn't try to run the test suite, and gave my T-b (under
the circumstances described there). Given that the change in v3 (= this
submission) is limited to the test case, I think my T-b should have been
carried forward.

Thanks
Laszlo



> diff --git a/builtin/log.c b/builtin/log.c
> index 061d4fd864..1a39c6e52a 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -1009,8 +1009,6 @@ static void show_diffstat(struct rev_info *rev,
>  
>   memcpy(, >diffopt, sizeof(opts));
>   opts.output_format = DIFF_FORMAT_SUMMARY | DIFF_FORMAT_DIFFSTAT;
> - opts.stat_width = MAIL_DEFAULT_WRAP;
> -
>   diff_setup_done();
>  
>   diff_tree_oid(get_commit_tree_oid(origin),
> diff --git a/t/t4052-stat-output.sh b/t/t4052-stat-output.sh
> index 6e2cf933f7..28c053849a 100755
> --- a/t/t4052-stat-output.sh
> +++ b/t/t4052-stat-output.sh
> @@ -44,42 +44,50 @@ show --stat
>  log -1 --stat
>  EOF
>  
> -while read cmd args
> +cat >expect.60 <<-'EOF'
> + ...a | 1 +
> +EOF
> +cat >expect.6030 <<-'EOF'
> + ...aaa | 1 +
> +EOF
> +cat >expect2.60 <<-'EOF'
> + ...a | 1 +
> + ...a | 1 +
> +EOF
> +cat >expect2.6030 <<-'EOF'
> + ...aaa | 1 +
> + ...aaa | 1 +
> +EOF
> +while read expect cmd args
>  do
> - cat >expect <<-'EOF'
> -  ...a | 1 +
> - EOF
>   test_expect_success "$cmd --stat=width: a long name is given more room 
> when the bar is short" '
>   git $cmd $args --stat=40 >output &&
>   grep " | " output >actual &&
> - test_cmp expect actual
> + test_cmp $expect.60 actual
>   '
>  
>   test_expect_success "$cmd --stat-width=width with long name" '
>   git $cmd $args --stat-width=40 >output &&
>   grep " | " output >actual &&
> - test_cmp expect actual
> + test_cmp $expect.60 actual
>   '
>  
> - cat >expect <<-'EOF'
> -  ...aaa | 1 +
> - EOF
>   test_expect_success "$cmd --stat=...,name-width with long name" '
>   git $cmd $args --stat=60,30 >output &&
>   grep " | " output >actual &&
> - test_cmp expect actual
> + test_cmp $expect.6030 actual
>   '
>  
>   test_expect_success "$cmd --stat-name-width with long name" '
>   git $cmd $args --stat-name-width=30 >output &&
>   grep " | " output >actual &&
> - test_cmp expect actual
> + test_cmp $expect.6030 actual
>   '
>  done <<\EOF
> -format-patch -1 --stdout
> -diff HEAD^ HEAD --stat
> -show --stat
> -log -1 --stat
> +expect2 format-patch --cover-letter -1 --stdout
> +expect diff HEAD^ HEAD --stat
> +expect show --stat
> +expect log -1 --stat
>  EOF
>  
>  
> @@ -95,6 +103,16 @@ test_expect_success 'preparation for big change tests' '
>   git commit -m message abcd
>  '
>  
> +cat >expect72 <<'EOF'
> + abcd | 1000 ++
> + abcd | 1000 ++
> +EOF
> +test_expect_success "format-patch --cover-letter ignores COLUMNS (big 
> change)" '
> + COLUMNS=200 git format-patch -1 --stdout --cover-letter >output &&
> + grep " | " output >actual &&
> + test_cmp expect72 actual
> +'
> +
>  cat >expect72 <<'EOF'
>   abcd | 1000 

[PATCH 2/2] read-cache: use time_t instead of unsigned long

2018-11-12 Thread Carlo Marcelo Arenas Belón
There are still some more possible improvements around this code but
they are orthogonal to this change :

* migrate to approxidate_careful or parse_expiry_date
* maybe make sure only approxidate are used for expiration

Changes in v2:
* improved commit message as suggested by Eric
* failsafe against time_t truncation as suggested by Junio

-- >8 --
Subject: [PATCH v2 2/2] read-cache: use time specific types instead of
 unsigned long

b968372279 ("read-cache: unlink old sharedindex files", 2017-03-06)
introduced get_shared_index_expire_date using unsigned long to track
the modification times of a shared index.

dddbad728c ("timestamp_t: a new data type for timestamps", 2017-04-26)
shows why that might be problematic so move to timestamp_t/time_t.

if time_t can't represent a valid time keep the indexes for failsafe

Signed-off-by: Carlo Marcelo Arenas Belón 
---
 read-cache.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 7b1354d759..7d322f11c8 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2625,9 +2625,9 @@ static int write_split_index(struct index_state *istate,
 
 static const char *shared_index_expire = "2.weeks.ago";
 
-static unsigned long get_shared_index_expire_date(void)
+static timestamp_t get_shared_index_expire_date(void)
 {
-   static unsigned long shared_index_expire_date;
+   static timestamp_t shared_index_expire_date;
static int shared_index_expire_date_prepared;
 
if (!shared_index_expire_date_prepared) {
@@ -2643,12 +2643,12 @@ static unsigned long get_shared_index_expire_date(void)
 static int should_delete_shared_index(const char *shared_index_path)
 {
struct stat st;
-   unsigned long expiration;
+   time_t expiration;
+   timestamp_t t = get_shared_index_expire_date();
 
-   /* Check timestamp */
-   expiration = get_shared_index_expire_date();
-   if (!expiration)
+   if (!t || date_overflows(t))
return 0;
+   expiration = t;
if (stat(shared_index_path, ))
return error_errno(_("could not stat '%s'"), shared_index_path);
if (st.st_mtime > expiration)
-- 
2.19.1.856.g8858448bb



Re: [PATCH 03/10] fast-export: use value from correct enum

2018-11-12 Thread Jeff King
On Sun, Nov 11, 2018 at 09:10:17PM +0100, Ævar Arnfjörð Bjarmason wrote:

> > This is a good argument for naming these SIGNED_TAG_ABORT, etc. But this
> > is obviously an improvement in the meantime.
> 
> In C enum values aren't the types of the enum, but I'd thought someone
> would have added a warning for this:
> 
> #include 
> 
> enum { A, B } foo = A;
> enum { C, D } bar = C;
> 
> int main(void)
> {
> switch (foo) {
>   case C:
> puts("A");
> break;
>   case B:
> puts("B");
> break;
> }
> }
> 
> But none of the 4 C compilers (gcc, clang, suncc & xlc) I have warn
> about it. Good to know.

There is -Wenum-compare, but it does not seem to catch this (and is
enabled by -Wall). It (gcc, at least) does catch:

enum foo { A, B };
enum bar { C, D };

int f(enum foo x)
{
return x == C;
}

but converting that equality check to:

switch (x) {
case C:
return 1;
default:
return 0;
}

is not (which is essentially the same as your snippet). So I think the
bug / feature request is to have -Wenum-compare apply to switch
statements.

Clang has -Wenum-compare-switch, but I cannot seem to get it to complain
about even the "==" version using -Wenum-compare. Not sure if it's
buggy, or if I'm holding it wrong. This patch seems to be what we want:

  https://reviews.llvm.org/D36407

-Peff


[PATCH v2 2/3] rebase: prepare reset_head() for more flags

2018-11-12 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

Currently, we only accept the flag indicating whether the HEAD should be
detached not. In the next commit, we want to introduce another flag: to
toggle between emulating `reset --hard` vs `checkout -q`.

Signed-off-by: Johannes Schindelin 
---
 builtin/rebase.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index e173654d56..074594cf10 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -522,10 +522,13 @@ finished_rebase:
 
 #define GIT_REFLOG_ACTION_ENVIRONMENT "GIT_REFLOG_ACTION"
 
+#define RESET_HEAD_DETACH (1<<0)
+
 static int reset_head(struct object_id *oid, const char *action,
- const char *switch_to_branch, int detach_head,
+ const char *switch_to_branch, unsigned flags,
  const char *reflog_orig_head, const char *reflog_head)
 {
+   unsigned detach_head = flags & RESET_HEAD_DETACH;
struct object_id head_oid;
struct tree_desc desc;
struct lock_file lock = LOCK_INIT;
@@ -1500,8 +1503,8 @@ int cmd_rebase(int argc, const char **argv, const char 
*prefix)
 "it...\n"));
 
strbuf_addf(, "rebase: checkout %s", options.onto_name);
-   if (reset_head(>object.oid, "checkout", NULL, 1,
-   NULL, msg.buf))
+   if (reset_head(>object.oid, "checkout", NULL,
+  RESET_HEAD_DETACH, NULL, msg.buf))
die(_("Could not detach HEAD"));
strbuf_release();
 
-- 
gitgitgadget



[PATCH v2 0/3] Fix built-in rebase perf regression

2018-11-12 Thread Johannes Schindelin via GitGitGadget
In our tests with large repositories, we noticed a serious regression of the
performance of git rebase when using the built-in vs the shell script
version. It boils down to an incorrect conversion of a git checkout -q:
instead of using a twoway_merge as git checkout does, we used a oneway_merge 
as git reset does. The latter, however, calls lstat() on all files listed in
the index, while the former essentially looks only at the files that are
different between the given two revisions.

Let's reinstate the original behavior by introducing a flag to the 
reset_head() function to indicate whether we want to emulate reset --hard 
(in which case we use the oneway_merge, otherwise we use twoway_merge).

Johannes Schindelin (3):
  rebase: consolidate clean-up code before leaving reset_head()
  rebase: prepare reset_head() for more flags
  built-in rebase: reinstate `checkout -q` behavior where appropriate

 builtin/rebase.c | 79 
 1 file changed, 46 insertions(+), 33 deletions(-)


base-commit: 8858448bb49332d353febc078ce4a3abcc962efe
Published-As: 
https://github.com/gitgitgadget/git/releases/tags/pr-72%2Fdscho%2Fbuiltin-rebase-perf-regression-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git 
pr-72/dscho/builtin-rebase-perf-regression-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/72

Range-diff vs v1:

 1:  64597fe827 ! 1:  28e24d98ab rebase: consolidate clean-up code before 
leaving reset_head()
 @@ -11,6 +11,33 @@
  --- a/builtin/rebase.c
  +++ b/builtin/rebase.c
  @@
 +  if (switch_to_branch && !starts_with(switch_to_branch, "refs/"))
 +  BUG("Not a fully qualified branch: '%s'", switch_to_branch);
 + 
 +- if (hold_locked_index(, LOCK_REPORT_ON_ERROR) < 0)
 +- return -1;
 ++ if (hold_locked_index(, LOCK_REPORT_ON_ERROR) < 0) {
 ++ ret = -1;
 ++ goto leave_reset_head;
 ++ }
 + 
 +  if (!oid) {
 +  if (get_oid("HEAD", _oid)) {
 +- rollback_lock_file();
 +- return error(_("could not determine HEAD revision"));
 ++ ret = error(_("could not determine HEAD revision"));
 ++ goto leave_reset_head;
 +  }
 +  oid = _oid;
 +  }
 +@@
 +  unpack_tree_opts.reset = 1;
 + 
 +  if (read_index_unmerged(the_repository->index) < 0) {
 +- rollback_lock_file();
 +- return error(_("could not read index"));
 ++ ret = error(_("could not read index"));
 ++ goto leave_reset_head;
}
   
if (!fill_tree_descriptor(, oid)) {
 @@ -31,15 +58,17 @@
}
   
tree = parse_tree_indirect(oid);
 -@@
 +  prime_cache_tree(the_repository->index, tree);
   
 -  if (write_locked_index(the_repository->index, , COMMIT_LOCK) < 0)
 +- if (write_locked_index(the_repository->index, , COMMIT_LOCK) < 0)
 ++ if (write_locked_index(the_repository->index, , COMMIT_LOCK) < 0) {
ret = error(_("could not write index"));
  - free((void *)desc.buffer);
 - 
 -  if (ret)
 +-
 +- if (ret)
  - return ret;
  + goto leave_reset_head;
 ++ }
   
reflog_action = getenv(GIT_REFLOG_ACTION_ENVIRONMENT);
strbuf_addf(, "%s: ", reflog_action ? reflog_action : "rebase");
 -:  -- > 2:  db963b2094 rebase: prepare reset_head() for more flags
 2:  070092b430 ! 3:  a7360b856f built-in rebase: reinstate `checkout -q` 
behavior where appropriate
 @@ -20,15 +20,18 @@
  @@
   #define GIT_REFLOG_ACTION_ENVIRONMENT "GIT_REFLOG_ACTION"
   
 + #define RESET_HEAD_DETACH (1<<0)
 ++#define RESET_HEAD_HARD (1<<1)
 + 
   static int reset_head(struct object_id *oid, const char *action,
 --   const char *switch_to_branch, int detach_head,
 -+   const char *switch_to_branch,
 -+   int detach_head, int reset_hard,
 +const char *switch_to_branch, unsigned flags,
  const char *reflog_orig_head, const char *reflog_head)
   {
 +  unsigned detach_head = flags & RESET_HEAD_DETACH;
 ++ unsigned reset_hard = flags & RESET_HEAD_HARD;
struct object_id head_oid;
  - struct tree_desc desc;
 -+ struct tree_desc desc[2];
 ++ struct tree_desc desc[2] = { { NULL }, { NULL } };
struct lock_file lock = LOCK_INIT;
struct unpack_trees_options unpack_tree_opts;
struct tree *tree;
 @@ -42,18 +45,18 @@
if (switch_to_branch && !starts_with(switch_to_branch, "refs/"))
BUG("Not a fully qualified branch: '%s'", switch_to_branch);
  @@
 -  if (hold_locked_index(, LOCK_REPORT_ON_ERROR) < 0)
 -  return -1;
 +  goto leave_reset_head;
 +  }
   
  - if (!oid) {
  - if 

[PATCH v2 1/3] rebase: consolidate clean-up code before leaving reset_head()

2018-11-12 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

The same clean-up code is repeated quite a few times; Let's DRY up the
code some.

Signed-off-by: Johannes Schindelin 
---
 builtin/rebase.c | 36 ++--
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index 0ee06aa363..e173654d56 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -541,13 +541,15 @@ static int reset_head(struct object_id *oid, const char 
*action,
if (switch_to_branch && !starts_with(switch_to_branch, "refs/"))
BUG("Not a fully qualified branch: '%s'", switch_to_branch);
 
-   if (hold_locked_index(, LOCK_REPORT_ON_ERROR) < 0)
-   return -1;
+   if (hold_locked_index(, LOCK_REPORT_ON_ERROR) < 0) {
+   ret = -1;
+   goto leave_reset_head;
+   }
 
if (!oid) {
if (get_oid("HEAD", _oid)) {
-   rollback_lock_file();
-   return error(_("could not determine HEAD revision"));
+   ret = error(_("could not determine HEAD revision"));
+   goto leave_reset_head;
}
oid = _oid;
}
@@ -564,32 +566,27 @@ static int reset_head(struct object_id *oid, const char 
*action,
unpack_tree_opts.reset = 1;
 
if (read_index_unmerged(the_repository->index) < 0) {
-   rollback_lock_file();
-   return error(_("could not read index"));
+   ret = error(_("could not read index"));
+   goto leave_reset_head;
}
 
if (!fill_tree_descriptor(, oid)) {
-   error(_("failed to find tree of %s"), oid_to_hex(oid));
-   rollback_lock_file();
-   free((void *)desc.buffer);
-   return -1;
+   ret = error(_("failed to find tree of %s"), oid_to_hex(oid));
+   goto leave_reset_head;
}
 
if (unpack_trees(1, , _tree_opts)) {
-   rollback_lock_file();
-   free((void *)desc.buffer);
-   return -1;
+   ret = -1;
+   goto leave_reset_head;
}
 
tree = parse_tree_indirect(oid);
prime_cache_tree(the_repository->index, tree);
 
-   if (write_locked_index(the_repository->index, , COMMIT_LOCK) < 0)
+   if (write_locked_index(the_repository->index, , COMMIT_LOCK) < 0) {
ret = error(_("could not write index"));
-   free((void *)desc.buffer);
-
-   if (ret)
-   return ret;
+   goto leave_reset_head;
+   }
 
reflog_action = getenv(GIT_REFLOG_ACTION_ENVIRONMENT);
strbuf_addf(, "%s: ", reflog_action ? reflog_action : "rebase");
@@ -622,7 +619,10 @@ static int reset_head(struct object_id *oid, const char 
*action,
 UPDATE_REFS_MSG_ON_ERR);
}
 
+leave_reset_head:
strbuf_release();
+   rollback_lock_file();
+   free((void *)desc.buffer);
return ret;
 }
 
-- 
gitgitgadget



[PATCH v2 3/3] built-in rebase: reinstate `checkout -q` behavior where appropriate

2018-11-12 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

When we converted a `git checkout -q $onto^0` call to use
`reset_head()`, we inadvertently incurred a change from a twoway_merge
to a oneway_merge, as if we wanted a `git reset --hard` instead.

This has performance ramifications under certain, though, as the
oneway_merge needs to lstat() every single index entry whereas
twoway_merge does not.

So let's go back to the old behavior.

Signed-off-by: Johannes Schindelin 
---
 builtin/rebase.c | 40 +---
 1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index 074594cf10..dc78c1497d 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -523,14 +523,16 @@ finished_rebase:
 #define GIT_REFLOG_ACTION_ENVIRONMENT "GIT_REFLOG_ACTION"
 
 #define RESET_HEAD_DETACH (1<<0)
+#define RESET_HEAD_HARD (1<<1)
 
 static int reset_head(struct object_id *oid, const char *action,
  const char *switch_to_branch, unsigned flags,
  const char *reflog_orig_head, const char *reflog_head)
 {
unsigned detach_head = flags & RESET_HEAD_DETACH;
+   unsigned reset_hard = flags & RESET_HEAD_HARD;
struct object_id head_oid;
-   struct tree_desc desc;
+   struct tree_desc desc[2] = { { NULL }, { NULL } };
struct lock_file lock = LOCK_INIT;
struct unpack_trees_options unpack_tree_opts;
struct tree *tree;
@@ -539,7 +541,7 @@ static int reset_head(struct object_id *oid, const char 
*action,
size_t prefix_len;
struct object_id *orig = NULL, oid_orig,
*old_orig = NULL, oid_old_orig;
-   int ret = 0;
+   int ret = 0, nr = 0;
 
if (switch_to_branch && !starts_with(switch_to_branch, "refs/"))
BUG("Not a fully qualified branch: '%s'", switch_to_branch);
@@ -549,20 +551,20 @@ static int reset_head(struct object_id *oid, const char 
*action,
goto leave_reset_head;
}
 
-   if (!oid) {
-   if (get_oid("HEAD", _oid)) {
-   ret = error(_("could not determine HEAD revision"));
-   goto leave_reset_head;
-   }
-   oid = _oid;
+   if ((!oid || !reset_hard) && get_oid("HEAD", _oid)) {
+   ret = error(_("could not determine HEAD revision"));
+   goto leave_reset_head;
}
 
+   if (!oid)
+   oid = _oid;
+
memset(_tree_opts, 0, sizeof(unpack_tree_opts));
setup_unpack_trees_porcelain(_tree_opts, action);
unpack_tree_opts.head_idx = 1;
unpack_tree_opts.src_index = the_repository->index;
unpack_tree_opts.dst_index = the_repository->index;
-   unpack_tree_opts.fn = oneway_merge;
+   unpack_tree_opts.fn = reset_hard ? oneway_merge : twoway_merge;
unpack_tree_opts.update = 1;
unpack_tree_opts.merge = 1;
if (!detach_head)
@@ -573,12 +575,17 @@ static int reset_head(struct object_id *oid, const char 
*action,
goto leave_reset_head;
}
 
-   if (!fill_tree_descriptor(, oid)) {
+   if (!reset_hard && !fill_tree_descriptor([nr++], _oid)) {
+   ret = error(_("failed to find tree of %s"), oid_to_hex(oid));
+   goto leave_reset_head;
+   }
+
+   if (!fill_tree_descriptor([nr++], oid)) {
ret = error(_("failed to find tree of %s"), oid_to_hex(oid));
goto leave_reset_head;
}
 
-   if (unpack_trees(1, , _tree_opts)) {
+   if (unpack_trees(nr, desc, _tree_opts)) {
ret = -1;
goto leave_reset_head;
}
@@ -625,7 +632,8 @@ static int reset_head(struct object_id *oid, const char 
*action,
 leave_reset_head:
strbuf_release();
rollback_lock_file();
-   free((void *)desc.buffer);
+   while (nr)
+   free((void *)desc[--nr].buffer);
return ret;
 }
 
@@ -1003,7 +1011,8 @@ int cmd_rebase(int argc, const char **argv, const char 
*prefix)
rerere_clear(_rr);
string_list_clear(_rr, 1);
 
-   if (reset_head(NULL, "reset", NULL, 0, NULL, NULL) < 0)
+   if (reset_head(NULL, "reset", NULL, RESET_HEAD_HARD,
+  NULL, NULL) < 0)
die(_("could not discard worktree changes"));
if (read_basic_state())
exit(1);
@@ -1019,7 +1028,8 @@ int cmd_rebase(int argc, const char **argv, const char 
*prefix)
if (read_basic_state())
exit(1);
if (reset_head(_head, "reset",
-  options.head_name, 0, NULL, NULL) < 0)
+  options.head_name, RESET_HEAD_HARD,
+  NULL, NULL) < 0)
die(_("could not move back to %s"),
oid_to_hex(_head));
ret = 

Re: [PATCH v2 1/1] Upcast size_t variables to uintmax_t when printing

2018-11-12 Thread Jeff King
On Sun, Nov 11, 2018 at 11:16:50AM +0100, Torsten Bögershausen wrote:

> > I like the overall direction. I feel a little funny doing this step now,
> > and not as part of a series to convert individual variables. But I
> > cannot offhand think of any reason that it would behave badly even if
> > the other part does not materialize
> > 
> 
> Hej all,
> There may be some background information missing:
> - I did a 2-patch series based on this commit in pu:
> commit 37c59c3e8fac8bae7ccc5baa148b0e9bae0c8d65
> Author: Junio C Hamano 
> Date:   Sat Oct 27 16:42:25 2018 +0900
> 
> treewide: apply cocci patch
> 
> (that patch was never send out, see below)
> 
> The week later, I tried to apply it on pu, but that was nearly hopeless,
> as too much things had changed on pu.
> I had the chance to compile & test it, but decided to take "part2" before
> "part1", so to say:
> Fix all the printing, and wait for the master branch to settle,
> and then do the "unsigned long" -> size_t conversion.
> That will probably happen after 2.20.

Ah, OK. I am fine with that approach. My thinking was that we'd see
individual functions and their callers converted, which is another way
to do it incrementally. But sometimes that ends up cascading and you end
up having to change quite a bit of the callstack anyway.

-Peff


Re: Migration to Git LFS inflates repository multiple times

2018-11-12 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 12 2018, Jeff King wrote:

> On Mon, Nov 12, 2018 at 12:47:42AM +0100, Mateusz Loskot wrote:
>
>> Hi,
>>
>> I'm posting here for the first time and I hope it's the right place to ask
>> questions about Git LFS.
>>
>> TL;TR: Is this normal a repository migrated to Git LFS inflates multiple 
>> times
>> and how to deal with it?
>
> That does sound odd to me. People with more LFS experience can probably
> give you a better answers, but one thought occurred to me: does LFS
> store backup copies of the original refs that it rewrites (similar to
> the way filter-branch stores refs/original)?
>
> If so, then the resulting repo has the new history _and_ the old
> history. Which might mean storing those large blobs both as Git objects
> (for the old history) and in an LFS cache directory (for the new
> history).
>
> And the right next step is probably to delete those backup refs, and
> then "git gc --prune=now". Hmm, actually thinking about it, reflogs
> could be making the old history reachable, too.
>
> Try looking at the output of "git for-each-ref" and seeing if there are
> any backup refs. After deleting them (or confirming that there aren't),
> prune the reflogs with:
>
>   git reflog expire --expire-unreachable=now --all
>
> and then "git gc --prune=now".

Even if it's only the most recent version of each file this could also
be explained by LFS storing each file inflated as-is on disk, whereas
git will store them delta-compressed.

According to the initial E-Mail "*.exe,*.dll,*.lib,*.pdb,*.zip" was
added to LFS. Depending on the content of those they might be delta
compressing somewhat better than random data.


Re: [PATCH 00/10] fast export and import fixes and features

2018-11-12 Thread Jeff King
On Sun, Nov 11, 2018 at 12:44:47AM -0800, Elijah Newren wrote:

> > > These patches were driven by the needs of git-repo-filter[1], but most
> > > if not all of them should be independently useful.
> >
> > I left lots of comments. Some of the earlier ones may just be showing my
> > confusion about fast-export works (some of which was cleared up by your
> > later patches). But I like the overall direction for sure.
> 
> Thanks for taking the time to read over the series and providing lots
> of feedback!  And, whoops, looks like it's gotten kinda late, so I'll
> check any further feedback on Monday.

Thank you for your patience with my sometimes-confused responses. :)

Overall it makes more sense to me now (and everything seems like a good
direction), with the exception that I'm still a bit confused about patch
10.

-Peff


Re: [RFC PATCH 1/5] ref-filter: add objectsize:disk option

2018-11-12 Thread Jeff King
On Mon, Nov 12, 2018 at 01:03:25PM +0100, Johannes Schindelin wrote:

> > oi.disk_size is off_t; do we know "long long" 
> > 
> >(1) is available widely enough (I think it is from c99)?
> > 
> >(2) is sufficiently wide so that we can safely cast off_t to?
> > 
> >(3) will stay to be sufficiently wide as machines get larger
> >together with standard types like off_t in the future?
> > 
> > I'd rather use intmax_t (as off_t is a signed integral type) so that
> > we do not have to worry about these questions in the first place.
> 
> You mean something like
> 
>   v->s = xstrfmt("%"PRIdMAX, (intmax_t)oi->disk_size);

I think elsewhere we simply use PRIuMAX for printing large sizes via
off_t; we know this value isn't going to be negative.

I'm not opposed to PRIdMAX, which _is_ more accurate, but...

> P.S.: I wondered whether we have precedent for PRIdMAX, as we used to use
> only PRIuMAX, but yes: JeffH's json-writer uses PRIdMAX.

That's pretty recent. I won't be surprised if we have to do some
preprocessor trickery to handle that at some point. We have a PRIuMAX
fallback already. That comes from c4001d92be (Use off_t when we really
mean a file offset., 2007-03-06), but it's not clear to me if that was
motivated by a real platform or an over-abundance of caution.

I'm OK with just using PRIdMAX as appropriate for now. It will serve as
a weather-balloon, and we can #define our way out of it later if need
be.

-Peff


Re: [RFC/PATCH 1/5] Makefile: move long inline shell loops in "install" into helper

2018-11-12 Thread Johannes Schindelin
Hi,

On Fri, 2 Nov 2018, Ævar Arnfjörð Bjarmason wrote:

> Move a 37 line for-loop mess out of "install" and into a helper
> script. This started out fairly innocent but over the years has grown
> into a hard-to-maintain monster, and my recent ad874608d8 ("Makefile:
> optionally symlink libexec/git-core binaries to bin/git", 2018-03-13)
> certainly didn't help.
> 
> The shell code is ported pretty much as-is (with getopts added), it'll
> be fixed & prettified in subsequent commits.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  Makefile | 52 
>  install_programs | 89 
>  2 files changed, 103 insertions(+), 38 deletions(-)
>  create mode 100755 install_programs
> 
> diff --git a/Makefile b/Makefile
> index bbfbb4292d..aa6ca1fa68 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -2808,44 +2808,20 @@ endif
>   bindir=$$(cd '$(DESTDIR_SQ)$(bindir_SQ)' && pwd) && \
>   execdir=$$(cd '$(DESTDIR_SQ)$(gitexec_instdir_SQ)' && pwd) && \
>   destdir_from_execdir_SQ=$$(echo '$(gitexecdir_relative_SQ)' | sed -e 
> 's|[^/][^/]*|..|g') && \
> - { test "$$bindir/" = "$$execdir/" || \
> -   for p in git$X $(filter $(install_bindir_programs),$(ALL_PROGRAMS)); 
> do \
> - $(RM) "$$execdir/$$p" && \
> - test -n "$(INSTALL_SYMLINKS)" && \
> - ln -s "$$destdir_from_execdir_SQ/$(bindir_relative_SQ)/$$p" 
> "$$execdir/$$p" || \
> - { test -z 
> "$(NO_INSTALL_HARDLINKS)$(NO_CROSS_DIRECTORY_HARDLINKS)" && \
> -   ln "$$bindir/$$p" "$$execdir/$$p" 2>/dev/null || \
> -   cp "$$bindir/$$p" "$$execdir/$$p" || exit; } \
> -   done; \
> - } && \
> - for p in $(filter $(install_bindir_programs),$(BUILT_INS)); do \
> - $(RM) "$$bindir/$$p" && \
> - test -n "$(INSTALL_SYMLINKS)" && \
> - ln -s "git$X" "$$bindir/$$p" || \
> - { test -z "$(NO_INSTALL_HARDLINKS)" && \
> -   ln "$$bindir/git$X" "$$bindir/$$p" 2>/dev/null || \
> -   ln -s "git$X" "$$bindir/$$p" 2>/dev/null || \
> -   cp "$$bindir/git$X" "$$bindir/$$p" || exit; } \
> - done && \
> - for p in $(BUILT_INS); do \
> - $(RM) "$$execdir/$$p" && \
> - test -n "$(INSTALL_SYMLINKS)" && \
> - ln -s "$$destdir_from_execdir_SQ/$(bindir_relative_SQ)/git$X" 
> "$$execdir/$$p" || \
> - { test -z "$(NO_INSTALL_HARDLINKS)" && \
> -   ln "$$execdir/git$X" "$$execdir/$$p" 2>/dev/null || \
> -   ln -s "git$X" "$$execdir/$$p" 2>/dev/null || \
> -   cp "$$execdir/git$X" "$$execdir/$$p" || exit; } \
> - done && \
> - remote_curl_aliases="$(REMOTE_CURL_ALIASES)" && \
> - for p in $$remote_curl_aliases; do \
> - $(RM) "$$execdir/$$p" && \
> - test -n "$(INSTALL_SYMLINKS)" && \
> - ln -s "git-remote-http$X" "$$execdir/$$p" || \
> - { test -z "$(NO_INSTALL_HARDLINKS)" && \
> -   ln "$$execdir/git-remote-http$X" "$$execdir/$$p" 2>/dev/null 
> || \
> -   ln -s "git-remote-http$X" "$$execdir/$$p" 2>/dev/null || \
> -   cp "$$execdir/git-remote-http$X" "$$execdir/$$p" || exit; } \
> - done && \

This indeed looks like a mess...

> + ./install_programs \
> + --X="$$X" \
> + --RM="$(RM)" \
> + --bindir="$$bindir" \
> + --bindir-relative="$(bindir_relative_SQ)" \
> + --execdir="$$execdir" \
> + --destdir-from-execdir="$$destdir_from_execdir_SQ" \
> + --flag-install-symlinks="$(INSTALL_SYMLINKS)" \
> + --flag-no-install-hardlinks="$(NO_INSTALL_HARDLINKS)" \
> + 
> --flag-no-cross-directory-hardlinks="$(NO_CROSS_DIRECTORY_HARDLINKS)" \
> + --list-bindir-standalone="git$X $(filter 
> $(install_bindir_programs),$(ALL_PROGRAMS))" \
> + --list-bindir-git-dashed="$(filter 
> $(install_bindir_programs),$(BUILT_INS))" \
> + --list-execdir-git-dashed="$(BUILT_INS)" \
> + --list-execdir-curl-aliases="$(REMOTE_CURL_ALIASES)" && \
>   ./check_bindir "z$$bindir" "z$$execdir" "$$bindir/git-add$X"
>  
>  .PHONY: install-gitweb install-doc install-man install-man-perl install-html 
> install-info install-pdf
> diff --git a/install_programs b/install_programs
> new file mode 100755
> index 00..e287108112
> --- /dev/null
> +++ b/install_programs
> @@ -0,0 +1,89 @@
> +#!/bin/sh
> +
> +while test $# != 0
> +do
> + case "$1" in
> + --X=*)
> + X="${1#--X=}"
> + ;;
> + --RM=*)
> + RM="${1#--RM=}"
> + ;;
> + --bindir=*)
> + bindir="${1#--bindir=}"
> + ;;
> + --bindir-relative=*)
> + bindir_relative="${1#--bindir-relative=}"
> + ;;
> + --execdir=*)
> + execdir="${1#--execdir=}"
> + 

Re: [PATCH v2 1/2] ref-filter: add worktree atom

2018-11-12 Thread Jeff King
On Sun, Nov 11, 2018 at 03:58:30PM -0800, nbelakov...@gmail.com wrote:

> From: Nickolai Belakovski 
> 
> Add an atom expressing whether the particular ref is checked out in a
> linked worktree.
> 
> Signed-off-by: Nickolai Belakovski 
> ---
>  ref-filter.c   | 31 +++
>  t/t6302-for-each-ref-filter.sh | 15 +++
>  2 files changed, 46 insertions(+)

I left some more comments elsewhere in the thread, but one more thing to
note: this probably needs to touch Documentation/git-for-each-ref.txt to
describe the new placeholder.

-Peff


Re: [PATCH 10/10] fast-export: add --always-show-modify-after-rename

2018-11-12 Thread Jeff King
On Sun, Nov 11, 2018 at 12:42:58AM -0800, Elijah Newren wrote:

> > > fast-export output is traditionally used as an input to a fast-import
> > > program, but it is also useful to help gather statistics about the
> > > history of a repository (particularly when --no-data is also passed).
> > > For example, two of the types of information we may want to collect
> > > could include:
> > >   1) general information about renames that have occurred
> > >   2) what the biggest objects in a repository are and what names
> > >  they appear under.
> > >
> > > The first bit of information can be gathered by just passing -M to
> > > fast-export.  The second piece of information can partially be gotten
> > > from running
> > > git cat-file --batch-check --batch-all-objects
> > > However, that only shows what the biggest objects in the repository are
> > > and their sizes, not what names those objects appear as or what commits
> > > they were introduced in.  We can get that information from fast-export,
> > > but when we only see
> > > R oldname newname
> > > instead of
> > > R oldname newname
> > > M 100644 $SHA1 newname
> > > then it makes the job more difficult.  Add an option which allows us to
> > > force the latter output even when commits have exact renames of files.
> >
> > fast-export seems like a funny tool to look up paths. What about "git
> > log --find-object=$SHA1" ?
> 
> Eek, and give me O(N*M) behavior, where N is the number of commits in
> the repository and M is the number of renames that occur in its
> history?  Also, that's the inverse of the lookup I need anyway (I have
> the commit and filename, but am missing the SHA).

Maybe I don't understand what you're trying to accomplish. I was
thinking specifically of your "cat-file can tell you the large objects,
but you don't know their names/commits" from above.

I would do:

   git log --raw $(
 git cat-file --batch-check='%(objectsize:disk) %(objectname)' 
--batch-all-objects |
 sort -rn | head -3 |
 awk '{print "--find-object=" $2 }'
   )

I'm not sure how renames enter into it at all.

> One of the problems with filter-branch that people often run into is
> they know what they want at a high-level (e.g. extract the history of
> this directory for a new repository, or rewrite the history of this
> repo to appear at a subdirectory so it can be merged into a bigger
> repo and people passing filenames to log will still get the history of
> those files, or I want to remove some of the big stuff in my history),
> but often times that's not quite enough.  They need help finding big
> objects, or may be unaware that the subset of files they want used to
> be known by alternative names.
> 
> I want a simple --analyze mode that can report on all files that have
> been renamed (so users don't just say "all I care about is these N
> files, give me a rewritten history just including those" -- we can
> point out to them whether those N files used to be known by other
> names), as well as reporting on all big files and if they've been
> deleted, and aggregations of the "big files" information across
> directories and file extensions.

So this seems like a separate problem than what the commit message talks
about.

There I think you'd want to assemble the list with something like "git
log --follow --name-only paths-of-interest" except that --follow sucks
too much to handle more than one path at a time.

But if you wanted to do it manually, then:

  git log --diff-filter=R --name-only

would be enough to let you track it down, wouldn't it?

-Peff


Re: Add issue management within git

2018-11-12 Thread Martin Delille
Hi,

Thank you very much!
The git-bug project is what I'm looking for even if it is not very interesting 
without gitlab connection.
There is an issue about it on Gitlab: 
https://gitlab.com/gitlab-org/gitlab-ce/issues/50435
Maybe some encouragment from git core developer would help!
I also proposed to change the project name here: 
https://github.com/MichaelMure/git-bug/issues/73

Regards,

Martin
martin.deli...@gmail.com

> On 12 Nov 2018, at 10:22, Ævar Arnfjörð Bjarmason  wrote:
> 
> 
> On Mon, Nov 12 2018, Konstantin Khomoutov wrote:
> 
>> On Mon, Nov 12, 2018 at 09:35:31AM +0800, yan ke wrote:
>> 
 This would be awesome to handle issue directly with git:
 Having an offline version of the issues synced to the gitlab/github issues.
 A lot of work is done on the issues and it is lost when migrating
 from one service to the other.
 Beside we don’t always have a good internet connection.
 There is already a kind of integration between commit message fixing
 issue automatically when merged in the master branch (with “fix
 #143’).
>>>   Very very agree, now it is very difficult to find a solution when
>>> has some problem such build problem an so on! The mail-list is good to
>>> send patch es, but is it not suitable for problem track or problem
>>> solution search!
>>>   Now the Github or Gitlab is good to track issues, suggest to open
>>> the git issue track!
>> 
>> Please don't hijack the discussion: the original poster did not question
>> the workflow adopted by the Git project itself but rather asked about
>> what is colloquially called "distributed bug tracker", and wanted to
>> have one integrated with (or into) Git. That is completely orthogonal
>> story.
> 
> Correct, but let's assume good faith here and presume yan ke just
> misread the original E-mail. Many of us (and perhaps yourself) are
> participating in our second, third, fourth etc. language on this list :)
> 
>> As to searching for Git issues / problem solutions - I'd recommend using
>> the search on the main Git mailing list archive [1] and the issue
>> tracker of the Git for Windows project [2].
>> 
>> The communities around Git also include the "Git Users" low-volume
>> mailing list [3] (also perfectly searcheable), and the "git" tag at
>> StackOverflow [4].
>> 
>> 1. https://public-inbox.org/git/
>> 2. https://github.com/git-for-windows/git/issues
>> 3. https://groups.google.com/forum/#!forum/git-users
>> 4. https://stackoverflow.com/questions/tagged/git
> 
> Yeah. I'll add to that that this specific thing has been discussed here
> really recently:
> 
> https://public-inbox.org/git/CACSZ0Pwzs2e7E5RUEPDcEUsa=inzCyBAptU7YaCUw+5=mut...@mail.gmail.com/
> https://github.com/MichaelMure/git-bug/
> 
> So Martin, there's already a nascent tool that does this. It looks like
> the main thing it needs now is users & testers.



[PATCH 0/5] tests: various improvements to the GIT_TEST_INSTALLED feature

2018-11-12 Thread Johannes Schindelin via GitGitGadget
By setting the GIT_TEST_INSTALLED variable to the path of an installed Git
executable, it is possible to run the test suite also on a specific
installed version (as opposed to a version built from scratch).

The only thing this needs that is unlikely to be installed is the test
helper(s).

However, there have been a few rough edges around that, identified in my
(still ongoing) work to support building Git in Visual Studio (where we do
not want to run GNU Make, and where we have no canonical way to create, say,
hard-linked copies of the built-in commands, and other work to let Git for
Windows play better with BusyBox.

Triggered by a comment of AEvar
[https://public-inbox.org/git/20181102223743.4331-1-ava...@gmail.com/], I
hereby contribute these assorted fixes for the GIT_TEST_INSTALLED feature.

Johannes Schindelin (5):
  tests: fix GIT_TEST_INSTALLED's PATH to include t/helper/
  tests: respect GIT_TEST_INSTALLED when initializing repositories
  t/lib-gettext: test installed git-sh-i18n if GIT_TEST_INSTALLED is set
  tests: do not require Git to be built when testing an installed Git
  tests: explicitly use `git.exe` on Windows

 Makefile|  1 +
 t/lib-gettext.sh|  7 ++-
 t/test-lib-functions.sh |  3 ++-
 t/test-lib.sh   | 15 ++-
 4 files changed, 19 insertions(+), 7 deletions(-)


base-commit: 8858448bb49332d353febc078ce4a3abcc962efe
Published-As: 
https://github.com/gitgitgadget/git/releases/tags/pr-73%2Fdscho%2Ftest-git-installed-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git 
pr-73/dscho/test-git-installed-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/73
-- 
gitgitgadget


[PATCH 3/5] t/lib-gettext: test installed git-sh-i18n if GIT_TEST_INSTALLED is set

2018-11-12 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

It makes very, very little sense to test the built git-sh-i18n when the
user asked specifically to test another one.

Signed-off-by: Johannes Schindelin 
---
 t/lib-gettext.sh | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/t/lib-gettext.sh b/t/lib-gettext.sh
index eec757f104..9eb160c997 100644
--- a/t/lib-gettext.sh
+++ b/t/lib-gettext.sh
@@ -10,7 +10,12 @@ GIT_TEXTDOMAINDIR="$GIT_BUILD_DIR/po/build/locale"
 GIT_PO_PATH="$GIT_BUILD_DIR/po"
 export GIT_TEXTDOMAINDIR GIT_PO_PATH
 
-. "$GIT_BUILD_DIR"/git-sh-i18n
+if test -n "$GIT_TEST_INSTALLED"
+then
+   . "$(git --exec-path)"/git-sh-i18n
+else
+   . "$GIT_BUILD_DIR"/git-sh-i18n
+fi
 
 if test_have_prereq GETTEXT && ! test_have_prereq GETTEXT_POISON
 then
-- 
gitgitgadget



[PATCH 2/5] tests: respect GIT_TEST_INSTALLED when initializing repositories

2018-11-12 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

It really makes very, very little sense to use a different git
executable than the one the caller indicated via setting the environment
variable GIT_TEST_INSTALLED.

Signed-off-by: Johannes Schindelin 
---
 t/test-lib-functions.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 78d8c3783b..801cc9b2ef 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -900,7 +900,8 @@ test_create_repo () {
mkdir -p "$repo"
(
cd "$repo" || error "Cannot setup test environment"
-   "$GIT_EXEC_PATH/git-init" 
"--template=$GIT_BUILD_DIR/templates/blt/" >&3 2>&4 ||
+   "${GIT_TEST_INSTALLED:-$GIT_EXEC_PATH}/git" init \
+   "--template=$GIT_BUILD_DIR/templates/blt/" >&3 2>&4 ||
error "cannot run git init -- have you built things yet?"
mv .git/hooks .git/hooks-disabled
) || exit
-- 
gitgitgadget



[PATCH 5/5] tests: explicitly use `git.exe` on Windows

2018-11-12 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

In the bin-wrappers/* scripts, we already take pains to use `git.exe`
rather than `git`, as this could pick up the wrong thing on Windows
(i.e. if there exists a `git` file or directory in the build directory).

Now we do the same in the tests' start-up code.

This also helps when testing an installed Git, as there might be even
more likely some stray file or directory in the way.

Note: the only way we can record whether the `.exe` suffix is by writing
it to the `GIT-BUILD-OPTIONS` file and sourcing it at the beginning of
`t/test-lib.sh`. This is not a requirement introduced by this patch, but
we move the call to be able to use the `$X` variable that holds the file
extension, if any.

Note also: the many, many calls to `git this` and `git that` are
unaffected, as the regular PATH search will find the `.exe` files on
Windows (and not be confused by a directory of the name `git` that is
in one of the directories listed in the `PATH` variable), while
`/path/to/git` would not, per se, know that it is looking for an
executable and happily prefer such a directory.

Signed-off-by: Johannes Schindelin 
---
 Makefile|  1 +
 t/test-lib-functions.sh |  2 +-
 t/test-lib.sh   | 13 +
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/Makefile b/Makefile
index bbfbb4292d..5df0118ce9 100644
--- a/Makefile
+++ b/Makefile
@@ -2590,6 +2590,7 @@ GIT-BUILD-OPTIONS: FORCE
@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst 
','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
+   @echo X=\'$(X)\' >>$@+
 ifdef TEST_OUTPUT_DIRECTORY
@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst 
','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
 endif
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 801cc9b2ef..c167b2e1af 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -900,7 +900,7 @@ test_create_repo () {
mkdir -p "$repo"
(
cd "$repo" || error "Cannot setup test environment"
-   "${GIT_TEST_INSTALLED:-$GIT_EXEC_PATH}/git" init \
+   "${GIT_TEST_INSTALLED:-$GIT_EXEC_PATH}/git$X" init \
"--template=$GIT_BUILD_DIR/templates/blt/" >&3 2>&4 ||
error "cannot run git init -- have you built things yet?"
mv .git/hooks .git/hooks-disabled
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 1ea20dc2dc..3e2a9ce76d 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -49,18 +49,23 @@ export ASAN_OPTIONS
 : ${LSAN_OPTIONS=abort_on_error=1}
 export LSAN_OPTIONS
 
+if test ! -f "$GIT_BUILD_DIR"/GIT-BUILD-OPTIONS
+then
+   echo >&2 'error: GIT-BUILD-OPTIONS missing (has Git been built?).'
+   exit 1
+fi
+. "$GIT_BUILD_DIR"/GIT-BUILD-OPTIONS
+export PERL_PATH SHELL_PATH
+
 
 # It appears that people try to run tests without building...
-test -n "$GIT_TEST_INSTALLED" || "$GIT_BUILD_DIR/git" >/dev/null ||
+test -n "$GIT_TEST_INSTALLED" || "$GIT_BUILD_DIR/git$X" >/dev/null ||
 if test $? != 1
 then
echo >&2 'error: you do not seem to have built git yet.'
exit 1
 fi
 
-. "$GIT_BUILD_DIR"/GIT-BUILD-OPTIONS
-export PERL_PATH SHELL_PATH
-
 # if --tee was passed, write the output not only to the terminal, but
 # additionally to the file test-results/$BASENAME.out, too.
 case "$GIT_TEST_TEE_STARTED, $* " in
-- 
gitgitgadget


[PATCH 1/5] tests: fix GIT_TEST_INSTALLED's PATH to include t/helper/

2018-11-12 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

We really need to be able to find the test helpers... Really. This
change was forgotten when we moved the test helpers into t/helper/

Signed-off-by: Johannes Schindelin 
---
 t/test-lib.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 47a99aa0ed..832ede5099 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -957,7 +957,7 @@ elif test -n "$GIT_TEST_INSTALLED"
 then
GIT_EXEC_PATH=$($GIT_TEST_INSTALLED/git --exec-path)  ||
error "Cannot run git from $GIT_TEST_INSTALLED."
-   PATH=$GIT_TEST_INSTALLED:$GIT_BUILD_DIR:$PATH
+   PATH=$GIT_TEST_INSTALLED:$GIT_BUILD_DIR/t/helper:$PATH
GIT_EXEC_PATH=${GIT_TEST_EXEC_PATH:-$GIT_EXEC_PATH}
 else # normal case, use ../bin-wrappers only unless $with_dashes:
git_bin_dir="$GIT_BUILD_DIR/bin-wrappers"
-- 
gitgitgadget



Re: [PATCH v2 2/2] branch: Mark and colorize a branch differently if it is checked out in a linked worktree

2018-11-12 Thread Jeff King
On Mon, Nov 12, 2018 at 07:20:28PM +0900, Junio C Hamano wrote:

> nbelakov...@gmail.com writes:
> 
> > diff --git a/color.h b/color.h
> > index 98894d6a17..857653df73 100644
> > --- a/color.h
> > +++ b/color.h
> > @@ -42,6 +42,24 @@ struct strbuf;
> >  #define GIT_COLOR_FAINT_BLUE   "\033[2;34m"
> >  #define GIT_COLOR_FAINT_MAGENTA"\033[2;35m"
> >  #define GIT_COLOR_FAINT_CYAN   "\033[2;36m"
> > +#define GIT_COLOR_LIGHT_RED"\033[91m"
> > +#define GIT_COLOR_LIGHT_GREEN  "\033[92m"
> > +#define GIT_COLOR_LIGHT_YELLOW "\033[93m"
> > +#define GIT_COLOR_LIGHT_BLUE   "\033[94m"
> > +#define GIT_COLOR_LIGHT_MAGENTA"\033[95m"
> > +#define GIT_COLOR_LIGHT_CYAN   "\033[96m"
> > +#define GIT_COLOR_BOLD_LIGHT_RED   "\033[1;91m"
> > +#define GIT_COLOR_BOLD_LIGHT_GREEN "\033[1;92m"
> > +#define GIT_COLOR_BOLD_LIGHT_YELLOW"\033[1;93m"
> > +#define GIT_COLOR_BOLD_LIGHT_BLUE  "\033[1;94m"
> > +#define GIT_COLOR_BOLD_LIGHT_MAGENTA   "\033[1;95m"
> > +#define GIT_COLOR_BOLD_LIGHT_CYAN  "\033[1;96m"
> > +#define GIT_COLOR_FAINT_LIGHT_RED  "\033[2;91m"
> > +#define GIT_COLOR_FAINT_LIGHT_GREEN"\033[2;92m"
> > +#define GIT_COLOR_FAINT_LIGHT_YELLOW   "\033[2;93m"
> > +#define GIT_COLOR_FAINT_LIGHT_BLUE "\033[2;94m"
> > +#define GIT_COLOR_FAINT_LIGHT_MAGENTA  "\033[2;95m"
> > +#define GIT_COLOR_FAINT_LIGHT_CYAN "\033[2;96m"
> 
> Hopefully you made sure that there is no other topic in-flight that
> touch this area before doing this change?  Otherwise you'd be
> creating pointless merge conflict by futzing with spaces.

This hunk confused me for a minute, too. It's not changing spaces, but
just adding a bunch of color variants. It would be nice if we could just
do this with a run-time parse_color("bold red") or whatever, but we use
these as static initializers.

We don't strictly need anything more than FAINT_LIGHT_GREEN here. I
don't have a strong opinion on adding just what we need versus
being more complete.

> Ditto for an earlier hunk of this patch.

Yeah, I think this does apply to the earlier hunk that defines
branch_colors[].

-Peff


Re: [PATCH 09/10] fast-export: add a --show-original-ids option to show original names

2018-11-12 Thread Jeff King
On Sun, Nov 11, 2018 at 12:32:22AM -0800, Elijah Newren wrote:

> > >  Documentation/git-fast-export.txt |  7 +++
> > >  builtin/fast-export.c | 20 +++-
> > >  fast-import.c | 17 +
> > >  t/t9350-fast-export.sh| 17 +
> > >  4 files changed, 56 insertions(+), 5 deletions(-)
> >
> > The fast-import format is documented in Documentation/git-fast-import.txt.
> > It might need an update to cover the new format.
> 
> We document the format in both fast-import.c and
> Documentation/git-fast-import.txt?  Maybe we should delete the long
> comments in fast-import.c so this isn't duplicated?

Yes, that is probably worth doing (see the comment at the top of
fast-import.c). Some information might need to be migrated.

If we're going to have just one spot, I think it needs to be the
user-facing documentation. This is a public interface that other people
are building compatible implementations for (including your new tool).

> > > +--show-original-ids::
> > > + Add an extra directive to the output for commits and blobs,
> > > + `originally `.  While such directives will likely be
> > > + ignored by importers such as git-fast-import, it may be useful
> > > + for intermediary filters (e.g. for rewriting commit messages
> > > + which refer to older commits, or for stripping blobs by id).
> >
> > I'm not quite sure how a blob ends up being rewritten by fast-export (I
> > get that commits may change due to dropping parents).
> 
> It doesn't get rewritten by fast-export; it gets rewritten by other
> intermediary filters, e.g. in something like this:
> 
>git fast-export --show-original-ids --all | intermediary_filter |
> git fast-import
> 
> The intermediary_filter program may want to strip out blobs by id, or
> remove filemodify and filedelete directives unless they touch certain
> paths, etc.

OK, that matches my understanding. So why does fast-export need to print
the blob ids? If the intermediary is rewriting blobs, it can then
produce the "originally" line itself, can't it?

The more interesting case I guess is your "strip out blobs by id"
example. There the intermediary _could_ do so itself, but it would
require recomputing the object id of each blob.

If you use "--no-data", then this just works (we specify tree entries by
object id, rather than by mark). But I can see how it would be useful to
have the information even without "--no-data" (i.e., if you are doing
multiple kinds of rewrites on a single stream).

I think the thing that confused me is that this "originally" is doing
two things:

  - mentioning blob ids as an optimization / convenience for the reader

  - mentioning rewritten commit (and presumably tag?) ids that were
rewritten as part of a partial history export. I suppose even trees
could be rewritten that way, too, but fast-import doesn't generally
consider trees to be a first-class item.

So I'm OK with it, but I wonder if there is an easier way to explain it.

-Peff


Re: [PATCH 2/2] built-in rebase: reinstate `checkout -q` behavior where appropriate

2018-11-12 Thread Johannes Schindelin
Hi Junio,

On Mon, 12 Nov 2018, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> >> >  static int reset_head(struct object_id *oid, const char *action,
> >> > -  const char *switch_to_branch, int detach_head,
> >> > +  const char *switch_to_branch,
> >> > +  int detach_head, int reset_hard,
> >> 
> >> It might be worth switching to a single flag variable here. It would
> >> make calls like this:
> >> 
> >> > -if (reset_head(>object.oid, "checkout", NULL, 1,
> >> > +if (reset_head(>object.oid, "checkout", NULL, 1, 
> >> > 0,
> >> >  NULL, msg.buf))
> >> 
> >> a little more self-documenting (if a little more verbose).
> >
> > I thought about that, but for just two flags? Well, let me try it and see
> > whether I like the result ;-)
> 
> My rule of thumb is that repeating three times is definitely when we
> should consolidate separate ones into a single flag word, and twice
> is a borderline.
> 
> For two-time repetition, it is often worth fixing when somebody
> notices it during the review, as that is a sign that repetition,
> even a minor one, disturbed a reader enough to point out.

That's my thought exactly, hence I looked into it. The end result is
actually easier to read, in particular the commit that re-introduces the
`reset --hard` behavior: it no longer touches *all* callsites of
`reset_head()` but only the relevant ones.

> On the other hand, for a file-scope static that is likely to stay as a
> non-API function but a local helper, it may not be worth it.

Oh, do you think that `reset_head()` is likely to stay as non-API
function? I found myself in the need of repeating this tedious
unpack_trees() dance quite a few times over the past two years, and it is
*always* daunting because the API is *that* unintuitive.

So I *do* hope that `reset_head()` will actually be moved to reset.[ch]
eventually, and callsites e.g. in `sequencer.c` will be converted from
calling `unpack_trees()` to calling `reset_head()` instead.

v2 on the way,
Dscho

> So I am OK either way, slightly in favor of fixing it while we
> remember.
> 
> 
> >> This one could actually turn into:
> >> 
> >>   ret = error(...);
> >>   goto leave_reset_head;
> >> 
> >> now. We don't have to worry about an uninitialized desc.buffer anymore
> >> (as I mentioned in the previous email), because "nr" would be 0.
> >> 
> >> It doesn't save any lines, though (but maybe having a single
> >> cleanup/exit point would make things easier to read; I dunno).
> >
> > But you're right, of course. Consistency makes for easier-to-read code.
> 
> Yup, that does sound good.
> 
> Thanks.
> 


Re: [PATCH 0/2] avoid unsigned long for timestamps

2018-11-12 Thread Johannes Schindelin
Hi Carlo,

On Mon, 12 Nov 2018, Carlo Marcelo Arenas Belón wrote:

> specially problematic in Windows where unsigned long is only 32bit wide
> and therefore the assumption that a time_t would fit will lead to loss
> of precision in a 64bit OS.

Both patches look correct to me.

Thanks!
Dscho

> 
>  builtin/commit.c | 4 ++--
>  read-cache.c | 6 +++---
>  2 files changed, 5 insertions(+), 5 deletions(-)
> 
> 
> 

Re: [PATCH v2 1/2] ref-filter: add worktree atom

2018-11-12 Thread Jeff King
On Mon, Nov 12, 2018 at 07:11:23PM +0900, Junio C Hamano wrote:

> > +   }
> > +
> > +   string_list_sort(>u.worktree_heads);
> > +
> > +   free_worktrees(worktrees);
> > +   return 0;
> > +}
> 
> So..., this function collects any and all branches that are checked
> out in some worktree, and sort them _without_ dedup.  The user of
> the resulting information (i.e. atom->u.worktree_heads) cannot tell
> where each of the listed branches is checked out.
> 
> I wonder if "The worktree at /local/src/wt1 has this branch checked
> out" is something the user of %(worktree) atom, or a variant thereof
> e.g. "%(worktree:detailed)", may want to learn, but because that
> information is lost when this function returns, such an enhancement
> cannot be done without fixing this funciton.

Hmm. I think for the purposes of this series we could jump straight to
converting %(worktree) to mean "the path of the worktree for which this
branch is HEAD, or the empty string otherwise".

Then the caller from git-branch (or anybody wanting to emulate it) could
still do:

  %(if)%(worktree)%(then)+ %(refname)%(end)

As a bonus, the decision to use "+" becomes a lot easier. It is no
longer a part of the format language that we must promise forever, but
simply a porcelain decision by git-branch.

> Also, I am not sure if this "list of some info on worktrees" really
> belongs to an individual atom.  For one thing, if a format includes
> more than one instance of %(worktree) atoms, you'd iterate over the
> worktrees as many times as the number of these atoms you have.  Is
> there another existing atom that "caches" expensive piece of
> information per used_atom[] element like this one?  Essentially I am
> trying to convince myself that the approach taken by the patch is a
> sane one by finding a precedent.

Yes, we faced this a bit with Olga's cat-file conversion patches (where
we had a shared struct object_info). There probably should just be a
file-global data-structure storing the worktree info once (in an ideal
world, it would be part of a "struct ref_format" that uses no global
variables, but that is not how the code is structured today).

> > +   } else if (!strcmp(name, "worktree")) {
> > +   if (string_list_has_string(>u.worktree_heads, 
> > ref->refname))
> 
> I thought we were moving towards killing the use of string_list as a
> look-up table, as we do not want to see thoughtless copy such
> a code from parts of the code that are not performance critical to a
> part.  Not very satisfying.
> 
>   I think we can let this pass, and later add a wrapper around
>   hashmap that is meant to only be used to replace string-list
>   used for this exact purpose, i.e. key is a string, and there
>   is no need to iterate over the existing elements in any
>   sorted order.  Optionally, we can limit the look up to only
>   checking for existence, if it makes the code for the wrapper
>   simpler.

This came up over in another thread yesterday, too. So yeah, perhaps we
should move on that (I am OK punting on it for this series and
converting it later, though).

-Peff


Re: [PATCH 04/10] fast-export: avoid dying when filtering by paths and old tags exist

2018-11-12 Thread Jeff King
On Sat, Nov 10, 2018 at 11:38:45PM -0800, Elijah Newren wrote:

> > Hmm. That's the right thing to do if we're considering the export to be
> > an independent unit. But what if I'm just rewriting a portion of history
> > like:
> >
> >   git fast-export HEAD~5..HEAD | some_filter | git fast-import
> >
> > ? If I have a tag pointing to HEAD~10, will this delete that? Ideally I
> > think it would be left alone.
> 
> A couple things:
>   * This code path only triggers in a very specific case: If a tag is
> requested for export but points to a commit which is filtered out by
> something else (e.g. path limiters and the commit in question didn't
> modify any of the relevant paths), AND the user explicitly specified
> --tag-of-filtered-object=rewrite (so that the tag in question can be
> rewritten to the nearest non-filtered ancestor).

Right, I think this is the bit I was missing: somebody has to have
explicitly asked to export the tag. At which point the only sensible
thing to do is drop it.

-Peff


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-12 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 12 2018, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  writes:
>
>> What I'd add to your list is:
>>
>> * Some projects (I've seen this in the wild) add e.g. *.mp3 or whatever
>>   else usually doesn't belong in the repo as a "soft ignore". This is
>>   something we've never recommended, but have implicitly supported since
>>   the only caveats are a) you need a one-off "git add -f" and then
>>   they're tracked b) them being missing from "status" before being
>>   tracked c) the issue under discussion here.
>
> Or only selected "*.o" (vendor supplied binary blob) kept tracked
> while everything else is built from the source.
>
> I do not know who you are referring to "we" in your sentence, but as
> far as I am concerned, it has been and still is a BCP recommendation
> on this list to deal with a case like that.

I mean that this use-case of having a "soft" ignore by carrying it
across the "git add" barrier with a one-off "-f" isn't something
explicitly documented, and apparently not something many
expect. I.e. you / Matthieu have mentioned .gitignore in the past for
only-generated *.o use-case.

But it also does get used for "mostly we don't want this file, but
sometimes we do" use-case, so that's something we need to deal with in
practice. Like many workflows in git it's not something that was forseen
or intended, but does happen in the wild.


[PATCH 0/2] fix some exclude patterns being ignored

2018-11-12 Thread Rafael Ascensão
While trying to set up some aliases for my own use, I found out that
--exclude with --branches behave differently depending if the latter
uses globs.

I tried to fix it making my 2nd contribution. :)

Cheers,

Rafael Ascensão (2):
  refs: show --exclude failure with --branches/tags/remotes=glob
  refs: fix some exclude patterns being ignored

 refs.c   |  4 +++
 t/t6018-rev-list-glob.sh | 60 ++--
 2 files changed, 61 insertions(+), 3 deletions(-)

-- 
2.19.1



[PATCH 2/2] refs: fix some exclude patterns being ignored

2018-11-12 Thread Rafael Ascensão
`--exclude` from rev-list and rev-parse fails to exclude references if
the next `--branches`, `--tags` or `--remotes` use the optional
inclusive glob because those options are implemented as particular cases
of `--glob=`, which itself requires that exclude patterns begin with
'refs/'.

But it makes sense for `--branches=glob` and friends to be aware that
exclusions patterns for them shouldn't be 'refs//' prefixed, the
same way exclude patterns for `--branches` and friends (without the
optional glob) already are.

Let's record in 'refs.c:struct ref_filter' which context the exclude
pattern is tied to, so refs.c:filter_refs() can decide if it should
ignore the prefix when trying to match.

Signed-off-by: Rafael Ascensão 
---
 refs.c   |  4 
 t/t6018-rev-list-glob.sh | 24 
 2 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/refs.c b/refs.c
index de81c7be7c..539f385f61 100644
--- a/refs.c
+++ b/refs.c
@@ -217,6 +217,7 @@ char *resolve_refdup(const char *refname, int resolve_flags,
 /* The argument to filter_refs */
 struct ref_filter {
const char *pattern;
+   const char *prefix;
each_ref_fn *fn;
void *cb_data;
 };
@@ -296,6 +297,8 @@ static int filter_refs(const char *refname, const struct 
object_id *oid,
 
if (wildmatch(filter->pattern, refname, 0))
return 0;
+   if (filter->prefix)
+   skip_prefix(refname, filter->prefix, );
return filter->fn(refname, oid, flags, filter->cb_data);
 }
 
@@ -458,6 +461,7 @@ int for_each_glob_ref_in(each_ref_fn fn, const char 
*pattern,
}
 
filter.pattern = real_pattern.buf;
+   filter.prefix = prefix;
filter.fn = fn;
filter.cb_data = cb_data;
ret = for_each_ref(filter_refs, );
diff --git a/t/t6018-rev-list-glob.sh b/t/t6018-rev-list-glob.sh
index 8e2b136356..7dc6cbdc42 100755
--- a/t/t6018-rev-list-glob.sh
+++ b/t/t6018-rev-list-glob.sh
@@ -147,51 +147,51 @@ test_expect_success 'rev-parse accumulates multiple 
--exclude' '
compare rev-parse "--exclude=refs/remotes/* --exclude=refs/tags/* 
--all" --branches
 '
 
-test_expect_failure 'rev-parse --exclude=glob with --branches=glob' '
+test_expect_success 'rev-parse --exclude=glob with --branches=glob' '
compare rev-parse "--exclude=subspace-* --branches=sub*" "subspace/one 
subspace/two"
 '
 
-test_expect_failure 'rev-parse --exclude=glob with --tags=glob' '
+test_expect_success 'rev-parse --exclude=glob with --tags=glob' '
compare rev-parse "--exclude=qux/? --tags=qux/*" "qux/one qux/two"
 '
 
-test_expect_failure 'rev-parse --exclude=glob with --remotes=glob' '
+test_expect_success 'rev-parse --exclude=glob with --remotes=glob' '
compare rev-parse "--exclude=upstream/? --remotes=upstream/*" 
"upstream/one upstream/two"
 '
 
-test_expect_failure 'rev-parse --exclude=ref with --branches=glob' '
+test_expect_success 'rev-parse --exclude=ref with --branches=glob' '
compare rev-parse "--exclude=subspace-x --branches=sub*" "subspace/one 
subspace/two"
 '
 
-test_expect_failure 'rev-parse --exclude=ref with --tags=glob' '
+test_expect_success 'rev-parse --exclude=ref with --tags=glob' '
compare rev-parse "--exclude=qux/x --tags=qux/*" "qux/one qux/two"
 '
 
-test_expect_failure 'rev-parse --exclude=ref with --remotes=glob' '
+test_expect_success 'rev-parse --exclude=ref with --remotes=glob' '
compare rev-parse "--exclude=upstream/x --remotes=upstream/*" 
"upstream/one upstream/two"
 '
 
-test_expect_failure 'rev-list --exclude=glob with --branches=glob' '
+test_expect_success 'rev-list --exclude=glob with --branches=glob' '
compare rev-list "--exclude=subspace-* --branches=sub*" "subspace/one 
subspace/two"
 '
 
-test_expect_failure 'rev-list --exclude=glob with --tags=glob' '
+test_expect_success 'rev-list --exclude=glob with --tags=glob' '
compare rev-list "--exclude=qux/? --tags=qux/*" "qux/one qux/two"
 '
 
-test_expect_failure 'rev-list --exclude=glob with --remotes=glob' '
+test_expect_success 'rev-list --exclude=glob with --remotes=glob' '
compare rev-list "--exclude=upstream/? --remotes=upstream/*" 
"upstream/one upstream/two"
 '
 
-test_expect_failure 'rev-list --exclude=ref with --branches=glob' '
+test_expect_success 'rev-list --exclude=ref with --branches=glob' '
compare rev-list "--exclude=subspace-x --branches=sub*" "subspace/one 
subspace/two"
 '
 
-test_expect_failure 'rev-list --exclude=ref with --tags=glob' '
+test_expect_success 'rev-list --exclude=ref with --tags=glob' '
compare rev-list "--exclude=qux/x --tags=qux/*" "qux/one qux/two"
 '
 
-test_expect_failure 'rev-list --exclude=ref with --remotes=glob' '
+test_expect_success 'rev-list --exclude=ref with --remotes=glob' '
compare rev-list "--exclude=upstream/x --remotes=upstream/*" 
"upstream/one upstream/two"
 '
 
-- 
2.19.1



[PATCH 1/2] refs: show --exclude failure with --branches/tags/remotes=glob

2018-11-12 Thread Rafael Ascensão
The documentation of `--exclude=` option from rev-list and rev-parse
explicitly states that exclude patterns *should not* start with 'refs/'
when used with `--branches`, `--tags` or `--remotes`.

However, following this advice results in refereces not being excluded
if the next `--branches`, `--tags`, `--remotes` use the optional
inclusive glob.

Demonstrate this failure.

Signed-off-by: Rafael Ascensão 
---
 t/t6018-rev-list-glob.sh | 60 ++--
 1 file changed, 57 insertions(+), 3 deletions(-)

diff --git a/t/t6018-rev-list-glob.sh b/t/t6018-rev-list-glob.sh
index 0bf10d0686..8e2b136356 100755
--- a/t/t6018-rev-list-glob.sh
+++ b/t/t6018-rev-list-glob.sh
@@ -36,7 +36,13 @@ test_expect_success 'setup' '
git tag foo/bar master &&
commit master3 &&
git update-ref refs/remotes/foo/baz master &&
-   commit master4
+   commit master4 &&
+   git update-ref refs/remotes/upstream/one subspace/one &&
+   git update-ref refs/remotes/upstream/two subspace/two &&
+   git update-ref refs/remotes/upstream/x subspace-x &&
+   git tag qux/one subspace/one &&
+   git tag qux/two subspace/two &&
+   git tag qux/x subspace-x
 '
 
 test_expect_success 'rev-parse --glob=refs/heads/subspace/*' '
@@ -141,6 +147,54 @@ test_expect_success 'rev-parse accumulates multiple 
--exclude' '
compare rev-parse "--exclude=refs/remotes/* --exclude=refs/tags/* 
--all" --branches
 '
 
+test_expect_failure 'rev-parse --exclude=glob with --branches=glob' '
+   compare rev-parse "--exclude=subspace-* --branches=sub*" "subspace/one 
subspace/two"
+'
+
+test_expect_failure 'rev-parse --exclude=glob with --tags=glob' '
+   compare rev-parse "--exclude=qux/? --tags=qux/*" "qux/one qux/two"
+'
+
+test_expect_failure 'rev-parse --exclude=glob with --remotes=glob' '
+   compare rev-parse "--exclude=upstream/? --remotes=upstream/*" 
"upstream/one upstream/two"
+'
+
+test_expect_failure 'rev-parse --exclude=ref with --branches=glob' '
+   compare rev-parse "--exclude=subspace-x --branches=sub*" "subspace/one 
subspace/two"
+'
+
+test_expect_failure 'rev-parse --exclude=ref with --tags=glob' '
+   compare rev-parse "--exclude=qux/x --tags=qux/*" "qux/one qux/two"
+'
+
+test_expect_failure 'rev-parse --exclude=ref with --remotes=glob' '
+   compare rev-parse "--exclude=upstream/x --remotes=upstream/*" 
"upstream/one upstream/two"
+'
+
+test_expect_failure 'rev-list --exclude=glob with --branches=glob' '
+   compare rev-list "--exclude=subspace-* --branches=sub*" "subspace/one 
subspace/two"
+'
+
+test_expect_failure 'rev-list --exclude=glob with --tags=glob' '
+   compare rev-list "--exclude=qux/? --tags=qux/*" "qux/one qux/two"
+'
+
+test_expect_failure 'rev-list --exclude=glob with --remotes=glob' '
+   compare rev-list "--exclude=upstream/? --remotes=upstream/*" 
"upstream/one upstream/two"
+'
+
+test_expect_failure 'rev-list --exclude=ref with --branches=glob' '
+   compare rev-list "--exclude=subspace-x --branches=sub*" "subspace/one 
subspace/two"
+'
+
+test_expect_failure 'rev-list --exclude=ref with --tags=glob' '
+   compare rev-list "--exclude=qux/x --tags=qux/*" "qux/one qux/two"
+'
+
+test_expect_failure 'rev-list --exclude=ref with --remotes=glob' '
+   compare rev-list "--exclude=upstream/x --remotes=upstream/*" 
"upstream/one upstream/two"
+'
+
 test_expect_success 'rev-list --glob=refs/heads/subspace/*' '
 
compare rev-list "subspace/one subspace/two" 
"--glob=refs/heads/subspace/*"
@@ -233,7 +287,7 @@ test_expect_success 'rev-list --tags=foo' '
 
 test_expect_success 'rev-list --tags' '
 
-   compare rev-list "foo/bar" "--tags"
+   compare rev-list "foo/bar qux/x qux/two qux/one" "--tags"
 
 '
 
@@ -292,7 +346,7 @@ test_expect_success 'shortlog accepts 
--glob/--tags/--remotes' '
  "master other/three someref subspace-x subspace/one subspace/two" \
  "--glob=heads/*" &&
compare shortlog foo/bar --tags=foo &&
-   compare shortlog foo/bar --tags &&
+   compare shortlog "foo/bar qux/one qux/two qux/x" --tags &&
compare shortlog foo/baz --remotes=foo
 
 '
-- 
2.19.1



Re: [RFC/PATCH 5/5] Makefile: Add a NO_INSTALL_BUILTIN_EXECDIR_ALIASES flag

2018-11-12 Thread Johannes Schindelin
Hi Ævar,

On Fri, 2 Nov 2018, Ævar Arnfjörð Bjarmason wrote:

> Back when git was initially written the likes of "git-add", "git-init"
> etc. were installed in the user's $PATH. A few years later everything,
> with a few exceptions like git-upload-pack and git-receive-pack, was
> expected to be invoked as "git $cmd".
> 
> Now something like a decade later we're still installing these old
> commands in gitexecdir. This is so someone with a shellscript that
> still targets e.g. "git-init" can add $(git --exec-path) to their
> $PATH and not have to change their script.
> 
> Let's add an option to break this backwards compatibility. Now with
> NO_INSTALL_BUILTIN_EXECDIR_ALIASES=YesPlease there's only 3 programs
> in the bindir that are hardlinked to "git" (receive-pack,
> upload-archive & upload-pack), and 3 in the
> gitexecdir (git-remote-{ftp,ftps,https} linked to git-remote-http).
> 
> There's no cross-directory links anymore, so the
> "NO_CROSS_DIRECTORY_HARDLINKS" flag becomes redundant under this new
> option.
> 
> 1. https://public-inbox.org/git/87woyfdkoi@evledraar.gmail.com/
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---

I like the idea.

With my suggested refactoring that avoids the non-DRY code, this patch
would also become much simpler (as would 2/5 -- 4/5).

However, I would not call these "aliases". That's just confusing. Maybe
NO_INSTALL_DASHED_BUILTINS would be better? It certainly would not have
confused me.

Ciao,
Dscho

>  Makefile |  8 
>  install_programs | 36 +---
>  2 files changed, 29 insertions(+), 15 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index 07c8b74353..a849a7b6d1 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -346,6 +346,13 @@ all::
>  # INSTALL_SYMLINKS if you'd prefer not to have the install procedure
>  # fallack on hardlinking or copying if "ln -s" fails.
>  #
> +# Define NO_INSTALL_BUILTIN_EXECDIR_ALIASES if you'd like to skip
> +# installing legacy such as "git-init" and "git-add" in the
> +# gitexecdir. Unless you're on a system where "which git-init" is
> +# expected to returns something set this. Users have been expected to
> +# use the likes of "git init" for ages now, these programs were only
> +# provided for legacy compatibility.
> +#
>  # Define NO_CROSS_DIRECTORY_HARDLINKS if you plan to distribute the installed
>  # programs as a tar, where bin/ and libexec/ might be on different file 
> systems.
>  #
> @@ -2823,6 +2830,7 @@ endif
>   --flag-no-install-hardlinks="$(NO_INSTALL_HARDLINKS)" \
>   
> --flag-no-cross-directory-hardlinks="$(NO_CROSS_DIRECTORY_HARDLINKS)" \
>   
> --flag-no-install-symlinks-fallback="$(NO_INSTALL_SYMLINKS_FALLBACK)" \
> + 
> --flag-no-install-builtin-execdir-aliases="$(NO_INSTALL_BUILTIN_EXECDIR_ALIASES)"
>  \
>   --list-bindir-standalone="git$X $(filter 
> $(install_bindir_programs),$(ALL_PROGRAMS))" \
>   --list-bindir-git-dashed="$(filter 
> $(install_bindir_programs),$(BUILT_INS))" \
>   --list-execdir-git-dashed="$(BUILT_INS)" \
> diff --git a/install_programs b/install_programs
> index 51e08019dd..8d89cd9984 100755
> --- a/install_programs
> +++ b/install_programs
> @@ -33,6 +33,9 @@ do
>   --flag-no-install-symlinks-fallback=*)
>   
> NO_INSTALL_SYMLINKS_FALLBACK="${1#--flag-no-install-symlinks-fallback=}"
>   ;;
> + --flag-no-install-builtin-execdir-aliases=*)
> + 
> NO_INSTALL_BUILTIN_EXECDIR_ALIASES="${1#--flag-no-install-builtin-execdir-aliases=}"
> + ;;
>   --list-bindir-standalone=*)
>   list_bindir_standalone="${1#--list-bindir-standalone=}"
>   ;;
> @@ -54,7 +57,7 @@ do
>   shift
>  done &&
>  
> -if test "$bindir/" != "$execdir/"
> +if test "$bindir/" != "$execdir/" -a -z "$NO_INSTALL_BUILTIN_EXECDIR_ALIASES"
>  then
>   for p in $list_bindir_standalone; do
>   $RM "$execdir/$p" &&
> @@ -87,20 +90,23 @@ do
>   fi
>  done &&
>  
> -for p in $list_execdir_git_dashed; do
> - $RM "$execdir/$p" &&
> - if test -n "$INSTALL_SYMLINKS" -a -n "$NO_INSTALL_SYMLINKS_FALLBACK"
> - then
> - ln -s "$destdir_from_execdir/$bindir_relative/git$X" 
> "$execdir/$p"
> - else
> - test -n "$INSTALL_SYMLINKS" &&
> - ln -s "$destdir_from_execdir/$bindir_relative/git$X" 
> "$execdir/$p" ||
> - { test -z "$NO_INSTALL_HARDLINKS" &&
> -   ln "$execdir/git$X" "$execdir/$p" ||
> -   ln -s "git$X" "$execdir/$p" ||
> -   cp "$execdir/git$X" "$execdir/$p" || exit; }
> - fi
> -done &&
> +if test -z "$NO_INSTALL_BUILTIN_EXECDIR_ALIASES"
> +then
> + for p in $list_execdir_git_dashed; do
> + $RM "$execdir/$p" &&
> + if test -n "$INSTALL_SYMLINKS" -a -n 
> "$NO_INSTALL_SYMLINKS_FALLBACK"
> + then
> + ln -s 

Re: [PATCH 06/10] fast-export: when using paths, avoid corrupt stream with non-existent mark

2018-11-12 Thread Jeff King
On Sun, Nov 11, 2018 at 12:01:43AM -0800, Elijah Newren wrote:

> > It does seem funny that the behavior for the earlier case (bounded
> > commits) and this case (skipping some commits) are different. Would you
> > ever want to keep walking backwards to find an ancestor in the earlier
> > case? Or vice versa, would you ever want to simply delete a tag in a
> > case like this one?
> >
> > I'm not sure sure, but I suspect you may have thought about it a lot
> > harder than I have. :)
> 
> I'm not sure why you thought the behavior for the two cases was
> different?  For both patches, my testcases used path limiting; it was
> you who suggested employing a negative revision to bound the commits.

Sorry, I think I just got confused. I was thinking about the
documentation fixup you started with, which did regard bounded commits.
But that's not relevant here.

> Anyway, for both patches assuming you haven't bounded the commits, you
> can attempt to keep walking backwards to find an earlier ancestor, but
> the fundamental fact is you aren't guaranteed that you can find one
> (i.e. some tag or branch points to a commit that didn't modify any of
> the specified paths, and nor did any of its ancestors back to any root
> commits).  I hit that case lots of times.  If the user explicitly
> requested a tag or branch for export (and requested tag rewriting),
> and limited to certain paths that had never existed in the repository
> as of the time of the tag or branch, then you hit the cases these
> patches worry about.  Patch 4 was about (annotated and signed) tags,
> this patch is about unannotated tags and branches and other refs.

OK, that makes more sense.

So I guess my question is: in patch 4, why do we not walk back to find
an appropriate ancestor pointed to by the signed tag object, as we do
here for the unannotated case?

And I think the answer is: we already do that. It's just that the
unannotated case never learned the same trick. So basically it's:

  1. rewriting annotated tags to ancestors is already known on "master"

  2. patch 4 further teaches it to drop a tag when that fails

  3. patch 6 teaches both (1) and (2) to the unannotated code path,
 which knew neither

Is that right?

> > This hunk makes sense.
> 
> Cool, this was the entirety of the code...so does this mean that the
> code makes more sense than my commit message summary did?  ...and
> perhaps that my attempts to answer your questions in this email
> weren't necessary anymore?

No, it only made sense that the hunk implemented what you claimed in the
commit message. ;)

I think your responses did help me understand that what the commit
message is claiming is a good thing.

-Peff


Re: [RFC PATCH 1/5] ref-filter: add objectsize:disk option

2018-11-12 Thread Jeff King
On Mon, Nov 12, 2018 at 02:03:20PM +0900, Junio C Hamano wrote:

> > +   } else if (!strcmp(name, "objectsize")) {
> > v->value = oi->size;
> > v->s = xstrfmt("%lu", oi->size);
> 
> This is not a suggestion but is a genuine question, but I wonder if
> two years down the road somebody who meets this API for the first
> time find the asymmetry between "objectsize" and "objectsize:disk" a
> tad strange and suggest the former to have "objectsize:real" or some
> synonym.  Or we can consider "objectsize" the primary thing (hence
> needing no colon-plus-modifier to clarify what kind of size we are
> asking) and leave these two deliberatly asymmetric.  I am leaning
> towards the latter myself.

I think to some degree that ship has already sailed (and is my fault!).

The ulterior motive here is to eventually unify the cat-file formatter
with the ref-filter formatter.  So for that we'll have to support
%(objectsize) anyway.

That still leaves the option of having %(objectsize:real) later and
marking a bare %(objectsize) as a deprecated synonym. But I don't think
there's any advantage to trying to deal with it at this stage.

-Peff


Re: [RFC PATCH 1/5] ref-filter: add objectsize:disk option

2018-11-12 Thread Johannes Schindelin
Hi,

On Mon, 12 Nov 2018, Junio C Hamano wrote:

> Olga Telezhnaya  writes:
> 
> > @@ -876,11 +882,13 @@ static void grab_common_values(struct atom_value 
> > *val, int deref, struct expand_
> > name++;
> > if (!strcmp(name, "objecttype"))
> > v->s = xstrdup(type_name(oi->type));
> > -   else if (!strcmp(name, "objectsize")) {
> > +   else if (!strcmp(name, "objectsize:disk")) {
> > +   v->value = oi->disk_size;
> > +   v->s = xstrfmt("%lld", (long long)oi->disk_size);
> 
> oi.disk_size is off_t; do we know "long long" 
> 
>(1) is available widely enough (I think it is from c99)?
> 
>(2) is sufficiently wide so that we can safely cast off_t to?
> 
>(3) will stay to be sufficiently wide as machines get larger
>together with standard types like off_t in the future?
> 
> I'd rather use intmax_t (as off_t is a signed integral type) so that
> we do not have to worry about these questions in the first place.

You mean something like

v->s = xstrfmt("%"PRIdMAX, (intmax_t)oi->disk_size);


If so, I agree.

Ciao,
Dscho

P.S.: I wondered whether we have precedent for PRIdMAX, as we used to use
only PRIuMAX, but yes: JeffH's json-writer uses PRIdMAX.

> 
> > +   } else if (!strcmp(name, "objectsize")) {
> > v->value = oi->size;
> > v->s = xstrfmt("%lu", oi->size);
> 
> This is not a suggestion but is a genuine question, but I wonder if
> two years down the road somebody who meets this API for the first
> time find the asymmetry between "objectsize" and "objectsize:disk" a
> tad strange and suggest the former to have "objectsize:real" or some
> synonym.  Or we can consider "objectsize" the primary thing (hence
> needing no colon-plus-modifier to clarify what kind of size we are
> asking) and leave these two deliberatly asymmetric.  I am leaning
> towards the latter myself.
> 
> > -   }
> > -   else if (deref)
> > +   } else if (deref)
> > grab_objectname(name, >oid, v, _atom[i]);
> > }
> >  }
> >
> > --
> > https://github.com/git/git/pull/552
> 


Re: Migration to Git LFS inflates repository multiple times

2018-11-12 Thread Jeff King
On Mon, Nov 12, 2018 at 12:47:42AM +0100, Mateusz Loskot wrote:

> Hi,
> 
> I'm posting here for the first time and I hope it's the right place to ask
> questions about Git LFS.
> 
> TL;TR: Is this normal a repository migrated to Git LFS inflates multiple times
> and how to deal with it?

That does sound odd to me. People with more LFS experience can probably
give you a better answers, but one thought occurred to me: does LFS
store backup copies of the original refs that it rewrites (similar to
the way filter-branch stores refs/original)?

If so, then the resulting repo has the new history _and_ the old
history. Which might mean storing those large blobs both as Git objects
(for the old history) and in an LFS cache directory (for the new
history).

And the right next step is probably to delete those backup refs, and
then "git gc --prune=now". Hmm, actually thinking about it, reflogs
could be making the old history reachable, too.

Try looking at the output of "git for-each-ref" and seeing if there are
any backup refs. After deleting them (or confirming that there aren't),
prune the reflogs with:

  git reflog expire --expire-unreachable=now --all

and then "git gc --prune=now".

-Peff


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-12 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

>> Or only selected "*.o" (vendor supplied binary blob) kept tracked
>> while everything else is built from the source.
>> ...
> But it also does get used for "mostly we don't want this file, but
> sometimes we do" use-case, so that's something we need to deal with in
> practice.

Exactly.  "Mostly we don't want *.o as we prefer to build from the
source, but we have only object files for some selected ones" is an
often cited use case where it is the BCP to have *.o in .gitignore
and use "add -f" to add the "selected" ones initially.



Re: [RFC/PATCH 0/5] stop installing old libexec aliases like "git-init"

2018-11-12 Thread Johannes Schindelin
Hi Ævar,

On Fri, 2 Nov 2018, Ævar Arnfjörð Bjarmason wrote:

>  * GIT_TEST_INSTALLED breaks entirely under this, as early as the
>heuristic for "are we built?" being "do we have git-init in
>libexecdir?". I tried a bit to make this work, but there's a lot of
>dependencies there.

I have a couple of patches in the pipeline to improve
`GIT_TEST_INSTALLED`, as I needed it to work without hardlinked copies of
the built-ins. These patches might help this here isue.

>  * We still (and this is also true of my ad874608d8) hardlink
>everything in the build dir via a different part of the Makefile,
>ideally we should do exactly the same thing there so also normal
>tests and not just GIT_TEST_INSTALLED (if that worked) would test
>in the same mode.
> 
>I gave making that work a bit of a try and gave up in the Makefile
>jungle.

Yep.

Ciao,
Dscho

[PATCH 4/5] tests: do not require Git to be built when testing an installed Git

2018-11-12 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

We really only need the test helpers in that case, but that is not what
we test for. So let's skip the test for now when we know that we want to
test an installed Git.

Signed-off-by: Johannes Schindelin 
---
 t/test-lib.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 832ede5099..1ea20dc2dc 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -51,7 +51,7 @@ export LSAN_OPTIONS
 
 
 # It appears that people try to run tests without building...
-"$GIT_BUILD_DIR/git" >/dev/null
+test -n "$GIT_TEST_INSTALLED" || "$GIT_BUILD_DIR/git" >/dev/null ||
 if test $? != 1
 then
echo >&2 'error: you do not seem to have built git yet.'
-- 
gitgitgadget



Re: [PATCH v2 0/3] Fix built-in rebase perf regression

2018-11-12 Thread Johannes Schindelin
Hi,

On Mon, 12 Nov 2018, Johannes Schindelin via GitGitGadget wrote:

> In our tests with large repositories, we noticed a serious regression of the
> performance of git rebase when using the built-in vs the shell script
> version. It boils down to an incorrect conversion of a git checkout -q:
> instead of using a twoway_merge as git checkout does, we used a oneway_merge 
> as git reset does. The latter, however, calls lstat() on all files listed in
> the index, while the former essentially looks only at the files that are
> different between the given two revisions.
> 
> Let's reinstate the original behavior by introducing a flag to the 
> reset_head() function to indicate whether we want to emulate reset --hard 
> (in which case we use the oneway_merge, otherwise we use twoway_merge).
> 
> Johannes Schindelin (3):
>   rebase: consolidate clean-up code before leaving reset_head()
>   rebase: prepare reset_head() for more flags
>   built-in rebase: reinstate `checkout -q` behavior where appropriate
> 
>  builtin/rebase.c | 79 
>  1 file changed, 46 insertions(+), 33 deletions(-)

I forgot to specify the changes vs v1:

- More error paths are not consolidated via `goto leave_reset_head`.
- The `desc` array is not initialized to all-zero, to avoid bogus
  addresses being passed to `free()`.
- The `detach_head` and `reset_hard` parameters have been consolidated
  into a `flags` parameter.
- The `reset_head()` function once again only initializes `head_oid`
  when needed.

Sorry for the omission,
Johannes

> 
> 
> base-commit: 8858448bb49332d353febc078ce4a3abcc962efe
> Published-As: 
> https://github.com/gitgitgadget/git/releases/tags/pr-72%2Fdscho%2Fbuiltin-rebase-perf-regression-v2
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git 
> pr-72/dscho/builtin-rebase-perf-regression-v2
> Pull-Request: https://github.com/gitgitgadget/git/pull/72
> 
> Range-diff vs v1:
> 
>  1:  64597fe827 ! 1:  28e24d98ab rebase: consolidate clean-up code before 
> leaving reset_head()
>  @@ -11,6 +11,33 @@
>   --- a/builtin/rebase.c
>   +++ b/builtin/rebase.c
>   @@
>  +if (switch_to_branch && !starts_with(switch_to_branch, "refs/"))
>  +BUG("Not a fully qualified branch: '%s'", 
> switch_to_branch);
>  + 
>  +-   if (hold_locked_index(, LOCK_REPORT_ON_ERROR) < 0)
>  +-   return -1;
>  ++   if (hold_locked_index(, LOCK_REPORT_ON_ERROR) < 0) {
>  ++   ret = -1;
>  ++   goto leave_reset_head;
>  ++   }
>  + 
>  +if (!oid) {
>  +if (get_oid("HEAD", _oid)) {
>  +-   rollback_lock_file();
>  +-   return error(_("could not determine HEAD 
> revision"));
>  ++   ret = error(_("could not determine HEAD 
> revision"));
>  ++   goto leave_reset_head;
>  +}
>  +oid = _oid;
>  +}
>  +@@
>  +unpack_tree_opts.reset = 1;
>  + 
>  +if (read_index_unmerged(the_repository->index) < 0) {
>  +-   rollback_lock_file();
>  +-   return error(_("could not read index"));
>  ++   ret = error(_("could not read index"));
>  ++   goto leave_reset_head;
>   }
>
>   if (!fill_tree_descriptor(, oid)) {
>  @@ -31,15 +58,17 @@
>   }
>
>   tree = parse_tree_indirect(oid);
>  -@@
>  +prime_cache_tree(the_repository->index, tree);
>
>  -if (write_locked_index(the_repository->index, , 
> COMMIT_LOCK) < 0)
>  +-   if (write_locked_index(the_repository->index, , 
> COMMIT_LOCK) < 0)
>  ++   if (write_locked_index(the_repository->index, , 
> COMMIT_LOCK) < 0) {
>   ret = error(_("could not write index"));
>   -   free((void *)desc.buffer);
>  - 
>  -if (ret)
>  +-
>  +-   if (ret)
>   -   return ret;
>   +   goto leave_reset_head;
>  ++   }
>
>   reflog_action = getenv(GIT_REFLOG_ACTION_ENVIRONMENT);
>   strbuf_addf(, "%s: ", reflog_action ? reflog_action : 
> "rebase");
>  -:  -- > 2:  db963b2094 rebase: prepare reset_head() for more flags
>  2:  070092b430 ! 3:  a7360b856f built-in rebase: reinstate `checkout -q` 
> behavior where appropriate
>  @@ -20,15 +20,18 @@
>   @@
>#define GIT_REFLOG_ACTION_ENVIRONMENT "GIT_REFLOG_ACTION"
>
>  + #define RESET_HEAD_DETACH (1<<0)
>  ++#define RESET_HEAD_HARD (1<<1)
>  + 
>static int reset_head(struct object_id *oid, const char *action,
>  -- const char *switch_to_branch, int detach_head,
>  -+ const char *switch_to_branch,
> 

Re: [RFC PATCH] index-pack: improve performance on NFS

2018-11-12 Thread Jeff King
On Sat, Nov 10, 2018 at 03:04:35PM +0100, Ævar Arnfjörð Bjarmason wrote:

> d) As shown in the linked E-Mails of mine you sometimes pay a 2-3 second
>*fixed* cost even for a very small (think ~100-200 objects) push/fetch
>that would otherwise take milliseconds with Jeff's version of this
>optimization (and not with mine). This can be a hundred/thousands of
>percent slowdown.
> 
>Is that a big deal in itself in terms of absolute time spent? No. But
>I'm also thinking about this from the perspective of getting noise
>out of performance metrics. Some of this slowdown is also "user
>waiting for the terminal to be usable again" not just some machine
>somewhere wasting its own time.

IMHO the ultimate end-game in this direction is still "don't have a
bunch of loose objects".

Right now this can legitimately happen due to unreachable-but-recent
objects being exploded out (or never packed in the first place). But I
hope in the long run that we'll actually put these into packs. That will
make this case faster _and_ avoid extra work during gc _and_ fix the
"whoops, we just ran gc but you still have a lot of objects" problem.

Which doesn't invalidate your other four points, of course. ;)

-Peff


[PATCH 0/9] caching loose objects

2018-11-12 Thread Jeff King
Here's the series I mentioned earlier in the thread to cache loose
objects when answering has_object_file(..., OBJECT_INFO_QUICK). For
those just joining us, this makes operations that look up a lot of
missing objects (like "index-pack" looking for collisions) faster. This
is mostly targeted at systems where stat() is slow, like over NFS, but
it seems to give a 2% speedup indexing a full git.git packfile into an
empty repository (i.e., what you'd see on a clone).

I'm adding René Scharfe and Takuto Ikuta to the cc for their previous
work in loose-object caching.

The interesting bit is patch 8. The rest of it is cleanup to let us
treat alternates and the main object directory similarly.

  [1/9]: fsck: do not reuse child_process structs
  [2/9]: submodule--helper: prefer strip_suffix() to ends_with()
  [3/9]: rename "alternate_object_database" to "object_directory"
  [4/9]: sha1_file_name(): overwrite buffer instead of appending
  [5/9]: handle alternates paths the same as the main object dir
  [6/9]: sha1-file: use an object_directory for the main object dir
  [7/9]: object-store: provide helpers for loose_objects_cache
  [8/9]: sha1-file: use loose object cache for quick existence check
  [9/9]: fetch-pack: drop custom loose object cache

 builtin/count-objects.c |   4 +-
 builtin/fsck.c  |  35 +++---
 builtin/grep.c  |   2 +-
 builtin/submodule--helper.c |   9 +-
 commit-graph.c  |  13 +--
 environment.c   |   4 +-
 fetch-pack.c|  39 +--
 http-walker.c   |   2 +-
 http.c  |   4 +-
 object-store.h  |  60 +--
 object.c|  26 ++---
 packfile.c  |  20 ++--
 path.c  |   2 +-
 repository.c|   8 +-
 sha1-file.c | 210 ++--
 sha1-name.c |  42 ++--
 transport.c |   2 +-
 17 files changed, 209 insertions(+), 273 deletions(-)

-Peff


Re: [PATCH 5/9] handle alternates paths the same as the main object dir

2018-11-12 Thread Derrick Stolee

On 11/12/2018 9:49 AM, Jeff King wrote:

When we generate loose file paths for the main object directory, the
caller provides a buffer to loose_object_path (formerly sha1_file_name).
The callers generally keep their own static buffer to avoid excessive
reallocations.

But for alternate directories, each struct carries its own scratch
buffer. This is needlessly different; let's unify them.

We could go either direction here, but this patch moves the alternates
struct over to the main directory style (rather than vice-versa).
Technically the alternates style is more efficient, as it avoids
rewriting the object directory name on each call. But this is unlikely
to matter in practice, as we avoid reallocations either way (and nobody
has ever noticed or complained that the main object directory is copying
a few extra bytes before making a much more expensive system call).


Hm. I've complained in the past [1] about a simple method like 
strbuf_addf() over loose objects, but that was during abbreviation 
checks so we were adding the string for every loose object but not 
actually reading the objects.


[1] 
https://public-inbox.org/git/20171201174956.143245-1-dsto...@microsoft.com/


The other concern I have is for alternates that may have long-ish paths 
to their object directories.


So, this is worth keeping an eye on, but is likely to be fine.


And this has the advantage that the reusable buffers are tied to
particular calls, which makes the invalidation rules simpler (for
example, the return value from stat_sha1_file() used to be invalidated
by basically any other object call, but now it is affected only by other
calls to stat_sha1_file()).

We do steal the trick from alt_sha1_path() of returning a pointer to the
filled buffer, which makes a few conversions more convenient.

Signed-off-by: Jeff King 
---
  object-store.h | 14 +-
  object.c   |  1 -
  sha1-file.c| 44 
  sha1-name.c|  8 ++--
  4 files changed, 23 insertions(+), 44 deletions(-)

diff --git a/object-store.h b/object-store.h
index fefa17e380..b2fa0d0df0 100644
--- a/object-store.h
+++ b/object-store.h
@@ -10,10 +10,6 @@
  struct object_directory {
struct object_directory *next;
  
-	/* see alt_scratch_buf() */

-   struct strbuf scratch;
-   size_t base_len;
-
/*
 * Used to store the results of readdir(3) calls when searching
 * for unique abbreviated hashes.  This cache is never
@@ -54,14 +50,6 @@ void add_to_alternates_file(const char *dir);
   */
  void add_to_alternates_memory(const char *dir);
  
-/*

- * Returns a scratch strbuf pre-filled with the alternate object directory,
- * including a trailing slash, which can be used to access paths in the
- * alternate. Always use this over direct access to alt->scratch, as it
- * cleans up any previous use of the scratch buffer.
- */
-struct strbuf *alt_scratch_buf(struct object_directory *odb);
-
  struct packed_git {
struct packed_git *next;
struct list_head mru;
@@ -157,7 +145,7 @@ void raw_object_store_clear(struct raw_object_store *o);
   * Put in `buf` the name of the file in the local object database that
   * would be used to store a loose object with the specified sha1.
   */
-void loose_object_path(struct repository *r, struct strbuf *buf, const 
unsigned char *sha1);
+const char *loose_object_path(struct repository *r, struct strbuf *buf, const 
unsigned char *sha1);
  
  void *map_sha1_file(struct repository *r, const unsigned char *sha1, unsigned long *size);
  
diff --git a/object.c b/object.c

index 6af8e908bb..dd485ac629 100644
--- a/object.c
+++ b/object.c
@@ -484,7 +484,6 @@ struct raw_object_store *raw_object_store_new(void)
  
  static void free_alt_odb(struct object_directory *odb)

  {
-   strbuf_release(>scratch);
oid_array_clear(>loose_objects_cache);
free(odb);
  }
diff --git a/sha1-file.c b/sha1-file.c
index 478eac326b..15db6b61a9 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -346,27 +346,20 @@ static void fill_sha1_path(struct strbuf *buf, const 
unsigned char *sha1)
}
  }
  
-void loose_object_path(struct repository *r, struct strbuf *buf,

-  const unsigned char *sha1)
+static const char *odb_loose_path(const char *path, struct strbuf *buf,
+ const unsigned char *sha1)
  {
strbuf_reset(buf);
-   strbuf_addstr(buf, r->objects->objectdir);
+   strbuf_addstr(buf, path);
strbuf_addch(buf, '/');
fill_sha1_path(buf, sha1);
+   return buf->buf;
  }
  
-struct strbuf *alt_scratch_buf(struct object_directory *odb)

+const char *loose_object_path(struct repository *r, struct strbuf *buf,
+ const unsigned char *sha1)
  {
-   strbuf_setlen(>scratch, odb->base_len);
-   return >scratch;
-}
-
-static const char *alt_sha1_path(struct object_directory *odb,
-const unsigned 

Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-11-12 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 12 2018, Jeff King wrote:

> In cases where we expect to ask has_sha1_file() about a lot of objects
> that we are not likely to have (e.g., during fetch negotiation), we
> already use OBJECT_INFO_QUICK to sacrifice accuracy (due to racing with
> a simultaneous write or repack) for speed (we avoid re-scanning the pack
> directory).
>
> However, even checking for loose objects can be expensive, as we will
> stat() each one. On many systems this cost isn't too noticeable, but
> stat() can be particularly slow on some operating systems, or due to
> network filesystems.
>
> Since the QUICK flag already tells us that we're OK with a slightly
> stale answer, we can use that as a cue to look in our in-memory cache of
> each object directory. That basically trades an in-memory binary search
> for a stat() call.
>
> Note that it is possible for this to actually be _slower_. We'll do a
> full readdir() to fill the cache, so if you have a very large number of
> loose objects and a very small number of lookups, that readdir() may end
> up more expensive.
>
> This shouldn't be a big deal in practice. If you have a large number of
> reachable loose objects, you'll already run into performance problems
> (which you should remedy by repacking). You may have unreachable objects
> which wouldn't otherwise impact performance. Usually these would go away
> with the prune step of "git gc", but they may be held for up to 2 weeks
> in the default configuration.
>
> So it comes down to how many such objects you might reasonably expect to
> have, how much slower is readdir() on N entries versus M stat() calls
> (and here we really care about the syscall backing readdir(), like
> getdents() on Linux, but I'll just call this readdir() below).
>
> If N is much smaller than M (a typical packed repo), we know this is a
> big win (few readdirs() followed by many uses of the resulting cache).
> When N and M are similar in size, it's also a win. We care about the
> latency of making a syscall, and readdir() should be giving us many
> values in a single call. How many?
>
> On Linux, running "strace -e getdents ls" shows a 32k buffer getting 512
> entries per call (which is 64 bytes per entry; the name itself is 38
> bytes, plus there are some other fields). So we can imagine that this is
> always a win as long as the number of loose objects in the repository is
> a factor of 500 less than the number of lookups you make. It's hard to
> auto-tune this because we don't generally know up front how many lookups
> we're going to do. But it's unlikely for this to perform significantly
> worse.
>
> Signed-off-by: Jeff King 
> ---
> There's some obvious hand-waving in the paragraphs above. I would love
> it if somebody with an NFS system could do some before/after timings
> with various numbers of loose objects, to get a sense of where the
> breakeven point is.
>
> My gut is that we do not need the complexity of a cache-size limit, nor
> of a config option to disable this. But it would be nice to have a real
> number where "reasonable" ends and "pathological" begins. :)

I'm happy to test this on some of the NFS we have locally, and started
out with a plan to write some for-loop using the low-level API (so it
would look up all 256), fake populate .git/objects/?? with N number of
objects etc, but ran out of time.

Do you have something ready that you think would be representative and I
could just run? If not I'll try to pick this up again...


Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-11-12 Thread Derrick Stolee

On 11/12/2018 9:54 AM, Jeff King wrote:

In cases where we expect to ask has_sha1_file() about a lot of objects
that we are not likely to have (e.g., during fetch negotiation), we
already use OBJECT_INFO_QUICK to sacrifice accuracy (due to racing with
a simultaneous write or repack) for speed (we avoid re-scanning the pack
directory).

However, even checking for loose objects can be expensive, as we will
stat() each one. On many systems this cost isn't too noticeable, but
stat() can be particularly slow on some operating systems, or due to
network filesystems.

Since the QUICK flag already tells us that we're OK with a slightly
stale answer, we can use that as a cue to look in our in-memory cache of
each object directory. That basically trades an in-memory binary search
for a stat() call.

Note that it is possible for this to actually be _slower_. We'll do a
full readdir() to fill the cache, so if you have a very large number of
loose objects and a very small number of lookups, that readdir() may end
up more expensive.

This shouldn't be a big deal in practice. If you have a large number of
reachable loose objects, you'll already run into performance problems
(which you should remedy by repacking). You may have unreachable objects
which wouldn't otherwise impact performance. Usually these would go away
with the prune step of "git gc", but they may be held for up to 2 weeks
in the default configuration.

So it comes down to how many such objects you might reasonably expect to
have, how much slower is readdir() on N entries versus M stat() calls
(and here we really care about the syscall backing readdir(), like
getdents() on Linux, but I'll just call this readdir() below).

If N is much smaller than M (a typical packed repo), we know this is a
big win (few readdirs() followed by many uses of the resulting cache).
When N and M are similar in size, it's also a win. We care about the
latency of making a syscall, and readdir() should be giving us many
values in a single call. How many?

On Linux, running "strace -e getdents ls" shows a 32k buffer getting 512
entries per call (which is 64 bytes per entry; the name itself is 38
bytes, plus there are some other fields). So we can imagine that this is
always a win as long as the number of loose objects in the repository is
a factor of 500 less than the number of lookups you make. It's hard to
auto-tune this because we don't generally know up front how many lookups
we're going to do. But it's unlikely for this to perform significantly
worse.

Signed-off-by: Jeff King 
---
There's some obvious hand-waving in the paragraphs above. I would love
it if somebody with an NFS system could do some before/after timings
with various numbers of loose objects, to get a sense of where the
breakeven point is.

My gut is that we do not need the complexity of a cache-size limit, nor
of a config option to disable this. But it would be nice to have a real
number where "reasonable" ends and "pathological" begins. :)


I'm interested in such numbers, but do not have the appropriate setup to 
test.


I think the tradeoffs you mention above are reasonable. There's also 
some chance that this isn't "extra" work but is just "earlier" work, as 
the abbreviation code would load these loose object directories.




  object-store.h |  1 +
  sha1-file.c| 20 
  2 files changed, 21 insertions(+)

diff --git a/object-store.h b/object-store.h
index bf1e0cb761..60758efad8 100644
--- a/object-store.h
+++ b/object-store.h
@@ -13,6 +13,7 @@ struct object_directory {
/*
 * Used to store the results of readdir(3) calls when we are OK
 * sacrificing accuracy due to races for speed. That includes
+* object existence with OBJECT_INFO_QUICK, as well as
 * our search for unique abbreviated hashes. Don't use it for tasks
 * requiring greater accuracy!
 *
diff --git a/sha1-file.c b/sha1-file.c
index 4aae716a37..e53da0b701 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -921,6 +921,24 @@ static int open_sha1_file(struct repository *r,
return -1;
  }
  
+static int quick_has_loose(struct repository *r,

+  const unsigned char *sha1)
+{
+   int subdir_nr = sha1[0];
+   struct object_id oid;
+   struct object_directory *odb;
+
+   hashcpy(oid.hash, sha1);
+
+   prepare_alt_odb(r);
+   for (odb = r->objects->odb; odb; odb = odb->next) {
+   odb_load_loose_cache(odb, subdir_nr);
+   if (oid_array_lookup(>loose_objects_cache, ) >= 0)
+   return 1;
+   }
+   return 0;
+}
+
  /*
   * Map the loose object at "path" if it is not NULL, or the path found by
   * searching for a loose object named "sha1".
@@ -1171,6 +1189,8 @@ static int sha1_loose_object_info(struct repository *r,
if (!oi->typep && !oi->type_name && !oi->sizep && !oi->contentp) {
const char *path;
struct stat st;
+   if 

Re: [PATCH 8/9] sha1-file: use loose object cache for quick existence check

2018-11-12 Thread Jeff King
On Mon, Nov 12, 2018 at 05:01:02PM +0100, Ævar Arnfjörð Bjarmason wrote:

> > There's some obvious hand-waving in the paragraphs above. I would love
> > it if somebody with an NFS system could do some before/after timings
> > with various numbers of loose objects, to get a sense of where the
> > breakeven point is.
> >
> > My gut is that we do not need the complexity of a cache-size limit, nor
> > of a config option to disable this. But it would be nice to have a real
> > number where "reasonable" ends and "pathological" begins. :)
> 
> I'm happy to test this on some of the NFS we have locally, and started
> out with a plan to write some for-loop using the low-level API (so it
> would look up all 256), fake populate .git/objects/?? with N number of
> objects etc, but ran out of time.
> 
> Do you have something ready that you think would be representative and I
> could just run? If not I'll try to pick this up again...

No, but they don't even really need to be actual objects. So I suspect
something like:

  git init
  for i in $(seq 256); do
i=$(printf %02x $i)
mkdir -p .git/objects/$i
for j in $(seq --format=%038g 1000); do
  echo foo >.git/objects/$i/$j
done
  done
  git index-pack -v --stdin 

Re: [RFC PATCH] index-pack: improve performance on NFS

2018-11-12 Thread Jeff King
On Thu, Nov 08, 2018 at 11:20:47PM +0100, Ævar Arnfjörð Bjarmason wrote:

> Just a comment on this from the series:
> 
> Note that it is possible for this to actually be _slower_. We'll do a
> full readdir() to fill the cache, so if you have a very large number of
> loose objects and a very small number of lookups, that readdir() may end
> up more expensive.
> 
> In practice, though, having a large number of loose objects is already a
> performance problem, which should be fixed by repacking or pruning via
> git-gc. So on balance, this should be a good tradeoff.
> 
> Our biggest repo has a very large number of loose objects at any given
> time, but the vast majority of these are because gc *is* happening very
> frequently and the default expiry policy of 2wks is in effect.
> 
> Having a large number of loose objects is not per-se a performance
> problem.

Yes, you're right. I was trying not to get into the rabbit hole of
discussing theoretical tradeoffs, but it is worth addressing. I've
updated that commit message in the patches I'll send out momentarily.

-Peff


[PATCH 3/9] rename "alternate_object_database" to "object_directory"

2018-11-12 Thread Jeff King
In preparation for unifying the handling of alt odb's and the normal
repo object directory, let's use a more neutral name. This patch is
purely mechanical, swapping the type name, and converting any variables
named "alt" to "odb". There should be no functional change, but it will
reduce the noise in subsequent diffs.

Signed-off-by: Jeff King 
---
I waffled on calling this object_database instead of object_directory.
But really, it is very specifically about the directory (packed
storage, including packs from alternates, is handled elsewhere).

 builtin/count-objects.c |  4 ++--
 builtin/fsck.c  | 16 ++---
 builtin/submodule--helper.c |  6 ++---
 commit-graph.c  | 10 
 object-store.h  | 14 +--
 object.c| 10 
 packfile.c  |  8 +++
 sha1-file.c | 48 ++---
 sha1-name.c | 20 
 transport.c |  2 +-
 10 files changed, 69 insertions(+), 69 deletions(-)

diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index a7cad052c6..3fae474f6f 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -78,10 +78,10 @@ static int count_cruft(const char *basename, const char 
*path, void *data)
return 0;
 }
 
-static int print_alternate(struct alternate_object_database *alt, void *data)
+static int print_alternate(struct object_directory *odb, void *data)
 {
printf("alternate: ");
-   quote_c_style(alt->path, NULL, stdout, 0);
+   quote_c_style(odb->path, NULL, stdout, 0);
putchar('\n');
return 0;
 }
diff --git a/builtin/fsck.c b/builtin/fsck.c
index b10f2b154c..55153cf92a 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -688,7 +688,7 @@ static struct option fsck_opts[] = {
 int cmd_fsck(int argc, const char **argv, const char *prefix)
 {
int i;
-   struct alternate_object_database *alt;
+   struct object_directory *odb;
 
/* fsck knows how to handle missing promisor objects */
fetch_if_missing = 0;
@@ -725,14 +725,14 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
for_each_packed_object(mark_packed_for_connectivity, NULL, 0);
} else {
-   struct alternate_object_database *alt_odb_list;
+   struct object_directory *alt_odb_list;
 
fsck_object_dir(get_object_directory());
 
prepare_alt_odb(the_repository);
alt_odb_list = the_repository->objects->alt_odb_list;
-   for (alt = alt_odb_list; alt; alt = alt->next)
-   fsck_object_dir(alt->path);
+   for (odb = alt_odb_list; odb; odb = odb->next)
+   fsck_object_dir(odb->path);
 
if (check_full) {
struct packed_git *p;
@@ -840,12 +840,12 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
errors_found |= ERROR_COMMIT_GRAPH;
 
prepare_alt_odb(the_repository);
-   for (alt =  the_repository->objects->alt_odb_list; alt; alt = 
alt->next) {
+   for (odb = the_repository->objects->alt_odb_list; odb; odb = 
odb->next) {
child_process_init(_graph_verify);
commit_graph_verify.argv = verify_argv;
commit_graph_verify.git_cmd = 1;
verify_argv[2] = "--object-dir";
-   verify_argv[3] = alt->path;
+   verify_argv[3] = odb->path;
if (run_command(_graph_verify))
errors_found |= ERROR_COMMIT_GRAPH;
}
@@ -861,12 +861,12 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
errors_found |= ERROR_COMMIT_GRAPH;
 
prepare_alt_odb(the_repository);
-   for (alt =  the_repository->objects->alt_odb_list; alt; alt = 
alt->next) {
+   for (odb = the_repository->objects->alt_odb_list; odb; odb = 
odb->next) {
child_process_init(_verify);
midx_verify.argv = midx_argv;
midx_verify.git_cmd = 1;
midx_argv[2] = "--object-dir";
-   midx_argv[3] = alt->path;
+   midx_argv[3] = odb->path;
if (run_command(_verify))
errors_found |= ERROR_COMMIT_GRAPH;
}
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 28b9449e82..3ae451bc46 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1265,7 +1265,7 @@ struct submodule_alternate_setup {
SUBMODULE_ALTERNATE_ERROR_IGNORE, NULL }
 
 static int 

[PATCH 4/9] sha1_file_name(): overwrite buffer instead of appending

2018-11-12 Thread Jeff King
The sha1_file_name() function is used to generate the path to a loose
object in the object directory. It doesn't make much sense for it to
append, since the the path we write may be absolute (i.e., you cannot
reliably build up a path with it). Because many callers use it with a
static buffer, they have to strbuf_reset() manually before each call
(and the other callers always use an empty buffer, so they don't care
either way). Let's handle this automatically.

Since we're changing the semantics, let's take the opportunity to give
it a more hash-neutral name (which will also catch any callers from
topics in flight).

Signed-off-by: Jeff King 
---
 http-walker.c  |  2 +-
 http.c |  4 ++--
 object-store.h |  2 +-
 sha1-file.c| 18 --
 4 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/http-walker.c b/http-walker.c
index b3334bf657..0a392c85b6 100644
--- a/http-walker.c
+++ b/http-walker.c
@@ -547,7 +547,7 @@ static int fetch_object(struct walker *walker, unsigned 
char *sha1)
ret = error("File %s has bad hash", hex);
} else if (req->rename < 0) {
struct strbuf buf = STRBUF_INIT;
-   sha1_file_name(the_repository, , req->sha1);
+   loose_object_path(the_repository, , req->sha1);
ret = error("unable to write sha1 filename %s", buf.buf);
strbuf_release();
}
diff --git a/http.c b/http.c
index 3dc8c560d6..46c2e7a275 100644
--- a/http.c
+++ b/http.c
@@ -2314,7 +2314,7 @@ struct http_object_request *new_http_object_request(const 
char *base_url,
hashcpy(freq->sha1, sha1);
freq->localfile = -1;
 
-   sha1_file_name(the_repository, , sha1);
+   loose_object_path(the_repository, , sha1);
strbuf_addf(>tmpfile, "%s.temp", filename.buf);
 
strbuf_addf(, "%s.prev", filename.buf);
@@ -2465,7 +2465,7 @@ int finish_http_object_request(struct http_object_request 
*freq)
unlink_or_warn(freq->tmpfile.buf);
return -1;
}
-   sha1_file_name(the_repository, , freq->sha1);
+   loose_object_path(the_repository, , freq->sha1);
freq->rename = finalize_object_file(freq->tmpfile.buf, filename.buf);
strbuf_release();
 
diff --git a/object-store.h b/object-store.h
index 122d5f75e2..fefa17e380 100644
--- a/object-store.h
+++ b/object-store.h
@@ -157,7 +157,7 @@ void raw_object_store_clear(struct raw_object_store *o);
  * Put in `buf` the name of the file in the local object database that
  * would be used to store a loose object with the specified sha1.
  */
-void sha1_file_name(struct repository *r, struct strbuf *buf, const unsigned 
char *sha1);
+void loose_object_path(struct repository *r, struct strbuf *buf, const 
unsigned char *sha1);
 
 void *map_sha1_file(struct repository *r, const unsigned char *sha1, unsigned 
long *size);
 
diff --git a/sha1-file.c b/sha1-file.c
index a3cc650a0a..478eac326b 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -346,8 +346,10 @@ static void fill_sha1_path(struct strbuf *buf, const 
unsigned char *sha1)
}
 }
 
-void sha1_file_name(struct repository *r, struct strbuf *buf, const unsigned 
char *sha1)
+void loose_object_path(struct repository *r, struct strbuf *buf,
+  const unsigned char *sha1)
 {
+   strbuf_reset(buf);
strbuf_addstr(buf, r->objects->objectdir);
strbuf_addch(buf, '/');
fill_sha1_path(buf, sha1);
@@ -735,8 +737,7 @@ static int check_and_freshen_local(const struct object_id 
*oid, int freshen)
 {
static struct strbuf buf = STRBUF_INIT;
 
-   strbuf_reset();
-   sha1_file_name(the_repository, , oid->hash);
+   loose_object_path(the_repository, , oid->hash);
 
return check_and_freshen_file(buf.buf, freshen);
 }
@@ -888,7 +889,7 @@ int git_open_cloexec(const char *name, int flags)
  *
  * The "path" out-parameter will give the path of the object we found (if any).
  * Note that it may point to static storage and is only valid until another
- * call to sha1_file_name(), etc.
+ * call to loose_object_path(), etc.
  */
 static int stat_sha1_file(struct repository *r, const unsigned char *sha1,
  struct stat *st, const char **path)
@@ -896,8 +897,7 @@ static int stat_sha1_file(struct repository *r, const 
unsigned char *sha1,
struct object_directory *odb;
static struct strbuf buf = STRBUF_INIT;
 
-   strbuf_reset();
-   sha1_file_name(r, , sha1);
+   loose_object_path(r, , sha1);
*path = buf.buf;
 
if (!lstat(*path, st))
@@ -926,8 +926,7 @@ static int open_sha1_file(struct repository *r,
int most_interesting_errno;
static struct strbuf buf = STRBUF_INIT;
 
-   strbuf_reset();
-   sha1_file_name(r, , sha1);
+   loose_object_path(r, , sha1);
*path = buf.buf;
 
fd = git_open(*path);
@@ -1626,8 +1625,7 @@ static int write_loose_object(const struct object_id 
*oid, 

[PATCH 9/9] fetch-pack: drop custom loose object cache

2018-11-12 Thread Jeff King
Commit 024aa4696c (fetch-pack.c: use oidset to check existence of loose
object, 2018-03-14) added a cache to avoid calling stat() for a bunch of
loose objects we don't have.

Now that OBJECT_INFO_QUICK handles this caching itself, we can drop the
custom solution.

Note that this might perform slightly differently, as the original code
stopped calling readdir() when we saw more loose objects than there were
refs. So:

  1. The old code might have spent work on readdir() to fill the cache,
 but then decided there were too many loose objects, wasting that
 effort.

  2. The new code might spend a lot of time on readdir() if you have a
 lot of loose objects, even though there are very few objects to
 ask about.

In practice it probably won't matter either way; see the previous commit
for some discussion of the tradeoff.

Signed-off-by: Jeff King 
---
 fetch-pack.c | 39 ++-
 1 file changed, 2 insertions(+), 37 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index b3ed7121bc..25a88f4eb2 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -636,23 +636,6 @@ struct loose_object_iter {
struct ref *refs;
 };
 
-/*
- *  If the number of refs is not larger than the number of loose objects,
- *  this function stops inserting.
- */
-static int add_loose_objects_to_set(const struct object_id *oid,
-   const char *path,
-   void *data)
-{
-   struct loose_object_iter *iter = data;
-   oidset_insert(iter->loose_object_set, oid);
-   if (iter->refs == NULL)
-   return 1;
-
-   iter->refs = iter->refs->next;
-   return 0;
-}
-
 /*
  * Mark recent commits available locally and reachable from a local ref as
  * COMPLETE. If args->no_dependents is false, also mark COMPLETE remote refs as
@@ -670,30 +653,14 @@ static void mark_complete_and_common_ref(struct 
fetch_negotiator *negotiator,
struct ref *ref;
int old_save_commit_buffer = save_commit_buffer;
timestamp_t cutoff = 0;
-   struct oidset loose_oid_set = OIDSET_INIT;
-   int use_oidset = 0;
-   struct loose_object_iter iter = {_oid_set, *refs};
-
-   /* Enumerate all loose objects or know refs are not so many. */
-   use_oidset = !for_each_loose_object(add_loose_objects_to_set,
-   , 0);
 
save_commit_buffer = 0;
 
for (ref = *refs; ref; ref = ref->next) {
struct object *o;
-   unsigned int flags = OBJECT_INFO_QUICK;
 
-   if (use_oidset &&
-   !oidset_contains(_oid_set, >old_oid)) {
-   /*
-* I know this does not exist in the loose form,
-* so check if it exists in a non-loose form.
-*/
-   flags |= OBJECT_INFO_IGNORE_LOOSE;
-   }
-
-   if (!has_object_file_with_flags(>old_oid, flags))
+   if (!has_object_file_with_flags(>old_oid,
+   OBJECT_INFO_QUICK))
continue;
o = parse_object(the_repository, >old_oid);
if (!o)
@@ -710,8 +677,6 @@ static void mark_complete_and_common_ref(struct 
fetch_negotiator *negotiator,
}
}
 
-   oidset_clear(_oid_set);
-
if (!args->deepen) {
for_each_ref(mark_complete_oid, NULL);
for_each_cached_alternate(NULL, mark_alternate_complete);
-- 
2.19.1.1577.g2c5b293d4f


Re: [PATCH v2 1/2] git-rebase, sequencer: extend --quiet option for the interactive machinery

2018-11-12 Thread Johannes Schindelin
Hi Elijah,

On Wed, 7 Nov 2018, Elijah Newren wrote:

> While 'quiet' and 'interactive' may sound like antonyms, the interactive
> machinery actually has logic that implements several
> interactive_rebase=implied cases (--exec, --keep-empty, --rebase-merges)
> which won't pop up an editor.  The rewrite of interactive rebase in C
> added a quiet option, though it only turns stats off.  Since we want to
> make the interactive machinery also take over for git-rebase--merge, it
> should fully implement the --quiet option.
> 
> git-rebase--interactive was already somewhat quieter than
> git-rebase--merge and git-rebase--am, possibly because cherry-pick has
> just traditionally been quieter.  As such, we only drop a few
> informational messages -- "Rebasing (n/m)" and "Succesfully rebased..."

Makes sense.

> Also, for simplicity, remove the differences in how quiet and verbose
> options were recorded.  Having one be signalled by the presence of a
> "verbose" file in the state_dir, while the other was signalled by the
> contents of a "quiet" file was just weirdly inconsistent.  (This
> inconsistency pre-dated the rewrite into C.)  Make them consistent by
> having them both key off the presence of the file.

I am slightly concerned that some creative power user could have written
scripts that rely on this behavior.

But only *slightly* concerned.

The patch looks correct.

Ciao,
Dscho

> 
> Signed-off-by: Elijah Newren 
> ---
>  builtin/rebase.c  |  5 +
>  git-legacy-rebase.sh  |  2 +-
>  git-rebase--common.sh |  2 +-
>  sequencer.c   | 23 +--
>  sequencer.h   |  1 +
>  5 files changed, 17 insertions(+), 16 deletions(-)
> 
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 0ee06aa363..be004406a6 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -181,10 +181,7 @@ static int read_basic_state(struct rebase_options *opts)
>   if (get_oid(buf.buf, >orig_head))
>   return error(_("invalid orig-head: '%s'"), buf.buf);
>  
> - strbuf_reset();
> - if (read_one(state_dir_path("quiet", opts), ))
> - return -1;
> - if (buf.len)
> + if (file_exists(state_dir_path("quiet", opts)))
>   opts->flags &= ~REBASE_NO_QUIET;
>   else
>   opts->flags |= REBASE_NO_QUIET;
> diff --git a/git-legacy-rebase.sh b/git-legacy-rebase.sh
> index 75a08b2683..da27bfca5a 100755
> --- a/git-legacy-rebase.sh
> +++ b/git-legacy-rebase.sh
> @@ -113,7 +113,7 @@ read_basic_state () {
>   else
>   orig_head=$(cat "$state_dir"/head)
>   fi &&
> - GIT_QUIET=$(cat "$state_dir"/quiet) &&
> + test -f "$state_dir"/quiet && GIT_QUIET=t
>   test -f "$state_dir"/verbose && verbose=t
>   test -f "$state_dir"/strategy && strategy="$(cat "$state_dir"/strategy)"
>   test -f "$state_dir"/strategy_opts &&
> diff --git a/git-rebase--common.sh b/git-rebase--common.sh
> index 7e39d22871..dc18c682fa 100644
> --- a/git-rebase--common.sh
> +++ b/git-rebase--common.sh
> @@ -10,7 +10,7 @@ write_basic_state () {
>   echo "$head_name" > "$state_dir"/head-name &&
>   echo "$onto" > "$state_dir"/onto &&
>   echo "$orig_head" > "$state_dir"/orig-head &&
> - echo "$GIT_QUIET" > "$state_dir"/quiet &&
> + test t = "$GIT_QUIET" && : > "$state_dir"/quiet
>   test t = "$verbose" && : > "$state_dir"/verbose
>   test -n "$strategy" && echo "$strategy" > "$state_dir"/strategy
>   test -n "$strategy_opts" && echo "$strategy_opts" > \
> diff --git a/sequencer.c b/sequencer.c
> index 9e1ab3a2a7..bd8337dbf1 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -150,6 +150,7 @@ static GIT_PATH_FUNC(rebase_path_refs_to_delete, 
> "rebase-merge/refs-to-delete")
>  static GIT_PATH_FUNC(rebase_path_gpg_sign_opt, "rebase-merge/gpg_sign_opt")
>  static GIT_PATH_FUNC(rebase_path_orig_head, "rebase-merge/orig-head")
>  static GIT_PATH_FUNC(rebase_path_verbose, "rebase-merge/verbose")
> +static GIT_PATH_FUNC(rebase_path_quiet, "rebase-merge/quiet")
>  static GIT_PATH_FUNC(rebase_path_signoff, "rebase-merge/signoff")
>  static GIT_PATH_FUNC(rebase_path_head_name, "rebase-merge/head-name")
>  static GIT_PATH_FUNC(rebase_path_onto, "rebase-merge/onto")
> @@ -157,7 +158,6 @@ static GIT_PATH_FUNC(rebase_path_autostash, 
> "rebase-merge/autostash")
>  static GIT_PATH_FUNC(rebase_path_strategy, "rebase-merge/strategy")
>  static GIT_PATH_FUNC(rebase_path_strategy_opts, "rebase-merge/strategy_opts")
>  static GIT_PATH_FUNC(rebase_path_allow_rerere_autoupdate, 
> "rebase-merge/allow_rerere_autoupdate")
> -static GIT_PATH_FUNC(rebase_path_quiet, "rebase-merge/quiet")
>  
>  static int git_sequencer_config(const char *k, const char *v, void *cb)
>  {
> @@ -2307,6 +2307,9 @@ static int read_populate_opts(struct replay_opts *opts)
>   if (file_exists(rebase_path_verbose()))
>   opts->verbose = 1;
>  
> + if (file_exists(rebase_path_quiet()))
> + 

  1   2   >