date:20160630

Re: preview: What's cooking in git.git (Jun 2016, #10; Tue, 28)

2016-06-30 Thread Jeff King

On Wed, Jun 29, 2016 at 07:41:36AM +0200, Johannes Sixt wrote:

> Am 29.06.2016 um 03:43 schrieb Jeff King:
> > Another is to just put the posix/ksh schemes into the helper function,
> > and let Windows people sort it out later if they want to.
> 
> Let's do this.

OK, here's a replacement for the first patch in test-match-signal (the
others do not need touched at all). It punts on Windows entirely.

It does retain the existing check for "3" in t0005 (though I note in
04422c7 you skipped the next test entirely in MINGW, and a similar
argument could perhaps apply here).

I suspect that supporting Windows in test_match_sigpipe would require
at least:

  - checking for code 3 to cover cases where we raise (either because of
check_pipe(), or because we are doing the pop-and-raise thing in our
cleanup handlers)

  - checking for 128 when we want sigpipe, in case we get EPIPE and it
caused a die().

  - figuring out what the heck happens when you "kill -15" a git process
and mapping that

But those are all just guesses (and we'd probably want to hide them
behind "if test_have_prerequisite MINGW" to avoid making the tests less
robust elsewhere).

This patch should change nothing at all for Windows, so at least we are
not making things worse.

-- >8 --
Subject: [PATCH] tests: factor portable signal check out of t0005

In POSIX shells, a program which exits due to a signal
generally has an exit code of 128 plus the signal number.
However, ksh uses 256 plus the signal number.  We've
accounted for that in t0005, but not in other tests.  Let's
pull out the logic so we can use it elsewhere.

It would be nice for debugging if this additionally printed
errors to stderr, like our other test_* helpers. But we're
going to need to use it in other places besides the innards
of a test_expect block. So let's leave it as generic as
possible.

Note that we also leave the magic "3" for Windows out of the
generic helper. This is an artifact of the way we use
raise() to kill ourselves in test-sigchain.c, and will not
necessarily apply to all programs. So it's better to keep it
out of the helper, to reduce the chance of confusing it with
a real call to exit(3).

Signed-off-by: Jeff King 
---
 t/t0005-signals.sh  | 13 +++--
 t/test-lib-functions.sh | 15 +++
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/t/t0005-signals.sh b/t/t0005-signals.sh
index e7f27eb..95f8c05 100755
--- a/t/t0005-signals.sh
+++ b/t/t0005-signals.sh
@@ -11,12 +11,13 @@ EOF

 test_expect_success 'sigchain works' '
{ test-sigchain >actual; ret=$?; } &&
-   case "$ret" in
-   143) true ;; # POSIX w/ SIGTERM=15
-   271) true ;; # ksh w/ SIGTERM=15
- 3) true ;; # Windows
- *) false ;;
-   esac &&
+   {
+   # Signal death by raise() on Windows acts like exit(3),
+   # regardless of the signal number. So we must allow that
+   # as well as the normal signal check.
+   test_match_signal 15 "$ret" ||
+   test "$ret" = 3
+   } &&
test_cmp expect actual
 '

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 48884d5..15ef3f8 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -961,3 +961,18 @@ test_env () {
done
)
 }
+
+# Returns true if the numeric exit code in "$2" represents the expected signal
+# in "$1". Signals should be given numerically.
+test_match_signal () {
+   if test "$2" = "$((128 + $1))"
+   then
+   # POSIX
+   return 0
+   elif test "$2" = "$((256 + $1))"
+   then
+   # ksh
+   return 0
+   fi
+   return 1
+}
-- 
2.9.0.317.g65b4e7c

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] t3404: fix another typo

2016-06-30 Thread Johannes Schindelin

Hi Junio,

On Wed, 29 Jun 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > On Wed, 29 Jun 2016, Junio C Hamano wrote:
> >
> >> Johannes Schindelin  writes:
> >> 
> >> > The past tense of "to run" is "run", not "ran".
> >> 
> >> Actually, past tense of the verb "to run" is "ran" ;-) The reason
> >> why this patch is still correct is because this is writing in
> >> passive voice, where you want "be + past participle", and the past
> >> participle of the verb "to run" is "run".
> >
> > Embarrassing! ;-)
> >
> > Well, I shall fix up the commit message thusly:
> >
> > t3404: fix another typo
> >
> > The passive form of "to run" is "run", not "ran".
> 
> If your convention of referring to a verb is to show its infinitive
> form, i.e. "to run", then you would probably want to say
> 
>   The passive from of "to run" is "to be run", not "to be
>   ran".
> 
> But I think we do not need any of this if we just retitled it to
> 
>   Subject: t3404: fix a grammo (commands are ran -> commands are run)
> 
> without any body.

Done. Will be part of v2.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] t3404: add a test for the --gpg-sign option

2016-06-30 Thread Johannes Schindelin

Hi Junio,

On Wed, 29 Jun 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > To keep the time t3404 requires short (in this developer's Windows
> > setup, this single test already takes a painful 8 minutes to pass),
> > we avoid a full-blown GPG test and cop out by verifying the message
> > displayed to the user upon an 'edit' command.
> 
> This is a tangent, but I wonder if we should be solving it by
> parallelizing the tests even more.  If the script for example
> can be split into part1 and part2 that share the same set-up
> that is prepared by the very first test, we could split this
> into three files (common one that is dot-sourced by two actual
> test scripts that have part1 and part2).

Sure, that would work around the problem in the short run.

In the long run, the only real solution would be to stop relying on shell
scripting so much, because the biggest performance hit on Windows stems
from the fact that our test suite is a big honking shell script.

> > diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> > index c7ea8ba..4c96075 100755
> > --- a/t/t3404-rebase-interactive.sh
> > +++ b/t/t3404-rebase-interactive.sh
> > @@ -1281,4 +1281,11 @@ test_expect_success 'editor saves as CR/LF' '
> > )
> >  '
> >  
> > +EPIPHANY="'"
> 
> Why?  If you really need a variable, SQ is probably the way this
> codebase typically spell it.

Have you ever watched the movie "Hook"?

> > +test_expect_success 'rebase -i --gpg-sign=' '
> > +   set_fake_editor &&
> > +   FAKE_LINES="edit 1" git rebase -i --gpg-sign=\" HEAD^ >out 2>err &&
> > +   grep "$EPIPHANY-S\"$EPIPHANY" err
> 
> I am not sure what is going on here.  You are asking to sign using
> the keyid of a single double-quote, and expecting the message that
> says
> 
>   You can amend the commit now, with
> 
>   git commit --amend '-S"'

Precisely. That way, I not only verified that the key ID was correctly
passed around, but also that it is quoted properly.

>   ...
> 
> that has a substring '-S"' in it to ensure that the codepath to
> parse --gpg-sign= on the command line of "rebase", and to the
> message we see here are working correctly, without actually checking
> if GPG is invoked at all, or if it is invoked the key given by the
> option is correctly passed to the invocation?

Exactly. I want to test --gpg-sign even when there is no gpg executable
available.

> If so, can't you do that without confusing users by using keyid "?
> IOW, wouldn't using --gpg-sign=me work equally well?

Not really, because it is much easier to get quoting of "me" right than of
"\"".

I guess I could change the test to pass --gpg-sign="\"S I Gner\"".

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/9] Report bugs consistently

2016-06-30 Thread Johannes Schindelin

Hi Eric,

On Wed, 29 Jun 2016, Eric Sunshine wrote:

> On Wed, Jun 29, 2016 at 7:36 AM, Johannes Schindelin
>  wrote:
> > The vast majority of error messages in Git's source code which report a
> > bug use the convention to prefix the message with "BUG:".
> > [...]
> > Signed-off-by: Johannes Schindelin 
> > ---
> > diff --git a/merge-recursive.c b/merge-recursive.c
> > @@ -1853,7 +1852,7 @@ int merge_trees(struct merge_options *o,
> > -   die(_("Unprocessed path??? %s"),
> > +   die(_("BUG: unprocessed path??? %s"),
> 
> This and others downcase the first word (which is consistent with
> modern practice)...
> 
> > diff --git a/sha1_file.c b/sha1_file.c
> > @@ -795,7 +795,7 @@ void close_all_packs(void)
> > -   die("BUG! Want to close pack marked 
> > 'do-not-close'");
> > +   die("BUG: Want to close pack marked 
> > 'do-not-close'");
> 
> ...but this one neglects to.

Thanks! Will be fixed as part of v2.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/9] Report bugs consistently

2016-06-30 Thread Johannes Schindelin

Hi Junio,

On Wed, 29 Jun 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > The vast majority of error messages in Git's source code which report a
> > bug use the convention to prefix the message with "BUG:".
> 
> Good thing to do.
> 
> But if we were to review and apply a 200+ line patch, I wonder if we
> want to go one step further to allow us to write
> 
> BUG("killed-file %s not found", name);
> 
> instead.

If the idea is to make it easier to find, I would wager a guess that
'die("BUG:' would be just as good a search term. Even better, I think,
because 'BUG' would also match comments.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 9/9] am: make a direct call to merge_recursive

2016-06-30 Thread Johannes Schindelin

Hi Junio,

On Wed, 29 Jun 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > From: Junio C Hamano 
> 
> Did I write this thing?

Yes, you did. It was db05d6194d3f9ea9e64163944961d5f6e85302be as part of
pu@{2016-06-15}.

> Having two sets of numbers that illustrated that this is not really
> a useful optimization in the bigger picture looks vaguely familiar
> (e.g. $gmane/279417), but the numbers are different.

I beg to differ. While the performance improvement is not huge, even your
toy experiment shows it is significant, i.e. noticeable.

> > It feels *slightly* wrong to submit your own patch to review,
> > however, please keep in mind that
> >
> > 1) I changed the patch (o.gently does not exist anymore, so I do
> >not set it), and
> >
> > 2) I added my own timings performed on Windows.
> 
> It probably is much less confusing if you take the authorship,
> possibly with a passing reference to whatever I wrote as the source
> of inspiration in the log message, or something.  I do not think I
> deserve a credit in this 9-patch series.

The patch is an almost verbatim copy of db05d6194, I only removed the
now-obsolete "gently" setting, is all.

That means that I feel really, really uneasy about claiming authorship
because I did not write it.

If you really want me to, I will take custody of this patch and rewrite
the commit message as well using --reset--author, of course.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/9] Report bugs consistently

2016-06-30 Thread Johannes Schindelin

Hi Hannes,

On Thu, 30 Jun 2016, Johannes Sixt wrote:

> Am 29.06.2016 um 13:36 schrieb Johannes Schindelin:
> > @@ -955,9 +955,8 @@ static struct merge_file_info merge_file_1(struct
> > merge_options *o,
> >
> >  if (!sha_eq(a->sha1, b->sha1))
> > result.clean = 0;
> > -   } else {
> > -   die(_("unsupported object type in the tree"));
> > -   }
> > +   } else
> > +   die(_("BUG: unsupported object type in the tree"));
> 
> Would it perhaps make sense to remove the _() markup (here and a few more
> instances in this patch)? It's simpler for developers to find the code
> location when a "BUG:" is reported untranslated.

I would agree, but the purpose of this patch was not to fix that, but to
fix the inconsistency of the message.

Maybe as an add-on patch, with *all* 'die(_("BUG:' instances converted?
That would be even more outside the purview of my patch series than
touching the bug reports outside merge-recursive.c, though.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 0/5] friendlier handling of overflows in archive-tar

2016-06-30 Thread Jeff King

This is a re-roll of the jk/big-and-future-archive-tar topic. It
addresses all but one of the review comments, and I hope should be
pretty polished.

The changes are:

  - the dependency on bunzip2 is dropped; instead, we just provide a
partial object for the 64GB blob. See the first commit message for
details.

  - the portable "head -c" replacement from t9300 has been factored out,
and we use it in the new tests

  - symbolic constants for the giant octal numbers (with a comment
warning that the values are set by posix)

  - the comments for tar_info() and the lazy-prereq were split so the
two aren't mashed together

  - uses awk in tar_info() instead of "sed | cut"

  - extra simplification in the final commit, as suggested by review

  - typo and awkwardness fixes in the commit messages

The one thing that isn't fixed is the use of "141" to test for sigpipe
death. That should use test_match_signal, but that topic just got
re-rolled, too.

  [1/5]: t9300: factor out portable "head -c" replacement
  [2/5]: t5000: test tar files that overflow ustar headers
  [3/5]: archive-tar: write extended headers for file sizes >= 8GB
  [4/5]: archive-tar: write extended headers for far-future mtime
  [5/5]: archive-tar: drop return value

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 5/5] archive-tar: drop return value

2016-06-30 Thread Jeff King

We never do any error checks, and so never return anything
but "0". Let's just drop this to simplify the code.

Signed-off-by: Jeff King 
---
 archive-tar.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index d671bc3..7ea4e90 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -319,13 +319,12 @@ static int write_tar_entry(struct archiver_args *args,
return err;
 }
 
-static int write_global_extended_header(struct archiver_args *args)
+static void write_global_extended_header(struct archiver_args *args)
 {
const unsigned char *sha1 = args->commit_sha1;
struct strbuf ext_header = STRBUF_INIT;
struct ustar_header header;
unsigned int mode;
-   int err = 0;
 
if (sha1)
strbuf_append_ext_header(&ext_header, "comment",
@@ -337,7 +336,7 @@ static int write_global_extended_header(struct 
archiver_args *args)
}
 
if (!ext_header.len)
-   return 0;
+   return;
 
memset(&header, 0, sizeof(header));
*header.typeflag = TYPEFLAG_GLOBAL_HEADER;
@@ -347,7 +346,6 @@ static int write_global_extended_header(struct 
archiver_args *args)
write_blocked(&header, sizeof(header));
write_blocked(ext_header.buf, ext_header.len);
strbuf_release(&ext_header);
-   return err;
 }
 
 static struct archiver **tar_filters;
@@ -423,9 +421,8 @@ static int write_tar_archive(const struct archiver *ar,
 {
int err = 0;
 
-   err = write_global_extended_header(args);
-   if (!err)
-   err = write_archive_entries(args, write_tar_entry);
+   write_global_extended_header(args);
+   err = write_archive_entries(args, write_tar_entry);
if (!err)
write_trailer();
return err;
-- 
2.9.0.317.g65b4e7c
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 3/5] archive-tar: write extended headers for file sizes >= 8GB

2016-06-30 Thread Jeff King

The ustar format has a fixed-length field for the size of
each file entry which is supposed to contain up to 11 bytes
of octal-formatted data plus a NUL or space terminator.

These means that the largest size we can represent is
0777, or 1 byte short of 8GB. The correct solution
for a larger file, according to POSIX.1-2001, is to add an
extended pax header, similar to how we handle long
filenames. This patch does that, and writes zero for the
size field in the ustar header (the last bit is not
mentioned by POSIX, but it matches how GNU tar behaves with
--format=pax).

This should be a strict improvement over the current
behavior, which is to die in xsnprintf with a "BUG".
However, there's some interesting history here.

Prior to f2f0267 (archive-tar: use xsnprintf for trivial
formatting, 2015-09-24), we silently overflowed the "size"
field. The extra bytes ended up in the "mtime" field of the
header, which was then immediately written itself,
overwriting our extra bytes. What that means depends on how
many bytes we wrote.

If the size was 64GB or greater, then we actually overflowed
digits into the mtime field, meaning our value was
effectively right-shifted by those lost octal digits. And
this patch is again a strict improvement over that.

But if the size was between 8GB and 64GB, then our 12-byte
field held all of the actual digits, and only our NUL
terminator overflowed. According to POSIX, there should be a
NUL or space at the end of the field. However, GNU tar seems
to be lenient here, and will correctly parse a size up 64GB
(minus one) from the field. So sizes in this range might
have just worked, depending on the implementation reading
the tarfile.

This patch is mostly still an improvement there, as the 8GB
limit is specifically mentioned in POSIX as the correct
limit. But it's possible that it could be a regression
(versus the pre-f2f0267 state) if all of the following are
true:

  1. You have a file between 8GB and 64GB.

  2. Your tar implementation _doesn't_ know about pax
 extended headers.

  3. Your tar implementation _does_ parse 12-byte sizes from
 the ustar header without a delimiter.

It's probably not worth worrying about such an obscure set
of conditions, but I'm documenting it here just in case.

Helped-by: René Scharfe 
Signed-off-by: Jeff King 
---
 archive-tar.c   | 31 +--
 t/t5000-tar-tree.sh |  4 ++--
 2 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index cb99df2..57a1540 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -18,6 +18,13 @@ static int tar_umask = 002;
 static int write_tar_filter_archive(const struct archiver *ar,
struct archiver_args *args);
 
+/*
+ * This is the max value that a ustar size header can specify, as it is fixed
+ * at 11 octal digits. POSIX specifies that we switch to extended headers at
+ * this size.
+ */
+#define USTAR_MAX_SIZE 0777UL
+
 /* writes out the whole block, but only if it is full */
 static void write_if_needed(void)
 {
@@ -137,6 +144,20 @@ static void strbuf_append_ext_header(struct strbuf *sb, 
const char *keyword,
strbuf_addch(sb, '\n');
 }
 
+/*
+ * Like strbuf_append_ext_header, but for numeric values.
+ */
+static void strbuf_append_ext_header_uint(struct strbuf *sb,
+ const char *keyword,
+ uintmax_t value)
+{
+   char buf[40]; /* big enough for 2^128 in decimal, plus NUL */
+   int len;
+
+   len = xsnprintf(buf, sizeof(buf), "%"PRIuMAX, value);
+   strbuf_append_ext_header(sb, keyword, buf, len);
+}
+
 static unsigned int ustar_header_chksum(const struct ustar_header *header)
 {
const unsigned char *p = (const unsigned char *)header;
@@ -208,7 +229,7 @@ static int write_tar_entry(struct archiver_args *args,
struct ustar_header header;
struct strbuf ext_header = STRBUF_INIT;
unsigned int old_mode = mode;
-   unsigned long size;
+   unsigned long size, size_in_header;
void *buffer;
int err = 0;
 
@@ -267,7 +288,13 @@ static int write_tar_entry(struct archiver_args *args,
memcpy(header.linkname, buffer, size);
}
 
-   prepare_header(args, &header, mode, size);
+   size_in_header = size;
+   if (S_ISREG(mode) && size > USTAR_MAX_SIZE) {
+   size_in_header = 0;
+   strbuf_append_ext_header_uint(&ext_header, "size", size);
+   }
+
+   prepare_header(args, &header, mode, size_in_header);
 
if (ext_header.len > 0) {
err = write_extended_header(args, sha1, ext_header.buf,
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 950bdd3..93c2d34 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -360,7 +360,7 @@ test_expect_success 'set up repository with huge blob' '
 
 # We expect git to die with SIGPIPE here (otherwise we
 # would

[PATCH v4 4/5] archive-tar: write extended headers for far-future mtime

2016-06-30 Thread Jeff King

The ustar format represents timestamps as seconds since the
epoch, but only has room to store 11 octal digits.  To
express anything larger, we need to use an extended header.
This is exactly the same case we fixed for the size field in
the previous commit, and the solution here follows the same
pattern.

This is even mentioned as an issue in f2f0267 (archive-tar:
use xsnprintf for trivial formatting, 2015-09-24), but since
it only affected things far in the future, it wasn't deemed
worth dealing with. But note that my calculations claiming
thousands of years were off there; because our xsnprintf
produces a NUL byte, we only have until the year 2242 to fix
this.

Given that this is just around the corner (geologically
speaking, anyway), and because it's easy to fix, let's just
make it work. Unlike the previous fix for "size", where we
had to write an individual extended header for each file, we
can write one global header (since we have only one mtime
for the whole archive).

There's a slight bit of trickiness there. We may already be
writing a global header with a "comment" field for the
commit sha1. So we need to write our new field into the same
header. To do this, we push the decision of whether to write
such a header down into write_global_extended_header(),
which will now assemble the header as it sees fit, and will
return early if we have nothing to write (in practice, we'll
only have a large mtime if it comes from a commit, but this
makes it also work if you set your system clock ahead such
that time() returns a huge value).

Note that we don't (and never did) handle negative
timestamps (i.e., before 1970). This would probably not be
too hard to support in the same way, but since git does not
support negative timestamps at all, I didn't bother here.

After writing the extended header, we munge the timestamp in
the ustar headers to the maximum-allowable size. This is
wrong, but it's the least-wrong thing we can provide to a
tar implementation that doesn't understand pax headers (it's
also what GNU tar does).

Helped-by: René Scharfe 
Signed-off-by: Jeff King 
---
 archive-tar.c   | 19 ---
 t/t5000-tar-tree.sh |  4 ++--
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index 57a1540..d671bc3 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -22,8 +22,11 @@ static int write_tar_filter_archive(const struct archiver 
*ar,
  * This is the max value that a ustar size header can specify, as it is fixed
  * at 11 octal digits. POSIX specifies that we switch to extended headers at
  * this size.
+ *
+ * Likewise for the mtime (which happens to use a buffer of the same size).
  */
 #define USTAR_MAX_SIZE 0777UL
+#define USTAR_MAX_MTIME 0777UL
 
 /* writes out the whole block, but only if it is full */
 static void write_if_needed(void)
@@ -324,7 +327,18 @@ static int write_global_extended_header(struct 
archiver_args *args)
unsigned int mode;
int err = 0;
 
-   strbuf_append_ext_header(&ext_header, "comment", sha1_to_hex(sha1), 40);
+   if (sha1)
+   strbuf_append_ext_header(&ext_header, "comment",
+sha1_to_hex(sha1), 40);
+   if (args->time > USTAR_MAX_MTIME) {
+   strbuf_append_ext_header_uint(&ext_header, "mtime",
+ args->time);
+   args->time = USTAR_MAX_MTIME;
+   }
+
+   if (!ext_header.len)
+   return 0;
+
memset(&header, 0, sizeof(header));
*header.typeflag = TYPEFLAG_GLOBAL_HEADER;
mode = 0100666;
@@ -409,8 +423,7 @@ static int write_tar_archive(const struct archiver *ar,
 {
int err = 0;
 
-   if (args->commit_sha1)
-   err = write_global_extended_header(args);
+   err = write_global_extended_header(args);
if (!err)
err = write_archive_entries(args, write_tar_entry);
if (!err)
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 93c2d34..96d208d 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -383,11 +383,11 @@ test_expect_success 'set up repository with far-future 
commit' '
git commit -m "tempori parendum"
 '
 
-test_expect_failure 'generate tar with future mtime' '
+test_expect_success 'generate tar with future mtime' '
git archive HEAD >future.tar
 '
 
-test_expect_failure TAR_HUGE 'system tar can read our future mtime' '
+test_expect_success TAR_HUGE 'system tar can read our future mtime' '
echo 4147 >expect &&
tar_info future.tar | cut -d" " -f2 >actual &&
test_cmp expect actual
-- 
2.9.0.317.g65b4e7c

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 1/5] t9300: factor out portable "head -c" replacement

2016-06-30 Thread Jeff King

In shell scripts it is sometimes useful to be able to read
exactly N bytes from a pipe. Doing this portably turns out
to be surprisingly difficult.

We want a solution that:

  - is portable

  - never reads more than N bytes due to buffering (which
would mean those bytes are not available to the next
program to read from the same pipe)

  - handles partial reads by looping until N bytes or read
(or we see EOF)

  - is resilient to stray signals giving us EINTR while
trying to read (even though we don't send them, things
like SIGWINCH could cause apparently-random failures)

Some possible solutions are:

  - "head -c" is not portable, and implementations may
buffer (though GNU head does not)

  - "read -N" is a bash-ism, and thus not portable

  - "dd bs=$n count=1" does not handle partial reads. GNU dd
has iflags=fullblock, but that is not portable

  - "dd bs=1 count=$n" fixes the partial read problem (all
reads are 1-byte, so there can be no partial response).
It does make a lot of write() calls, but for our tests
that's unlikely to matter.  It's fairly portable. We
already use it in our tests, and it's unlikely that
implementations would screw up any of our criteria. The
most unknown one would be signal handling.

  - perl can do a sysread() loop pretty easily. On my Linux
system, at least, it seems to restart the read() call
automatically. If that turns out not to be portable,
though, it would be easy for us to handle it.

That makes the perl solution the least bad (because we
conveniently omitted "length of code" as a criterion).
It's also what t9300 is currently using, so we can just pull
the implementation from there.

Signed-off-by: Jeff King 
---
 t/t9300-fast-import.sh  | 23 +++
 t/test-lib-functions.sh | 14 ++
 2 files changed, 17 insertions(+), 20 deletions(-)

diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 74d740d..2e0ba3e 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -7,23 +7,6 @@ test_description='test git fast-import utility'
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/diff-lib.sh ;# test-lib chdir's into trash
 
-# Print $1 bytes from stdin to stdout.
-#
-# This could be written as "head -c $1", but IRIX "head" does not
-# support the -c option.
-head_c () {
-   perl -e '
-   my $len = $ARGV[1];
-   while ($len > 0) {
-   my $s;
-   my $nread = sysread(STDIN, $s, $len);
-   die "cannot read: $!" unless defined($nread);
-   print $s;
-   $len -= $nread;
-   }
-   ' - "$1"
-}
-
 verify_packs () {
for p in .git/objects/pack/*.pack
do
@@ -2481,7 +2464,7 @@ test_expect_success PIPE 'R: copy using cat-file' '
 
read blob_id type size <&3 &&
echo "$blob_id $type $size" >response &&
-   head_c $size >blob <&3 &&
+   test_copy_bytes $size >blob <&3 &&
read newline <&3 &&
 
cat <<-EOF &&
@@ -2524,7 +2507,7 @@ test_expect_success PIPE 'R: print blob mid-commit' '
EOF
 
read blob_id type size <&3 &&
-   head_c $size >actual <&3 &&
+   test_copy_bytes $size >actual <&3 &&
read newline <&3 &&
 
echo
@@ -2559,7 +2542,7 @@ test_expect_success PIPE 'R: print staged blob within 
commit' '
echo "cat-blob $to_get" &&
 
read blob_id type size <&3 &&
-   head_c $size >actual <&3 &&
+   test_copy_bytes $size >actual <&3 &&
read newline <&3 &&
 
echo deleteall
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 48884d5..90856d6 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -961,3 +961,17 @@ test_env () {
done
)
 }
+
+# Read up to "$1" bytes (or to EOF) from stdin and write them to stdout.
+test_copy_bytes () {
+   perl -e '
+   my $len = $ARGV[1];
+   while ($len > 0) {
+   my $s;
+   my $nread = sysread(STDIN, $s, $len);
+   die "cannot read: $!" unless defined($nread);
+   print $s;
+   $len -= $nread;
+   }
+   ' - "$1"
+}
-- 
2.9.0.317.g65b4e7c

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 6/5] t5000: use test_match_signal

2016-06-30 Thread Jeff King

On Thu, Jun 30, 2016 at 05:06:14AM -0400, Jeff King wrote:

> The one thing that isn't fixed is the use of "141" to test for sigpipe
> death. That should use test_match_signal, but that topic just got
> re-rolled, too.

And here's what the patch for that looks like (which can be applied if
this topic and jk/test-match-signal are merged).

-- >8 --
Subject: t5000: use test_match_signal

We are testing for sigpipe death from git, and doing so
portably requires using our test helper.

Signed-off-by: Jeff King 
---

Of course this does not help Windows at all, because we removed the "3"
check from jk/test-match-signal. So this new test will probably need to
be dealt with by Windows folks, one way or another (either extending
test_match_signal, or just skipping the exit code check under the MINGW
prereq).

 t/t5000-tar-tree.sh | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 96d208d..6950d7d 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -365,8 +365,7 @@ test_expect_success 'generate tar with huge size' '
git archive HEAD
echo $? >exit-code
} | test_copy_bytes 4096 >huge.tar &&
-   echo 141 >expect &&
-   test_cmp expect exit-code
+   test_match_signal 13 "$(cat exit-code)"
 '

 test_expect_success TAR_HUGE 'system tar can read our huge size' '
-- 
2.9.0.317.g65b4e7c

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-06-30 Thread Jeff King

The ustar format only has room for 11 (or 12, depending on
some implementations) octal digits for the size and mtime of
each file. For values larger than this, we have to add pax
extended headers to specify the real data, and git does not
yet know how to do so.

Before fixing that, let's start off with some test
infrastructure, as designing portable and efficient tests
for this is non-trivial.

We want to use the system tar to check our output (because
what we really care about is interoperability), but we can't
rely on it:

  1. being able to read pax headers

  2. being able to handle huge sizes or mtimes

  3. supporting a "t" format we can parse

So as a prerequisite, we can feed the system tar a reference
tarball to make sure it can handle these features. The
reference tar here was created with:

  dd if=/dev/zero seek=64G bs=1 count=1 of=huge
  touch -d @68719476737 huge
  tar cf - --format=pax |
  head -c 2048

using GNU tar. Note that this is not a complete tarfile, but
it's enough to contain the headers we want to examine.

Likewise, we need to convince git that it has a 64GB blob to
output. Running "git add" on that 64GB file takes many
minutes of CPU, and even compressed, the result is 64MB. So
again, I pre-generated that loose object, and then took only
the first 2k of it. That should be enough to generate 2MB of
data before hitting an inflate error, which is plenty for us
to generate the tar header (and then die of SIGPIPE while
streaming the rest out).

The tests are split so that we test as much as we can even
with an uncooperative system tar. This actually catches the
current breakage (which is that we die("BUG") trying to
write the ustar header) on every system, and then on systems
where we can, we go farther and actually verify the result.

Helped-by: Robin H. Johnson 
Signed-off-by: Jeff King 
---
 t/t5000-tar-tree.sh  |  74 +++
 t/t5000/19f9c8273ec45a8938e6999cb59b3ff66739902a | Bin 0 -> 2048 bytes
 t/t5000/huge-and-future.tar  | Bin 0 -> 2048 bytes
 3 files changed, 74 insertions(+)
 create mode 100644 t/t5000/19f9c8273ec45a8938e6999cb59b3ff66739902a
 create mode 100644 t/t5000/huge-and-future.tar

diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 4b68bba..950bdd3 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -319,4 +319,78 @@ test_expect_success 'catch non-matching pathspec' '
test_must_fail git archive -v HEAD -- "*.abc" >/dev/null
 '
 
+# Pull the size and date of each entry in a tarfile using the system tar.
+#
+# We'll pull out only the year from the date; that avoids any question of
+# timezones impacting the result (as long as we keep our test times away from a
+# year boundary; our reference times are all in August).
+#
+# The output of tar_info is expected to be " ", both in decimal. It
+# ignores the return value of tar. We have to do this, because some of our test
+# input is only partial (the real data is 64GB in some cases).
+tar_info () {
+   "$TAR" tvf "$1" |
+   awk '{
+   split($4, date, "-")
+   print $3 " " date[1]
+   }'
+}
+
+# See if our system tar can handle a tar file with huge sizes and dates far in
+# the future, and that we can actually parse its output.
+#
+# The reference file was generated by GNU tar, and the magic time and size are
+# both octal 010001, which overflows normal ustar fields.
+test_lazy_prereq TAR_HUGE '
+   echo "68719476737 4147" >expect &&
+   tar_info "$TEST_DIRECTORY"/t5000/huge-and-future.tar >actual &&
+   test_cmp expect actual
+'
+
+test_expect_success 'set up repository with huge blob' '
+   obj_d=19 &&
+   obj_f=f9c8273ec45a8938e6999cb59b3ff66739902a &&
+   obj=${obj_d}${obj_f} &&
+   mkdir -p .git/objects/$obj_d &&
+   cp "$TEST_DIRECTORY"/t5000/$obj .git/objects/$obj_d/$obj_f &&
+   rm -f .git/index &&
+   git update-index --add --cacheinfo 100644,$obj,huge &&
+   git commit -m huge
+'
+
+# We expect git to die with SIGPIPE here (otherwise we
+# would generate the whole 64GB).
+test_expect_failure 'generate tar with huge size' '
+   {
+   git archive HEAD
+   echo $? >exit-code
+   } | test_copy_bytes 4096 >huge.tar &&
+   echo 141 >expect &&
+   test_cmp expect exit-code
+'
+
+test_expect_failure TAR_HUGE 'system tar can read our huge size' '
+   echo 68719476737 >expect &&
+   tar_info huge.tar | cut -d" " -f1 >actual &&
+   test_cmp expect actual
+'
+
+test_expect_success 'set up repository with far-future commit' '
+   rm -f .git/index &&
+   echo content >file &&
+   git add file &&
+   GIT_COMMITTER_DATE="@68719476737 +" \
+   git commit -m "tempori parendum"
+'
+
+test_expect_failure 'generate tar with future mtime' '
+   git archive HEAD >future.tar
+'
+
+test_expect_failure TAR_HUGE 'system tar can read our future mtime' '
+   echo 4147 >expe

Re: [PATCH 1/9] Report bugs consistently

2016-06-30 Thread Jeff King

On Thu, Jun 30, 2016 at 10:42:37AM +0200, Johannes Schindelin wrote:

> > > The vast majority of error messages in Git's source code which report a
> > > bug use the convention to prefix the message with "BUG:".
> > 
> > Good thing to do.
> > 
> > But if we were to review and apply a 200+ line patch, I wonder if we
> > want to go one step further to allow us to write
> > 
> > BUG("killed-file %s not found", name);
> > 
> > instead.
> 
> If the idea is to make it easier to find, I would wager a guess that
> 'die("BUG:' would be just as good a search term. Even better, I think,
> because 'BUG' would also match comments.

I have been tempted to switch to BUG(), because it would make it easy to
call abort() and get a coredump (and therefore a stack trace). On the
other hand:

  - we could always trigger such behavior in die() by looking for "BUG:" in
the output string. :)

  - it's also sometimes useful to get a stack trace from a regular
non-bug die(). So maybe something optional like:

  if (git_env_bool("GIT_ABORT_ON_DIE", 0))
  abort();

would be helpful (since you have to turn it on ahead of time, you
could also just run the program under gdb, of course; however, I
sometimes find that it's hard to get gdb where you want it because
git spawns so many sub-programs. Or maybe I just need to get better
at using gdb's child-following options).

The other thing BUG() would get us is that we could turn it into a macro
(on systems with vararg macros) and report things like __FILE__ and
__LINE__.  In practice, though our BUG messages are unique enough that
there is no problem finding the source.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFD] Place to document magic pathspecs like ":/" and pathspec handling

2016-06-30 Thread Jakub Narębski

W dniu 2016-06-29 o 23:28, Junio C Hamano pisze:
> Jakub Narębski  writes:
> 
>> But I think it is not the best place to keep this documentation.
> 
> All true.  In case it was not obvious, I didn't mean to say "Here
> you find the information, shut up."  It was "here is a pointer if
> you didn't find it, so that you can use it as a starting point to
> make a better documentation."
> 
> An analogous entity in the world model of Git that appears
> everywhere and is not limited to a single command is the revision
> set calculus.  Where do we describe it and how do we make it
> discoverable?  Can the new way to describe pathspec and pathspec
> magic mimic the way it is done for the revisions?

Is that a trick question? :-P

The revision set calculus is described as a standalone documentation
in the gitrevisions(7) manpage (titled "specifying revisions and
[revision] ranges for Git"... well, the "[revision]" isn't there).
This documentation is also accessible via `git help revisions`,
which is cool.  It is referenced in the "Symbolic Identifiers"
section of the git(1) manual page.

Long time ago the description of revision set calculus was hidden
in the manpage for the plumbing command git-rev-parse, and is still
included in it (the common part is, via "include::revisions.txt[]").

As I wrote:

 Nowadays we have gitcli(7) manual page, but perhaps
 it would be better to create a separate manpage for issues related
 to pathspec handling (of which ":/" is only one part)... but then
 what should it be named?

So we could describe how Git handles pathspecs and pathspec magic
in the new manual page named gitpathspec(7), or gitpaths(7). The
former has the advantage of the name being identical to the entry
in gitglossary(7). The latter has the probable advantage of being
easier for the Git novice to find, and that it could be used to
gather various ways to generate list of files in Git (files in
the working area, files in the staging area aka the index, files
in the revision / tree object, changed files, etc.); the pathspec
in strict sense is about the input. Well, we could have 'manpage
alias' of gitpaths to gitpathspec, or vice versa.

 Sidenote: I wonder if people (especially novices) have problem
 finding relevant documentation, and if adding something like 
 "git apropos " command ("apropos", or "man -k"), or
 add the '--apropos' option to "git help" would be useful...
 and if it would be easy to create.

The description from gitglossary would be a good start, as would
parts of gitcli and relevant RelNotes. It would need to be linked
from git(1) manpage, and probably also from gitcli.

Now only to start it...
-- 
Jakub Narębski

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] builtin/apply: include declaration of cmd_apply()

2016-06-30 Thread Christian Couder

Hi Ramsay,

On Thu, Jun 30, 2016 at 3:47 AM, Ramsay Jones
 wrote:
>
> Hi Christian,
>
> If you need to re-roll your 'cc/apply-am' branch, could you please
> squash this into the relevant patch. Commit 95a3b0ba ("apply: move
> libified code from builtin/apply.c to apply.{c,h}", 22-04-2016)
> removed this '#include "builtin.h"' line, which causes sparse to
> issue a warning.

Ok, I will squash this.

Thanks,
Christian.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v8 33/41] write_or_die: use warning() instead of fprintf(stderr, ...)

2016-06-30 Thread Christian Couder

On Tue, Jun 28, 2016 at 11:39 PM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> @@ -98,8 +97,7 @@ int write_or_whine_pipe(int fd, const void *buf, size_t 
>> count, const char *msg)
>>  int write_or_whine(int fd, const void *buf, size_t count, const char *msg)
>>  {
>>   if (write_in_full(fd, buf, count) < 0) {
>> - fprintf(stderr, "%s: write error (%s)\n",
>> - msg, strerror(errno));
>> + warning("%s: write error (%s)\n", msg, strerror(errno));
>>   return 0;
>>   }
>
> I do not think you call write_or_whine() at all.  As another topic
> in flight removes the last caller of this function, this hunk is
> very much unwelcome.  The only effect of it is to force me resolve
> unnecessary merge conflicts.

Ok, sorry about that. I will remove the hunk.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH v2 00/10] Add initial experimental external ODB support

2016-06-30 Thread Christian Couder

On Wed, Jun 29, 2016 at 10:01 PM, Eric Wong  wrote:
> Christian Couder  wrote:
>> Design discussion about performance
>> ~~~
>>
>> Yeah, it is not efficient to fork/exec a command to just read or write
>> one object to or from the external ODB. Batch calls and/or using a
>> daemon and/or RPC should be used instead to be able to store regular
>> objects in an external ODB. But for now the external ODB would be all
>> about really big files, where the cost of a fork+exec should not
>> matter much. If we later want to extend usage of external ODBs, yeah
>> we will probably need to design other mechanisms.
>
> I would also investigate switching run_command to use vfork+exec
> or posix_spawn for performance (keeping in mind vfork
> caveats documented at https://ewontfix.com/7/ )

Thanks Eric for this idea and the test!
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: topological index field for commit objects

2016-06-30 Thread Jakub Narębski

W dniu 2016-06-30 o 00:00, Jeff King pisze:
> On Wed, Jun 29, 2016 at 11:49:35PM +0200, Jakub Narębski wrote:
> 
>>> So this is the ideal case for generation numbers (the worst cases are
>>> when the things you are looking for are in branchy, close history where
>>> the generation numbers don't tell you much; but in such cases the
>>> walking is usually not too bad).
>>
>> There are other approaches (special indices) that help reachability
>> queries beside "generation number".
> 
> Yes, though generation numbers can help with more questions (e.g., "what
> is the merge base").

Well, those special indices usually need generation number anyway. For
example FELINE indices^1 (from "Reachability Queries in Very Large Graphs",
http://openproceedings.org/EDBT/2014/paper_166.pdf, that I found when
trying to find more about compressed bitmap indices used by Git) also
utilize generation numbers to speed up reachability analysis. The idea
behind FELINE is to associate every vertex (commit) with a unique ordered
pair of natural integers (i.e. two more numbers, in addition to the level
aka generation number), so that partial order over N^2 is superset of
partial order of graph (DAG of revisions). It can answer in O(1) that
nodes are not connected, and cut search space if they are.

BTW. some references in this paper ("Related work" section) use PWAH
compression scheme - perhaps it would work with EWAH that Git uses?

1. FELINE = Fast rEfined onLINE search ... or fun with acronyms

>> By the way, what should happen if you add a replacement (in the git-replace
>> meaning) that creates a shortcut, therefore invalidating generation numbers,
>> at least in strict sense - committerdate as generation number would be still
>> good, I think?
> 
> This is one of the open questions. My older patches turned them off when
> replacements and grafts are in effect.

Well, if you store the cache of generation numbers in the packfile, or in
the index of the packfile, or in the bitmap file, or in separate bitmap-like
file, generating them on repack, then of course any grafts or replacements
invalidate them... though for low level commands (like object counting)
replacements are transparent -- or rather they are (and can be) treated as
any other ref for reachability analysis.

Well, if there are no grafts, you could still use them for doing
"git --no-replace-objects log ...", isn't it?

SIDENOTE: should grafts be deprecated and later removed, now that Git has
superior alternative in the form of git-replace, and a simple way to convert
grafts to replacements?

Anyway, if file with mapping from revisions to its generation numbers
were stored outside of packfiles, it could simply be updated / rewritten
when adding new replacement. You could use one cache for no-replace,
and one for replace. Though I do wonder if it would be possible to do
such rewrite fast... well, fast enough assuming that adding replacements
is rare.

Answering my own question:
>> committerdate as generation number would be still good, I think?
No, it wouldn't:

  pre replace:

1 <-- 2 <-- 3 <-- 5 <--
   \
\- 4 <-- 6 <-- 7 <-- 8

  post replace

   1 <-- 2  \-- 3' <-- 5
  \  \
   \-- 4 <-- 6 <-- 7 <-- 8 <--\

Either the replacement commit would have generation number correct, but
its child wouldn't, or its child would have correct generation number but
the replacement wouldn't.

>>> I have patches that generate and store the numbers at pack time, similar
>>> to the way we do the reachability bitmaps.

Ah, so those cached generation numbers are generated and stored at pack
time. Where you store them: is it a separate file? Bitmap file? Packfile?

>>>They're not production ready,
>>> but they could probably be made so without too much effort. You wouldn't
>>> have ready-made generation numbers for commits since the last full
>>> repack, but you can compute them incrementally based on what you do have
>>> at a cost linear to the unpacked commits (this is the same for bitmaps).
>>
>> Do Git use EWAH / EWOK bitmaps for reachability analysis, or is it still
>> limited to object counting?
> 
> At GitHub we are using them for --contains analysis, along with mass
> ahead/behind (e.g., as in https://github.com/gitster/git/branches). My
> plan is to send patches upstream, but they need some cleanup first.

That would be nice to have, please.

Er, is mass ahead/behind something that can be plugged into Git
(e.g. for "git branch -v -v"), or is it something GitHub-specific?

P.S. Having Git ensure that committerdate (as an epoch) is greater
than committerdates of its parents at the commit creation time (with
providing warning about time skew, perhaps not doing it if skew is
too large) would be not costly. No need to add anything, and it would
help future queries to be faster, I think.

-- 
Jakub Narębski

--
To unsubscribe from this list

Re: git tag --contains for cherry-picks

2016-06-30 Thread Jakub Narębski

W dniu 2016-06-30 o 08:22, Jeff King pisze:
> On Wed, Jun 29, 2016 at 12:48:33PM +0100, Laszlo Papp wrote:
> 
>> Old releases are maintained with important bug fixes or even new features
>> in our case. It sometimes means that we need to cherry-pick commits across
>> branches, like from master to a specific release branch.
>>
>> Cherry-picking changes the hash of the commit, therefore, this may no
>> longer work for cherry-picks:
>>
>> git tag --contains
>>
>> I am thinking of having something like:
>>
>> git tag --contains-follow
>>
>> which would follow cherry-picks. I am not sure how easily and/or
>> efficiently this can be implemented, but my gut feeling is that in the vast
>> majority of the cases, the content check would bail out already at the
>> "subject line".
> 
> Git generally considers commits "equivalent" based on the patch-id, whic
> his a sha1 of the diff (modulo some canonicalization).

The problem with patch based equivalence is that for cherry-picking
on old release branches you might need to modify a patch for it to
apply - and then patch-id might not detect it.

[...]
> Of course there are other ways of determining commit equivalence. You
> could find ones with duplicate commit messages, or duplicate subjects,
> or whatever. But if you have a cherry-picking workflow, I suspect the
> easiest thing may be to simply use "git cherry-pick -x", which will
> write the sha1 of the original commit into the cherry-picked commit
> message. You can then use that to correlate directly.

"git log --grep=" would help there (or even match the specific
message that "git cherry-pick -x" adds).

-- 
Jakub Narębski

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 3/3] grep: fix grepping for "intent to add" files

2016-06-30 Thread Charles Bailey

From: Charles Bailey 

This reverts commit 4d5520053 (grep: make it clear i-t-a entries are
ignored, 2015-12-27) and adds an alternative fix to maintain the -L
--cached behavior.

4d5520053 caused 'git grep' to no longer find matches in new files in
the working tree where the corresponding index entry had the "intent to
add" bit set, despite the fact that these files are tracked.

The content in the index of a file for which the "intent to add" bit is
set is considered indeterminate and not empty. For most grep queries we
want these to behave the same, however for -L --cached (files without a
match) we don't want to respond positively for "intent to add" files as
their contents are indeterminate. This is in contrast to files with
empty contents in the index (no lines implies no matches for any grep
query expression) which should be reported in the output of a grep -L
--cached invocation.

Add tests to cover this case and a few related cases which previously
lacked coverage.

Helped-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Charles Bailey 
---
 builtin/grep.c  |  4 ++--
 t/t7810-grep.sh | 58 +
 2 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 462e607..ae73831 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -386,7 +386,7 @@ static int grep_cache(struct grep_opt *opt, const struct 
pathspec *pathspec, int
 
for (nr = 0; nr < active_nr; nr++) {
const struct cache_entry *ce = active_cache[nr];
-   if (!S_ISREG(ce->ce_mode) || ce_intent_to_add(ce))
+   if (!S_ISREG(ce->ce_mode))
continue;
if (!ce_path_match(ce, pathspec, NULL))
continue;
@@ -396,7 +396,7 @@ static int grep_cache(struct grep_opt *opt, const struct 
pathspec *pathspec, int
 * cache version instead
 */
if (cached || (ce->ce_flags & CE_VALID) || 
ce_skip_worktree(ce)) {
-   if (ce_stage(ce))
+   if (ce_stage(ce) || ce_intent_to_add(ce))
continue;
hit |= grep_sha1(opt, ce->sha1, ce->name, 0, ce->name);
}
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 6e6eaa4..c18e954 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1364,4 +1364,62 @@ test_expect_success 'grep --color -e A --and -e B -p 
with context' '
test_cmp expected actual
 '
 
+test_expect_success 'grep can find things only in the work tree' '
+   : >work-tree-only &&
+   git add work-tree-only &&
+   test_when_finished "git rm -f work-tree-only" &&
+   echo "find in work tree" >work-tree-only &&
+   git grep --quiet "find in work tree" &&
+   test_must_fail git grep --quiet --cached "find in work tree" &&
+   test_must_fail git grep --quiet "find in work tree" HEAD
+'
+
+test_expect_success 'grep can find things only in the work tree (i-t-a)' '
+   echo "intend to add this" >intend-to-add &&
+   git add -N intend-to-add &&
+   test_when_finished "git rm -f intend-to-add" &&
+   git grep --quiet "intend to add this" &&
+   test_must_fail git grep --quiet --cached "intend to add this" &&
+   test_must_fail git grep --quiet "intend to add this" HEAD
+'
+
+test_expect_success 'grep does not search work tree with assume unchanged' '
+   echo "intend to add this" >intend-to-add &&
+   git add -N intend-to-add &&
+   git update-index --assume-unchanged intend-to-add &&
+   test_when_finished "git rm -f intend-to-add" &&
+   test_must_fail git grep --quiet "intend to add this" &&
+   test_must_fail git grep --quiet --cached "intend to add this" &&
+   test_must_fail git grep --quiet "intend to add this" HEAD
+'
+
+test_expect_success 'grep can find things only in the index' '
+   echo "only in the index" >cache-this &&
+   git add cache-this &&
+   rm cache-this &&
+   test_when_finished "git rm --cached cache-this" &&
+   test_must_fail git grep --quiet "only in the index" &&
+   git grep --quiet --cached "only in the index" &&
+   test_must_fail git grep --quiet "only in the index" HEAD
+'
+
+test_expect_success 'grep does not report i-t-a with -L --cached' '
+   echo "intend to add this" >intend-to-add &&
+   git add -N intend-to-add &&
+   test_when_finished "git rm -f intend-to-add" &&
+   git ls-files | grep -v "^intend-to-add\$" >expected &&
+   git grep -L --cached "nonexistent_string" >actual &&
+   test_cmp expected actual
+'
+
+test_expect_success 'grep does not report i-t-a and assume unchanged with -L' '
+   echo "intend to add this" >intend-to-add-assume-unchanged &&
+   git add -N intend-to-add-assume-unchanged &&
+   test_when_finished "git rm -f intend-to-add-assume-unchanged" &&
+   git update-index --assume-unchanged intend-to-add-assume-unch

[PATCH v3 0/3] Grepping with intent to add

2016-06-30 Thread Charles Bailey

So I've got back around to this topic again.

I've applied fixes to the tests as suggested by Eric and Junio.

I came up with a test case that demonstrates a difference between the
additional fix that Duy suggested and the alternative that Junio
suggested.

I've kept Duy's fix because I think it makes more sense, although it's
a sufficiently obscure case that I don't feel strongly that it's
definitely the best behavior.

The fix ensures that if you have a file which is both "intend to add"
and "assume unchanged" that it is not listed if you "grep -L" for for
something. In effect, we are applying the "contents indeterminate" state
of the index to the working tree file.

Charles Bailey (3):
  t7810-grep.sh: fix duplicated test name
  t7810-grep.sh: fix a whitespace inconsistency
  grep: fix grepping for "intent to add" files

 builtin/grep.c  |  4 ++--
 t/t7810-grep.sh | 62 +++--
 2 files changed, 62 insertions(+), 4 deletions(-)

-- 
2.8.2.311.gee88674

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 2/3] t7810-grep.sh: fix a whitespace inconsistency

2016-06-30 Thread Charles Bailey

Signed-off-by: Charles Bailey 
---
 t/t7810-grep.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index c4302ed..6e6eaa4 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -175,7 +175,7 @@ do
 
test_expect_success "grep -c $L (no /dev/null)" '
! git grep -c test $H | grep /dev/null
-'
+   '
 
test_expect_success "grep --max-depth -1 $L" '
{
-- 
2.8.2.311.gee88674

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 1/3] t7810-grep.sh: fix duplicated test name

2016-06-30 Thread Charles Bailey

Signed-off-by: Charles Bailey 
---
 t/t7810-grep.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 1e72971..c4302ed 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -353,7 +353,7 @@ test_expect_success 'grep -l -C' '
 cat >expected

Re: [PATCH v13 04/20] index-helper: new daemon for caching index and related stuff

2016-06-30 Thread Johannes Schindelin

Hi Dave,

On Sun, 26 Jun 2016, David Turner wrote:

> diff --git a/t/t7900-index-helper.sh b/t/t7900-index-helper.sh
> new file mode 100755
> index 000..114c112
> --- /dev/null
> +++ b/t/t7900-index-helper.sh
> @@ -0,0 +1,23 @@
> +#!/bin/sh
> +#
> +# Copyright (c) 2016, Twitter, Inc
> +#
> +
> +test_description='git-index-helper
> +
> +Testing git index-helper
> +'
> +
> +. ./test-lib.sh
> +
> +test -n "$NO_MMAP" && {
> + skip_all='skipping index-helper tests: no mmap'
> + test_done
> +}

Even when NO_MMAP is empty, there might be no Unix sockets available (such
as is the case on Windows). In any case, you really only want to skip
these tests when index-helper is not available, so would you mind
squashing this patch in?

-- snipsnap --
From: Johannes Schindelin 
Subject: [PATCH] fixup! index-helper: new daemon for caching index and related
 stuff

---
 t/t7900-index-helper.sh | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/t/t7900-index-helper.sh b/t/t7900-index-helper.sh
index 6c9b4dd..12b5bf7 100755
--- a/t/t7900-index-helper.sh
+++ b/t/t7900-index-helper.sh
@@ -10,8 +10,10 @@ Testing git index-helper
 
 . ./test-lib.sh
 
-test -n "$NO_MMAP" && {
-   skip_all='skipping index-helper tests: no mmap'
+git index-helper -h 2>/dev/null
+test $? = 129 ||
+{
+   skip_all='skipping index-helper tests: no index-helper executable'
test_done
 }
 
-- 
2.9.0.270.g810e421

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Problem with --shallow-submodules option

2016-06-30 Thread Istvan Zakar

Hello,

Thanks for your answers. I tested it after the changes were made on
the git server, and it seems to be working. But some other issue came
up.

We have quite many submodules in our project so I did some comaprision:

If I do a clone with these parameters:
--jobs 20 --recurse-submodules

The clone lasts ~53 seconds, and the total size of the folder is around 2 GB.

If I add the shallow-submodules option, the size of the folder will be
a bit below 1GB, so the size decreased as I expected, but the time of
the clone itself increased to 90 seconds. It seems the last step of
the command, checking out the submodules is executed one-by-one, and
not in parallel, so it seems at this step the jobs parameter does not
have effect.

Is it intentional, or there is some option I missed?

I'm using git 2.9.0 on client side.

Thanks,
   Istvan

ps: if I update the submodules with --depth 1 parameter in parallel
using xargs it lasts about 18 seconds, so it's a workaround for this
issue, but it would be nice to do it with a single command.

On 22 June 2016 at 17:31, Fredrik Gustafsson  wrote:
> On Mon, Jun 20, 2016 at 01:06:39PM +, Istvan Zakar wrote:
>> I'm working on a relatively big project with many submodules. During
>> cloning for testing I tried to decrease the amount of data need to be
>> fetched from the server by using --shallow-submodules option in the clone
>> command. It seems to check out the tip of the remote repo, and if it's not
>> the commit registered in the superproject the submodule update fails
>> (obviously). Can I somehow tell to fetch that exact commit I need for my
>> superproject?
>
> Maybe. http://stackoverflow.com/questions/2144406/git-shallow-submodules
> gives a good overview of this problem.
>
> git fetches a branch and is shallow from that branch, which might be an
> other sha1 than the one the submodule points to, (as you say). This
> is/was one of the drawbacks with this method. However the since git 2.8,
> git will try to fetch the sha1 direct (and not the branch). So then it
> will work, if(!), the server supports direct access to sha1. This was
> previously not allowed due to security concerns (if I recall correctly).
>
> So the answer is, yes this will work if you've a recent version of git
> and support on the server side for doing this. Unfortunately I'm not
> sure which git version is needed on the server side for this to work.
>
> --
> Fredrik Gustafsson
>
> phone: +46 733-608274
> e-mail: iv...@iveqy.com
> website: http://www.iveqy.com
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] diff compaction heuristic: favor shortest neighboring blank lines

2016-06-30 Thread Michael Haggerty

On 06/23/2016 07:10 PM, Michael Haggerty wrote:
> On 06/17/2016 06:09 PM, Stefan Beller wrote:
>> I think before spending more time on discussing and implementing new
>> (hopefully better) heuristics, I'd want to step back and try to be a bit more
>> systematic, i.e. I'll want to collect lots of test cases in different 
>> languages
>> and use cases. A mini test suite, maybe?
> 
> Stefan,
> 
> I've also been playing around with diff heuristics and also trying to
> find some good test data. Have you put together some test cases yet?

I put quite of work into tooling to evaluate diff heuristics, and just
published it on GitHub:

https://github.com/mhagger/diff-slider-tools

The README there is hopefully enough to let people get started using it
to test their own favorite heuristics.

!!!
Please be careful! The scripts are not hardened against strangely-named
files!
!!!

I've run a bunch of tests of the heuristic that I described in the
previous email against a bunch of repositories (to try to get a sampling
of popular languages): ant, bugzilla, couchdb, docker, git, ipython,
junit, nodejs, phpmyadmin, test-more, test-unit, xmonad.

I haven't done a lot of systematic analysis yet, nor tried to optimize
the heuristic systematically, but by eye the results look very good.
Some constructs (C preprocessor directives, Shell here documents)
sometimes confuse it because of their weird indentation, and of course
code with inconsistent indentation or strangely-placed blank lines
doesn't get the best results. But in a large majority of cases where it
disagrees with the standard Git or with `git --compaction-heuristic`, it
does a better job.

Below are a few examples to show off some typical results. I promise I
tried not to cherry pick only flattering examples.

In the output below, `|` mark the highest and lowest possible
positioning of the "slider" (a group of added or deleted lines whose
exact identity is ambiguous). Current Git, by default, always chooses
the lowest positioning. "c" is where Git master with
`--compaction-heuristic` chooses to position the slider, and `i` is the
choice of my indentation-based algorithm. The numbers are just the shift
relative to the standard shift.

Michael

The algorithm does pretty well with XML and HTML:

> fef3ea39f8bd474add292bb6437df6cbd22e1ba7:contributors.xml 
> a394a0bdf8e6240dc09023a8260059c57c158a00:contributors.xml + 1624
> 
>>Toni
>>  
>>  
>>Vincent
>>Legoll
>   -2 | >  
>   -1 |   i >  
>0 || ci >Vincent
>  || ci >Privat
>   | ci >  
>   | c  >  
>>Vimil
>>Saju
>>  
>>  
>>Vitold
>>Sedyshev
> 

> 789422e131b6c2c003d94f394169a64297e986c6:manual/Tasks/signjar.html 
> 7f51882300a8e40ff675867e055061867ba6c8bd:manual/Tasks/signjar.html + 151
> 
>>  
>>  
>>tsacert
>>alias in the keystore for a timestamp 
> authority for
>>timestamped JAR files in Java1.5+
>   -3 | >No
>   -2 | >  
>   -1 |   i >  
>0 || ci >tsaproxyhost
>  || ci >proxy host to be used when connecting to 
> TSA server
>  || ci >No
>  || ci >  
>  || ci >  
>  || ci >getTsaproxyport
>  || ci >proxy port to be used when connecting to 
> TSA server
>   | ci >No
>   | ci >  
>   | c  >  
>>executable
>>Specify a particular 
> jarsigner executable
>>  to use in place of the default binary (found in the 
> same JDK as
>>  Apache Ant is running in).
>>  Must support the same command line options as the Sun 
> JDK
>>  jarsigner command.
> 

Java:

> de5b4058b9bb039cdb17082fc543098de598ece2:src/main/org/apache/tools/ant/taskdefs/optional/ssh/AbstractSshMessage.java
>  
> 3fe363e2120342f6de2b219c3bd74efe9aea2803:src/main/org/apache/tools/ant/taskdefs/optional/ssh/AbstractSshMessage.java
>  + 208
> 
>> * @return true if the verbose attribute is set
>> * @since Ant 1.6.2
>> */
>>protected final boolean getVerbose() {
>>return verbose;
>   -3 | >}
>   -2 | >
>   -1 |  ci >/**
>0 || ci > * Is the compressed attribute set.
>  || ci > * @return true if the compressed attribute is set
>

Warm greatings

2016-06-30 Thread NASOLY STORE

Greetings from NASOLY STORE Supply , We are keen to placing an order for so
me items . Kindly let me know the kinds of credit card you take for payment
 .

Will be waiting to read back from you via : allenlescot...@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v13 04/20] index-helper: new daemon for caching index and related stuff

2016-06-30 Thread Duy Nguyen

On Thu, Jun 30, 2016 at 3:06 PM, Johannes Schindelin
 wrote:
> Even when NO_MMAP is empty, there might be no Unix sockets available (such
> as is the case on Windows). In any case, you really only want to skip
> these tests when index-helper is not available, so would you mind
> squashing this patch in?
>
> -- snipsnap --
> From: Johannes Schindelin 
> Subject: [PATCH] fixup! index-helper: new daemon for caching index and related
>  stuff
>
> ---
>  t/t7900-index-helper.sh | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/t/t7900-index-helper.sh b/t/t7900-index-helper.sh
> index 6c9b4dd..12b5bf7 100755
> --- a/t/t7900-index-helper.sh
> +++ b/t/t7900-index-helper.sh
> @@ -10,8 +10,10 @@ Testing git index-helper
>
>  . ./test-lib.sh
>
> -test -n "$NO_MMAP" && {
> -   skip_all='skipping index-helper tests: no mmap'
> +git index-helper -h 2>/dev/null
> +test $? = 129 ||

So when NO_MMAP is set, "git index-helper -h" will set $? to 1. And we
correctly skip the tests as well. It's a bit subtle though. How about
"git help -a|grep index-helper"?

> +{
> +   skip_all='skipping index-helper tests: no index-helper executable'
> test_done
>  }
>
> --
> 2.9.0.270.g810e421
>



-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFD] Place to document magic pathspecs like ":/" and pathspec handling

2016-06-30 Thread Duy Nguyen

Hi Jakub,

Where have you been all these years? :D

On Thu, Jun 30, 2016 at 11:42 AM, Jakub Narębski  wrote:
> Nowadays we have gitcli(7) manual page, but perhaps
> it would be better to create a separate manpage for issues related
> to pathspec handling (of which ":/" is only one part)... but then
> what should it be named?
>
> So we could describe how Git handles pathspecs and pathspec magic
> in the new manual page named gitpathspec(7), or gitpaths(7). The
> former has the advantage of the name being identical to the entry
> in gitglossary(7). The latter has the probable advantage of being
> easier for the Git novice to find,

git-pathspec(7) is a great idea. It bugs me that all the pathspec
details are hidden in that glossary file (though I didn't know it was
also available via "git help glossary", which probably just reinforces
its invisibility). The closet place I could think of was git(1) but
that page is already very long.

> and that it could be used to
> gather various ways to generate list of files in Git (files in
> the working area, files in the staging area aka the index, files
> in the revision / tree object, changed files, etc.);

You mean the list of commands in this man page? OK. Another thing we
could do is the reverse direction, add gitpathspec(7) as a reference
to all commands that may need it.

> the pathspec
> in strict sense is about the input. Well, we could have 'manpage
> alias' of gitpaths to gitpathspec, or vice versa.

>  Sidenote: I wonder if people (especially novices) have problem
>  finding relevant documentation, and if adding something like
>  "git apropos " command ("apropos", or "man -k"), or
>  add the '--apropos' option to "git help" would be useful...
>  and if it would be easy to create.

I have that problem sometimes and I don't think I can call myself a
noob anymore. I usually need to do some grepping. So yeah "git
apropos" would be great.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 00/12] nd/icase updates

2016-06-30 Thread Duy Nguyen

On Mon, Jun 27, 2016 at 4:53 PM, Junio C Hamano  wrote:
>> diff --git a/grep.c b/grep.c
>> index cb058a5..92587a8 100644
>> --- a/grep.c
>> +++ b/grep.c
>> @@ -432,15 +432,8 @@ static void compile_regexp(struct grep_pat *p, struct 
>> grep_opt *opt)
>>   icase  = opt->regflags & REG_ICASE || p->ignore_case;
>>   ascii_only = !has_non_ascii(p->pattern);
>>
>> + if (opt->fixed || is_fixed(p->pattern, p->patternlen))
>>   p->fixed = !icase || ascii_only;
>>   else
>>   p->fixed = 0;
>>
>> @@ -449,6 +442,9 @@ static void compile_regexp(struct grep_pat *p, struct 
>> grep_opt *opt)
>>   kwsincr(p->kws, p->pattern, p->patternlen);
>>   kwsprep(p->kws);
>>   return;
>> + } else if (opt->fixed) {
>> + compile_fixed_regexp(p, opt);
>> + return;
>>   }
>
> This if/elseif/else cascade made a lot simpler and while the
> discussion is fresh in my brain it makes sense, but it may deserve a
> bit of commenting.
>
> And while attempting to do so, I found one possible issue there.
>
> Can't p->ignore_case be true even when opt->regflags does not have
> REG_ICASE?  The user never asked us to do a regexp match in such a
> case, and the logical place to compensate for that case would be
> inside compile_fixed_regexp(), where we use regexp engine behind
> user's back for our convenience, I would think.
>
> In the current code, compile_fixed_regexp() is only called when we
> want ICASE, but hardcoding that assumption to it unnecessarily robs
> flexibility (and the function name does not tell us it is only for
> icase in the first place), so I taught it to do the REG_ICASE thing
> only when opt->ignore_case is set.
>
> How does this look?
>
>
>  grep.c | 23 ++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/grep.c b/grep.c
> index 92587a8..3a3a9f4 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -407,17 +407,21 @@ static int is_fixed(const char *s, size_t len)
>  static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
>  {
> struct strbuf sb = STRBUF_INIT;
> int err;
> +   int regflags;
>
> basic_regex_quote_buf(&sb, p->pattern);
> -   err = regcomp(&p->regexp, sb.buf, opt->regflags & ~REG_EXTENDED);
> +   regflags = opt->regflags & ~REG_EXTENDED;
> +   if (opt->ignore_case)
> +   regflags |= REG_ICASE;
> +   err = regcomp(&p->regexp, sb.buf, regflags);

Makes sense. But then if opt->ignore_case is false and regflags
happens to have REG_ICASE set, should we clear it as well?

The rest looks good (after your comment fixup). I see you already have
all the changes in your SQUASH??? commit. Do you want me to resend or
you will just squash this in locally?
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] fixup! worktree: add "lock" command

2016-06-30 Thread Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Torsten, this seems to fix the symlink problem for me. How many times
 have I got similar reports from you and still managed to forget ...

 t/t2028-worktree-move.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t2028-worktree-move.sh b/t/t2028-worktree-move.sh
index 68d3fe8..8298aaf 100755
--- a/t/t2028-worktree-move.sh
+++ b/t/t2028-worktree-move.sh
@@ -9,8 +9,8 @@ test_expect_success 'setup' '
git worktree add source &&
git worktree list --porcelain | grep "^worktree" >actual &&
cat <<-EOF >expected &&
-   worktree $TRASH_DIRECTORY
-   worktree $TRASH_DIRECTORY/source
+   worktree $(pwd)
+   worktree $(pwd)/source
EOF
test_cmp expected actual
 '
-- 
2.8.2.531.gd073806

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 0/5] Better ref summary alignment in "git fetch"

2016-06-30 Thread Duy Nguyen

On Mon, Jun 27, 2016 at 8:43 PM, Jeff King  wrote:
> I tried it on my most-horrible example case, and the results were...just
> OK. Because the variable-length part of each line comes first, the
> alignment code means that the "origin/$" bit of every line gets bumped
> out. And if you have a single large branch name, then everybody gets
> bumped out very far, even to the point of wrapping. E.g., I get
> something like (with fetch.output=compact, obviously):
>
>   From ...
>* [new branch]  branch1  -> origin/$
>* [new branch]  branch2  -> origin/$
>* [new branch]  some-really-long-branch-name -> origin/$
>+ 1234abc..5678def  branch3  -> origin/$ (forced
> update)
>* [new branch]  branch4  -> origin/$
>
> I've shrunk it a bit to fit in the email; my actual "long" name was much
> larger. And the average length for the shorter ones is, too, but the
> overall effect is the same; almost every line has a huge run of
> whitespace. And some lines wrap that would not have even under the
> normal, duplicated scheme.

I was about to resend with s/\$/*/g and ignored this issue (with a
note) then it occurred to me we can simply ignore these long lines
from column width calculation. Yeah such a line may still be wrapped
around, but it will not push every other line to the far right. We
already have code for that in adjust_refcol_width()

max= term_columns();
...
/*
* rough estimation to see if the output line is too long and
* should not be counted (we can't do precise calculation
* anyway because we don't know if the error explanation part
* will be printed in update_local_ref)
*/
if (21 /* flag and summary */ + rlen + 4 /* -> */ + llen >= max)
return;
...

we can limit max to, like, term_columns() / 2 (or 2/3 of
term_columns). There's no perfect number, some people will still find
the output ugly _often_. But hopefully the majority won't. What do you
think?
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] t5541: become resilient to GETTEXT_POISON

2016-06-30 Thread Vasco Almeida

Use test_i18n* functions for testing text already marked for
translation.

Signed-off-by: Vasco Almeida 
---
Incremental update for va/i18n-even-more (merged to 'next' on 2016-06-28 at
5919dfa).

I don't know how I didn't catch this one.

 t/t5541-http-push-smart.sh | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/t5541-http-push-smart.sh b/t/t5541-http-push-smart.sh
index fd7d06b..97748af 100755
--- a/t/t5541-http-push-smart.sh
+++ b/t/t5541-http-push-smart.sh
@@ -119,7 +119,7 @@ test_expect_success 'rejected update prints status' '
git commit -m dev2 &&
test_must_fail git push origin dev2 2>act &&
sed -e "/^remote: /s/ *$//" cmp &&
-   test_cmp exp cmp
+   test_i18ncmp exp cmp
 '
 rm -f "$HTTPD_DOCUMENT_ROOT_PATH/test_repo.git/hooks/update"
 
@@ -219,7 +219,7 @@ test_expect_success TTY 'push shows progress when stderr is 
a tty' '
cd "$ROOT_PATH"/test_repo_clone &&
test_commit noisy &&
test_terminal git push >output 2>&1 &&
-   grep "^Writing objects" output
+   test_i18ngrep "^Writing objects" output
 '
 
 test_expect_success TTY 'push --quiet silences status and progress' '
@@ -233,16 +233,16 @@ test_expect_success TTY 'push --no-progress silences 
progress but not status' '
cd "$ROOT_PATH"/test_repo_clone &&
test_commit no-progress &&
test_terminal git push --no-progress >output 2>&1 &&
-   grep "^To http" output &&
-   ! grep "^Writing objects"
+   test_i18ngrep ! "^To http" output &&
+   test_i18ngrep ! "^Writing objects"
 '
 
 test_expect_success 'push --progress shows progress to non-tty' '
cd "$ROOT_PATH"/test_repo_clone &&
test_commit progress &&
git push --progress >output 2>&1 &&
-   grep "^To http" output &&
-   grep "^Writing objects" output
+   test_i18ngrep "^To http" output &&
+   test_i18ngrep "^Writing objects" output
 '
 
 test_expect_success 'http push gives sane defaults to reflog' '
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] log: decorate HEAD -> branch with the same color for branch and HEAD

2016-06-30 Thread Nguyễn Thái Ngọc Duy

Commit 76c61fb (log: decorate HEAD with branch name under
--decorate=full, too - 2015-05-13) adds "HEAD -> branch" decoration to
show current branch vs detached HEAD. The sign of whether HEAD is
detached or not is "->" (vs ", "). It's too subtle for my poor
eyes. If color is used, we can make the branch name's color the same
as HEAD to visually emphasize that it's the current branch.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 For the original discussion of 76c61fb see
 http://thread.gmane.org/gmane.comp.version-control.git/263922

 log-tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/log-tree.c b/log-tree.c
index 78a5381..440e7cc 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -267,7 +267,7 @@ void format_decorations_extended(struct strbuf *sb,
strbuf_addstr(sb, color_commit);
strbuf_addstr(sb, " -> ");
strbuf_addstr(sb, color_reset);
-   strbuf_addstr(sb, decorate_get_color(use_color, 
current_and_HEAD->type));
+   strbuf_addstr(sb, decorate_get_color(use_color, 
DECORATION_REF_HEAD));
show_name(sb, current_and_HEAD);
}
strbuf_addstr(sb, color_reset);
-- 
2.8.2.531.gd073806

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] log: decorate HEAD -> branch with the same color for branch and HEAD

2016-06-30 Thread Duy Nguyen

On Thu, Jun 30, 2016 at 6:39 PM, Nguyễn Thái Ngọc Duy  wrote:
> Commit 76c61fb (log: decorate HEAD with branch name under
> --decorate=full, too - 2015-05-13)

.. and I got the commit wrong. It should be 51ff0f2 (log: decorate
HEAD with branch name - 2015-03-10)

adds "HEAD -> branch" decoration to
> show current branch vs detached HEAD. The sign of whether HEAD is
> detached or not is "->" (vs ", "). It's too subtle for my poor
> eyes. If color is used, we can make the branch name's color the same
> as HEAD to visually emphasize that it's the current branch.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2] t5541: become resilient to GETTEXT_POISON

2016-06-30 Thread Vasco Almeida

Use test_i18n* functions for testing text already marked for
translation.

Signed-off-by: Vasco Almeida 
---
Fix typo on v1.
Forgot to mention that tests with TTY prerequisite were skipped, I don't
know how to run them.

Notes:
Incremental update for va/i18n-even-more (merged to 'next' on 2016-06-28 at
5919dfa).

I don't know how I didn't catch this one.

 t/t5541-http-push-smart.sh | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/t5541-http-push-smart.sh b/t/t5541-http-push-smart.sh
index fd7d06b..ca6becf 100755
--- a/t/t5541-http-push-smart.sh
+++ b/t/t5541-http-push-smart.sh
@@ -119,7 +119,7 @@ test_expect_success 'rejected update prints status' '
git commit -m dev2 &&
test_must_fail git push origin dev2 2>act &&
sed -e "/^remote: /s/ *$//" cmp &&
-   test_cmp exp cmp
+   test_i18ncmp exp cmp
 '
 rm -f "$HTTPD_DOCUMENT_ROOT_PATH/test_repo.git/hooks/update"
 
@@ -219,7 +219,7 @@ test_expect_success TTY 'push shows progress when stderr is 
a tty' '
cd "$ROOT_PATH"/test_repo_clone &&
test_commit noisy &&
test_terminal git push >output 2>&1 &&
-   grep "^Writing objects" output
+   test_i18ngrep "^Writing objects" output
 '
 
 test_expect_success TTY 'push --quiet silences status and progress' '
@@ -233,16 +233,16 @@ test_expect_success TTY 'push --no-progress silences 
progress but not status' '
cd "$ROOT_PATH"/test_repo_clone &&
test_commit no-progress &&
test_terminal git push --no-progress >output 2>&1 &&
-   grep "^To http" output &&
-   ! grep "^Writing objects"
+   test_i18ngrep "^To http" output &&
+   test_i18ngrep ! "^Writing objects"
 '
 
 test_expect_success 'push --progress shows progress to non-tty' '
cd "$ROOT_PATH"/test_repo_clone &&
test_commit progress &&
git push --progress >output 2>&1 &&
-   grep "^To http" output &&
-   grep "^Writing objects" output
+   test_i18ngrep "^To http" output &&
+   test_i18ngrep "^Writing objects" output
 '
 
 test_expect_success 'http push gives sane defaults to reflog' '
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge

2016-06-30 Thread Torsten Bögershausen

On 29.06.16 18:14, Junio C Hamano wrote:
> tbo...@web.de writes:
> 
>> From: Torsten Bögershausen 
>>
>> The following didn't work as expected:
> 
> Sorry for being slow (not in response but in understanding), but
> let's examine the expectation first.

Thanks for the patience.
There is one detail missing in my description:
The check if a file in the working tree is clean:

static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
{
int match = -1;
int fd = open(ce->name, O_RDONLY);

if (fd >= 0) {
unsigned char sha1[20];
unsigned flags = HASH_USE_SHA_NOT_PATH;
--
Remove the "new feature":

git diff  read-cache.c
diff --git a/read-cache.c b/read-cache.c
index dd98b36..1f951ea 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -170,7 +170,7 @@ static int ce_compare_data(const struct cache_entry *ce, 
struct stat *st)
 
if (fd >= 0) {
unsigned char sha1[20];
-   unsigned flags = HASH_USE_SHA_NOT_PATH;
+   unsigned flags = 0;//HASH_USE_SHA_NOT_PATH;
memcpy(sha1, ce->sha1, sizeof(sha1));
if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, flags))
match = hashcmp(sha1, ce->sha1);

The the problem can be re-produced, and debugged with help of t6038:

not ok 5 - Merge addition of text=auto eol=CRLF


> 
>>  - In a middle of a merge
>>  - merge.renormalize is true,
> 
> gitattributes(5) tells us that this would make a "virtual check-out
> and check-in of all stages", i.e. each of the three blob objects
> involved is run through convert_to_working_tree(), and the result is
> run through convert_to_git().  Now, what are these two operations
> told to do?
> 
>>  - .gitattributes = "* text=auto"
>>  - core.eol = crlf
> 
> git-config(1) tells us that a text file will be checked out with
> CRLF line endings, so convert_to_working_tree() would work that
> way.  Even though core.eol entry in that manual page does not tell
> us anything, presumably convert_to_git() is expected to turn it back
> to LF line endings.
> 
Yes, the git-config(1) may need improvements, but that can go as a later path.

>> Merge a blob with CRLF "first line\r\nsame line\r\n" and a blob
>> with LF "first line\nsame line\n".
I may have written:
Merge a blob with CRLF "first line\r\nsame line\r\n" 
(which is checked out as
"first line\r\nsame line\r\n" 
in the working tree, and therefore clean)
and a blob with LF "first line\nsame line\n".

> 
> The former, when renormalized, should (we are discussing the
> expectation, not necessarily what is done by today's code) be
> processed like this:
> 
>  * Pretend as if blob "first line\r\nsame line\r\n" is in the stage
>#0 of the index at the path;
>  * Pretend as if the user said "git checkout" that path;
>  * Pretend as if the user then said "git add" that path.
> 
> The checkout process hopefully does not blindly turn LF into CRLF,
> making it "first line \r\r\nsame line\r\rn".  Instead the (virtual)
> working tree file will have "first line\r\nsame line\r\n".
Yes
> 
> Then "git add" should turn that CRLF into LF, which would give us
> "first line\nsame line\n", but because the "safer autocrlf" rule
> prevents it from happening.  The (real) index already has CR in it
> in the blob in stage #2, so the check-in process would (this is not
> describing the ideal, but what is done by today's code) disable
> CRLF->LF conversion.
> 
> Is that the problem you are trying to solve?
Not sure, if that problem isn't already solved.

> 
> If that is the case, I do not see how "don't use stage #2 of the
> real index; use the blob being renormalized instead" would help.
> The blob being renormalized may have CR in it, triggering that
> "safer autocrlf" rule and cause you the same trouble, no?
> 
> To me, it appears that if you consider that the "safer autocrlf" is
> a sensible thing, you _must_ expect that the renormalization would
> not work at all, in the scenario you presented.  Also,
> 
>> The expected result of the merge is "first line\nsame line\n".
> 
> if you expect this, to me, it appears that you need to reject the
> "safer autocrlf", at least during renormalization, as a sensible
> thing.  And if you _are_ to disable the "safer autocrlf" thing, then
> it does not matter what is currently in the index--the conversion
> can happen solely based on the data being converted and the
> configuration and attribute settings.
> 
> So I still do not see why you want to pass "no do not use stage #0
> or stage #2; use this blob instead".  Shouldn't you be just passing
> a bit "don't do the safer autocrlf thing in this convert_to_git()
> call" from renormalization codepath without doing anything else?
> 

This is my understanding:
- git checkout and git add are working as expected:
  LF in blob gives CRLF in the working tree at checkout (if attr says "auto")
  CRLF in blob gives CRLF in the working tree at checkout.
  CRLF in the working

Re: [PATCH v13 11/20] index-helper: use watchman to avoid refreshing index with lstat()

2016-06-30 Thread Ben Peart

David Turner  novalis.org> writes:

> 
> Hiding watchman behind index-helper means you need both daemons. You
> can't run watchman alone. Not so good. But on the other hand, 'git'
> binary is not linked to watchman/json libraries, which is good for
> packaging. Core git package will run fine without watchman-related
> packages. If they need watchman, they can install git-index-helper and
> dependencies.
> 

Have you considered splitting index-helper and watchman apart?  Using 
Watchman to not lstat unchanged entries is a huge perf win with very 
large repos.

It would also be interesting to make the Watchman backend replaceable by
using an extensible API.  This has the benefit of not having to link the
'git' binary to the watchman/json libraries.  Is there any pattern 
already in git for accomplishing this?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: topological index field for commit objects

2016-06-30 Thread Linus Torvalds

On Thu, Jun 30, 2016 at 3:30 AM, Jakub Narębski  wrote:
>
> P.S. Having Git ensure that committerdate (as an epoch) is greater
> than committerdates of its parents at the commit creation time (with
> providing warning about time skew, perhaps not doing it if skew is
> too large) would be not costly. No need to add anything, and it would
> help future queries to be faster, I think.

So I think git should check the committer date (ie verify that the
commitetr date of a new commit is always equal to or more recent than
all the parents).

But we should probably allow some slop for when machines are out of
sync (think build farms etc automation that may do automated commits,
but the clocks on different machines may be off by a few seconds). We
already do things like that in some of the use of committer dates -
see for example CUTOFF_DATE_SLOP)

And we should probably allow the user to override the refusal to
create new commits with bogus dates - think importing repositories etc
where you may have reasons to just say "that's just how it was".

And it will take years for that to percolate out to all users, so it's
not like it will fix things immediately, but then some time from now,
we can consider commit dates to be more reliably generation numbers.

I do think that it's ok to cache generation numbers somewhere if there
is an algorithm that can make use of them, but every time this comes
up, it's just not been important enough to make a big deal and a new
incompatible object format for it. The committer date is preexisting
and has existing pseudo-generation-number usage, so..improving on the
quality of it sounds like a good idea.

The first step should probably be to make fsck warn about the existing
cases of "commit has older date than parents". Something like the
attached patch, perhaps?

  Linus
 fsck.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/fsck.c b/fsck.c
index 05315451c56f..722b65371d38 100644
--- a/fsck.c
+++ b/fsck.c
@@ -60,6 +60,7 @@
FUNC(NULL_SHA1, WARN) \
FUNC(ZERO_PADDED_FILEMODE, WARN) \
FUNC(NUL_IN_COMMIT, WARN) \
+   FUNC(DATE_ORDERING, WARN) \
/* infos (reported as warnings, but ignored by default) */ \
FUNC(BAD_TAG_NAME, INFO) \
FUNC(MISSING_TAGGER_ENTRY, INFO)
@@ -604,6 +605,17 @@ static int fsck_ident(const char **ident, struct object 
*obj, struct fsck_option
return 0;
 }

+static void fsck_commit_date(struct fsck_options *options, struct commit 
*commit)
+{
+   struct commit_list *p;
+
+   for (p = commit->parents; p; p = p->next) {
+   struct commit *parent = p->item;
+   if (commit->date < parent->date)
+   report(options, &commit->object, 
FSCK_MSG_DATE_ORDERING, "Bad commit date ordering with parent");
+   }
+}
+
 static int fsck_commit_buffer(struct commit *commit, const char *buffer,
unsigned long size, struct fsck_options *options)
 {
@@ -650,6 +662,7 @@ static int fsck_commit_buffer(struct commit *commit, const 
char *buffer,
return err;
}
}
+   fsck_commit_date(options, commit);
author_count = 0;
while (skip_prefix(buffer, "author ", &buffer)) {
author_count++;

Re: [PATCH v13 11/20] index-helper: use watchman to avoid refreshing index with lstat()

2016-06-30 Thread Duy Nguyen

On Thu, Jun 30, 2016 at 7:55 PM, Ben Peart  wrote:
> David Turner  novalis.org> writes:
>
>>
>> Hiding watchman behind index-helper means you need both daemons. You
>> can't run watchman alone. Not so good. But on the other hand, 'git'
>> binary is not linked to watchman/json libraries, which is good for
>> packaging. Core git package will run fine without watchman-related
>> packages. If they need watchman, they can install git-index-helper and
>> dependencies.
>>
>
> Have you considered splitting index-helper and watchman apart?  Using
> Watchman to not lstat unchanged entries is a huge perf win with very
> large repos.

On large repos (i.e. lots of files/dirs on worktree), the cost of
reading index will increase proportionally. Yes lstat costs, but I
suspect index reading (integrity verification actually) may cost more,
especially on platforms with cheap lstat like linux. On these repos
you really want to enable all four: index-helper (with watchman),
split-index (I still need to work out pruning on split-index) and
untracked cache. There's still a lot more to make git run fast on
large repos though.

> It would also be interesting to make the Watchman backend replaceable by
> using an extensible API.  This has the benefit of not having to link the
> 'git' binary to the watchman/json libraries.

'git' binary is not linked to watchman libraries. git-index-helper is
a separate binary, by design. In theory you can create a
'git-index-helper' replacement binary with something other than
watchman. I think David documented the protocol well (it may change in
the future though and we are not prepared for capability progression)

> Is there any pattern already in git for accomplishing this?
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--abbrev-commit gives longer hash than necessary

2016-06-30 Thread Steffen Nurpmeso

Hello, for your possible interest.

For some time (currently with 2.9.0) know see that a single commit
gives a longer hash than necessary, even though there is no
ambiguity:

  ?0[steffen@wales ]$ git longca|
  awk 'BEGIN{l7=0;l8=0}\
/^[[:alnum:]]{7} /{++l7;next}\
/^[[:alnum:]]{8} /{++l8;print}\
  END{print "L7 " l7 " L8 " l8}'
  786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
handlers..
  L7 3364 L8 1

So it is only this single commit.. but why?

  ?0[steffen@wales ]$ git long1 786d0c9
  786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
handlers..
  ?0[steffen@wales ]$ git long1 786d0c
  786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
handlers..
  ?0[steffen@wales ]$ git long1 786d0
  786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
handlers..
  ?0[steffen@wales ]$ git long1 786d
  786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
handlers..

Not really ambiguous:

  ?0[steffen@wales ]$ git long|cut -f1 -d' '|grep ^786
  786d0c9c
  786f219

Ciao!

--steffen
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: --abbrev-commit gives longer hash than necessary

2016-06-30 Thread Bryan Turner

Steffen,

Git commands generally have a 7 character minimum by default when
abbreviating hashes, even if fewer characters are still (currently)
unique. Per the documentation:

   core.abbrev
   Set the length object names are abbreviated to. If unspecified,
   many commands abbreviate to 7 hexdigits, which may not be enough
   for abbreviated object names to stay unique for sufficiently long
   time.

You may be able to set core.abbrev to a smaller value to get even
shorter hashes, but the shorter you go the more likely you are to find
once-unique hashes no longer are, as your repository grows over time.
The default isn't just about what's likely to be unique now; it's
about what's likely to stay unique for a period of time.

Hope this helps!
Bryan Turner

On Thu, Jun 30, 2016 at 12:38 PM, Steffen Nurpmeso  wrote:
> Hello, for your possible interest.
>
> For some time (currently with 2.9.0) know see that a single commit
> gives a longer hash than necessary, even though there is no
> ambiguity:
>
>   ?0[steffen@wales ]$ git longca|
>   awk 'BEGIN{l7=0;l8=0}\
> /^[[:alnum:]]{7} /{++l7;next}\
> /^[[:alnum:]]{8} /{++l8;print}\
>   END{print "L7 " l7 " L8 " l8}'
>   786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
> handlers..
>   L7 3364 L8 1
>
> So it is only this single commit.. but why?
>
>   ?0[steffen@wales ]$ git long1 786d0c9
>   786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
> handlers..
>   ?0[steffen@wales ]$ git long1 786d0c
>   786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
> handlers..
>   ?0[steffen@wales ]$ git long1 786d0
>   786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
> handlers..
>   ?0[steffen@wales ]$ git long1 786d
>   786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
> handlers..
>
> Not really ambiguous:
>
>   ?0[steffen@wales ]$ git long|cut -f1 -d' '|grep ^786
>   786d0c9c
>   786f219
>
> Ciao!
>
> --steffen
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 0/4] Name for A..B ranges?

2016-06-30 Thread Philip Oakley

This is the re-roll of the po/range-doc (2016-06-27) 3 commits

The order is slightly re-arranged, and an additional patch clarifying
that ^r1 excludes r1 itself being added.

The heading have been tweaked.

Discussion: $gmane/297908
previous patch series $gmane/298223

Philip Oakley (4):
  doc: use 'symmetric difference' consistently
  doc: show the actual left, right, and boundary marks
  doc: give headings for the two and three dot notations
  doc: clarify that `^r1` will exclude `r1` itself

 Documentation/gitk.txt |  2 +-
 Documentation/pretty-formats.txt   |  2 +-
 Documentation/rev-list-options.txt |  4 ++--
 Documentation/revisions.txt| 25 ++---
 4 files changed, 22 insertions(+), 11 deletions(-)

-- 
2.8.4.windows.1.3.ge328a54

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 2/4] doc: show the actual left, right, and boundary marks

2016-06-30 Thread Philip Oakley

Signed-off-by: Philip Oakley 
---
Found while checking the 'symmetric difference' documentation
---
 Documentation/pretty-formats.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
index 29b19b9..10719e1 100644
--- a/Documentation/pretty-formats.txt
+++ b/Documentation/pretty-formats.txt
@@ -166,7 +166,7 @@ endif::git-rev-list[]
   respecting the `auto` settings of the former if we are going to a
   terminal). `auto` alone (i.e. `%C(auto)`) will turn on auto coloring
   on the next placeholders until the color is switched again.
-- '%m': left, right or boundary mark
+- '%m': left (`<`), right (`>`) or boundary (`-`) mark
 - '%n': newline
 - '%%': a raw '%'
 - '%x00': print a byte from a hex code
-- 
2.8.4.windows.1.3.ge328a54

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 4/4] doc: clarify that `^r1` will exclude `r1` itself

2016-06-30 Thread Philip Oakley

Signed-off-by: Philip Oakley 
---
 Documentation/revisions.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt
index 131060c..87be9c4 100644
--- a/Documentation/revisions.txt
+++ b/Documentation/revisions.txt
@@ -246,7 +246,7 @@ The '{caret}' (caret) notation
 ~~~
 To exclude commits reachable from a commit, a prefix '{caret}'
 notation is used.  E.g. '{caret}r1 r2' means commits reachable
-from 'r2' but exclude the ones reachable from 'r1'.
+from 'r2' but exclude 'r1' and those reachable from 'r1'.
 
 The '..' (two-dot) range notation
 ~
-- 
2.8.4.windows.1.3.ge328a54

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 3/4] doc: give headings for the two and three dot notations

2016-06-30 Thread Philip Oakley

While there, also break out the other shorthand notations and
add a title for the revision range summary (which also appears
in git-rev-parse, so keep it mixed case).

Signed-off-by: Philip Oakley 
---
 Documentation/revisions.txt | 23 +--
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt
index 19314e3..131060c 100644
--- a/Documentation/revisions.txt
+++ b/Documentation/revisions.txt
@@ -242,35 +242,46 @@ specifying a single revision with the notation described 
in the
 previous section means the set of commits reachable from that
 commit, following the commit ancestry chain.
 
+The '{caret}' (caret) notation
+~~~
 To exclude commits reachable from a commit, a prefix '{caret}'
 notation is used.  E.g. '{caret}r1 r2' means commits reachable
 from 'r2' but exclude the ones reachable from 'r1'.
 
-This set operation appears so often that there is a shorthand
+The '..' (two-dot) range notation
+~
+The '{caret}r1 r2' set operation appears so often that there is a shorthand
 for it.  When you have two commits 'r1' and 'r2' (named according
 to the syntax explained in SPECIFYING REVISIONS above), you can ask
 for commits that are reachable from r2 excluding those that are reachable
 from r1 by '{caret}r1 r2' and it can be written as 'r1..r2'.
 
+The '...' (three dot) Symmetric Difference notation
+~~~
 A similar notation 'r1\...r2' is called symmetric difference
 of 'r1' and 'r2' and is defined as
 'r1 r2 --not $(git merge-base --all r1 r2)'.
 It is the set of commits that are reachable from either one of
 'r1' or 'r2' but not from both.
 
-In these two shorthands, you can omit one end and let it default to HEAD.
+In these two shorthand notations, you can omit one end and let it default to 
HEAD.
 For example, 'origin..' is a shorthand for 'origin..HEAD' and asks "What
 did I do since I forked from the origin branch?"  Similarly, '..origin'
 is a shorthand for 'HEAD..origin' and asks "What did the origin do since
 I forked from them?"  Note that '..' would mean 'HEAD..HEAD' which is an
 empty range that is both reachable and unreachable from HEAD.
 
+Additional '{caret}' Shorthand notations
+
 Two other shorthands for naming a set that is formed by a commit
-and its parent commits exist.  The 'r1{caret}@' notation means all
-parents of 'r1'.  'r1{caret}!' includes commit 'r1' but excludes
-all of its parents.
+and its parent commits exist.
 
-To summarize:
+The 'r1{caret}@' notation means all parents of 'r1'.
+
+'r1{caret}!' includes commit 'r1' but excludes all of its parents.
+
+Revision Range Summary
+--
 
 ''::
Include commits that are reachable from (i.e. ancestors of)
-- 
2.8.4.windows.1.3.ge328a54

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 1/4] doc: use 'symmetric difference' consistently

2016-06-30 Thread Philip Oakley

Signed-off-by: Philip Oakley 
---
 Documentation/gitk.txt | 2 +-
 Documentation/rev-list-options.txt | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/gitk.txt b/Documentation/gitk.txt
index 6ade002..6c3eb15 100644
--- a/Documentation/gitk.txt
+++ b/Documentation/gitk.txt
@@ -70,7 +70,7 @@ linkgit:git-rev-list[1] for a complete list.
 
 --left-right::
 
-   Mark which side of a symmetric diff a commit is reachable
+   Mark which side of a symmetric difference a commit is reachable
from.  Commits from the left side are prefixed with a `<`
symbol and those from the right with a `>` symbol.
 
diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 4f009d4..6dc0bb0 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -225,7 +225,7 @@ excluded from the output.
 
 --left-only::
 --right-only::
-   List only commits on the respective side of a symmetric range,
+   List only commits on the respective side of a symmetric difference,
i.e. only those which would be marked `<` resp. `>` by
`--left-right`.
 +
@@ -766,7 +766,7 @@ ifdef::git-rev-list[]
 endif::git-rev-list[]
 
 --left-right::
-   Mark which side of a symmetric diff a commit is reachable from.
+   Mark which side of a symmetric difference a commit is reachable from.
Commits from the left side are prefixed with `<` and those from
the right with `>`.  If combined with `--boundary`, those
commits are prefixed with `-`.
-- 
2.8.4.windows.1.3.ge328a54

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Problem with --shallow-submodules option

2016-06-30 Thread Stefan Beller

On Thu, Jun 30, 2016 at 6:27 AM, Istvan Zakar  wrote:
> Hello,
>
> Thanks for your answers. I tested it after the changes were made on
> the git server, and it seems to be working. But some other issue came
> up.
>
> We have quite many submodules in our project so I did some comaprision:
>
> If I do a clone with these parameters:
> --jobs 20 --recurse-submodules
>
> The clone lasts ~53 seconds, and the total size of the folder is around 2 GB.
>
> If I add the shallow-submodules option, the size of the folder will be
> a bit below 1GB, so the size decreased as I expected, but the time of
> the clone itself increased to 90 seconds. It seems the last step of
> the command, checking out the submodules is executed one-by-one, and
> not in parallel, so it seems at this step the jobs parameter does not
> have effect.
>
> Is it intentional, or there is some option I missed?

It was intentional at the time of submitting the patches.
The checkout phase is a bit complicated as it combines the
newly cloned submodules as well as the submodules to incrementally
fetch into one bucket and treats them the same.

And for submodules that were fetched incrementally you may run into problems
when combining that with the local state (e.g. rebase or merge configured in
`submodule..update` or passed on the command line), which requires
human interaction (resolving the merge conflict), which we want to present one
at a time to the user.

The handling for the user is not quite clear, when to stop, see:
15ffb7cde48b73b3d5ce259443db7d2e0ba13750 (submodule update: continue
when a checkout fails)
877449c136539cf8b9b4ed9cfe33a796b7b93f93 (git-submodule.sh: clarify
the "should we die now" logic)

So we want to die as soon as we see a merge conflict or other
error that is likely to require some human interaction.
To do that properly we need to have complicated logic or just update
one submodule at a time.

For initial checkouts we know that there will be no merge conflicts, i.e.
it will be a "checkout -f" (with an implicit must_die_on_failure=no)
So we could run all checkouts of submodules in parallel, too. We'd
just need to write the patch for that.

As the cloning is already done in parallel, we can hook into the initial
checkout there easily. I'd build that on top of [1], creating a similar commit.
In the successful case of `update_clone_task_finished` (the case with
`!result`  -> return 0;) we would need to add the checkout command to
the queue instead of just finishing.

[1] 
https://github.com/gitster/git/commit/665b35eccd39fefd714cb5c332277a6b94fd9386


>
> I'm using git 2.9.0 on client side.
>
> Thanks,
>Istvan
>
> ps: if I update the submodules with --depth 1 parameter in parallel
> using xargs it lasts about 18 seconds, so it's a workaround for this
> issue, but it would be nice to do it with a single command.
>
>
>
>
> On 22 June 2016 at 17:31, Fredrik Gustafsson  wrote:
>> On Mon, Jun 20, 2016 at 01:06:39PM +, Istvan Zakar wrote:
>>> I'm working on a relatively big project with many submodules. During
>>> cloning for testing I tried to decrease the amount of data need to be
>>> fetched from the server by using --shallow-submodules option in the clone
>>> command. It seems to check out the tip of the remote repo, and if it's not
>>> the commit registered in the superproject the submodule update fails
>>> (obviously). Can I somehow tell to fetch that exact commit I need for my
>>> superproject?
>>
>> Maybe. http://stackoverflow.com/questions/2144406/git-shallow-submodules
>> gives a good overview of this problem.
>>
>> git fetches a branch and is shallow from that branch, which might be an
>> other sha1 than the one the submodule points to, (as you say). This
>> is/was one of the drawbacks with this method. However the since git 2.8,
>> git will try to fetch the sha1 direct (and not the branch). So then it
>> will work, if(!), the server supports direct access to sha1. This was
>> previously not allowed due to security concerns (if I recall correctly).
>>
>> So the answer is, yes this will work if you've a recent version of git
>> and support on the server side for doing this. Unfortunately I'm not
>> sure which git version is needed on the server side for this to work.
>>
>> --
>> Fredrik Gustafsson
>>
>> phone: +46 733-608274
>> e-mail: iv...@iveqy.com
>> website: http://www.iveqy.com
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Problem with --shallow-submodules option

2016-06-30 Thread Istvan Zakar

Hi,

Thanks for the clarification, it makes sense now.

Thanks,
Istvan


On 30 June 2016 at 22:57, Stefan Beller  wrote:
> On Thu, Jun 30, 2016 at 6:27 AM, Istvan Zakar  wrote:
>> Hello,
>>
>> Thanks for your answers. I tested it after the changes were made on
>> the git server, and it seems to be working. But some other issue came
>> up.
>>
>> We have quite many submodules in our project so I did some comaprision:
>>
>> If I do a clone with these parameters:
>> --jobs 20 --recurse-submodules
>>
>> The clone lasts ~53 seconds, and the total size of the folder is around 2 GB.
>>
>> If I add the shallow-submodules option, the size of the folder will be
>> a bit below 1GB, so the size decreased as I expected, but the time of
>> the clone itself increased to 90 seconds. It seems the last step of
>> the command, checking out the submodules is executed one-by-one, and
>> not in parallel, so it seems at this step the jobs parameter does not
>> have effect.
>>
>> Is it intentional, or there is some option I missed?
>
> It was intentional at the time of submitting the patches.
> The checkout phase is a bit complicated as it combines the
> newly cloned submodules as well as the submodules to incrementally
> fetch into one bucket and treats them the same.
>
> And for submodules that were fetched incrementally you may run into problems
> when combining that with the local state (e.g. rebase or merge configured in
> `submodule..update` or passed on the command line), which requires
> human interaction (resolving the merge conflict), which we want to present one
> at a time to the user.
>
> The handling for the user is not quite clear, when to stop, see:
> 15ffb7cde48b73b3d5ce259443db7d2e0ba13750 (submodule update: continue
> when a checkout fails)
> 877449c136539cf8b9b4ed9cfe33a796b7b93f93 (git-submodule.sh: clarify
> the "should we die now" logic)
>
> So we want to die as soon as we see a merge conflict or other
> error that is likely to require some human interaction.
> To do that properly we need to have complicated logic or just update
> one submodule at a time.
>
> For initial checkouts we know that there will be no merge conflicts, i.e.
> it will be a "checkout -f" (with an implicit must_die_on_failure=no)
> So we could run all checkouts of submodules in parallel, too. We'd
> just need to write the patch for that.
>
> As the cloning is already done in parallel, we can hook into the initial
> checkout there easily. I'd build that on top of [1], creating a similar 
> commit.
> In the successful case of `update_clone_task_finished` (the case with
> `!result`  -> return 0;) we would need to add the checkout command to
> the queue instead of just finishing.
>
> [1] 
> https://github.com/gitster/git/commit/665b35eccd39fefd714cb5c332277a6b94fd9386
>
>
>>
>> I'm using git 2.9.0 on client side.
>>
>> Thanks,
>>Istvan
>>
>> ps: if I update the submodules with --depth 1 parameter in parallel
>> using xargs it lasts about 18 seconds, so it's a workaround for this
>> issue, but it would be nice to do it with a single command.
>>
>>
>>
>>
>> On 22 June 2016 at 17:31, Fredrik Gustafsson  wrote:
>>> On Mon, Jun 20, 2016 at 01:06:39PM +, Istvan Zakar wrote:
 I'm working on a relatively big project with many submodules. During
 cloning for testing I tried to decrease the amount of data need to be
 fetched from the server by using --shallow-submodules option in the clone
 command. It seems to check out the tip of the remote repo, and if it's not
 the commit registered in the superproject the submodule update fails
 (obviously). Can I somehow tell to fetch that exact commit I need for my
 superproject?
>>>
>>> Maybe. http://stackoverflow.com/questions/2144406/git-shallow-submodules
>>> gives a good overview of this problem.
>>>
>>> git fetches a branch and is shallow from that branch, which might be an
>>> other sha1 than the one the submodule points to, (as you say). This
>>> is/was one of the drawbacks with this method. However the since git 2.8,
>>> git will try to fetch the sha1 direct (and not the branch). So then it
>>> will work, if(!), the server supports direct access to sha1. This was
>>> previously not allowed due to security concerns (if I recall correctly).
>>>
>>> So the answer is, yes this will work if you've a recent version of git
>>> and support on the server side for doing this. Unfortunately I'm not
>>> sure which git version is needed on the server side for this to work.
>>>
>>> --
>>> Fredrik Gustafsson
>>>
>>> phone: +46 733-608274
>>> e-mail: iv...@iveqy.com
>>> website: http://www.iveqy.com
>> --
>> To unsubscribe from this list: send the line "unsubscribe git" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

git-credentials-store.exe crash

2016-06-30 Thread d...@ucar.edu


Since I cannot submit a github issue, I will try here.



 - Which version of Git for Windows are you using? 32-bit or 64-bit? 
Include the

   output of `git version` as well.
git version 2.9.0.windows.1
32-bit

 - Which version of Windows are you running? 32-bit or 64-bit?
64-bit windows 7

 - What options did you set as part of the installation? Or did you 
choose the

   defaults?
no git credentials manager

 - Any other interesting things about your environment that might be 
related

   to the issue you're seeing?
Unknown

### Details

 - Which terminal/shell are you running Git from? e.g 
Bash/CMD/PowerShell/other


 cygwin bash

 - What commands did you run to trigger this issue? If you can provide a
   [Minimal, Complete, and Verifiable 
example](http://stackoverflow.com/help/mcve)

   this will help us understand the issue.

I kept getting complaints by git about credentials lock existing.
deleted it and then git-credentials-store crashed as follows.
```
Carson:part2: git push
fatal: unable to get credential storage lock: File exists
Everything up-to-date
Carson:part2: ls ~/.git*
/cygdrive/c/Users/dmh/.gitconfig* 
/cygdrive/c/Users/dmh/.gitignore*

/cygdrive/c/Users/dmh/.git-credentials*   /cygdrive/c/Users/dmh/.gitk*
/cygdrive/c/Users/dmh/.git-credentials.lock*
Carson:part2: rm ~/.git-credentials.lock
Carson:part2: git push
This application has requested the Runtime to terminate it in an unusual 
way.

Please contact the application's support team for more information.
A s s e r t i o n   f a i l e d !

 P r o g r a m :   C : \ t o o l s \ G i t \ m i n g w 3 2 \ l i b e x 
e c \ g i

 t - c o r e \ g i t - c r e d e n t i a l - s t o r e . e x e
 F i l e :   e x e c _ c m d . c ,   L i n e   2 3

 E x p r e s s i o n :   a r g v 0 _ p a t h
 Everything up-to-date
Carson:part2:

```
 - What did you expect to occur after running these commands?
completion with no errors

 - What actually happened instead?

 git-credentials-store failed.

 - If the problem was occurring with a specific repository, can you 
provide the

   URL to that repository to help us with testing?
url = https://github.com/Unidata/thredds.git

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: topological index field for commit objects

2016-06-30 Thread Jakub Narębski

W dniu 2016-06-30 o 20:12, Linus Torvalds pisze:
> On Thu, Jun 30, 2016 at 3:30 AM, Jakub Narębski  wrote:
>>
>> P.S. Having Git ensure that committerdate (as an epoch) is greater
>> than committerdates of its parents at the commit creation time (with
>> providing warning about time skew, perhaps not doing it if skew is
>> too large) would be not costly. No need to add anything, and it would
>> help future queries to be faster, I think.
> 
> So I think git should check the committer date (ie verify that the
> committer date of a new commit is always equal to or more recent than
> all the parents).
> 
> But we should probably allow some slop for when machines are out of
> sync (think build farms etc automation that may do automated commits,
> but the clocks on different machines may be off by a few seconds). We
> already do things like that in some of the use of committer dates -
> see for example CUTOFF_DATE_SLOP).

The problem with this idea is that the clock skew might be in two
directions, but it does fix/help only one.  If committer's clock
lags behind true time, the automatic bump to have committerdate of
a new commit be greater than committerdates of all its parent is quite
helpful.

However, if some developer has his or her clock misconfigured so that
it gives time in the future, this feature would not help.  Commits from
such developer (from such machine) would screw it up for others.

That's why I think some limit is needed.  For example, if for some
reason commit was created with committer date 3 days in the future,
we would probably do not want for other developers to have to lie
about when they created their commit.

Or did you mean that if the date for new commit is in the past
of committerdates of its parents, its all right when it is within
slop?

> 
> And we should probably allow the user to override the refusal to
> create new commits with bogus dates - think importing repositories etc
> where you may have reasons to just say "that's just how it was".

Right. I was thinking about (ab)using --no-verify flag, or perhaps
prompting user if the terminal is interactive. But perhaps a separate
flag / environment variable would be better...

> And it will take years for that to percolate out to all users, so it's
> not like it will fix things immediately, but then some time from now,
> we can consider commit dates to be more reliably generation numbers.
[...]

That's the idea of the proposal.

-- 
Jakub Narębski

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v13 11/20] index-helper: use watchman to avoid refreshing index with lstat()

2016-06-30 Thread Ben Peart

Duy Nguyen  gmail.com> writes:

> 
> On Thu, Jun 30, 2016 at 7:55 PM, Ben Peart  gmail.com> wrote:
> > David Turner  novalis.org> writes:
> >
> >>
> >> Hiding watchman behind index-helper means you need both daemons. You
> >> can't run watchman alone. Not so good. But on the other hand, 'git'
> >> binary is not linked to watchman/json libraries, which is good for
> >> packaging. Core git package will run fine without watchman-related
> >> packages. If they need watchman, they can install git-index-helper and
> >> dependencies.
> >>
> >
> > Have you considered splitting index-helper and watchman apart?  Using
> > Watchman to not lstat unchanged entries is a huge perf win with very
> > large repos.
> 
> On large repos (i.e. lots of files/dirs on worktree), the cost of
> reading index will increase proportionally. Yes lstat costs, but I
> suspect index reading (integrity verification actually) may cost more,
> especially on platforms with cheap lstat like linux. On these repos
> you really want to enable all four: index-helper (with watchman),
> split-index (I still need to work out pruning on split-index) and
> untracked cache. There's still a lot more to make git run fast on
> large repos though.
> 

I've found (at least on Windows) that as the repo size gets larger, the
time to read the index becomes a much smaller percentage of the overall
time.  I just captured some perf traces of git status on a large repo we
have.  Of that, 92.5% was spent in git!read_directory and only 4.0% was 
spent in git!read_index.  Of that 4%, 2.6% was git!glk_SHA1_Update.

Given there were no dirty files, Watchman would have made a huge 
improvement in the overall time but index helper would have had
relatively little impact.  We've noticed this same pattern in all our
repos which is what is driving our interest in the Watchman model and
also shows why index-helper is of less interest.

> > It would also be interesting to make the Watchman backend replaceable by
> > using an extensible API.  This has the benefit of not having to link the
> > 'git' binary to the watchman/json libraries.
> 
> 'git' binary is not linked to watchman libraries. git-index-helper is
> a separate binary, by design. In theory you can create a
> 'git-index-helper' replacement binary with something other than
> watchman. I think David documented the protocol well (it may change in
> the future though and we are not prepared for capability progression)
> 
> > Is there any pattern already in git for accomplishing this?

While the current design hides watchman behind index-helper, if we were
to change that model so they were independent, we would be interested
in doing it in such a way that provided some abstraction so that it 
could be replaced with another file watching daemon.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: topological index field for commit objects

2016-06-30 Thread Mike Hommey

On Thu, Jun 30, 2016 at 11:12:52AM -0700, Linus Torvalds wrote:
> On Thu, Jun 30, 2016 at 3:30 AM, Jakub Narębski  wrote:
> >
> > P.S. Having Git ensure that committerdate (as an epoch) is greater
> > than committerdates of its parents at the commit creation time (with
> > providing warning about time skew, perhaps not doing it if skew is
> > too large) would be not costly. No need to add anything, and it would
> > help future queries to be faster, I think.
> 
> So I think git should check the committer date (ie verify that the
> commitetr date of a new commit is always equal to or more recent than
> all the parents).
> 
> But we should probably allow some slop for when machines are out of
> sync (think build farms etc automation that may do automated commits,
> but the clocks on different machines may be off by a few seconds). We
> already do things like that in some of the use of committer dates -
> see for example CUTOFF_DATE_SLOP)
> 
> And we should probably allow the user to override the refusal to
> create new commits with bogus dates - think importing repositories etc
> where you may have reasons to just say "that's just how it was".
> 
> And it will take years for that to percolate out to all users, so it's
> not like it will fix things immediately, but then some time from now,
> we can consider commit dates to be more reliably generation numbers.

This is an assumption that might work for git repositories, but not for
foreign repositories imported in git (think e.g. mercurial repositories
accessed with a remote helper)

Mike
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: topological index field for commit objects

2016-06-30 Thread Jeff King

On Thu, Jun 30, 2016 at 11:12:52AM -0700, Linus Torvalds wrote:

> I do think that it's ok to cache generation numbers somewhere if there
> is an algorithm that can make use of them, but every time this comes
> up, it's just not been important enough to make a big deal and a new
> incompatible object format for it. The committer date is preexisting
> and has existing pseudo-generation-number usage, so..improving on the
> quality of it sounds like a good idea.

If you are OK with a cache, I don't think one needs to change the object
format at all. It can be computed on the fly, and is purely a local
optimization.

> The first step should probably be to make fsck warn about the existing
> cases of "commit has older date than parents". Something like the
> attached patch, perhaps?

I have mixed feelings on this, because it forces the user to confront
the issue at a time that's potentially very far from when it actually
happened (and often when it is not their fault).

I expect most people don't run fsck at all under normal circumstances,
so it is only when they have a problem of some sort that they would see
this warning at all. It would kick in and prevent objects being
transferred when things like receive.fsckObjects are configured, but
it's not on by default. GitHub does enable it, so pushing there is often
the first notification people get about any kind of problem.

This is a general problem with all fsck-driven warnings (people may
fetch without realizing they're getting breakage, and such objects may
get years of history built on top). But I think it can be even worse
with timestamps, because the broken state may not even be recognizable
when you first fetch the troublesome object.

E.g., imagine somebody else has their clock set forward, and you fetch
from them. Their object by itself is not broken. It is only when you
want to commit on top of it, with the correct clock, that the broken
state is created (and then, we cannot know whether it is your clock or
the original committer's clock that is bad).

So I think it would be more productive to put a check like this in "git
commit" rather than (or perhaps in addition to) fsck. That prevents
us creating the broken relationship, but it does still mean the user may
have to to go back and tell the original committer that their clock was
broken.

You could also have the fsck check look not only for out-of-order
commits, but also commits in the future (from the perspective of the
receiver). That would reject such broken commits before they even hit
your repository (though again, it is unclear in such a case if the
commit is broken or the clock of the checker).

> +static void fsck_commit_date(struct fsck_options *options, struct commit 
> *commit)
> +{
> + struct commit_list *p;
> +
> + for (p = commit->parents; p; p = p->next) {
> + struct commit *parent = p->item;
> + if (commit->date < parent->date)
> + report(options, &commit->object, 
> FSCK_MSG_DATE_ORDERING, "Bad commit date ordering with parent");
> + }
> +}

I didn't test it, but I suspect this won't work reliably, as we do not
always have the parents parsed during an fsck check (e.g., during
"index-pack --strict").

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: git-credentials-store.exe crash

2016-06-30 Thread Jeff King

On Thu, Jun 30, 2016 at 03:24:41PM -0600, d...@ucar.edu wrote:

> Carson:part2: git push
> This application has requested the Runtime to terminate it in an unusual
> way.
> Please contact the application's support team for more information.
> A s s e r t i o n   f a i l e d !
> 
>  P r o g r a m :   C : \ t o o l s \ G i t \ m i n g w 3 2 \ l i b e x e c \
> g i
>  t - c o r e \ g i t - c r e d e n t i a l - s t o r e . e x e
>  F i l e :   e x e c _ c m d . c ,   L i n e   2 3
> 
>  E x p r e s s i o n :   a r g v 0 _ p a t h
>  Everything up-to-date

Interesting. It's failing on the assert(argv0_path) in system_path().

That's part of the RUNTIME_PREFIX code which is built only on Windows,
so this is a Windows-specific issue.

I can guess the reason that argv0_path is not set is that
credential-store has its own main() function and does not call
git_extract_argv0_path(). It never mattered before, but as of v2.8.0,
one of the library functions it calls wants to use system_path(), which
assumes that the argv0 stuff has been set up.

I'm preparing some patches to fix this case (and some other related
ones).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 1/5] t9300: factor out portable "head -c" replacement

2016-06-30 Thread Eric Sunshine

On Thursday, June 30, 2016, Jeff King  wrote:
> In shell scripts it is sometimes useful to be able to read
> exactly N bytes from a pipe. Doing this portably turns out
> to be surprisingly difficult.
>
> We want a solution that:
>
>   - is portable
>
>   - never reads more than N bytes due to buffering (which
> would mean those bytes are not available to the next
> program to read from the same pipe)
>
>   - handles partial reads by looping until N bytes or read

s/or/are/

> (or we see EOF)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/5] consistent setup code for external commands

2016-06-30 Thread Jeff King

On Fri, Jul 01, 2016 at 12:07:15AM -0400, Jeff King wrote:

> Interesting. It's failing on the assert(argv0_path) in system_path().
> 
> That's part of the RUNTIME_PREFIX code which is built only on Windows,
> so this is a Windows-specific issue.
> 
> I can guess the reason that argv0_path is not set is that
> credential-store has its own main() function and does not call
> git_extract_argv0_path(). It never mattered before, but as of v2.8.0,
> one of the library functions it calls wants to use system_path(), which
> assumes that the argv0 stuff has been set up.
> 
> I'm preparing some patches to fix this case (and some other related
> ones).

Here they are:

  [1/5]: add an extra level of indirection to main()
  [2/5]: common-main: call git_extract_argv0_path()
  [3/5]: common-main: call sanitize_stdfds()
  [4/5]: common-main: call restore_sigpipe_to_default()
  [5/5]: common-main: call git_setup_gettext()

The first patch is refactoring so we can fix this problem once, rather
than in the dozens of programs that need it.

The second should fix the problem you saw. It's unfortunate that the
tests didn't detect it; there's some discussion of that in the commit
message.

Patches 3-5 are other places where we can use the refactoring to be more
consistent and remove boilerplate code.

I almost did a patch 6 moving trace_command_performance(), but I'm not
sure if people would care or not. Running "git foo" already covers that,
even for an external command, because the git wrapper waits until the
sub-process finishes. Setting it up in each sub-program would mean:

  - you get it for dashed externals you run directly, too

  - for external programs run as "git foo", you get _two_ times. One for
the time the actual sub-program ran, and one with the overhead of
the git wrapper process.

I'm not sure if people actually care about that distinction, or
whether the extra number would simply be annoying.

So I punted. It's easy to move it over later if anybody cares.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/5] add an extra level of indirection to main()

2016-06-30 Thread Jeff King

There are certain startup tasks that we expect every git
process to do. In some cases this is just to improve the
quality of the program (e.g., setting up gettext()). In
others it is a requirement for using certain functions in
libgit.a (e.g., system_path() expects that you have called
git_extract_argv0_path()).

Most commands are builtins and are covered by the git.c
version of main(). However, there are still a few external
commands that use their own main(). Each of these has to
remember to include the correct startup sequence, and we are
not always consistent.

Rather than just fix the inconsistencies, let's make this
harder to get wrong by providing a common main() that can
run this standard startup.

We basically have two options to do this:

 - the compat/mingw.h file already does something like this by
   adding a #define that replaces the definition of main with a
   wrapper that calls mingw_startup().

   The upside is that the code in each program doesn't need
   to be changed at all; it's rewritten on the fly by the
   preprocessor.

   The downside is that it may make debugging of the startup
   sequence a bit more confusing, as the preprocessor is
   quietly inserting new code.

 - the builtin functions are all of the form cmd_foo(),
   and git.c's main() calls them.

   This is much more explicit, which may make things more
   obvious to somebody reading the code. It's also more
   flexible (because of course we have to figure out _which_
   cmd_foo() to call).

   The downside is that each of the builtins must define
   cmd_foo(), instead of just main().

This patch chooses the latter option, preferring the more
explicit approach, even though it is more invasive. We
introduce a new file common-main.c, with the "real" main. It
expects to call cmd_main() from whatever other objects it is
linked against.

We link common-main.o against anything that links against
libgit.a, since we know that such programs will need to do
this setup. Note that common-main.o can't actually go inside
libgit.a, as the linker would not pick up its main()
function automatically (it has no callers).

The rest of the patch is just adjusting all of the various
external programs (mostly in t/helper) to use cmd_main().
I've provided a global declaration for cmd_main(), which
means that all of the programs also need to match its
signature. In particular, many functions need to switch to
"const char **" instead of "char **" for argv. This effect
ripples out to a few other variables and functions, as well.

This makes the patch even more invasive, but the end result
is much better. We should be treating argv strings as const
anyway, and now all programs conform to the same signature
(which also matches the way builtins are defined).

Signed-off-by: Jeff King 
---
I waffled between the two mechanisms. Opinions welcome.

 Makefile   | 17 +
 common-main.c  | 12 
 credential-cache--daemon.c |  2 +-
 credential-cache.c |  2 +-
 credential-store.c |  2 +-
 daemon.c   |  8 
 fast-import.c  |  4 ++--
 git-compat-util.h  |  2 ++
 git.c  |  3 +--
 http-backend.c |  2 +-
 http-fetch.c   |  2 +-
 http-push.c|  6 +++---
 imap-send.c|  2 +-
 remote-curl.c  |  2 +-
 remote-testsvn.c   |  2 +-
 sh-i18n--envsubst.c|  2 +-
 shell.c|  2 +-
 show-index.c   |  2 +-
 t/helper/test-chmtime.c|  2 +-
 t/helper/test-config.c |  2 +-
 t/helper/test-ctype.c  |  2 +-
 t/helper/test-date.c   |  8 
 t/helper/test-delta.c  |  2 +-
 t/helper/test-dump-cache-tree.c|  2 +-
 t/helper/test-dump-split-index.c   |  2 +-
 t/helper/test-dump-untracked-cache.c   |  2 +-
 t/helper/test-fake-ssh.c   |  2 +-
 t/helper/test-genrandom.c  |  2 +-
 t/helper/test-hashmap.c|  2 +-
 t/helper/test-index-version.c  |  2 +-
 t/helper/test-line-buffer.c|  2 +-
 t/helper/test-match-trees.c|  2 +-
 t/helper/test-mergesort.c  |  2 +-
 t/helper/test-mktemp.c |  2 +-
 t/helper/test-parse-options.c  |  2 +-
 t/helper/test-path-utils.c |  4 ++--
 t/helper/test-prio-queue.c |  2 +-
 t/helper/test-read-cache.c |  2 +-
 t/helper/test-regex.c  |  2 +-
 t/helper/test-revision-walking.c   |  2 +-
 t/helper/test-run-command.c|  2 +-
 t/helper/test-scrap-cache-tree.c   |  2 +-
 t/helper/test-sha1-array.c |  2 +-
 t/helper/test-sha1.c

[PATCH 2/5] common-main: call git_extract_argv0_path()

2016-06-30 Thread Jeff King

Every program which links against libgit.a must call this
function, or risk hitting an assert() in system_path() that
checks whether we have configured argv0_path (though only
when RUNTIME_PREFIX is defined, so essentially only on
Windows).

Looking at the diff, you can see that putting it into the
common main() saves us having to do it individually in each
of the external commands. But what you can't see are the
cases where we _should_ have been doing so, but weren't
(e.g., git-credential-store, and all of the t/helper test
programs).

This has been an accident-waiting-to-happen for a long time,
but wasn't triggered until recently because it involves one
of those programs actually calling system_path(). That
happened with git-credential-store in v2.8.0 with ae5f677
(lazily load core.sharedrepository, 2016-03-11). The
program:

  - takes a lock file, which...

  - opens a tempfile, which...

  - calls adjust_shared_perm to fix permissions, which...

  - lazy-loads the config (as of ae5f677), which...

  - calls system_path() to find the location of
/etc/gitconfig

On systems with RUNTIME_PREFIX, this means credential-store
reliably hits that assert() and cannot be used.

We never noticed in the test suite, because we set
GIT_CONFIG_NOSYSTEM there, which skips the system_path()
lookup entirely.  But if we were to tweak git_config() to
find /etc/gitconfig even when we aren't going to open it,
then the test suite shows multiple failures (for
credential-store, and for some other test helpers). I didn't
include that tweak here because it's way too specific to
this particular call to be worth carrying around what is
essentially dead code.

The implementation is fairly straightforward, with one
exception: there is exactly one caller (git.c) that actually
cares about the result of the function, and not the
side-effect of setting up argv0_path. We can accommodate
that by simply replacing the value of argv[0] in the array
we hand down to cmd_main().

Signed-off-by: Jeff King 
---
 common-main.c| 3 +++
 daemon.c | 3 ---
 fast-import.c| 3 ---
 git.c| 2 +-
 http-backend.c   | 1 -
 http-fetch.c | 2 --
 http-push.c  | 2 --
 imap-send.c  | 2 --
 remote-curl.c| 1 -
 remote-testsvn.c | 1 -
 shell.c  | 2 --
 upload-pack.c| 1 -
 12 files changed, 4 insertions(+), 19 deletions(-)

diff --git a/common-main.c b/common-main.c
index 2b96bbf..57c912a 100644
--- a/common-main.c
+++ b/common-main.c
@@ -1,4 +1,5 @@
 #include "git-compat-util.h"
+#include "exec_cmd.h"
 
 int main(int argc, char **av)
 {
@@ -8,5 +9,7 @@ int main(int argc, char **av)
 */
const char **argv = (const char **)av;
 
+   argv[0] = git_extract_argv0_path(argv[0]);
+
return cmd_main(argc, argv);
 }
diff --git a/daemon.c b/daemon.c
index 4326cd0..d33ee5a 100644
--- a/daemon.c
+++ b/daemon.c
@@ -1,6 +1,5 @@
 #include "cache.h"
 #include "pkt-line.h"
-#include "exec_cmd.h"
 #include "run-command.h"
 #include "strbuf.h"
 #include "string-list.h"
@@ -1204,8 +1203,6 @@ int cmd_main(int argc, const char **argv)
 
git_setup_gettext();
 
-   git_extract_argv0_path(argv[0]);
-
for (i = 1; i < argc; i++) {
const char *arg = argv[i];
const char *v;
diff --git a/fast-import.c b/fast-import.c
index 7cb7434..f12cd00 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -164,7 +164,6 @@ Format of STDIN stream:
 #include "refs.h"
 #include "csum-file.h"
 #include "quote.h"
-#include "exec_cmd.h"
 #include "dir.h"
 #include "run-command.h"
 
@@ -3449,8 +3448,6 @@ int cmd_main(int argc, const char **argv)
 {
unsigned int i;
 
-   git_extract_argv0_path(argv[0]);
-
git_setup_gettext();
 
if (argc == 2 && !strcmp(argv[1], "-h"))
diff --git a/git.c b/git.c
index e244404..3b4e12d 100644
--- a/git.c
+++ b/git.c
@@ -635,7 +635,7 @@ int cmd_main(int argc, const char **argv)
const char *cmd;
int done_help = 0;
 
-   cmd = git_extract_argv0_path(argv[0]);
+   cmd = argv[0];
if (!cmd)
cmd = "git-help";
 
diff --git a/http-backend.c b/http-backend.c
index 411c835..5375cbc 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -642,7 +642,6 @@ int cmd_main(int argc, const char **argv)
 
git_setup_gettext();
 
-   git_extract_argv0_path(argv[0]);
set_die_routine(die_webcgi);
set_die_is_recursing_routine(die_webcgi_recursing);
 
diff --git a/http-fetch.c b/http-fetch.c
index eb559eb..244cd0d 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -24,8 +24,6 @@ int cmd_main(int argc, const char **argv)
 
git_setup_gettext();
 
-   git_extract_argv0_path(argv[0]);
-
while (arg < argc && argv[arg][0] == '-') {
if (argv[arg][1] == 't') {
get_tree = 1;
diff --git a/http-push.c b/http-push.c
index df20e44..3a5fecf 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1711,8 +1711,6 @@ int cmd_main(int argc, const char **ar

[PATCH 3/5] common-main: call sanitize_stdfds()

2016-06-30 Thread Jeff King

This is setup that should be done in every program for
safety, but we never got around to adding it everywhere (so
builtins benefited from the call in git.c, but any external
commands did not). Putting it in the common main() gives us
this safety everywhere.

Note that the case in daemon.c is a little funny. We wait
until we know whether we want to daemonize, and then either:

 - call daemonize(), which will close stdio and reopen it to
   /dev/null under the hood

 - sanitize_stdfds(), to fix up any odd cases

But that is way too late; the point of sanitizing is to give
us reliable descriptors on 0/1/2, and we will already have
executed code, possibly called die(), etc. The sanitizing
should be the very first thing that happens.

With this patch, git-daemon will sanitize first, and can
remove the call in the non-daemonize case. It does mean that
daemonize() may just end up closing the descriptors we
opened, but that's not a big deal (it's not wrong to do so,
nor is it really less optimal than the case where our parent
process redirected us from /dev/null ahead of time).

Signed-off-by: Jeff King 
---
 common-main.c | 9 -
 daemon.c  | 3 +--
 git.c | 7 ---
 shell.c   | 7 ---
 4 files changed, 9 insertions(+), 17 deletions(-)

diff --git a/common-main.c b/common-main.c
index 57c912a..353c6ea 100644
--- a/common-main.c
+++ b/common-main.c
@@ -1,4 +1,4 @@
-#include "git-compat-util.h"
+#include "cache.h"
 #include "exec_cmd.h"
 
 int main(int argc, char **av)
@@ -9,6 +9,13 @@ int main(int argc, char **av)
 */
const char **argv = (const char **)av;
 
+   /*
+* Always open file descriptors 0/1/2 to avoid clobbering files
+* in die().  It also avoids messing up when the pipes are dup'ed
+* onto stdin/stdout/stderr in the child processes we spawn.
+*/
+   sanitize_stdfds();
+
argv[0] = git_extract_argv0_path(argv[0]);
 
return cmd_main(argc, argv);
diff --git a/daemon.c b/daemon.c
index d33ee5a..8646f33 100644
--- a/daemon.c
+++ b/daemon.c
@@ -1378,8 +1378,7 @@ int cmd_main(int argc, const char **argv)
if (detach) {
if (daemonize())
die("--detach not supported on this platform");
-   } else
-   sanitize_stdfds();
+   }
 
if (pid_file)
write_file(pid_file, "%"PRIuMAX, (uintmax_t) getpid());
diff --git a/git.c b/git.c
index 3b4e12d..b65083c 100644
--- a/git.c
+++ b/git.c
@@ -639,13 +639,6 @@ int cmd_main(int argc, const char **argv)
if (!cmd)
cmd = "git-help";
 
-   /*
-* Always open file descriptors 0/1/2 to avoid clobbering files
-* in die().  It also avoids messing up when the pipes are dup'ed
-* onto stdin/stdout/stderr in the child processes we spawn.
-*/
-   sanitize_stdfds();
-
restore_sigpipe_to_default();
 
git_setup_gettext();
diff --git a/shell.c b/shell.c
index ca00807..5e70acb 100644
--- a/shell.c
+++ b/shell.c
@@ -147,13 +147,6 @@ int cmd_main(int argc, const char **argv)
 
git_setup_gettext();
 
-   /*
-* Always open file descriptors 0/1/2 to avoid clobbering files
-* in die().  It also avoids messing up when the pipes are dup'ed
-* onto stdin/stdout/stderr in the child processes we spawn.
-*/
-   sanitize_stdfds();
-
/*
 * Special hack to pretend to be a CVS server
 */
-- 
2.9.0.317.g65b4e7c

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/5] common-main: call restore_sigpipe_to_default()

2016-06-30 Thread Jeff King

This is another safety/sanity setup that should be in force
everywhere, but which we only applied in git.c. This did
catch most cases, since even external commands are typically
run via "git ..." (and the restoration applies to
sub-processes, too). But there were cases we missed, such as
somebody calling git-upload-pack directly via ssh, or
scripts which use dashed external commands directly.

Signed-off-by: Jeff King 
---
 common-main.c | 23 +++
 git.c | 23 ---
 2 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/common-main.c b/common-main.c
index 353c6ea..20e55ef 100644
--- a/common-main.c
+++ b/common-main.c
@@ -1,6 +1,27 @@
 #include "cache.h"
 #include "exec_cmd.h"
 
+/*
+ * Many parts of Git have subprograms communicate via pipe, expect the
+ * upstream of a pipe to die with SIGPIPE when the downstream of a
+ * pipe does not need to read all that is written.  Some third-party
+ * programs that ignore or block SIGPIPE for their own reason forget
+ * to restore SIGPIPE handling to the default before spawning Git and
+ * break this carefully orchestrated machinery.
+ *
+ * Restore the way SIGPIPE is handled to default, which is what we
+ * expect.
+ */
+static void restore_sigpipe_to_default(void)
+{
+   sigset_t unblock;
+
+   sigemptyset(&unblock);
+   sigaddset(&unblock, SIGPIPE);
+   sigprocmask(SIG_UNBLOCK, &unblock, NULL);
+   signal(SIGPIPE, SIG_DFL);
+}
+
 int main(int argc, char **av)
 {
/*
@@ -18,5 +39,7 @@ int main(int argc, char **av)
 
argv[0] = git_extract_argv0_path(argv[0]);
 
+   restore_sigpipe_to_default();
+
return cmd_main(argc, argv);
 }
diff --git a/git.c b/git.c
index b65083c..ccb24fd 100644
--- a/git.c
+++ b/git.c
@@ -609,27 +609,6 @@ static int run_argv(int *argcp, const char ***argv)
return done_alias;
 }
 
-/*
- * Many parts of Git have subprograms communicate via pipe, expect the
- * upstream of a pipe to die with SIGPIPE when the downstream of a
- * pipe does not need to read all that is written.  Some third-party
- * programs that ignore or block SIGPIPE for their own reason forget
- * to restore SIGPIPE handling to the default before spawning Git and
- * break this carefully orchestrated machinery.
- *
- * Restore the way SIGPIPE is handled to default, which is what we
- * expect.
- */
-static void restore_sigpipe_to_default(void)
-{
-   sigset_t unblock;
-
-   sigemptyset(&unblock);
-   sigaddset(&unblock, SIGPIPE);
-   sigprocmask(SIG_UNBLOCK, &unblock, NULL);
-   signal(SIGPIPE, SIG_DFL);
-}
-
 int cmd_main(int argc, const char **argv)
 {
const char *cmd;
@@ -639,8 +618,6 @@ int cmd_main(int argc, const char **argv)
if (!cmd)
cmd = "git-help";
 
-   restore_sigpipe_to_default();
-
git_setup_gettext();
 
trace_command_performance(argv);
-- 
2.9.0.317.g65b4e7c

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 5/5] common-main: call git_setup_gettext()

2016-06-30 Thread Jeff King

This should be part of every program, as otherwise users do
not get translated error messages. However, some external
commands forgot to do so (e.g., git-credential-store). This
fixes them, and eliminates the repeated code in programs
that did remember to use it.

Signed-off-by: Jeff King 
---
 common-main.c  | 2 ++
 daemon.c   | 2 --
 fast-import.c  | 2 --
 git.c  | 2 --
 http-backend.c | 2 --
 http-fetch.c   | 2 --
 http-push.c| 2 --
 imap-send.c| 2 --
 remote-curl.c  | 2 --
 shell.c| 2 --
 show-index.c   | 2 --
 upload-pack.c  | 2 --
 12 files changed, 2 insertions(+), 22 deletions(-)

diff --git a/common-main.c b/common-main.c
index 20e55ef..3be5ad1 100644
--- a/common-main.c
+++ b/common-main.c
@@ -37,6 +37,8 @@ int main(int argc, char **av)
 */
sanitize_stdfds();
 
+   git_setup_gettext();
+
argv[0] = git_extract_argv0_path(argv[0]);
 
restore_sigpipe_to_default();
diff --git a/daemon.c b/daemon.c
index 8646f33..e647254 100644
--- a/daemon.c
+++ b/daemon.c
@@ -1201,8 +1201,6 @@ int cmd_main(int argc, const char **argv)
struct credentials *cred = NULL;
int i;
 
-   git_setup_gettext();
-
for (i = 1; i < argc; i++) {
const char *arg = argv[i];
const char *v;
diff --git a/fast-import.c b/fast-import.c
index f12cd00..bf53ac9 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -3448,8 +3448,6 @@ int cmd_main(int argc, const char **argv)
 {
unsigned int i;
 
-   git_setup_gettext();
-
if (argc == 2 && !strcmp(argv[1], "-h"))
usage(fast_import_usage);
 
diff --git a/git.c b/git.c
index ccb24fd..0f1937f 100644
--- a/git.c
+++ b/git.c
@@ -618,8 +618,6 @@ int cmd_main(int argc, const char **argv)
if (!cmd)
cmd = "git-help";
 
-   git_setup_gettext();
-
trace_command_performance(argv);
 
/*
diff --git a/http-backend.c b/http-backend.c
index 5375cbc..0d59499 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -640,8 +640,6 @@ int cmd_main(int argc, const char **argv)
char *cmd_arg = NULL;
int i;
 
-   git_setup_gettext();
-
set_die_routine(die_webcgi);
set_die_is_recursing_routine(die_webcgi_recursing);
 
diff --git a/http-fetch.c b/http-fetch.c
index 244cd0d..3b556d6 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -22,8 +22,6 @@ int cmd_main(int argc, const char **argv)
int get_verbosely = 0;
int get_recover = 0;
 
-   git_setup_gettext();
-
while (arg < argc && argv[arg][0] == '-') {
if (argv[arg][1] == 't') {
get_tree = 1;
diff --git a/http-push.c b/http-push.c
index 3a5fecf..dacada9 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1709,8 +1709,6 @@ int cmd_main(int argc, const char **argv)
int new_refs;
struct ref *ref, *local_refs;
 
-   git_setup_gettext();
-
repo = xcalloc(1, sizeof(*repo));
 
argv++;
diff --git a/imap-send.c b/imap-send.c
index 125b218..9cbe27f 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -1500,8 +1500,6 @@ int cmd_main(int argc, const char **argv)
int total;
int nongit_ok;
 
-   git_setup_gettext();
-
setup_git_directory_gently(&nongit_ok);
git_imap_config();
 
diff --git a/remote-curl.c b/remote-curl.c
index d39f4cf..6b83b77 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -989,8 +989,6 @@ int cmd_main(int argc, const char **argv)
struct strbuf buf = STRBUF_INIT;
int nongit;
 
-   git_setup_gettext();
-
setup_git_directory_gently(&nongit);
if (argc < 2) {
error("remote-curl: usage: git remote-curl  []");
diff --git a/shell.c b/shell.c
index 5e70acb..464ee1a 100644
--- a/shell.c
+++ b/shell.c
@@ -145,8 +145,6 @@ int cmd_main(int argc, const char **argv)
struct commands *cmd;
int count;
 
-   git_setup_gettext();
-
/*
 * Special hack to pretend to be a CVS server
 */
diff --git a/show-index.c b/show-index.c
index 575f9c5..1ead41e 100644
--- a/show-index.c
+++ b/show-index.c
@@ -11,8 +11,6 @@ int cmd_main(int argc, const char **argv)
unsigned int version;
static unsigned int top_index[256];
 
-   git_setup_gettext();
-
if (argc != 1)
usage(show_index_usage);
if (fread(top_index, 2 * 4, 1, stdin) != 1)
diff --git a/upload-pack.c b/upload-pack.c
index 681fd2f..8769b1b 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -836,8 +836,6 @@ int cmd_main(int argc, const char **argv)
OPT_END()
};
 
-   git_setup_gettext();
-
packet_trace_identity("upload-pack");
check_replace_refs = 0;
 
-- 
2.9.0.317.g65b4e7c
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 0/5] Better ref summary alignment in "git fetch"

2016-06-30 Thread Jeff King

On Thu, Jun 30, 2016 at 06:16:09PM +0200, Duy Nguyen wrote:

> > I've shrunk it a bit to fit in the email; my actual "long" name was much
> > larger. And the average length for the shorter ones is, too, but the
> > overall effect is the same; almost every line has a huge run of
> > whitespace. And some lines wrap that would not have even under the
> > normal, duplicated scheme.
> 
> I was about to resend with s/\$/*/g and ignored this issue (with a
> note) then it occurred to me we can simply ignore these long lines
> from column width calculation. Yeah such a line may still be wrapped
> around, but it will not push every other line to the far right. We
> already have code for that in adjust_refcol_width()
> 
> max= term_columns();
> ...
> /*
> * rough estimation to see if the output line is too long and
> * should not be counted (we can't do precise calculation
> * anyway because we don't know if the error explanation part
> * will be printed in update_local_ref)
> */
> if (21 /* flag and summary */ + rlen + 4 /* -> */ + llen >= max)
> return;
> ...
> 
> we can limit max to, like, term_columns() / 2 (or 2/3 of
> term_columns). There's no perfect number, some people will still find
> the output ugly _often_. But hopefully the majority won't. What do you
> think?

Without having seen the output, my gut feeling is "meh". It will
hopefully avoid the wrapping, and it will reduce the size of the huge
whitespace gap, but not eliminate it. So I think it will still be
somewhat ugly.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: --abbrev-commit gives longer hash than necessary

2016-06-30 Thread Jeff King

On Thu, Jun 30, 2016 at 09:38:45PM +0200, Steffen Nurpmeso wrote:

> For some time (currently with 2.9.0) know see that a single commit
> gives a longer hash than necessary, even though there is no
> ambiguity:
> 
>   ?0[steffen@wales ]$ git longca|
>   awk 'BEGIN{l7=0;l8=0}\
> /^[[:alnum:]]{7} /{++l7;next}\
> /^[[:alnum:]]{8} /{++l8;print}\
>   END{print "L7 " l7 " L8 " l8}'
>   786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
> handlers..
>   L7 3364 L8 1

I don't know what your "git longca" alias is, but presumably it's
running "git log". That will show just _commit_ hashes, but the
abbreviation code will consider all objects.

There are some places you can use an abbreviated sha1 that know they are
looking for a commit, but most places will generally complain if there
is ambiguity between a blob and commit (e.g., "git show $foo").

It's also presumably just traversing HEAD, so it would miss commits on
other branches.

>   ?0[steffen@wales ]$ git long|cut -f1 -d' '|grep ^786
>   786d0c9c
>   786f219

Try "git rev-list --objects --all | grep ^786d0c9".

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] log: decorate HEAD -> branch with the same color for branch and HEAD

2016-06-30 Thread Jeff King

On Thu, Jun 30, 2016 at 06:39:42PM +0200, Nguyễn Thái Ngọc Duy wrote:

> Commit 76c61fb (log: decorate HEAD with branch name under
> --decorate=full, too - 2015-05-13) adds "HEAD -> branch" decoration to
> show current branch vs detached HEAD. The sign of whether HEAD is
> detached or not is "->" (vs ", "). It's too subtle for my poor
> eyes. If color is used, we can make the branch name's color the same
> as HEAD to visually emphasize that it's the current branch.

Hmm. I think I like this, as it uses color to make the grouping between
HEAD and its referent more clear.

We do already use colors to indicate "type", though. Which means:

  1. The branch now uses the "symref" color. Probably OK, as that is the
 point of the grouping (I wonder if the "->" in the middle should
 match in color, too).

  2. We used to color based on current_and_HEAD->type, but now that
 information isn't conveyed. However, can this really ever have been
 anything _except_ a branch?

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFD] Place to document magic pathspecs like ":/" and pathspec handling

2016-06-30 Thread Jeff King

On Thu, Jun 30, 2016 at 05:14:15PM +0200, Duy Nguyen wrote:

> > So we could describe how Git handles pathspecs and pathspec magic
> > in the new manual page named gitpathspec(7), or gitpaths(7). The
> > former has the advantage of the name being identical to the entry
> > in gitglossary(7). The latter has the probable advantage of being
> > easier for the Git novice to find,
> 
> git-pathspec(7) is a great idea. It bugs me that all the pathspec
> details are hidden in that glossary file (though I didn't know it was
> also available via "git help glossary", which probably just reinforces
> its invisibility). The closet place I could think of was git(1) but
> that page is already very long.
> 
> > and that it could be used to
> > gather various ways to generate list of files in Git (files in
> > the working area, files in the staging area aka the index, files
> > in the revision / tree object, changed files, etc.);
> 
> You mean the list of commands in this man page? OK. Another thing we
> could do is the reverse direction, add gitpathspec(7) as a reference
> to all commands that may need it.

I'll second (third?) this. Some of our manpages are quite unwieldy, and
I don't think we are doing any favors to users when we auto-include the
same content in multiple places. I think we should instead be splitting
concepts into their own pages and referring to them. That helps users
build the mental model ("aha, pathspecs; I read that concept page and
know how those work!" instead of "here's 3 pages of description on how
git-diff handles paths. Is it identical to the 3 pages in git-log?").

I suspect some things like diff options could benefit from a similar
split.

> >  Sidenote: I wonder if people (especially novices) have problem
> >  finding relevant documentation, and if adding something like
> >  "git apropos " command ("apropos", or "man -k"), or
> >  add the '--apropos' option to "git help" would be useful...
> >  and if it would be easy to create.
> 
> I have that problem sometimes and I don't think I can call myself a
> noob anymore. I usually need to do some grepping. So yeah "git
> apropos" would be great.

I think better searching would help a lot if we split things into more
pages (right now, a legitimate technique is "man git-foo", using search
in the pager, and wading through all of the false positives. That sucks,
but at least you can find what you want _eventually_).

I'm not sure if "man -k" is all that great, though, for two reasons:

  1. AFAIK there isn't an easy way to restrict it only to git manpages
 (try "man -k git"; oh, hello "isxdigit(3posix)").

  2. It only searches the title lines. So it's great for finding the
 pathspec page, but you could probably already do that by guessing
 it's called "git help pathspecs". But if you're looking for
 discussion of a particular diff option, say, it would be nice to be
 able to search for all mentions of "--word-diff-regex", or
 something.

 There is "man -K", but it kind of sucks (it seems to just dump you
 in any manpage that matches, in a list).

I don't think those are problems that _git_ should necessarily be
solving, though. It's a general problem for manpages. And there may be
better "man" implementations than I'm used to (or even options or
configuration I don't know about).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: topological index field for commit objects

2016-06-30 Thread Marc Strapetz


On 01.07.2016 05:17, Jeff King wrote:

On Thu, Jun 30, 2016 at 11:12:52AM -0700, Linus Torvalds wrote:


I do think that it's ok to cache generation numbers somewhere if there
is an algorithm that can make use of them, but every time this comes
up, it's just not been important enough to make a big deal and a new
incompatible object format for it. The committer date is preexisting
and has existing pseudo-generation-number usage, so..improving on the
quality of it sounds like a good idea.


If you are OK with a cache, I don't think one needs to change the object
format at all. It can be computed on the fly, and is purely a local
optimization.


What I like about the local cache is that it can be fixed easily: if we 
had generation numbers already, there would certainly be a few 
repositories now which would violate the parent - child condition 
(however that has happened) and then you have to deal with this scenario 
anyway (or tell the user to rewrite his entire history). It's a lot 
easier to throw away and rebuild the cache.


Also, the local cache can be improved over time, starting with 
generation numbers now and one day supporting a FELINE index or whatever 
will come.


-Marc
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

69 matches

Mail list logo