date:20160718

Hi Brian,

On Sun, 17 Jul 2016, brian m. carlson wrote:

> On Sun, Jul 17, 2016 at 10:01:38AM +0200, Johannes Schindelin wrote:
> > Out of curiosity: have you considered something like padding the SHA-1s
> > with, say 0xa1, to the size of the new hash and using that padding to
> > distinguish between old vs new hash?
> 
> I'm going to end up having to do something similar because of the issue
> of submodules.  Submodules may still be SHA-1, while the main repo may
> be a newer hash.  I was going to zero-pad, however.

I thought about zero-padding, but there are plenty of
is_null_sha1()/is_null_oid() calls around. Of course, I assumed
left-padding. But you may have thought of right-padding instead? That
would make short name handling much easier, too.

FWIW it never crossed my mind to allow different same-sized hash
algorithms. So I never thought we'd need a way to distinguish, say,
BLAKE2b-256 from SHA-256.

Is there a good reason to add the maintenance burden of several 256-bit
hash algorithms, apart from speed (which in my mind should decide which
one to use, always, rather than letting the user choose)? It would also
complicate transport even further, let alone subtree merges from
differently-hashed repositories.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git and SHA-1 security (again)

Hi Zsolt,

On Mon, 18 Jul 2016, Herczeg Zsolt wrote:

> I think converting is a much better option. Use a single-hash storage, and
> convert everything to that on import/clone/pull.

That ignores two very important issues that I already had mentioned:

- existing references, both in-repository, e.g. in commit messages
  referring to earlier commits, as well as out-of-repository, e.g.
  referring to commits in mails, blog posts, etc

- GPG-signed commits

Those issues cannot just be hand-waved away.

The "convert everything" strategy also ignores the problem of interacting
with servers and collaborators. Think of hosting repositories,
rediscovering forgotten work trees, and of the "D" in DSCM.

Ciao,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Question: Getting 'git diff' to generate /usr/bin/diff output

Hi Norm,

On Sun, 17 Jul 2016, n...@dad.org wrote:

>  writes:
> >
> >The other replies covered how to use the system's own diff instead.
> >Just curious: What makes using git diff difficult and its output hard to
> >deal with for you?
> 
> In decreasing importance order:
> 
> I am 84 years old.

Wow. Chapeau! I am impressed.

> I have been using /usr/bin/diff for more than four decades.  And having
> to learn how to read the output of 'git diff' makes learning how to use
> git a more difficult trick for this old dog to learn. True, the diff of
> today is very different from the diff of 1972, but the changes happened
> gradually.

Curious: do you use context diff (GNU diff's default) or unified diffs?

> I have scripts which process the output of /usr/bin/diff.

Even more curious: what do those scripts do? Maybe they do things that we
either can already do with Git's diff, or that we can teach Git.

> 'git diff' outputs escape characters which clutter my terminal. Yes, I
> can sed them out, but then why are they there?

Those are most likely the ANSI sequences to add color. Can you call Git
with the --no-color option and see whether the escape characters go away?

Ciao,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] pager: disable color when pager is "more"

Johannes Schindelin  wrote:
> On Sun, 17 Jul 2016, n...@dad.org wrote:
> > 'git diff' outputs escape characters which clutter my terminal. Yes, I
> > can sed them out, but then why are they there?
> 
> Those are most likely the ANSI sequences to add color. Can you call Git
> with the --no-color option and see whether the escape characters go away?

Norm: do you have PAGER=more set by any chance?
Perhaps changing it to "less" will allow you to preserve colors.

I saw a similar or identical problem during my vacation in
FreeBSD-land.  Perhaps the out-of-the-box experience can be
improved:

-8<-
Subject: [PATCH] pager: disable color when pager is "more"

more(1) traditionally cannot handle colors.

On FreeBSD 10.3, a new user ~/.profile explicitly sets
PAGER=more, but does not configure it to display colors, leading
to a bad out-of-the-box experience with escape sequences being
seen by the user.

In the FreeBSD 10.3 case, /usr/bin/more is actually a hardlink
to /usr/bin/less and capable of handling colors.  While we could
set MORE=FRX, this breaks other more(1) implementations,
including the one provided by util-linux on common GNU/Linux
systems.

So take the safe route and assume anybody still using more(1)
today can live with monochrome output, but acknowledge 'R'
in the MORE environment variable if it was set by the user.

Signed-off-by: Eric Wong 
---
 environment.c |  2 +-
 pager.c   | 11 +++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/environment.c b/environment.c
index ca72464..cfb56fd 100644
--- a/environment.c
+++ b/environment.c
@@ -41,7 +41,7 @@ size_t packed_git_limit = DEFAULT_PACKED_GIT_LIMIT;
 size_t delta_base_cache_limit = 96 * 1024 * 1024;
 unsigned long big_file_threshold = 512 * 1024 * 1024;
 const char *pager_program;
-int pager_use_color = 1;
+int pager_use_color = -1;
 const char *editor_program;
 const char *askpass_program;
 const char *excludes_file;
diff --git a/pager.c b/pager.c
index 4bc0481..3110df4 100644
--- a/pager.c
+++ b/pager.c
@@ -80,6 +80,17 @@ void setup_pager(void)
if (!pager)
return;

+   if (pager_use_color < 0 && !strcmp(pager, "more")) {
+   const char *more = getenv("MORE");
+
+   /*
+* MORE=R does not work everywhere, so we cannot set it,
+* but we can respect it if set.
+*/
+   if (!more || !strchr(more, 'R'))
+   pager_use_color = 0;
+   }
+
/*
 * force computing the width of the terminal before we redirect
 * the standard output to the pager.
-- 
EW
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] configure.ac: stronger test for pthread linkage

We need to test linkage of pthread_create and pthread_join,
as pthread_mutex_* and pthread_key_* functions do not need
extra linkage under FreeBSD 10.3, leading to a false-positive
of the empty case.

Signed-off-by: Eric Wong 
---
 configure.ac | 5 +
 1 file changed, 5 insertions(+)

diff --git a/configure.ac b/configure.ac
index c279025..aa9c91d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1108,14 +1108,19 @@ GIT_CONF_SUBST([HAVE_BSD_SYSCTL])
 AC_DEFUN([PTHREADTEST_SRC], [
 AC_LANG_PROGRAM([[
 #include 
+static void *noop(void *ignore) { return ignore; }
 ]], [[
pthread_mutex_t test_mutex;
pthread_key_t test_key;
+   pthread_t th;
int retcode = 0;
+   void *ret = (void *)0;
retcode |= pthread_key_create(&test_key, (void *)0);
retcode |= pthread_mutex_init(&test_mutex,(void *)0);
retcode |= pthread_mutex_lock(&test_mutex);
retcode |= pthread_mutex_unlock(&test_mutex);
+   retcode |= pthread_create(&th, ret, noop, ret);
+   retcode |= pthread_join(th, &ret);
return retcode;
 ]])])
 
-- 
EW
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] test-lib: stricter unzip(1) check

On Mon, Jul 18, 2016 at 06:44:31AM +, Eric Wong wrote:

> On FreeBSD 10.3 (but presumably any FreeBSD 8+), /usr/bin/unzip
> exists, but is insufficient for t5003 due to its non-standard
> handling of the -a option[1].  This version of unzip exits
> with "1" when given the "-v" flag.
> 
> However, the common Info-ZIP version may be installed at
> /usr/local/bin/unzip (via "pkg install unzip") to pass t5003.
> This Info-ZIP version exits with "0" when given "-v",
> so limit the prereq to only versions which return 0 on "-v".

I guess I am cc'd because of f838ce5 (test-lib: factor out $GIT_UNZIP
setup, 2013-03-10). But really this check for 127 dates all the way back
to Johannes's 3a86f36 (t5000: skip ZIP tests if unzip was not found,
2007-06-06), and was just pulled along as it got refactored into a
various incarnations of prerequisite by me and René.

It's possible that there is some version of unzip that does not like
"-v" but otherwise is OK with our tests, but we would skip tests using
this patch. Even with the FreeBSD version you mention, I'd expect you
could run all of the tests except for the single "-a" test.

So I wonder if we could more directly test the two levels we care about:

  - do you appear to have a working "unzip" at all?

  - does your unzip support "-a"?

My Debian version of unzip (which is derived from Info-zip) seems to
give return code 0 for just "unzip". So for the first check, we could
possibly drop "-v"; we don't care about "-v", but just wanted some way
to say "does unzip exist on the path?". Another option would just be
checking whether "unzip" returns something besides 127 (so what we have
now, minus "-v").

To test for "-a", I think we'd have to actually feed it a sample zip
file, though. My unzip returns "10", which its manpage explains as
"invalid command line options" (presumably because of the missing
zipfile argument). But that seems like it probably isn't portable.  And
it's also what I might expect another unzip to return if it doesn't
support "-a".

So while this patch does solve the immediate problem, I think it does so
by overly skipping tests that we _could_ run.

If we do go with this patch, though:

> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 11201e9..938f788 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -1103,7 +1103,7 @@ test_lazy_prereq SANITY '
>  GIT_UNZIP=${GIT_UNZIP:-unzip}
>  test_lazy_prereq UNZIP '
>   "$GIT_UNZIP" -v
> - test $? -ne 127
> + test $? -eq 0
>  '

...you can simply drop the "test" line, as testing $? against 0 is
essentially a noop. If it is 0, then test will return 0; if it is not,
then test will return non-zero. You can just return the value directly
instead.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] pager: disable color when pager is "more"

On Mon, Jul 18, 2016 at 09:19:07AM +, Eric Wong wrote:

> Johannes Schindelin  wrote:
> > On Sun, 17 Jul 2016, n...@dad.org wrote:
> > > 'git diff' outputs escape characters which clutter my terminal. Yes, I
> > > can sed them out, but then why are they there?
> > 
> > Those are most likely the ANSI sequences to add color. Can you call Git
> > with the --no-color option and see whether the escape characters go away?
> 
> Norm: do you have PAGER=more set by any chance?
> Perhaps changing it to "less" will allow you to preserve colors.
> 
> I saw a similar or identical problem during my vacation in
> FreeBSD-land.  Perhaps the out-of-the-box experience can be
> improved:
> 
> -8<-
> Subject: [PATCH] pager: disable color when pager is "more"

This is the tip of a smaller iceberg. See

  http://public-inbox.org/git/52D87A79.6060600%40rawbw.com/t/#u

for more discussion, and some patches that fix more cases (like "LESS"
without "R", or "more" that _does_ understand "R"). I think it was
discarded as being a little too intimate with the details of pagers, but
it does suck that the out-of-the-box experience on FreeBSD is not good.
Maybe we should revisit it.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] test-lib: stricter unzip(1) check

Hi Peff & Eric,

On Mon, 18 Jul 2016, Jeff King wrote:

> On Mon, Jul 18, 2016 at 06:44:31AM +, Eric Wong wrote:
> 
> > On FreeBSD 10.3 (but presumably any FreeBSD 8+), /usr/bin/unzip
> > exists, but is insufficient for t5003 due to its non-standard
> > handling of the -a option[1].  This version of unzip exits
> > with "1" when given the "-v" flag.
> > 
> > However, the common Info-ZIP version may be installed at
> > /usr/local/bin/unzip (via "pkg install unzip") to pass t5003.
> > This Info-ZIP version exits with "0" when given "-v",
> > so limit the prereq to only versions which return 0 on "-v".

Hrm. That sounds a little magical, and fragile, to me. What if the next
person's unzip returns 0 and *still* cannot handle -a?

I'd rather do something like

-- snipsnap --
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 0055ebb..5b9521e 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -929,7 +929,8 @@ yes () {
 }
 
 # Fix some commands on Windows
-case $(uname -s) in
+uname_s=$(uname -s)
+case $uname_s in
 *MINGW*)
# Windows has its own (incompatible) sort and find
sort () {
@@ -1100,6 +1101,7 @@ test_lazy_prereq SANITY '
return $status
 '
 
+test FreeBSD != $uname_s || GIT_UNZIP=${GIT_UNZIP:-/usr/local/bin/unzip}
 GIT_UNZIP=${GIT_UNZIP:-unzip}
 test_lazy_prereq UNZIP '
"$GIT_UNZIP" -v
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git and SHA-1 security (again)

>> I think converting is a much better option. Use a single-hash storage, and
>> convert everything to that on import/clone/pull.
>
> That ignores two very important issues that I already had mentioned:

That's not true. If you double-check the next part of my message, you
I just showed that an automatic two-way mapping could solve these
problems! (I even give briefs explanation how to handle referencing
and signature verification in those cases.)

My point is not to throw out old hashes and break signatures. My point
is to convert the data storage, and use mapping to resolve problems
with those old hashes and signatures. A single-hash data storage is
obviously way easier to handle, than a multi-hash mass. (See Linus's
old e-mail: multiple hashes [=meaning database keys] for the same
content is a complete nonsense in git-speak)

> The "convert everything" strategy also ignores the problem of interacting
> with servers and collaborators. Think of hosting repositories,
> rediscovering forgotten work trees, and of the "D" in DSCM.

That's not an issue when we're working with a single repository. It's
reasonable to ask for all git clients of the same repository, to
support the same hash. Yes, you have the need to configure the hash
algo on a per-repository basis but that's all. For importing and
co-working between different repositories, it's a bit harder, problem,
but it's possible to handle the conversions correctly.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git and SHA-1 security (again)

Hi Zsolt,

On Mon, 18 Jul 2016, Herczeg Zsolt wrote:

> >> I think converting is a much better option. Use a single-hash
> >> storage, and convert everything to that on import/clone/pull.
> >
> > That ignores two very important issues that I already had mentioned:
> 
> That's not true. If you double-check the next part of my message, you I
> just showed that an automatic two-way mapping could solve these
> problems! (I even give briefs explanation how to handle referencing and
> signature verification in those cases.)
> 
> My point is not to throw out old hashes and break signatures. My point
> is to convert the data storage, and use mapping to resolve problems
> with those old hashes and signatures.

If you convert the data storage, then the SHA-1s listed in the commit
objects will have to be rewritten, and then the GPG signature will not
match anymore.

Call e.g. `git cat-file commit 44cc742a8ca17b9c279be4cc195a93a6ef7a320e`
to see the anatomy of a gpg-signed commit object.

Ciao,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git and SHA-1 security (again)

2016-07-18 Thread Duy Nguyen

On Mon, Jul 18, 2016 at 5:57 PM, Johannes Schindelin
 wrote:
> Hi Zsolt,
>
> On Mon, 18 Jul 2016, Herczeg Zsolt wrote:
>
>> >> I think converting is a much better option. Use a single-hash
>> >> storage, and convert everything to that on import/clone/pull.
>> >
>> > That ignores two very important issues that I already had mentioned:
>>
>> That's not true. If you double-check the next part of my message, you I
>> just showed that an automatic two-way mapping could solve these
>> problems! (I even give briefs explanation how to handle referencing and
>> signature verification in those cases.)
>>
>> My point is not to throw out old hashes and break signatures. My point
>> is to convert the data storage, and use mapping to resolve problems
>> with those old hashes and signatures.
>
> If you convert the data storage, then the SHA-1s listed in the commit
> objects will have to be rewritten, and then the GPG signature will not
> match anymore.

But we can recreate SHA-1 from the same content and verify GPG, right?
I know it's super expensive, but it feels safer to not carry SHA-1
around when it's not secure anymore (I recall something about
exploiting the weakest link when you have both sha1 and sha256 in the
object content). Rehashing would be done locally and is better
controlled.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] list: avoid incompatibility with *BSD sys/queue.h

2016-07-18 Thread Lars Schneider


> On 17 Jul 2016, at 02:25, Eric Wong  wrote:
> 
> Eric Wong  wrote:
>> Lars Schneider  wrote:
>>> It looks like as if this topic breaks the OS X build because 
>>> it defines LIST_HEAD. LIST_HEAD is already defined in 
>>> /usr/include/sys/queue.h. 
>> 
>> Oops, I suppose GIT_LIST_HEAD is an acceptable name?
>> (looks a bit like a refname, though...).
>> 
>> Or maybe CDS_LIST_HEAD (since I originally took it from the cds
>> namespace under urcu)
> 
> Naming things is hard; I think it's better to just undef an
> existing LIST_HEAD, instead, since it's unlikely we'd ever use
> sys/queue.h from *BSD.  (sys/queue.h is branchier, and IMHO
> sys/queue.h macros are uglier than list_entry (container_of))
That sounds like the best solution to me, too.


> I also wonder where we use sys/queue.h, since I use
>> LIST_HEAD from ccan/list/list.h in a different project
>> without conflicts...
> 
> Still wondering... Checking sys/mman.h in an old FreeBSD source
> tree I had lying around reveals "#include " is
> guarded by "#if defined(_KERNEL)", so it mman.h wouldn't pull
> it in for userspace builds...
> 
> -8<--
> Subject: [PATCH] list: avoid incompatibility with *BSD sys/queue.h
> 
> Somehow, the OS X build pulls in sys/queue.h and causes
> conflicts with the LIST_HEAD macro, here.
> 
> ref: http://mid.gmane.org/fb76544f-16f7-45ca-9649-fd62ee44b...@gmail.com
> 
> Reported-by: Lars Schneider 
> Signed-off-by: Eric Wong 
> ---
> list.h | 2 ++
> 1 file changed, 2 insertions(+)
> 
> diff --git a/list.h b/list.h
> index f65edce..a226a87 100644
> --- a/list.h
> +++ b/list.h
> @@ -36,6 +36,8 @@ struct list_head {
>   struct list_head *next, *prev;
> };
> 
> +/* avoid conflicts with BSD-only sys/queue.h */
> +#undef LIST_HEAD
> /* Define a variable with the head and tail of the list. */
> #define LIST_HEAD(name) \
>   struct list_head name = { &(name), &(name) }
> -- 
> EW

This patch compiles without trouble on my local OS X.

Thanks,
Lars

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git and SHA-1 security (again)

Hi Johannes,

>> My point is not to throw out old hashes and break signatures. My point
>> is to convert the data storage, and use mapping to resolve problems
>> with those old hashes and signatures.
>
> If you convert the data storage, then the SHA-1s listed in the commit
> objects will have to be rewritten, and then the GPG signature will not
> match anymore.
>
> Call e.g. `git cat-file commit 44cc742a8ca17b9c279be4cc195a93a6ef7a320e`
> to see the anatomy of a gpg-signed commit object.
>

Yes and no. That's the reason you need the two-way lookup table. If
you need to verify a commit which was signed as SHA-1, you must use
the lookup table in reverse. This way you can reconstruct the original
commit structure, which than can be verified. Of course it's work to
do so but you only need to develop the new signature verification
algorithm. You save much more on the other side where you don't have
to rework all the other algorithms to multi-hash.

Another interesting point is that multi-hash storage, actively hurts
signature security! (Duy just mentoined that while I'm writing.) A
signed commit (or tag) is just as secure as the least secure hash it
refers (directly or indirectly). Let's imagine that you make a new a
commit, and there is on old file in the tree somewhere. That's a weak
point: cause it has SHA-1 hash, someone can replace it (and thus
change your commits content.

I would clearly mark any signature wether it's SHA-1 or SHA2 (or
anything else) based, and strictly allow that hash in all the trees
and objects while verifying that commit. If it's not the same
hash-type as the storage-key, than use the lookup table for conversion
before check. (This has some interesting side-effects, but it's all
about good implementation).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git and SHA-1 security (again)

2016-07-18 Thread Ævar Arnfjörð Bjarmason

On Sat, Jul 16, 2016 at 3:48 PM, Herczeg Zsolt  wrote:
> I would like to discuss an old topic from 2006. I understand it was
> already discussed. The only reason i'm sending this e-mail is to talk
> about a possible solution which didn't show up on this list before.

You mention the 2006 discussion, but I wonder if you've read the more
recent discussion from April on the subject.

> I think we all understand that SHA-1 is broken. It still works perfect
> as a storage key, but it's not cryptographically secure anymore. Git
> is not moving away from SHA-1 because it would break too many
> projects, and cryptographic security is not needed but git if you have
> your own repository.
>
> However I would like to show some big problems caused by SHA-1:
>  - Git signed tags and signed commits are cryptographically insecure,
> they're useless at the moment.
>  - Git Torrent (https://github.com/cjb/GitTorrent) is also
> cryptographically broken, however it would be an awesome experiment.
>  - Linus said: "You only need to know the SHA-1 of the top of your
> tree, and if you know that, you can trust your tree." That's not true
> anymore. You have to trust your computer, you servers, your git
> provider in a way that no-one can maliciously modify your data.

In particular, as far as I know and as Theodore Ts'o's post describes
better than I could[1], you seem to be confusing preimage attacks with
collision attacks, and then concluding that because SHA1 is vulnerable
to collision attacks that use-cases that would need a preimage attack
to be compromised (which as far is I can tell, includes all your
examples) are also "broken".

1. http://thread.gmane.org/gmane.comp.version-control.git/291305/focus=291511
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git and SHA-1 security (again)

2016-07-18 Thread Duy Nguyen

On Sun, Jul 17, 2016 at 4:21 PM, brian m. carlson
 wrote:
> I'm going to end up having to do something similar because of the issue
> of submodules.  Submodules may still be SHA-1, while the main repo may
> be a newer hash.

Or even the other way around, main repo is one with sha1 while
submodule is on sha256. I wonder if we should address this separately
(and even in parallel with sha256 support), making submodules work
with an any external VCS system (that supports some basic operations
we define).
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] pager: disable color when pager is "more"

Jeff King  writes:

> On Mon, Jul 18, 2016 at 09:19:07AM +, Eric Wong wrote:
>
>> Johannes Schindelin  wrote:
>> > On Sun, 17 Jul 2016, n...@dad.org wrote:
>> > > 'git diff' outputs escape characters which clutter my terminal. Yes, I
>> > > can sed them out, but then why are they there?
>> > 
>> > Those are most likely the ANSI sequences to add color. Can you call Git
>> > with the --no-color option and see whether the escape characters go away?
>> 
>> Norm: do you have PAGER=more set by any chance?
>> Perhaps changing it to "less" will allow you to preserve colors.
>> 
>> I saw a similar or identical problem during my vacation in
>> FreeBSD-land.  Perhaps the out-of-the-box experience can be
>> improved:
>> 
>> -8<-
>> Subject: [PATCH] pager: disable color when pager is "more"
>
> This is the tip of a smaller iceberg. See
>
>   http://public-inbox.org/git/52D87A79.6060600%40rawbw.com/t/#u
>
> for more discussion, and some patches that fix more cases (like "LESS"
> without "R", or "more" that _does_ understand "R"). I think it was
> discarded as being a little too intimate with the details of pagers, but
> it does suck that the out-of-the-box experience on FreeBSD is not good.
> Maybe we should revisit it.

Yup, the three-patch series at

http://public-inbox.org/git/20140117041430.GB19551%40sigill.intra.peff.net/

would be a safe starting point that is low-impact.  I think what
ended up being discarded was a more elaborate side topic that
started from exploring the possibility of checking if LESS has 'R'
in it to see if it is possible to help people with LESS that does
not allow coloring explicitly exported.

I do not think the approach in the same thread suggested by Kyle

  http://public-inbox.org/git/62DB6DEF-8B39-4481-BA06-245BF45233E5%40gmail.com/

is too bad, either.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Feature request for automatic worktree checkout

2016-07-18 Thread Ehsan Azarnasab

Currently if I have a branch checked out in a work-tree, git-checkout
will show this error message when checking out that branch:

$ git checkout master
fatal: 'master' is already checked out at '/home/dashesy/development/feature'

It would be very useful to instead of this error just change the
current working directory, so git checkout would become a `cd` command

Ehsan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git and SHA-1 security (again)

> In particular, as far as I know and as Theodore Ts'o's post describes
> better than I could[1], you seem to be confusing preimage attacks with
> collision attacks, and then concluding that because SHA1 is vulnerable
> to collision attacks that use-cases that would need a preimage attack
> to be compromised (which as far is I can tell, includes all your
> examples) are also "broken".

I understand the differences between the collision and preimage
attacks. A collision attack is not that bad for git in a typical
use-case. But I think that it's important to note that there are many
use-cases which do need a hash safe from collision attack. Some
examples:

You maintain a repository with gittorrent with signed commits Others
can use these signatures to verify it's original. Let's say you
include some safe file (potentially binary) from a third-party
contributor. That would be fine if the hash algo is safe. Currently
there is the possibility that you received a (safe) file which was
made to collide with another malicious one. Once you committed (and
signed) that file, the attacker joins the gittorrent network and
starts to distribute the malicious file. Suddenly most of your clients
pulling are infected however your signature is correct.

Or, you would like to make a continuous delivery system, where you use
signed tags. The delivery happens only when signature is right, and
the signer is responsible for it. Your colleague makes a collision,
pushes the good-file. You make all the tests, everything is fine, sign
and push and wait for the delivery to happen. Your colleague changes
the file on the server. The delivery makes a huge mass, and you're
fired.

Or, let's say you use a service like github, which is nice enough to
make a repository for you, with .gitignore, licenses and everything.
Likely, you'll never change dose files. Let's say that service made
one of those initial files to collide something bad. That means, they
can "infect" anyone, who is pulling your repo.

Do you need more hypothetical stories? There are a lot. Of course they
need a lot of work, and they're unlikely to happen. But it's possible.
If you need trust, and gpg signatures that means you need ultimate
trust. What's the point in making GPG signatures anyway if you cannot
ultimately trust them? You could just as well say: well that's
repository is only reachable by trustworthy persons, everything here
is just fine and really made by the person named in the "author
field".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git and SHA-1 security (again)

"brian m. carlson"  writes:

> I will say that the pack format will likely require some changes,
> because it assumes ...
> The reason is that we can't have an unambiguous parse of the current
> objects if two hash algorithms are in use
> So when we look at a new hash, we need to provide an unambiguous way to
> know what hash is in use.  The two choices are to either require all
> object use the new hash, or to extend the objects to include the hash.
> Until a couple days ago, I had planned to do the former.  I had not even
> considered using a multihash approach due to the complexity.

Objects in Git identify themselves, but once you introduce the
second hash function (as opposed to replacing the hash function to a
new one), you would allow people to call the same object by two
names.  That has interesting implications.

Let's say you have a blob at path F in a top-level tree object and
create a commit.  You have three objects in total, the tree knows
the blob as one name based on SHA-1 and the commit knows the tree as
one name based on SHA-1.  The same contents of the blob and the tree
could have different names based on SHA-256 in the future Git.

Let's further say you have a future Git and clone from the above
repository with three objects.  You get a pack stream, containing
the data for one commit, tree and blob each.  These objects do not
carry their own name as extra pieces of information.  You only get
their contents, and it is up to you to name them by hashing.  .idx
files are created by running index-pack while receiving the pack
data stream.  You _somehow_ need to know that these three objects
need to be hashed with SHA-1, even though you are SHA-256 capable,
because otherwise the object name recorded in the tree object for
the blob would not match what your .idx file would call the blob
data.  Also the object name recorded in the ref to point at the
commit would not match the commit object's object name, unless you
hash with SHA-1.  It is a possibility to always hash these objects
twice and record _both_ hashes in the updated .idx file; after all,
.idx files are strictly local matter.

Now let's further say that you update the file F in the working
tree, and do "git commit -a" with updated version of Git.  What
should happen?  Assuming that we are trying to migrate to a
different hashing algorithm over time, we would want to create a new
blob under object name based on SHA-256, add that to the index and
write a new tree out, named by hashing with SHA-256.  We then record
that longer-named tree in a commit whose parent commit is still
named with SHA-1 based hash, and the new commit in turn is named by
hashing with SHA-256.

Then you push the result back.  Let's assume by now the place you
cloned from is also SHA-256 capable.  You look at the tips of refs
at your clone-source and discover that you would need to only send
the new commit, its tree and the updated blob.  You send data in
these three objects.  The receiving end would now need to do the
same "magically choose hash to make sure the new blob gets the name
that is recorded in the new tree (and the new tree the new commit)"
thing.  The same discussion applies if somebody else clones from you
at this point.  The objects introduced by the second commit all need
to be hashed with the new hash to be named, while the other objects
need to be hashed with the old hash.

Continuing this thought process, I do not see a good way to allow us
to wean ourselves off of the old hash, unless we _break_ the pack
stream format so that each object in the pack carries not just the
data but also the hash algorithm to be used to _name_ it, so that
new objects will never be referred to using the old hash.

It matters performance-wise that the weaning process go as quickly
as possible, once the system becomes capable of new hash algorighm,
because during the transition period, we'd have to suffer the full
tree-diff becoming inefficient (Note: don't limit your thinking to
just "git diff" and "git log"; the same inefficiency hits "git
checkout" to switch branches and "git merge" to walk three trees in
parallel), because we cannot skip descending into subdirectories
based on the tree object name being equal, which guarantees that
everything under the hierarchy is equal.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] test-lib: stricter unzip(1) check

Johannes Schindelin  writes:

> Hrm. That sounds a little magical, and fragile, to me. What if the next
> person's unzip returns 0 and *still* cannot handle -a?

That is a very sensible line of thought.

> I'd rather do something like

... but the patch presented as an alternative does not seem to
follow that line of thought.  After reading that sensible line of
thought I would have expected to see an auto-detection of the path
and support for features we care about.

Stepping back a bit, why do we even care if "unzip -a" works on
"../$zipfile" and converts things correctly in that check_zip() test
in t5003 in the first place?  It looks more like a test on "unzip"
than making sure we correctly generate a zip archive to me...

> -- snipsnap --
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 0055ebb..5b9521e 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -929,7 +929,8 @@ yes () {
>  }
>  
>  # Fix some commands on Windows
> -case $(uname -s) in
> +uname_s=$(uname -s)
> +case $uname_s in
>  *MINGW*)
>   # Windows has its own (incompatible) sort and find
>   sort () {
> @@ -1100,6 +1101,7 @@ test_lazy_prereq SANITY '
>   return $status
>  '
>  
> +test FreeBSD != $uname_s || GIT_UNZIP=${GIT_UNZIP:-/usr/local/bin/unzip}
>  GIT_UNZIP=${GIT_UNZIP:-unzip}
>  test_lazy_prereq UNZIP '
>   "$GIT_UNZIP" -v
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Feature request for automatic worktree checkout

Ehsan Azarnasab  writes:

> Currently if I have a branch checked out in a work-tree, git-checkout
> will show this error message when checking out that branch:
>
> $ git checkout master
> fatal: 'master' is already checked out at '/home/dashesy/development/feature'
>
> It would be very useful to instead of this error just change the
> current working directory, so git checkout would become a `cd` command

That is an understandable thing to want from 10,000ft view, but it
would not be something Git, or any external command that is spawned
by the shell for that matter, can address.  You'd need to teach the
shell to cooperate.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] fetch-pack: grow stateless RPC windows exponentially

When updating large repositories, the LARGE_FLUSH limit (that is, the
limit at which the window growth strategy switches from exponential to
linear) is reached quite quickly. Use a conservative exponential growth
strategy when that limit is reached instead.

This optimization is only applied during stateless RPCs to avoid the
issue raised and fixed in commit
44d8dc54e73e8010c4bdf57a422fc8d5ce709029.

Signed-off-by: Jonathan Tan 
---
 fetch-pack.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fetch-pack.c b/fetch-pack.c
index b501d5c..3fcbda2 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -251,6 +251,8 @@ static int next_flush(struct fetch_pack_args *args, int 
count)
 
if (count < flush_limit)
count <<= 1;
+   else if (args->stateless_rpc && count >= flush_limit * 10)
+   count = count * 11 / 10;
else
count += flush_limit;
return count;
-- 
2.8.0.rc3.226.g39d4020

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 0/4] Teach `git fsck` a new option: `--name-objects`

Johannes Schindelin  writes:

>  -static int fsck_error_func(struct object *obj, int type, const char 
> *message)
>  +static int fsck_error_func(struct fsck_options *o,
>  +struct object *obj, int type, const char *message)
>   {
>   objreport(obj, (type == FSCK_WARN) ? "warning" : "error", message);
>   return (type == FSCK_WARN) ? 0 : 1;

That's a good change.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] blame: Allow to blame paths freshly added to the index

Mike Hommey  writes:

>> I suspect that this would be useful without copy detection.  If you
>> "git mv fileA fileB" (optionally followed by "edit fileB"), fileB
>> would not be in HEAD but you should be able to trace the lineage of
>> the lines in it back through the renaming event, and this change
>> also allows that use case, no?
>
> It should, but that'd be copy/move detection, wouldn't it? :)

Actually, in the context of "git blame", there is no extra
"detection" needed for following a whole file rename.

>> But the user can be in the same conflicted rename situation with
>> "git am -3" or cherry-pick, and in these cases there won't be extra
>> parent commits for the fake work tree commit, hence the conclusion
>> does not change.
>
> Indeed, with cherry-pick, the "no such path in HEAD" error is happening
> with the patch.

OK.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 2/2] t/t8003-blame-corner-cases.sh: Use here documents

Mike Hommey  writes:

> Somehow, this test was using:
>
> {
>   echo A
>   echo B
> } > file
>
> block to feed file contents. This changes those to the form most common
> in git test scripts:
>
> cat >file <<-\EOF
> A
> B
> EOF
>
> Signed-off-by: Mike Hommey 
> ---

It is not so strong a preference to ask you to re-roll, but for
future reference, I'd prefer to see this preparatory clean-up early
in the series, followed by the primary thing, IOW, I would have
liked more if the two patches were swapped.

Thanks.  Will queue.

>  t/t8003-blame-corner-cases.sh | 34 +-
>  1 file changed, 17 insertions(+), 17 deletions(-)
>
> diff --git a/t/t8003-blame-corner-cases.sh b/t/t8003-blame-corner-cases.sh
> index eab2e28..e48370d 100755
> --- a/t/t8003-blame-corner-cases.sh
> +++ b/t/t8003-blame-corner-cases.sh
> @@ -41,12 +41,12 @@ test_expect_success setup '
>   test_tick &&
>   GIT_AUTHOR_NAME=Fourth git commit -m Fourth &&
>  
> - {
> - echo ABC
> - echo DEF
> - echo 
> - echo GHIJK
> - } >cow &&
> + cat >cow <<-\EOF &&
> + ABC
> + DEF
> + 
> + GHIJK
> + EOF
>   git add cow &&
>   test_tick &&
>   GIT_AUTHOR_NAME=Fifth git commit -m Fifth
> @@ -115,11 +115,11 @@ test_expect_success 'append with -C -C -C' '
>  test_expect_success 'blame wholesale copy' '
>  
>   git blame -f -C -C1 HEAD^ -- cow | sed -e "$pick_fc" >current &&
> - {
> - echo mouse-Initial
> - echo mouse-Second
> - echo mouse-Third
> - } >expected &&
> + cat >expected <<-\EOF &&
> + mouse-Initial
> + mouse-Second
> + mouse-Third
> + EOF
>   test_cmp expected current
>  
>  '
> @@ -127,12 +127,12 @@ test_expect_success 'blame wholesale copy' '
>  test_expect_success 'blame wholesale copy and more' '
>  
>   git blame -f -C -C1 HEAD -- cow | sed -e "$pick_fc" >current &&
> - {
> - echo mouse-Initial
> - echo mouse-Second
> - echo cow-Fifth
> - echo mouse-Third
> - } >expected &&
> + cat >expected <<-\EOF &&
> + mouse-Initial
> + mouse-Second
> + cow-Fifth
> + mouse-Third
> + EOF
>   test_cmp expected current
>  
>  '
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] test-lib: stricter unzip(1) check

On Mon, Jul 18, 2016 at 11:20:19AM -0700, Junio C Hamano wrote:

> Stepping back a bit, why do we even care if "unzip -a" works on
> "../$zipfile" and converts things correctly in that check_zip() test
> in t5003 in the first place?  It looks more like a test on "unzip"
> than making sure we correctly generate a zip archive to me...

I think it is testing that we generated an archive with the correct "I
am text" flags so that an unzip implementation can do the
auto-conversion.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] fetch-pack: grow stateless RPC windows exponentially

Hi,

Jonathan Tan wrote:

> When updating large repositories, the LARGE_FLUSH limit (that is, the
> limit at which the window growth strategy switches from exponential to
> linear) is reached quite quickly. Use a conservative exponential growth
> strategy when that limit is reached instead.
>
> This optimization is only applied during stateless RPCs to avoid the
> issue raised and fixed in commit
> 44d8dc54e73e8010c4bdf57a422fc8d5ce709029.
>
> Signed-off-by: Jonathan Tan 
> ---
>  fetch-pack.c | 2 ++
>  1 file changed, 2 insertions(+)

Yay, thanks for this.

When this condition triggers (count >= 10240), we have already
experienced 10 rounds of negotiation.  Negotiation ought to have
finished by then.  So this is a pretty conservative change to try to
salvage an already bad situation.

The condition ensures that the exponential growth will go faster
than the previous heuristic of linear growth.

Memory usage grows with the number of 'have's to be sent.  Linear
growth didn't bound memory usage. This exponential growth makes memory
usage increase faster, but not aggressively so and the unbounded
memory usage is already something we'd want to address separately to
handle hostile servers.

All in all, this looks likely to allow negotiation to finish in fewer
rounds, speeding up fetch, without much downside, so for what it's
worth,

Reviewed-by: Jonathan Nieder 

I'd expect us to need more aggressive improvements to negotiation in the
end (e.g. finding a way to order SHA-1s sent as 'have's to finish in
fewer rounds).  But this is a good start.  Thanks for writing it.

> diff --git a/fetch-pack.c b/fetch-pack.c
> index b501d5c..3fcbda2 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -251,6 +251,8 @@ static int next_flush(struct fetch_pack_args *args, int 
> count)
>  
>   if (count < flush_limit)
>   count <<= 1;
> + else if (args->stateless_rpc && count >= flush_limit * 10)
> + count = count * 11 / 10;
>   else
>   count += flush_limit;
>   return count;
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] pager: disable color when pager is "more"

On Mon, Jul 18, 2016 at 10:19:09AM -0700, Junio C Hamano wrote:

> > This is the tip of a smaller iceberg. See
> >
> >   http://public-inbox.org/git/52D87A79.6060600%40rawbw.com/t/#u
> >
> > for more discussion, and some patches that fix more cases (like "LESS"
> > without "R", or "more" that _does_ understand "R"). I think it was
> > discarded as being a little too intimate with the details of pagers, but
> > it does suck that the out-of-the-box experience on FreeBSD is not good.
> > Maybe we should revisit it.
> 
> Yup, the three-patch series at
> 
> 
> http://public-inbox.org/git/20140117041430.GB19551%40sigill.intra.peff.net/
> 
> would be a safe starting point that is low-impact.  I think what
> ended up being discarded was a more elaborate side topic that
> started from exploring the possibility of checking if LESS has 'R'
> in it to see if it is possible to help people with LESS that does
> not allow coloring explicitly exported.

Yeah, I only re-skimmed the thread, but I had the same impression. I am
traveling the next few days, so I will probably not get to it soon.  If
anybody wants to pick up and rebase/polish those patches as necessary,
be my guest.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] fetch-pack: grow stateless RPC windows exponentially

Jonathan Nieder  writes:

> Yay, thanks for this.
>
> When this condition triggers (count >= 10240), we have already
> experienced 10 rounds of negotiation.  Negotiation ought to have
> finished by then.  So this is a pretty conservative change to try to
> salvage an already bad situation.
>
> The condition ensures that the exponential growth will go faster
> than the previous heuristic of linear growth.
>
> Memory usage grows with the number of 'have's to be sent.  Linear
> growth didn't bound memory usage. This exponential growth makes memory
> usage increase faster, but not aggressively so and the unbounded
> memory usage is already something we'd want to address separately to
> handle hostile servers.
>
> All in all, this looks likely to allow negotiation to finish in fewer
> rounds, speeding up fetch, without much downside, so for what it's
> worth,
>
> Reviewed-by: Jonathan Nieder 
>
> I'd expect us to need more aggressive improvements to negotiation in the
> end (e.g. finding a way to order SHA-1s sent as 'have's to finish in
> fewer rounds).  But this is a good start.  Thanks for writing it.

Sorry, while I agree with the general sentiment that the windowing
heuristics can be improved, from your description, I would have
expected an updated curve goes like "aggressive exponential ->
conservative exponential -> slow linear", but the new comparison
reads the other way around, i.e. "aggressive exponential -> slow
linear -> conservative exponential".

I'd understand if it were more like "aggressive exponential ->
conservative exponential" without linear phase when stateless_rpc is
in use, though.  I just do not quite understand the justification
behind the order of three phases introduced by this change.


>> diff --git a/fetch-pack.c b/fetch-pack.c
>> index b501d5c..3fcbda2 100644
>> --- a/fetch-pack.c
>> +++ b/fetch-pack.c
>> @@ -251,6 +251,8 @@ static int next_flush(struct fetch_pack_args *args, int 
>> count)
>>  
>>  if (count < flush_limit)
>>  count <<= 1;
>> +else if (args->stateless_rpc && count >= flush_limit * 10)
>> +count = count * 11 / 10;
>>  else
>>  count += flush_limit;
>>  return count;
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] test-lib: stricter unzip(1) check

Jeff King  writes:

> On Mon, Jul 18, 2016 at 11:20:19AM -0700, Junio C Hamano wrote:
>
>> Stepping back a bit, why do we even care if "unzip -a" works on
>> "../$zipfile" and converts things correctly in that check_zip() test
>> in t5003 in the first place?  It looks more like a test on "unzip"
>> than making sure we correctly generate a zip archive to me...
>
> I think it is testing that we generated an archive with the correct "I
> am text" flags so that an unzip implementation can do the
> auto-conversion.

Yes, I understand that.  I was hoping for a response along the lines
of "we want to make sure we mark text as text, and 'zip -l' has an
option to let us check the attributes without having to actually
checking things out" ;-)


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] fetch-pack: grow stateless RPC windows exponentially

On Mon, Jul 18, 2016 at 12:10 PM, Junio C Hamano  wrote:
> I'd understand if it were more like "aggressive exponential ->
> conservative exponential" without linear phase when stateless_rpc is
> in use, though.  I just do not quite understand the justification
> behind the order of three phases introduced by this change.

Adding conservative exponential phase after the aggressive exponential
phase was the original intention, but the conservative exponential
approach (e.g. n' = n * 11 / 10) is slower than the existing linear
(n' = n + 1024) approach when n < 10240, so I added that intermediate
phase to avoid a regression in the packet size growth.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] fetch-pack: grow stateless RPC windows exponentially

Jonathan Tan wrote:
> On Mon, Jul 18, 2016 at 12:10 PM, Junio C Hamano  wrote:

>> I'd understand if it were more like "aggressive exponential ->
>> conservative exponential" without linear phase when stateless_rpc is
>> in use, though.  I just do not quite understand the justification
>> behind the order of three phases introduced by this change.
>
> Adding conservative exponential phase after the aggressive exponential
> phase was the original intention, but the conservative exponential
> approach (e.g. n' = n * 11 / 10) is slower than the existing linear
> (n' = n + 1024) approach when n < 10240, so I added that intermediate
> phase to avoid a regression in the packet size growth.

You have full control of the growth function.  So how about aggressive
growth until 1024*10?

That is:

Current git:
  n < 1024: aggressive exponential
16, 32, 64, 128, 256, 512, 1024
  1024 <= n: linear
2048, 3072, 4096, 5120, ...

Initial proposal:
  n < 1024: aggressive exponential
16, 32, 64, 128, 256, 512, 1024
  1024 <= n < 10240: linear
2048, 307, 4096, 5120, ...
  10240 <= n: conservative exponential
11264, 12390, ...

New proposal:
  n < 10240: aggressive exponential
16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384
  10240 <= n: conservative exponential
18022, 19824, ...

That way, on one hand it would still never use a smaller window than
today and on the other hand the heuristic would be easier to
understand (only decelarating, instead of decelarating and then
accelerating again).

Thanks,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] test-lib: stricter unzip(1) check

Jeff King  writes:

> My Debian version of unzip (which is derived from Info-zip) seems to
> give return code 0 for just "unzip". So for the first check, we could
> possibly drop "-v"; we don't care about "-v", but just wanted some way
> to say "does unzip exist on the path?". Another option would just be
> checking whether "unzip" returns something besides 127 (so what we have
> now, minus "-v").
>
> To test for "-a", I think we'd have to actually feed it a sample zip
> file, though. My unzip returns "10", which its manpage explains as
> "invalid command line options" (presumably because of the missing
> zipfile argument). But that seems like it probably isn't portable.  And
> it's also what I might expect another unzip to return if it doesn't
> support "-a".
>
> So while this patch does solve the immediate problem, I think it does so
> by overly skipping tests that we _could_ run.

Hmm, how about taking Dscho's "default GIT_UNZIP to /usr/local/bin/unzip
on FreeBSD" thing, together with something like this, then?

I suspect that 4 checks that look at $extracted/* after running
unzip -a should probably be inside a single test that runs unzip -a,
simply because they do not make any sense if the extraction failed,
but I did not fix that with this.

-- >8 --
test: check "unzip" and "unzip -a"

Different platforms have implementations "unzip" that behave
differently.  Most of the tests we use GIT_UNZIP we only care about
the command to be able to extract from *.zip archive, but one test
in t5003 wants it to also be able to grok the "-a" option.

Prepare a sample zip file that has a single text file in it, and try
extracting its contents to see GIT_UNZIP is usable. when setting
UNZIP prerequisite.  Similarly, set UNZIP_AUTOTEXT prerequisite by
running GIT_UNZIP with the "-a" option.

---
 t/t5003-archive-zip.sh   |  19 ++-
 t/t5003/infozip-text.zip | Bin 0 -> 163 bytes
 t/test-lib.sh|   4 ++--
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
index 14744b2..43c0cfd 100755
--- a/t/t5003-archive-zip.sh
+++ b/t/t5003-archive-zip.sh
@@ -15,6 +15,15 @@ test_lazy_prereq UNZIP_SYMLINKS '
)
 '
 
+test_lazy_prereq UNZIP_AUTOTEXT '
+   (
+   mkdir unzip-autotext &&
+   cd unzip-autotext
+   "$GIT_UNZIP" -a "$TEST_DIRECTORY"/t5003/infozip-text.zip &&
+   test -f text
+   )
+'
+
 check_zip() {
zipfile=$1.zip
listfile=$1.lst
@@ -39,27 +48,27 @@ check_zip() {
extracted=${dir_with_prefix}a
original=a
 
-   test_expect_success UNZIP " extract ZIP archive with EOL conversion" '
+   test_expect_success UNZIP_AUTOTEXT " extract ZIP archive with EOL 
conversion" '
(mkdir $dir && cd $dir && "$GIT_UNZIP" -a ../$zipfile)
'
 
-   test_expect_success UNZIP " validate that text files are converted" "
+   test_expect_success UNZIP_AUTOTEXT " validate that text files are 
converted" "
test_cmp_bin $extracted/text.cr $extracted/text.crlf &&
test_cmp_bin $extracted/text.cr $extracted/text.lf
"
 
-   test_expect_success UNZIP " validate that binary files are unchanged" "
+   test_expect_success UNZIP_AUTOTEXT " validate that binary files are 
unchanged" "
test_cmp_bin $original/binary.cr   $extracted/binary.cr &&
test_cmp_bin $original/binary.crlf $extracted/binary.crlf &&
test_cmp_bin $original/binary.lf   $extracted/binary.lf
"
 
-   test_expect_success UNZIP " validate that diff files are converted" "
+   test_expect_success UNZIP_AUTOTEXT " validate that diff files are 
converted" "
test_cmp_bin $extracted/diff.cr $extracted/diff.crlf &&
test_cmp_bin $extracted/diff.cr $extracted/diff.lf
"
 
-   test_expect_success UNZIP " validate that -diff files are unchanged" "
+   test_expect_success UNZIP_AUTOTEXT " validate that -diff files are 
unchanged" "
test_cmp_bin $original/nodiff.cr   $extracted/nodiff.cr &&
test_cmp_bin $original/nodiff.crlf $extracted/nodiff.crlf &&
test_cmp_bin $original/nodiff.lf   $extracted/nodiff.lf
diff --git a/t/t5003/infozip-text.zip b/t/t5003/infozip-text.zip
new file mode 100644
index 000..a019acb
Binary files /dev/null and b/t/t5003/infozip-text.zip differ
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 11201e9..9907b3f 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1102,8 +1102,8 @@ test_lazy_prereq SANITY '
 
 GIT_UNZIP=${GIT_UNZIP:-unzip}
 test_lazy_prereq UNZIP '
-   "$GIT_UNZIP" -v
-   test $? -ne 127
+   "$GIT_UNZIP" "$TEST_DIRECTORY/t5003/infozip-text.zip" &&
+   test -f text
 '
 
 run_with_limited_cmdline () {
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at

Re: [PATCH] fetch-pack: grow stateless RPC windows exponentially

Jonathan Nieder  writes:

> You have full control of the growth function.  So how about aggressive
> growth until 1024*10?
>
> That is:
>
> Current git:
>   n < 1024: aggressive exponential
>   16, 32, 64, 128, 256, 512, 1024
>   1024 <= n: linear
>   2048, 3072, 4096, 5120, ...
>
> Initial proposal:
>   n < 1024: aggressive exponential
>   16, 32, 64, 128, 256, 512, 1024
>   1024 <= n < 10240: linear
>   2048, 307, 4096, 5120, ...
>   10240 <= n: conservative exponential
>   11264, 12390, ...
>
> New proposal:
>   n < 10240: aggressive exponential
>   16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384
>   10240 <= n: conservative exponential
>   18022, 19824, ...
>
> That way, on one hand it would still never use a smaller window than
> today and on the other hand the heuristic would be easier to
> understand (only decelarating, instead of decelarating and then
> accelerating again).

That sounds more explainable (I do not know if that is a growth
curve that gives us better results, though).

So, the result would look something like this, perhaps?

 fetch-pack.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 3c5dfc4..97fe5f7 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -264,12 +264,17 @@ static void insert_one_alternate_ref(const struct ref 
*ref, void *unused)
 
 static int next_flush(struct fetch_pack_args *args, int count)
 {
-   int flush_limit = args->stateless_rpc ? LARGE_FLUSH : PIPESAFE_FLUSH;
-
-   if (count < flush_limit)
-   count <<= 1;
-   else
-   count += flush_limit;
+   if (args->stateless_rpc) {
+   if (count < LARGE_FLUSH * 10)
+   count <<= 1;
+   else
+   count = count * 11 / 10;
+   } else {
+   if (count < PIPESAFE_FLUSH)
+   count <<= 1;
+   else
+   count += PIPESAFE_FLUSH;
+   }
return count;
 }
 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git and SHA-1 security (again)

2016-07-18 Thread Ævar Arnfjörð Bjarmason

On Mon, Jul 18, 2016 at 7:48 PM, Herczeg Zsolt  wrote:
>> In particular, as far as I know and as Theodore Ts'o's post describes
>> better than I could[1], you seem to be confusing preimage attacks with
>> collision attacks, and then concluding that because SHA1 is vulnerable
>> to collision attacks that use-cases that would need a preimage attack
>> to be compromised (which as far is I can tell, includes all your
>> examples) are also "broken".
>
> I understand the differences between the collision and preimage
> attacks.

Fair enough. The rest of your E-Mail certainly shows that you do, and
I didn't know enough anything about GitTorrent and this case where
it's vulnerable to collission attacks.

But I didn't get that impression from your initial E-Mail which
outright said said:

Git signed tags and signed commits are cryptographically
insecure, they're useless at the moment.

It's important that those of us who *do* understand the difference
between collision and preimage attacks carefully phrase things, least
they turn into FUD.

Your initial E-Mail does *not* make it sound like you're just talking
about the cases where someone's provided you with a crafted blob that
you've been tricked into signing, but rather makes it sound like
signed tags & commits are just categorically broken, even for preimage
attacks, which is not the case.

The reality of the current situation is that it's largely mitigated in
practice because:

a) it's hard to hand someone a crafted blob to begin with for reasons
that have nothing to do with SHA-1 (they'll go "wtf is this garbage?")

b) even in that case it's *very* hard to come up with two colliding
blobs that are *useful* for some nefarious purpose, e.g. a program A
that looks normal being replaced by an evil program B with the same
SHA-1.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] test-lib: stricter unzip(1) check

Junio C Hamano  wrote:
> +test_lazy_prereq UNZIP_AUTOTEXT '
> + (
> + mkdir unzip-autotext &&
> + cd unzip-autotext
> + "$GIT_UNZIP" -a "$TEST_DIRECTORY"/t5003/infozip-text.zip &&
> + test -f text
> + )

/usr/bin/unzip actually takes -a on FreeBSD, just not in the
same way the Info-ZIP version does, so I suspect "test -f"
here is not enough.

I would test this, but I can't apply it:

> diff --git a/t/t5003/infozip-text.zip b/t/t5003/infozip-text.zip
> new file mode 100644
> index 000..a019acb
> Binary files /dev/null and b/t/t5003/infozip-text.zip differ
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git and SHA-1 security (again)

2016-07-18 Thread David Lang


On Mon, 18 Jul 2016, Herczeg Zsolt wrote:


In particular, as far as I know and as Theodore Ts'o's post describes
better than I could[1], you seem to be confusing preimage attacks with
collision attacks, and then concluding that because SHA1 is vulnerable
to collision attacks that use-cases that would need a preimage attack
to be compromised (which as far is I can tell, includes all your
examples) are also "broken".


I understand the differences between the collision and preimage
attacks. A collision attack is not that bad for git in a typical
use-case. But I think that it's important to note that there are many
use-cases which do need a hash safe from collision attack. Some
examples:

You maintain a repository with gittorrent with signed commits Others
can use these signatures to verify it's original. Let's say you
include some safe file (potentially binary) from a third-party
contributor. That would be fine if the hash algo is safe. Currently
there is the possibility that you received a (safe) file which was
made to collide with another malicious one. Once you committed (and
signed) that file, the attacker joins the gittorrent network and
starts to distribute the malicious file. Suddenly most of your clients
pulling are infected however your signature is correct.

Or, you would like to make a continuous delivery system, where you use
signed tags. The delivery happens only when signature is right, and
the signer is responsible for it. Your colleague makes a collision,
pushes the good-file. You make all the tests, everything is fine, sign
and push and wait for the delivery to happen. Your colleague changes
the file on the server. The delivery makes a huge mass, and you're
fired.

Or, let's say you use a service like github, which is nice enough to
make a repository for you, with .gitignore, licenses and everything.
Likely, you'll never change dose files. Let's say that service made
one of those initial files to collide something bad. That means, they
can "infect" anyone, who is pulling your repo.

Do you need more hypothetical stories? There are a lot. Of course they
need a lot of work, and they're unlikely to happen. But it's possible.
If you need trust, and gpg signatures that means you need ultimate
trust. What's the point in making GPG signatures anyway if you cannot
ultimately trust them? You could just as well say: well that's
repository is only reachable by trustworthy persons, everything here
is just fine and really made by the person named in the "author
field".


All of your examples are actually preimage attacks. If the bad guy can tamper 
with the both the 'safe' and 'malicious' versions of the file, they don't 
actually need the malicious version, they can attack you through the one you 
think is 'safe'


The 'collision' attack isn't that there is some increased chance of a random 
file colliding with your safe file, it's that if you are manipulating the 
contents of both files, you can create two that collide. This won't hurt a Git 
repository unless one of these manipulated files is able to be introduced as a 
legitimate part of the repo you are dealing with.


David Lang
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] test-lib: stricter unzip(1) check

Eric Wong  writes:

> Junio C Hamano  wrote:
>> +test_lazy_prereq UNZIP_AUTOTEXT '
>> +(
>> +mkdir unzip-autotext &&
>> +cd unzip-autotext
>> +"$GIT_UNZIP" -a "$TEST_DIRECTORY"/t5003/infozip-text.zip &&
>> +test -f text
>> +)
>
> /usr/bin/unzip actually takes -a on FreeBSD, just not in the
> same way the Info-ZIP version does, so I suspect "test -f"
> here is not enough.

Hmph.  So it only and always does "CRLF -> LF", while Info-ZIP
version does something like autocrlf?

> I would test this, but I can't apply it:
>
>> diff --git a/t/t5003/infozip-text.zip b/t/t5003/infozip-text.zip
>> new file mode 100644
>> index 000..a019acb
>> Binary files /dev/null and b/t/t5003/infozip-text.zip differ

Heh.  It was created like so:

$ printf 'text\r\n' >text && zip -ll infozip-text.zip text
$ zipinfo infozip-text.zip text
-rw-r-  3.0 unx5 tx stor 16-Jul-18 13:12 text

 t/t5003/infozip-text.zip | Bin 0 -> 163 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)

diff --git a/t/t5003/infozip-text.zip b/t/t5003/infozip-text.zip
new file mode 100644
index 
..7119bbb62699f4cb613f3675f57aa9c9dc021ea0
GIT binary patch
literal 163
zcmWIWW@h1H0D&2qpFGrWy)kD6vO$=IL586uwW1_6gp+~U-l8`ggi9;985mjSu4iOm
z=@4cB%X0;IGcw6B<1$17WHtjM5HDy1u^>jWLX1Q+F2I|W4Wxz<2)%%`Gl;_g0G&b|
AeEhttp://vger.kernel.org/majordomo-info.html

Re: [PATCH] contrib/persistent-https: update ldflags syntax for Go 1.7+

Jeff King  writes:

>> This `name=value` syntax for the -X flag was introduced in Go v1.5
>> (released Aug 19, 2015):
>> 
>> - release notes: https://golang.org/doc/go1.5#link
>> - commit: 
>> https://github.com/golang/go/commit/12795c02f3d6fc54ece09a86e70aaa40a94d5131
>> 
>> In Go v1.7, support for the old syntax was removed:
>> 
>> - release notes: https://tip.golang.org/doc/go1.7#compiler
>> - commit: 
>> https://github.com/golang/go/commit/51b624e6a29b135ce0fadb22b678acf4998ff16f
>> 
>> This patch includes the `=` to fix builds with Go v1.7+.
>
> With the disclaimer that I have very little experience with Go, this
> seems like a good, well-explained change. My only question would be
> whether people still use pre-v1.5 versions of Go, since it sounds like
> this would adversely affect them if they do. (If it does, it seems the

Yeah, you get something like this:

$ ./git-remote-persistent-https --print_label
2016/07/18 13:34:33 unlabeled build; build with "make" to label

which is probably not the end of the world.  The label does not even
identify the version of the source in any way, so I am not sure how
people are depending on that feature anyway ;-)



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 0/4] cache-tree building fix in the presence of ita entries

Nguyễn Thái Ngọc Duy   writes:

> v4 removes the leading underscore from _EMPTY_BLOB and _EMPTY_TREE and
> updates 4/4 slightly like this.
>
> diff --git a/cache-tree.c b/cache-tree.c
> index 2d50640..f28b1f4 100644
> --- a/cache-tree.c
> +++ b/cache-tree.c
> @@ -325,6 +325,7 @@ static int update_one(struct cache_tree *it,
>   const unsigned char *sha1;
>   unsigned mode;
>   int expected_missing = 0;
> + int contains_ita = 0;
>  
>   path = ce->name;
>   pathlen = ce_namelen(ce);
> @@ -341,7 +342,8 @@ static int update_one(struct cache_tree *it,
>   i += sub->count;
>   sha1 = sub->cache_tree->sha1;
>   mode = S_IFDIR;
> - if (sub->cache_tree->entry_count < 0) {
> + contains_ita = sub->cache_tree->entry_count < 0;
> + if (contains_ita) {
>   to_invalidate = 1;
>   expected_missing = 1;
>   }
> @@ -381,10 +383,9 @@ static int update_one(struct cache_tree *it,
>   }
>  
>   /*
> -  * "sub" can be an empty tree if subentries are i-t-a.
> +  * "sub" can be an empty tree if all subentries are i-t-a.
>*/
> - if (sub && sub->cache_tree->entry_count < 0 &&
> - !hashcmp(sha1, EMPTY_TREE_SHA1_BIN))
> + if (contains_ita && !hashcmp(sha1, EMPTY_TREE_SHA1_BIN))
>   continue;
>  
>   strbuf_grow(&buffer, entlen + 100);

This makes quite a lot of sense; even though I do not think it would
change the behaviour within the context of current code, it
definitely is easier to understand and prevents future mistakes.

Will queue.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git and SHA-1 security (again)

Ævar Arnfjörð Bjarmason  writes:

> The reality of the current situation is that it's largely mitigated in
> practice because:
>
> a) it's hard to hand someone a crafted blob to begin with for reasons
> that have nothing to do with SHA-1 (they'll go "wtf is this garbage?")
>
> b) even in that case it's *very* hard to come up with two colliding
> blobs that are *useful* for some nefarious purpose, e.g. a program A
> that looks normal being replaced by an evil program B with the same
> SHA-1.

Thanks.  That's a nice rephrasing of

  
http://public-inbox.org/git/Pine.LNX.4.58.0504291221250.18901%40ppc970.osdl.org/

where Linus explains SHA-1 is not the security, and the real
security is in distribution.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] fetch-pack: grow stateless RPC windows exponentially

On Mon, Jul 18, 2016 at 1:00 PM, Junio C Hamano  wrote:
> Jonathan Nieder  writes:
>
>> You have full control of the growth function.  So how about aggressive
>> growth until 1024*10?
>>
>> That is:
>>
>> Current git:
>>   n < 1024: aggressive exponential
>>   16, 32, 64, 128, 256, 512, 1024
>>   1024 <= n: linear
>>   2048, 3072, 4096, 5120, ...
>>
>> Initial proposal:
>>   n < 1024: aggressive exponential
>>   16, 32, 64, 128, 256, 512, 1024
>>   1024 <= n < 10240: linear
>>   2048, 307, 4096, 5120, ...
>>   10240 <= n: conservative exponential
>>   11264, 12390, ...
>>
>> New proposal:
>>   n < 10240: aggressive exponential
>>   16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384
>>   10240 <= n: conservative exponential
>>   18022, 19824, ...
>>
>> That way, on one hand it would still never use a smaller window than
>> today and on the other hand the heuristic would be easier to
>> understand (only decelarating, instead of decelarating and then
>> accelerating again).
>
> That sounds more explainable (I do not know if that is a growth
> curve that gives us better results, though).
>
> So, the result would look something like this, perhaps?
>
>  fetch-pack.c | 17 +++--
>  1 file changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 3c5dfc4..97fe5f7 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -264,12 +264,17 @@ static void insert_one_alternate_ref(const struct ref 
> *ref, void *unused)
>
>  static int next_flush(struct fetch_pack_args *args, int count)
>  {
> -   int flush_limit = args->stateless_rpc ? LARGE_FLUSH : PIPESAFE_FLUSH;
> -
> -   if (count < flush_limit)
> -   count <<= 1;
> -   else
> -   count += flush_limit;
> +   if (args->stateless_rpc) {
> +   if (count < LARGE_FLUSH * 10)
> +   count <<= 1;
> +   else
> +   count = count * 11 / 10;
> +   } else {
> +   if (count < PIPESAFE_FLUSH)
> +   count <<= 1;
> +   else
> +   count += PIPESAFE_FLUSH;
> +   }
> return count;
>  }
>

Using aggressive growth until 1024*10 seems like a good idea to me,
and it would look like that patch. (I would probably redefine
LARGE_FLUSH to be 10 times its current value instead of multiplying it
by 10, since it is not used anywhere else.)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] test-lib: stricter unzip(1) check

Junio C Hamano  wrote:
> Eric Wong  writes:
> > Junio C Hamano  wrote:
> >> +test_lazy_prereq UNZIP_AUTOTEXT '
> >> +  (
> >> +  mkdir unzip-autotext &&
> >> +  cd unzip-autotext
> >> +  "$GIT_UNZIP" -a "$TEST_DIRECTORY"/t5003/infozip-text.zip &&
> >> +  test -f text
> >> +  )
> >
> > /usr/bin/unzip actually takes -a on FreeBSD, just not in the
> > same way the Info-ZIP version does, so I suspect "test -f"
> > here is not enough.
> 
> Hmph.  So it only and always does "CRLF -> LF", while Info-ZIP
> version does something like autocrlf?

No, it does CRLF -> LF based on the absence of non-ASCII
characters and ignoring metadata set in the zipfile.
The unzip manpage states:

 Normally, the -a option should only affect files which are
 marked as text files in the zipfile's central directory.  Since
 the archive(3) library reads zipfiles sequentially, and does
 not use the central directory, that information is not available
 to the unzip utility.  Instead, the unzip utility will assume
 that a file is a text file if no non-ASCII characters are
 present within the first block of data decompressed for that
 file.  If non-ASCII characters appear in subsequent blocks of
 data, a warning will be issued.

https://www.freebsd.org/cgi/man.cgi?query=unzip&sektion=1&manpath=FreeBSD+10.3-RELEASE

> Heh.  It was created like so:
> 
>   $ printf 'text\r\n' >text && zip -ll infozip-text.zip text
>   $ zipinfo infozip-text.zip text
> -rw-r-  3.0 unx5 tx stor 16-Jul-18 13:12 text
> 

Thanks, but I think the test file is too small.  I tried
setting up a test to store the text file as binary in the
zip to check for inadvertant CRLF conversions:

  printf 'text\r\n' >binary && zip -B infozip-binary.zip binary

But zip -B/--binary only works on VM/CMS and MVS...

So I'm inclined to go with Dscho's patch.

(apologies for messing up René's name in my previous email;
 on my new FreeBSD setup: mutt displays it fine, as does vim when
 taking it from .mailmap, but something gets lost when mutt
 populates the file for vim.  Perhaps some Debian patch didn't
 make it upstream...)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git and SHA-1 security (again)

Junio C Hamano wrote:

> Continuing this thought process, I do not see a good way to allow us
> to wean ourselves off of the old hash, unless we _break_ the pack
> stream format so that each object in the pack carries not just the
> data but also the hash algorithm to be used to _name_ it, so that
> new objects will never be referred to using the old hash.

Taking a step further: I don't think that any backward-compatible
format change would address the security concerns with sufficiently
old hashing algorithms.

As long as my favorite repository is allowed to contain objects
identified by SHA-1, my adversary can exploit a SHA-1 collision using
signed tags referring (possibly indirectly) to backdated objects. The
Git object format does not include a proof of commit date, so I cannot
guarantee "Only old objects are named by SHA-1".

There is a way to get a backward-compatible *user experience* without
the format change being backward-compatible, though. Name all objects
in the repository using FuturisticHash. Also store enough information
to recover the old hashes, either in objects as a new field or in a
side table.

If the old hash is broken, signatures using the old hash cannot be
trusted. An adversary could generate a collision to retroactively
change the meaning of an existing signature. To maintain the meaning
of old signatures, someone has to record the new names of all involved
objects, either before the state of the art in breaking the old hash
advances far enough or using a copy of the repository from before the
state of the art had advanced --- in effect you need new signatures to
maintain the meaning of old signatures. This could happen as part of
the process of updating a repository to use a new hash.

E.g.

object
a787a87b98a7s98798a798b7a98b798a7b98a7b987a9b87a9b87a98b79a87b98a7b98a7b987a987987a878a78a
sha1tag object 04b871796dc0420f8e7561a895b52484b701d51a
type commit
tag signedtag
tagger C O Mitter 1465981006 +

signed tag

signed tag message body
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iQEcBAABAgAGBQJXYRhOAAoJEGEJLoW3InGJklkIAIcnhL7RwEb/+QeX9enkXhxn
rxfdqrvWd1K80sl2TOt8Bg/NYwrUBw/RWJ+sg/hhHp4WtvE1HDGHlkEz3y11Lkuh
8tSxS3qKTxXUGozyPGuE90sJfExhZlW4knIQ1wt/yWqM+33E9pN4hzPqLwyrdods
q8FWEqPPUbSJXoMbRPw04S5jrLtZSsUWbRYjmJCHzlhSfFWW4eFd37uquIaLUBS0
rkC3Jrx7420jkIpgFcTI2s60uhSQLzgcCwdA2ukSYIRnjg/zDkj8+3h/GaROJ72x
lZyI6HWixKJkWw8lE9aAOD9TmTW9sFJwcVAzmAuFX2kUreDUKMZduGcoRYGpD7E=
=jpXa
-END PGP SIGNATURE-
-BEGIN PGP SIGNATURE
...
-END PGP SIGNATURE

This example uses a signature to attest that mapping
04b871796dc0420f8e7561a895b52484b701d51a->a787a87b98a7s98798a798b7a98b798a7b98a7b987a9b87a9b87a98b79a87b98a7b98a7b987a987987a878a78a
is correct. A more straightforward approach would be for the
conversion process to produce an out-of-band signed mapping list to
make the sha1tag usable without such a signature.

Summary:
* Git's properties depend on using a single hash function throughout
a repository. I don't think we should change that.

* A safe and mostly painless migration to a stronger hash function is
possible using a signed assertion (for example generated by the
conversion process) of the mapping from old object names to new
object names.

* Dealing with multiple such signed mappings (for example due to
separate conversion of repositories based on linux.git) is left as
an exercise to the reader.

Hope that helps,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Git and SHA-1 security (again)

>> The reality of the current situation is that it's largely mitigated in
>> practice because:
>>
>> a) it's hard to hand someone a crafted blob to begin with for reasons
>> that have nothing to do with SHA-1 (they'll go "wtf is this garbage?")
>>
>> b) even in that case it's *very* hard to come up with two colliding
>> blobs that are *useful* for some nefarious purpose, e.g. a program A
>> that looks normal being replaced by an evil program B with the same
>> SHA-1.
>
> Thanks.  That's a nice rephrasing of
>
>   
> http://public-inbox.org/git/Pine.LNX.4.58.0504291221250.18901%40ppc970.osdl.org/
>
> where Linus explains SHA-1 is not the security, and the real
> security is in distribution.

If the real security is in the distribution, than why git supports
signed commits and objects?

The security of the signatures do depend on the hash. Saying the hash
is not a security feature and offering GPG signing based on that hash
is a damn big lie. You can change the hash algorithm to a secure one,
or change the signing method to be independent of the hash algorithm,
or you can stop offering signatures at all, but something has to be
done here.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] test-lib: stricter unzip(1) check

On Mon, Jul 18, 2016 at 2:19 PM, Eric Wong  wrote:

> Thanks, but I think the test file is too small.  I tried
> setting up a test to store the text file as binary in the
> zip to check for inadvertant CRLF conversions:
>
>   printf 'text\r\n' >binary && zip -B infozip-binary.zip binary
>
> But zip -B/--binary only works on VM/CMS and MVS...
>
> So I'm inclined to go with Dscho's patch.

OK, I'll wait for the final reroll and queue it near 'next' when it happens.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] fetch-pack: grow stateless RPC windows exponentially

Jonathan Tan  writes:

> and it would look like that patch. (I would probably redefine
> LARGE_FLUSH to be 10 times its current value instead of multiplying it
> by 10, since it is not used anywhere else.)

Sounds good.  Care to do the final version of the patch to be
applied?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2] fetch-pack: grow stateless RPC windows exponentially

When updating large repositories, the LARGE_FLUSH limit (that is, the
limit at which the window growth strategy switches from exponential to
linear) is reached quite quickly. Use a conservative exponential growth
strategy when that limit is reached instead (and increase LARGE_FLUSH so
that there is no regression in window size).

This optimization is only applied during stateless RPCs to avoid the
issue raised and fixed in commit
44d8dc54e73e8010c4bdf57a422fc8d5ce709029.

Signed-off-by: Jonathan Tan 
---
 fetch-pack.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index b501d5c..85e77af 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -243,16 +243,21 @@ static void insert_one_alternate_ref(const struct ref 
*ref, void *unused)
 
 #define INITIAL_FLUSH 16
 #define PIPESAFE_FLUSH 32
-#define LARGE_FLUSH 1024
+#define LARGE_FLUSH 16384
 
 static int next_flush(struct fetch_pack_args *args, int count)
 {
-   int flush_limit = args->stateless_rpc ? LARGE_FLUSH : PIPESAFE_FLUSH;
-
-   if (count < flush_limit)
-   count <<= 1;
-   else
-   count += flush_limit;
+   if (args->stateless_rpc) {
+   if (count < LARGE_FLUSH)
+   count <<= 1;
+   else
+   count = count * 11 / 10;
+   } else {
+   if (count < PIPESAFE_FLUSH)
+   count <<= 1;
+   else
+   count += PIPESAFE_FLUSH;
+   }
return count;
 }
 
-- 
2.8.0.rc3.226.g39d4020

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

t7063 failure on FreeBSD 10.3 i386/amd64

Not sure what's going on, below is the relevant output when
run with -i -v --tee (the rest succeeds without -i):

ok 26 - untracked cache correct after status

expecting success: 
avoid_racy &&
: >../trace &&
GIT_TRACE_UNTRACKED_STATS="$TRASH_DIRECTORY/trace" \
git status --porcelain >../status.actual &&
cat >../status.expect <../trace.expect

Re: [PATCH v2] fetch-pack: grow stateless RPC windows exponentially

Jonathan Tan wrote:

> Signed-off-by: Jonathan Tan 
> ---
>  fetch-pack.c | 19 ---
>  1 file changed, 12 insertions(+), 7 deletions(-)

Reviewed-by: Jonathan Nieder 

Thanks again.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git and SHA-1 security (again)

2016-07-18 Thread brian m. carlson

On Mon, Jul 18, 2016 at 09:00:06AM +0200, Johannes Schindelin wrote:
> Hi Brian,
> 
> On Sun, 17 Jul 2016, brian m. carlson wrote:
> 
> > On Sun, Jul 17, 2016 at 10:01:38AM +0200, Johannes Schindelin wrote:
> > > Out of curiosity: have you considered something like padding the SHA-1s
> > > with, say 0xa1, to the size of the new hash and using that padding to
> > > distinguish between old vs new hash?
> > 
> > I'm going to end up having to do something similar because of the issue
> > of submodules.  Submodules may still be SHA-1, while the main repo may
> > be a newer hash.  I was going to zero-pad, however.
> 
> I thought about zero-padding, but there are plenty of
> is_null_sha1()/is_null_oid() calls around. Of course, I assumed
> left-padding. But you may have thought of right-padding instead? That
> would make short name handling much easier, too.

I was going to right-pad.

> FWIW it never crossed my mind to allow different same-sized hash
> algorithms. So I never thought we'd need a way to distinguish, say,
> BLAKE2b-256 from SHA-256.
> 
> Is there a good reason to add the maintenance burden of several 256-bit
> hash algorithms, apart from speed (which in my mind should decide which
> one to use, always, rather than letting the user choose)? It would also
> complicate transport even further, let alone subtree merges from
> differently-hashed repositories.

There are really three candidates:

* SHA-256 (the SHA-2 algorithm): While this looks good right now,
  cryptanalysis is advancing.  This is not a good choice for a long-term
  solution.
* SHA3-256 (the SHA-3 algorithm): This is the conservative choice.  It's
  also faster than SHA-256 on 64-bit systems.  It has a very
  conservative security margin and is a good long-term choice.
* BLAKE2b-256: This is the blisteringly fast choice.  It outperforms
  SHA-1 and even MD5 on 64-bit systems.  This algorithm was designed so
  that nobody would have a reason to use an insecure algorithm.  It will
  probably be secure for some time, but maybe not as long as SHA3-256.

I'm only considering 256-bit hashes, because anything longer won't fit
on an 80-column terminal in hex form.

The reason I had considered implementing both SHA3-256 and BLAKE2b-256
is that I want there to be no reason not to upgrade.  People who need a
FIPS-approved algorithm or want a long-term, conservative choice should
use SHA3-256.  People who want even better performance than current Git
would use BLAKE2b-256.

Performance comparison (my implementations):
SHA-1: 437 MiB/s
SHA-256:   196 MiB/s
SHA3-256:  273 MiB/s
BLAKE2b:   649 MiB/s

I hadn't thought about subtree merges, though.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204

signature.asc
Description: PGP signature

Re: t7063 failure on FreeBSD 10.3 i386/amd64

Oops, forgot to Cc some folks who worked on this :x

Filesystem is ufs and it fails regardless of whether
soft-updates is enabled or not.

Eric Wong  wrote:
> Not sure what's going on, below is the relevant output when
> run with -i -v --tee (the rest succeeds without -i):
> 
> ok 26 - untracked cache correct after status
> 
> expecting success: 
>   avoid_racy &&
>   : >../trace &&
>   GIT_TRACE_UNTRACKED_STATS="$TRASH_DIRECTORY/trace" \
>   git status --porcelain >../status.actual &&
>   cat >../status.expect <  M done/two
> ?? .gitignore
> ?? done/five
> ?? dtwo/
> EOF
>   test_cmp ../status.expect ../status.actual &&
>   cat >../trace.expect < node creation: 0
> gitignore invalidation: 0
> directory invalidation: 0
> opendir: 0
> EOF
>   test_cmp ../trace.expect ../trace
> 
> --- ../trace.expect   2016-07-18 22:23:28.679886000 +
> +++ ../trace  2016-07-18 22:23:28.677135000 +
> @@ -1,4 +1,4 @@
>  node creation: 0
>  gitignore invalidation: 0
> -directory invalidation: 0
> -opendir: 0
> +directory invalidation: 1
> +opendir: 1
> not ok 27 - test sparse status again with untracked cache
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

git-prompt.sh incompatible with non-basic global grep.patternType

2016-07-18 Thread Richard Soderberg

Hi, I wanted to report something interesting that I found while
tracing a severe slowdown in git-prompt.sh.

https://github.com/git/git/commit/6d158cba282f22fa1548af1188f78042fed30aed#diff-f37c4f4a898819f0ca4b5ff69e81d4d9R141

Way back in this commit, someone added a useful chunk of code that
works perfectly with svn+ssh:// URLs under basic regexes:

+ local svn_upstream=($(git log --first-parent -1 \
+ --grep="^git-svn-id: \(${svn_url_pattern:2}\)" 2>/dev/null))

However, if I switch over to Perl regexes (or Extended):

git config --global grep.patternType perl

Then the command runs for one wall clock second and shows incorrect
results on my repository. I eventually traced this to an issue with
the regular expression provided, assuming the svn repository url is
"svn+ssh://...":

git log ... --grep="^git-svn-id: \(svn+ssh://...\)" 2>/dev/null

The + sign isn't escaped in git-prompt.sh, which under non-basic
regexes causes the match to fail entirely.

 - R.

ps. git log --basic-regexp does not fix the issue, as for unknown
reasons (I'll start another thread) the command-line option doesn't
override grep.patternType.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git and SHA-1 security (again)

2016-07-18 Thread brian m. carlson

On Mon, Jul 18, 2016 at 11:00:35AM -0700, Junio C Hamano wrote:
> Continuing this thought process, I do not see a good way to allow us
> to wean ourselves off of the old hash, unless we _break_ the pack
> stream format so that each object in the pack carries not just the
> data but also the hash algorithm to be used to _name_ it, so that
> new objects will never be referred to using the old hash.

I think for this reason, I'm going to propose the following approach
when we get there:

* We serialize the hash in the object formats, using multihash or
  something similar.  This means that it is minimally painful if we ever
  need to change in the future[0].
* Each repository carries exactly one hash algorithm, except for
  submodule data.  If we don't do this, then some people will never
  switch because the submodules they depend on haven't.
* If people on the new format need to refer to submodule commits using
  SHA-1, then they have to use a prefix on the hash form; otherwise,
  they can use the raw hash value (without any multihash prefix).
* git fsck verifies one consistent algorithm (excepting submodule
  references).

This preserves the security benefits, avoids future-proofing problems,
and minimizes performance impacts due to naming like you mentioned.

[0] We are practically limited to 256-bit hashes because anything longer
will wrap on an 80-column terminal when in hex form.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204

signature.asc
Description: PGP signature

Re: [PATCH] test-lib: stricter unzip(1) check

On Mon, Jul 18, 2016 at 12:43:27PM -0700, Junio C Hamano wrote:

> -- >8 --
> test: check "unzip" and "unzip -a"
> 
> Different platforms have implementations "unzip" that behave
> differently.  Most of the tests we use GIT_UNZIP we only care about
> the command to be able to extract from *.zip archive, but one test
> in t5003 wants it to also be able to grok the "-a" option.
> 
> Prepare a sample zip file that has a single text file in it, and try
> extracting its contents to see GIT_UNZIP is usable. when setting
> UNZIP prerequisite.  Similarly, set UNZIP_AUTOTEXT prerequisite by
> running GIT_UNZIP with the "-a" option.

I like the direction here, modulo the problems with "-a" that Eric
pointed out. Maybe "zip -l" would be a better approach.

One nit:

> +test_lazy_prereq UNZIP_AUTOTEXT '
> + (
> + mkdir unzip-autotext &&
> + cd unzip-autotext
> + "$GIT_UNZIP" -a "$TEST_DIRECTORY"/t5003/infozip-text.zip &&
> + test -f text
> + )
> +'

I don't think we need the extra directory or the subshell here.
test_lazy_prereq takes care of that for us.

> diff --git a/t/t5003/infozip-text.zip b/t/t5003/infozip-text.zip
> new file mode 100644
> index 000..a019acb
> Binary files /dev/null and b/t/t5003/infozip-text.zip differ

Couldn't apply this locally without --binary, of course. :)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 00/16] Use merge_recursive() directly in the builtin am

Junio C Hamano  writes:

> Johannes Schindelin  writes:
>
>> The two topics that are in 'pu' and conflict with this series are
>> 'jh/clean-smudge-annex' and 'bc/cocci'.
>>
>> It also conflicted with 'va/i18n-even-more', but that one was merged to
>> 'master'.
>>
>> Now, I think it would be okay to wait for 'bc/cocci' to go to 'master'
>> before integrating the 'am-3-merge-recursive-direct' branch, but I would
>> want to avoid waiting for 'jh/clean-smudge-annex'.

Nothing seems to be happening on jh/clean-smudge-annex front, so
unless we hear otherwise from Joey in the coming couple of days,
let's get js/am-3-merge-recursive-direct untangled from its
dependencies and get it into a shape to hit 'next'.  You can assume
that I'll merge bc/cocci and js/am-call-theirs-theirs-in-fallback-3way
to 'master' during that time, so an appropriate base to use would be
the result of

git checkout master^0
git merge bc/cocci
git merge js/am-call-theirs-theirs-in-fallback-3way
git merge cc/apply-am

if you want a semi-solid base to rebuild am-3-merge-recursive-direct
on.  I am not 100% sure about the doneness of cc/apply-am yet,
though but it could be that am-3-merge-recursive-direct does not
have to depend on it at all.  You'd know better than I do ;-)

After that, I'll see if the conflict resolution is manageable when
remerging jh/clean-smudge-annex to 'pu', and if it is not, I'll
worry about it at that point.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] pager: disable color when pager is "more"