Re: Why does 'submodule add' stage the relevant portions?

2013-03-26 Thread Ramkumar Ramachandra
Jens Lehmann wrote:
 Am 25.03.2013 20:57, schrieb Ramkumar Ramachandra:
 Doesn't that sound horribly crippled to you?  Is there any advantage
 to leaving the .git directory inside the submodule?  Isn't it always
 better to relocate it?

 It's not crippled at all, that is just the way it was from submodule
 day one. And no, it isn't always better to relocate it. E.g. when
 you want to be able to just tar away work tree and history someplace
 else because you don't have (or don't want) an upstream to push to,
 you'd be very surprised a submodule add moved your .git directory
 someplace else effectively nuking the backup of your history and
 refs (guess under what circumstances you'll notice that). While I
 believe most submodule users would benefit from such a relocation, I
 consider the other use cases as valid and we would introduce silent
 breakage on them. On the other hand I made all relevant commands
 complain loudly about the .git directory in the submodule's work
 tree when it matters, so users can do something about it when they
 need it and are told so.

I see.  Thanks for the explanation.

 Why a new subcommand?  Is there a problem if we do the relocation at
 the time of 'add'?  Will some user expectation break?

 For me relocation at the time of 'add' would be ok with a new option
 (and it might also make sense to have a config option changing the
 default for users who want that), but not as the default.

Makes sense.  This seems trivial to implement: I'll get to work on it soon.

 And leaving aside 'add', there are tons of submodules out there
 which were cloned with older Git who have their .git directory
 inside the work tree. So a new subcommand (or at least a helper
 script in contrib) to relocate the .git directory would really help
 here to migrate these legacy submodules without users having to
 worry about data loss.

The question is: after using a non-relocated submodule for some
time, will the user suddenly decide to make it a relocated submodule
one day?

 I meant a variant of add that would clone, but not stage.  I was
 arguing for a new subcommand so that I don't have to touch 'submodule
 add', not for a rename.

 Ah, now I get it, I was confused by your reference to 'git submodule
 add repository path'. I have to admit I still don't understand
 the use case you have for adding a submodule without staging it, but
 maybe it is just too late here.

I usually reset after running 'git submodule add', because I rarely
commit the added submodule immediately after adding it.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git ate my home directory :-(

2013-03-26 Thread Philip Oakley

From: Junio C Hamano gits...@pobox.com
Sent: Monday, March 25, 2013 10:06 PM

Jonathan Nieder jrnie...@gmail.com writes:


Richard Weinberger wrote:

In my scripts I'm setting GIT_DIR to use git-fetch and git-reset 
without changing the

current working directory all the time.


Yeah, for historical reasons GIT_WORK_TREE defaults to $(pwd) when
GIT_DIR is explicitly set.


And it *WILL* be that way til the end of time.  Unless you are at
the top level of your working tree, you are supposed to tell where
the top level is with GIT_WORK_TREE when you use GIT_DIR.  Always.


Should this important warning be part of the git(1) documentation on the 
environment variables (and possibly other places) given the consequences 
of this case? It wasn't something I'd appreciated from a simple reading.




And that is the answer you should be giving here, not implicit
stuff, which is an implementation detail to help aliases.  I do not
know how things will break when the end user sets and exports it to
the environment, and I do not think we would want to make any
promise on how it works.
--
Philip 


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] difftool: don't overwrite modified files

2013-03-26 Thread Johannes Sixt
Am 3/25/2013 22:44, schrieb John Keeping:
 After running the user's diff tool, git-difftool will copy any files
 that differ between the working tree and the temporary tree.  This is
 useful when the user edits the file in their diff tool but is wrong if
 they edit the working tree file while examining the diff.
 
 Instead of copying unconditionally when the files differ, create and
 index from the working tree files and only copy the temporary file back
 if it was modified and the working tree file was not.  If both files
 have been modified, print a warning and exit with an error.
 
 Note that we cannot use an existing index in git-difftool since those
 contain the modified files that need to be checked out but here we are
 looking at those files which are copied from the working tree and not
 checked out.  These are precisely the files which are not in the
 existing indices.
 
 Signed-off-by: John Keeping j...@keeping.me.uk
 ---
 On Mon, Mar 25, 2013 at 10:42:19AM +, John Keeping wrote:
 On Mon, Mar 25, 2013 at 08:41:59AM +0100, Johannes Sixt wrote:
 This is gross. Can't we do much better here? Difftool already keeps a
 GIT_INDEX of the files in the temporary tree ($tmpdir/rindex). Running
 git-diff-files should be sufficient to tell which ones where edited via
 the users's diff-tool. Then you can restrict calling hash-object to only
 those worktree files where an edit collision needs to be checked for.

 That's only the case for files that are not copied from the working
 tree, so the temporary index doesn't contain the files that are of
 interest here.

 You could also keep a parallel index that keeps the state of the same set
 of files in the worktree. Then another git-diff-files call could replace
 the other half of hash-object calls.

 I like the idea of creating an index from the working tree files and
 using it here.  If we create a starting state index for these files,
 we should be able to run git-diff-files against both the working tree
 and the temporary tree at this point and compare the output.
 
 Here's an attempt at taking this approach, built on
 jk/difftool-dir-diff-edit-fix.
 
  git-difftool.perl   | 73 
 +++--
  t/t7800-difftool.sh | 26 +++
  2 files changed, 85 insertions(+), 14 deletions(-)
 
 diff --git a/git-difftool.perl b/git-difftool.perl
 index c433e86..d10f7d2 100755
 --- a/git-difftool.perl
 +++ b/git-difftool.perl
 @@ -13,9 +13,9 @@
  use 5.008;
  use strict;
  use warnings;
 +use Error qw(:try);
  use File::Basename qw(dirname);
  use File::Copy;
 -use File::Compare;
  use File::Find;
  use File::stat;
  use File::Path qw(mkpath rmtree);
 @@ -88,14 +88,45 @@ sub use_wt_file
   my ($repo, $workdir, $file, $sha1, $symlinks) = @_;
   my $null_sha1 = '0' x 40;
  
 - if ($sha1 eq $null_sha1) {
 - return 1;
 - } elsif (not $symlinks) {
 + if ($sha1 ne $null_sha1 and not $symlinks) {
   return 0;
   }
  
   my $wt_sha1 = $repo-command_oneline('hash-object', $workdir/$file);
 - return $sha1 eq $wt_sha1;
 + my $use = ($sha1 eq $null_sha1) || ($sha1 eq $wt_sha1);
 + return ($use, $wt_sha1);
 +}
 +
 +sub changed_files
 +{
 + my ($repo_path, $index, $worktree) = @_;
 + $ENV{GIT_INDEX_FILE} = $index;
 + $ENV{GIT_WORK_TREE} = $worktree;
 + my $must_unset_git_dir = 0;
 + if (not defined($ENV{GIT_DIR})) {
 + $must_unset_git_dir = 1;
 + $ENV{GIT_DIR} = $repo_path;
 + }
 +
 + my @refreshargs = qw/update-index --really-refresh -q --unmerged/;
 + my @gitargs = qw/diff-files --name-only -z/;
 + try {
 + Git::command_oneline(@refreshargs);
 + } catch Git::Error::Command with {};
 +
 + my $line = Git::command_oneline(@gitargs);
 + my @files;
 + if (defined $line) {
 + @files = split('\0', $line);
 + } else {
 + @files = ();
 + }
 +
 + delete($ENV{GIT_INDEX_FILE});
 + delete($ENV{GIT_WORK_TREE});
 + delete($ENV{GIT_DIR}) if ($must_unset_git_dir);
 +
 + return map { $_ = 1 } @files;
  }
  
  sub setup_dir_diff
 @@ -121,6 +152,7 @@ sub setup_dir_diff
   my $null_sha1 = '0' x 40;
   my $lindex = '';
   my $rindex = '';
 + my $wtindex = '';
   my %submodule;
   my %symlink;
   my @working_tree = ();
 @@ -174,8 +206,12 @@ EOF
   }
  
   if ($rmode ne $null_mode) {
 - if (use_wt_file($repo, $workdir, $dst_path, $rsha1, 
 $symlinks)) {
 - push(@working_tree, $dst_path);
 + my ($use, $wt_sha1) = use_wt_file($repo, $workdir,
 +   $dst_path, $rsha1,
 +   $symlinks);
 + if ($use) {
 + push @working_tree, $dst_path;
 + $wtindex .= $rmode 

Re: [PATCH v2] difftool: don't overwrite modified files

2013-03-26 Thread Johannes Sixt
Forgot to mention: The patch passes t7800 on Windows.

Thanks,
-- Hannes
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/3] t7800: run --dir-diff tests with and without symlinks

2013-03-26 Thread John Keeping
On Mon, Mar 25, 2013 at 02:50:38PM -0700, Junio C Hamano wrote:
 Johannes Sixt j.s...@viscovery.net writes:
 
  Am 3/25/2013 11:35, schrieb John Keeping:
  On Mon, Mar 25, 2013 at 08:26:52AM +0100, Johannes Sixt wrote:
  The series looks good, but I can't test it because it does not apply
  anywhere here.
  
  It's built on top of da/difftool-fixes, is there some problem that stops
  it applying cleanly on top of that?
 
  Thanks. I had only tried trees that were contaminated by
  jk/difftool-dir-diff-edit-fix, which is in conflict with da/difftool-fixes.
 
 Yes, the conflict was unpleasant to deal with.  John, I think what
 is queued on 'pu' is OK, but please double check after I push it out
 in a few hours.

The result looks good to me.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] difftool: don't overwrite modified files

2013-03-26 Thread John Keeping
On Tue, Mar 26, 2013 at 09:38:42AM +0100, Johannes Sixt wrote:
 Am 3/25/2013 22:44, schrieb John Keeping:
  After running the user's diff tool, git-difftool will copy any files
  that differ between the working tree and the temporary tree.  This is
  useful when the user edits the file in their diff tool but is wrong if
  they edit the working tree file while examining the diff.
  
  Instead of copying unconditionally when the files differ, create and
  index from the working tree files and only copy the temporary file back
  if it was modified and the working tree file was not.  If both files
  have been modified, print a warning and exit with an error.
  
  Note that we cannot use an existing index in git-difftool since those
  contain the modified files that need to be checked out but here we are
  looking at those files which are copied from the working tree and not
  checked out.  These are precisely the files which are not in the
  existing indices.
  
  Signed-off-by: John Keeping j...@keeping.me.uk
  ---
  On Mon, Mar 25, 2013 at 10:42:19AM +, John Keeping wrote:
  On Mon, Mar 25, 2013 at 08:41:59AM +0100, Johannes Sixt wrote:
  This is gross. Can't we do much better here? Difftool already keeps a
  GIT_INDEX of the files in the temporary tree ($tmpdir/rindex). Running
  git-diff-files should be sufficient to tell which ones where edited via
  the users's diff-tool. Then you can restrict calling hash-object to only
  those worktree files where an edit collision needs to be checked for.
 
  That's only the case for files that are not copied from the working
  tree, so the temporary index doesn't contain the files that are of
  interest here.
 
  You could also keep a parallel index that keeps the state of the same set
  of files in the worktree. Then another git-diff-files call could replace
  the other half of hash-object calls.
 
  I like the idea of creating an index from the working tree files and
  using it here.  If we create a starting state index for these files,
  we should be able to run git-diff-files against both the working tree
  and the temporary tree at this point and compare the output.
  
  Here's an attempt at taking this approach, built on
  jk/difftool-dir-diff-edit-fix.
  
   git-difftool.perl   | 73 
  +++--
   t/t7800-difftool.sh | 26 +++
   2 files changed, 85 insertions(+), 14 deletions(-)
  
  diff --git a/git-difftool.perl b/git-difftool.perl
  index c433e86..d10f7d2 100755
  --- a/git-difftool.perl
  +++ b/git-difftool.perl
  @@ -13,9 +13,9 @@
   use 5.008;
   use strict;
   use warnings;
  +use Error qw(:try);
   use File::Basename qw(dirname);
   use File::Copy;
  -use File::Compare;
   use File::Find;
   use File::stat;
   use File::Path qw(mkpath rmtree);
  @@ -88,14 +88,45 @@ sub use_wt_file
  my ($repo, $workdir, $file, $sha1, $symlinks) = @_;
  my $null_sha1 = '0' x 40;
   
  -   if ($sha1 eq $null_sha1) {
  -   return 1;
  -   } elsif (not $symlinks) {
  +   if ($sha1 ne $null_sha1 and not $symlinks) {
  return 0;
  }
   
  my $wt_sha1 = $repo-command_oneline('hash-object', $workdir/$file);
  -   return $sha1 eq $wt_sha1;
  +   my $use = ($sha1 eq $null_sha1) || ($sha1 eq $wt_sha1);
  +   return ($use, $wt_sha1);
  +}
  +
  +sub changed_files
  +{
  +   my ($repo_path, $index, $worktree) = @_;
  +   $ENV{GIT_INDEX_FILE} = $index;
  +   $ENV{GIT_WORK_TREE} = $worktree;
  +   my $must_unset_git_dir = 0;
  +   if (not defined($ENV{GIT_DIR})) {
  +   $must_unset_git_dir = 1;
  +   $ENV{GIT_DIR} = $repo_path;
  +   }
  +
  +   my @refreshargs = qw/update-index --really-refresh -q --unmerged/;
  +   my @gitargs = qw/diff-files --name-only -z/;
  +   try {
  +   Git::command_oneline(@refreshargs);
  +   } catch Git::Error::Command with {};
  +
  +   my $line = Git::command_oneline(@gitargs);
  +   my @files;
  +   if (defined $line) {
  +   @files = split('\0', $line);
  +   } else {
  +   @files = ();
  +   }
  +
  +   delete($ENV{GIT_INDEX_FILE});
  +   delete($ENV{GIT_WORK_TREE});
  +   delete($ENV{GIT_DIR}) if ($must_unset_git_dir);
  +
  +   return map { $_ = 1 } @files;
   }
   
   sub setup_dir_diff
  @@ -121,6 +152,7 @@ sub setup_dir_diff
  my $null_sha1 = '0' x 40;
  my $lindex = '';
  my $rindex = '';
  +   my $wtindex = '';
  my %submodule;
  my %symlink;
  my @working_tree = ();
  @@ -174,8 +206,12 @@ EOF
  }
   
  if ($rmode ne $null_mode) {
  -   if (use_wt_file($repo, $workdir, $dst_path, $rsha1, 
  $symlinks)) {
  -   push(@working_tree, $dst_path);
  +   my ($use, $wt_sha1) = use_wt_file($repo, $workdir,
  + $dst_path, $rsha1,
  + $symlinks);
  +   if ($use) {
  +   

Re: git ate my home directory :-(

2013-03-26 Thread Duy Nguyen
On Tue, Mar 26, 2013 at 08:02:30AM -, Philip Oakley wrote:
  Yeah, for historical reasons GIT_WORK_TREE defaults to $(pwd) when
  GIT_DIR is explicitly set.
 
  And it *WILL* be that way til the end of time.  Unless you are at
  the top level of your working tree, you are supposed to tell where
  the top level is with GIT_WORK_TREE when you use GIT_DIR.  Always.
 
 Should this important warning be part of the git(1) documentation on the 
 environment variables (and possibly other places) given the consequences 
 of this case? It wasn't something I'd appreciated from a simple reading.

Something like this, maybe?

-- 8 --
Subject: [PATCH] git.txt: document the implicit working tree setting with 
GIT_DIR

Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com
---
 Documentation/git.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index 7efaa59..ce55abf 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -671,6 +671,8 @@ Git so take care if using Cogito etc.
specifies a path to use instead of the default `.git`
for the base of the repository.
The '--git-dir' command-line option also sets this value.
+   If neither GIT_WORK_TREE nor '--work-tree' is set, the
+   current directory will become the working tree.
 
 'GIT_WORK_TREE'::
Set the path to the working tree.  The value will not be
-- 
1.8.2.82.gc24b958
-- 8 --
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] difftool: don't overwrite modified files

2013-03-26 Thread Johannes Sixt
Am 3/26/2013 10:31, schrieb John Keeping:
 On Tue, Mar 26, 2013 at 09:38:42AM +0100, Johannes Sixt wrote:
 One question though: Do I understand correctly that the temporary
 directories are leaked in the case of an edit conflict? If so, is it
 worth a warning for the user to clean up the garbage?
 
 Do you mean for normal users or for those running the tests?  In normal
 usage we do print a warning - it's in the existing code, triggered by
 setting $error = 1 - you can see that if you run the tests with -v.

I meant for normal users. I see the error now. Thanks.

 The last test does result in /tmp filling up with temporary directories
 though, it would be good if the test could clean up after itself.  The
 best I can come up with is adding something like this immediately after
 running difftool but I'm not entirely happy with the .. in the
 argument to rm:
 
   test_when_finished rm -rf $(cat tmpdir)/..

Wrap the test in

(
TMPDIR=$TRASH_DIRECTORY 
export TMPDIR 
...
)

It works for me.

-- Hannes
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 3/5] merge/pull: verify GPG signatures of commits being merged

2013-03-26 Thread Sebastian Götte
When --verify-signatures is specified on the command-line of git-merge
or git-pull, check whether the commits being merged have good gpg
signatures and abort the merge in case they do not. This allows e.g.
auto-deployment from untrusted repo hosts.

Signed-off-by: Sebastian Götte ja...@physik-pool.tu-berlin.de
---
 Documentation/merge-options.txt|  4 +++
 builtin/merge.c| 33 +++-
 git-pull.sh| 10 ++--
 t/t7612-merge-verify-signatures.sh | 52 ++
 4 files changed, 96 insertions(+), 3 deletions(-)

diff --git a/Documentation/merge-options.txt b/Documentation/merge-options.txt
index 0bcbe0a..2f76ab5 100644
--- a/Documentation/merge-options.txt
+++ b/Documentation/merge-options.txt
@@ -83,6 +83,10 @@ option can be used to override --squash.
Pass merge strategy specific option through to the merge
strategy.
 
+--verify-signatures::
+--no-verify-signatures::
+   Verify that the commits being merged have good trusted GPG signatures
+
 --summary::
 --no-summary::
Synonyms to --stat and --no-stat; these are deprecated and will be
diff --git a/builtin/merge.c b/builtin/merge.c
index 7c8922c..227040b 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -49,7 +49,7 @@ static const char * const builtin_merge_usage[] = {
 static int show_diffstat = 1, shortlog_len = -1, squash;
 static int option_commit = 1, allow_fast_forward = 1;
 static int fast_forward_only, option_edit = -1;
-static int allow_trivial = 1, have_message;
+static int allow_trivial = 1, have_message, verify_signatures;
 static int overwrite_ignore = 1;
 static struct strbuf merge_msg = STRBUF_INIT;
 static struct strategy **use_strategies;
@@ -199,6 +199,8 @@ static struct option builtin_merge_options[] = {
OPT_BOOLEAN(0, ff-only, fast_forward_only,
N_(abort if fast-forward is not possible)),
OPT_RERERE_AUTOUPDATE(allow_rerere_auto),
+   OPT_BOOLEAN(0, verify-signatures, verify_signatures,
+   N_(Verify that the named commit has a valid GPG signature)),
OPT_CALLBACK('s', strategy, use_strategies, N_(strategy),
N_(merge strategy to use), option_parse_strategy),
OPT_CALLBACK('X', strategy-option, xopts, N_(option=value),
@@ -1233,6 +1235,35 @@ int cmd_merge(int argc, const char **argv, const char 
*prefix)
usage_with_options(builtin_merge_usage,
builtin_merge_options);
 
+   if (verify_signatures) {
+   /* Verify the commit signatures */
+   for (p = remoteheads; p; p = p-next) {
+   struct commit *commit = p-item;
+   struct signature signature;
+   signature.check_result = 0;
+
+   check_commit_signature(commit, signature);
+
+   char hex[41];
+   strcpy(hex, find_unique_abbrev(commit-object.sha1, 
DEFAULT_ABBREV));
+   switch(signature.check_result){
+   case 'G':
+   if (verbosity = 0)
+   printf(_(Commit %s has a good 
GPG signature by %s (key fingerprint %s)\n), hex, signature.signer, 
signature.key);
+   break;
+   case 'B':
+   die(_(Commit %s has a bad GPG 
signature allegedly by %s (key fingerprint %s).), hex, signature.signer, 
signature.key);
+   default: /* 'N' */
+   die(_(Commit %s does not have a good 
GPG signature. In fact, commit %s does not have a GPG signature at all.), hex, 
hex);
+   }
+
+   free(signature.gpg_output);
+   free(signature.gpg_status);
+   free(signature.signer);
+   free(signature.key);
+   }
+   }
+
strbuf_addstr(buf, merge);
for (p = remoteheads; p; p = p-next)
strbuf_addf(buf,  %s, merge_remote_util(p-item)-name);
diff --git a/git-pull.sh b/git-pull.sh
index 266e682..705940d 100755
--- a/git-pull.sh
+++ b/git-pull.sh
@@ -39,7 +39,7 @@ test -z $(git ls-files -u) || die_conflict
 test -f $GIT_DIR/MERGE_HEAD  die_merge
 
 strategy_args= diffstat= no_commit= squash= no_ff= ff_only=
-log_arg= verbosity= progress= recurse_submodules=
+log_arg= verbosity= progress= recurse_submodules= verify_signatures=
 merge_args= edit=
 curr_branch=$(git symbolic-ref -q HEAD)
 curr_branch_short=${curr_branch#refs/heads/}
@@ -125,6 +125,12 @@ do
--no-recurse-submodules)
recurse_submodules=--no-recurse-submodules
;;
+   --verify-signatures)
+   verify_signatures=--verify-signatures
+   ;;
+   --no-verify-signatures)
+   

[PATCH v4 4/5] merge/pull Check for untrusted good GPG signatures

2013-03-26 Thread Sebastian Götte
When --verify-signatures is specified, abort the merge in case a good
GPG signature from an untrusted key is encountered.

Signed-off-by: Sebastian Götte ja...@physik-pool.tu-berlin.de
---
 builtin/merge.c|   2 ++
 commit.c   |   2 ++
 commit.h   |   9 +
 gpg-interface.h|   2 +-
 t/lib-gpg/pubring.gpg  | Bin 1164 - 2359 bytes
 t/lib-gpg/random_seed  | Bin 600 - 600 bytes
 t/lib-gpg/secring.gpg  | Bin 1237 - 3734 bytes
 t/lib-gpg/trustdb.gpg  | Bin 1280 - 1360 bytes
 t/t7612-merge-verify-signatures.sh |   9 +
 9 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 227040b..e3df36f 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1251,6 +1251,8 @@ int cmd_merge(int argc, const char **argv, const char 
*prefix)
if (verbosity = 0)
printf(_(Commit %s has a good 
GPG signature by %s (key fingerprint %s)\n), hex, signature.signer, 
signature.key);
break;
+   case 'U':
+   die(_(Commit %s has a good GPG 
signature allegedly by %s, albeit from an untrusted key (fingerprint %s).), 
hex, signature.signer, signature.key);
case 'B':
die(_(Commit %s has a bad GPG 
signature allegedly by %s (key fingerprint %s).), hex, signature.signer, 
signature.key);
default: /* 'N' */
diff --git a/commit.c b/commit.c
index 533727c..785eb51 100644
--- a/commit.c
+++ b/commit.c
@@ -1027,6 +1027,8 @@ static struct {
char result;
const char *check;
 } signature_check[] = {
+   { 'U', [GNUPG:] TRUST_UNDEFINED },
+   { 'U', [GNUPG:] TRUST_NEVER },
{ 'G', [GNUPG:] GOODSIG  },
{ 'B', [GNUPG:] BADSIG  },
 };
diff --git a/commit.h b/commit.h
index cf35472..c946135 100644
--- a/commit.h
+++ b/commit.h
@@ -232,10 +232,11 @@ extern void print_commit_list(struct commit_list *list,
  const char *format_last);
 
 /*
- * Check the signature of the given commit. The result of the check is stored 
in
- * sig-check_result, 'G' for a good signature, 'B' for a bad signature and 'N'
- * for no signature at all.
- * This may allocate memory for sig-gpg_output, sig-gpg_status, sig-signer 
and sig-key.
+ * Check the signature of the given commit. The result of the check is stored
+ * in sig-check_result, 'G' for a good signature, 'U' for a good signature
+ * from an untrusted signer, 'B' for a bad signature and 'N' for no signature
+ * at all.  This may allocate memory for sig-gpg_output, sig-gpg_status,
+ * sig-signer and sig-key.
  */
 extern void check_commit_signature(const struct commit* commit, struct 
signature *sig);
 
diff --git a/gpg-interface.h b/gpg-interface.h
index 2a536d9..7fd3021 100644
--- a/gpg-interface.h
+++ b/gpg-interface.h
@@ -4,7 +4,7 @@
 struct signature {
char *gpg_output;
char *gpg_status;
-   char check_result; /* 0 (not checked), N (checked but no further 
result), G (good) or B (bad) */
+   char check_result; /* 0 (not checked), N (checked but no further 
result), U (untrusted, good), G (good) or B (bad) */
char *signer;
char *key;
 };
diff --git a/t/lib-gpg/pubring.gpg b/t/lib-gpg/pubring.gpg
index 
83855fa4e1c6c37afe550c17afa1e7971042ded5..1a3c2d487c2fda9169751a3068fa51e853a1e519
 100644
GIT binary patch
delta 1212
zcmV;t1Vj6b3AYlkj0As~0SyFEOqNFh2mr~8{AU1PG9!Ku9w|}@vpZJPg*#s86v-*O
zhafj(DLlFA0)`tvC$E@WHJ2r~0{0ZCh1kHo$b9ih^aD*~)oVvKyC1(yi)6x_y
zF8V3JpbIY^ZYQUk#j*ja0`;jw^J+5~!h3qc)ej%g1;Wb0U8cXSuLKdXkx2PUtB4
zzqQO;r+6QaVFd?ZkA#8Ch(2p+QaoKR;K0{BU;l_DqGU2YCt6Fr4psgB*tPou
z8C@tnj=CqkffSLs=a-5G+h;Y8bkws-iIgT{$f^j!)l`hmkz}$nZPxBoPmwU;zIH^
zLN~K=bOhGBes9{Fcqzh%Wj2h^{ZjI01*KI0kkAVa%poQL}_zlZ*pX5VIVwYX((5
za%4bdcwudDY-KKPWpqA?0XPH`0RjLb1p-k_mPY~`0|pBT2nPcK1{DYb2?`4Y76JnS
z0v-VZ7k~f?2@qikE`_%uafw)?2m1%-{uDP0Rs1|(F{OVhOSKoH$k!o)x3CO5Z^@El
zxGq=+xQGRr@3$S0@SUu@@UuxUH%pD3Q-D`?37Vg*xmo%+KD)(U~kXU^b+=70V
zFr?b3DvlFq^B*d%lz=lr!ch6nQbvcwWCD}n2DT3`r?RDkq=7r}qFvEhdfApA7yR;
z8`OB;$TEKm@jkwG=8vlcKFdrcM8+9eP66QueaUSz6NhphD3O@E(iX7T@2kxV(dXd
zxG_$z;qqdhNp6*{J!OD7b;8M6otxRi0UUN|OngZ4NdYbeU0hneiVEwY2h5xZ4z{
zsXL4UkGxJ?Pc}MK(eQlV@oumS)Bxd9CXQA~f9M*#=`sXI#Mh|Ll%uxJqK}$
zc)VOU|M7H$nU~Bd(!9WlLk)V)zUY~vSc_rVFe1JAY7ThlG1IzE9z6ly;9WIe
zrW+EU|2KpxZ$0rq-Jn9T{vsi6Qh0j42I3_PLQ5$)qa$}3etj89nJkr^3#KD!xh9*
z4BvmZ0TM||zk);(8@JN=fad9d4hK=Hf1{J5DP=J9wW0IGZ_*p$xvH=WZu8h1r#
z-1Cjfqp9S1m|b%7S$7Du_^GfwyFyObC4EMw~6SwH1W45gXD*OLAz9x~@tJ8?
z^Jtn;2$K^0SUY?f@Cu@)V5#TJ_yB~Qw!nY4S8m?1+xGV0RRDs0Urby0RjLC1p-k_
zmPY~`3;+rV5Mc=}g|{MciR^I*0LA1n{v0`JrvmHmaVfVdlHi8rqLbHhetm3X@@Ir
zXeh6;re%LlP-QWig+n`!Ae0mFQrnvwOO!ZC8%_sj-2tVNM{p{WTNusbkjMdd`2^

[PATCH v4 5/5] pretty printing: extend %G? to include 'N' and 'U'

2013-03-26 Thread Sebastian Götte
Expand %G? in pretty format strings to 'N' in case of no GPG signature
and 'U' in case of a good but untrusted GPG signature in addition to
the previous 'G'ood and 'B'ad. This eases writing anyting parsing
git-log output.

Signed-off-by: Sebastian Götte ja...@physik-pool.tu-berlin.de
---
 Documentation/pretty-formats.txt | 3 ++-
 pretty.c | 2 ++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
index 2939655..afac703 100644
--- a/Documentation/pretty-formats.txt
+++ b/Documentation/pretty-formats.txt
@@ -131,7 +131,8 @@ The placeholders are:
 - '%B': raw body (unwrapped subject and body)
 - '%N': commit notes
 - '%GG': raw verification message from GPG for a signed commit
-- '%G?': show either G for Good or B for Bad for a signed commit
+- '%G?': show G for a Good signature, B for a Bad signature, U for a 
good,
+  untrusted signature and N for no signature
 - '%GS': show the name of the signer for a signed commit
 - '%GK': show the key used to sign a signed commit
 - '%gD': reflog selector, e.g., `refs/stash@{1}`
diff --git a/pretty.c b/pretty.c
index 3aac5d8..fc74795 100644
--- a/pretty.c
+++ b/pretty.c
@@ -1135,6 +1135,8 @@ static size_t format_commit_one(struct strbuf *sb, const 
char *placeholder,
switch (c-signature.check_result) {
case 'G':
case 'B':
+   case 'U':
+   case 'N':
strbuf_addch(sb, c-signature.check_result);
}
break;
-- 
1.8.1.5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] Refactor parts of in_delta_base_cache/cache_or_unpack_entry

2013-03-26 Thread thomas
Junio C Hamano gits...@pobox.com writes:

 Thomas Rast tr...@student.ethz.ch writes:

 The delta base cache lookup and test were shared.  Refactor them;
 we'll need both parts again.  Also, we'll use the clearing routine
 later.

 Signed-off-by: Thomas Rast tr...@student.ethz.ch
 ---

 Looks like a very straight-forward rewrite.

 The only little concern I may have is this cmp_* function tells us
 I found it! by returning true, which is counter-intuitive to the
 readers of the caller (not the callee).

 I think it makes sense to compare delta-base-cache entries only for
 equality, so eq-delta-base-cache-entry might be a better name for
 it, perhaps?

True.  I'll resend.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git ate my home directory :-(

2013-03-26 Thread Richard Weinberger

Am 26.03.2013 09:02, schrieb Philip Oakley:

From: Junio C Hamano gits...@pobox.com
Sent: Monday, March 25, 2013 10:06 PM

Jonathan Nieder jrnie...@gmail.com writes:


Richard Weinberger wrote:


In my scripts I'm setting GIT_DIR to use git-fetch and git-reset without 
changing the
current working directory all the time.


Yeah, for historical reasons GIT_WORK_TREE defaults to $(pwd) when
GIT_DIR is explicitly set.


And it *WILL* be that way til the end of time.  Unless you are at
the top level of your working tree, you are supposed to tell where
the top level is with GIT_WORK_TREE when you use GIT_DIR.  Always.


Should this important warning be part of the git(1) documentation on the 
environment variables (and possibly other places) given the consequences of 
this case? It wasn't something
I'd appreciated from a simple reading.


BTW: Can't we change git-clean such that it will not delete any files if GIT_DIR is set 
and GIT_WORK_TREE is .?

Thanks,
//richard

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: propagating repo corruption across clone

2013-03-26 Thread Jeff Mitchell
On Mon, Mar 25, 2013 at 4:07 PM, Jeff King p...@peff.net wrote:
 On Mon, Mar 25, 2013 at 12:32:50PM -0400, Jeff Mitchell wrote:
 For commit corruptions, the --no-hardlinks, non --mirror case refused
 to create the new repository and exited with an error code of 128. The
 --no-hardlinks, --mirror case spewed errors to the console, yet
 *still* created the new clone *and* returned an error code of zero.

 I wasn't able to reproduce this; can you post a succint test case?

This actually seems hard to reproduce. I found this during testing
with an existing repository on-disk, but when I tried creating a new
repository with some commit objects, and modifying one of the commit
objects the same way I modified the an object in the previous
repository, I was unable to reproduce it.

I do have the original repository though, so I'll tar.gz it up so that
you can have exactly the same content as I do. It's about 40MB and you
can grab it here:
https://www.dropbox.com/s/e8dhedmpd1a1axs/tomahawk-corrupt.tar.gz (MD5
sum: cde8a43233db5d649932407891f8366b).

Once you extract that, you should be able to run a clone using paths
(not file://) with --no-hardlinks --mirror and replicate the behavior
I saw. FYI, I'm on Git 1.8.2.

Thanks,
Jeff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Correct the docs about GIT_SSH.

2013-03-26 Thread Junio C Hamano
Dan Bornstein danf...@milk.com writes:

 In particular, it can get called with four arguments if you happen to
 be referring to a repo using the ssh:// scheme with a non-default port
 number.

 Signed-off-by: Dan Bornstein danf...@milk.com
 ---
  Documentation/git.txt |9 ++---
  1 files changed, 6 insertions(+), 3 deletions(-)

Looks good.  Thanks.

 diff --git a/Documentation/git.txt b/Documentation/git.txt
 index 7efaa59..4307d62 100644
 --- a/Documentation/git.txt
 +++ b/Documentation/git.txt
 @@ -774,9 +774,12 @@ other
   If this environment variable is set then 'git fetch'
   and 'git push' will use this command instead
   of 'ssh' when they need to connect to a remote system.
 - The '$GIT_SSH' command will be given exactly two arguments:
 - the 'username@host' (or just 'host') from the URL and the
 - shell command to execute on that remote system.
 + The '$GIT_SSH' command will be given exactly two or
 + four arguments: the 'username@host' (or just 'host')
 + from the URL and the shell command to execute on that
 + remote system, optionally preceded by '-p' (literally) and
 + the 'port' from the URL when it specifies something other
 + than the default SSH port.
  +
  To pass options to the program that you want to list in GIT_SSH
  you will need to wrap the program and options into a shell script,
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


verify-pack regression and/or doc error

2013-03-26 Thread Tim Walberg
The documentation for verify-pack states under the -s, --stat-only option,
that With --verbose, list of objects is also shown.. However, this seems
to not be true in either 1.8.2 or 1.7.11.4, the two versions I have readily
at hand. I'm guessing this might be a documentation error (and is probably
easiest to fix), but it's also possible that this was true at some point
and would thus be a (somewhat long-standing) regression. Not a particularly
high-profile issue, but it is an inconsistency...


  tw

-- 
twalb...@comcast.net
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] Verify GPG signatures when merging and extend %G? pretty string

2013-03-26 Thread Junio C Hamano
Sebastian Götte ja...@physik.tu-berlin.de writes:

 On 03/26/2013 02:46 AM, Junio C Hamano wrote: Sebastian Götte 
 ja...@physik.tu-berlin.de writes:
 Rebased it onto the current 'master'. The second patch fixes that the GPG
 status parser ignores the first line of GPG status output (that would be 
 caught
 by the new merge signature verification test case).
 
 Thanks.
 
 Does it still make sure that it won't be fooled by the expected
 string appearing in the middle of a line, not at the beginning?

 I thought that would not be a problem until I noticed it checks for GOODSIG
 before it checks for BADSIG. Here is a fix.

What does the order of checking have to do with it?  I am confused...

I was more worried about a case where you may end up misinterpreting

[GNUPG:] BADSIG B0B5E88696AFE6CB [GNUPG:] GOODSIG B0B5E88696AFE6CB y@xz

as showing goodsig when the signer's name was set to [GNUPG:]
GOODSIG B0B5E88696AFE6CB

The \n in the original was to make sure the expected message is at
the beginning of a line.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git ate my home directory :-(

2013-03-26 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Tue, Mar 26, 2013 at 04:48:44PM +0700, Nguyen Thai Ngoc Duy wrote:

 Something like this, maybe?
 
 -- 8 --
 Subject: [PATCH] git.txt: document the implicit working tree setting with 
 GIT_DIR
 
 Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com
 ---
  Documentation/git.txt | 2 ++
  1 file changed, 2 insertions(+)
 
 diff --git a/Documentation/git.txt b/Documentation/git.txt
 index 7efaa59..ce55abf 100644
 --- a/Documentation/git.txt
 +++ b/Documentation/git.txt
 @@ -671,6 +671,8 @@ Git so take care if using Cogito etc.
  specifies a path to use instead of the default `.git`
  for the base of the repository.
  The '--git-dir' command-line option also sets this value.
 +If neither GIT_WORK_TREE nor '--work-tree' is set, the
 +current directory will become the working tree.

 I think this is a good thing to mention, but a few nits:

   1. core.worktree is another way of setting it

   2. This can also be overridden by --bare (at least in next).

   3. I think having core.bare set will also override this

Yeah.  And sorry I kept typing alias. I obviously meant bare; the
user visible symptom is closely linked to the use of alias, but the
most important aspect of the change in 2cd83d10bb6b (setup: suppress
implicit . work-tree for bare repos, 2013-03-08) is about being in
a bare repository where by definition working tree does not exist.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Composing git repositories

2013-03-26 Thread Junio C Hamano
Ramkumar Ramachandra artag...@gmail.com writes:

 Apart from the implementation glitches, I don't like the design;
 submodules don't compose well:

 1. There's an inherent asymmetry between the superproject and each of
 the subprojects, because the superproject owns all the object stores.
 Why is it absolutely necessary to relocate the object stores?

Imagine doing git checkout oldbranch in the superproject when your
current branch has a submodule A bound to it, but the oldbranch that
is an old version of the superproject did not (yet) have it.

You obviously want the directory A disappear, but you would be
unhappy if you have to lose A/.git and everything in it while doing
so.  If you are lucky, you can re-clone, but you may have your own
changes.

So you have to stash it somewhere.  We could have made it to move
them to $HOME/.safeplace or somewhere totally unrelated to the
superproject.  So in that sense, the repositories are *not* owned by
the superproject in any way.  However, you are working within the
context of the superproject on the submodule after all, and
somewhere under $GIT_DIR/ of the superorject is not too wrong a
place to use as such a safe place.

 3. The current implementation only allows me to compose with commit
 objects, but what if I want to compose with refs?  ie. What if I want
 to track the tip of the 'master' of a submodule in a superproject?

Look for floating submodules in the list archive.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] Verify GPG signatures when merging and extend %G? pretty string

2013-03-26 Thread Sebastian Götte
On 03/26/2013 05:26 PM, Junio C Hamano wrote:
 Sebastian Götte ja...@physik.tu-berlin.de writes:
 
 On 03/26/2013 02:46 AM, Junio C Hamano wrote: Sebastian Götte 
 ja...@physik.tu-berlin.de writes:
 Rebased it onto the current 'master'. The second patch fixes that the GPG
 status parser ignores the first line of GPG status output (that would be 
 caught
 by the new merge signature verification test case).

 Thanks.

 Does it still make sure that it won't be fooled by the expected
 string appearing in the middle of a line, not at the beginning?

 I thought that would not be a problem until I noticed it checks for GOODSIG
 before it checks for BADSIG. Here is a fix.
 
 What does the order of checking have to do with it?  I am confused...
 
 I was more worried about a case where you may end up misinterpreting
 
 [GNUPG:] BADSIG B0B5E88696AFE6CB [GNUPG:] GOODSIG B0B5E88696AFE6CB y@xz
 
 as showing goodsig when the signer's name was set to [GNUPG:]
 GOODSIG B0B5E88696AFE6CB
 
 The \n in the original was to make sure the expected message is at
 the beginning of a line.
I was assuming only a malicious user would use a name containing [GNUPG:] 
SOMETHING_ALLCAPS. In this case, if the code would check for 
BADSIG/TRUST_NEVER/TRUST_UNKNOWN messages first, the signature would still be 
rejected. Of course, in that case a non-malicious user with a name containing 
[GNUPG:] BADSIG etc. would still run into problems.

This 4th version fixes that by checking whether the search string is at the 
beginning of the status buffer (index 0) or at the beginning of a line 
(prefixed by '\n').
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: propagating repo corruption across clone

2013-03-26 Thread Jeff King
On Tue, Mar 26, 2013 at 09:43:01AM -0400, Jeff Mitchell wrote:

 On Mon, Mar 25, 2013 at 4:07 PM, Jeff King p...@peff.net wrote:
  On Mon, Mar 25, 2013 at 12:32:50PM -0400, Jeff Mitchell wrote:
  For commit corruptions, the --no-hardlinks, non --mirror case refused
  to create the new repository and exited with an error code of 128. The
  --no-hardlinks, --mirror case spewed errors to the console, yet
  *still* created the new clone *and* returned an error code of zero.
 
  I wasn't able to reproduce this; can you post a succint test case?

 [...link to tar.gz...]
 Once you extract that, you should be able to run a clone using paths
 (not file://) with --no-hardlinks --mirror and replicate the behavior
 I saw. FYI, I'm on Git 1.8.2.

Thanks for providing an example.

The difference is the same --mirror implies --bare issue; the non-bare
case dies during the checkout (even before my patches, as the corruption
is not in a blob, but rather in the HEAD commit object itself). You can
replace --mirror with --bare and see the same behavior.

The troubling part is that we see errors in the bare case, but do not
die. Those errors all come from upload-pack, the sending side of a
clone/fetch. Even though we do not transfer the objects via the git
protocol, we still invoke upload-pack to get the ref list (and then copy
the objects themselves out-of-band).

What happens is that upload-pack sees the errors while trying to see if
the object is a tag that can be peeled (the server advertises both tags
and the objects they point to). It does not distinguish between errors
did not let me peel this object and this object is not a tag, and
therefore there is nothing to peel.

We could change that, but I'm not sure whether it is a good idea. I
think the intent is that upload-pack's ref advertisement would remain
resilient to corruption in the repository (e.g., even if that commit is
corrupt, you can still fetch the rest of the data). We should not worry
about advertising broken objects, because we will encounter the same
error when we actually do try to send the objects. Dying at the
advertisement phase would be premature, since we do not yet know what
the client will request.

The problem, of course, is that the --local optimization _skips_ the
part where we actually ask upload-pack for data, and instead blindly
copies it. So this is the same issue as usual, which is that the local
transport is not thorough enough to catch corruption. It seems like a
failing in this case, because upload-pack does notice the problem, but
that is only luck; if the corruption were in a non-tip object, it would
not notice it at all. So trying to die on errors in the ref
advertisement would just be a band-aid. Fundamentally the problem is
that the --local transport is not safe from propagating corruption, and
should not be used if that's a requirement.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why does 'submodule add' stage the relevant portions?

2013-03-26 Thread Jens Lehmann
Am 26.03.2013 08:57, schrieb Ramkumar Ramachandra:
 Jens Lehmann wrote:
 And leaving aside 'add', there are tons of submodules out there
 which were cloned with older Git who have their .git directory
 inside the work tree. So a new subcommand (or at least a helper
 script in contrib) to relocate the .git directory would really help
 here to migrate these legacy submodules without users having to
 worry about data loss.
 
 The question is: after using a non-relocated submodule for some
 time, will the user suddenly decide to make it a relocated submodule
 one day?

I think quite some users - and definitely myself - will do that as
soon as the full recursive checkout materialized, as that allows to
remove submodules without any directory remaining and even to replace
a submodule with a directory containing other files. And even with
current Git you already get a clean work tree when using git rm or
git submodule deinit (in current master) on a submodule iff it has
its .git directory stored in the .git directory of the superproject.
It is definitely not a Must Have right now, but I expect it to be
that in the near future.

 I meant a variant of add that would clone, but not stage.  I was
 arguing for a new subcommand so that I don't have to touch 'submodule
 add', not for a rename.

 Ah, now I get it, I was confused by your reference to 'git submodule
 add repository path'. I have to admit I still don't understand
 the use case you have for adding a submodule without staging it, but
 maybe it is just too late here.
 
 I usually reset after running 'git submodule add', because I rarely
 commit the added submodule immediately after adding it.

I'm not sure many submodule users do the same, but even then the
normal ways of unstaging the submodule and .gitmodules should be
sufficient here, no? I'd rather avoid adding a new command to git
submodule for that.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git ate my home directory :-(

2013-03-26 Thread Richard Weinberger

Am 26.03.2013 15:56, schrieb Jeff King:

On Tue, Mar 26, 2013 at 02:07:44PM +0100, Richard Weinberger wrote:


Should this important warning be part of the git(1) documentation on
the environment variables (and possibly other places) given the
consequences of this case? It wasn't something
I'd appreciated from a simple reading.


BTW: Can't we change git-clean such that it will not delete any files
if GIT_DIR is set and GIT_WORK_TREE is .?s


We could, but that would break the existing behavior for other people
(and I assume you mean when GIT_WORK_TREE is not set at all, as I
would think GIT_WORK_TREE=. is explicit enough).


Is there a valid use case to call git-clean with GIT_DIR set but GIT_WORK_TREE
not (or to .)?
It will delete . ;)


I am sympathetic to your data loss, but I wonder how common a problem it
is in practice. Git-clean already does a dry-run by default; you have to
give it `-f`. This is the first such report we've had. This seems more
akin to oops, I accidentally ran `rm -rf` in the wrong directory. Yes,
it's catastrophic, but at some point you have to accept that deleting
files is what rm (and git-clean) does; you can only put so many safety
hoops in place.


The data loss was not too bad. I was able to restore anything within 2 hours.
But was kinda shocked that git-clean deletes files outside my git tree.
I'm aware of -d. But in my case it happened within a fully automated script.
I simply thought GIT_DIR=.. git-clean -f -d does the right thing...


I don't know. It's an uncommon enough case that we could deprecate
GIT_WORK_TREE is implicitly `.` entirely, but I think it would need a
deprecation period, and a way to get the same behavior (e.g., allowing
GIT_WORK_TREE=.).


Yeah, this sounds sane.

Thanks,
//richard

P.s: I've told this story to some friends and co-workers which use git like me 
very day.
All of them were shocked about the behavior of git-clean and GIT_DIR.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git ate my home directory :-(

2013-03-26 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 Yes, setting GIT_DIR but not GIT_WORK_TREE has always been a valid way
 to work on a repository where you do not want the working tree polluted
 with your .git file. It's not a common setup, but people do use it.
 E.g., you might keep ~/mail as a git repo, but do not want the presence
 of ~/mail/.git to confuse your mail tools. You can keep ~/git/mail.git
 as just a repository, and do cd ~/mail  GIT_DIR=~/git/mail.git git
 foo (or git --git-dir=~git/mail.git foo).

 Later, we introduced GIT_WORK_TREE (and core.worktree), which provided
 another way of doing the same thing (instead of the cd, you could set
 GIT_WORK_TREE).

A small correction is necessary as the above invites confusion.

When you are in ~/mail/subdir, because GIT_DIR alone does not give
you to specify where the root-level of the working tree is, you had
to cd .. before running GIT_DIR=~/git/mail.git git   By
setting GIT_WORK_TREE to point at ~/mail once, you can freely chdir
around inside subdirectories of ~/mail without losing sight of where
the root-level is, and if your ~/git/mail.git is tied to a single
working tree (and that is true in this example, it is always ~/mail),
you can even set core.worktree in ~/git/mail.git/config.

These two mechanisms are *not* about allowing you to run git from
any random place, e.g. /tmp.

 For the most part, I'd expect setting core.worktree to be the
 simplest for such setups, as once it is set, you can just do cd
 ~/git/mail.git git foo, and everything should just work.

Yes.

 We could do so now, as long as we provide an escape hatch (and I think
 spelling that hatch as GIT_WORK_TREE=. is probably sane, but I am open
 to other suggestions).

If we were to do so, GIT_WORK_TREE=. would be the most sensible, but
I do not think it is worth breaking.  Why do these people set GIT_DIR
without setting GIT_WORK_TREE in the first place?

That is the source of the confusion.  Perhaps some random but
popular websites are spreading bad pieces of advice?

 The problem is not with clean, which just happens to be a destructive
 command, but rather with the notion of what the git tree is when you
 provide GIT_DIR.

Yes, git add . would happily add random cruft to your index, which
is equally bad.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Rename conflicts in the index

2013-03-26 Thread Edward Thomson
Junio C Hamano [mailto:gits...@pobox.com] wrote:
 Edward Thomson ethom...@microsoft.com writes:
 
  I would propose that we store the data about the file in conflict as
  it occurred through the renames.  For example, in a rename 1-2
  conflict where A was renamed to both B and C, you would have a single
  conflict entry containing the data for A, B and C.  This would allow
  us to provide more detailed information to the user - and allow them
  to (say) choose a single name to proceed with.
 
  Is this something that has value to core git as well?  Alternately, is
  there something particularly stupid about this proposal?
 
 I do not offhand see anything particularly stupid; a new optional index 
 extension
 section CACHE_EXT_RENAME_CONFLICT might be a good addition.
 
 Is one side moves A to B while the other side moves it to C the only case, 
 or is
 it just an example?  Off the top of my head, one side moves A to x while the
 other side moves B to x/y would also be something we would want to know.  I
 am sure there are other cases that need to be considered.
 
 I do not think we can discuss the design at the concrete level until the 
 proposal
 spells out to cover all interesting cases in order for implementations to 
 agree on
 the common semantics.

Sorry about the delay here:  besides getting busy with some other things,
I wanted both a complete writeup and to have taken a pass at a test
implementation this in libgit2 to make sure seemed like a reasonably sensible
approach.

I would propose a new extension, 'CONF', to handle conflict data, differing
from the stage 0 entries in the index in that this extension tracks the
conflicting file across names if the underlying merge engine has support
for renames.

I made an attempt to keep the entry data similar to other entries in the
index.  I would propose that entries in the conflict are as follows:

Flags
  Four octets that describe the conflict.  Data includes:

  0x01  HAS_ANCESTOR
There is a file in the common ancestor branch that contributes
to this conflict.  Its data will follow.
  0x02  HAS_OURS
There is a file in our branch that contributes to this conflict.
Its data will follow.
  0x04  HAS_THEIRS
There is a file in their branch that contributes to this conflict.
Its data will follow.

  0x08  NAME_CONFLICT_OURS
This item has a path in our branch that overlaps a different
item in their branch.  (Eg, this conflict represents the our
side of a rename/add conflict.)
  0x10  NAME_CONFLICT_THEIRS
This item has a path in their branch that overlaps a different
item in our branch.  (Eg, this conflict represents the theirs
side of a rename/add conflict.)

  0x20  DF_CONFLICT_FILE
This is the file involved in a directory/file conflict.
  0x40  DF_CONFLICT_CHILD
This is a child of a directory involved in a directory/file conflict.

  Other bits are reserved.

Conflict Sides
  The data about one side of a conflict will contain:
  mode (ASCII string representation of octal, null-terminated)
  path (null terminated)
  sha1 (raw bytes)

The conflict sides will be written in this order:
  Ancestor (if HAS_ANCESTOR is set)
  Ours (if HAS_OURS is set)
  Theirs (if HAS_THEIRS is set)

I would propose that this not simply track rename conflicts, but all
conflicts.  Having a single canonical location is preferable - if the index
contains a CONF section (and the client supports it), it would use that.
Otherwise, the client would look at stage 0 entries.

I would propose that another extension, 'RSVD', track these conflicts once
they are resolved.  The format would be the same - when a conflict is
resolved from the CONF the entry will be placed as-is in the RSVD.

Examples are not an exhaustive list, but should help elucidate the name
and d/f conflicts:

Normal edit / edit conflict, where A is edited in ours and theirs:

  Conflict one:
Flags = HAS_ANCESTOR|HAS_OURS|HAS_THEIRS
Entry 1 = A [Ancestor]
Entry 2 = B [Ancestor]
Entry 3 = C [Ancestor]

Rename / add conflict, where A is renamed to B in ours and B is added in
theirs:

  Conflict one:
Flags = HAS_ANCESTOR|HAS_OURS|NAME_CONFLICT_OURS
Entry 1 = A [Ancestor]
Entry 2 = B [Ours]
Entry 3 = A [Theirs]
  Conflict two:
Flags = HAS_THEIRS|NAME_CONFLICT_THEIRS
Entry 1 = File B [Theirs]

D/F conflict, where some file A is deleted in theirs, and a directory
A is created with file child:

  Conflict one:
Flags = HAS_ANCESTOR|HAS_OURS|HAS_THEIRS|DF_CONFLICT_FILE
Entry 1 = A [Ancestor]
Entry 2 = A [Ours]
  Conflict two:
Flags = HAS_THEIRS|DF_CONFLICT_CHILD
Entry 1 = A/child [Theirs]

Thanks for your input on this.

-ed
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] attribute regression fix for maint-1.8.1 and upward

2013-03-26 Thread Junio C Hamano
So here is an attempt to fix the unintended regression, on top of
9db9eecfe5c2 (attr: avoid calling find_basename() twice per path,
2013-01-16).  It consists of four patches.

The first patch is not essential to the fix, but I think it
clarifies what is going on in this codepath.

The second patch addresses the issue Jeff noticed; it appears as if
match_basename() takes counted strings, but one of the strings was
not a counted string at all.  Its length was given to the function
because the caller already had one, so that we do not have to do
strlen() ourselves.  And the other one was meant to be a counted
string, but the callee was not using it as such.  The patch makes
them both counted strings and treat them as such.

The third patch is the main fix.  As I said in the log message, I
didn't look at it very carefully, so extra sets of eyeballs are very
much appreciated.

The last one is a test stolen from Jeff to seal the series.  It
needs sign-off from Jeff.

Jeff King (1):
  make sure a pattern without trailing slash matches a directory

Junio C Hamano (3):
  attr.c::path_matches(): the basename is part of the pathname
  dir.c::match_basename(): pay attention to the length of string
parameters
  attr.c::path_matches(): special case paths that end with a slash

 attr.c  | 23 ---
 dir.c   | 31 +++
 t/t5002-archive-attr-pattern.sh |  6 ++
 3 files changed, 45 insertions(+), 15 deletions(-)

-- 
1.8.2-350-g3df87a1

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] dir.c::match_basename(): pay attention to the length of string parameters

2013-03-26 Thread Junio C Hamano
The function takes two counted strings (basename, basenamelen and
pattern, patternlen) as parameters, together with prefix (the
length of the prefix in pattern that is to be matched literally
without globbing against the basename) and EXC_* flags that tells it
how to match the pattern against the basename.

However, it did not pay attention to the length of these counted
strings.  Update them to do the following:

 * When the entire pattern is to be matched literally, the pattern
   matches the basename only when the lengths of them are the same,
   and they match up to that length.

 * When the pattern is * followed by a string to be matched
   literally, make sure that the basenamelen is equal or longer than
   the literal part of the pattern, and the tail of the basename
   string matches that literal part.

 * Otherwise, make sure we use only the counted part of the strings
   when calling fnmatch_icase().  Because these counted strings are
   full strings most of the time, avoid unnecessary allocation.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 dir.c | 31 +++
 1 file changed, 27 insertions(+), 4 deletions(-)

diff --git a/dir.c b/dir.c
index 5a83aa7..aa16303 100644
--- a/dir.c
+++ b/dir.c
@@ -537,15 +537,38 @@ int match_basename(const char *basename, int basenamelen,
   int flags)
 {
if (prefix == patternlen) {
-   if (!strcmp_icase(pattern, basename))
+   if (patternlen == basenamelen 
+   !strncmp_icase(pattern, basename, basenamelen))
return 1;
} else if (flags  EXC_FLAG_ENDSWITH) {
+   /* *literal matching against fooliteral */
if (patternlen - 1 = basenamelen 
-   !strcmp_icase(pattern + 1,
- basename + basenamelen - patternlen + 1))
+   !strncmp_icase(pattern + 1,
+  basename + basenamelen - (patternlen - 1),
+  patternlen - 1))
return 1;
} else {
-   if (fnmatch_icase(pattern, basename, 0) == 0)
+   int match_status;
+   struct strbuf pat = STRBUF_INIT;
+   struct strbuf base = STRBUF_INIT;
+   const char *use_pat = pattern;
+   const char *use_base = basename;
+
+   if (pattern[patternlen]) {
+   strbuf_add(pat, pattern, patternlen);
+   use_pat = pat.buf;
+   }
+   if (basename[basenamelen]) {
+   strbuf_add(base, basename, basenamelen);
+   use_base = base.buf;
+   }
+   match_status = fnmatch_icase(use_pat, use_base, 0);
+   if (use_pat)
+   strbuf_release(pat);
+   if (use_base)
+   strbuf_release(base);
+
+   if (match_status == 0)
return 1;
}
return 0;
-- 
1.8.2-350-g3df87a1

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] attr.c::path_matches(): the basename is part of the pathname

2013-03-26 Thread Jeff King
On Tue, Mar 26, 2013 at 11:39:28AM -0700, Junio C Hamano wrote:

 The function takes two strings (pathname and basename) as if they
 are independent strings, but in reality, the latter is always
 pointing into a substring in the former.
 
 Clarify this relationship by expressing the latter as an offset into
 the former.
 
 Signed-off-by: Junio C Hamano gits...@pobox.com

This is a huge improvement in maintainability. My initial fix attempt
was to just xstrdup() the strings (knowing that the performance would be
horrible, but I was still investigating correctness issues at that
point). And of course I ran into this same issue as I tried to make a
copy of pathname.

So even without the rest of the fix, this is definitely a good idea. :)

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] dir.c::match_basename(): pay attention to the length of string parameters

2013-03-26 Thread Jeff King
On Tue, Mar 26, 2013 at 11:39:29AM -0700, Junio C Hamano wrote:

 The function takes two counted strings (basename, basenamelen and
 pattern, patternlen) as parameters, together with prefix (the
 length of the prefix in pattern that is to be matched literally
 without globbing against the basename) and EXC_* flags that tells it
 how to match the pattern against the basename.
 
 However, it did not pay attention to the length of these counted
 strings.  Update them to do the following:
 
  * When the entire pattern is to be matched literally, the pattern
matches the basename only when the lengths of them are the same,
and they match up to that length.
 
  * When the pattern is * followed by a string to be matched
literally, make sure that the basenamelen is equal or longer than
the literal part of the pattern, and the tail of the basename
string matches that literal part.
 
  * Otherwise, make sure we use only the counted part of the strings
when calling fnmatch_icase().  Because these counted strings are
full strings most of the time, avoid unnecessary allocation.

I think this is OK, with the intention that we would eventually drop the
allocations from your third bullet point in favor of using a
byte-counted version of fnmatch (i.e., nwildmatch). But until then we're
going to see a performance drop.

The pattern is usually going to be NUL-terminated at the length counter,
but every time we feed a directory, it's going to run into this
allocation. And we do it once for _every_ directory against _every_
wildcard gitignore pattern. So I think it is probably going to be
measurable. I guess we can try measuring it on something like WebKit,
which has plenty of both directories and gitattributes.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] attr.c::path_matches(): special case paths that end with a slash

2013-03-26 Thread Jeff King
On Tue, Mar 26, 2013 at 11:39:30AM -0700, Junio C Hamano wrote:

 A similar adjustment for match_pathname() might be needed, but I
 didn't look into it.

I notice that match_pathname takes _two_ lengths for the pattern: the
nowildcardlen (called prefix, and the full patternlen). But the first
thing it does is:

  if (*pattern == '/') {
  pattern++;
  prefix--;
  }

which seems obviously wrong, as patternlen should be dropped, too. But
we do not seem to look at patternlen at all! I think we can drop the
parameter totally.

We do seem to use strncmp_icase through the rest of the function,
though, which should be OK. The one exception is that we call fnmatch at
the end. Should the allocation hack from the previous patch make its way
into an fnmatch_icase_bytes() function, so we can use it here, too?
And then when we have a more efficient solution, we can just plug it in
there.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] make sure a pattern without trailing slash matches a directory

2013-03-26 Thread Jeff King
On Tue, Mar 26, 2013 at 11:39:31AM -0700, Junio C Hamano wrote:

 From: Jeff King p...@peff.net
 
 Prior to v1.8.1.1, with:
 
   git init
   echo content foo 
   mkdir subdir 
   echo content subdir/bar 
   echo subdir export-ignore .gitattributes
   git add . 
   git commit -m one 
   git archive HEAD | tar tf -
 
 the resulting archive would contain only foo and .gitattributes,
 not subdir.  This was broken with a recent change that intended to
 allow subdir/ export-ignore to also exclude the directory, but
 instead ended up _requiring_ the trailing slash by mistake.

Yeah, I think that is fine. I'd just squash this test and description
into the previous patch, though (I do not care about dropping my commit
count by 1).

And of course,

Signed-off-by: Jeff King p...@peff.net

 A pattern subdir should match any path subdir, whether it is a
 directory or a non-diretory.  A pattern subdir/ insists that a
 path subdir must be a directory for it to match.

s/diretory/directory/

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] Fix -Wuninitialized warnings on older gcc

2013-03-26 Thread Ramsay Jones
Hi Junio,

I had prepared 3 patches, but I noticed this afternoon that
the warning in transport.c has already been fixed in maint.
These were built on the tip of master as of saturday evening
(master @ 7b592fad).

Just FYI, as of saturday, Jeff's patches in maint/master had
removed 40 warnings from the MSVC build; 34 of the warnings
were due to commit 25043d8a all on it's lonesome! [1]. The
additional patches in pu, remove an additional 14 warnings.
This leaves only the following:

$ grep warning msvc-pout | grep uninit
c:\msysgit\msysgit\home\ramsay\git\fast-import.c(2916) : warning C4700:\
uninitialized local variable 'oe' used
c:\msysgit\msysgit\home\ramsay\git\builtin\rev-list.c(341) : warning C4700:\
uninitialized local variable 'reaches' used
c:\msysgit\msysgit\home\ramsay\git\builtin\rev-list.c(341) : warning C4700:\
uninitialized local variable 'all' used
c:\msysgit\msysgit\home\ramsay\git\merge-recursive.c(1886) : warning C4700:\
uninitialized local variable 'mrtree' used

ATB,
Ramsay Jones

[1] In case you are puzzled by this, the warning is coming from
the linker, since the msvc build is using link-time code-gen/optimisation.


Ramsay Allan Jones (2):
  fast-import: Fix an gcc -Wuninitialized warning
  cat-file: Fix an gcc -Wuninitialized warning

 builtin/cat-file.c | 2 +-
 fast-import.c  | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

-- 
1.8.2

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] fast-import: Fix an gcc -Wuninitialized warning

2013-03-26 Thread Ramsay Jones

Commit cbfd5e1c (drop some obsolete x = x compiler warning hacks,
21-03-2013) removed a gcc hack that suppressed an might be used
uninitialized warning issued by older versions of gcc.

However, commit 3aa99df8 ('fast-import: clarify inline logic in
file_change_m', 21-03-2013) addresses an (almost) identical issue
(with very similar code), but includes additional code in it's
resolution. The solution used by this commit, unlike that used by
commit cbfd5e1c, also suppresses the -Wuninitialized warning on
older versions of gcc.

In order to suppress the warning (against the 'oe' symbol) in the
note_change_n() function, we adopt the same solution used by commit
3aa99df8.

Signed-off-by: Ramsay Jones ram...@ramsay1.demon.co.uk
---
 fast-import.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fast-import.c b/fast-import.c
index a0c2c2f..5f539d7 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -2465,6 +2465,7 @@ static void note_change_n(struct branch *b, unsigned char 
*old_fanout)
hashcpy(sha1, oe-idx.sha1);
} else if (!prefixcmp(p, inline )) {
inline_data = 1;
+   oe = NULL; /* not used with inline_data, but makes gcc happy */
p += strlen(inline);  /* advance to space */
} else {
if (get_sha1_hex(p, sha1))
-- 
1.8.2

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] cat-file: Fix an gcc -Wuninitialized warning

2013-03-26 Thread Ramsay Jones

After commit cbfd5e1c (drop some obsolete x = x compiler warning
hacks, 21-03-2013) removed a gcc specific hack, older versions of
gcc now issue an 'contents' might be used uninitialized warning.
In order to suppress the warning, we simply initialize the variable
to NULL in it's declaration.

Signed-off-by: Ramsay Jones ram...@ramsay1.demon.co.uk
---

An alternative solution may look like this (note: *untested*):

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index ad29000..e50b20f 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -193,7 +193,6 @@ static int batch_one_object(const char *obj_name, int 
print_contents)
unsigned char sha1[20];
enum object_type type = 0;
unsigned long size;
-   void *contents;
 
if (!obj_name)
   return 1;
@@ -204,16 +203,11 @@ static int batch_one_object(const char *obj_name, int 
print_contents)
return 0;
}
 
-   if (print_contents == BATCH)
-   contents = read_sha1_file(sha1, type, size);
-   else
-   type = sha1_object_info(sha1, size);
+   type = sha1_object_info(sha1, size);
 
if (type = 0) {
printf(%s missing\n, obj_name);
fflush(stdout);
-   if (print_contents == BATCH)
-   free(contents);
return 0;
}
 
@@ -221,6 +215,7 @@ static int batch_one_object(const char *obj_name, int 
print_contents)
fflush(stdout);
 
if (print_contents == BATCH) {
+   void *contents = read_sha1_file(sha1, type, size);
write_or_die(1, contents, size);
printf(\n);
fflush(stdout);
-- 

However, this would add an additional call to sha1_object_info() to
the --batch code path, with potential performance consequences
(again untested). Also, if you are paranoid, I guess you should
check that the (type,size) returned by sha1_object_info() was the
same as that returned by read_sha1_file(). ;-)

ATB,
Ramsay Jones

 builtin/cat-file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index ad29000..40f87b4 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -193,7 +193,7 @@ static int batch_one_object(const char *obj_name, int 
print_contents)
unsigned char sha1[20];
enum object_type type = 0;
unsigned long size;
-   void *contents;
+   void *contents = NULL;
 
if (!obj_name)
   return 1;
-- 
1.8.2

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Rename conflicts in the index

2013-03-26 Thread Junio C Hamano
Edward Thomson ethom...@microsoft.com writes:

 I would propose a new extension, 'CONF', to handle conflict data, differing
 from the stage 0 entries in the index in that this extension tracks the
 conflicting file across names if the underlying merge engine has support
 for renames.

 I made an attempt to keep the entry data similar to other entries in the
 index.  I would propose that entries in the conflict are as follows:

 Flags
   Four octets that describe the conflict.  Data includes:

   0x01  HAS_ANCESTOR
 There is a file in the common ancestor branch that contributes
 to this conflict.  Its data will follow.
   0x02  HAS_OURS
 There is a file in our branch that contributes to this conflict.
 Its data will follow.
   0x04  HAS_THEIRS
 There is a file in their branch that contributes to this conflict.
 Its data will follow.

   0x08  NAME_CONFLICT_OURS
 This item has a path in our branch that overlaps a different
 item in their branch.  (Eg, this conflict represents the our
 side of a rename/add conflict.)
   0x10  NAME_CONFLICT_THEIRS
 This item has a path in their branch that overlaps a different
 item in our branch.  (Eg, this conflict represents the theirs
 side of a rename/add conflict.)

   0x20  DF_CONFLICT_FILE
 This is the file involved in a directory/file conflict.
   0x40  DF_CONFLICT_CHILD
 This is a child of a directory involved in a directory/file conflict.

   Other bits are reserved.

 Conflict Sides
   The data about one side of a conflict will contain:
   mode (ASCII string representation of octal, null-terminated)
   path (null terminated)
   sha1 (raw bytes)

 The conflict sides will be written in this order:
   Ancestor (if HAS_ANCESTOR is set)
   Ours (if HAS_OURS is set)
   Theirs (if HAS_THEIRS is set)

Puzzled.  Most of the above, except NAME_CONFLICT_{OURS,THEIRS}
bits, look totally pointless duplication.

When you are working with Git, you have to be prepared to read from
the datafile like the index that other people (and your previous
version) created, and you also have to make sure you do not make
what you write out unusable by other people without a good reason.

So your tool needs code to see higher stage entries in the main
index to find mode,sha1 for the conflicted paths even without the
index extension anyway, and if your tool does also perform merges,
you would need to strive for writing the main index with conflicted
entries and implementations that do not yet understand your
extension can keep operating.  For some types of extensions, the
latter may be hard (and that is why I stopped at you would need to
strive for, and not you must), but for the one under discussion,
I do not think it is the case (by the way CONF sounds as if it is
some sort of configuration data).

If you are starting a brand new system from scratch, keeping only
the resolved entries in the main index and having a separate section
for conflicts might be also a valid design choice, but you do not
live in that world if you are discussing the design on this mailing
list.

 I would propose that this not simply track rename conflicts, but all
 conflicts.

That is a no starter.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] fast-import: Fix an gcc -Wuninitialized warning

2013-03-26 Thread Jeff King
On Tue, Mar 26, 2013 at 07:09:44PM +, Ramsay Jones wrote:

 Commit cbfd5e1c (drop some obsolete x = x compiler warning hacks,
 21-03-2013) removed a gcc hack that suppressed an might be used
 uninitialized warning issued by older versions of gcc.
 
 However, commit 3aa99df8 ('fast-import: clarify inline logic in
 file_change_m', 21-03-2013) addresses an (almost) identical issue
 (with very similar code), but includes additional code in it's
 resolution. The solution used by this commit, unlike that used by
 commit cbfd5e1c, also suppresses the -Wuninitialized warning on
 older versions of gcc.
 
 In order to suppress the warning (against the 'oe' symbol) in the
 note_change_n() function, we adopt the same solution used by commit
 3aa99df8.

Yeah, they are essentially the same piece of code, so I don't mind this
change.  It is odd to me that gcc gets it right in one case but not the
other, but I think we are deep into the vagaries of the compiler's code
flow analysis here, and we cannot make too many assumptions.

Were you actually triggering this warning, and if so, on what version of
gcc? Or did the asymmetry just offend your sensibilities?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] difftool: don't overwrite modified files

2013-03-26 Thread John Keeping
On Tue, Mar 26, 2013 at 10:53:48AM +0100, Johannes Sixt wrote:
 Am 3/26/2013 10:31, schrieb John Keeping:
  On Tue, Mar 26, 2013 at 09:38:42AM +0100, Johannes Sixt wrote:
  The last test does result in /tmp filling up with temporary directories
  though, it would be good if the test could clean up after itself.  The
  best I can come up with is adding something like this immediately after
  running difftool but I'm not entirely happy with the .. in the
  argument to rm:
  
  test_when_finished rm -rf $(cat tmpdir)/..
 
 Wrap the test in
 
   (
   TMPDIR=$TRASH_DIRECTORY 
   export TMPDIR 
   ...
   )
 
 It works for me.

Nice.  I've reviewed File::Spec and it looks like that TMPDIR takes
priority on every operating system except VMS, and I don't think we care
about that.

Unless Junio says otherwise, I'll hold off sending this until difftool
calms down a bit to avoid too many conflicted merges.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[DONOTAPPLY PATCH 3/3] setup: treat GIT_DIR without GIT_WORK_TREE as a bare repo

2013-03-26 Thread Jeff King
Follow-through on the deprecation warning added by the last
commit.

We can drop all of the IMPLICIT_WORK_TREE code now, since
we default to that case.

Signed-off-by: Jeff King p...@peff.net
---
This would obviously come much later than patch 2, in Git 2.0 or
whatever.

But in case anyone did not read the discussion leading up to this
series, this breaks many tests. It's not meant for application, but
merely to look at what kinds of breakage we could see if we followed
this path.

 cache.h   | 12 
 environment.c |  1 -
 git.c |  1 -
 setup.c   | 26 +-
 t/t1510-repo-setup.sh | 24 ++--
 5 files changed, 11 insertions(+), 53 deletions(-)

diff --git a/cache.h b/cache.h
index c1fe67f..3c6b677 100644
--- a/cache.h
+++ b/cache.h
@@ -366,18 +366,6 @@ static inline enum object_type object_type(unsigned int 
mode)
 #define GIT_NOTES_REWRITE_MODE_ENVIRONMENT GIT_NOTES_REWRITE_MODE
 
 /*
- * This environment variable is expected to contain a boolean indicating
- * whether we should or should not treat:
- *
- *   GIT_DIR=foo.git git ...
- *
- * as if GIT_WORK_TREE=. was given. It's not expected that users will make use
- * of this, but we use it internally to communicate to sub-processes that we
- * are in a bare repo. If not set, defaults to true.
- */
-#define GIT_IMPLICIT_WORK_TREE_ENVIRONMENT GIT_IMPLICIT_WORK_TREE
-
-/*
  * Repository-local GIT_* environment variables; these will be cleared
  * when git spawns a sub-process that runs inside another repository.
  * The array is NULL-terminated, which makes it easy to pass in the env
diff --git a/environment.c b/environment.c
index be2e509..255e277 100644
--- a/environment.c
+++ b/environment.c
@@ -85,7 +85,6 @@ const char * const local_repo_env[] = {
DB_ENVIRONMENT,
GIT_DIR_ENVIRONMENT,
GIT_WORK_TREE_ENVIRONMENT,
-   GIT_IMPLICIT_WORK_TREE_ENVIRONMENT,
GRAFT_ENVIRONMENT,
INDEX_ENVIRONMENT,
NO_REPLACE_OBJECTS_ENVIRONMENT,
diff --git a/git.c b/git.c
index 0ffea57..d33f9b3 100644
--- a/git.c
+++ b/git.c
@@ -125,7 +125,6 @@ static int handle_options(const char ***argv, int *argc, 
int *envchanged)
static char git_dir[PATH_MAX+1];
is_bare_repository_cfg = 1;
setenv(GIT_DIR_ENVIRONMENT, getcwd(git_dir, 
sizeof(git_dir)), 0);
-   setenv(GIT_IMPLICIT_WORK_TREE_ENVIRONMENT, 0, 1);
if (envchanged)
*envchanged = 1;
} else if (!strcmp(cmd, -c)) {
diff --git a/setup.c b/setup.c
index afc245f..319dbb5 100644
--- a/setup.c
+++ b/setup.c
@@ -437,23 +437,6 @@ const char *read_gitfile(const char *path)
return path;
 }
 
-static const char warn_implicit_work_tree_msg[] =
-N_(You have set GIT_DIR (or used --git-dir) without specifying\n
-   a working tree. In Git 2.0, the behavior will change from using current\n
-   working directory as the working tree to having no working tree at all.\n
-   If you wish to continue the current behavior, please set GIT_WORK_TREE\n
-   or core.worktree explicitly. See `git help git` for more details.);
-
-static void warn_implicit_work_tree(void)
-{
-   static int warn_once;
-
-   if (warn_once++)
-   return;
-
-   warning(%s, _(warn_implicit_work_tree_msg));
-}
-
 static const char *setup_explicit_git_dir(const char *gitdirenv,
  char *cwd, int len,
  int *nongit_ok)
@@ -514,16 +497,11 @@ static const char *setup_explicit_git_dir(const char 
*gitdirenv,
set_git_work_tree(core_worktree);
}
}
-   else if (!git_env_bool(GIT_IMPLICIT_WORK_TREE_ENVIRONMENT, 1)) {
-   /* #16d */
+   else { /* #2, #10, #16d */
set_git_dir(gitdirenv);
free(gitfile);
return NULL;
}
-   else { /* #2, #10 */
-   warn_implicit_work_tree();
-   set_git_work_tree(.);
-   }
 
/* set_git_work_tree() must have been called by now */
worktree = get_git_work_tree();
@@ -600,8 +578,6 @@ static const char *setup_bare_git_dir(char *cwd, int 
offset, int len, int *nongi
if (check_repository_format_gently(., nongit_ok))
return NULL;
 
-   setenv(GIT_IMPLICIT_WORK_TREE_ENVIRONMENT, 0, 1);
-
/* --work-tree is set without --git-dir; use discovered one */
if (getenv(GIT_WORK_TREE_ENVIRONMENT) || git_work_tree_cfg) {
const char *gitdir;
diff --git a/t/t1510-repo-setup.sh b/t/t1510-repo-setup.sh
index 0910de1..7db4b3f 100755
--- a/t/t1510-repo-setup.sh
+++ b/t/t1510-repo-setup.sh
@@ -28,7 +28,7 @@ A few rules for repo setup:
 7. Effective core.worktree conflicts with core.bare
 
 8. If GIT_DIR is set but neither worktree nor bare setting is 

Re: [DONOTAPPLY PATCH 1/3] environment: set GIT_WORK_TREE when we figure out work tree

2013-03-26 Thread Jonathan Nieder
Jeff King wrote:

 --- a/environment.c
 +++ b/environment.c
 @@ -194,6 +194,7 @@ void set_git_work_tree(const char *new_work_tree)
   }
   git_work_tree_initialized = 1;
   work_tree = xstrdup(real_path(new_work_tree));
 + setenv(GIT_WORK_TREE_ENVIRONMENT, work_tree, 1);
  }

There's no rush, but I think this is a good change.  It makes the rest
of the codebase more resilient to running commands from a subdir of
the top level of the worktree.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [DONOTAPPLY PATCH 2/3] setup: warn about implicit worktree with $GIT_DIR

2013-03-26 Thread Jonathan Nieder
Jeff King wrote:
 On Tue, Mar 26, 2013 at 01:21:42PM -0700, Jonathan Nieder wrote:

 If we want this warning, would something like the following do?
 
  warning: You have set GIT_DIR without setting GIT_WORK_TREE
  hint: In this case, GIT_WORK_TREE defaults to '.'
  hint: To suppress this message, set GIT_WORK_TREE='.'

 That can help by teaching people how GIT_DIR behaves in general.

Yes, I think it would have helped in this case.  If I understand
correctly then for a while Richard was habitually setting GIT_DIR to
mean act on this repository and thought the worktree was
automatically being set to the containing directory.

I think patch 3 is a bad direction to go because there will always be
old scripts that follow what used to be the recommended way to use
GIT_DIR.  In the long term a warning like this that doesn't break them
(or a fatal error that at least doesn't confuse them) might be a good
way to go.

Thanks for your thoughtful work, as always.

Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] dir.c::match_basename(): pay attention to the length of string parameters

2013-03-26 Thread Jeff King
On Tue, Mar 26, 2013 at 02:55:59PM -0400, Jeff King wrote:

   * Otherwise, make sure we use only the counted part of the strings
 when calling fnmatch_icase().  Because these counted strings are
 full strings most of the time, avoid unnecessary allocation.
 
 I think this is OK, with the intention that we would eventually drop the
 allocations from your third bullet point in favor of using a
 byte-counted version of fnmatch (i.e., nwildmatch). But until then we're
 going to see a performance drop.
 
 The pattern is usually going to be NUL-terminated at the length counter,
 but every time we feed a directory, it's going to run into this
 allocation. And we do it once for _every_ directory against _every_
 wildcard gitignore pattern. So I think it is probably going to be
 measurable. I guess we can try measuring it on something like WebKit,
 which has plenty of both directories and gitattributes.

I timed this doing git archive HEAD on webkit.git before and after. It
actually ended up not mattering much (I think because it is only the
directories which are affected, not each individually path, so it's a
much smaller number than you'd think). The best-of-five timing was
slightly slower, but was within the noise.

So I do still think it would make sense to go to a byte-limited version
of fnmatch eventually, just for code cleanliness and predictability of
performance, but this is really not a bad solution in the interim.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Rename conflicts in the index

2013-03-26 Thread Edward Thomson

Junio C Hamano [mailto:gits...@pobox.com] wrote:
 Edward Thomson ethom...@microsoft.com writes:
  I would propose a new extension, 'CONF', to handle conflict data,
  differing from the stage 0 entries in the index in that this
  extension tracks the conflicting file across names if the underlying
  merge engine has support for renames.
 
  I made an attempt to keep the entry data similar to other entries in
  the index.  I would propose that entries in the conflict are as follows:
 
  Flags
Four octets that describe the conflict.  Data includes:
 
0x01  HAS_ANCESTOR
  There is a file in the common ancestor branch that contributes
  to this conflict.  Its data will follow.
0x02  HAS_OURS
  There is a file in our branch that contributes to this conflict.
  Its data will follow.
0x04  HAS_THEIRS
  There is a file in their branch that contributes to this conflict.
  Its data will follow.
 
0x08  NAME_CONFLICT_OURS
  This item has a path in our branch that overlaps a different
  item in their branch.  (Eg, this conflict represents the our
  side of a rename/add conflict.)
0x10  NAME_CONFLICT_THEIRS
  This item has a path in their branch that overlaps a different
  item in our branch.  (Eg, this conflict represents the theirs
  side of a rename/add conflict.)
 
0x20  DF_CONFLICT_FILE
  This is the file involved in a directory/file conflict.
0x40  DF_CONFLICT_CHILD
  This is a child of a directory involved in a directory/file conflict.
 
Other bits are reserved.
 
  Conflict Sides
The data about one side of a conflict will contain:
mode (ASCII string representation of octal, null-terminated)
path (null terminated)
sha1 (raw bytes)
 
  The conflict sides will be written in this order:
Ancestor (if HAS_ANCESTOR is set)
Ours (if HAS_OURS is set)
Theirs (if HAS_THEIRS is set)
 
 Puzzled.  Most of the above, except NAME_CONFLICT_{OURS,THEIRS} bits, look
 totally pointless duplication.

Obviously HAS_ANCESTOR / HAS_OURS / HAS_THEIRS is to indicate to a reader
whether there is data to be read or not.  Similar to how a mode of 0
in the REUC indicates that the rest of the record should not be read.)

 When you are working with Git, you have to be prepared to read from the
 datafile like the index that other people (and your previous
 version) created, and you also have to make sure you do not make what you
 write out unusable by other people without a good reason.

I'm acutely aware that you need to be able to read an index that other
people created - that's the problem at hand.  git does not produce an
index that allows anyone (including itself) to reason about rename
conflicts.  It doesn't even bother to write high-stage conflict entries
in many instances, so you can have an instance where git tells you that
a conflict occurred but one of those files is staged anyway, the other
is just dirty in the workdir and you can commit immediately thereafter.

While obviously it's possible to handle this situation (is file A
in conflict?  Look in the rename conflict extension.  Not there?  Okay,
look in the index.)  That's not exactly elegant.  My goal here was
to have a single source for conflicts.

 (by the way CONF sounds as if it is some sort of configuration data).

There's only four letters, and not everything's as easy as TREE.  REUC,
for example, sounds like a donkey, though I suppose it depends on the
language in question.

-ed
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] dir.c::match_basename(): pay attention to the length of string parameters

2013-03-26 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 I timed this doing git archive HEAD on webkit.git before and after. It
 actually ended up not mattering much (I think because it is only the
 directories which are affected, not each individually path, so it's a
 much smaller number than you'd think). The best-of-five timing was
 slightly slower, but was within the noise.

Interesting.  Because archive has to incur a large I/O cost
anyway, I expected extra allocation for correctness for only the
directory paths would be dwarfed in the noise.

I actually care more about cases other than archive, though.  Do
we even feed directory paths to the machinery?

 So I do still think it would make sense to go to a byte-limited version
 of fnmatch eventually, just for code cleanliness and predictability of
 performance, but this is really not a bad solution in the interim.

Yes, what we do with wildmatch is a separate issue for 'master' and
upwards.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] difftool: don't overwrite modified files

2013-03-26 Thread Matt McClure
On Mon, Mar 25, 2013 at 5:44 PM, John Keeping j...@keeping.me.uk wrote:
 Instead of copying unconditionally when the files differ, create and
 index from the working tree files and only copy the temporary file back
 if it was modified and the working tree file was not.  If both files
 have been modified, print a warning and exit with an error.

When there's a conflict, does difftool save both conflicting files? Or
only the working tree copy? I think it should preserve both copies on
disk.

-- 
Matt McClure
http://www.matthewlmcclure.com
http://www.mapmyfitness.com/profile/matthewlmcclure
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] difftool: don't overwrite modified files

2013-03-26 Thread John Keeping
On Tue, Mar 26, 2013 at 04:52:02PM -0400, Matt McClure wrote:
 On Mon, Mar 25, 2013 at 5:44 PM, John Keeping j...@keeping.me.uk wrote:
  Instead of copying unconditionally when the files differ, create and
  index from the working tree files and only copy the temporary file back
  if it was modified and the working tree file was not.  If both files
  have been modified, print a warning and exit with an error.
 
 When there's a conflict, does difftool save both conflicting files? Or
 only the working tree copy? I think it should preserve both copies on
 disk.

It preserves both copies - the clean the temporary directory step is
just skipped.

This isn't ideal since the temporary copy will be under a temporary
directory somewhere but is better than the current behaviour.  It might
be nice to move the temporary file back with an extension so that the
files are at least near each other but I don't think that's needed in
the first version of this change.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


patch series vs. multiple files changed in a commit; storytelling history vs. literal creation history

2013-03-26 Thread Matt McClure
I've read Documentation/SubmittingPatches, followed some of the
discussion on this list, and looked over some of the recent commit
history. I'm impressed by the strong culture of review that produces
readable patches and commit messages, but I think there are some gaps
in my understanding of the prevailing process here.

Most of the code I've worked on has been closed source, and the commit
histories tend to reflect what I'd call the literal creation
history. Reading the Git history, my impression is that it reflects a
different storytelling history. In some cases, that might be the
same as the creation history, but in general the emphasis is on
telling a coherent story of the changes to the other developers rather
than communicating all the messy details of how you arrived at the
order of that story. Is that right?

What are the Git project's rules of thumb for when to create a patch
series vs. putting changes to multiple files in a single commit/patch?

As a patch series evolves before landing on an upstream branch, do you
typically make corrections to the original series in new commits, or
update the respective commits from the original series in a new series
of analogous commits?

-- 
Matt McClure
http://www.matthewlmcclure.com
http://www.mapmyfitness.com/profile/matthewlmcclure
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] git-svn: Support custom tunnel schemes instead of SSH only

2013-03-26 Thread Sebastian Schuberth
This originates from an msysgit pull request, see:

https://github.com/msysgit/git/pull/58

Signed-off-by: Eric Wieser wieser.e...@gmail.com
Signed-off-by: Sebastian Schuberth sschube...@gmail.com
---
 perl/Git/SVN/Ra.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/perl/Git/SVN/Ra.pm b/perl/Git/SVN/Ra.pm
index 049c97b..6a212eb 100644
--- a/perl/Git/SVN/Ra.pm
+++ b/perl/Git/SVN/Ra.pm
@@ -295,7 +295,7 @@ sub gs_do_switch {
my $full_url = add_path_to_url( $self-url, $path );
my ($ra, $reparented);
 
-   if ($old_url =~ m#^svn(\+ssh)?://# ||
+   if ($old_url =~ m#^svn(\+\w+)?://# ||
($full_url =~ m#^https?://# 
 canonicalize_url($full_url) ne $full_url)) {
$_[0] = undef;
-- 
1.8.1.msysgit.1


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] stream_blob_to_fd: detect errors reading from stream

2013-03-26 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 We call read_istream, but never check its return value for
 errors. This can lead to us looping infinitely, as we just
 keep trying to write -1 bytes (and we do not notice the
 error, as we simply check that write_in_full reports the
 same number of bytes we fed it, which of course is also -1).

Looks sane.  Thanks.


 Signed-off-by: Jeff King p...@peff.net
 ---
 No test yet, as my method for triggering this causes _another_ infinite
 loop. So the test comes after the fixes, to avoid infinite loops when
 bisecting the history later. :)

  streaming.c | 2 ++
  1 file changed, 2 insertions(+)

 diff --git a/streaming.c b/streaming.c
 index 4d978e5..f4126a7 100644
 --- a/streaming.c
 +++ b/streaming.c
 @@ -514,6 +514,8 @@ int stream_blob_to_fd(int fd, unsigned const char *sha1, 
 struct stream_filter *f
   ssize_t wrote, holeto;
   ssize_t readlen = read_istream(st, buf, sizeof(buf));
  
 + if (readlen  0)
 + goto close_and_exit;
   if (!readlen)
   break;
   if (can_seek  sizeof(buf) == readlen) {
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] dir.c::match_basename(): pay attention to the length of string parameters

2013-03-26 Thread Jeff King
On Tue, Mar 26, 2013 at 01:49:10PM -0700, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  I timed this doing git archive HEAD on webkit.git before and after. It
  actually ended up not mattering much (I think because it is only the
  directories which are affected, not each individually path, so it's a
  much smaller number than you'd think). The best-of-five timing was
  slightly slower, but was within the noise.
 
 Interesting.  Because archive has to incur a large I/O cost
 anyway, I expected extra allocation for correctness for only the
 directory paths would be dwarfed in the noise.
 
 I actually care more about cases other than archive, though.  Do
 we even feed directory paths to the machinery?

In general, no, I don't think so. That's why I tested archive, since I
knew it did. In the normal case, we should just feed file paths, meaning
we only run into this code path when somebody has foo/ in their
pattern. Testing like:

  git ls-files -z files
  time git check-attr --stdin -z -a files /dev/null

showed a difference well within the noise.

  So I do still think it would make sense to go to a byte-limited version
  of fnmatch eventually, just for code cleanliness and predictability of
  performance, but this is really not a bad solution in the interim.
 
 Yes, what we do with wildmatch is a separate issue for 'master' and
 upwards.

Oh, agreed. I just wanted to see how much performance would be impacted
for the interim. But it seems that it's not.

So I think your series is the right direction, but we would want to
factor out the allocation code and use it from match_pathname, as well.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Change the committer username

2013-03-26 Thread Eric Kom

Good day,

Please how can I change the committer username from system default to 
personalize?


--
Kind Regards

Eric Kom

System Administrator  Programmer - Metropolitan College
 _
/ You are scrupulously honest, frank, and \
| straightforward. Therefore you have few |
\ friends./
 -
   \
\
.--.
   |o_o |
   |:_/ |
  //   \ \
 (| Kom | )
/'\_   _/`\
\___)=(___/

2 Hennie Van Till, White River, 1240
Tel: 013 750 2255 | Fax: 013 750 0105 | Cell: 078 879 1334
eric...@kom.za.net | eric...@metropolitancollege.co.za
www.kom.za.net | www.kom.za.org | www.erickom.co.za

Key fingerprint: 513E E91A C243 3020 8735 09BB 2DBC 5AD7 A9DA 1EF5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] attr.c::path_matches(): special case paths that end with a slash

2013-03-26 Thread Jeff King
On Tue, Mar 26, 2013 at 03:05:58PM -0400, Jeff King wrote:

 On Tue, Mar 26, 2013 at 11:39:30AM -0700, Junio C Hamano wrote:
 
  A similar adjustment for match_pathname() might be needed, but I
  didn't look into it.
 
 I notice that match_pathname takes _two_ lengths for the pattern: the
 nowildcardlen (called prefix, and the full patternlen). But the first
 thing it does is:
 
   if (*pattern == '/') {
   pattern++;
   prefix--;
   }
 
 which seems obviously wrong, as patternlen should be dropped, too. But
 we do not seem to look at patternlen at all! I think we can drop the
 parameter totally.
 
 We do seem to use strncmp_icase through the rest of the function,
 though, which should be OK. The one exception is that we call fnmatch at
 the end. Should the allocation hack from the previous patch make its way
 into an fnmatch_icase_bytes() function, so we can use it here, too?
 And then when we have a more efficient solution, we can just plug it in
 there.

Hmm. match_pathname does have this:

/*
 * baselen does not count the trailing slash. base[] may or
 * may not end with a trailing slash though.
 */
if (pathlen  baselen + 1 ||
(baselen  pathname[baselen] != '/') ||
strncmp_icase(pathname, base, baselen))
return 0;

which seems to imply that the trailing slash is important here, and that
we should not drop it when passing the path to match_pathname. I'm
still trying to figure out exactly what it is that the extra slash check
is for, and whether it might not have the same problem.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/9] streaming_write_entry: propagate streaming errors

2013-03-26 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 Subject: [PATCH] streaming_write_entry: propagate streaming errors

 When we are streaming an index blob to disk, we store the
 error from stream_blob_to_fd in the result variable, and
 then immediately overwrite that with the return value of
 close. That means we catch errors on close (e.g., problems
 committing the file to disk), but miss anything which
 happened before then.

 We can fix this by using bitwise-OR to accumulate errors in
 our result variable.

 While we're here, we can also simplify the error handling
 with an early return, which makes it easier to see under
 which circumstances we need to clean up.

 Signed-off-by: Jeff King p...@peff.net

Very sensible.  Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/9] clone: die on errors from unpack_trees

2013-03-26 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 When clone is populating the working tree, it ignores the
 return status from unpack_trees; this means we may report a
 successful clone, even when the checkout fails.

 When checkout fails, we may want to leave the $GIT_DIR in
 place, as it might be possible to recover the data through
 further use of git checkout (e.g., if the checkout failed
 due to a transient error, disk full, etc). However, we
 already die on a number of other checkout-related errors, so
 this patch follows that pattern.

 In addition to marking a now-passing test, we need to adjust
 t5710, which blindly assumed it could make bogus clones of
 very deep alternates hierarchies. By using --bare, we can
 avoid it actually touching any objects.

 Signed-off-by: Jeff King p...@peff.net
 ---

Thanks.

 I think the leave the data behind fix may be to just set junk_pid =
 0 a little sooner in cmd_clone (i.e., before checkout()). Then we
 would still die, but at least leave the fetched objects intact.

Yeah, perhaps, but I agree that is a much lower priority change.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git ate my home directory :-(

2013-03-26 Thread Philip Oakley

From: Duy Nguyen pclo...@gmail.com
Sent: Tuesday, March 26, 2013 9:48 AM

On Tue, Mar 26, 2013 at 08:02:30AM -, Philip Oakley wrote:

 Yeah, for historical reasons GIT_WORK_TREE defaults to $(pwd) when
 GIT_DIR is explicitly set.

 And it *WILL* be that way til the end of time.  Unless you are at
 the top level of your working tree, you are supposed to tell where
 the top level is with GIT_WORK_TREE when you use GIT_DIR.  Always.

Should this important warning be part of the git(1) documentation on
the
environment variables (and possibly other places) given the
consequences
of this case? It wasn't something I'd appreciated from a simple
reading.


Something like this, maybe?

-- 8 --
Subject: [PATCH] git.txt: document the implicit working tree setting
with GIT_DIR

Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com
---
Documentation/git.txt | 2 ++
1 file changed, 2 insertions(+)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index 7efaa59..ce55abf 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -671,6 +671,8 @@ Git so take care if using Cogito etc.
 specifies a path to use instead of the default `.git`
 for the base of the repository.
 The '--git-dir' command-line option also sets this value.
+ If neither GIT_WORK_TREE nor '--work-tree' is set, the
+ current directory will become the working tree.


I didn't feel this conveyed the Dire Warning effect that would be needed
to avoid the original misunderstanding.

It is easy to miss some of the potential consequences when other
priorities are pressing.

As Junio wondered, perhaps rhetorically, in a later message  Why do
these people set GIT_DIR without setting GIT_WORK_TREE in the first
place?

Perhaps
If the GIT_DIR environment variable is set then it specifies a path to
use instead of the default `.git` for the base of the repository. Note
that the current directory `.` will be used as the working
GIT_WORK_TREE, if not set elsewhere. The --git-dir command-line
option also sets the GIT_DIR environment variable.




'GIT_WORK_TREE'::
 Set the path to the working tree.  The value will not be
--
1.8.2.82.gc24b958
-- 8 --
--
Philip 


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 9/9] clone: run check_everything_connected

2013-03-26 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 The slowdown is really quite terrible if you try git clone --bare
 linux-2.6.git. Even with this, the local-clone case already misses blob
 corruption. So it probably makes sense to restrict it to just the
 non-local clone case, which already has to do more work.

Probably.  We may want to enable fsck even for local clones in the
longer term and also have this check.  Those who know their filesystem
is trustworthy can do the filesystem-level copy with cp -R themselves
after all.

 Even still, it adds a non-trivial amount of work (linux-2.6 takes
 something like a minute to check). I don't like the idea of declaring
 git clone non-safe unless you turn on transfer.fsckObjects, though. It
 should have the same safety as git fetch.

True.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: propagating repo corruption across clone

2013-03-26 Thread Philip Oakley

From: Jeff King p...@peff.net
Sent: Tuesday, March 26, 2013 4:55 PM

On Tue, Mar 26, 2013 at 09:43:01AM -0400, Jeff Mitchell wrote:


On Mon, Mar 25, 2013 at 4:07 PM, Jeff King p...@peff.net wrote:
 On Mon, Mar 25, 2013 at 12:32:50PM -0400, Jeff Mitchell wrote:
 For commit corruptions, the --no-hardlinks, non --mirror case
 refused
 to create the new repository and exited with an error code of 128.
 The
 --no-hardlinks, --mirror case spewed errors to the console, yet
 *still* created the new clone *and* returned an error code of
 zero.

 I wasn't able to reproduce this; can you post a succint test case?

[...link to tar.gz...]
Once you extract that, you should be able to run a clone using paths
(not file://) with --no-hardlinks --mirror and replicate the behavior
I saw. FYI, I'm on Git 1.8.2.


Thanks for providing an example.

The difference is the same --mirror implies --bare issue; the
non-bare
case dies during the checkout (even before my patches, as the
corruption
is not in a blob, but rather in the HEAD commit object itself). You
can
replace --mirror with --bare and see the same behavior.

The troubling part is that we see errors in the bare case, but do not
die. Those errors all come from upload-pack, the sending side of a
clone/fetch. Even though we do not transfer the objects via the git
protocol, we still invoke upload-pack to get the ref list (and then
copy
the objects themselves out-of-band).

What happens is that upload-pack sees the errors while trying to see
if
the object is a tag that can be peeled (the server advertises both
tags
and the objects they point to). It does not distinguish between
errors
did not let me peel this object and this object is not a tag, and
therefore there is nothing to peel.

We could change that, but I'm not sure whether it is a good idea. I
think the intent is that upload-pack's ref advertisement would remain
resilient to corruption in the repository (e.g., even if that commit
is
corrupt, you can still fetch the rest of the data). We should not
worry
about advertising broken objects, because we will encounter the same
error when we actually do try to send the objects. Dying at the
advertisement phase would be premature, since we do not yet know what
the client will request.

The problem, of course, is that the --local optimization _skips_ the
part where we actually ask upload-pack for data, and instead blindly
copies it. So this is the same issue as usual, which is that the local
transport is not thorough enough to catch corruption. It seems like a
failing in this case, because upload-pack does notice the problem, but
that is only luck; if the corruption were in a non-tip object, it
would
not notice it at all. So trying to die on errors in the ref
advertisement would just be a band-aid. Fundamentally the problem is
that the --local transport is not safe from propagating corruption,
and
should not be used if that's a requirement.

-Peff
--


Which way does `git bundle file.bundl --all` perform after the changes
for both the 'transport' checking and being reliable during updates.

Is it an option for creating an archivable file that can be used for a
later `clone`?

I wasn't sure if the bundle capability had been considered.

Philip


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: propagating repo corruption across clone

2013-03-26 Thread Jeff King
On Tue, Mar 26, 2013 at 09:59:42PM -, Philip Oakley wrote:

 Which way does `git bundle file.bundl --all` perform after the changes
 for both the 'transport' checking and being reliable during updates.

Bundles are treated at a fairly low level the same as a remote who
provides us a particular set of refs and a packfile. So we should get
the same protections via index-pack, and still run
check_everything_connected on it, just as we would with a fetch over the
git protocol.

I didn't test it, though.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch series vs. multiple files changed in a commit; storytelling history vs. literal creation history

2013-03-26 Thread Junio C Hamano
Matt McClure matthewlmccl...@gmail.com writes:

 I've read Documentation/SubmittingPatches, followed some of the
 discussion on this list, and looked over some of the recent commit
 history. I'm impressed by the strong culture of review that produces
 readable patches and commit messages, but I think there are some gaps
 in my understanding of the prevailing process here.

 Most of the code I've worked on has been closed source, and the commit
 histories tend to reflect what I'd call the literal creation
 history. Reading the Git history, my impression is that it reflects a
 different storytelling history. In some cases, that might be the
 same as the creation history, but in general the emphasis is on
 telling a coherent story of the changes to the other developers rather
 than communicating all the messy details of how you arrived at the
 order of that story. Is that right?

We do not try to keep records of oops, the previous was wrong when
we can easily tell the previous was wrong.  Usually such a shallow
mistake that can be spotted during the review is not worth keeping.

For example, the 4 patch series I posted today had three iterations
of botched attempts that were never even published.  I am sure other
people do the same for their initial round for their patches, and
any message on this list whose subject begins with [PATCH vN X/Y]
for N  2 are rewritten betterment based on list feedback.

Our history tends to become a coherent story because of this.

It is a different story for more involved changes that cook in
'next' for a while and then later turns out to have flaws. We update
them with follow-up fixes and at that point, we do have records of
mistakes. They often are tricky cases that are worth recording, as
people can later make similar kinds of mistakes in other parts of
the codebase.

The early part of the history back when Linus was running the show
is somewhat different; you see more reverts and rewrites. But even
back then, there were more experimental changes that were rewritten
than the changes that were finally committed to the history.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Change the committer username

2013-03-26 Thread Junio C Hamano
Eric Kom eric...@metropolitancollege.co.za writes:

 Good day,

 Please how can I change the committer username from system default to
 personalize?

Quoting from a very early part of

http://git-htmldocs.googlecode.com/git/gittutorial.html


It is a good idea to introduce yourself to Git with your name and
public email address before doing any operation.  The easiest
way to do so is:


$ git config --global user.name Your Name Comes Here
$ git config --global user.email y...@yourdomain.example.com


HTH
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


merge help

2013-03-26 Thread J.V.
I have a branch for which I have made 0 (nada) changes.  I did the 
following:


$git pull --rebase --no-stat -v --progress origin mybranch

I get the following

U   java/Profile.java
Pull is not possible because you have unmerged files.
Please, fix them up in the work tree, and then use 'git add/rm file'
as appropriate to mark resolution, or use 'git commit -a'.
--   (there is no local master branch), because of the conflict 
I was put on (no branch)

* (no branch)
  branch1
  dev
  tmpWork

Question 1)
Why did I get a merge conflict if I have not changed any files?


Question 2)
What is the command to show the difference between the files? (is there 
a visual tool that would let me merge)?


Question 3)
What is the command I would type to accept theirs and overwrite my local?

After this would I need to $git add  commit  push the file?

If someone could guide me through this initial process, (what is best 
recommended it would be most helpful).


thanks

J.V.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/9] clone: leave repo in place after checkout errors

2013-03-26 Thread Jeff King
On Tue, Mar 26, 2013 at 02:40:57PM -0700, Junio C Hamano wrote:

  I think the leave the data behind fix may be to just set junk_pid =
  0 a little sooner in cmd_clone (i.e., before checkout()). Then we
  would still die, but at least leave the fetched objects intact.
 
 Yeah, perhaps, but I agree that is a much lower priority change.

As it turns out, the checkout() error path sometimes _already_ leaves
the repository intact, but it's due to a bug. And it ends up deleting
something random instead. :)

I agree it's not a high priority, but I think it makes sense while we're
in the area. And while it's very unlikely that the deletion would be
disastrous (see below), it makes me nervous. Patch is below.

-- 8 --
Subject: [PATCH] clone: leave repo in place after checkout errors

If we manage to clone a remote repository but run into an
error in the checkout, it is probably sane to leave the repo
directory in place. That lets the user examine the situation
without spending time to re-clone from the remote (which may
be a lengthy process).

Rather than try to convert each die() from the checkout code
path into an error(), we simply set a flag that tells the
remove_junk atexit function to print a helpful message and
leave the repo in place.

Note that the test added in this patch actually passes
without the code change. The reason is that the cleanup code
is buggy; we chdir into the working tree for the checkout,
but still may use relative paths to remove the directories
(which means if you cloned into foo, we would accidentally
remove foo from the working tree!).  There's no point in
fixing it now, since this patch means we will never try to
remove anything after the chdir, anyway.

Signed-off-by: Jeff King p...@peff.net
---
I think the accidental deletion could also escape the repository if you
did something like:

  git clone $remote ../../foo

which would delete ../../foo/../../foo, or ../../../foo, which is not
related to what you just cloned. But I didn't test, and we don't have to
care anymore after this patch.

 builtin/clone.c  | 33 -
 t/t1060-object-corruption.sh |  4 
 2 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index eceaa74..e145dfc 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -377,10 +377,40 @@ static void remove_junk(void)
 static const char *junk_work_tree;
 static const char *junk_git_dir;
 static pid_t junk_pid;
+enum {
+   JUNK_LEAVE_NONE,
+   JUNK_LEAVE_REPO,
+   JUNK_LEAVE_ALL
+} junk_mode = JUNK_LEAVE_NONE;
+
+static const char junk_leave_repo_msg[] =
+N_(The remote repository was cloned successfully, but there was\n
+   an error checking out the HEAD branch. The repository has been left in\n
+   place but the working tree may be in an inconsistent state. You can\n
+   can inspect the contents with:\n
+   \n
+   git status\n
+   \n
+   and retry the checkout with\n
+   \n
+   git checkout -f HEAD\n
+   \n);
 
 static void remove_junk(void)
 {
struct strbuf sb = STRBUF_INIT;
+
+   switch (junk_mode) {
+   case JUNK_LEAVE_REPO:
+   warning(%s, _(junk_leave_repo_msg));
+   /* fall-through */
+   case JUNK_LEAVE_ALL:
+   return;
+   default:
+   /* proceed to removal */
+   break;
+   }
+
if (getpid() != junk_pid)
return;
if (junk_git_dir) {
@@ -925,12 +955,13 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
transport_unlock_pack(transport);
transport_disconnect(transport);
 
+   junk_mode = JUNK_LEAVE_REPO;
err = checkout();
 
strbuf_release(reflog_msg);
strbuf_release(branch_top);
strbuf_release(key);
strbuf_release(value);
-   junk_pid = 0;
+   junk_mode = JUNK_LEAVE_ALL;
return err;
 }
diff --git a/t/t1060-object-corruption.sh b/t/t1060-object-corruption.sh
index a405b70..a84deb1 100755
--- a/t/t1060-object-corruption.sh
+++ b/t/t1060-object-corruption.sh
@@ -89,6 +89,10 @@ test_expect_success 'clone --local detects corruption' '
test_must_fail git clone --local bit-error corrupt-checkout
 '
 
+test_expect_success 'error detected during checkout leaves repo intact' '
+   test_path_is_dir corrupt-checkout/.git
+'
+
 test_expect_success 'clone --local detects missing objects' '
test_must_fail git clone --local missing missing-checkout
 '
-- 
1.8.2.13.g0f18d3c

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 9/9] clone: run check_everything_connected

2013-03-26 Thread Jeff King
On Tue, Mar 26, 2013 at 07:53:42AM +0700, Nguyen Thai Ngoc Duy wrote:

 On Tue, Mar 26, 2013 at 3:26 AM, Jeff King p...@peff.net wrote:
   static void update_remote_refs(const struct ref *refs,
 const struct ref *mapped_refs,
 const struct ref *remote_head_points_at,
 const char *branch_top,
 const char *msg)
   {
  +   const struct ref *rm = mapped_refs;
  +
  +   if (check_everything_connected(iterate_ref_map, 0, rm))
  +   die(_(remote did not send all necessary objects));
  +
  if (refs) {
  write_remote_refs(mapped_refs);
  if (option_single_branch)
 
 Maybe move this after checkout, so that I can switch terminal and
 start working while it's verifying? And maybe an option not to
 check_everything_connected, instead print a big fat warning telling
 the user to fsck later?

I tried to follow the fetch process of not installing the refs until we
had verified that the objects were reasonable. It probably doesn't
matter that much for clone, since you would not have simultaneous users
expecting the repository to be in a reasonable state until after clone
completes, though.

We also would have to tweak check_everything_connected, which does
something like --not --all to avoid rechecking objects we already
have. But that is not too hard to do.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/9] clone: leave repo in place after checkout errors

2013-03-26 Thread Jonathan Nieder
Jeff King wrote:

 --- a/builtin/clone.c
 +++ b/builtin/clone.c
 @@ -377,10 +377,40 @@ static void remove_junk(void)
  static const char *junk_work_tree;
  static const char *junk_git_dir;
  static pid_t junk_pid;
 +enum {
 + JUNK_LEAVE_NONE,
 + JUNK_LEAVE_REPO,
 + JUNK_LEAVE_ALL
 +} junk_mode = JUNK_LEAVE_NONE;

Neat.

 +
 +static const char junk_leave_repo_msg[] =
 +N_(The remote repository was cloned successfully, but there was\n
 +   an error checking out the HEAD branch. The repository has been left in\n
 +   place but the working tree may be in an inconsistent state. You can\n
 +   can inspect the contents with:\n
 +   \n
 +   git status\n
 +   \n
 +   and retry the checkout with\n
 +   \n
 +   git checkout -f HEAD\n
 +   \n);

Can this be made more precise?  I don't know what it means for the
working tree to be in an inconsistent state: do you mean that some files
might be partially checked out or not have been checked out at all yet?

error: Clone succeeded, but checkout failed.
hint: You can inspect what was checked out with git status.
hint: To retry the checkout, run git checkout -f HEAD.

Aside from that, this looks very nice.

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] dir.c::match_basename(): pay attention to the length of string parameters

2013-03-26 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 So I think your series is the right direction, but we would want to
 factor out the allocation code and use it from match_pathname, as well.

I am deep into today's integration cycle, so perhaps in the meantime
you can help with a follow-up patch ;-)?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


What's cooking in git.git (Mar 2013, #07; Tue, 26)

2013-03-26 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to master]

* ap/maint-diff-rename-avoid-overlap (2013-03-06) 3 commits
  (merged to 'next' on 2013-03-19 at c3276cf)
 + tests: make sure rename pretty print works
 + diff: prevent pprint_rename from underrunning input
 + diff: Fix rename pretty-print when suffix and prefix overlap

 Originally merged to 'next' on 2013-03-06

 The logic used by git diff -M --stat to shorten the names of
 files before and after a rename did not work correctly when the
 common prefix and suffix between the two filenames overlapped.


* jc/describe (2013-02-28) 1 commit
  (merged to 'next' on 2013-03-19 at 89e6e47)
 + describe: --match=pattern must limit the refs even when used with --all

 Originally merged to 'next' on 2013-03-05

 The --match=pattern option of git describe, when used with
 --all to allow refs that are not annotated tags to be used as a
 base of description, did not restrict the output from the command
 to those that match the given pattern.

 We may want to have a looser matching that does not restrict to tags,
 but that can be done as a follow-up topic; this step is purely a bugfix.


* jc/maint-reflog-expire-clean-mark-typofix (2013-03-05) 1 commit
  (merged to 'next' on 2013-03-19 at a4f9eac)
 + reflog: fix typo in reflog expire clean-up codepath

 Originally merged to 'next' on 2013-03-07

 In git reflog expire, REACHABLE bit was not cleared from the
 correct objects.


* jc/push-follow-tag (2013-03-05) 4 commits
  (merged to 'next' on 2013-03-19 at d302a10)
 + push: --follow-tags
 + commit.c: use clear_commit_marks_many() in in_merge_bases_many()
 + commit.c: add in_merge_bases_many()
 + commit.c: add clear_commit_marks_many()

 Originally merged to 'next' on 2013-03-09

 The new --follow-tags option tells git push to push relevant
 annotated tags when pushing branches out.


* jc/reflog-reverse-walk (2013-03-23) 4 commits
  (merged to 'next' on 2013-03-25 at 1bcc1c4)
 + refs.c: fix fread error handling
  (merged to 'next' on 2013-03-19 at 25beb2a)
 + reflog: add for_each_reflog_ent_reverse() API
 + for_each_recent_reflog_ent(): simplify opening of a reflog file
 + for_each_reflog_ent(): extract a helper to process a single entry
 (this branch is tangled with nd/branch-show-rebase-bisect-state.)

 An internal function used to implement git checkout @{-1} was
 hard to use correctly.


* jk/alias-in-bare (2013-03-08) 3 commits
  (merged to 'next' on 2013-03-19 at d2b4227)
 + setup: suppress implicit . work-tree for bare repos
 + environment: add GIT_PREFIX to local_repo_env
 + cache.h: drop LOCAL_REPO_ENV_SIZE

 Originally merged to 'next' on 2013-03-09

 An aliased command spawned from a bare repository that does not say
 it is bare with core.bare = yes is treated as non-bare by mistake.


* jk/empty-archive (2013-03-10) 2 commits
  (merged to 'next' on 2013-03-19 at bb4eb61)
 + archive: handle commits with an empty tree
 + test-lib: factor out $GIT_UNZIP setup

 Originally merged to 'next' on 2013-03-12

 git archive reports a failure when asked to create an archive out
 of an empty tree.  It would be more intuitive to give an empty
 archive back in such a case.


* jk/fast-export-object-lookup (2013-03-17) 2 commits
  (merged to 'next' on 2013-03-19 at 026ac3d)
 + fast-export: do not load blob objects twice
 + fast-export: rename handle_object function

 Originally merged to 'next' on 2013-03-18


* jk/fully-peeled-packed-ref (2013-03-18) 4 commits
  (merged to 'next' on 2013-03-19 at fa92bc7)
 + pack-refs: add fully-peeled trait
 + pack-refs: write peeled entry for non-tags
 + use parse_object_or_die instead of die(bad object)
 + avoid segfaults on parse_object failure

 Originally merged to 'next' on 2013-03-18

 Not that we do not actively encourage having annotated tags outside
 refs/tags/ hierarchy, but they were not advertised correctly to the
 ls-remote and fetch with recent version of Git.


* jk/peel-ref (2013-03-16) 3 commits
  (merged to 'next' on 2013-03-19 at f0d4c16)
 + upload-pack: load non-tip want objects from disk
 + upload-pack: make sure want objects are parsed
 + upload-pack: drop lookup-before-parse optimization

 Originally merged to 'next' on 2013-03-18

 Recent optimization broke shallow clones.


* jk/suppress-clang-warning (2013-02-25) 1 commit
  (merged to 'next' on 2013-03-19 at 1fd6858)
 + fix clang -Wtautological-compare with unsigned enum

 Originally merged to 'next' on 2013-03-14


* jl/submodule-deinit (2013-03-04) 1 commit
  (merged to 'next' on 2013-03-19 at d8367c5)
 + submodule: add 'deinit' command

 Originally merged to 'next' on 2013-03-05

 There was no Porcelain way to say I no 

Re: propagating repo corruption across clone

2013-03-26 Thread Rich Fromm
Jeff King wrote
 Fundamentally the problem is
 that the --local transport is not safe from propagating corruption, and
 should not be used if that's a requirement.

I've read Jeff Mitchell's blog post, his update, relevant parts of the
git-clone(1) man page, and a decent chunk of this thread, and I'm still not
clear on one thing.  Is the danger of `git clone --mirror` propagating
corruption only true when using the --local option ?

Specifically, in my case, I'm using `git clone --mirror`, but I'm *not*
using --local, nor am I using --no-hardlinks.  The host executing the clone
command is different than the the host on which the remote repository lives,
and I am using ssh as a transport protocol.  If there is corruption, can I
or can I not expect the clone operation to fail and return a non-zero exit
value?  If I can not expect this, is the workaround to run `git fsck` on the
resulting clone?




--
View this message in context: 
http://git.661346.n2.nabble.com/propagating-repo-corruption-across-clone-tp7580504p7580771.html
Sent from the git mailing list archive at Nabble.com.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/9] clone: leave repo in place after checkout errors

2013-03-26 Thread Jeff King
On Tue, Mar 26, 2013 at 03:32:59PM -0700, Jonathan Nieder wrote:

  +static const char junk_leave_repo_msg[] =
  +N_(The remote repository was cloned successfully, but there was\n
  +   an error checking out the HEAD branch. The repository has been left 
  in\n
  +   place but the working tree may be in an inconsistent state. You can\n
  +   can inspect the contents with:\n
  +   \n
  +   git status\n
  +   \n
  +   and retry the checkout with\n
  +   \n
  +   git checkout -f HEAD\n
  +   \n);
 
 Can this be made more precise?  I don't know what it means for the
 working tree to be in an inconsistent state: do you mean that some files
 might be partially checked out or not have been checked out at all yet?

It means that we died during the checkout procedure, and we don't have
any idea what was left. Maybe something, maybe nothing. Maybe an index,
maybe not.

   error: Clone succeeded, but checkout failed.
   hint: You can inspect what was checked out with git status.
   hint: To retry the checkout, run git checkout -f HEAD.

That is certainly more succint, if not more precise. I'd be fine with
it.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] dir.c::match_basename(): pay attention to the length of string parameters

2013-03-26 Thread Jeff King
On Tue, Mar 26, 2013 at 03:33:40PM -0700, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  So I think your series is the right direction, but we would want to
  factor out the allocation code and use it from match_pathname, as well.
 
 I am deep into today's integration cycle, so perhaps in the meantime
 you can help with a follow-up patch ;-)?

Your afternoon integration is my dinner-time. :)

I'll try to look at it tomorrow.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] attribute regression fix for maint-1.8.1 and upward

2013-03-26 Thread Duy Nguyen
qOn Tue, Mar 26, 2013 at 11:39:27AM -0700, Junio C Hamano wrote:
 So here is an attempt to fix the unintended regression, on top of
 9db9eecfe5c2 (attr: avoid calling find_basename() twice per path,
 2013-01-16).  It consists of four patches.

Not that I disagree with this. Just wanted to see how far the dtype
idea went. How about this? git_check_attr() now takes dtype as an
argument and the caller must not add the trailing slash. This could be
split into two patches, one for git_check_attr prototype change, and
the other the real meat.

-- 8 --
diff --git a/archive.c b/archive.c
index 93e00bb..ab811b3 100644
--- a/archive.c
+++ b/archive.c
@@ -112,7 +112,7 @@ static int write_archive_entry(const unsigned char *sha1, 
const char *base,
write_archive_entry_fn_t write_entry = c-write_entry;
struct git_attr_check check[2];
const char *path_without_prefix;
-   int err;
+   int err, dtype;
 
args-convert = 0;
strbuf_reset(path);
@@ -120,18 +120,18 @@ static int write_archive_entry(const unsigned char *sha1, 
const char *base,
strbuf_add(path, args-base, args-baselen);
strbuf_add(path, base, baselen);
strbuf_addstr(path, filename);
-   if (S_ISDIR(mode) || S_ISGITLINK(mode))
-   strbuf_addch(path, '/');
+   dtype = S_ISDIR(mode) || S_ISGITLINK(mode) ? DT_DIR : DT_REG;
path_without_prefix = path.buf + args-baselen;
 
setup_archive_check(check);
-   if (!git_check_attr(path_without_prefix, ARRAY_SIZE(check), check)) {
+   if (!git_check_attr(path_without_prefix, dtype, ARRAY_SIZE(check), 
check)) {
if (ATTR_TRUE(check[0].value))
return 0;
args-convert = ATTR_TRUE(check[1].value);
}
 
if (S_ISDIR(mode) || S_ISGITLINK(mode)) {
+   strbuf_addch(path, '/');
if (args-verbose)
fprintf(stderr, %.*s\n, (int)path.len, path.buf);
err = write_entry(args, sha1, path.buf, path.len, mode);
diff --git a/attr.c b/attr.c
index e2f9377..ab6422c 100644
--- a/attr.c
+++ b/attr.c
@@ -255,6 +255,8 @@ static struct match_attr *parse_attr_line(const char *line, 
const char *src,
  res-u.pat.patternlen,
  res-u.pat.flags,
  res-u.pat.nowildcardlen);
+   if (res-u.pat.flags  EXC_FLAG_MUSTBEDIR)
+   p[res-u.pat.patternlen] = '\0';
if (res-u.pat.flags  EXC_FLAG_NEGATIVE) {
warning(_(Negative patterns are ignored in git 
attributes\n
  Use '\\!' for literal leading 
exclamation.));
@@ -657,15 +659,14 @@ static void prepare_attr_stack(const char *path, int 
dirlen)
 }
 
 static int path_matches(const char *pathname, int pathlen,
-   const char *basename,
+   const char *basename, int dtype,
const struct pattern *pat,
const char *base, int baselen)
 {
const char *pattern = pat-pattern;
int prefix = pat-nowildcardlen;
 
-   if ((pat-flags  EXC_FLAG_MUSTBEDIR) 
-   ((!pathlen) || (pathname[pathlen-1] != '/')))
+   if ((pat-flags  EXC_FLAG_MUSTBEDIR)  dtype != DT_DIR)
return 0;
 
if (pat-flags  EXC_FLAG_NODIR) {
@@ -704,7 +705,7 @@ static int fill_one(const char *what, struct match_attr *a, 
int rem)
 }
 
 static int fill(const char *path, int pathlen, const char *basename,
-   struct attr_stack *stk, int rem)
+   int dtype, struct attr_stack *stk, int rem)
 {
int i;
const char *base = stk-origin ? stk-origin : ;
@@ -713,7 +714,7 @@ static int fill(const char *path, int pathlen, const char 
*basename,
struct match_attr *a = stk-attrs[i];
if (a-is_macro)
continue;
-   if (path_matches(path, pathlen, basename,
+   if (path_matches(path, pathlen, basename, dtype,
 a-u.pat, base, stk-originlen))
rem = fill_one(fill, a, rem);
}
@@ -748,20 +749,17 @@ static int macroexpand_one(int attr_nr, int rem)
  * Collect all attributes for path into the array pointed to by
  * check_all_attr.
  */
-static void collect_all_attrs(const char *path)
+static void collect_all_attrs(const char *path, int dtype)
 {
struct attr_stack *stk;
int i, pathlen, rem, dirlen;
-   const char *basename, *cp, *last_slash = NULL;
+   const char *basename, *cp;
 
-   for (cp = path; *cp; cp++) {
-   if (*cp == '/'  cp[1])
-   last_slash = cp;
-   }
-   pathlen = cp - path;
-   if (last_slash) {
-   basename = last_slash + 1;
-   dirlen = last_slash - path;
+   cp = strrchr(path, '/');
+   pathlen = 

Re: propagating repo corruption across clone

2013-03-26 Thread Jonathan Nieder
Hi,

Rich Fromm wrote:

The host executing the clone
 command is different than the the host on which the remote repository lives,
 and I am using ssh as a transport protocol.  If there is corruption, can I
 or can I not expect the clone operation to fail and return a non-zero exit
 value?  If I can not expect this, is the workaround to run `git fsck` on the
 resulting clone?

Is the [transfer] fsckObjects configuration on the host executing the
clone set to true?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] attr.c::path_matches(): special case paths that end with a slash

2013-03-26 Thread Duy Nguyen
On Wed, Mar 27, 2013 at 4:33 AM, Jeff King p...@peff.net wrote:
 Hmm. match_pathname does have this:

 /*
  * baselen does not count the trailing slash. base[] may or
  * may not end with a trailing slash though.
  */
 if (pathlen  baselen + 1 ||
 (baselen  pathname[baselen] != '/') ||
 strncmp_icase(pathname, base, baselen))
 return 0;

 which seems to imply that the trailing slash is important here, and that
 we should not drop it when passing the path to match_pathname. I'm
 still trying to figure out exactly what it is that the extra slash check
 is for, and whether it might not have the same problem.

The may not end with a trailing slash can only happen when baselen
== 0. And that rule is documented in code, at this line in
last_exclude_matching_from_list:

assert(x-baselen == 0 || x-base[x-baselen - 1] == '/');
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] attr.c::path_matches(): the basename is part of the pathname

2013-03-26 Thread Duy Nguyen
On Wed, Mar 27, 2013 at 1:49 AM, Jeff King p...@peff.net wrote:
 On Tue, Mar 26, 2013 at 11:39:28AM -0700, Junio C Hamano wrote:

 The function takes two strings (pathname and basename) as if they
 are independent strings, but in reality, the latter is always
 pointing into a substring in the former.

 Clarify this relationship by expressing the latter as an offset into
 the former.

 Signed-off-by: Junio C Hamano gits...@pobox.com

 This is a huge improvement in maintainability. My initial fix attempt
 was to just xstrdup() the strings (knowing that the performance would be
 horrible, but I was still investigating correctness issues at that
 point). And of course I ran into this same issue as I tried to make a
 copy of pathname.

 So even without the rest of the fix, this is definitely a good idea. :)

match_{base,path}name and their exclude callers do the same thing. I
guess I'm used to it and did not see the maintainability issue. Maybe
we should do the same there too.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


More detailed error message for 403 forbidden.

2013-03-26 Thread Yi, EungJun
Currently, if user tried to access a git repository via HTTP and it
fails because the user's permission is not enough to access the
repository, git client tells that http request failed and the error
was 403 forbidden.

But It is not enough for user to understand why it fails, especially
if the user don't know the username because git-credential-osxkeychain
authenticate implicitly without user knowing.

It would be much better if git client shows response body which might
include an explanation of the failure. For example,

before:

$ git clone http://localhost/foo/bar
error: The requested URL returned error: 403 while accessing
http://localhost/foo/bar
fatal: HTTP request failed

after:

$ git clone http://localhost/foo/bar
error: The requested URL returned error: 403 while accessing
http://localhost/foo/bar
remote: User 'me' does not have enough permission to access the repository.
fatal: HTTP request failed
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: propagating repo corruption across clone

2013-03-26 Thread Junio C Hamano
Rich Fromm richard_fr...@yahoo.com writes:

 Jeff King wrote
 Fundamentally the problem is
 that the --local transport is not safe from propagating corruption, and
 should not be used if that's a requirement.

 I've read Jeff Mitchell's blog post, his update, relevant parts of the
 git-clone(1) man page, and a decent chunk of this thread, and I'm still not
 clear on one thing.  Is the danger of `git clone --mirror` propagating
 corruption only true when using the --local option ?

If you use --local, that is equivalent to cp -R.  Your corruption
in the source will faithfully be byte-for-byte copied to the
destination.  If you do not (and in your case you have two different
machines), unless you are using the long deprecated rsync transport
(which again is the same as cp -R), transport layer will notice
object corruption.  See Jeff's analysis earlier in the thread.

If you are lucky (or unlucky, depending on how you look at it), the
corruption you have in your object store may affect objects that are
needed to check out the version at the tip of the history, and git
checkout that happens as the last step of cloning may loudly
complain, but that just means you can immediately notice the
breakage in that case.  You may be unlucky and the corruption may
not affect objects that are needed to check out the tip. The initial
checkout will succeed as if nothing is wrong, but the corruption in
your object store is still there nevertheless.  git log -p --all
or git fsck will certainly be unhappy.

The difference between --mirror and no --mirror is a red herring.
You may want to ask Jeff Mitchell to remove the mention of it; it
only adds to the confusion without helping users.  If you made
byte-for-byte copy of corrupt repository, it wouldn't make any
difference if the first checkout notices it.

To be paranoid, you may want to set transfer.fsckObjects to true,
perhaps in your ~/.gitconfig.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] attribute regression fix for maint-1.8.1 and upward

2013-03-26 Thread Junio C Hamano
Duy Nguyen pclo...@gmail.com writes:

 How about this? git_check_attr() now takes dtype as an argument
 and the caller must not add the trailing slash.  This could be
 split into two patches, one for git_check_attr prototype change,
 and the other the real meat.

git check-attr fundamentally cannot know, but aside from that do
all the callsites know if the path in question is a directory or
not?  My impression was that there are some cases you do not
necessarily know.

Add slash when you _know_ it is a directory, but otherwise pass the
path without trailing slash. is easier to understand than Pass
04 if you know it is a directory, but otherwise pass 100644,
exactly because otherwise in both of these instructions include
the case where the path in question _is_ a directory (you just do
not know what it is).

I do not particularly like the trailing slash on the basename
approach, but it feels less bad than passing dtype down.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] attribute regression fix for maint-1.8.1 and upward

2013-03-26 Thread Duy Nguyen
On Wed, Mar 27, 2013 at 10:57 AM, Junio C Hamano gits...@pobox.com wrote:
 Duy Nguyen pclo...@gmail.com writes:

 How about this? git_check_attr() now takes dtype as an argument
 and the caller must not add the trailing slash.  This could be
 split into two patches, one for git_check_attr prototype change,
 and the other the real meat.

 git check-attr fundamentally cannot know, but aside from that do
 all the callsites know if the path in question is a directory or
 not?  My impression was that there are some cases you do not
 necessarily know.

 Add slash when you _know_ it is a directory, but otherwise pass the
 path without trailing slash. is easier to understand than Pass
 04 if you know it is a directory, but otherwise pass 100644,
 exactly because otherwise in both of these instructions include
 the case where the path in question _is_ a directory (you just do
 not know what it is).

 I do not particularly like the trailing slash on the basename
 approach, but it feels less bad than passing dtype down.

Fair enough. I'll rebase my changes on top of yours as long term
cleanup. Maybe I can make nwildmatch take patternlen too.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3] checkout: avoid unnecessary match_pathspec calls

2013-03-26 Thread Nguyễn Thái Ngọc Duy
In checkout_paths() we do this

 - for all updated items, call match_pathspec
 - for all items, call match_pathspec (inside unmerge_cache)
 - for all items, call match_pathspec (for showing path .. is unmerged)
 - for updated items, call match_pathspec and update paths

That's a lot of duplicate match_pathspec(s) and the function is not
exactly cheap to be called so many times, especially on large indexes.
This patch makes it call match_pathspec once per updated index entry,
save the result in ce_flags and reuse the results in the following
loops.

The changes in 0a1283b (checkout $tree $path: do not clobber local
changes in $path not in $tree - 2011-09-30) limit the affected paths
to ones we read from $tree. We do not do anything to other modified
entries in this case, so the for all items above could be modified
to for all updated items. But..

The command's behavior now is modified slightly: unmerged entries that
match $path, but not updated by $tree, are now NOT touched.  Although
this should be considered a bug fix, not a regression. A new test is
added for this change.

And while at there, free ps_matched after use.

The following command is tested on webkit, 215k entries. The pattern
is chosen mainly to make match_pathspec sweat:

git checkout -- *[a-zA-Z]*[a-zA-Z]*[a-zA-Z]*

before  after
real0m3.493s0m2.737s
user0m2.239s0m1.586s
sys 0m1.252s0m1.151s

Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com
---
 Changes from v2: a new test and some note about matching twice in
 checkout $tree $path, once in read_tree_some and once checkout_paths.
 We may be able avoid match_pathspec entirely in this case when
 tree_entry_interesting learns to fill ps_matched.

 builtin/checkout.c| 43 ---
 cache.h   |  1 +
 resolve-undo.c| 19 ++-
 resolve-undo.h|  1 +
 t/t2022-checkout-paths.sh | 21 +
 5 files changed, 77 insertions(+), 8 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index a9c1b5a..f8033f4 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -271,24 +271,55 @@ static int checkout_paths(const struct checkout_opts 
*opts,
;
ps_matched = xcalloc(1, pos);
 
+   /*
+* Make sure all pathspecs participated in locating the paths
+* to be checked out.
+*/
for (pos = 0; pos  active_nr; pos++) {
struct cache_entry *ce = active_cache[pos];
+   ce-ce_flags = ~CE_MATCHED;
if (opts-source_tree  !(ce-ce_flags  CE_UPDATE))
+   /*
+* git checkout tree-ish -- path, but this entry
+* is in the original index; it will not be checked
+* out to the working tree and it does not matter
+* if pathspec matched this entry.  We will not do
+* anything to this entry at all.
+*/
continue;
-   match_pathspec(opts-pathspec, ce-name, ce_namelen(ce), 0, 
ps_matched);
+   /*
+* Either this entry came from the tree-ish we are
+* checking the paths out of, or we are checking out
+* of the index.
+*
+* If it comes from the tree-ish, we already know it
+* matches the pathspec and could just stamp
+* CE_MATCHED to it from update_some(). But we still
+* need ps_matched and read_tree_recursive (and
+* eventually tree_entry_interesting) cannot fill
+* ps_matched yet. Once it can, we can avoid calling
+* match_pathspec() for _all_ entries when
+* opts-source_tree != NULL.
+*/
+   if (match_pathspec(opts-pathspec, ce-name, ce_namelen(ce),
+  0, ps_matched))
+   ce-ce_flags |= CE_MATCHED;
}
 
-   if (report_path_error(ps_matched, opts-pathspec, opts-prefix))
+   if (report_path_error(ps_matched, opts-pathspec, opts-prefix)) {
+   free(ps_matched);
return 1;
+   }
+   free(ps_matched);
 
/* checkout -m path to recreate conflicted state */
if (opts-merge)
-   unmerge_cache(opts-pathspec);
+   unmerge_marked_index(the_index);
 
/* Any unmerged paths? */
for (pos = 0; pos  active_nr; pos++) {
struct cache_entry *ce = active_cache[pos];
-   if (match_pathspec(opts-pathspec, ce-name, ce_namelen(ce), 0, 
NULL)) {
+   if (ce-ce_flags  CE_MATCHED) {
if (!ce_stage(ce))
continue;
if (opts-force) {
@@ -313,9 +344,7 @@ static int checkout_paths(const struct checkout_opts