Re: [RFC PATCH 4/7] merge-recursive: fix assumption that head tree being merged is HEAD

2018-06-05 Thread Elijah Newren
On Sun, Jun 3, 2018 at 8:19 PM, Junio C Hamano  wrote:
> Elijah Newren  writes:
>
>> `git merge-recursive` does a three-way merge between user-specified trees
>> base, head, and remote.  Since the user is allowed to specify head, we can
>> not necesarily assume that head == HEAD.
>>
>> We modify index_has_changes() to take an extra argument specifying the
>> tree to compare the index to.  If NULL, it will compare to HEAD.  We then
>> use this from merge-recursive to make sure we compare to the
>> user-specified head.
>>
>> Signed-off-by: Elijah Newren 
>> ---
>>
>> I'm really unsure where the index_has_changes() declaration should go;
>> I stuck it in tree.h, but is there a better spot?
>
> I think I saw you tried to lift an assumption that we're always
> working on the_index in a separate patch recently.  Should that
> logic apply also to this part of the codebase?  IOW, shouldn't
> index_has_changes() take a pointer to istate (as opposed to a
> function that uses the implicit the_index that should be named as
> "cache_has_changes()" or something?)
>
> I tend to think this function as part of the larger read-cache.c
> family whose definitions are in cache.h and accompanied by macros
> that are protected by NO_THE_INDEX_COMPATIBILITY_MACROS so if we
> were to move it elsewhere, I'd keep the header part as-is and
> implementation to read-cache.c to keep it together with the family,
> but I do not see a huge issue with the current placement, either.

That's good point; the goal to lift assumptions on the_index should
probably also apply here.  I'll make the change.
(And it was actually Duy's patch that I was reviewing, but close
enough.)   I'll take a look at moving it to read-cache.c as well.


[PATCHv1 0/2] git-p4: disable sync after submit

2018-06-05 Thread Luke Diamand
This is a small patch to git-p4 to disable the automatic sync after
submit.

In my day-to-day work, I have a central git-p4 repo which is
automatically kept up-to-date, so the repos where I actually work and
submit from don't even need the sync. I usually end up hitting Ctrl-C
partway through the sync.

I imagine other people with large git-p4 projects do something similar
and have the same problem.

This also updates Merland's recent submit-selection change so that both
options can be set via configuration rather than being set on the
command line.

Luke Diamand (2):
  git-p4: disable-rebase: allow setting this via configuration
  git-p4: add option to disable syncing of p4/master with p4

 Documentation/git-p4.txt | 13 -
 git-p4.py| 33 +
 2 files changed, 33 insertions(+), 13 deletions(-)

-- 
2.17.0.392.gdeb1a6e9b7



[PATCHv1 2/2] git-p4: add option to disable syncing of p4/master with p4

2018-06-05 Thread Luke Diamand
Add an option to the git-p4 submit command to disable syncing
with Perforce.

This is useful for the case where a git-p4 mirror has been setup
on a server somewhere, running from (e.g.) cron, and developers
then clone from this. Having the local cloned copy also sync
from Perforce just isn't useful.

Signed-off-by: Luke Diamand 
---
 Documentation/git-p4.txt |  8 
 git-p4.py| 31 ---
 2 files changed, 28 insertions(+), 11 deletions(-)

diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
index 3d83842e47..8f6a7543fd 100644
--- a/Documentation/git-p4.txt
+++ b/Documentation/git-p4.txt
@@ -369,6 +369,11 @@ These options can be used to modify 'git p4 submit' 
behavior.
 Disable the automatic rebase after all commits have been successfully
 submitted. Can also be set with git-p4.disableRebase.
 
+--disable-p4sync::
+Disable the automatic sync of p4/master from Perforce after commit have
+been submitted. Implies --disable-rebase. Can also be set with
+git-p4.disableP4Sync. Sync with origin/master still goes ahead if possible.
+
 Rebase options
 ~~
 These options can be used to modify 'git p4 rebase' behavior.
@@ -693,6 +698,9 @@ git-p4.conflict::
 git-p4.disableRebase::
 Do not rebase the tree against p4/master following a submit.
 
+git-p4.disableP4Sync::
+Do not sync p4/master with Perforce following a submit. Implies 
git-p4.disableRebase.
+
 IMPLEMENTATION DETAILS
 --
 * Changesets from p4 are imported using Git fast-import.
diff --git a/git-p4.py b/git-p4.py
index 5ab9421af8..b9e79f1d8b 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -1368,7 +1368,9 @@ def __init__(self):
  help="submit only the specified 
commit(s), one commit or xxx..xxx"),
 optparse.make_option("--disable-rebase", 
dest="disable_rebase", action="store_true",
  help="Disable rebase after submit is 
completed. Can be useful if you "
- "work from a local git branch that is not 
master")
+ "work from a local git branch that is not 
master"),
+optparse.make_option("--disable-p4sync", 
dest="disable_p4sync", action="store_true",
+ help="Skip perforce sync of p4/master 
after submit or shelve"),
 ]
 self.description = "Submit changes from git to the perforce depot."
 self.usage += " [name of git branch to submit into perforce depot]"
@@ -1380,6 +1382,7 @@ def __init__(self):
 self.update_shelve = list()
 self.commit = ""
 self.disable_rebase = gitConfigBool("git-p4.disableRebase")
+self.disable_p4sync = gitConfigBool("git-p4.disableP4Sync")
 self.prepare_p4_only = False
 self.conflict_behavior = None
 self.isWindows = (platform.system() == "Windows")
@@ -2240,11 +2243,14 @@ def run(self, args):
 sync = P4Sync()
 if self.branch:
 sync.branch = self.branch
-sync.run([])
+if self.disable_p4sync:
+sync.sync_origin_only()
+else:
+sync.run([])
 
-if self.disable_rebase is False:
-rebase = P4Rebase()
-rebase.rebase()
+if not self.disable_rebase:
+rebase = P4Rebase()
+rebase.rebase()
 
 else:
 if len(applied) == 0:
@@ -3324,6 +3330,14 @@ def importChanges(self, changes, shelved=False, 
origin_revision=0):
 print self.gitError.read()
 sys.exit(1)
 
+def sync_origin_only(self):
+if self.syncWithOrigin:
+self.hasOrigin = originP4BranchesExist()
+if self.hasOrigin:
+if not self.silent:
+print 'Syncing with origin first, using "git fetch origin"'
+system("git fetch origin")
+
 def importHeadRevision(self, revision):
 print "Doing initial import of %s from revision %s into %s" % (' 
'.join(self.depotPaths), revision, self.branch)
 
@@ -3402,12 +3416,7 @@ def run(self, args):
 else:
 self.refPrefix = "refs/heads/p4/"
 
-if self.syncWithOrigin:
-self.hasOrigin = originP4BranchesExist()
-if self.hasOrigin:
-if not self.silent:
-print 'Syncing with origin first, using "git fetch origin"'
-system("git fetch origin")
+self.sync_origin_only()
 
 branch_arg_given = bool(self.branch)
 if len(self.branch) == 0:
-- 
2.17.0.392.gdeb1a6e9b7



[PATCHv1 1/2] git-p4: disable-rebase: allow setting this via configuration

2018-06-05 Thread Luke Diamand
This just lets you set the --disable-rebase option with the
git configuration options git-p4.disableRebase. If you're
using this option, you probably want to set it all the time
for a given repo.

Signed-off-by: Luke Diamand 
---
 Documentation/git-p4.txt | 5 -
 git-p4.py| 2 +-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
index e8452528fc..3d83842e47 100644
--- a/Documentation/git-p4.txt
+++ b/Documentation/git-p4.txt
@@ -367,7 +367,7 @@ These options can be used to modify 'git p4 submit' 
behavior.
 
 --disable-rebase::
 Disable the automatic rebase after all commits have been successfully
-submitted.
+submitted. Can also be set with git-p4.disableRebase.
 
 Rebase options
 ~~
@@ -690,6 +690,9 @@ git-p4.conflict::
Specify submit behavior when a conflict with p4 is found, as per
--conflict.  The default behavior is 'ask'.
 
+git-p4.disableRebase::
+Do not rebase the tree against p4/master following a submit.
+
 IMPLEMENTATION DETAILS
 --
 * Changesets from p4 are imported using Git fast-import.
diff --git a/git-p4.py b/git-p4.py
index c4581d20a6..5ab9421af8 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -1379,7 +1379,7 @@ def __init__(self):
 self.shelve = False
 self.update_shelve = list()
 self.commit = ""
-self.disable_rebase = False
+self.disable_rebase = gitConfigBool("git-p4.disableRebase")
 self.prepare_p4_only = False
 self.conflict_behavior = None
 self.isWindows = (platform.system() == "Windows")
-- 
2.17.0.392.gdeb1a6e9b7



Re: [PATCHv1 2/2] git-p4: add option to disable syncing of p4/master with p4

2018-06-05 Thread Eric Sunshine
On Tue, Jun 5, 2018 at 4:29 AM Luke Diamand  wrote:
> Add an option to the git-p4 submit command to disable syncing
> with Perforce.
>
> This is useful for the case where a git-p4 mirror has been setup
> on a server somewhere, running from (e.g.) cron, and developers
> then clone from this. Having the local cloned copy also sync
> from Perforce just isn't useful.
>
> Signed-off-by: Luke Diamand 
> ---
> diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
> @@ -369,6 +369,11 @@ These options can be used to modify 'git p4 submit' 
> behavior.
> +--disable-p4sync::
> +Disable the automatic sync of p4/master from Perforce after commit have

s/commit/commits/

> +been submitted. Implies --disable-rebase. Can also be set with
> +git-p4.disableP4Sync. Sync with origin/master still goes ahead if 
> possible.
> diff --git a/git-p4.py b/git-p4.py
> @@ -1368,7 +1368,9 @@ def __init__(self):
> +optparse.make_option("--disable-p4sync", 
> dest="disable_p4sync", action="store_true",
> + help="Skip perforce sync of p4/master 
> after submit or shelve"),

s/perforce/Perforce/


[PATCHv1 1/1] git-p4: better error reporting when p4 fails

2018-06-05 Thread Luke Diamand
Currently when p4 fails to run, git-p4 just crashes with an obscure
error message.

For example, if the P4 ticket has expired, you get:

  Error: Cannot locate perforce checkout of  in client view

This change checks whether git-p4 can talk to the Perforce server when
the first P4 operation is attempted, and tries to print a meaningful
error message if it fails.

Signed-off-by: Luke Diamand 
---
 git-p4.py | 55 +
 t/t9833-errors.sh | 78 +++
 2 files changed, 133 insertions(+)
 create mode 100755 t/t9833-errors.sh

diff --git a/git-p4.py b/git-p4.py
index b9e79f1d8b..66652f8280 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -50,6 +50,8 @@ def __str__(self):
 # Grab changes in blocks of this many revisions, unless otherwise requested
 defaultBlockSize = 512
 
+p4_access_checked = False
+
 def p4_build_cmd(cmd):
 """Build a suitable p4 command line.
 
@@ -91,6 +93,13 @@ def p4_build_cmd(cmd):
 real_cmd = ' '.join(real_cmd) + ' ' + cmd
 else:
 real_cmd += cmd
+
+# now check that we can actually talk to the server
+global p4_access_checked
+if not p4_access_checked:
+p4_access_checked = True
+p4_check_access()
+
 return real_cmd
 
 def git_dir(path):
@@ -264,6 +273,52 @@ def p4_system(cmd):
 if retcode:
 raise CalledProcessError(retcode, real_cmd)
 
+def die_bad_access(s):
+die("failure accessing depot: {0}".format(s.rstrip()))
+
+def p4_check_access(min_expiration=1):
+""" Check if we can access Perforce - account still logged in
+"""
+results = p4CmdList(["login", "-s"])
+
+if len(results) == 0:
+# should never get here: always get either some results, or a 
p4ExitCode
+assert("could not parse response from perforce")
+
+result = results[0]
+
+if 'p4ExitCode' in result:
+# p4 returned non-zero status, e.g. P4PORT invalid, or p4 not in path
+die_bad_access("could not run p4")
+
+code = result.get("code")
+if not code:
+# we get here if we couldn't connect and there was nothing to unmarshal
+die_bad_access("could not connect")
+
+elif code == "stat":
+expiry = result.get("TicketExpiration")
+if expiry:
+expiry = int(expiry)
+if expiry > min_expiration:
+# ok to carry on
+return
+else:
+die_bad_access("perforce ticket expires in {0} 
seconds".format(expiry))
+
+else:
+# account without a timeout - all ok
+return
+
+elif code == "error":
+data = result.get("data")
+if data:
+die_bad_access("p4 error: {0}".format(data))
+else:
+die_bad_access("unknown error")
+else:
+die_bad_access("unknown error code {0}".format(code))
+
 _p4_version_string = None
 def p4_version_string():
 """Read the version string, showing just the last line, which
diff --git a/t/t9833-errors.sh b/t/t9833-errors.sh
new file mode 100755
index 00..9ba892de7a
--- /dev/null
+++ b/t/t9833-errors.sh
@@ -0,0 +1,78 @@
+#!/bin/sh
+
+test_description='git p4 errors'
+
+. ./lib-git-p4.sh
+
+test_expect_success 'start p4d' '
+   start_p4d
+'
+
+test_expect_success 'add p4 files' '
+   (
+   cd "$cli" &&
+   echo file1 >file1 &&
+   p4 add file1 &&
+   p4 submit -d "file1"
+   )
+'
+
+# after this test, the default user requires a password
+test_expect_success 'error handling' '
+   git p4 clone --dest="$git" //depot@all &&
+   (
+   cd "$git" &&
+   P4PORT=: test_must_fail git p4 submit 2>errmsg
+   ) &&
+   p4 passwd -P newpassword &&
+   (
+   P4PASSWD=badpassword test_must_fail git p4 clone //depot/foo 
2>errmsg &&
+   grep -q "failure accessing depot.*P4PASSWD" errmsg
+   )
+'
+
+test_expect_success 'ticket logged out' '
+   P4TICKETS="$cli/tickets" &&
+   echo "newpassword" | p4 login &&
+   (
+   cd "$git" &&
+   test_commit "ticket-auth-check" &&
+   p4 logout &&
+   test_must_fail git p4 submit 2>errmsg &&
+   grep -q "failure accessing depot" errmsg
+   )
+'
+
+test_expect_success 'create group with short ticket expiry' '
+   P4TICKETS="$cli/tickets" &&
+   echo "newpassword" | p4 login &&
+   p4_add_user short_expiry_user &&
+   p4 -u short_expiry_user passwd -P password &&
+   p4 group -i <<-EOF &&
+   Group: testgroup
+   Timeout: 3
+   Users: short_expiry_user
+   EOF
+
+   p4 users | grep short_expiry_user
+'
+
+test_expect_success 'git operation with expired ticket' '
+   P4TICKETS="$cli/tickets" &&
+   P4USER=short_expiry_user &&
+   echo "password" | p4 login &&
+   (
+   cd "$git" &&
+   git p4 sync &&
+   sl

[PATCHv1 0/1] git-p4: better error reporting

2018-06-05 Thread Luke Diamand
If git-p4 cannot talk to the Perforce server, it will either give a
confusing error message, or just crash. Even I get tripped up by this.

This change just checks that the client can talk to the server, and in
particular that the user is logged in (the default login timeout is 12
hours).

It would be nice to also check for ticket expiration during long
Perforce operations, but this change does not attempt to do that.

Luke Diamand (1):
  git-p4: better error reporting when p4 fails

 git-p4.py | 55 +
 t/t9833-errors.sh | 78 +++
 2 files changed, 133 insertions(+)
 create mode 100755 t/t9833-errors.sh

-- 
2.17.0.392.gdeb1a6e9b7



[PATCHv1 3/3] git-p4: auto-size the block

2018-06-05 Thread Luke Diamand
git-p4 originally would fetch changes in one query. On large repos this
could fail because of the limits that Perforce imposes on the number of
items returned and the number of queries in the database.

To fix this, git-p4 learned to query changes in blocks of 512 changes,
However, this can be very slow - if you have a few million changes,
with each chunk taking about a second, it can be an hour or so.

Although it's possible to tune this value manually with the
"--changes-block-size" option, it's far from obvious to ordinary users
that this is what needs doing.

This change alters the block size dynamically by looking for the
specific error messages returned from the Perforce server, and reducing
the block size if the error is seen, either to the limit reported by the
server, or to half the current block size.

That means we can start out with a very large block size, and then let
it automatically drop down to a value that works without error, while
still failing correctly if some other error occurs.

Signed-off-by: Luke Diamand 
---
 git-p4.py   | 26 +-
 t/t9818-git-p4-block.sh | 11 +++
 2 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index b337562b39..6736f81b60 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -48,7 +48,8 @@ def __str__(self):
 defaultLabelRegexp = r'[a-zA-Z0-9_\-.]+$'
 
 # Grab changes in blocks of this many revisions, unless otherwise requested
-defaultBlockSize = 512
+# The code should now automatically reduce the block size if it is required
+defaultBlockSize = 1<<20
 
 p4_access_checked = False
 
@@ -969,7 +970,8 @@ def p4ChangesForPaths(depotPaths, changeRange, 
requestedBlockSize):
 changes = set()
 
 # Retrieve changes a block at a time, to prevent running
-# into a MaxResults/MaxScanRows error from the server.
+# into a MaxResults/MaxScanRows error from the server. If
+# we _do_ hit one of those errors, turn down the block size
 
 while True:
 cmd = ['changes']
@@ -983,10 +985,24 @@ def p4ChangesForPaths(depotPaths, changeRange, 
requestedBlockSize):
 for p in depotPaths:
 cmd += ["%s...@%s" % (p, revisionRange)]
 
+# fetch the changes
+try:
+result = p4CmdList(cmd, errors_as_exceptions=True)
+except P4RequestSizeException as e:
+if not block_size:
+block_size = e.limit
+elif block_size > e.limit:
+block_size = e.limit
+else:
+block_size = max(2, block_size // 2)
+
+if verbose: print("block size error, retry with block size 
{0}".format(block_size))
+continue
+except P4Exception as e:
+die('Error retrieving changes description 
({0})'.format(e.p4ExitCode))
+
 # Insert changes in chronological order
-for entry in reversed(p4CmdList(cmd)):
-if entry.has_key('p4ExitCode'):
-die('Error retrieving changes descriptions 
({})'.format(entry['p4ExitCode']))
+for entry in reversed(result):
 if not entry.has_key('change'):
 continue
 changes.add(int(entry['change']))
diff --git a/t/t9818-git-p4-block.sh b/t/t9818-git-p4-block.sh
index 8840a183ac..e5ec9cdec8 100755
--- a/t/t9818-git-p4-block.sh
+++ b/t/t9818-git-p4-block.sh
@@ -129,6 +129,7 @@ test_expect_success 'Create a repo with multiple depot 
paths' '
 '
 
 test_expect_success 'Clone repo with multiple depot paths' '
+   test_when_finished cleanup_git &&
(
cd "$git" &&
git p4 clone --changes-block-size=4 //depot/pathA@all 
//depot/pathB@all \
@@ -138,6 +139,16 @@ test_expect_success 'Clone repo with multiple depot paths' 
'
)
 '
 
+test_expect_success 'Clone repo with self-sizing block size' '
+   test_when_finished cleanup_git &&
+   git p4 clone --changes-block-size=100 //depot@all 
--destination="$git" &&
+   (
+   cd "$git" &&
+   git log --oneline > log &&
+   test_line_count \> 10 log
+   )
+'
+
 test_expect_success 'kill p4d' '
kill_p4d
 '
-- 
2.17.0.392.gdeb1a6e9b7



[PATCHv1 1/3] git-p4: raise exceptions from p4CmdList based on error from p4 server

2018-06-05 Thread Luke Diamand
This change lays some groundwork for better handling of rowcount errors
from the server, where it fails to send us results because we requested
too many.

It adds an option to p4CmdList() to return errors as a Python exception.

The exceptions are derived from P4Exception (something went wrong),
P4ServerException (the server sent us an error code) and
P4RequestSizeException (we requested too many rows/results from the
server database).

This makes makes the code that handles the errors a bit easier.

The default behavior is unchanged; the new code is enabled with a flag.

Signed-off-by: Luke Diamand 
---
 git-p4.py | 44 
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 66652f8280..d856078ec8 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -566,10 +566,30 @@ def isModeExec(mode):
 # otherwise False.
 return mode[-3:] == "755"
 
+class P4Exception(Exception):
+""" Base class for exceptions from the p4 client """
+def __init__(self, exit_code):
+self.p4ExitCode = exit_code
+
+class P4ServerException(Exception):
+""" Base class for exceptions where we get some kind of marshalled up 
result from the server """
+def __init__(self, exit_code, p4_result):
+super(P4ServerException, self).__init__(exit_code)
+self.p4_result = p4_result
+self.code = p4_result[0]['code']
+self.data = p4_result[0]['data']
+
+class P4RequestSizeException(P4ServerException):
+""" One of the maxresults or maxscanrows errors """
+def __init__(self, exit_code, p4_result, limit):
+super(P4RequestSizeException, self).__init__(exit_code, p4_result)
+self.limit = limit
+
 def isModeExecChanged(src_mode, dst_mode):
 return isModeExec(src_mode) != isModeExec(dst_mode)
 
-def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, skip_info=False):
+def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, skip_info=False,
+errors_as_exceptions=False):
 
 if isinstance(cmd,basestring):
 cmd = "-G " + cmd
@@ -616,9 +636,25 @@ def p4CmdList(cmd, stdin=None, stdin_mode='w+b', cb=None, 
skip_info=False):
 pass
 exitCode = p4.wait()
 if exitCode != 0:
-entry = {}
-entry["p4ExitCode"] = exitCode
-result.append(entry)
+if errors_as_exceptions:
+if len(result) > 0:
+data = result[0].get('data')
+if data:
+m = re.search('Too many rows scanned \(over (\d+)\)', data)
+if not m:
+m = re.search('Request too large \(over (\d+)\)', data)
+
+if m:
+limit = int(m.group(1))
+raise P4RequestSizeException(exitCode, result, limit)
+
+raise P4ServerException(exitCode, result)
+else:
+raise P4Exception(exitCode)
+else:
+entry = {}
+entry["p4ExitCode"] = exitCode
+result.append(entry)
 
 return result
 
-- 
2.17.0.392.gdeb1a6e9b7



[PATCHv1 2/3] git-p4: narrow the scope of exceptions caught when parsing an int

2018-06-05 Thread Luke Diamand
The current code traps all exceptions around some code which parses an
integer, and then talks to Perforce.

That can result in errors from Perforce being ignored. Change the code
to only catch the integer conversion exceptions.

Signed-off-by: Luke Diamand 
---
 git-p4.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-p4.py b/git-p4.py
index d856078ec8..b337562b39 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -959,7 +959,7 @@ def p4ChangesForPaths(depotPaths, changeRange, 
requestedBlockSize):
 try:
 (changeStart, changeEnd) = p4ParseNumericChangeRange(parts)
 block_size = chooseBlockSize(requestedBlockSize)
-except:
+except ValueError:
 changeStart = parts[0][1:]
 changeEnd = parts[1]
 if requestedBlockSize:
-- 
2.17.0.392.gdeb1a6e9b7



[PATCHv1 0/3] git-p4: improvements to p4 "blocking"

2018-06-05 Thread Luke Diamand
This patchset improves the way that git-p4 sends requests in "blocks".

The problem comes about because the Perforce server limits the maximum
number of results it will send back (there are actually two different
limits).

This causes git-p4 failures if you ask for too much data.

In commit 96b2d54aee ("git-p4: use -m when running p4 changes",
2015-04-20), git-p4 learned to make requests in blocks, by default 512
in size.

The problem with this, however, is that it can be very slow on large
repositories - you might have millions of commits, although only a
handful actually relate to the code you are trying to clone. Each 512
block request takes a sizeable fraction of a second to complete.

There's a command-line option to increase the block size, but unless you
know about it, it won't be at all obvious that you could use this to
speed things up.

This change
~~~

The function p4CmdList() has been taught to raise an exception when it
gets an error, and in particular, to notice and report the two "too many
results" errors.

The code that does the blocking can now start out with a nice large
block size, and then reduce it if it sees an error.

Luke Diamand (3):
  git-p4: raise exceptions from p4CmdList based on error from p4 server
  git-p4: narrow the scope of exceptions caught when parsing an int
  git-p4: auto-size the block

 git-p4.py   | 72 +++--
 t/t9818-git-p4-block.sh | 11 +++
 2 files changed, 73 insertions(+), 10 deletions(-)

-- 
2.17.0.392.gdeb1a6e9b7



Re: [PATCHv1 1/3] git-p4: raise exceptions from p4CmdList based on error from p4 server

2018-06-05 Thread Eric Sunshine
On Tue, Jun 5, 2018 at 5:14 AM Luke Diamand  wrote:
> This change lays some groundwork for better handling of rowcount errors
> from the server, where it fails to send us results because we requested
> too many.
>
> It adds an option to p4CmdList() to return errors as a Python exception.
>
> The exceptions are derived from P4Exception (something went wrong),
> P4ServerException (the server sent us an error code) and
> P4RequestSizeException (we requested too many rows/results from the
> server database).
>
> This makes makes the code that handles the errors a bit easier.
>
> The default behavior is unchanged; the new code is enabled with a flag.
>
> Signed-off-by: Luke Diamand 
> ---
> diff --git a/git-p4.py b/git-p4.py
> @@ -566,10 +566,30 @@ def isModeExec(mode):
> +class P4ServerException(Exception):
> +""" Base class for exceptions where we get some kind of marshalled up 
> result from the server """
> +def __init__(self, exit_code, p4_result):
> +super(P4ServerException, self).__init__(exit_code)
> +self.p4_result = p4_result
> +self.code = p4_result[0]['code']
> +self.data = p4_result[0]['data']

The subsequent patches never seem to access any of these fields, so
it's difficult to judge whether it's worthwhile having 'code' and
'data' bits split out like this.

> @@ -616,9 +636,25 @@ def p4CmdList(cmd, stdin=None, stdin_mode='w+b', 
> cb=None, skip_info=False):
>  if exitCode != 0:
> -entry = {}
> -entry["p4ExitCode"] = exitCode
> -result.append(entry)
> +if errors_as_exceptions:
> +if len(result) > 0:
> +data = result[0].get('data')
> +if data:
> +m = re.search('Too many rows scanned \(over (\d+)\)', 
> data)
> +if not m:
> +m = re.search('Request too large \(over (\d+)\)', 
> data)

Does 'p4' localize these error messages?

> +if m:
> +limit = int(m.group(1))
> +raise P4RequestSizeException(exitCode, result, limit)
> +
> +raise P4ServerException(exitCode, result)
> +else:
> +raise P4Exception(exitCode)
> +else:
> +entry = {}
> +entry["p4ExitCode"] = exitCode
> +result.append(entry)


Re: [PATCHv1 3/3] git-p4: auto-size the block

2018-06-05 Thread Eric Sunshine
On Tue, Jun 5, 2018 at 5:14 AM Luke Diamand  wrote:
> git-p4 originally would fetch changes in one query. On large repos this
> could fail because of the limits that Perforce imposes on the number of
> items returned and the number of queries in the database.
>
> To fix this, git-p4 learned to query changes in blocks of 512 changes,
> However, this can be very slow - if you have a few million changes,
> with each chunk taking about a second, it can be an hour or so.
>
> Although it's possible to tune this value manually with the
> "--changes-block-size" option, it's far from obvious to ordinary users
> that this is what needs doing.
>
> This change alters the block size dynamically by looking for the
> specific error messages returned from the Perforce server, and reducing
> the block size if the error is seen, either to the limit reported by the
> server, or to half the current block size.
>
> That means we can start out with a very large block size, and then let
> it automatically drop down to a value that works without error, while
> still failing correctly if some other error occurs.
>
> Signed-off-by: Luke Diamand 
> ---
> diff --git a/git-p4.py b/git-p4.py
> @@ -48,7 +48,8 @@ def __str__(self):
>  # Grab changes in blocks of this many revisions, unless otherwise requested
> -defaultBlockSize = 512
> +# The code should now automatically reduce the block size if it is required
> +defaultBlockSize = 1<<20

As worded, the new comment only has value to someone who knew the old
behavior (before this patch), not for someone reading the code fresh.
However, if reworded, it might be meaningful to all readers (new and
old):

# The block size is reduced automatically if required

> @@ -983,10 +985,24 @@ def p4ChangesForPaths(depotPaths, changeRange, 
> requestedBlockSize):
> +try:
> +result = p4CmdList(cmd, errors_as_exceptions=True)
> +except P4RequestSizeException as e:
> +if not block_size:
> +block_size = e.limit
> +elif block_size > e.limit:
> +block_size = e.limit
> +else:
> +block_size = max(2, block_size // 2)
> +
> +if verbose: print("block size error, retry with block size 
> {0}".format(block_size))

Perhaps: s/retry/retrying/

> diff --git a/t/t9818-git-p4-block.sh b/t/t9818-git-p4-block.sh
> @@ -129,6 +129,7 @@ test_expect_success 'Create a repo with multiple depot 
> paths' '
> +test_expect_success 'Clone repo with self-sizing block size' '
> +   test_when_finished cleanup_git &&
> +   git p4 clone --changes-block-size=100 //depot@all 
> --destination="$git" &&
> +   (
> +   cd "$git" &&
> +   git log --oneline > log &&
> +   test_line_count \> 10 log
> +   )
> +'

Or, without a subshell (and dropping whitespace after redirection operator):

git -C git log --oneline >log &&
test_line_count \> 10 log

(not worth a re-roll)


Re: [PATCHv1 1/3] git-p4: raise exceptions from p4CmdList based on error from p4 server

2018-06-05 Thread Luke Diamand
On 5 June 2018 at 10:54, Eric Sunshine  wrote:
> On Tue, Jun 5, 2018 at 5:14 AM Luke Diamand  wrote:
>> This change lays some groundwork for better handling of rowcount errors
>> from the server, where it fails to send us results because we requested
>> too many.
>>
>> It adds an option to p4CmdList() to return errors as a Python exception.
>>
>> The exceptions are derived from P4Exception (something went wrong),
>> P4ServerException (the server sent us an error code) and
>> P4RequestSizeException (we requested too many rows/results from the
>> server database).
>>
>> This makes makes the code that handles the errors a bit easier.
>>
>> The default behavior is unchanged; the new code is enabled with a flag.
>>
>> Signed-off-by: Luke Diamand 
>> ---
>> diff --git a/git-p4.py b/git-p4.py
>> @@ -566,10 +566,30 @@ def isModeExec(mode):
>> +class P4ServerException(Exception):
>> +""" Base class for exceptions where we get some kind of marshalled up 
>> result from the server """
>> +def __init__(self, exit_code, p4_result):
>> +super(P4ServerException, self).__init__(exit_code)
>> +self.p4_result = p4_result
>> +self.code = p4_result[0]['code']
>> +self.data = p4_result[0]['data']
>
> The subsequent patches never seem to access any of these fields, so
> it's difficult to judge whether it's worthwhile having 'code' and
> 'data' bits split out like this.

These changes don't use it, but I thought that future changes might
need them, and perhaps when I put that code in I was thinking I might
need it myself.

>
>> @@ -616,9 +636,25 @@ def p4CmdList(cmd, stdin=None, stdin_mode='w+b', 
>> cb=None, skip_info=False):
>>  if exitCode != 0:
>> -entry = {}
>> -entry["p4ExitCode"] = exitCode
>> -result.append(entry)
>> +if errors_as_exceptions:
>> +if len(result) > 0:
>> +data = result[0].get('data')
>> +if data:
>> +m = re.search('Too many rows scanned \(over (\d+)\)', 
>> data)
>> +if not m:
>> +m = re.search('Request too large \(over (\d+)\)', 
>> data)
>
> Does 'p4' localize these error messages?

That's a good question.

The marshalled-up error from Perforce looks like this:

 ([{'generic': 35, 'code': 'error', 'data': "Too many rows scanned
(over 40); see 'p4 help maxscanrows'.\n", 'severity': 3}])

It turns out that Perforce open-sourced the P4 client in 2014 (I only
recently found this out) so we can actually look at the code now!

https://swarm.workshop.perforce.com/projects/perforce_software-p4

Clone it like this:

mkdir p4 &&
(cd p4 && git init && git config --add git-p4.branchList p4/2018-1:2018-1) &&
P4USER=guest P4PORT=workshop.perforce.com:1666 git p4 clone
--detect-branches --destination p4  //guest/perforce_software/p4@all

Here's the code:

// ErrorId graveyard: retired/deprecated ErrorIds.

ErrorId MsgDb::MaxResults  = { ErrorOf( ES_DB, 32,
E_FAILED, EV_ADMIN, 1 ), "Request too large (over %maxResults%); see
'p4 help maxresults'." } ;//NOTRANS
ErrorId MsgDb::MaxScanRows = { ErrorOf( ES_DB, 61,
E_FAILED, EV_ADMIN, 1 ), "Too many rows scanned (over %maxScanRows%);
see 'p4 help maxscanrows'." } ;//NOTRANS


I don't think there's actually a way to make it return any language
other than English though. There's a P4LANGUAGE environment variable,
but it just says "this is for system integrators":

https://www.perforce.com/perforce/r15.2/manuals/cmdref/P4LANGUAGE.html

So I think probably the language is always English.

Luke


Re: [PATCHv1 1/1] git-p4: better error reporting when p4 fails

2018-06-05 Thread Luke Diamand
On 5 June 2018 at 11:49, Merland Romain  wrote:
>
>> @@ -91,6 +93,13 @@ def p4_build_cmd(cmd):
>> real_cmd = ' '.join(real_cmd) + ' ' + cmd
>> else:
>> real_cmd += cmd
>> +
>> +# now check that we can actually talk to the server
>> +global p4_access_checked
>> +if not p4_access_checked:
>> +p4_access_checked = True
>> +p4_check_access()
>> +
>
> Just switch the 2 lines 'p4_access_checked = True' and 'p4_check_access()'
> It seems to me more logical

Like this:

+p4_check_access()
+p4_access_checked = True

You need to set p4_access_checked first so that it doesn't go and try
to check the p4 access before running "p4 login -s", which would then
get stuck forever.

>
> Romain


[PATCH] t3200-branch.sh: use "--set-upstream-to" in test

2018-06-05 Thread Robert P. J. Day


Change deprecated "--set-upstream" branch option to
preferred "--set-upstream-to".

Signed-off-by: Robert P. J. Day 

---

  i'm assuming this should use "--set-upstream-to" as do all the
others.

diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
index 69ea8202f4..ef887a0b32 100755
--- a/t/t3200-branch.sh
+++ b/t/t3200-branch.sh
@@ -885,8 +885,8 @@ test_expect_success 'test --unset-upstream on a particular 
branch' '
test_must_fail git config branch.my14.merge
 '

-test_expect_success '--set-upstream fails' '
-test_must_fail git branch --set-upstream origin/master
+test_expect_success '--set-upstream-to fails' '
+test_must_fail git branch --set-upstream-to origin/master
 '

 test_expect_success '--set-upstream-to notices an error to set branch as own 
upstream' '

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
  http://crashcourse.ca/dokuwiki

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday



Re: [PATCH] t3200-branch.sh: use "--set-upstream-to" in test

2018-06-05 Thread SZEDER Gábor
> Change deprecated "--set-upstream" branch option to
> preferred "--set-upstream-to".
> 
> Signed-off-by: Robert P. J. Day 
> 
> ---
> 
>   i'm assuming this should use "--set-upstream-to" as do all the
> others.

I don't think so, see 52668846ea (builtin/branch: stop supporting the
"--set-upstream" option, 2017-08-17).

Though arguably the test name could be more descriptive and tell why
it should fail.

> diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
> index 69ea8202f4..ef887a0b32 100755
> --- a/t/t3200-branch.sh
> +++ b/t/t3200-branch.sh
> @@ -885,8 +885,8 @@ test_expect_success 'test --unset-upstream on a 
> particular branch' '
>   test_must_fail git config branch.my14.merge
>  '
> 
> -test_expect_success '--set-upstream fails' '
> -test_must_fail git branch --set-upstream origin/master
> +test_expect_success '--set-upstream-to fails' '
> +test_must_fail git branch --set-upstream-to origin/master
>  '
> 
>  test_expect_success '--set-upstream-to notices an error to set branch as own 
> upstream' '
> 
> -- 
> 
> 
> Robert P. J. Day Ottawa, Ontario, CANADA
>   http://crashcourse.ca/dokuwiki
> 
> Twitter:   http://twitter.com/rpjday
> LinkedIn:   http://ca.linkedin.com/in/rpjday
> 
> 


Re: [PATCH] t3200-branch.sh: use "--set-upstream-to" in test

2018-06-05 Thread Robert P. J. Day
On Tue, 5 Jun 2018, SZEDER Gábor wrote:

> > Change deprecated "--set-upstream" branch option to
> > preferred "--set-upstream-to".
> >
> > Signed-off-by: Robert P. J. Day 
> >
> > ---
> >
> >   i'm assuming this should use "--set-upstream-to" as do all the
> > others.
>
> I don't think so, see 52668846ea (builtin/branch: stop supporting
> the "--set-upstream" option, 2017-08-17).

  yes, you're right, i didn't look at the context enough. my bad.

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
  http://crashcourse.ca/dokuwiki

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday


Git not creating merge commit when merging signed/annotated tag

2018-06-05 Thread Tim Friske
Hi Everyone,

ten days ago I asked on https://unix.stackexchange.com/ why Git is not creating 
a merge commit when merging a signed/annotated tag. Does someone has an answer 
to my question 
https://unix.stackexchange.com/questions/446154/git-not-creating-merge-commit-when-merging-signed-annotated-tag?

Thank you in advance!

– Tim


Re: Git not creating merge commit when merging signed/annotated tag

2018-06-05 Thread Ævar Arnfjörð Bjarmason


On Tue, Jun 05 2018, Tim Friske wrote:

> Hi Everyone,
>
> ten days ago I asked on https://unix.stackexchange.com/ why Git is not 
> creating a merge commit when merging a signed/annotated tag. Does someone has 
> an answer to my question 
> https://unix.stackexchange.com/questions/446154/git-not-creating-merge-commit-when-merging-signed-annotated-tag?
>
> Thank you in advance!

I believe my answer in this recent thread which brought up the same
topic applies to your situation as well:
https://public-inbox.org/git/CANgJU+VFCY0LNRLMSGtD7ScpcLaPFMzUOyw6Bjgk6q=kx9d...@mail.gmail.com/

Try to apply the patch I noted to builtin/merge.c there and see if you
don't get the same message printed out.


[GSoC][PATCH v2 1/1] rebase--interactive: rewrite append_todo_help() in C

2018-06-05 Thread Alban Gruin
This rewrites append_todo_help() from shell to C. It also incorporates
some parts of initiate_action() and complete_action() that also write
help texts to the todo file.

Two flags are added to rebase--helper.c: one to call
append_todo_help() (`--append-todo-help`), and another one to tell
append_todo_help() to write the help text suited for the edit-todo
mode (`--write-edit-todo`).

Finally, append_todo_help() is removed from git-rebase--interactive.sh
to use `rebase--helper --append-todo-help` instead.

Signed-off-by: Alban Gruin 
---
 builtin/rebase--helper.c   | 10 ++--
 git-rebase--interactive.sh | 52 ++--
 sequencer.c| 60 ++
 sequencer.h|  1 +
 4 files changed, 71 insertions(+), 52 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index f7c2a5fdc..7ef92fbb6 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -12,12 +12,12 @@ static const char * const builtin_rebase_helper_usage[] = {
 int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
 {
struct replay_opts opts = REPLAY_OPTS_INIT;
-   unsigned flags = 0, keep_empty = 0, rebase_merges = 0;
+   unsigned flags = 0, keep_empty = 0, rebase_merges = 0, edit_todo = 0;
int abbreviate_commands = 0, rebase_cousins = -1;
enum {
CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_OIDS, EXPAND_OIDS,
CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH,
-   ADD_EXEC
+   ADD_EXEC, APPEND_TODO_HELP
} command = 0;
struct option options[] = {
OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
@@ -27,6 +27,8 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
OPT_BOOL(0, "rebase-merges", &rebase_merges, N_("rebase merge 
commits")),
OPT_BOOL(0, "rebase-cousins", &rebase_cousins,
 N_("keep original branch points of cousins")),
+   OPT_BOOL(0, "write-edit-todo", &edit_todo,
+N_("append the edit-todo message to the todo-list")),
OPT_CMDMODE(0, "continue", &command, N_("continue rebase"),
CONTINUE),
OPT_CMDMODE(0, "abort", &command, N_("abort rebase"),
@@ -45,6 +47,8 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
N_("rearrange fixup/squash lines"), REARRANGE_SQUASH),
OPT_CMDMODE(0, "add-exec-commands", &command,
N_("insert exec commands in todo list"), ADD_EXEC),
+   OPT_CMDMODE(0, "append-todo-help", &command,
+   N_("insert the help in the todo list"), 
APPEND_TODO_HELP),
OPT_END()
};
 
@@ -84,5 +88,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
return !!rearrange_squash();
if (command == ADD_EXEC && argc == 2)
return !!sequencer_add_exec_commands(argv[1]);
+   if (command == APPEND_TODO_HELP && argc == 1)
+   return !!append_todo_help(edit_todo, keep_empty);
usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 299ded213..94c23a7af 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -39,38 +39,6 @@ comment_for_reflog () {
esac
 }
 
-append_todo_help () {
-   gettext "
-Commands:
-p, pick  = use commit
-r, reword  = use commit, but edit the commit message
-e, edit  = use commit, but stop for amending
-s, squash  = use commit, but meld into previous commit
-f, fixup  = like \"squash\", but discard this commit's log message
-x, exec  = run command (the rest of the line) using shell
-d, drop  = remove commit
-l, label  = label current HEAD with a name
-t, reset  = reset HEAD to a label
-m, merge [-C  | -c ]  [# ]
-.   create a merge commit using the original merge commit's
-.   message (or the oneline, if no original merge commit was
-.   specified). Use -c  to reword the commit message.
-
-These lines can be re-ordered; they are executed from top to bottom.
-" | git stripspace --comment-lines >>"$todo"
-
-   if test $(get_missing_commit_check_level) = error
-   then
-   gettext "
-Do not remove any line. Use 'drop' explicitly to remove a commit.
-" | git stripspace --comment-lines >>"$todo"
-   else
-   gettext "
-If you remove a line here THAT COMMIT WILL BE LOST.
-" | git stripspace --comment-lines >>"$todo"
-   fi
-}
-
 die_abort () {
apply_autostash
rm -rf "$state_dir"
@@ -143,13 +111,7 @@ initiate_action () {
git stripspace --strip-comments <"$todo" >"$todo".new
mv -f "$todo".new "$todo"
collapse_todo_ids
-   

[GSoC][PATCH v2 0/1] rebase -i: rewrite append_todo_help() in C

2018-06-05 Thread Alban Gruin
This patch rewrites append_todo_help() from shell to C. The C version
covers a bit more than the old shell version. To achieve that, some
parameters were added to rebase--helper.

This is part of the effort to rewrite interactive rebase in C.

Changes since v1:

 - Renaming the parameter to insert the edit-todo message from
   `edit-todo` to `write-edit-todo`.

 - Clarifying the `write-edit-todo` help message.

 - Dropping the commit that removed newlines in the messages.

Alban Gruin (1):
  rebase--interactive: rewrite append_todo_help() in C

 builtin/rebase--helper.c   | 10 ++--
 git-rebase--interactive.sh | 52 ++--
 sequencer.c| 60 ++
 sequencer.h|  1 +
 4 files changed, 71 insertions(+), 52 deletions(-)

-- 
2.16.4



[PATCH v7 0/8] ambiguous checkout UI & checkout.defaultRemote

2018-06-05 Thread Ævar Arnfjörð Bjarmason
Fixes issues noted with v6, hopefully ready for queuing. A tbdiff with
v6:

1: ab4529d9f5 = 1: 2ca81c76fc checkout tests: index should be clean after dwim 
checkout
2: c8bbece403 = 2: 19b14a1c75 checkout.h: wrap the arguments to 
unique_tracking_name()
3: 881fe63f4f = 3: 8bc6a9c052 checkout.c: introduce an *_INIT macro
4: 72ddaeddd3 ! 4: 34f3b67f9b checkout.c: change "unique" member to 
"num_matches"
@@ -1,6 +1,6 @@
 Author: Ævar Arnfjörð Bjarmason 
 
-checkout.c]: change "unique" member to "num_matches"
+checkout.c: change "unique" member to "num_matches"
 
 Internally track how many matches we find in the check_tracking_name()
 callback. Nothing uses this now, but it will be made use of in a later
5: 5e8c82680b = 5: 7d81c06a23 checkout: pass the "num_matches" up to callers
6: 07e667f80a = 6: e86636ad2c builtin/checkout.c: use "ret" variable for return
7: 0a148182e6 ! 7: c2130b347c checkout: add advice for ambiguous "checkout 
"
@@ -27,6 +27,28 @@
 hint: you can do so by fully qualifying the name with the --track 
option:
 hint:
 hint: git checkout --track origin/
+
+Note that the "error: pathspec[...]" message is still printed. This is
+because whatever else checkout may have tried earlier, its final
+fallback is to try to resolve the argument as a path. E.g. in this
+case:
+
+$ ./git --exec-path=$PWD checkout master pu
+error: pathspec 'master' did not match any file(s) known to git.
+error: pathspec 'pu' did not match any file(s) known to git.
+
+There we don't print the "hint:" implicitly due to earlier logic
+around the DWIM fallback. That fallback is only used if it looks like
+we have one argument that might be a branch.
+
+I can't think of an intrinsic reason for why we couldn't in some
+future change skip printing the "error: pathspec[...]" error. However,
+to do so we'd need to pass something down to checkout_paths() to make
+it suppress printing an error on its own, and for us to be confident
+that we're not silencing cases where those errors are meaningful.
+
+I don't think that's worth it since determining whether that's the
+case could easily change due to future changes in the checkout logic.
 
 Signed-off-by: Ævar Arnfjörð Bjarmason 
 
8: f3a52a26a2 ! 8: f1ac0f7351 checkout & worktree: introduce 
checkout.defaultRemote
@@ -53,12 +53,12 @@
 
 $ ./git --exec-path=$PWD checkout master
 error: pathspec 'master' did not match any file(s) known to git.
-hint: The argument 'master' matched more than one remote tracking 
branch.
+hint: 'master' matched more than one remote tracking branch.
 hint: We found 26 remotes with a reference that matched. So we 
fell back
 hint: on trying to resolve the argument as a path, but failed 
there too!
 hint:
-hint: If you meant to check out a remote tracking branch on e.g. 
'origin'
-hint: you can do so by fully-qualifying the name with the --track 
option:
+hint: If you meant to check out a remote tracking branch on, e.g. 
'origin',
+hint: you can do so by fully qualifying the name with the --track 
option:
 hint:
 hint: git checkout --track origin/
 hint:
@@ -263,7 +263,7 @@
status_uno_is_clean &&
 -  test_i18ngrep ! "^hint: " stderr
 +  test_i18ngrep ! "^hint: " stderr &&
-+  # Make sure the likes of checkout -p don not print this hint
++  # Make sure the likes of checkout -p do not print this hint
 +  git checkout -p foo 2>stderr &&
 +  test_i18ngrep ! "^hint: " stderr &&
 +  status_uno_is_clean

Ævar Arnfjörð Bjarmason (8):
  checkout tests: index should be clean after dwim checkout
  checkout.h: wrap the arguments to unique_tracking_name()
  checkout.c: introduce an *_INIT macro
  checkout.c: change "unique" member to "num_matches"
  checkout: pass the "num_matches" up to callers
  builtin/checkout.c: use "ret" variable for return
  checkout: add advice for ambiguous "checkout "
  checkout & worktree: introduce checkout.defaultRemote

 Documentation/config.txt   | 26 +++
 Documentation/git-checkout.txt |  9 ++
 Documentation/git-worktree.txt |  9 ++
 advice.c   |  2 ++
 advice.h   |  1 +
 builtin/checkout.c | 41 ++-
 builtin/worktree.c |  4 +--
 checkout.c | 37 ++---
 checkout.h |  4 ++-
 t/t2024-checkout-dwim.sh   | 59 ++
 t/t2025-worktree-add.sh| 21 
 11 files changed, 197 insertions(+), 16 deletions(-)

-- 
2.17.0.290.g

[PATCH v7 4/8] checkout.c: change "unique" member to "num_matches"

2018-06-05 Thread Ævar Arnfjörð Bjarmason
Internally track how many matches we find in the check_tracking_name()
callback. Nothing uses this now, but it will be made use of in a later
change.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 checkout.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/checkout.c b/checkout.c
index 80e430cda8..7662a39a62 100644
--- a/checkout.c
+++ b/checkout.c
@@ -7,10 +7,10 @@ struct tracking_name_data {
/* const */ char *src_ref;
char *dst_ref;
struct object_id *dst_oid;
-   int unique;
+   int num_matches;
 };
 
-#define TRACKING_NAME_DATA_INIT { NULL, NULL, NULL, 1 }
+#define TRACKING_NAME_DATA_INIT { NULL, NULL, NULL, 0 }
 
 static int check_tracking_name(struct remote *remote, void *cb_data)
 {
@@ -23,9 +23,9 @@ static int check_tracking_name(struct remote *remote, void 
*cb_data)
free(query.dst);
return 0;
}
+   cb->num_matches++;
if (cb->dst_ref) {
free(query.dst);
-   cb->unique = 0;
return 0;
}
cb->dst_ref = query.dst;
@@ -39,7 +39,7 @@ const char *unique_tracking_name(const char *name, struct 
object_id *oid)
cb_data.dst_oid = oid;
for_each_remote(check_tracking_name, &cb_data);
free(cb_data.src_ref);
-   if (cb_data.unique)
+   if (cb_data.num_matches == 1)
return cb_data.dst_ref;
free(cb_data.dst_ref);
return NULL;
-- 
2.17.0.290.gded63e768a



[PATCH v7 3/8] checkout.c: introduce an *_INIT macro

2018-06-05 Thread Ævar Arnfjörð Bjarmason
Add an *_INIT macro for the tracking_name_data similar to what exists
elsewhere in the codebase, e.g. OID_ARRAY_INIT in sha1-array.h. This
will make it more idiomatic in later changes to add more fields to the
struct & its initialization macro.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 checkout.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/checkout.c b/checkout.c
index bdefc888ba..80e430cda8 100644
--- a/checkout.c
+++ b/checkout.c
@@ -10,6 +10,8 @@ struct tracking_name_data {
int unique;
 };
 
+#define TRACKING_NAME_DATA_INIT { NULL, NULL, NULL, 1 }
+
 static int check_tracking_name(struct remote *remote, void *cb_data)
 {
struct tracking_name_data *cb = cb_data;
@@ -32,7 +34,7 @@ static int check_tracking_name(struct remote *remote, void 
*cb_data)
 
 const char *unique_tracking_name(const char *name, struct object_id *oid)
 {
-   struct tracking_name_data cb_data = { NULL, NULL, NULL, 1 };
+   struct tracking_name_data cb_data = TRACKING_NAME_DATA_INIT;
cb_data.src_ref = xstrfmt("refs/heads/%s", name);
cb_data.dst_oid = oid;
for_each_remote(check_tracking_name, &cb_data);
-- 
2.17.0.290.gded63e768a



[PATCH v7 5/8] checkout: pass the "num_matches" up to callers

2018-06-05 Thread Ævar Arnfjörð Bjarmason
Pass the previously added "num_matches" struct value up to the callers
of unique_tracking_name(). This will allow callers to optionally print
better error messages in a later change.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 builtin/checkout.c | 10 +++---
 builtin/worktree.c |  4 ++--
 checkout.c |  5 -
 checkout.h |  3 ++-
 4 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 2e1d2376d2..72457fb7d5 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -878,7 +878,8 @@ static int parse_branchname_arg(int argc, const char **argv,
int dwim_new_local_branch_ok,
struct branch_info *new_branch_info,
struct checkout_opts *opts,
-   struct object_id *rev)
+   struct object_id *rev,
+   int *dwim_remotes_matched)
 {
struct tree **source_tree = &opts->source_tree;
const char **new_branch = &opts->new_branch;
@@ -972,7 +973,8 @@ static int parse_branchname_arg(int argc, const char **argv,
recover_with_dwim = 0;
 
if (recover_with_dwim) {
-   const char *remote = unique_tracking_name(arg, rev);
+   const char *remote = unique_tracking_name(arg, rev,
+ 
dwim_remotes_matched);
if (remote) {
*new_branch = arg;
arg = remote;
@@ -1109,6 +,7 @@ int cmd_checkout(int argc, const char **argv, const char 
*prefix)
struct branch_info new_branch_info;
char *conflict_style = NULL;
int dwim_new_local_branch = 1;
+   int dwim_remotes_matched = 0;
struct option options[] = {
OPT__QUIET(&opts.quiet, N_("suppress progress reporting")),
OPT_STRING('b', NULL, &opts.new_branch, N_("branch"),
@@ -1219,7 +1222,8 @@ int cmd_checkout(int argc, const char **argv, const char 
*prefix)
opts.track == BRANCH_TRACK_UNSPECIFIED &&
!opts.new_branch;
int n = parse_branchname_arg(argc, argv, dwim_ok,
-&new_branch_info, &opts, &rev);
+&new_branch_info, &opts, &rev,
+&dwim_remotes_matched);
argv += n;
argc -= n;
}
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 5c7d2bb180..a763dbdccb 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -412,7 +412,7 @@ static const char *dwim_branch(const char *path, const char 
**new_branch)
if (guess_remote) {
struct object_id oid;
const char *remote =
-   unique_tracking_name(*new_branch, &oid);
+   unique_tracking_name(*new_branch, &oid, NULL);
return remote;
}
return NULL;
@@ -484,7 +484,7 @@ static int add(int ac, const char **av, const char *prefix)
 
commit = lookup_commit_reference_by_name(branch);
if (!commit) {
-   remote = unique_tracking_name(branch, &oid);
+   remote = unique_tracking_name(branch, &oid, NULL);
if (remote) {
new_branch = branch;
branch = remote;
diff --git a/checkout.c b/checkout.c
index 7662a39a62..ee3a7e9c05 100644
--- a/checkout.c
+++ b/checkout.c
@@ -32,12 +32,15 @@ static int check_tracking_name(struct remote *remote, void 
*cb_data)
return 0;
 }
 
-const char *unique_tracking_name(const char *name, struct object_id *oid)
+const char *unique_tracking_name(const char *name, struct object_id *oid,
+int *dwim_remotes_matched)
 {
struct tracking_name_data cb_data = TRACKING_NAME_DATA_INIT;
cb_data.src_ref = xstrfmt("refs/heads/%s", name);
cb_data.dst_oid = oid;
for_each_remote(check_tracking_name, &cb_data);
+   if (dwim_remotes_matched)
+   *dwim_remotes_matched = cb_data.num_matches;
free(cb_data.src_ref);
if (cb_data.num_matches == 1)
return cb_data.dst_ref;
diff --git a/checkout.h b/checkout.h
index 4cd4cd1c23..6b2073310c 100644
--- a/checkout.h
+++ b/checkout.h
@@ -9,6 +9,7 @@
  * exists, NULL otherwise.
  */
 extern const char *unique_tracking_name(const char *name,
-   struct object_id *oid);
+   struct object_id *oid,
+   int *dwim_remotes_matched);
 
 #endif /* CHECKOUT_H */
-- 
2.17.0.290.gded63e768a



[PATCH v7 8/8] checkout & worktree: introduce checkout.defaultRemote

2018-06-05 Thread Ævar Arnfjörð Bjarmason
Introduce a checkout.defaultRemote setting which can be used to
designate a remote to prefer (via checkout.defaultRemote=origin) when
running e.g. "git checkout master" to mean origin/master, even though
there's other remotes that have the "master" branch.

I want this because it's very handy to use this workflow to checkout a
repository and create a topic branch, then get back to a "master" as
retrieved from upstream:

(
cd /tmp &&
rm -rf tbdiff &&
git clone g...@github.com:trast/tbdiff.git &&
cd tbdiff &&
git branch -m topic &&
git checkout master
)

That will output:

Branch 'master' set up to track remote branch 'master' from 'origin'.
Switched to a new branch 'master'

But as soon as a new remote is added (e.g. just to inspect something
from someone else) the DWIMery goes away:

(
cd /tmp &&
rm -rf tbdiff &&
git clone g...@github.com:trast/tbdiff.git &&
cd tbdiff &&
git branch -m topic &&
git remote add avar g...@github.com:avar/tbdiff.git &&
git fetch avar &&
git checkout master
)

Will output (without the advice output added earlier in this series):

error: pathspec 'master' did not match any file(s) known to git.

The new checkout.defaultRemote config allows me to say that whenever
that ambiguity comes up I'd like to prefer "origin", and it'll still
work as though the only remote I had was "origin".

Also adjust the advice.checkoutAmbiguousRemoteBranchName message to
mention this new config setting to the user, the full output on my
git.git is now (the last paragraph is new):

$ ./git --exec-path=$PWD checkout master
error: pathspec 'master' did not match any file(s) known to git.
hint: 'master' matched more than one remote tracking branch.
hint: We found 26 remotes with a reference that matched. So we fell back
hint: on trying to resolve the argument as a path, but failed there too!
hint:
hint: If you meant to check out a remote tracking branch on, e.g. 'origin',
hint: you can do so by fully qualifying the name with the --track option:
hint:
hint: git checkout --track origin/
hint:
hint: If you'd like to always have checkouts of an ambiguous  prefer
hint: one remote, e.g. the 'origin' remote, consider setting
hint: checkout.defaultRemote=origin in your config.

I considered splitting this into checkout.defaultRemote and
worktree.defaultRemote, but it's probably less confusing to break our
own rules that anything shared between config should live in core.*
than have two config settings, and I couldn't come up with a short
name under core.* that made sense (core.defaultRemoteForCheckout?).

See also 70c9ac2f19 ("DWIM "git checkout frotz" to "git checkout -b
frotz origin/frotz"", 2009-10-18) which introduced this DWIM feature
to begin with, and 4e85333197 ("worktree: make add  
dwim", 2017-11-26) which added it to git-worktree.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/config.txt   | 21 -
 Documentation/git-checkout.txt |  9 +
 Documentation/git-worktree.txt |  9 +
 builtin/checkout.c | 12 +---
 checkout.c | 26 --
 t/t2024-checkout-dwim.sh   | 18 +-
 t/t2025-worktree-add.sh| 21 +
 7 files changed, 109 insertions(+), 7 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index dfc0413a84..aef2769211 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -350,7 +350,10 @@ advice.*::
remote tracking branch on more than one remote in
situations where an unambiguous argument would have
otherwise caused a remote-tracking branch to be
-   checked out.
+   checked out. See the `checkout.defaultRemote`
+   configuration variable for how to set a given remote
+   to used by default in some situations where this
+   advice would be printed.
amWorkDir::
Advice that shows the location of the patch file when
linkgit:git-am[1] fails to apply it.
@@ -1105,6 +1108,22 @@ browser..path::
browse HTML help (see `-w` option in linkgit:git-help[1]) or a
working repository in gitweb (see linkgit:git-instaweb[1]).
 
+checkout.defaultRemote::
+   When you run 'git checkout ' and only have one
+   remote, it may implicitly fall back on checking out and
+   tracking e.g. 'origin/'. This stops working as soon
+   as you have more than one remote with a ''
+   reference. This setting allows for setting the name of a
+   preferred remote that should always win when it comes to
+   disambiguation. The typical use-case is to set this to
+   `origin`.
++
+Currently this is used by linkgit:git-checkout[1] when 'git checkout
+' will check

[PATCH v7 6/8] builtin/checkout.c: use "ret" variable for return

2018-06-05 Thread Ævar Arnfjörð Bjarmason
There is no point in doing this right now, but in later change the
"ret" variable will be inspected. This change makes that meaningful
change smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 builtin/checkout.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 72457fb7d5..8c93c55cbc 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -1265,8 +1265,10 @@ int cmd_checkout(int argc, const char **argv, const char 
*prefix)
}
 
UNLEAK(opts);
-   if (opts.patch_mode || opts.pathspec.nr)
-   return checkout_paths(&opts, new_branch_info.name);
-   else
+   if (opts.patch_mode || opts.pathspec.nr) {
+   int ret = checkout_paths(&opts, new_branch_info.name);
+   return ret;
+   } else {
return checkout_branch(&opts, &new_branch_info);
+   }
 }
-- 
2.17.0.290.gded63e768a



[PATCH v7 2/8] checkout.h: wrap the arguments to unique_tracking_name()

2018-06-05 Thread Ævar Arnfjörð Bjarmason
The line was too long already, and will be longer still when a later
change adds another argument.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 checkout.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/checkout.h b/checkout.h
index 9980711179..4cd4cd1c23 100644
--- a/checkout.h
+++ b/checkout.h
@@ -8,6 +8,7 @@
  * tracking branch.  Return the name of the remote if such a branch
  * exists, NULL otherwise.
  */
-extern const char *unique_tracking_name(const char *name, struct object_id 
*oid);
+extern const char *unique_tracking_name(const char *name,
+   struct object_id *oid);
 
 #endif /* CHECKOUT_H */
-- 
2.17.0.290.gded63e768a



[PATCH v7 1/8] checkout tests: index should be clean after dwim checkout

2018-06-05 Thread Ævar Arnfjörð Bjarmason
Assert that whenever there's a DWIM checkout that the index should be
clean afterwards, in addition to the correct branch being checked-out.

The way the DWIM checkout code in checkout.[ch] works is by looping
over all remotes, and for each remote trying to find if a given
reference name only exists on that remote, or if it exists anywhere
else.

This is done by starting out with a `unique = 1` tracking variable in
a struct shared by the entire loop, which will get set to `0` if the
data reference is not unique.

Thus if we find a match we know the dst_oid member of
tracking_name_data must be correct, since it's associated with the
only reference on the only remote that could have matched our query.

But if there was ever a mismatch there for some reason we might end up
with the correct branch checked out, but at the wrong oid, which would
show whatever the difference between the two staged in the
index (checkout branch A, stage changes from the state of branch B).

So let's amend the tests (mostly added in) 399e4a1c56 ("t2024: Add
tests verifying current DWIM behavior of 'git checkout '",
2013-04-21) to always assert that "status" is clean after we run
"checkout", that's being done with "-uno" because there's going to be
some untracked files related to the test itself which we don't care
about.

In all these tests (DWIM or otherwise) we start with a clean index, so
these tests are asserting that that's still the case after the
"checkout", failed or otherwise.

Then if we ever run into this sort of regression, either in the
existing code or with a new feature, we'll know.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 t/t2024-checkout-dwim.sh | 29 +
 1 file changed, 29 insertions(+)

diff --git a/t/t2024-checkout-dwim.sh b/t/t2024-checkout-dwim.sh
index 3e5ac81bd2..ed32828105 100755
--- a/t/t2024-checkout-dwim.sh
+++ b/t/t2024-checkout-dwim.sh
@@ -23,6 +23,12 @@ test_branch_upstream () {
test_cmp expect.upstream actual.upstream
 }
 
+status_uno_is_clean() {
+   >status.expect &&
+   git status -uno --porcelain >status.actual &&
+   test_cmp status.expect status.actual
+}
+
 test_expect_success 'setup' '
test_commit my_master &&
git init repo_a &&
@@ -55,6 +61,7 @@ test_expect_success 'checkout of non-existing branch fails' '
test_might_fail git branch -D xyzzy &&
 
test_must_fail git checkout xyzzy &&
+   status_uno_is_clean &&
test_must_fail git rev-parse --verify refs/heads/xyzzy &&
test_branch master
 '
@@ -64,6 +71,7 @@ test_expect_success 'checkout of branch from multiple remotes 
fails #1' '
test_might_fail git branch -D foo &&
 
test_must_fail git checkout foo &&
+   status_uno_is_clean &&
test_must_fail git rev-parse --verify refs/heads/foo &&
test_branch master
 '
@@ -73,6 +81,7 @@ test_expect_success 'checkout of branch from a single remote 
succeeds #1' '
test_might_fail git branch -D bar &&
 
git checkout bar &&
+   status_uno_is_clean &&
test_branch bar &&
test_cmp_rev remotes/repo_a/bar HEAD &&
test_branch_upstream bar repo_a bar
@@ -83,6 +92,7 @@ test_expect_success 'checkout of branch from a single remote 
succeeds #2' '
test_might_fail git branch -D baz &&
 
git checkout baz &&
+   status_uno_is_clean &&
test_branch baz &&
test_cmp_rev remotes/other_b/baz HEAD &&
test_branch_upstream baz repo_b baz
@@ -90,6 +100,7 @@ test_expect_success 'checkout of branch from a single remote 
succeeds #2' '
 
 test_expect_success '--no-guess suppresses branch auto-vivification' '
git checkout -B master &&
+   status_uno_is_clean &&
test_might_fail git branch -D bar &&
 
test_must_fail git checkout --no-guess bar &&
@@ -99,6 +110,7 @@ test_expect_success '--no-guess suppresses branch 
auto-vivification' '
 
 test_expect_success 'setup more remotes with unconventional refspecs' '
git checkout -B master &&
+   status_uno_is_clean &&
git init repo_c &&
(
cd repo_c &&
@@ -128,27 +140,33 @@ test_expect_success 'setup more remotes with 
unconventional refspecs' '
 
 test_expect_success 'checkout of branch from multiple remotes fails #2' '
git checkout -B master &&
+   status_uno_is_clean &&
test_might_fail git branch -D bar &&
 
test_must_fail git checkout bar &&
+   status_uno_is_clean &&
test_must_fail git rev-parse --verify refs/heads/bar &&
test_branch master
 '
 
 test_expect_success 'checkout of branch from multiple remotes fails #3' '
git checkout -B master &&
+   status_uno_is_clean &&
test_might_fail git branch -D baz &&
 
test_must_fail git checkout baz &&
+   status_uno_is_clean &&
test_must_fail git rev-parse --verify refs/heads/baz &&
test_branch master
 '
 
 test_expect_success 'checkout of branch from a single re

[PATCH v7 7/8] checkout: add advice for ambiguous "checkout "

2018-06-05 Thread Ævar Arnfjörð Bjarmason
As the "checkout" documentation describes:

If  is not found but there does exist a tracking branch in
exactly one remote (call it ) with a matching name, treat
as equivalent to [...] /

Note that the "error: pathspec[...]" message is still printed. This is
because whatever else checkout may have tried earlier, its final
fallback is to try to resolve the argument as a path. E.g. in this
case:

$ ./git --exec-path=$PWD checkout master pu
error: pathspec 'master' did not match any file(s) known to git.
error: pathspec 'pu' did not match any file(s) known to git.

There we don't print the "hint:" implicitly due to earlier logic
around the DWIM fallback. That fallback is only used if it looks like
we have one argument that might be a branch.

I can't think of an intrinsic reason for why we couldn't in some
future change skip printing the "error: pathspec[...]" error. However,
to do so we'd need to pass something down to checkout_paths() to make
it suppress printing an error on its own, and for us to be confident
that we're not silencing cases where those errors are meaningful.

I don't think that's worth it since determining whether that's the
case could easily change due to future changes in the checkout logic.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/config.txt |  7 +++
 advice.c |  2 ++
 advice.h |  1 +
 builtin/checkout.c   | 13 +
 t/t2024-checkout-dwim.sh | 14 ++
 5 files changed, 37 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ab641bf5a9..dfc0413a84 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -344,6 +344,13 @@ advice.*::
Advice shown when you used linkgit:git-checkout[1] to
move to the detach HEAD state, to instruct how to create
a local branch after the fact.
+   checkoutAmbiguousRemoteBranchName::
+   Advice shown when the argument to
+   linkgit:git-checkout[1] ambiguously resolves to a
+   remote tracking branch on more than one remote in
+   situations where an unambiguous argument would have
+   otherwise caused a remote-tracking branch to be
+   checked out.
amWorkDir::
Advice that shows the location of the patch file when
linkgit:git-am[1] fails to apply it.
diff --git a/advice.c b/advice.c
index 370a56d054..75e7dede90 100644
--- a/advice.c
+++ b/advice.c
@@ -21,6 +21,7 @@ int advice_add_embedded_repo = 1;
 int advice_ignored_hook = 1;
 int advice_waiting_for_editor = 1;
 int advice_graft_file_deprecated = 1;
+int advice_checkout_ambiguous_remote_branch_name = 1;
 
 static int advice_use_color = -1;
 static char advice_colors[][COLOR_MAXLEN] = {
@@ -72,6 +73,7 @@ static struct {
{ "ignoredhook", &advice_ignored_hook },
{ "waitingforeditor", &advice_waiting_for_editor },
{ "graftfiledeprecated", &advice_graft_file_deprecated },
+   { "checkoutambiguousremotebranchname", 
&advice_checkout_ambiguous_remote_branch_name },
 
/* make this an alias for backward compatibility */
{ "pushnonfastforward", &advice_push_update_rejected }
diff --git a/advice.h b/advice.h
index 9f5064e82a..4d11d51d43 100644
--- a/advice.h
+++ b/advice.h
@@ -22,6 +22,7 @@ extern int advice_add_embedded_repo;
 extern int advice_ignored_hook;
 extern int advice_waiting_for_editor;
 extern int advice_graft_file_deprecated;
+extern int advice_checkout_ambiguous_remote_branch_name;
 
 int git_default_advice_config(const char *var, const char *value);
 __attribute__((format (printf, 1, 2)))
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 8c93c55cbc..baa027455a 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -22,6 +22,7 @@
 #include "resolve-undo.h"
 #include "submodule-config.h"
 #include "submodule.h"
+#include "advice.h"
 
 static const char * const checkout_usage[] = {
N_("git checkout [] "),
@@ -1267,6 +1268,18 @@ int cmd_checkout(int argc, const char **argv, const char 
*prefix)
UNLEAK(opts);
if (opts.patch_mode || opts.pathspec.nr) {
int ret = checkout_paths(&opts, new_branch_info.name);
+   if (ret && dwim_remotes_matched > 1 &&
+   advice_checkout_ambiguous_remote_branch_name)
+   advise(_("'%s' matched more than one remote tracking 
branch.\n"
+"We found %d remotes with a reference that 
matched. So we fell back\n"
+"on trying to resolve the argument as a path, 
but failed there too!\n"
+"\n"
+"If you meant to check out a remote tracking 
branch on, e.g. 'origin',\n"
+"you can do so by fully qualifying the name 
with the --track option:\n"
+"\n"
+ 

Re: [PATCH v4 00/21] Integrate commit-graph into 'fsck' and 'gc'

2018-06-05 Thread Ævar Arnfjörð Bjarmason
On Mon, Jun 4, 2018 at 6:52 PM, Derrick Stolee  wrote:
> Thanks for the feedback on v3. There were several small cleanups, but
> perhaps the biggest change is the addition of "commit-graph: use
> string-list API for input" which makes "commit-graph: add '--reachable'
> option" much simpler.

Do you have a version of this pushed somewhere? Your fsck/upstream
branch conflicts with e.g. long-since-merged nd/repack-keep-pack.


BUG: submodule code prints '(null)'

2018-06-05 Thread Duy Nguyen
I do not know how to reproduce this (and didn't bother to look deeply
into it after I found it was not a trivial fix) but one of my "git
fetch" showed

warning: Submodule in commit be2db96a6c506464525f588da59cade0cedddb5e
at path: '(null)' collides with a submodule named the same. Skipping
it.

I think it's reported that some libc implementation will not be able
to gracefully handle NULL strings like glibc and may crash instead of
printing '(null)' here. I'll leave it to submodule people to fix this
:)
-- 
Duy


[PATCH 6/6] Forbid "the_index" in dir.c and unpack-trees.c

2018-06-05 Thread Nguyễn Thái Ngọc Duy
Use of global variables like the_index makes it very hard to follow
the code flow, especially when it's buried deep in some minor helper
function.

We are gradually avoiding global variables, this hopefully will make
it one step closer. The end game is, the set of "library" source files
will have just one macro set: "LIB_CODE" (or something). This macro
will prevent access to most (if not all) global variables in those
files. We can't do that now, so we just prevent one thing at a time.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Should I keep my trick of defining the_index to
 the_index_should_not_be_used_here? It may help somewhat when people
 accidentally use the_index.

 cache.h| 2 ++
 dir.c  | 2 ++
 unpack-trees.c | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/cache.h b/cache.h
index 89a107a7f7..ecc96ccb0e 100644
--- a/cache.h
+++ b/cache.h
@@ -330,7 +330,9 @@ struct index_state {
struct ewah_bitmap *fsmonitor_dirty;
 };
 
+#ifndef NO_GLOBAL_INDEX
 extern struct index_state the_index;
+#endif
 
 /* Name hashing */
 extern int test_lazy_init_name_hash(struct index_state *istate, int 
try_threaded);
diff --git a/dir.c b/dir.c
index ccf8b4975e..74d848db5a 100644
--- a/dir.c
+++ b/dir.c
@@ -8,6 +8,8 @@
  *  Junio Hamano, 2005-2006
  */
 #define NO_THE_INDEX_COMPATIBILITY_MACROS
+/* Do not use the_index. You should have access to it via function arg */
+#define NO_GLOBAL_INDEX
 #include "cache.h"
 #include "config.h"
 #include "dir.h"
diff --git a/unpack-trees.c b/unpack-trees.c
index 3ace82ca27..9aebe9762b 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1,4 +1,6 @@
 #define NO_THE_INDEX_COMPATIBILITY_MACROS
+/* Do not use the_index here, you probably want o->src_index */
+#define NO_GLOBAL_INDEX
 #include "cache.h"
 #include "argv-array.h"
 #include "repository.h"
-- 
2.18.0.rc0.333.g22e6ee6cdf



[PATCH 2/6] unpack-trees: add a note about path invalidation

2018-06-05 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 unpack-trees.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/unpack-trees.c b/unpack-trees.c
index 3a85a02a77..5d06aa9c98 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1545,6 +1545,17 @@ static int verify_uptodate_sparse(const struct 
cache_entry *ce,
return verify_uptodate_1(ce, o, ERROR_SPARSE_NOT_UPTODATE_FILE);
 }
 
+/*
+ * TODO: We should actually invalidate o->result, not src_index [1].
+ * But since cache tree and untracked cache both are not copied to
+ * o->result until unpacking is complete, we invalidate them on
+ * src_index instead with the assumption that they will be copied to
+ * dst_index at the end.
+ *
+ * [1] src_index->cache_tree is also used in unpack_callback() so if
+ * we invalidate o->result, we need to update it to use
+ * o->result.cache_tree as well.
+ */
 static void invalidate_ce_path(const struct cache_entry *ce,
   struct unpack_trees_options *o)
 {
-- 
2.18.0.rc0.333.g22e6ee6cdf



[PATCH 5/6] unpack-trees: avoid the_index in verify_absent()

2018-06-05 Thread Nguyễn Thái Ngọc Duy
Both functions that are updated in this commit are called by
verify_absent(), which is part of the "unpack-trees" operation that is
supposed to work on any index file specified by the caller. Thanks to
Brandon [1] [2], an implicit dependency on the_index is exposed. This
commit fixes it.

In both functions, it makes sense to use src_index to check for
exclusion because it's almost unchanged and should give us the same
outcome as if running the exclude check before the unpack.

It's "almost unchanged" because we do invalidate cache-tree and
untracked cache in the source index. But this should not affect how
exclude machinery uses the index: to see if a file is tracked, and to
read a blob from the index instead of worktree if it's marked
skip-worktree (i.e. it's not available in worktree)

[1] a0bba65b10 (dir: convert is_excluded to take an index - 2017-05-05
[2] 2c1eb10454 (dir: convert read_directory to take an index - 2017-05-05)

Helped-by: Elijah Newren 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 unpack-trees.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 5268de7af5..3ace82ca27 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1651,7 +1651,7 @@ static int verify_clean_subdirectory(const struct 
cache_entry *ce,
memset(&d, 0, sizeof(d));
if (o->dir)
d.exclude_per_dir = o->dir->exclude_per_dir;
-   i = read_directory(&d, &the_index, pathbuf, namelen+1, NULL);
+   i = read_directory(&d, o->src_index, pathbuf, namelen+1, NULL);
if (i)
return o->gently ? -1 :
add_rejected_path(o, ERROR_NOT_UPTODATE_DIR, ce->name);
@@ -1693,7 +1693,7 @@ static int check_ok_to_remove(const char *name, int len, 
int dtype,
return 0;
 
if (o->dir &&
-   is_excluded(o->dir, &the_index, name, &dtype))
+   is_excluded(o->dir, o->src_index, name, &dtype))
/*
 * ce->name is explicitly excluded, so it is Ok to
 * overwrite it.
-- 
2.18.0.rc0.333.g22e6ee6cdf



[PATCH 4/6] unpack-tress: convert clear_ce_flags* to avoid the_index

2018-06-05 Thread Nguyễn Thái Ngọc Duy
Prior to fba92be8f7, this code implicitly (and incorrectly) assumes
the_index when running the exclude machinery. fba92be8f7 helps show
this problem clearer because unpack-trees operation is supposed to
work on whatever index the caller specifies... not specifically
the_index.

Update the code to use "istate" argument that's originally from
mark_new_skip_worktree(). From the call sites, both in unpack_trees(),
you can see that this function works on two separate indexes:
o->src_index and o->result. The second mark_new_skip_worktree() so far
has incorecctly applied exclude rules on o->src_index instead of
o->result. It's unclear what is the consequences of this, but it's
definitely wrong.

[1] fba92be8f7 (dir: convert is_excluded_from_list to take an index -
2017-05-05)

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 unpack-trees.c | 29 +
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 60d1138e08..5268de7af5 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1085,13 +1085,15 @@ static int unpack_callback(int n, unsigned long mask, 
unsigned long dirmask, str
return mask;
 }
 
-static int clear_ce_flags_1(struct cache_entry **cache, int nr,
+static int clear_ce_flags_1(struct index_state *istate,
+   struct cache_entry **cache, int nr,
struct strbuf *prefix,
int select_mask, int clear_mask,
struct exclude_list *el, int defval);
 
 /* Whole directory matching */
-static int clear_ce_flags_dir(struct cache_entry **cache, int nr,
+static int clear_ce_flags_dir(struct index_state *istate,
+ struct cache_entry **cache, int nr,
  struct strbuf *prefix,
  char *basename,
  int select_mask, int clear_mask,
@@ -1100,7 +1102,7 @@ static int clear_ce_flags_dir(struct cache_entry **cache, 
int nr,
struct cache_entry **cache_end;
int dtype = DT_DIR;
int ret = is_excluded_from_list(prefix->buf, prefix->len,
-   basename, &dtype, el, &the_index);
+   basename, &dtype, el, istate);
int rc;
 
strbuf_addch(prefix, '/');
@@ -1122,7 +1124,7 @@ static int clear_ce_flags_dir(struct cache_entry **cache, 
int nr,
 * calling clear_ce_flags_1(). That function will call
 * the expensive is_excluded_from_list() on every entry.
 */
-   rc = clear_ce_flags_1(cache, cache_end - cache,
+   rc = clear_ce_flags_1(istate, cache, cache_end - cache,
  prefix,
  select_mask, clear_mask,
  el, ret);
@@ -1145,7 +1147,8 @@ static int clear_ce_flags_dir(struct cache_entry **cache, 
int nr,
  *   cache[0]->name[0..(prefix_len-1)]
  * Top level path has prefix_len zero.
  */
-static int clear_ce_flags_1(struct cache_entry **cache, int nr,
+static int clear_ce_flags_1(struct index_state *istate,
+   struct cache_entry **cache, int nr,
struct strbuf *prefix,
int select_mask, int clear_mask,
struct exclude_list *el, int defval)
@@ -1179,7 +1182,7 @@ static int clear_ce_flags_1(struct cache_entry **cache, 
int nr,
len = slash - name;
strbuf_add(prefix, name, len);
 
-   processed = clear_ce_flags_dir(cache, cache_end - cache,
+   processed = clear_ce_flags_dir(istate, cache, cache_end 
- cache,
   prefix,
   prefix->buf + 
prefix->len - len,
   select_mask, clear_mask,
@@ -1193,7 +1196,7 @@ static int clear_ce_flags_1(struct cache_entry **cache, 
int nr,
}
 
strbuf_addch(prefix, '/');
-   cache += clear_ce_flags_1(cache, cache_end - cache,
+   cache += clear_ce_flags_1(istate, cache, cache_end - 
cache,
  prefix,
  select_mask, clear_mask, el, 
defval);
strbuf_setlen(prefix, prefix->len - len - 1);
@@ -1203,7 +1206,7 @@ static int clear_ce_flags_1(struct cache_entry **cache, 
int nr,
/* Non-directory */
dtype = ce_to_dtype(ce);
ret = is_excluded_from_list(ce->name, ce_namelen(ce),
-   name, &dtype, el, &the_index);
+   name, &dtype, el, istate);
if (ret < 0)
ret = defval;
if (ret > 0)
@@ -1213,15

[PATCH 3/6] unpack-trees: don't shadow global var the_index

2018-06-05 Thread Nguyễn Thái Ngọc Duy
This function mark_new_skip_worktree() has an argument named the_index
which is also the name of a global variable. While they have different
types (the global the_index is not a pointer) mistakes can easily
happen and it's also confusing for readers. Rename the function
argument to something other than the_index.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 unpack-trees.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 5d06aa9c98..60d1138e08 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1231,7 +1231,7 @@ static int clear_ce_flags(struct cache_entry **cache, int 
nr,
  * Set/Clear CE_NEW_SKIP_WORKTREE according to $GIT_DIR/info/sparse-checkout
  */
 static void mark_new_skip_worktree(struct exclude_list *el,
-  struct index_state *the_index,
+  struct index_state *istate,
   int select_flag, int skip_wt_flag)
 {
int i;
@@ -1240,8 +1240,8 @@ static void mark_new_skip_worktree(struct exclude_list 
*el,
 * 1. Pretend the narrowest worktree: only unmerged entries
 * are checked out
 */
-   for (i = 0; i < the_index->cache_nr; i++) {
-   struct cache_entry *ce = the_index->cache[i];
+   for (i = 0; i < istate->cache_nr; i++) {
+   struct cache_entry *ce = istate->cache[i];
 
if (select_flag && !(ce->ce_flags & select_flag))
continue;
@@ -1256,8 +1256,7 @@ static void mark_new_skip_worktree(struct exclude_list 
*el,
 * 2. Widen worktree according to sparse-checkout file.
 * Matched entries will have skip_wt_flag cleared (i.e. "in")
 */
-   clear_ce_flags(the_index->cache, the_index->cache_nr,
-  select_flag, skip_wt_flag, el);
+   clear_ce_flags(istate, select_flag, skip_wt_flag, el);
 }
 
 static int verify_absent(const struct cache_entry *,
-- 
2.18.0.rc0.333.g22e6ee6cdf



[PATCH 0/6] Fix "the_index" usage in unpack-trees.c

2018-06-05 Thread Nguyễn Thái Ngọc Duy
This is a potential problem that is exposed after Brandon kicked
the_index out of dir.c (big thanks!) and could be seen as a
continuation of bw/dir-c-stops-relying-on-the-index. Also thanks to
Elijah for helping analyze this code.

The last patch may be debatable. If we go this route, we may end up
with plenty of "#define NO_THIS" and "#define NO_THAT" until we
finally achieve "#define THIS_IS_TRUE_LIB_CODE" many years later.

Nguyễn Thái Ngọc Duy (6):
  unpack-trees: remove 'extern' on function declaration
  unpack-trees: add a note about path invalidation
  unpack-trees: don't shadow global var the_index
  unpack-tress: convert clear_ce_flags* to avoid the_index
  unpack-trees: avoid the_index in verify_absent()
  Forbid "the_index" in dir.c and unpack-trees.c

 cache.h|  2 ++
 dir.c  |  2 ++
 unpack-trees.c | 55 +-
 unpack-trees.h |  4 ++--
 4 files changed, 42 insertions(+), 21 deletions(-)

-- 
2.18.0.rc0.333.g22e6ee6cdf



[PATCH 1/6] unpack-trees: remove 'extern' on function declaration

2018-06-05 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 unpack-trees.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/unpack-trees.h b/unpack-trees.h
index c2b434c606..534358fcc5 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -82,8 +82,8 @@ struct unpack_trees_options {
struct exclude_list *el; /* for internal use */
 };
 
-extern int unpack_trees(unsigned n, struct tree_desc *t,
-   struct unpack_trees_options *options);
+int unpack_trees(unsigned n, struct tree_desc *t,
+struct unpack_trees_options *options);
 
 int verify_uptodate(const struct cache_entry *ce,
struct unpack_trees_options *o);
-- 
2.18.0.rc0.333.g22e6ee6cdf



Re: [PATCH v7 1/8] checkout tests: index should be clean after dwim checkout

2018-06-05 Thread SZEDER Gábor
> diff --git a/t/t2024-checkout-dwim.sh b/t/t2024-checkout-dwim.sh
> index 3e5ac81bd2..ed32828105 100755
> --- a/t/t2024-checkout-dwim.sh
> +++ b/t/t2024-checkout-dwim.sh
> @@ -23,6 +23,12 @@ test_branch_upstream () {
>   test_cmp expect.upstream actual.upstream
>  }
>  
> +status_uno_is_clean() {
> + >status.expect &&
> + git status -uno --porcelain >status.actual &&
> + test_cmp status.expect status.actual

This function could be written a tad simpler as:

  git status -uno --porcelain >status.actual &&
  test_must_be_empty status.actual



Re: [RFC PATCH 1/2] docs: reflect supported fetch options of git pull

2018-06-05 Thread Duy Nguyen
On Mon, Jun 4, 2018 at 11:50 PM, Rafael Ascensão  wrote:
> `git pull` understands some options of `git fetch` which then uses in
> its operation. The documentation of `git pull` doesn't reflect this
> clearly, showing options that are not yet supported (e.g. `--deepen`)
> and omitting options that are supported (e.g. `--prune`).
>
> Make the documentation consistent with present behaviour by hiding
> unavailable options only.

A better option may be making git-pull accept those options as well. I
see no reason git-pull should support options that git-fetch does (at
least most of them). But I would understand if you would not want to
go touch the code. It's basically a couple of OPT_PASSTRHU though so
not very hard to do.

PS. Anybody up to making parse-options accept multiple struct option
arrays? This way we can have much better option passthru without
specifying them again and again.

>
> Reported-by: Marius Giurgi 
> Signed-off-by: Rafael Ascensão 
> ---
>
> Marius asked on freenode.#git if pull supported `--prune`, upon
> inspection seems like the man page was missing some of the supported
> options and listing others that are not supported via pull.
>
> Here's a quick summary of the changes to pull's documentation:
>
> add:  remove:
>   --dry-run --deepen=
>   -p, --prune   --shallow-since=
>   --refmap=--shallow-exclude=
>   -t, --tags-u, --update-head-ok
>   -j, --jobs=
>
>  Documentation/fetch-options.txt | 20 +---
>  1 file changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
> index 8631e365f..da17d27c1 100644
> --- a/Documentation/fetch-options.txt
> +++ b/Documentation/fetch-options.txt
> @@ -14,6 +14,7 @@
> linkgit:git-clone[1]), deepen or shorten the history to the specified
> number of commits. Tags for the deepened commits are not fetched.
>
> +ifndef::git-pull[]
>  --deepen=::
> Similar to --depth, except it specifies the number of commits
> from the current shallow boundary instead of from the tip of
> @@ -27,6 +28,7 @@
> Deepen or shorten the history of a shallow repository to
> exclude commits reachable from a specified remote branch or tag.
> This option can be specified multiple times.
> +endif::git-pull[]
>
>  --unshallow::
> If the source repository is complete, convert a shallow
> @@ -42,10 +44,8 @@ the current repository has the same history as the source 
> repository.
> .git/shallow. This option updates .git/shallow and accept such
> refs.
>
> -ifndef::git-pull[]
>  --dry-run::
> Show what would be done, without making any changes.
> -endif::git-pull[]
>
>  -f::
>  --force::
> @@ -63,6 +63,7 @@ ifndef::git-pull[]
>  --multiple::
> Allow several  and  arguments to be
> specified. No s may be specified.
> +endif::git-pull[]
>
>  -p::
>  --prune::
> @@ -76,8 +77,14 @@ ifndef::git-pull[]
> subject to pruning. Supplying `--prune-tags` is a shorthand for
> providing the tag refspec.
>  +
> +ifdef::git-pull[]
> +See the PRUNING section on linkgit:git-fetch[1] for more details.
> +endif::git-pull[]
> +ifndef::git-pull[]
>  See the PRUNING section below for more details.
> +endif::git-pull[]
>
> +ifndef::git-pull[]
>  -P::
>  --prune-tags::
> Before fetching, remove any local tags that no longer exist on
> @@ -89,9 +96,6 @@ See the PRUNING section below for more details.
>  +
>  See the PRUNING section below for more details.
>
> -endif::git-pull[]
> -
> -ifndef::git-pull[]
>  -n::
>  endif::git-pull[]
>  --no-tags::
> @@ -101,7 +105,6 @@ endif::git-pull[]
> behavior for a remote may be specified with the remote..tagOpt
> setting. See linkgit:git-config[1].
>
> -ifndef::git-pull[]
>  --refmap=::
> When fetching refs listed on the command line, use the
> specified refspec (can be given more than once) to map the
> @@ -119,6 +122,7 @@ ifndef::git-pull[]
> is used (though tags may be pruned anyway if they are also the
> destination of an explicit refspec; see `--prune`).
>
> +ifndef::git-pull[]
>  --recurse-submodules[=yes|on-demand|no]::
> This option controls if and under what conditions new commits of
> populated submodules should be fetched too. It can be used as a
> @@ -129,6 +133,7 @@ ifndef::git-pull[]
> when the superproject retrieves a commit that updates the submodule's
> reference to a commit that isn't already in the local submodule
> clone.
> +endif::git-pull[]
>
>  -j::
>  --jobs=::
> @@ -137,6 +142,7 @@ ifndef::git-pull[]
> submodules will be faster. By default submodules will be fetched
> one at a time.
>
> +ifndef::git-pull[]
>  --no-recurse-submodules::
> Disable recursive fetching of submodules (this has the same effect as
> using the `--recurse-submodules=no` option).
> 

Re: [PATCH] refspec: initalize `refspec_item` in `valid_fetch_refspec()`

2018-06-05 Thread Brandon Williams
On 06/04, Ævar Arnfjörð Bjarmason wrote:
> 
> On Mon, Jun 04 2018, Martin Ågren wrote:
> 
> > We allocate a `struct refspec_item` on the stack without initializing
> > it. In particular, its `dst` and `src` members will contain some random
> > data from the stack. When we later call `refspec_item_clear()`, it will
> > call `free()` on those pointers. So if the call to `parse_refspec()` did
> > not assign to them, we will be freeing some random "pointers". This is
> > undefined behavior.
> >
> > To the best of my understanding, this cannot currently be triggered by
> > user-provided data. And for what it's worth, the test-suite does not
> > trigger this with SANITIZE=address. It can be provoked by calling
> > `valid_fetch_refspec(":*")`.
> >
> > Zero the struct, as is done in other users of `struct refspec_item`.
> >
> > Signed-off-by: Martin Ågren 
> > ---
> > I found some time to look into this. It does not seem to be a
> > user-visible bug, so not particularly critical.
> >
> >  refspec.c | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/refspec.c b/refspec.c
> > index ada7854f7a..7dd7e361e5 100644
> > --- a/refspec.c
> > +++ b/refspec.c
> > @@ -189,7 +189,10 @@ void refspec_clear(struct refspec *rs)
> >  int valid_fetch_refspec(const char *fetch_refspec_str)
> >  {
> > struct refspec_item refspec;
> > -   int ret = parse_refspec(&refspec, fetch_refspec_str, REFSPEC_FETCH);
> > +   int ret;
> > +
> > +   memset(&refspec, 0, sizeof(refspec));
> > +   ret = parse_refspec(&refspec, fetch_refspec_str, REFSPEC_FETCH);
> > refspec_item_clear(&refspec);
> > return ret;
> >  }
> 
> I think this makes more sense instead of this fix:

I like this diff.  The only nit I have is the same as what Martin
pointed out.  At least this way all memory will be initialized by the
time a call to parse_refspec is made.

> 
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 99e73dae85..74a804f2e8 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -1077,7 +1077,7 @@ int cmd_clone(int argc, const char **argv, const char 
> *prefix)
>   if (option_required_reference.nr || option_optional_reference.nr)
>   setup_reference();
> 
> - refspec_item_init(&refspec, value.buf, REFSPEC_FETCH);
> + refspec_item_init_or_die(&refspec, value.buf, REFSPEC_FETCH);
> 
>   strbuf_reset(&value);
> 
> diff --git a/builtin/pull.c b/builtin/pull.c
> index 1f2ecf3a88..bb64631d98 100644
> --- a/builtin/pull.c
> +++ b/builtin/pull.c
> @@ -684,7 +684,7 @@ static const char *get_tracking_branch(const char 
> *remote, const char *refspec)
>   const char *spec_src;
>   const char *merge_branch;
> 
> - refspec_item_init(&spec, refspec, REFSPEC_FETCH);
> + refspec_item_init_or_die(&spec, refspec, REFSPEC_FETCH);
>   spec_src = spec.src;
>   if (!*spec_src || !strcmp(spec_src, "HEAD"))
>   spec_src = "HEAD";
> diff --git a/refspec.c b/refspec.c
> index 78edc48ae8..8806df0fd2 100644
> --- a/refspec.c
> +++ b/refspec.c
> @@ -124,11 +124,16 @@ static int parse_refspec(struct refspec_item *item, 
> const char *refspec, int fet
>   return 1;
>  }
> 
> -void refspec_item_init(struct refspec_item *item, const char *refspec, int 
> fetch)
> +int refspec_item_init(struct refspec_item *item, const char *refspec, int 
> fetch)
>  {
>   memset(item, 0, sizeof(*item));
> + int ret = parse_refspec(item, refspec, fetch);
> + return ret;
> +}
> 
> - if (!parse_refspec(item, refspec, fetch))
> +void refspec_item_init_or_die(struct refspec_item *item, const char 
> *refspec, int fetch)
> +{
> + if (!refspec_item_init(item, refspec, fetch))
>   die("Invalid refspec '%s'", refspec);
>  }
> 
> @@ -152,7 +157,7 @@ void refspec_append(struct refspec *rs, const char 
> *refspec)
>  {
>   struct refspec_item item;
> 
> - refspec_item_init(&item, refspec, rs->fetch);
> + refspec_item_init_or_die(&item, refspec, rs->fetch);
> 
>   ALLOC_GROW(rs->items, rs->nr + 1, rs->alloc);
>   rs->items[rs->nr++] = item;
> @@ -191,7 +196,7 @@ void refspec_clear(struct refspec *rs)
>  int valid_fetch_refspec(const char *fetch_refspec_str)
>  {
>   struct refspec_item refspec;
> - int ret = parse_refspec(&refspec, fetch_refspec_str, REFSPEC_FETCH);
> + int ret = refspec_item_init(&refspec, fetch_refspec_str, REFSPEC_FETCH);
>   refspec_item_clear(&refspec);
>   return ret;
>  }
> diff --git a/refspec.h b/refspec.h
> index 3a9363887c..ed5d997f7f 100644
> --- a/refspec.h
> +++ b/refspec.h
> @@ -32,7 +32,8 @@ struct refspec {
>   int fetch;
>  };
> 
> -void refspec_item_init(struct refspec_item *item, const char *refspec, int 
> fetch);
> +int refspec_item_init(struct refspec_item *item, const char *refspec, int 
> fetch);
> +void refspec_item_init_or_die(struct refspec_item *item, const char 
> *refspec, int fetch);
>  void refspec_item_clear(struct refspec_item *item);
>  void refspec_init(struct refspec 

[PATCH v7 0/2] json-writer V7

2018-06-05 Thread git
From: Jeff Hostetler 

Here is V7 of my json-writer patches.  Please replace the existing V5/V6
version of jh/json-writer branch with this one.

This version cleans up the die()-vs-BUG() issue that Duy mentioned recently.
It also fixes a formatting bug when composing empty sub-objects/-arrays.

It also includes a new Python-based test to consume the generated JSON.

I plan to use the json-writer routines in a followup telemetry patch series.

Jeff Hostetler (2):
  json_writer: new routines to create data in JSON format
  json-writer: t0019: add Python unit test

 Makefile|   2 +
 json-writer.c   | 419 
 json-writer.h   | 113 +
 t/helper/test-json-writer.c | 572 
 t/t0019-json-writer.sh  | 274 +
 t/t0019/parse_json_1.py |  35 +++
 6 files changed, 1415 insertions(+)
 create mode 100644 json-writer.c
 create mode 100644 json-writer.h
 create mode 100644 t/helper/test-json-writer.c
 create mode 100755 t/t0019-json-writer.sh
 create mode 100644 t/t0019/parse_json_1.py

-- 
2.9.3



[PATCH v7 2/2] json-writer: t0019: add Python unit test

2018-06-05 Thread git
From: Jeff Hostetler 

Test json-writer output using Python.

Signed-off-by: Jeff Hostetler 
---
 t/t0019-json-writer.sh  | 38 ++
 t/t0019/parse_json_1.py | 35 +++
 2 files changed, 73 insertions(+)
 create mode 100644 t/t0019/parse_json_1.py

diff --git a/t/t0019-json-writer.sh b/t/t0019-json-writer.sh
index c9c2e23..951cd89 100755
--- a/t/t0019-json-writer.sh
+++ b/t/t0019-json-writer.sh
@@ -233,4 +233,42 @@ test_expect_success 'inline array with no members' '
test_cmp expect actual
 '
 
+# As a sanity check, ask Python to parse our generated JSON.  Let Python
+# recursively dump the resulting dictionary in sorted order.  Confirm that
+# that matches our expectations.
+test_expect_success PYTHON 'parse JSON using Python' '
+   cat >expect <<-\EOF &&
+   a abc
+   b 42
+   sub1 dict
+   sub1.c 3.14
+   sub1.d True
+   sub1.sub2 list
+   sub1.sub2[0] False
+   sub1.sub2[1] dict
+   sub1.sub2[1].g 0
+   sub1.sub2[1].h 1
+   sub1.sub2[2] None
+   EOF
+   test-json-writer >output.json \
+   @object \
+   @object-string a abc \
+   @object-int b 42 \
+   @object-object "sub1" \
+   @object-double c 2 3.140 \
+   @object-true d \
+   @object-array "sub2" \
+   @array-false \
+   @array-object \
+   @object-int g 0 \
+   @object-int h 1 \
+   @end \
+   @array-null \
+   @end \
+   @end \
+   @end &&
+   python "$TEST_DIRECTORY"/t0019/parse_json_1.py actual &&
+   test_cmp expect actual
+'
+
 test_done
diff --git a/t/t0019/parse_json_1.py b/t/t0019/parse_json_1.py
new file mode 100644
index 000..9d928a3
--- /dev/null
+++ b/t/t0019/parse_json_1.py
@@ -0,0 +1,35 @@
+import os
+import sys
+import json
+
+def dump_item(label_input, v):
+if type(v) is dict:
+print("%s dict" % (label_input))
+dump_dict(label_input, v)
+elif type(v) is list:
+print("%s list" % (label_input))
+dump_list(label_input, v)
+else:
+print("%s %s" % (label_input, v))
+
+def dump_list(label_input, list_input):
+ix = 0
+for v in list_input:
+label = ("%s[%d]" % (label_input, ix))
+dump_item(label, v)
+ix += 1
+return
+  
+def dump_dict(label_input, dict_input):
+for k in sorted(dict_input.iterkeys()):
+v = dict_input[k]
+if (len(label_input) > 0):
+label = ("%s.%s" % (label_input, k))
+else:
+label = k
+dump_item(label, v)
+return
+
+for line in sys.stdin:
+data = json.loads(line)
+dump_dict("", data)
-- 
2.9.3



[PATCH v7 1/2] json_writer: new routines to create data in JSON format

2018-06-05 Thread git
From: Jeff Hostetler 

Add a series of jw_ routines and "struct json_writer" structure to compose
JSON data.  The resulting string data can then be output by commands wanting
to support a JSON output format.

The json-writer routines can be used to generate structured data in a
JSON-like format.  We say "JSON-like" because we do not enforce the Unicode
(usually UTF-8) requirement on string fields.  Internally, Git does not
necessarily have Unicode/UTF-8 data for most fields, so it is currently
unclear the best way to enforce that requirement.  For example, on Linx
pathnames can contain arbitrary 8-bit character data, so a command like
"status" would not know how to encode the reported pathnames.  We may want
to revisit this (or double encode such strings) in the future.

The initial use for the json-writer routines is for generating telemetry
data for executed Git commands.  Later, we may want to use them in other
commands, such as status.

Helped-by: René Scharfe 
Helped-by: Wink Saville 
Helped-by: Ramsay Jones 
Signed-off-by: Jeff Hostetler 
---
 Makefile|   2 +
 json-writer.c   | 419 
 json-writer.h   | 113 +
 t/helper/test-json-writer.c | 572 
 t/t0019-json-writer.sh  | 236 ++
 5 files changed, 1342 insertions(+)
 create mode 100644 json-writer.c
 create mode 100644 json-writer.h
 create mode 100644 t/helper/test-json-writer.c
 create mode 100755 t/t0019-json-writer.sh

diff --git a/Makefile b/Makefile
index a1d8775..4ae6946 100644
--- a/Makefile
+++ b/Makefile
@@ -666,6 +666,7 @@ TEST_PROGRAMS_NEED_X += test-fake-ssh
 TEST_PROGRAMS_NEED_X += test-genrandom
 TEST_PROGRAMS_NEED_X += test-hashmap
 TEST_PROGRAMS_NEED_X += test-index-version
+TEST_PROGRAMS_NEED_X += test-json-writer
 TEST_PROGRAMS_NEED_X += test-lazy-init-name-hash
 TEST_PROGRAMS_NEED_X += test-line-buffer
 TEST_PROGRAMS_NEED_X += test-match-trees
@@ -820,6 +821,7 @@ LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
 LIB_OBJS += ident.o
+LIB_OBJS += json-writer.o
 LIB_OBJS += kwset.o
 LIB_OBJS += levenshtein.o
 LIB_OBJS += line-log.o
diff --git a/json-writer.c b/json-writer.c
new file mode 100644
index 000..f35ce19
--- /dev/null
+++ b/json-writer.c
@@ -0,0 +1,419 @@
+#include "cache.h"
+#include "json-writer.h"
+
+void jw_init(struct json_writer *jw)
+{
+   strbuf_reset(&jw->json);
+   strbuf_reset(&jw->open_stack);
+   strbuf_reset(&jw->first_stack);
+   jw->pretty = 0;
+}
+
+void jw_release(struct json_writer *jw)
+{
+   strbuf_release(&jw->json);
+   strbuf_release(&jw->open_stack);
+   strbuf_release(&jw->first_stack);
+}
+
+/*
+ * Append JSON-quoted version of the given string to 'out'.
+ */
+static void append_quoted_string(struct strbuf *out, const char *in)
+{
+   unsigned char c;
+
+   strbuf_addch(out, '"');
+   while ((c = *in++) != '\0') {
+   if (c == '"')
+   strbuf_addstr(out, "\\\"");
+   else if (c == '\\')
+   strbuf_addstr(out, "");
+   else if (c == '\n')
+   strbuf_addstr(out, "\\n");
+   else if (c == '\r')
+   strbuf_addstr(out, "\\r");
+   else if (c == '\t')
+   strbuf_addstr(out, "\\t");
+   else if (c == '\f')
+   strbuf_addstr(out, "\\f");
+   else if (c == '\b')
+   strbuf_addstr(out, "\\b");
+   else if (c < 0x20)
+   strbuf_addf(out, "\\u%04x", c);
+   else
+   strbuf_addch(out, c);
+   }
+   strbuf_addch(out, '"');
+}
+
+static inline void indent_pretty(struct json_writer *jw)
+{
+   int k;
+
+   if (!jw->pretty)
+   return;
+
+   for (k = 0; k < jw->open_stack.len; k++)
+   strbuf_addstr(&jw->json, "  ");
+}
+
+/*
+ * Begin an object or array (either top-level or nested within the currently
+ * open object or array).
+ */
+static inline void begin(struct json_writer *jw, char ch_open, int pretty)
+{
+   jw->pretty = pretty;
+
+   strbuf_addch(&jw->json, ch_open);
+
+   strbuf_addch(&jw->open_stack, ch_open);
+   strbuf_addch(&jw->first_stack, '1');
+}
+
+/*
+ * Assert that the top of the open-stack is an object.
+ */
+static inline void assert_in_object(const struct json_writer *jw, const char 
*key)
+{
+   if (!jw->open_stack.len)
+   BUG("json-writer: object: missing jw_object_begin(): '%s'", 
key);
+   if (jw->open_stack.buf[jw->open_stack.len - 1] != '{')
+   BUG("json-writer: object: not in object: '%s'", key);
+}
+
+/*
+ * Assert that the top of the open-stack is an array.
+ */
+static inline void assert_in_array(const struct json_writer *jw)
+{
+   if (!jw->open_stack.len)
+   BUG("json-writer: array: missin

Re: [PATCH v4 00/21] Integrate commit-graph into 'fsck' and 'gc'

2018-06-05 Thread Derrick Stolee

On 6/5/2018 10:51 AM, Ævar Arnfjörð Bjarmason wrote:

On Mon, Jun 4, 2018 at 6:52 PM, Derrick Stolee  wrote:

Thanks for the feedback on v3. There were several small cleanups, but
perhaps the biggest change is the addition of "commit-graph: use
string-list API for input" which makes "commit-graph: add '--reachable'
option" much simpler.

Do you have a version of this pushed somewhere? Your fsck/upstream
branch conflicts with e.g. long-since-merged nd/repack-keep-pack.


Sorry I forgot to push. It is now available on GitHub [1]. I based my 
commits on 'next' including the core.commitGraph fix for 
t5318-commit-graph.sh I already sent [2].


[1] https://github.com/derrickstolee/git/tree/fsck/upstream
    fsck/upstream branch matches this patch series

[2] 
https://public-inbox.org/git/20180604123906.136417-1-dsto...@microsoft.com/

    [PATCH] t5318-commit-graph.sh: use core.commitGraph


Re: [PATCH 6/6] Forbid "the_index" in dir.c and unpack-trees.c

2018-06-05 Thread Brandon Williams
On 06/05, Nguyễn Thái Ngọc Duy wrote:
> Use of global variables like the_index makes it very hard to follow
> the code flow, especially when it's buried deep in some minor helper
> function.
> 
> We are gradually avoiding global variables, this hopefully will make
> it one step closer. The end game is, the set of "library" source files
> will have just one macro set: "LIB_CODE" (or something). This macro
> will prevent access to most (if not all) global variables in those
> files. We can't do that now, so we just prevent one thing at a time.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  Should I keep my trick of defining the_index to
>  the_index_should_not_be_used_here? It may help somewhat when people
>  accidentally use the_index.

We already have a set of index compatibility macros which this could
maybe be a part of.  Though if we want to go this way I think we should
do the reverse of this and instead have GLOBAL_INDEX which must be
defined in order to have access to the global.  That way new files are
already protected by this and it may be easier to reduce the number of
these defines through our codebase than to add them to every single
file.

> 
>  cache.h| 2 ++
>  dir.c  | 2 ++
>  unpack-trees.c | 2 ++
>  3 files changed, 6 insertions(+)
> 
> diff --git a/cache.h b/cache.h
> index 89a107a7f7..ecc96ccb0e 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -330,7 +330,9 @@ struct index_state {
>   struct ewah_bitmap *fsmonitor_dirty;
>  };
>  
> +#ifndef NO_GLOBAL_INDEX
>  extern struct index_state the_index;
> +#endif
>  
>  /* Name hashing */
>  extern int test_lazy_init_name_hash(struct index_state *istate, int 
> try_threaded);
> diff --git a/dir.c b/dir.c
> index ccf8b4975e..74d848db5a 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -8,6 +8,8 @@
>   *Junio Hamano, 2005-2006
>   */
>  #define NO_THE_INDEX_COMPATIBILITY_MACROS
> +/* Do not use the_index. You should have access to it via function arg */
> +#define NO_GLOBAL_INDEX
>  #include "cache.h"
>  #include "config.h"
>  #include "dir.h"
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 3ace82ca27..9aebe9762b 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1,4 +1,6 @@
>  #define NO_THE_INDEX_COMPATIBILITY_MACROS
> +/* Do not use the_index here, you probably want o->src_index */
> +#define NO_GLOBAL_INDEX
>  #include "cache.h"
>  #include "argv-array.h"
>  #include "repository.h"
> -- 
> 2.18.0.rc0.333.g22e6ee6cdf
> 

-- 
Brandon Williams


Where is git checkout --orphan implemented at.

2018-06-05 Thread Sean Hunt
I would like to see the source code to git checkout --orphan so I can learn how 
it works and so I can manually do what it does by hand on making a new branch 
with no history in the refs folder. I can only do it on my iPhone as my laptop 
has no internet or way to do it there, and the program on my iPhone does not 
have the method implemented yet visually to do it, forcing manual creation of 
the orphan branch by hand in the end. If the public github had all the codes to 
all commands and subcommands like the one above it would be nice or well at 
least a file that explains the source file each command and subcommands are 
from so that way a person like me can use as a reference to make our own git 
gui that has 100% of command line git features.

Sent from my iPhone

Re: [PATCH 3/6] unpack-trees: don't shadow global var the_index

2018-06-05 Thread Ramsay Jones



On 05/06/18 16:43, Nguyễn Thái Ngọc Duy wrote:
> This function mark_new_skip_worktree() has an argument named the_index
> which is also the name of a global variable. While they have different
> types (the global the_index is not a pointer) mistakes can easily
> happen and it's also confusing for readers. Rename the function
> argument to something other than the_index.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  unpack-trees.c | 9 -
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 5d06aa9c98..60d1138e08 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1231,7 +1231,7 @@ static int clear_ce_flags(struct cache_entry **cache, 
> int nr,
>   * Set/Clear CE_NEW_SKIP_WORKTREE according to $GIT_DIR/info/sparse-checkout
>   */
>  static void mark_new_skip_worktree(struct exclude_list *el,
> -struct index_state *the_index,
> +struct index_state *istate,
>  int select_flag, int skip_wt_flag)
>  {
>   int i;
> @@ -1240,8 +1240,8 @@ static void mark_new_skip_worktree(struct exclude_list 
> *el,
>* 1. Pretend the narrowest worktree: only unmerged entries
>* are checked out
>*/
> - for (i = 0; i < the_index->cache_nr; i++) {
> - struct cache_entry *ce = the_index->cache[i];
> + for (i = 0; i < istate->cache_nr; i++) {
> + struct cache_entry *ce = istate->cache[i];
>  
>   if (select_flag && !(ce->ce_flags & select_flag))
>   continue;
> @@ -1256,8 +1256,7 @@ static void mark_new_skip_worktree(struct exclude_list 
> *el,
>* 2. Widen worktree according to sparse-checkout file.
>* Matched entries will have skip_wt_flag cleared (i.e. "in")
>*/
> - clear_ce_flags(the_index->cache, the_index->cache_nr,
> -select_flag, skip_wt_flag, el);
> + clear_ce_flags(istate, select_flag, skip_wt_flag, el);

This looks a bit suspect. The clear_ce_flags() function has
not been modified to take a 'struct index_state *' as its first
parameter, right? (If you look back at the first hunk header, you
will see that it still takes 'struct cache_entry **, int, ...') ;-)

ATB,
Ramsay Jones



Re: [PATCH 4/6] unpack-tress: convert clear_ce_flags* to avoid the_index

2018-06-05 Thread Ramsay Jones



On 05/06/18 16:43, Nguyễn Thái Ngọc Duy wrote:
> Prior to fba92be8f7, this code implicitly (and incorrectly) assumes
> the_index when running the exclude machinery. fba92be8f7 helps show
> this problem clearer because unpack-trees operation is supposed to
> work on whatever index the caller specifies... not specifically
> the_index.
> 
> Update the code to use "istate" argument that's originally from
> mark_new_skip_worktree(). From the call sites, both in unpack_trees(),
> you can see that this function works on two separate indexes:
> o->src_index and o->result. The second mark_new_skip_worktree() so far
> has incorecctly applied exclude rules on o->src_index instead of
> o->result. It's unclear what is the consequences of this, but it's
> definitely wrong.
> 
> [1] fba92be8f7 (dir: convert is_excluded_from_list to take an index -
> 2017-05-05)
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  unpack-trees.c | 29 +
>  1 file changed, 17 insertions(+), 12 deletions(-)
> 
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 60d1138e08..5268de7af5 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1085,13 +1085,15 @@ static int unpack_callback(int n, unsigned long mask, 
> unsigned long dirmask, str
>   return mask;
>  }
>  
> -static int clear_ce_flags_1(struct cache_entry **cache, int nr,
> +static int clear_ce_flags_1(struct index_state *istate,
> + struct cache_entry **cache, int nr,
>   struct strbuf *prefix,
>   int select_mask, int clear_mask,
>   struct exclude_list *el, int defval);
>  
>  /* Whole directory matching */
> -static int clear_ce_flags_dir(struct cache_entry **cache, int nr,
> +static int clear_ce_flags_dir(struct index_state *istate,
> +   struct cache_entry **cache, int nr,
> struct strbuf *prefix,
> char *basename,
> int select_mask, int clear_mask,
> @@ -1100,7 +1102,7 @@ static int clear_ce_flags_dir(struct cache_entry 
> **cache, int nr,
>   struct cache_entry **cache_end;
>   int dtype = DT_DIR;
>   int ret = is_excluded_from_list(prefix->buf, prefix->len,
> - basename, &dtype, el, &the_index);
> + basename, &dtype, el, istate);
>   int rc;
>  
>   strbuf_addch(prefix, '/');
> @@ -1122,7 +1124,7 @@ static int clear_ce_flags_dir(struct cache_entry 
> **cache, int nr,
>* calling clear_ce_flags_1(). That function will call
>* the expensive is_excluded_from_list() on every entry.
>*/
> - rc = clear_ce_flags_1(cache, cache_end - cache,
> + rc = clear_ce_flags_1(istate, cache, cache_end - cache,
> prefix,
> select_mask, clear_mask,
> el, ret);
> @@ -1145,7 +1147,8 @@ static int clear_ce_flags_dir(struct cache_entry 
> **cache, int nr,
>   *   cache[0]->name[0..(prefix_len-1)]
>   * Top level path has prefix_len zero.
>   */
> -static int clear_ce_flags_1(struct cache_entry **cache, int nr,
> +static int clear_ce_flags_1(struct index_state *istate,
> + struct cache_entry **cache, int nr,
>   struct strbuf *prefix,
>   int select_mask, int clear_mask,
>   struct exclude_list *el, int defval)
> @@ -1179,7 +1182,7 @@ static int clear_ce_flags_1(struct cache_entry **cache, 
> int nr,
>   len = slash - name;
>   strbuf_add(prefix, name, len);
>  
> - processed = clear_ce_flags_dir(cache, cache_end - cache,
> + processed = clear_ce_flags_dir(istate, cache, cache_end 
> - cache,
>  prefix,
>  prefix->buf + 
> prefix->len - len,
>  select_mask, clear_mask,
> @@ -1193,7 +1196,7 @@ static int clear_ce_flags_1(struct cache_entry **cache, 
> int nr,
>   }
>  
>   strbuf_addch(prefix, '/');
> - cache += clear_ce_flags_1(cache, cache_end - cache,
> + cache += clear_ce_flags_1(istate, cache, cache_end - 
> cache,
> prefix,
> select_mask, clear_mask, el, 
> defval);
>   strbuf_setlen(prefix, prefix->len - len - 1);
> @@ -1203,7 +1206,7 @@ static int clear_ce_flags_1(struct cache_entry **cache, 
> int nr,
>   /* Non-directory */
>   dtype = ce_to_dtype(ce);
>   ret = is_excluded_from_list(ce->name, ce_namelen(ce),
> - name, &dtype, el, &the_index);
> + 

[PATCH 6/8] fetch: refactor to make function args narrower

2018-06-05 Thread Brandon Williams
Refactor find_non_local_tags and get_ref_map to only take the
information they need instead of the entire transport struct. Besides
improving code clarity, this also improves their flexibility, allowing
for a different set of refs to be used instead of relying on the ones
stored in the transport struct.

Signed-off-by: Brandon Williams 
---
 builtin/fetch.c | 52 -
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index ee8b87c78..b600e1f10 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -254,9 +254,9 @@ static int will_fetch(struct ref **head, const unsigned 
char *sha1)
return 0;
 }
 
-static void find_non_local_tags(struct transport *transport,
-   struct ref **head,
-   struct ref ***tail)
+static void find_non_local_tags(const struct ref *refs,
+   struct ref **head,
+   struct ref ***tail)
 {
struct string_list existing_refs = STRING_LIST_INIT_DUP;
struct string_list remote_refs = STRING_LIST_INIT_NODUP;
@@ -264,7 +264,7 @@ static void find_non_local_tags(struct transport *transport,
struct string_list_item *item = NULL;
 
for_each_ref(add_existing, &existing_refs);
-   for (ref = transport_get_remote_refs(transport, NULL); ref; ref = 
ref->next) {
+   for (ref = refs; ref; ref = ref->next) {
if (!starts_with(ref->name, "refs/tags/"))
continue;
 
@@ -338,7 +338,8 @@ static void find_non_local_tags(struct transport *transport,
string_list_clear(&remote_refs, 0);
 }
 
-static struct ref *get_ref_map(struct transport *transport,
+static struct ref *get_ref_map(struct remote *remote,
+  const struct ref *remote_refs,
   struct refspec *rs,
   int tags, int *autotags)
 {
@@ -346,27 +347,11 @@ static struct ref *get_ref_map(struct transport 
*transport,
struct ref *rm;
struct ref *ref_map = NULL;
struct ref **tail = &ref_map;
-   struct argv_array ref_prefixes = ARGV_ARRAY_INIT;
 
/* opportunistically-updated references: */
struct ref *orefs = NULL, **oref_tail = &orefs;
 
struct string_list existing_refs = STRING_LIST_INIT_DUP;
-   const struct ref *remote_refs;
-
-   if (rs->nr)
-   refspec_ref_prefixes(rs, &ref_prefixes);
-   else if (transport->remote && transport->remote->fetch.nr)
-   refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes);
-
-   if (ref_prefixes.argc &&
-   (tags == TAGS_SET || (tags == TAGS_DEFAULT && !rs->nr))) {
-   argv_array_push(&ref_prefixes, "refs/tags/");
-   }
-
-   remote_refs = transport_get_remote_refs(transport, &ref_prefixes);
-
-   argv_array_clear(&ref_prefixes);
 
if (rs->nr) {
struct refspec *fetch_refspec;
@@ -403,7 +388,7 @@ static struct ref *get_ref_map(struct transport *transport,
if (refmap.nr)
fetch_refspec = &refmap;
else
-   fetch_refspec = &transport->remote->fetch;
+   fetch_refspec = &remote->fetch;
 
for (i = 0; i < fetch_refspec->nr; i++)
get_fetch_map(ref_map, &fetch_refspec->items[i], 
&oref_tail, 1);
@@ -411,7 +396,6 @@ static struct ref *get_ref_map(struct transport *transport,
die("--refmap option is only meaningful with command-line 
refspec(s).");
} else {
/* Use the defaults */
-   struct remote *remote = transport->remote;
struct branch *branch = branch_get(NULL);
int has_merge = branch_has_merge_config(branch);
if (remote &&
@@ -450,7 +434,7 @@ static struct ref *get_ref_map(struct transport *transport,
/* also fetch all tags */
get_fetch_map(remote_refs, tag_refspec, &tail, 0);
else if (tags == TAGS_DEFAULT && *autotags)
-   find_non_local_tags(transport, &ref_map, &tail);
+   find_non_local_tags(remote_refs, &ref_map, &tail);
 
/* Now append any refs to be updated opportunistically: */
*tail = orefs;
@@ -1137,6 +1121,8 @@ static int do_fetch(struct transport *transport,
struct ref *ref_map;
int autotags = (transport->remote->fetch_tags == 1);
int retcode = 0;
+   const struct ref *remote_refs;
+   struct argv_array ref_prefixes = ARGV_ARRAY_INIT;
 
if (tags == TAGS_DEFAULT) {
if (transport->remote->fetch_tags == 2)
@@ -1152,7 +1138,21 @@ static int do_fetch(struct transport *transport,
goto cleanup;
}
 
-   ref_map = get_ref_map(transport, rs, tags, &autotags);
+   if (rs->nr)
+   refspec_ref_prefixe

[PATCH 3/8] upload-pack: test negotiation with changing repository

2018-06-05 Thread Brandon Williams
Add tests to check the behavior of fetching from a repository which
changes between rounds of negotiation (for example, when different
servers in a load-balancing agreement participate in the same stateless
RPC negotiation). This forms a baseline of comparison to the ref-in-want
functionality (which will be introduced to the client in subsequent
commits), and ensures that subsequent commits do not change existing
behavior.

As part of this effort, a mechanism to substitute strings in a single
HTTP response is added.

Signed-off-by: Brandon Williams 
---
 t/lib-httpd.sh |  1 +
 t/lib-httpd/apache.conf|  8 +++
 t/lib-httpd/one-time-sed.sh| 16 ++
 t/t5703-upload-pack-ref-in-want.sh | 92 ++
 4 files changed, 117 insertions(+)
 create mode 100644 t/lib-httpd/one-time-sed.sh

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 435a37465..84f8efdd4 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -132,6 +132,7 @@ prepare_httpd() {
cp "$TEST_PATH"/passwd "$HTTPD_ROOT_PATH"
install_script broken-smart-http.sh
install_script error.sh
+   install_script one-time-sed.sh
 
ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules"
 
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index 724d9ae46..fe68d37bb 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -111,9 +111,14 @@ Alias /auth/dumb/ www/auth/dumb/
SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
SetEnv GIT_HTTP_EXPORT_ALL
 
+
+   SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+   SetEnv GIT_HTTP_EXPORT_ALL
+
 ScriptAliasMatch /smart_*[^/]*/(.*) ${GIT_EXEC_PATH}/git-http-backend/$1
 ScriptAlias /broken_smart/ broken-smart-http.sh/
 ScriptAlias /error/ error.sh/
+ScriptAliasMatch /one_time_sed/(.*) one-time-sed.sh/$1
 
Options FollowSymlinks
 
@@ -123,6 +128,9 @@ ScriptAlias /error/ error.sh/
 
   Options ExecCGI
 
+
+   Options ExecCGI
+
 
Options ExecCGI
 
diff --git a/t/lib-httpd/one-time-sed.sh b/t/lib-httpd/one-time-sed.sh
new file mode 100644
index 0..a9c4aa5f4
--- /dev/null
+++ b/t/lib-httpd/one-time-sed.sh
@@ -0,0 +1,16 @@
+#!/bin/sh
+
+if [ -e one-time-sed ]; then
+   "$GIT_EXEC_PATH/git-http-backend" >out
+
+   sed "$(cat one-time-sed)" out_modified
+
+   if diff out out_modified >/dev/null; then
+   cat out
+   else
+   cat out_modified
+   rm one-time-sed
+   fi
+else
+   "$GIT_EXEC_PATH/git-http-backend"
+fi
diff --git a/t/t5703-upload-pack-ref-in-want.sh 
b/t/t5703-upload-pack-ref-in-want.sh
index 0ef182970..979ab6d03 100755
--- a/t/t5703-upload-pack-ref-in-want.sh
+++ b/t/t5703-upload-pack-ref-in-want.sh
@@ -150,4 +150,96 @@ test_expect_success 'want-ref with ref we already have 
commit for' '
check_output
 '
 
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+REPO="$HTTPD_DOCUMENT_ROOT_PATH/repo"
+LOCAL_PRISTINE="$(pwd)/local_pristine"
+
+test_expect_success 'setup repos for change-while-negotiating test' '
+   (
+   git init "$REPO" &&
+   cd "$REPO" &&
+   >.git/git-daemon-export-ok &&
+   test_commit m1 &&
+   git tag -d m1 &&
+
+   # Local repo with many commits (so that negotiation will take
+   # more than 1 request/response pair)
+   git clone "http://127.0.0.1:$LIB_HTTPD_PORT/smart/repo"; 
"$LOCAL_PRISTINE" &&
+   cd "$LOCAL_PRISTINE" &&
+   git checkout -b side &&
+   for i in $(seq 1 33); do test_commit s$i; done &&
+
+   # Add novel commits to upstream
+   git checkout master &&
+   cd "$REPO" &&
+   test_commit m2 &&
+   test_commit m3 &&
+   git tag -d m2 m3
+   ) &&
+   git -C "$LOCAL_PRISTINE" remote set-url origin 
"http://127.0.0.1:$LIB_HTTPD_PORT/one_time_sed/repo"; &&
+   git -C "$LOCAL_PRISTINE" config protocol.version 2
+'
+
+inconsistency() {
+   # Simulate that the server initially reports $2 as the ref
+   # corresponding to $1, and after that, $1 as the ref corresponding to
+   # $1. This corresponds to the real-life situation where the server's
+   # repository appears to change during negotiation, for example, when
+   # different servers in a load-balancing arrangement serve (stateless)
+   # RPCs during a single negotiation.
+   printf "s/%s/%s/" \
+  $(git -C "$REPO" rev-parse $1 | tr -d "\n") \
+  $(git -C "$REPO" rev-parse $2 | tr -d "\n") \
+  >"$HTTPD_ROOT_PATH/one-time-sed"
+}
+
+test_expect_success 'server is initially ahead - no ref in want' '
+   git -C "$REPO" config uploadpack.allowRefInWant false &&
+   rm -rf local &&
+   cp -r "$LOCAL_PRISTINE" local &&
+   inconsistency master 1234567890123456789012345678901234567890 &&
+   test_must_fail git -C local fetch 2>err &&

[PATCH 4/8] fetch: refactor the population of peer ref OIDs

2018-06-05 Thread Brandon Williams
Populate peer ref OIDs in get_ref_map instead of do_fetch. Besides
tightening scopes of variables in the code, this also prepares for
get_ref_map being able to be called multiple times within do_fetch.

Signed-off-by: Brandon Williams 
---
 builtin/fetch.c | 36 ++--
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index ea5b9669a..545635448 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -351,6 +351,7 @@ static struct ref *get_ref_map(struct transport *transport,
/* opportunistically-updated references: */
struct ref *orefs = NULL, **oref_tail = &orefs;
 
+   struct string_list existing_refs = STRING_LIST_INIT_DUP;
const struct ref *remote_refs;
 
if (rs->nr)
@@ -458,7 +459,23 @@ static struct ref *get_ref_map(struct transport *transport,
tail = &rm->next;
}
 
-   return ref_remove_duplicates(ref_map);
+   ref_map = ref_remove_duplicates(ref_map);
+
+   for_each_ref(add_existing, &existing_refs);
+   for (rm = ref_map; rm; rm = rm->next) {
+   if (rm->peer_ref) {
+   struct string_list_item *peer_item =
+   string_list_lookup(&existing_refs,
+  rm->peer_ref->name);
+   if (peer_item) {
+   struct object_id *old_oid = peer_item->util;
+   oidcpy(&rm->peer_ref->old_oid, old_oid);
+   }
+   }
+   }
+   string_list_clear(&existing_refs, 1);
+
+   return ref_map;
 }
 
 #define STORE_REF_ERROR_OTHER 1
@@ -1110,14 +1127,10 @@ static void backfill_tags(struct transport *transport, 
struct ref *ref_map)
 static int do_fetch(struct transport *transport,
struct refspec *rs)
 {
-   struct string_list existing_refs = STRING_LIST_INIT_DUP;
struct ref *ref_map;
-   struct ref *rm;
int autotags = (transport->remote->fetch_tags == 1);
int retcode = 0;
 
-   for_each_ref(add_existing, &existing_refs);
-
if (tags == TAGS_DEFAULT) {
if (transport->remote->fetch_tags == 2)
tags = TAGS_SET;
@@ -1136,18 +1149,6 @@ static int do_fetch(struct transport *transport,
if (!update_head_ok)
check_not_current_branch(ref_map);
 
-   for (rm = ref_map; rm; rm = rm->next) {
-   if (rm->peer_ref) {
-   struct string_list_item *peer_item =
-   string_list_lookup(&existing_refs,
-  rm->peer_ref->name);
-   if (peer_item) {
-   struct object_id *old_oid = peer_item->util;
-   oidcpy(&rm->peer_ref->old_oid, old_oid);
-   }
-   }
-   }
-
if (tags == TAGS_DEFAULT && autotags)
transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, "1");
if (prune) {
@@ -1183,7 +1184,6 @@ static int do_fetch(struct transport *transport,
}
 
  cleanup:
-   string_list_clear(&existing_refs, 1);
return retcode;
 }
 
-- 
2.17.1.1185.g55be947832-goog



Re: Where is git checkout --orphan implemented at.

2018-06-05 Thread Jeff King
On Tue, Jun 05, 2018 at 12:02:12PM -0400, Sean Hunt wrote:

> I would like to see the source code to git checkout --orphan so I can
> learn how it works and so I can manually do what it does by hand on
> making a new branch with no history in the refs folder. I can only do
> it on my iPhone as my laptop has no internet or way to do it there,
> and the program on my iPhone does not have the method implemented yet
> visually to do it, forcing manual creation of the orphan branch by
> hand in the end. If the public github had all the codes to all
> commands and subcommands like the one above it would be nice or well
> at least a file that explains the source file each command and
> subcommands are from so that way a person like me can use as a
> reference to make our own git gui that has 100% of command line git
> features.

The code for "checkout --orphan" is in builtin/checkout.c[1].

But if you want to do roughly the same thing with other tools, you can
do:

 git symbolic-ref HEAD refs/heads/new-branch

If you don't even have symbolic-ref handy, you can do:

  echo "ref: refs/heads/new-branch" >.git/HEAD

That's not generally recommended, since future versions of Git may
change the ref storage format, but it would work with any current
version of Git.

-Peff

[1] Try update_refs_for_switch(), especially:


https://github.com/git/git/blob/61856ae69a2ceb241a90e47953e18f218e4d5f2f/builtin/checkout.c#L635

and


https://github.com/git/git/blob/61856ae69a2ceb241a90e47953e18f218e4d5f2f/builtin/checkout.c#L695


[PATCH 8/8] fetch-pack: implement ref-in-want

2018-06-05 Thread Brandon Williams
Implement ref-in-want on the client side so that when a server supports
the "ref-in-want" feature, a client will send "want-ref" lines for each
reference the client wants to fetch.

Signed-off-by: Brandon Williams 
---
 fetch-pack.c   | 35 +++---
 remote.c   |  1 +
 remote.h   |  1 +
 t/t5703-upload-pack-ref-in-want.sh |  4 ++--
 4 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 7799ee2cd..51e8356ba 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1102,9 +1102,10 @@ static void add_shallow_requests(struct strbuf *req_buf,
 
 static void add_wants(const struct ref *wants, struct strbuf *req_buf)
 {
+   int use_ref_in_want = server_supports_feature("fetch", "ref-in-want", 
0);
+
for ( ; wants ; wants = wants->next) {
const struct object_id *remote = &wants->old_oid;
-   const char *remote_hex;
struct object *o;
 
/*
@@ -1122,8 +1123,10 @@ static void add_wants(const struct ref *wants, struct 
strbuf *req_buf)
continue;
}
 
-   remote_hex = oid_to_hex(remote);
-   packet_buf_write(req_buf, "want %s\n", remote_hex);
+   if (!use_ref_in_want || wants->exact_sha1)
+   packet_buf_write(req_buf, "want %s\n", 
oid_to_hex(remote));
+   else
+   packet_buf_write(req_buf, "want-ref %s\n", wants->name);
}
 }
 
@@ -1334,6 +1337,29 @@ static void receive_shallow_info(struct fetch_pack_args 
*args,
args->deepen = 1;
 }
 
+static void receive_wanted_refs(struct packet_reader *reader, struct ref *refs)
+{
+   process_section_header(reader, "wanted-refs", 0);
+   while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+   struct object_id oid;
+   const char *end;
+   struct ref *r = NULL;
+
+   if (parse_oid_hex(reader->line, &oid, &end) || *end++ != ' ')
+   die("expected wanted-ref, got '%s'", reader->line);
+
+   for (r = refs; r; r = r->next) {
+   if (!strcmp(end, r->name)) {
+   oidcpy(&r->old_oid, &oid);
+   break;
+   }
+   }
+   }
+
+   if (reader->status != PACKET_READ_DELIM)
+   die("error processing wanted refs: %d", reader->status);
+}
+
 enum fetch_state {
FETCH_CHECK_LOCAL = 0,
FETCH_SEND_REQUEST,
@@ -1408,6 +1434,9 @@ static struct ref *do_fetch_pack_v2(struct 
fetch_pack_args *args,
if (process_section_header(&reader, "shallow-info", 1))
receive_shallow_info(args, &reader);
 
+   if (process_section_header(&reader, "wanted-refs", 1))
+   receive_wanted_refs(&reader, ref);
+
/* get the pack */
process_section_header(&reader, "packfile", 0);
if (get_pack(args, fd, pack_lockfile))
diff --git a/remote.c b/remote.c
index abe80c139..c9d452ac0 100644
--- a/remote.c
+++ b/remote.c
@@ -1735,6 +1735,7 @@ int get_fetch_map(const struct ref *remote_refs,
if (refspec->exact_sha1) {
ref_map = alloc_ref(name);
get_oid_hex(name, &ref_map->old_oid);
+   ref_map->exact_sha1 = 1;
} else {
ref_map = get_remote_ref(remote_refs, name);
}
diff --git a/remote.h b/remote.h
index 45ecc6cef..e5338e368 100644
--- a/remote.h
+++ b/remote.h
@@ -73,6 +73,7 @@ struct ref {
force:1,
forced_update:1,
expect_old_sha1:1,
+   exact_sha1:1,
deletion:1;
 
enum {
diff --git a/t/t5703-upload-pack-ref-in-want.sh 
b/t/t5703-upload-pack-ref-in-want.sh
index 979ab6d03..b94a51380 100755
--- a/t/t5703-upload-pack-ref-in-want.sh
+++ b/t/t5703-upload-pack-ref-in-want.sh
@@ -204,7 +204,7 @@ test_expect_success 'server is initially ahead - no ref in 
want' '
grep "ERR upload-pack: not our ref" err
 '
 
-test_expect_failure 'server is initially ahead - ref in want' '
+test_expect_success 'server is initially ahead - ref in want' '
git -C "$REPO" config uploadpack.allowRefInWant true &&
rm -rf local &&
cp -r "$LOCAL_PRISTINE" local &&
@@ -228,7 +228,7 @@ test_expect_success 'server is initially behind - no ref in 
want' '
test_cmp expected actual
 '
 
-test_expect_failure 'server is initially behind - ref in want' '
+test_expect_success 'server is initially behind - ref in want' '
git -C "$REPO" config uploadpack.allowRefInWant true &&
rm -rf local &&
cp -r "$LOCAL_PRISTINE" local &&
-- 
2.17.1.1185.g55be947832-goog



[PATCH 7/8] fetch-pack: put shallow info in output parameter

2018-06-05 Thread Brandon Williams
Expand the transport fetch method signature, by adding an output
parameter, to allow transports to return information about the refs they
have fetched.  Then communicate shallow status information through this
mechanism instead of by modifying the input list of refs.

This does require clients to sometimes generate the ref map twice: once
from the list of refs provided by the remote (as is currently done) and
potentially once from the new list of refs that the fetch mechanism
provides.

Signed-off-by: Brandon Williams 
---
 builtin/clone.c  |  4 ++--
 builtin/fetch.c  | 23 +++
 fetch-object.c   |  2 +-
 fetch-pack.c | 17 +
 transport-helper.c   |  6 --
 transport-internal.h |  9 -
 transport.c  | 34 --
 transport.h  |  3 ++-
 8 files changed, 73 insertions(+), 25 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 99e73dae8..8f86d99c5 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1155,7 +1155,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
}
 
if (!is_local && !complete_refs_before_fetch)
-   transport_fetch_refs(transport, mapped_refs);
+   transport_fetch_refs(transport, mapped_refs, NULL);
 
remote_head = find_ref_by_name(refs, "HEAD");
remote_head_points_at =
@@ -1197,7 +1197,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
if (is_local)
clone_local(path, git_dir);
else if (refs && complete_refs_before_fetch)
-   transport_fetch_refs(transport, mapped_refs);
+   transport_fetch_refs(transport, mapped_refs, NULL);
 
update_remote_refs(refs, mapped_refs, remote_head_points_at,
   branch_top.buf, reflog_msg.buf, transport,
diff --git a/builtin/fetch.c b/builtin/fetch.c
index b600e1f10..ddf44ba1a 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -946,11 +946,13 @@ static int quickfetch(struct ref *ref_map)
return check_connected(iterate_ref_map, &rm, &opt);
 }
 
-static int fetch_refs(struct transport *transport, struct ref *ref_map)
+static int fetch_refs(struct transport *transport, struct ref *ref_map,
+ struct ref **updated_remote_refs)
 {
int ret = quickfetch(ref_map);
if (ret)
-   ret = transport_fetch_refs(transport, ref_map);
+   ret = transport_fetch_refs(transport, ref_map,
+  updated_remote_refs);
if (ret)
transport_unlock_pack(transport);
return ret;
@@ -1106,7 +1108,7 @@ static void backfill_tags(struct transport *transport, 
struct ref *ref_map)
transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, NULL);
transport_set_option(transport, TRANS_OPT_DEPTH, "0");
transport_set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, NULL);
-   if (!fetch_refs(transport, ref_map))
+   if (!fetch_refs(transport, ref_map, NULL))
consume_refs(transport, ref_map);
 
if (gsecondary) {
@@ -1122,6 +1124,7 @@ static int do_fetch(struct transport *transport,
int autotags = (transport->remote->fetch_tags == 1);
int retcode = 0;
const struct ref *remote_refs;
+   struct ref *new_remote_refs = NULL;
struct argv_array ref_prefixes = ARGV_ARRAY_INIT;
 
if (tags == TAGS_DEFAULT) {
@@ -1172,7 +1175,19 @@ static int do_fetch(struct transport *transport,
   transport->url);
}
}
-   if (fetch_refs(transport, ref_map) || consume_refs(transport, ref_map)) 
{
+
+   if (fetch_refs(transport, ref_map, &new_remote_refs)) {
+   free_refs(ref_map);
+   retcode = 1;
+   goto cleanup;
+   }
+   if (new_remote_refs) {
+   free_refs(ref_map);
+   ref_map = get_ref_map(transport->remote, new_remote_refs, rs,
+ tags, &autotags);
+   free_refs(new_remote_refs);
+   }
+   if (consume_refs(transport, ref_map)) {
free_refs(ref_map);
retcode = 1;
goto cleanup;
diff --git a/fetch-object.c b/fetch-object.c
index 853624f81..48fe63dd6 100644
--- a/fetch-object.c
+++ b/fetch-object.c
@@ -19,7 +19,7 @@ static void fetch_refs(const char *remote_name, struct ref 
*ref)
 
transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
transport_set_option(transport, TRANS_OPT_NO_DEPENDENTS, "1");
-   transport_fetch_refs(transport, ref);
+   transport_fetch_refs(transport, ref, NULL);
fetch_if_missing = original_fetch_if_missing;
 }
 
diff --git a/fetch-pack.c b/fetch-pack.c
index a320ce987..7799ee2cd 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1470,12 +1470,13 @@ static int 

[PATCH 5/8] fetch: refactor fetch_refs into two functions

2018-06-05 Thread Brandon Williams
Refactor the fetch_refs function into a function that does the fetching
of refs and another function that stores them.

Signed-off-by: Brandon Williams 
---
 builtin/fetch.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 545635448..ee8b87c78 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -967,10 +967,16 @@ static int fetch_refs(struct transport *transport, struct 
ref *ref_map)
int ret = quickfetch(ref_map);
if (ret)
ret = transport_fetch_refs(transport, ref_map);
-   if (!ret)
-   ret |= store_updated_refs(transport->url,
-   transport->remote->name,
-   ref_map);
+   if (ret)
+   transport_unlock_pack(transport);
+   return ret;
+}
+
+static int consume_refs(struct transport *transport, struct ref *ref_map)
+{
+   int ret = store_updated_refs(transport->url,
+transport->remote->name,
+ref_map);
transport_unlock_pack(transport);
return ret;
 }
@@ -1116,7 +1122,8 @@ static void backfill_tags(struct transport *transport, 
struct ref *ref_map)
transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, NULL);
transport_set_option(transport, TRANS_OPT_DEPTH, "0");
transport_set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, NULL);
-   fetch_refs(transport, ref_map);
+   if (!fetch_refs(transport, ref_map))
+   consume_refs(transport, ref_map);
 
if (gsecondary) {
transport_disconnect(gsecondary);
@@ -1165,7 +1172,7 @@ static int do_fetch(struct transport *transport,
   transport->url);
}
}
-   if (fetch_refs(transport, ref_map)) {
+   if (fetch_refs(transport, ref_map) || consume_refs(transport, ref_map)) 
{
free_refs(ref_map);
retcode = 1;
goto cleanup;
-- 
2.17.1.1185.g55be947832-goog



[PATCH 1/8] test-pkt-line: add unpack-sideband subcommand

2018-06-05 Thread Brandon Williams
Add an 'unpack-sideband' subcommand to the test-pkt-line helper to
enable unpacking packet line data sent multiplexed using a sideband.

Signed-off-by: Brandon Williams 
---
 t/helper/test-pkt-line.c | 37 +
 1 file changed, 37 insertions(+)

diff --git a/t/helper/test-pkt-line.c b/t/helper/test-pkt-line.c
index 0f19e53c7..2a551 100644
--- a/t/helper/test-pkt-line.c
+++ b/t/helper/test-pkt-line.c
@@ -1,3 +1,4 @@
+#include "cache.h"
 #include "pkt-line.h"
 
 static void pack_line(const char *line)
@@ -48,6 +49,40 @@ static void unpack(void)
}
 }
 
+static void unpack_sideband(void)
+{
+   struct packet_reader reader;
+   packet_reader_init(&reader, 0, NULL, 0,
+  PACKET_READ_GENTLE_ON_EOF |
+  PACKET_READ_CHOMP_NEWLINE);
+
+   while (packet_reader_read(&reader) != PACKET_READ_EOF) {
+   int band;
+   int fd;
+
+   switch (reader.status) {
+   case PACKET_READ_EOF:
+   break;
+   case PACKET_READ_NORMAL:
+   band = reader.line[0] & 0xff;
+   if (band == 1)
+   fd = 1;
+   else
+   fd = 2;
+
+   write_or_die(fd, reader.line+1, reader.pktlen-1);
+
+   if (band == 3)
+   die("sind-band error");
+   break;
+   case PACKET_READ_FLUSH:
+   return;
+   case PACKET_READ_DELIM:
+   break;
+   }
+   }
+}
+
 int cmd_main(int argc, const char **argv)
 {
if (argc < 2)
@@ -57,6 +92,8 @@ int cmd_main(int argc, const char **argv)
pack(argc - 2, argv + 2);
else if (!strcmp(argv[1], "unpack"))
unpack();
+   else if (!strcmp(argv[1], "unpack-sideband"))
+   unpack_sideband();
else
die("invalid argument '%s'", argv[1]);
 
-- 
2.17.1.1185.g55be947832-goog



[PATCH 0/8] ref-in-want

2018-06-05 Thread Brandon Williams
This series adds the ref-in-want feature which was originally proposed
by Jonathan Tan
(https://public-inbox.org/git/cover.1485381677.git.jonathanta...@google.com/).

Back when ref-in-want was first discussed it was decided that we should
first solve the issue of moving to a new wire format and find a way to
limit the ref-advertisement before moving forward with ref-in-want.  Now
that protocol version 2 is a reality, and that refs can be filtered on
the server side, we can revisit ref-in-want.

This version of ref-in-want is a bit more restrictive than what Jonathan
originally proposed (only full ref names are allowed instead of globs
and OIDs), but it is meant to accomplish the same goal (solve the issues
of refs changing during negotiation).

Brandon Williams (8):
  test-pkt-line: add unpack-sideband subcommand
  upload-pack: implement ref-in-want
  upload-pack: test negotiation with changing repository
  fetch: refactor the population of peer ref OIDs
  fetch: refactor fetch_refs into two functions
  fetch: refactor to make function args narrower
  fetch-pack: put shallow info in output parameter
  fetch-pack: implement ref-in-want

 Documentation/config.txt|   4 +
 Documentation/technical/protocol-v2.txt |  28 ++-
 builtin/clone.c |   4 +-
 builtin/fetch.c | 126 +++-
 fetch-object.c  |   2 +-
 fetch-pack.c|  52 +++--
 remote.c|   1 +
 remote.h|   1 +
 t/helper/test-pkt-line.c|  37 
 t/lib-httpd.sh  |   1 +
 t/lib-httpd/apache.conf |   8 +
 t/lib-httpd/one-time-sed.sh |  16 ++
 t/t5703-upload-pack-ref-in-want.sh  | 245 
 transport-helper.c  |   6 +-
 transport-internal.h|   9 +-
 transport.c |  34 +++-
 transport.h |   3 +-
 upload-pack.c   |  64 +++
 18 files changed, 564 insertions(+), 77 deletions(-)
 create mode 100644 t/lib-httpd/one-time-sed.sh
 create mode 100755 t/t5703-upload-pack-ref-in-want.sh

-- 
2.17.1.1185.g55be947832-goog



[PATCH 2/8] upload-pack: implement ref-in-want

2018-06-05 Thread Brandon Williams
Currently, while performing packfile negotiation, clients are only
allowed to specify their desired objects using object ids.  This causes
a vulnerability to failure when an object turns non-existent during
negotiation, which may happen if, for example, the desired repository is
provided by multiple Git servers in a load-balancing arrangement.

In order to eliminate this vulnerability, implement the ref-in-want
feature for the 'fetch' command in protocol version 2.  This feature
enables the 'fetch' command to support requests in the form of ref names
through a new "want-ref " parameter.  At the conclusion of
negotiation, the server will send a list of all of the wanted references
(as provided by "want-ref" lines) in addition to the generated packfile.

Signed-off-by: Brandon Williams 
---
 Documentation/config.txt|   4 +
 Documentation/technical/protocol-v2.txt |  28 -
 t/t5703-upload-pack-ref-in-want.sh  | 153 
 upload-pack.c   |  64 ++
 4 files changed, 248 insertions(+), 1 deletion(-)
 create mode 100755 t/t5703-upload-pack-ref-in-want.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ab641bf5a..acafe6c8d 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3479,6 +3479,10 @@ Note that this configuration variable is ignored if it 
is seen in the
 repository-level config (this is a safety measure against fetching from
 untrusted repositories).
 
+uploadpack.allowRefInWant::
+   If this option is set, `upload-pack` will support the `ref-in-want`
+   feature of the protocol version 2 `fetch` command.
+
 url..insteadOf::
Any URL that starts with this value will be rewritten to
start, instead, with . In cases where some site serves a
diff --git a/Documentation/technical/protocol-v2.txt 
b/Documentation/technical/protocol-v2.txt
index 49bda76d2..8367e09b8 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -299,12 +299,21 @@ included in the client's request:
for use with partial clone and partial fetch operations. See
`rev-list` for possible "filter-spec" values.
 
+If the 'ref-in-want' feature is advertised, the following argument can
+be included in the client's request as well as the potential addition of
+the 'wanted-refs' section in the server's response as explained below.
+
+want-ref 
+   Indicates to the server than the client wants to retrieve a
+   particular ref, where  is the full name of a ref on the
+   server.
+
 The response of `fetch` is broken into a number of sections separated by
 delimiter packets (0001), with each section beginning with its section
 header.
 
 output = *section
-section = (acknowledgments | shallow-info | packfile)
+section = (acknowledgments | shallow-info | wanted-refs | packfile)
  (flush-pkt | delim-pkt)
 
 acknowledgments = PKT-LINE("acknowledgments" LF)
@@ -319,6 +328,10 @@ header.
 shallow = "shallow" SP obj-id
 unshallow = "unshallow" SP obj-id
 
+wanted-refs = PKT-LINE("wanted-refs" LF)
+ *PKT-Line(wanted-ref LF)
+wanted-ref = obj-id SP refname
+
 packfile = PKT-LINE("packfile" LF)
   *PKT-LINE(%x01-03 *%x00-ff)
 
@@ -379,6 +392,19 @@ header.
* This section is only included if a packfile section is also
  included in the response.
 
+wanted-refs section
+   * This section is only included if the client has requested a
+ ref using a 'want-ref' line and if a packfile section is also
+ included in the response.
+
+   * Always begins with the section header "wanted-refs"
+
+   * The server will send a ref listing (" ") for
+ each reference requested using 'want-ref' lines.
+
+   * Ther server MUST NOT send any refs which were not requested
+ using 'want-ref' lines.
+
 packfile section
* This section is only included if the client has sent 'want'
  lines in its request and either requested that no more
diff --git a/t/t5703-upload-pack-ref-in-want.sh 
b/t/t5703-upload-pack-ref-in-want.sh
new file mode 100755
index 0..0ef182970
--- /dev/null
+++ b/t/t5703-upload-pack-ref-in-want.sh
@@ -0,0 +1,153 @@
+#!/bin/sh
+
+test_description='upload-pack ref-in-want'
+
+. ./test-lib.sh
+
+get_actual_refs() {
+   sed -n '/wanted-refs/,/0001/p' actual_refs
+}
+
+get_actual_commits() {
+   sed -n '/packfile/,//p' o.pack &&
+   git index-pack o.pack &&
+   git verify-pack -v o.idx | grep commit | cut -c-40 | sort 
>actual_commits
+}
+
+check_output() {
+   get_actual_refs &&
+   test_cmp expected_refs actual_refs &&
+   get_actual_commits &&
+   test_cmp expected_commits actual_commits
+}
+
+# c(o/foo) d(o/bar)
+#\ /
+# b   e(baz)  f(master)
+#  \__  |  __/
+# \ | /
+#   a
+test_expect_success 'setup reposi

Re: [PATCH 2/8] upload-pack: implement ref-in-want

2018-06-05 Thread Ramsay Jones



On 05/06/18 18:51, Brandon Williams wrote:
> Currently, while performing packfile negotiation, clients are only
> allowed to specify their desired objects using object ids.  This causes
> a vulnerability to failure when an object turns non-existent during
> negotiation, which may happen if, for example, the desired repository is
> provided by multiple Git servers in a load-balancing arrangement.
> 
> In order to eliminate this vulnerability, implement the ref-in-want
> feature for the 'fetch' command in protocol version 2.  This feature
> enables the 'fetch' command to support requests in the form of ref names
> through a new "want-ref " parameter.  At the conclusion of
> negotiation, the server will send a list of all of the wanted references
> (as provided by "want-ref" lines) in addition to the generated packfile.
> 
> Signed-off-by: Brandon Williams 
> ---
>  Documentation/config.txt|   4 +
>  Documentation/technical/protocol-v2.txt |  28 -
>  t/t5703-upload-pack-ref-in-want.sh  | 153 
>  upload-pack.c   |  64 ++
>  4 files changed, 248 insertions(+), 1 deletion(-)
>  create mode 100755 t/t5703-upload-pack-ref-in-want.sh
> 
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index ab641bf5a..acafe6c8d 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -3479,6 +3479,10 @@ Note that this configuration variable is ignored if it 
> is seen in the
>  repository-level config (this is a safety measure against fetching from
>  untrusted repositories).
>  
> +uploadpack.allowRefInWant::
> + If this option is set, `upload-pack` will support the `ref-in-want`
> + feature of the protocol version 2 `fetch` command.
> +
>  url..insteadOf::
>   Any URL that starts with this value will be rewritten to
>   start, instead, with . In cases where some site serves a
> diff --git a/Documentation/technical/protocol-v2.txt 
> b/Documentation/technical/protocol-v2.txt
> index 49bda76d2..8367e09b8 100644
> --- a/Documentation/technical/protocol-v2.txt
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -299,12 +299,21 @@ included in the client's request:
>   for use with partial clone and partial fetch operations. See
>   `rev-list` for possible "filter-spec" values.
>  
> +If the 'ref-in-want' feature is advertised, the following argument can
> +be included in the client's request as well as the potential addition of
> +the 'wanted-refs' section in the server's response as explained below.
> +
> +want-ref 
> + Indicates to the server than the client wants to retrieve a
> + particular ref, where  is the full name of a ref on the
> + server.
> +
>  The response of `fetch` is broken into a number of sections separated by
>  delimiter packets (0001), with each section beginning with its section
>  header.
>  
>  output = *section
> -section = (acknowledgments | shallow-info | packfile)
> +section = (acknowledgments | shallow-info | wanted-refs | packfile)
> (flush-pkt | delim-pkt)
>  
>  acknowledgments = PKT-LINE("acknowledgments" LF)
> @@ -319,6 +328,10 @@ header.
>  shallow = "shallow" SP obj-id
>  unshallow = "unshallow" SP obj-id
>  
> +wanted-refs = PKT-LINE("wanted-refs" LF)
> +   *PKT-Line(wanted-ref LF)

s/PKT-Line/PKT-LINE/

ATB,
Ramsay Jones



Re: [PATCHv1 1/3] git-p4: raise exceptions from p4CmdList based on error from p4 server

2018-06-05 Thread Eric Sunshine
On Tue, Jun 5, 2018 at 6:56 AM Luke Diamand  wrote:
> On 5 June 2018 at 10:54, Eric Sunshine  wrote:
> > On Tue, Jun 5, 2018 at 5:14 AM Luke Diamand  wrote:
> >> +m = re.search('Too many rows scanned \(over (\d+)\)', 
> >> data)
> >> +if not m:
> >> +m = re.search('Request too large \(over (\d+)\)', 
> >> data)
> >
> > Does 'p4' localize these error messages?
>
> That's a good question.
>
> It turns out that Perforce open-sourced the P4 client in 2014 (I only
> recently found this out) so we can actually look at the code now!
>
> Here's the code:
>
> // ErrorId graveyard: retired/deprecated ErrorIds.

Hmm, the "too many rows" error you're seeing is retired/deprecated(?).

> ErrorId MsgDb::MaxResults  = { ErrorOf( ES_DB, 32,
> E_FAILED, EV_ADMIN, 1 ), "Request too large (over %maxResults%); see
> 'p4 help maxresults'." } ;//NOTRANS
> ErrorId MsgDb::MaxScanRows = { ErrorOf( ES_DB, 61,
> E_FAILED, EV_ADMIN, 1 ), "Too many rows scanned (over %maxScanRows%);
> see 'p4 help maxscanrows'." } ;//NOTRANS
>
> I don't think there's actually a way to make it return any language
> other than English though. [...]
> So I think probably the language is always English.

The "NOTRANS" annotation on the error messages is reassuring.


Re: [PATCHv1 1/1] git-p4: better error reporting when p4 fails

2018-06-05 Thread Eric Sunshine
On Tue, Jun 5, 2018 at 6:59 AM Luke Diamand  wrote:
> On 5 June 2018 at 11:49, Merland Romain  wrote:
> >> +# now check that we can actually talk to the server
> >> +global p4_access_checked
> >> +if not p4_access_checked:
> >> +p4_access_checked = True
> >> +p4_check_access()
> >> +
> >
> > Just switch the 2 lines 'p4_access_checked = True' and 'p4_check_access()'
> > It seems to me more logical
>
> Like this:
>
> +p4_check_access()
> +p4_access_checked = True
>
> You need to set p4_access_checked first so that it doesn't go and try
> to check the p4 access before running "p4 login -s", which would then
> get stuck forever.

Such subtlety may deserve an in-code comment.


[PATCH 2/3] refspec: add back a refspec_item_init() function

2018-06-05 Thread Ævar Arnfjörð Bjarmason
Re-add the non-fatal version of refspec_item_init_or_die() renamed
away in an earlier change to get a more minimal diff. This should be
used by callers that have their own error handling.

This new function could be marked "static" since nothing outside of
refspec.c uses it, but expecting future use of it, let's make it
available to other users.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 refspec.c | 10 +++---
 refspec.h |  2 ++
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/refspec.c b/refspec.c
index 0fd392e96b..a35493e35e 100644
--- a/refspec.c
+++ b/refspec.c
@@ -124,12 +124,16 @@ static int parse_refspec(struct refspec_item *item, const 
char *refspec, int fet
return 1;
 }
 
-void refspec_item_init_or_die(struct refspec_item *item, const char *refspec,
- int fetch)
+int refspec_item_init(struct refspec_item *item, const char *refspec, int 
fetch)
 {
memset(item, 0, sizeof(*item));
+   return parse_refspec(item, refspec, fetch);
+}
 
-   if (!parse_refspec(item, refspec, fetch))
+void refspec_item_init_or_die(struct refspec_item *item, const char *refspec,
+ int fetch)
+{
+   if (!refspec_item_init(item, refspec, fetch))
die("Invalid refspec '%s'", refspec);
 }
 
diff --git a/refspec.h b/refspec.h
index 4caaf1f8e3..9b6e64a824 100644
--- a/refspec.h
+++ b/refspec.h
@@ -32,6 +32,8 @@ struct refspec {
int fetch;
 };
 
+int refspec_item_init(struct refspec_item *item, const char *refspec,
+ int fetch);
 void refspec_item_init_or_die(struct refspec_item *item, const char *refspec,
  int fetch);
 void refspec_item_clear(struct refspec_item *item);
-- 
2.17.0.290.gded63e768a



git@vger.kernel.org

2018-06-05 Thread Ævar Arnfjörð Bjarmason
Rename the refspec_item_init() function introduced in
6d4c057859 ("refspec: introduce struct refspec", 2018-05-16) to
refspec_item_init_or_die().

This follows the convention of other *_or_die() functions, and is done
in preparation for making it a wrapper for a non-fatal variant.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 builtin/clone.c | 2 +-
 builtin/pull.c  | 2 +-
 refspec.c   | 5 +++--
 refspec.h   | 3 ++-
 4 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 99e73dae85..74a804f2e8 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1077,7 +1077,7 @@ int cmd_clone(int argc, const char **argv, const char 
*prefix)
if (option_required_reference.nr || option_optional_reference.nr)
setup_reference();
 
-   refspec_item_init(&refspec, value.buf, REFSPEC_FETCH);
+   refspec_item_init_or_die(&refspec, value.buf, REFSPEC_FETCH);
 
strbuf_reset(&value);
 
diff --git a/builtin/pull.c b/builtin/pull.c
index 1f2ecf3a88..bb64631d98 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -684,7 +684,7 @@ static const char *get_tracking_branch(const char *remote, 
const char *refspec)
const char *spec_src;
const char *merge_branch;
 
-   refspec_item_init(&spec, refspec, REFSPEC_FETCH);
+   refspec_item_init_or_die(&spec, refspec, REFSPEC_FETCH);
spec_src = spec.src;
if (!*spec_src || !strcmp(spec_src, "HEAD"))
spec_src = "HEAD";
diff --git a/refspec.c b/refspec.c
index 78edc48ae8..0fd392e96b 100644
--- a/refspec.c
+++ b/refspec.c
@@ -124,7 +124,8 @@ static int parse_refspec(struct refspec_item *item, const 
char *refspec, int fet
return 1;
 }
 
-void refspec_item_init(struct refspec_item *item, const char *refspec, int 
fetch)
+void refspec_item_init_or_die(struct refspec_item *item, const char *refspec,
+ int fetch)
 {
memset(item, 0, sizeof(*item));
 
@@ -152,7 +153,7 @@ void refspec_append(struct refspec *rs, const char *refspec)
 {
struct refspec_item item;
 
-   refspec_item_init(&item, refspec, rs->fetch);
+   refspec_item_init_or_die(&item, refspec, rs->fetch);
 
ALLOC_GROW(rs->items, rs->nr + 1, rs->alloc);
rs->items[rs->nr++] = item;
diff --git a/refspec.h b/refspec.h
index 3a9363887c..4caaf1f8e3 100644
--- a/refspec.h
+++ b/refspec.h
@@ -32,7 +32,8 @@ struct refspec {
int fetch;
 };
 
-void refspec_item_init(struct refspec_item *item, const char *refspec, int 
fetch);
+void refspec_item_init_or_die(struct refspec_item *item, const char *refspec,
+ int fetch);
 void refspec_item_clear(struct refspec_item *item);
 void refspec_init(struct refspec *rs, int fetch);
 void refspec_append(struct refspec *rs, const char *refspec);
-- 
2.17.0.290.gded63e768a



[PATCH 0/3] refspec: refactor & fix free() behavior

2018-06-05 Thread Ævar Arnfjörð Bjarmason
Since Martin & Brandon both liked this direction I've fixed it
up.

Martin: I didn't want to be the author of the actual fix for the bug
you found, so I rewrote your commit in 3/3. The diff is different, and
I slightly modified the 3rd paragraph of the commit message & added my
sign-off, but otherwise it's the same.

Martin Ågren (1):
  refspec: initalize `refspec_item` in `valid_fetch_refspec()`

Ævar Arnfjörð Bjarmason (2):
  refspec: s/refspec_item_init/&_or_die/g
  refspec: add back a refspec_item_init() function

 builtin/clone.c |  2 +-
 builtin/pull.c  |  2 +-
 refspec.c   | 13 +
 refspec.h   |  5 -
 4 files changed, 15 insertions(+), 7 deletions(-)

-- 
2.17.0.290.gded63e768a



[PATCH 3/3] refspec: initalize `refspec_item` in `valid_fetch_refspec()`

2018-06-05 Thread Ævar Arnfjörð Bjarmason
From: Martin Ågren 

We allocate a `struct refspec_item` on the stack without initializing
it. In particular, its `dst` and `src` members will contain some random
data from the stack. When we later call `refspec_item_clear()`, it will
call `free()` on those pointers. So if the call to `parse_refspec()` did
not assign to them, we will be freeing some random "pointers". This is
undefined behavior.

To the best of my understanding, this cannot currently be triggered by
user-provided data. And for what it's worth, the test-suite does not
trigger this with SANITIZE=address. It can be provoked by calling
`valid_fetch_refspec(":*")`.

Zero the struct, as is done in other users of `struct refspec_item` by
using the refspec_item_init() initialization function.

Signed-off-by: Martin Ågren 
Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 refspec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/refspec.c b/refspec.c
index a35493e35e..e8010dce0c 100644
--- a/refspec.c
+++ b/refspec.c
@@ -196,7 +196,7 @@ void refspec_clear(struct refspec *rs)
 int valid_fetch_refspec(const char *fetch_refspec_str)
 {
struct refspec_item refspec;
-   int ret = parse_refspec(&refspec, fetch_refspec_str, REFSPEC_FETCH);
+   int ret = refspec_item_init(&refspec, fetch_refspec_str, REFSPEC_FETCH);
refspec_item_clear(&refspec);
return ret;
 }
-- 
2.17.0.290.gded63e768a



Re: [PATCH 0/3] refspec: refactor & fix free() behavior

2018-06-05 Thread Brandon Williams
On 06/05, Ævar Arnfjörð Bjarmason wrote:
> Since Martin & Brandon both liked this direction I've fixed it
> up.
> 
> Martin: I didn't want to be the author of the actual fix for the bug
> you found, so I rewrote your commit in 3/3. The diff is different, and
> I slightly modified the 3rd paragraph of the commit message & added my
> sign-off, but otherwise it's the same.

Thanks for writing up a proper patch series for this fix.  I liked
breaking up your diff into two different patches to make it clear that
all callers of refpsec_item_init relying on dieing.

> 
> Martin Ågren (1):
>   refspec: initalize `refspec_item` in `valid_fetch_refspec()`
> 
> Ævar Arnfjörð Bjarmason (2):
>   refspec: s/refspec_item_init/&_or_die/g
>   refspec: add back a refspec_item_init() function
> 
>  builtin/clone.c |  2 +-
>  builtin/pull.c  |  2 +-
>  refspec.c   | 13 +
>  refspec.h   |  5 -
>  4 files changed, 15 insertions(+), 7 deletions(-)
> 
> -- 
> 2.17.0.290.gded63e768a
> 

-- 
Brandon Williams


Re: [PATCH 0/3] refspec: refactor & fix free() behavior

2018-06-05 Thread Martin Ågren
On 5 June 2018 at 21:58, Brandon Williams  wrote:
> On 06/05, Ævar Arnfjörð Bjarmason wrote:
>> Since Martin & Brandon both liked this direction I've fixed it
>> up.
>>
>> Martin: I didn't want to be the author of the actual fix for the bug
>> you found, so I rewrote your commit in 3/3. The diff is different, and
>> I slightly modified the 3rd paragraph of the commit message & added my
>> sign-off, but otherwise it's the same.
>
> Thanks for writing up a proper patch series for this fix.  I liked
> breaking up your diff into two different patches to make it clear that
> all callers of refpsec_item_init relying on dieing.

Me too.

>> Martin Ågren (1):
>>   refspec: initalize `refspec_item` in `valid_fetch_refspec()`

I was a bit surprised at first that this wasn't a "while at it" in the
second patch, but on second thought, it does make sense to keep this
separate. Thanks for picking this up and polishing it.

Just noticed: s/initalize/initialize/. That would be my fault...

Martin


[PATCH] docs: link to gitsubmodules

2018-06-05 Thread Brandon Williams
Add a link to gitsubmodules(7) under the `submodule.active` entry in
git-config(1).

Signed-off-by: Brandon Williams 
---
 Documentation/config.txt | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ab641bf5a..1277731aa 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3327,12 +3327,13 @@ submodule..ignore::
 submodule..active::
Boolean value indicating if the submodule is of interest to git
commands.  This config option takes precedence over the
-   submodule.active config option.
+   submodule.active config option. See linkgit:git-submodule[1] for
+   details.
 
 submodule.active::
A repeated field which contains a pathspec used to match against a
submodule's path to determine if the submodule is of interest to git
-   commands.
+   commands. See linkgit:git-submodule[1] for details.
 
 submodule.recurse::
Specifies if commands recurse into submodules by default. This
-- 
2.17.1.1185.g55be947832-goog



Re: [PATCH 2/8] upload-pack: implement ref-in-want

2018-06-05 Thread Ævar Arnfjörð Bjarmason


On Tue, Jun 05 2018, Brandon Williams wrote:

> +uploadpack.allowRefInWant::
> + If this option is set, `upload-pack` will support the `ref-in-want`
> + feature of the protocol version 2 `fetch` command.
> +

I think it makes sense to elaborate a bit on what this is for. Having
read this series through, and to make sure I understood this, maybe
something like this:

   This feature is intended for the benefit of load-balanced servers
   which may not have the same view of what SHA-1s their refs point to,
   but are guaranteed to never advertise a reference that another server
   serving the request doesn't know about.

I.e. from what I can tell this gives no benefits for someone using a
monolithic git server, except insofar as there would be a slight
decrease in network traffic if the average length of refs is less than
the length of a SHA-1.

That's fair enough, just something we should prominently say.

It does have the "disadvantage", if you can call it that, that it's
introducing a race condition between when we read the ref advertisement
and are promised XYZ refs, but may actually get ABC, but I can't think
of a reason anyone would care about this in practice.

The reason I'm saying "another server [...] doesn't know about" above is
that 2/8 has this:

if (read_ref(arg, &oid))
die("unknown ref %s", arg);

Doesn't that mean that if server A in your pool advertises master, next
& pu, and you then go and fetch from server B advertising master & next,
but not "pu" that the clone will die?

Presumably at Google you either have something to ensure a consistent
view, e.g. only advertise refs by name older than N seconds, or globally
update ref name but not their contents, and don't allow deleting refs
(or give them the same treatment).

But that, and again, I may have misunderstood this whole thing,
significantly reduces the utility of this feature for anyone "in the
wild" since nothing shipped with "git" gives you that feature.

The naïve way to do slave mirroring with stock git is to have a
post-receive hook that pushes to your mirrors in a for-loop, or has them
fetch from the master in a loop, and then round-robin LB those
servers. Due to the "die on nonexisting" semantics in this extension
that'll result in failed clones.

So I think we should either be really vocal about that caveat, or
perhaps think of how we could make that configurable, e.g. what happens
if the server says "sorry, don't know about that one", and carries on
with the rest it does know about?

Is there a way for client & server to gracefully recover from that?
E.g. send "master" & "next" now, and when I pull again in a few seconds
I get the new "pu"?

Also, as a digression isn't that a problem shared with protocol v2 in
general? I.e. without this extension isn't it going to make another
connection to the naïve LB'd mirroring setup described above and find
that SHA-1s as well as refs don't match?

BREAK.

Also is if this E-Mail wasn't long enough, on a completely different
topic, in an earlier discussion in
https://public-inbox.org/git/87inaje1uv@evledraar.gmail.com/ I noted
that it would be neat-o to have optional wildmatch/pcre etc. matching
for the use case you're not caring about here (and I don't expect you
to, you're solving a different problem).

But let's say I want to add that after this, and being unfamiliar with
the protocol v2 conventions. Would that be a whole new
ref-in-want-wildmatch-prefix capability with a new
want-ref-wildmatch-prefix verb, or is there some less verbose way we can
anticipate that use-case and internally version / advertise
sub-capabilities?

I don't know if that makes any sense, and would be fine with just a
ref-in-want-wildmatch-prefix if that's the way to do it. I just think
it's inevitable that we'll have such a thing eventually, so it's worth
thinking about how such a future extension fits in.


Re: [PATCH] docs: link to gitsubmodules

2018-06-05 Thread Ævar Arnfjörð Bjarmason


On Tue, Jun 05 2018, Brandon Williams wrote:

> Add a link to gitsubmodules(7) under the `submodule.active` entry in
> git-config(1).

Did you mean to change either the subject or content of this patch? Your
subject says gitsubmodules(7), but you link to git-submodule(1).


Re: [PATCH] docs: link to gitsubmodules

2018-06-05 Thread Jonathan Nieder
Hi,

Brandon Williams wrote:

> Add a link to gitsubmodules(7) under the `submodule.active` entry in
> git-config(1).
>
> Signed-off-by: Brandon Williams 
> ---
>  Documentation/config.txt | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index ab641bf5a..1277731aa 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -3327,12 +3327,13 @@ submodule..ignore::
>  submodule..active::
>   Boolean value indicating if the submodule is of interest to git
>   commands.  This config option takes precedence over the
> - submodule.active config option.
> + submodule.active config option. See linkgit:git-submodule[1] for
> + details.

This takes the user to gitsubmodules(7), but with a hop to
git-submodule(1) along the way there:

DESCRIPTION
   Inspects, updates and manages submodules.

   For more information about submodules, see gitsubmodules(7).

I suppose I'd prefer that it links directly to
linkgit:gitsubmodules[7] just because that would steer people toward
commands like "git checkout --recurse-submodules" instead of "git
submodule init".

With or without that tweak,
Reviewed-by: Jonathan Nieder 

Tested using

make -C Documentation/ git-config.1
man Documentation/git-config.1

Thanks,
Jonathan

diff --git i/Documentation/config.txt w/Documentation/config.txt
index 1277731aa4..efbd7e5652 100644
--- i/Documentation/config.txt
+++ w/Documentation/config.txt
@@ -3327,13 +3327,13 @@ submodule..ignore::
 submodule..active::
Boolean value indicating if the submodule is of interest to git
commands.  This config option takes precedence over the
-   submodule.active config option. See linkgit:git-submodule[1] for
+   submodule.active config option. See linkgit:gitsubmodules[7] for
details.
 
 submodule.active::
A repeated field which contains a pathspec used to match against a
submodule's path to determine if the submodule is of interest to git
-   commands. See linkgit:git-submodule[1] for details.
+   commands. See linkgit:gitsubmodule[7] for details.
 
 submodule.recurse::
Specifies if commands recurse into submodules by default. This


Re: [PATCH] docs: link to gitsubmodules

2018-06-05 Thread Brandon Williams
On 06/05, Ævar Arnfjörð Bjarmason wrote:
> 
> On Tue, Jun 05 2018, Brandon Williams wrote:
> 
> > Add a link to gitsubmodules(7) under the `submodule.active` entry in
> > git-config(1).
> 
> Did you mean to change either the subject or content of this patch? Your
> subject says gitsubmodules(7), but you link to git-submodule(1).

Yep I meant for it to be to gitsubmodules(7), turns out I don't know how
our documentation is built :)

-- 
Brandon Williams


Re: [PATCH] docs: link to gitsubmodules

2018-06-05 Thread Jonathan Nieder
Jonathan Nieder wrote:

> --- i/Documentation/config.txt
> +++ w/Documentation/config.txt
> @@ -3327,13 +3327,13 @@ submodule..ignore::
>  submodule..active::
>   Boolean value indicating if the submodule is of interest to git
>   commands.  This config option takes precedence over the
> - submodule.active config option. See linkgit:git-submodule[1] for
> + submodule.active config option. See linkgit:gitsubmodules[7] for
>   details.
>  
>  submodule.active::
>   A repeated field which contains a pathspec used to match against a
>   submodule's path to determine if the submodule is of interest to git
> - commands. See linkgit:git-submodule[1] for details.
> + commands. See linkgit:gitsubmodule[7] for details.

Gah, and I can't spell.  This one should have been
linkgit:gitsubmodules[7].  Updated diff below.  Tested using

make -C Documentation/ git-config.html gitsubmodules.html
w3m Documentation/git-config.html

Thanks and sorry for the noise,
Jonathan

diff --git i/Documentation/config.txt w/Documentation/config.txt
index 1277731aa4..340eb1f3c4 100644
--- i/Documentation/config.txt
+++ w/Documentation/config.txt
@@ -3327,13 +3327,13 @@ submodule..ignore::
 submodule..active::
Boolean value indicating if the submodule is of interest to git
commands.  This config option takes precedence over the
-   submodule.active config option. See linkgit:git-submodule[1] for
+   submodule.active config option. See linkgit:gitsubmodules[7] for
details.
 
 submodule.active::
A repeated field which contains a pathspec used to match against a
submodule's path to determine if the submodule is of interest to git
-   commands. See linkgit:git-submodule[1] for details.
+   commands. See linkgit:gitsubmodules[7] for details.
 
 submodule.recurse::
Specifies if commands recurse into submodules by default. This


Re: [PATCH] docs: link to gitsubmodules

2018-06-05 Thread Brandon Williams
On 06/05, Jonathan Nieder wrote:
> Jonathan Nieder wrote:
> 
> > --- i/Documentation/config.txt
> > +++ w/Documentation/config.txt
> > @@ -3327,13 +3327,13 @@ submodule..ignore::
> >  submodule..active::
> > Boolean value indicating if the submodule is of interest to git
> > commands.  This config option takes precedence over the
> > -   submodule.active config option. See linkgit:git-submodule[1] for
> > +   submodule.active config option. See linkgit:gitsubmodules[7] for
> > details.
> >  
> >  submodule.active::
> > A repeated field which contains a pathspec used to match against a
> > submodule's path to determine if the submodule is of interest to git
> > -   commands. See linkgit:git-submodule[1] for details.
> > +   commands. See linkgit:gitsubmodule[7] for details.
> 
> Gah, and I can't spell.  This one should have been
> linkgit:gitsubmodules[7].  Updated diff below.  Tested using
> 
>   make -C Documentation/ git-config.html gitsubmodules.html
>   w3m Documentation/git-config.html
> 
> Thanks and sorry for the noise,
> Jonathan
> 
> diff --git i/Documentation/config.txt w/Documentation/config.txt
> index 1277731aa4..340eb1f3c4 100644
> --- i/Documentation/config.txt
> +++ w/Documentation/config.txt
> @@ -3327,13 +3327,13 @@ submodule..ignore::
>  submodule..active::
>   Boolean value indicating if the submodule is of interest to git
>   commands.  This config option takes precedence over the
> - submodule.active config option. See linkgit:git-submodule[1] for
> + submodule.active config option. See linkgit:gitsubmodules[7] for
>   details.
>  
>  submodule.active::
>   A repeated field which contains a pathspec used to match against a
>   submodule's path to determine if the submodule is of interest to git
> - commands. See linkgit:git-submodule[1] for details.
> + commands. See linkgit:gitsubmodules[7] for details.
>  
>  submodule.recurse::
>   Specifies if commands recurse into submodules by default. This

Yep this is what I meant.

-- 
Brandon Williams


Re: [RFC PATCH 1/2] docs: reflect supported fetch options of git pull

2018-06-05 Thread Rafael Ascensão
On Tue, Jun 05, 2018 at 06:05:56PM +0200, Duy Nguyen wrote:
> A better option may be making git-pull accept those options as well. I
> see no reason git-pull should support options that git-fetch does (at
> least most of them).

I sent this as a RFC, mostly to discuss what is the correct path to
follow. Updating the documentation was trivial and would still be useful
if nothing came out from this.

> It's basically a couple of OPT_PASSTRHU though so not very hard to do.

My impression was that in the past git was very permissive on adding new
options but nowadays it tries exercise more restraint. But not sure how
relevant this is anyways, as pull already supports the majority of the
options from both `fetch` and `merge`.

> PS. Anybody up to making parse-options accept multiple struct option
> arrays? This way we can have much better option passthru without
> specifying them again and again.

If the path is adding just couple of OPT_PASSTRHU, I could do it. But
I'll wait and see if someone picks your parse-options suggestion.

Thanks for the review.

--
Rafael Ascensão


[PATCH v2 01/10] rerere: unify error messages when read_cache fails

2018-06-05 Thread Thomas Gummerer
We have multiple different variants of the error message we show to
the user if 'read_cache' fails.  The "Could not read index" variant we
are using in 'rerere.c' is currently not used anywhere in translated
form.

As a subsequent commit will mark all output that comes from 'rerere.c'
for translation, make the life of the translators a little bit easier
by using a string that is used elsewhere, and marked for translation
there, and thus most likely already translated.

"index file corrupt" seems to be the most common error message we show
when 'read_cache' fails, so use that here as well.

Signed-off-by: Thomas Gummerer 
---
 rerere.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/rerere.c b/rerere.c
index 18cae2d11c..4b4869662d 100644
--- a/rerere.c
+++ b/rerere.c
@@ -568,7 +568,7 @@ static int find_conflict(struct string_list *conflict)
 {
int i;
if (read_cache() < 0)
-   return error("Could not read index");
+   return error("index file corrupt");
 
for (i = 0; i < active_nr;) {
int conflict_type;
@@ -601,7 +601,7 @@ int rerere_remaining(struct string_list *merge_rr)
if (setup_rerere(merge_rr, RERERE_READONLY))
return 0;
if (read_cache() < 0)
-   return error("Could not read index");
+   return error("index file corrupt");
 
for (i = 0; i < active_nr;) {
int conflict_type;
@@ -1104,7 +1104,7 @@ int rerere_forget(struct pathspec *pathspec)
struct string_list merge_rr = STRING_LIST_INIT_DUP;
 
if (read_cache() < 0)
-   return error("Could not read index");
+   return error("index file corrupt");
 
fd = setup_rerere(&merge_rr, RERERE_NOAUTOUPDATE);
if (fd < 0)
-- 
2.17.0.410.g65aef3a6c4



[PATCH v2 03/10] rerere: wrap paths in output in sq

2018-06-05 Thread Thomas Gummerer
It looks like most paths in the output in the git codebase are wrapped
in single quotes.  Standardize on that in rerere as well.

Apart from being more consistent, this also makes some of the strings
match strings that are already translated in other parts of the
codebase, thus reducing the work for translators, when the strings are
marked for translation in a subsequent commit.

Signed-off-by: Thomas Gummerer 
---
 builtin/rerere.c |  2 +-
 rerere.c | 26 +-
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/builtin/rerere.c b/builtin/rerere.c
index 0bc40298c2..e0c67c98e9 100644
--- a/builtin/rerere.c
+++ b/builtin/rerere.c
@@ -107,7 +107,7 @@ int cmd_rerere(int argc, const char **argv, const char 
*prefix)
const char *path = merge_rr.items[i].string;
const struct rerere_id *id = merge_rr.items[i].util;
if (diff_two(rerere_path(id, "preimage"), path, path, 
path))
-   die("unable to generate diff for %s", 
rerere_path(id, NULL));
+   die("unable to generate diff for '%s'", 
rerere_path(id, NULL));
}
} else
usage_with_options(rerere_usage, options);
diff --git a/rerere.c b/rerere.c
index eca182023f..0e5956a51c 100644
--- a/rerere.c
+++ b/rerere.c
@@ -484,12 +484,12 @@ static int handle_file(const char *path, unsigned char 
*sha1, const char *output
io.input = fopen(path, "r");
io.io.wrerror = 0;
if (!io.input)
-   return error_errno("could not open %s", path);
+   return error_errno("could not open '%s'", path);
 
if (output) {
io.io.output = fopen(output, "w");
if (!io.io.output) {
-   error_errno("could not write %s", output);
+   error_errno("could not write '%s'", output);
fclose(io.input);
return -1;
}
@@ -499,15 +499,15 @@ static int handle_file(const char *path, unsigned char 
*sha1, const char *output
 
fclose(io.input);
if (io.io.wrerror)
-   error("there were errors while writing %s (%s)",
+   error("there were errors while writing '%s' (%s)",
  path, strerror(io.io.wrerror));
if (io.io.output && fclose(io.io.output))
-   io.io.wrerror = error_errno("failed to flush %s", path);
+   io.io.wrerror = error_errno("failed to flush '%s'", path);
 
if (hunk_no < 0) {
if (output)
unlink_or_warn(output);
-   return error("could not parse conflict hunks in %s", path);
+   return error("could not parse conflict hunks in '%s'", path);
}
if (io.io.wrerror)
return -1;
@@ -684,17 +684,17 @@ static int merge(const struct rerere_id *id, const char 
*path)
 * Mark that "postimage" was used to help gc.
 */
if (utime(rerere_path(id, "postimage"), NULL) < 0)
-   warning_errno("failed utime() on %s",
+   warning_errno("failed utime() on '%s'",
  rerere_path(id, "postimage"));
 
/* Update "path" with the resolution */
f = fopen(path, "w");
if (!f)
-   return error_errno("could not open %s", path);
+   return error_errno("could not open '%s'", path);
if (fwrite(result.ptr, result.size, 1, f) != 1)
-   error_errno("could not write %s", path);
+   error_errno("could not write '%s'", path);
if (fclose(f))
-   return error_errno("writing %s failed", path);
+   return error_errno("writing '%s' failed", path);
 
 out:
free(cur.ptr);
@@ -879,7 +879,7 @@ static int is_rerere_enabled(void)
return rr_cache_exists;
 
if (!rr_cache_exists && mkdir_in_gitdir(git_path_rr_cache()))
-   die("could not create directory %s", git_path_rr_cache());
+   die("could not create directory '%s'", git_path_rr_cache());
return 1;
 }
 
@@ -1068,9 +1068,9 @@ static int rerere_forget_one_path(const char *path, 
struct string_list *rr)
filename = rerere_path(id, "postimage");
if (unlink(filename)) {
if (errno == ENOENT)
-   error("no remembered resolution for %s", path);
+   error("no remembered resolution for '%s'", path);
else
-   error_errno("cannot unlink %s", filename);
+   error_errno("cannot unlink '%s'", filename);
goto fail_exit;
}
 
@@ -1089,7 +1089,7 @@ static int rerere_forget_one_path(const char *path, 
struct string_list *rr)
item = string_list_insert(rr, path);
free_rerere_id(item);
item->util = id;
-   fprintf(stderr, "Fo

[PATCH v2 05/10] rerere: add some documentation

2018-06-05 Thread Thomas Gummerer
Add some documentation for the logic behind the conflict normalization
in rerere.

Helped-by: Junio C Hamano 
Signed-off-by: Thomas Gummerer 
---
 Documentation/technical/rerere.txt | 142 +
 rerere.c   |   4 -
 2 files changed, 142 insertions(+), 4 deletions(-)
 create mode 100644 Documentation/technical/rerere.txt

diff --git a/Documentation/technical/rerere.txt 
b/Documentation/technical/rerere.txt
new file mode 100644
index 00..2c517fe0fc
--- /dev/null
+++ b/Documentation/technical/rerere.txt
@@ -0,0 +1,142 @@
+Rerere
+==
+
+This document describes the rerere logic.
+
+Conflict normalization
+--
+
+To ensure recorded conflict resolutions can be looked up in the rerere
+database, even when branches are merged in a different order,
+different branches are merged that result in the same conflict, or
+when different conflict style settings are used, rerere normalizes the
+conflicts before writing them to the rerere database.
+
+Differnt conflict styles and branch names are dealt with by stripping
+that information from the conflict markers, and removing extraneous
+information from the `diff3` conflict style.
+
+Branches being merged in different order are dealt with by sorting the
+conflict hunks.  More on each of those parts in the following
+sections.
+
+Once these two normalization operations are applied, a conflict ID is
+created based on the normalized conflict, which is later used by
+rerere to look up the conflict in the rerere database.
+
+Stripping extraneous information
+
+
+Say we have three branches AB, AC and AC2.  The common ancestor of
+these branches has a file with with a line with the string "A" (for
+brevity this line is called "line A" for brevity in the following) in
+it.  In branch AB this line is changed to "B", in AC, this line is
+changed to C, and branch AC2 is forked off of AC, after the line was
+changed to C.
+
+Now forking a branch ABAC off of branch AB and then merging AC into it,
+we'd get a conflict like the following:
+
+<<< HEAD
+B
+===
+C
+>>> AC
+
+Now doing the analogous with AC2 (forking a branch ABAC2 off of branch
+AB and then merging branch AC2 into it), maybe using the diff3
+conflict style, we'd get a conflict like the following:
+
+<<< HEAD
+B
+||| merged common ancestors
+A
+===
+C
+>>> AC2
+
+By resolving this conflict, to leave line D, the user declares:
+
+After examining what branches AB and AC did, I believe that making
+line A into line D is the best thing to do that is compatible with
+what AB and AC wanted to do.
+
+As branch AC2 refers to the same commit as AC, the above implies that
+this is also compatible what AB and AC2 wanted to do.
+
+By extension, this means that rerere should recognize that the above
+conflicts are the same.  To do this, the labels on the conflict
+markers are stripped, and the diff3 output is removed.  The above
+examples would both result in the following normalized conflict:
+
+<<<
+B
+===
+C
+>>>
+
+Sorting hunks
+~
+
+As before, lets imagine that a common ancestor had a file with line A
+its early part, and line X in its late part.  And then four branches
+are forked that do these things:
+
+- AB: changes A to B
+- AC: changes A to C
+- XY: changes X to Y
+- XZ: changes X to Z
+
+Now, forking a branch ABAC off of branch AB and then merging AC into
+it, and forking a branch ACAB off of branch AC and then merging AB
+into it, would yield the conflict in a different order.  The former
+would say "A became B or C, what now?" while the latter would say "A
+became C or B, what now?"
+
+As a reminder, the act of merging AC into ABAC and resolving the
+conflict to leave line D means that the user declares:
+
+After examining what branches AB and AC did, I believe that
+making line A into line D is the best thing to do that is
+compatible with what AB and AC wanted to do.
+
+So the conflict we would see when merging AB into ACAB should be
+resolved the same way---it is the resolution that is in line with that
+declaration.
+
+Imagine that similarly previously a branch XYXZ was forked from XY,
+and XZ was merged into it, and resolved "X became Y or Z" into "X
+became W".
+
+Now, if a branch ABXY was forked from AB and then merged XY, then ABXY
+would have line B in its early part and line Y in its later part.
+Such a merge would be quite clean.  We can construct 4 combinations
+using these four branches ((AB, AC) x (XY, XZ)).
+
+Merging ABXY and ACXZ would make "an early A became B or C, a late X
+became Y or Z" conflict, while merging ACXY and ABXZ would make "an
+early A became C or B, a late X became Y or Z".  We can see there are
+4 combinations of ("B or C", "C or B") x ("X or Y", "Y or X").
+
+By sorting, the conflict is given its canonical name, namely, "

[PATCH v2 00/10] rerere: handle nested conflicts

2018-06-05 Thread Thomas Gummerer
The previous round was at
<20180520211210.1248-1-t.gumme...@gmail.com>.

Thanks Junio for the comments on the previous round.

Changes since v2:
 - lowercase the first letter in some error/warning messages before
   marking them for translation
 - wrap paths in output in single quotes, for consistency, and to make
   some of the messages the same as ones that are already translated
 - mark messages in builtin/rerere.c for translation as well, which I
   had previously forgotten.
 - expanded the technical documentation on rerere.  The entire
   document is basically rewritten.
 - changed the test in 6/10 to just fake a conflict marker inside of
   one of the hunks instead of using an inner conflict created by a
   merge.  This is to make sure the codepath is still hit after we
   handle inner conflicts properly.
 - added tests for handling inner conflict markers
 - added one commit to recalculate the conflict ID when an unresolved
   conflict is committed, and the subsequent operation conflicts again
   in the same file.  More explanation in the commit message of that
   commit.

range-diff below.  A few commits changed enough for range-diff
to give up showing the differences in those, they are probably best
reviewed as the whole patch anyway:

1:  901b638400 ! 1:  2825342cc2 rerere: unify error message when read_cache 
fails
@@ -1,6 +1,6 @@
 Author: Thomas Gummerer 
 
-rerere: unify error message when read_cache fails
+rerere: unify error messages when read_cache fails
 
 We have multiple different variants of the error message we show to
 the user if 'read_cache' fails.  The "Could not read index" variant we
-:  -- > 2:  d1500028aa rerere: lowercase error messages
-:  -- > 3:  ed3601ee71 rerere: wrap paths in output in sq
2:  c48ffededd ! 4:  6ead84a199 rerere: mark strings for translation
@@ -9,6 +9,28 @@
 
 Signed-off-by: Thomas Gummerer 
 
+diff --git a/builtin/rerere.c b/builtin/rerere.c
+--- a/builtin/rerere.c
 b/builtin/rerere.c
+@@
+   if (!strcmp(argv[0], "forget")) {
+   struct pathspec pathspec;
+   if (argc < 2)
+-  warning("'git rerere forget' without paths is 
deprecated");
++  warning(_("'git rerere forget' without paths is 
deprecated"));
+   parse_pathspec(&pathspec, 0, PATHSPEC_PREFER_CWD,
+  prefix, argv + 1);
+   return rerere_forget(&pathspec);
+@@
+   const char *path = merge_rr.items[i].string;
+   const struct rerere_id *id = merge_rr.items[i].util;
+   if (diff_two(rerere_path(id, "preimage"), path, path, 
path))
+-  die("unable to generate diff for '%s'", 
rerere_path(id, NULL));
++  die(_("unable to generate diff for '%s'"), 
rerere_path(id, NULL));
+   }
+   } else
+   usage_with_options(rerere_usage, options);
+
 diff --git a/rerere.c b/rerere.c
 --- a/rerere.c
 +++ b/rerere.c
@@ -53,14 +75,14 @@
io.input = fopen(path, "r");
io.io.wrerror = 0;
if (!io.input)
--  return error_errno("Could not open %s", path);
-+  return error_errno(_("Could not open %s"), path);
+-  return error_errno("could not open '%s'", path);
++  return error_errno(_("could not open '%s'"), path);
  
if (output) {
io.io.output = fopen(output, "w");
if (!io.io.output) {
--  error_errno("Could not write %s", output);
-+  error_errno(_("Could not write %s"), output);
+-  error_errno("could not write '%s'", output);
++  error_errno(_("could not write '%s'"), output);
fclose(io.input);
return -1;
}
@@ -68,18 +90,18 @@
  
fclose(io.input);
if (io.io.wrerror)
--  error("There were errors while writing %s (%s)",
-+  error(_("There were errors while writing %s (%s)"),
+-  error("there were errors while writing '%s' (%s)",
++  error(_("there were errors while writing '%s' (%s)"),
  path, strerror(io.io.wrerror));
if (io.io.output && fclose(io.io.output))
--  io.io.wrerror = error_errno("Failed to flush %s", path);
-+  io.io.wrerror = error_errno(_("Failed to flush %s"), path);
+-  io.io.wrerror = error_errno("failed to flush '%s'", path);
++  io.io.wrerror = error_errno(_("failed to flush '%s'"), path);
  
if (hunk_no < 0) {
if (output)
unlink_or_warn(output);
--  return error("Could not parse conflict hunks in %s", path);
-+  return error(_("Could not parse

[PATCH v2 10/10] rerere: recalculate conflict ID when unresolved conflict is committed

2018-06-05 Thread Thomas Gummerer
Currently when a user doesn't resolve a conflict, commits the results,
and does an operation which creates another conflict, rerere will use
the ID of the previously unresolved conflict for the new conflict.
This is because the conflict is kept in the MERGE_RR file, which
'rerere' reads every time it is invoked.

After the new conflict is solved, rerere will record the resolution
with the ID of the old conflict.  So in order to replay the conflict,
both merges would have to be re-done, instead of just the last one, in
order for rerere to be able to automatically resolve the conflict.

Instead of that, assign a new conflict ID if there are still conflicts
in a file and the file had conflicts at a previous step.  This ID
matches the conflict we actually resolved at the corresponding step.

Note that there are no backwards compatibility worries here, as rerere
would have failed to even normalize the conflict before this patch
series.

Signed-off-by: Thomas Gummerer 
---
 rerere.c  | 7 +++
 t/t4200-rerere.sh | 7 +++
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/rerere.c b/rerere.c
index f611db7873..644f185180 100644
--- a/rerere.c
+++ b/rerere.c
@@ -818,7 +818,7 @@ static int do_plain_rerere(struct string_list *rr, int fd)
struct rerere_id *id;
unsigned char sha1[20];
const char *path = conflict.items[i].string;
-   int ret, has_string;
+   int ret;
 
/*
 * Ask handle_file() to scan and assign a
@@ -826,12 +826,11 @@ static int do_plain_rerere(struct string_list *rr, int fd)
 * yet.
 */
ret = handle_file(path, sha1, NULL);
-   has_string = string_list_has_string(rr, path);
-   if (ret < 0 && has_string) {
+   if (ret != 0 && string_list_has_string(rr, path)) {
remove_variant(string_list_lookup(rr, path)->util);
string_list_remove(rr, path, 1);
}
-   if (ret < 1 || has_string)
+   if (ret < 1)
continue;
 
id = new_rerere_id(sha1);
diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh
index f433848ccb..9578215ff2 100755
--- a/t/t4200-rerere.sh
+++ b/t/t4200-rerere.sh
@@ -636,6 +636,13 @@ test_expect_success 'rerere with inner conflict markers' '
git commit -q -m "will solve conflicts later" &&
test_must_fail git merge A &&
cat test >actual &&
+   test_cmp expect actual &&
+
+   git add test &&
+   git commit -m "rerere solved conflict" &&
+   git reset --hard HEAD~ &&
+   test_must_fail git merge A &&
+   cat test >actual &&
test_cmp expect actual
 '
 
-- 
2.17.0.410.g65aef3a6c4



[PATCH v2 02/10] rerere: lowercase error messages

2018-06-05 Thread Thomas Gummerer
Documentation/CodingGuidelines mentions that error messages should be
lowercase.  Prior to marking them for translation follow that pattern
in rerere as well.

Signed-off-by: Thomas Gummerer 
---
 rerere.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/rerere.c b/rerere.c
index 4b4869662d..eca182023f 100644
--- a/rerere.c
+++ b/rerere.c
@@ -484,12 +484,12 @@ static int handle_file(const char *path, unsigned char 
*sha1, const char *output
io.input = fopen(path, "r");
io.io.wrerror = 0;
if (!io.input)
-   return error_errno("Could not open %s", path);
+   return error_errno("could not open %s", path);
 
if (output) {
io.io.output = fopen(output, "w");
if (!io.io.output) {
-   error_errno("Could not write %s", output);
+   error_errno("could not write %s", output);
fclose(io.input);
return -1;
}
@@ -499,15 +499,15 @@ static int handle_file(const char *path, unsigned char 
*sha1, const char *output
 
fclose(io.input);
if (io.io.wrerror)
-   error("There were errors while writing %s (%s)",
+   error("there were errors while writing %s (%s)",
  path, strerror(io.io.wrerror));
if (io.io.output && fclose(io.io.output))
-   io.io.wrerror = error_errno("Failed to flush %s", path);
+   io.io.wrerror = error_errno("failed to flush %s", path);
 
if (hunk_no < 0) {
if (output)
unlink_or_warn(output);
-   return error("Could not parse conflict hunks in %s", path);
+   return error("could not parse conflict hunks in %s", path);
}
if (io.io.wrerror)
return -1;
@@ -690,11 +690,11 @@ static int merge(const struct rerere_id *id, const char 
*path)
/* Update "path" with the resolution */
f = fopen(path, "w");
if (!f)
-   return error_errno("Could not open %s", path);
+   return error_errno("could not open %s", path);
if (fwrite(result.ptr, result.size, 1, f) != 1)
-   error_errno("Could not write %s", path);
+   error_errno("could not write %s", path);
if (fclose(f))
-   return error_errno("Writing %s failed", path);
+   return error_errno("writing %s failed", path);
 
 out:
free(cur.ptr);
@@ -721,7 +721,7 @@ static void update_paths(struct string_list *update)
 
if (write_locked_index(&the_index, &index_lock,
   COMMIT_LOCK | SKIP_IF_UNCHANGED))
-   die("Unable to write new index file");
+   die("unable to write new index file");
 }
 
 static void remove_variant(struct rerere_id *id)
@@ -879,7 +879,7 @@ static int is_rerere_enabled(void)
return rr_cache_exists;
 
if (!rr_cache_exists && mkdir_in_gitdir(git_path_rr_cache()))
-   die("Could not create directory %s", git_path_rr_cache());
+   die("could not create directory %s", git_path_rr_cache());
return 1;
 }
 
@@ -1032,7 +1032,7 @@ static int rerere_forget_one_path(const char *path, 
struct string_list *rr)
 */
ret = handle_cache(path, sha1, NULL);
if (ret < 1)
-   return error("Could not parse conflict hunks in '%s'", path);
+   return error("could not parse conflict hunks in '%s'", path);
 
/* Nuke the recorded resolution for the conflict */
id = new_rerere_id(sha1);
-- 
2.17.0.410.g65aef3a6c4



[PATCH v2 06/10] rerere: fix crash when conflict goes unresolved

2018-06-05 Thread Thomas Gummerer
Currently when a user doesn't resolve a conflict in a file, but
commits the file with the conflict markers, and later the file ends up
in a state in which rerere can't handle it, subsequent rerere
operations that are interested in that path, such as 'rerere clear' or
'rerere forget ' will fail, or even worse in the case of 'rerere
clear' segfault.

Such states include nested conflicts, or an extra conflict marker that
doesn't have any match.

This is because the first 'git rerere' when there was only one
conflict in the file leaves an entry in the MERGE_RR file behind.  The
next 'git rerere' will then pick the rerere ID for that file up, and
not assign a new ID as it can't successfully calculate one.  It will
however still try to do the rerere operation, because of the existing
ID.  As the handle_file function fails, it will remove the 'preimage'
for the ID in the process, while leaving the ID in the MERGE_RR file.

Now when 'rerere clear' for example is run, it will segfault in
'has_rerere_resolution', because status is NULL.

To fix this, remove the rerere ID from the MERGE_RR file in the case
when we can't handle it, and remove the corresponding variant from
.git/rr-cache/.  Removing it unconditionally is fine here, because if
the user would have resolved the conflict and ran rerere, the entry
would no longer be in the MERGE_RR file, so we wouldn't have this
problem in the first place, while if the conflict was not resolved,
the only thing that's left in the folder is the 'preimage', which by
itself will be regenerated by git if necessary, so the user won't
loose any work.

Note that other variants that have the same conflict ID will not be
touched.

Signed-off-by: Thomas Gummerer 
---
 rerere.c  | 12 +++-
 t/t4200-rerere.sh | 22 ++
 2 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/rerere.c b/rerere.c
index ef23abe4dd..220020187b 100644
--- a/rerere.c
+++ b/rerere.c
@@ -824,10 +824,7 @@ static int do_plain_rerere(struct string_list *rr, int fd)
struct rerere_id *id;
unsigned char sha1[20];
const char *path = conflict.items[i].string;
-   int ret;
-
-   if (string_list_has_string(rr, path))
-   continue;
+   int ret, has_string;
 
/*
 * Ask handle_file() to scan and assign a
@@ -835,7 +832,12 @@ static int do_plain_rerere(struct string_list *rr, int fd)
 * yet.
 */
ret = handle_file(path, sha1, NULL);
-   if (ret < 1)
+   has_string = string_list_has_string(rr, path);
+   if (ret < 0 && has_string) {
+   remove_variant(string_list_lookup(rr, path)->util);
+   string_list_remove(rr, path, 1);
+   }
+   if (ret < 1 || has_string)
continue;
 
id = new_rerere_id(sha1);
diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh
index eaf18c81cb..5ce411b70d 100755
--- a/t/t4200-rerere.sh
+++ b/t/t4200-rerere.sh
@@ -580,4 +580,26 @@ test_expect_success 'multiple identical conflicts' '
count_pre_post 0 0
 '
 
+test_expect_success 'rerere with unexpected conflict markers does not crash' '
+   git reset --hard &&
+
+   git checkout -b branch-1 master &&
+   echo "bar" >test &&
+   git add test &&
+   git commit -q -m two &&
+
+   git reset --hard &&
+   git checkout -b branch-2 master &&
+   echo "foo" >test &&
+   git add test &&
+   git commit -q -a -m one &&
+
+   test_must_fail git merge branch-1 &&
+   sed "s/bar/>>> a/" >test.tmp 

[PATCH v2 09/10] rerere: teach rerere to handle nested conflicts

2018-06-05 Thread Thomas Gummerer
Currently rerere can't handle nested conflicts and will error out when
it encounters such conflicts.  Do that by recursively calling the
'handle_conflict' function to normalize the conflict.

The conflict ID calculation here deserves some explanation:

As we are using the same handle_conflict function, the nested conflict
is normalized the same way as for non-nested conflicts, which means
the ancestor in the diff3 case is stripped out, and the parts of the
conflict are ordered alphabetically.

The conflict ID is however is only calculated in the top level
handle_conflict call, so it will include the markers that 'rerere'
adds to the output.  e.g. say there's the following conflict:

<<< HEAD
1
===
<<< HEAD
3
===
2
>>> branch-2
>>> branch-3~

it would be reordered as follows in the preimage:

<<<
1
===
<<<
2
===
3
>>>
>>>

and the conflict ID would be calculated as
sha1(1<<<
2
===
3
>>>)

Stripping out vs. leaving the conflict markers in place should have no
practical impact, but it simplifies the implementation.

Signed-off-by: Thomas Gummerer 
---

I couldn't actually get the conflict markers the right way just using
merge-recursive.  But I think that would be fixed either way by
d694a17986 ("ll-merge: use a longer conflict marker for internal
merge", 2016-04-14), if I read that correctly.

Either way I still think this can be an improvement for when the user
commits merge conflicts (even though they shouldn't do that in the
first place), and for possible other edge cases that I'm not able to
produce right now, but I may just not be creative enough for those.

 Documentation/technical/rerere.txt | 40 ++
 rerere.c   | 14 ---
 t/t4200-rerere.sh  | 38 
 3 files changed, 88 insertions(+), 4 deletions(-)

diff --git a/Documentation/technical/rerere.txt 
b/Documentation/technical/rerere.txt
index 2c517fe0fc..7077ab4a08 100644
--- a/Documentation/technical/rerere.txt
+++ b/Documentation/technical/rerere.txt
@@ -140,3 +140,43 @@ SHA1('BC').
 If there are multiple conflicts in one file, the sha1 is calculated
 the same way with all hunks appended to each other, in the order in
 which they appear in the file, separated by a  character.
+
+Nested conflicts
+
+
+Nested conflicts are handled very similarly to "simple" conflicts.
+Same as before, labels on conflict markers and diff3 output is
+stripped, and the conflict hunks are sorted, for both the outer and
+the inner conflict.
+
+The only difference is in how the conflict ID is calculated.  For the
+inner conflict, the conflict markers themselves are not stripped out
+before calculating the sha1.
+
+Say we have the following conflict for example:
+
+<<< HEAD
+1
+===
+<<< HEAD
+3
+===
+2
+>>> branch-2
+>>> branch-3~
+
+After stripping out the labels of the conflict markers, the conflict
+would look as follows:
+
+<<<
+1
+===
+<<<
+3
+===
+2
+>>>
+>>>
+
+and finally the conflict ID would be calculated as:
+`sha1('1<<<\n3\n===\n2\n>>>')`
diff --git a/rerere.c b/rerere.c
index fac90663b0..f611db7873 100644
--- a/rerere.c
+++ b/rerere.c
@@ -365,12 +365,18 @@ static int handle_conflict(struct strbuf *out, struct 
rerere_io *io,
RR_SIDE_1 = 0, RR_SIDE_2, RR_ORIGINAL
} hunk = RR_SIDE_1;
struct strbuf one = STRBUF_INIT, two = STRBUF_INIT;
-   struct strbuf buf = STRBUF_INIT;
+   struct strbuf buf = STRBUF_INIT, conflict = STRBUF_INIT;
int has_conflicts = 1;
while (!io->getline(&buf, io)) {
-   if (is_cmarker(buf.buf, '<', marker_size))
-   goto bad;
-   else if (is_cmarker(buf.buf, '|', marker_size)) {
+   if (is_cmarker(buf.buf, '<', marker_size)) {
+   if (handle_conflict(&conflict, io, marker_size, NULL) < 
0)
+   goto bad;
+   if (hunk == RR_SIDE_1)
+   strbuf_addbuf(&one, &conflict);
+   else
+   strbuf_addbuf(&two, &conflict);
+   strbuf_release(&conflict);
+   } else if (is_cmarker(buf.buf, '|', marker_size)) {
if (hunk != RR_SIDE_1)
goto bad;
hunk = RR_ORIGINAL;
diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh
index 5ce411b70d..f433848ccb 100755
--- a/t/t4200-rerere.sh
+++ b/t/t4200-rerere.sh
@@ -602,4 +602,42 @@ test_expect_success 'rerere with unexpected conflict 
markers does not crash' '
git rerere clear
 '
 
+test_expect_success 'rerere with inner conflict markers' '
+   git reset --hard &&

[PATCH v2 04/10] rerere: mark strings for translation

2018-06-05 Thread Thomas Gummerer
'git rerere' is considered a plumbing command and as such its output
should be translated.  Its functionality is also only enabled through
a config setting, so scripts really shouldn't rely on its output
either way.

Signed-off-by: Thomas Gummerer 
---
 builtin/rerere.c |  4 +--
 rerere.c | 68 
 2 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/builtin/rerere.c b/builtin/rerere.c
index e0c67c98e9..5ed941b91f 100644
--- a/builtin/rerere.c
+++ b/builtin/rerere.c
@@ -75,7 +75,7 @@ int cmd_rerere(int argc, const char **argv, const char 
*prefix)
if (!strcmp(argv[0], "forget")) {
struct pathspec pathspec;
if (argc < 2)
-   warning("'git rerere forget' without paths is 
deprecated");
+   warning(_("'git rerere forget' without paths is 
deprecated"));
parse_pathspec(&pathspec, 0, PATHSPEC_PREFER_CWD,
   prefix, argv + 1);
return rerere_forget(&pathspec);
@@ -107,7 +107,7 @@ int cmd_rerere(int argc, const char **argv, const char 
*prefix)
const char *path = merge_rr.items[i].string;
const struct rerere_id *id = merge_rr.items[i].util;
if (diff_two(rerere_path(id, "preimage"), path, path, 
path))
-   die("unable to generate diff for '%s'", 
rerere_path(id, NULL));
+   die(_("unable to generate diff for '%s'"), 
rerere_path(id, NULL));
}
} else
usage_with_options(rerere_usage, options);
diff --git a/rerere.c b/rerere.c
index 0e5956a51c..74ce422634 100644
--- a/rerere.c
+++ b/rerere.c
@@ -212,7 +212,7 @@ static void read_rr(struct string_list *rr)
 
/* There has to be the hash, tab, path and then NUL */
if (buf.len < 42 || get_sha1_hex(buf.buf, sha1))
-   die("corrupt MERGE_RR");
+   die(_("corrupt MERGE_RR"));
 
if (buf.buf[40] != '.') {
variant = 0;
@@ -221,10 +221,10 @@ static void read_rr(struct string_list *rr)
errno = 0;
variant = strtol(buf.buf + 41, &path, 10);
if (errno)
-   die("corrupt MERGE_RR");
+   die(_("corrupt MERGE_RR"));
}
if (*(path++) != '\t')
-   die("corrupt MERGE_RR");
+   die(_("corrupt MERGE_RR"));
buf.buf[40] = '\0';
id = new_rerere_id_hex(buf.buf);
id->variant = variant;
@@ -259,12 +259,12 @@ static int write_rr(struct string_list *rr, int out_fd)
rr->items[i].string, 0);
 
if (write_in_full(out_fd, buf.buf, buf.len) < 0)
-   die("unable to write rerere record");
+   die(_("unable to write rerere record"));
 
strbuf_release(&buf);
}
if (commit_lock_file(&write_lock) != 0)
-   die("unable to write rerere record");
+   die(_("unable to write rerere record"));
return 0;
 }
 
@@ -484,12 +484,12 @@ static int handle_file(const char *path, unsigned char 
*sha1, const char *output
io.input = fopen(path, "r");
io.io.wrerror = 0;
if (!io.input)
-   return error_errno("could not open '%s'", path);
+   return error_errno(_("could not open '%s'"), path);
 
if (output) {
io.io.output = fopen(output, "w");
if (!io.io.output) {
-   error_errno("could not write '%s'", output);
+   error_errno(_("could not write '%s'"), output);
fclose(io.input);
return -1;
}
@@ -499,15 +499,15 @@ static int handle_file(const char *path, unsigned char 
*sha1, const char *output
 
fclose(io.input);
if (io.io.wrerror)
-   error("there were errors while writing '%s' (%s)",
+   error(_("there were errors while writing '%s' (%s)"),
  path, strerror(io.io.wrerror));
if (io.io.output && fclose(io.io.output))
-   io.io.wrerror = error_errno("failed to flush '%s'", path);
+   io.io.wrerror = error_errno(_("failed to flush '%s'"), path);
 
if (hunk_no < 0) {
if (output)
unlink_or_warn(output);
-   return error("could not parse conflict hunks in '%s'", path);
+   return error(_("could not parse conflict hunks in '%s'"), path);
}
if (io.io.wrerror)
return -1;
@@ -568,7 +568,7 @@ static int find_conflict(struct string_list *conflict)
 {
int i;
if (read_cache() <

[PATCH v2 07/10] rerere: only return whether a path has conflicts or not

2018-06-05 Thread Thomas Gummerer
We currently return the exact number of conflict hunks a certain path
has from the 'handle_paths' function.  However all of its callers only
care whether there are conflicts or not or if there is an error.
Return only that information, and document that only that information
is returned.  This will simplify the code in the subsequent steps.

Signed-off-by: Thomas Gummerer 
---
 rerere.c | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/rerere.c b/rerere.c
index 220020187b..da3744b86b 100644
--- a/rerere.c
+++ b/rerere.c
@@ -393,12 +393,13 @@ static int is_cmarker(char *buf, int marker_char, int 
marker_size)
  * one side of the conflict, NUL, the other side of the conflict,
  * and NUL concatenated together.
  *
- * Return the number of conflict hunks found.
+ * Return 1 if conflict hunks are found, 0 if there are no conflict
+ * hunks and -1 if an error occured.
  */
 static int handle_path(unsigned char *sha1, struct rerere_io *io, int 
marker_size)
 {
git_SHA_CTX ctx;
-   int hunk_no = 0;
+   int has_conflicts = 0;
enum {
RR_CONTEXT = 0, RR_SIDE_1, RR_SIDE_2, RR_ORIGINAL
} hunk = RR_CONTEXT;
@@ -426,7 +427,7 @@ static int handle_path(unsigned char *sha1, struct 
rerere_io *io, int marker_siz
goto bad;
if (strbuf_cmp(&one, &two) > 0)
strbuf_swap(&one, &two);
-   hunk_no++;
+   has_conflicts = 1;
hunk = RR_CONTEXT;
rerere_io_putconflict('<', marker_size, io);
rerere_io_putmem(one.buf, one.len, io);
@@ -462,7 +463,7 @@ static int handle_path(unsigned char *sha1, struct 
rerere_io *io, int marker_siz
git_SHA1_Final(sha1, &ctx);
if (hunk != RR_CONTEXT)
return -1;
-   return hunk_no;
+   return has_conflicts;
 }
 
 /*
@@ -471,7 +472,7 @@ static int handle_path(unsigned char *sha1, struct 
rerere_io *io, int marker_siz
  */
 static int handle_file(const char *path, unsigned char *sha1, const char 
*output)
 {
-   int hunk_no = 0;
+   int has_conflicts = 0;
struct rerere_io_file io;
int marker_size = ll_merge_marker_size(path);
 
@@ -491,7 +492,7 @@ static int handle_file(const char *path, unsigned char 
*sha1, const char *output
}
}
 
-   hunk_no = handle_path(sha1, (struct rerere_io *)&io, marker_size);
+   has_conflicts = handle_path(sha1, (struct rerere_io *)&io, marker_size);
 
fclose(io.input);
if (io.io.wrerror)
@@ -500,14 +501,14 @@ static int handle_file(const char *path, unsigned char 
*sha1, const char *output
if (io.io.output && fclose(io.io.output))
io.io.wrerror = error_errno(_("failed to flush '%s'"), path);
 
-   if (hunk_no < 0) {
+   if (has_conflicts < 0) {
if (output)
unlink_or_warn(output);
return error(_("could not parse conflict hunks in '%s'"), path);
}
if (io.io.wrerror)
return -1;
-   return hunk_no;
+   return has_conflicts;
 }
 
 /*
@@ -955,7 +956,7 @@ static int handle_cache(const char *path, unsigned char 
*sha1, const char *outpu
mmfile_t mmfile[3] = {{NULL}};
mmbuffer_t result = {NULL, 0};
const struct cache_entry *ce;
-   int pos, len, i, hunk_no;
+   int pos, len, i, has_conflicts;
struct rerere_io_mem io;
int marker_size = ll_merge_marker_size(path);
 
@@ -1009,11 +1010,11 @@ static int handle_cache(const char *path, unsigned char 
*sha1, const char *outpu
 * Grab the conflict ID and optionally write the original
 * contents with conflict markers out.
 */
-   hunk_no = handle_path(sha1, (struct rerere_io *)&io, marker_size);
+   has_conflicts = handle_path(sha1, (struct rerere_io *)&io, marker_size);
strbuf_release(&io.input);
if (io.io.output)
fclose(io.io.output);
-   return hunk_no;
+   return has_conflicts;
 }
 
 static int rerere_forget_one_path(const char *path, struct string_list *rr)
-- 
2.17.0.410.g65aef3a6c4



[PATCH v2 08/10] rerere: factor out handle_conflict function

2018-06-05 Thread Thomas Gummerer
Factor out the handle_conflict function, which handles a single
conflict in a path.  This is a preparation for the next step, where
this function will be re-used.  No functional changes intended.

Signed-off-by: Thomas Gummerer 
---
 rerere.c | 143 +--
 1 file changed, 65 insertions(+), 78 deletions(-)

diff --git a/rerere.c b/rerere.c
index da3744b86b..fac90663b0 100644
--- a/rerere.c
+++ b/rerere.c
@@ -302,38 +302,6 @@ static void rerere_io_putstr(const char *str, struct 
rerere_io *io)
ferr_puts(str, io->output, &io->wrerror);
 }
 
-/*
- * Write a conflict marker to io->output (if defined).
- */
-static void rerere_io_putconflict(int ch, int size, struct rerere_io *io)
-{
-   char buf[64];
-
-   while (size) {
-   if (size <= sizeof(buf) - 2) {
-   memset(buf, ch, size);
-   buf[size] = '\n';
-   buf[size + 1] = '\0';
-   size = 0;
-   } else {
-   int sz = sizeof(buf) - 1;
-
-   /*
-* Make sure we will not write everything out
-* in this round by leaving at least 1 byte
-* for the next round, giving the next round
-* a chance to add the terminating LF.  Yuck.
-*/
-   if (size <= sz)
-   sz -= (sz - size) + 1;
-   memset(buf, ch, sz);
-   buf[sz] = '\0';
-   size -= sz;
-   }
-   rerere_io_putstr(buf, io);
-   }
-}
-
 static void rerere_io_putmem(const char *mem, size_t sz, struct rerere_io *io)
 {
if (io->output)
@@ -384,37 +352,25 @@ static int is_cmarker(char *buf, int marker_char, int 
marker_size)
return isspace(*buf);
 }
 
-/*
- * Read contents a file with conflicts, normalize the conflicts
- * by (1) discarding the common ancestor version in diff3-style,
- * (2) reordering our side and their side so that whichever sorts
- * alphabetically earlier comes before the other one, while
- * computing the "conflict ID", which is just an SHA-1 hash of
- * one side of the conflict, NUL, the other side of the conflict,
- * and NUL concatenated together.
- *
- * Return 1 if conflict hunks are found, 0 if there are no conflict
- * hunks and -1 if an error occured.
- */
-static int handle_path(unsigned char *sha1, struct rerere_io *io, int 
marker_size)
+static void rerere_strbuf_putconflict(struct strbuf *buf, int ch, size_t size)
+{
+   strbuf_addchars(buf, ch, size);
+   strbuf_addch(buf, '\n');
+}
+
+static int handle_conflict(struct strbuf *out, struct rerere_io *io,
+  int marker_size, git_SHA_CTX *ctx)
 {
-   git_SHA_CTX ctx;
-   int has_conflicts = 0;
enum {
-   RR_CONTEXT = 0, RR_SIDE_1, RR_SIDE_2, RR_ORIGINAL
-   } hunk = RR_CONTEXT;
+   RR_SIDE_1 = 0, RR_SIDE_2, RR_ORIGINAL
+   } hunk = RR_SIDE_1;
struct strbuf one = STRBUF_INIT, two = STRBUF_INIT;
struct strbuf buf = STRBUF_INIT;
-
-   if (sha1)
-   git_SHA1_Init(&ctx);
-
+   int has_conflicts = 1;
while (!io->getline(&buf, io)) {
-   if (is_cmarker(buf.buf, '<', marker_size)) {
-   if (hunk != RR_CONTEXT)
-   goto bad;
-   hunk = RR_SIDE_1;
-   } else if (is_cmarker(buf.buf, '|', marker_size)) {
+   if (is_cmarker(buf.buf, '<', marker_size))
+   goto bad;
+   else if (is_cmarker(buf.buf, '|', marker_size)) {
if (hunk != RR_SIDE_1)
goto bad;
hunk = RR_ORIGINAL;
@@ -427,42 +383,73 @@ static int handle_path(unsigned char *sha1, struct 
rerere_io *io, int marker_siz
goto bad;
if (strbuf_cmp(&one, &two) > 0)
strbuf_swap(&one, &two);
-   has_conflicts = 1;
-   hunk = RR_CONTEXT;
-   rerere_io_putconflict('<', marker_size, io);
-   rerere_io_putmem(one.buf, one.len, io);
-   rerere_io_putconflict('=', marker_size, io);
-   rerere_io_putmem(two.buf, two.len, io);
-   rerere_io_putconflict('>', marker_size, io);
-   if (sha1) {
-   git_SHA1_Update(&ctx, one.buf ? one.buf : "",
+   rerere_strbuf_putconflict(out, '<', marker_size);
+   strbuf_addbuf(out, &one);
+   rerere_strbuf_putconflict(out, '=', marker_size);
+   strbuf_addbuf(out, &two);
+   rerere_strbuf_putconflict(

[PATCH 2/2] fetch: send "refs/tags/" prefix upon CLI refspecs

2018-06-05 Thread Jonathan Tan
When performing tag following, in addition to using the server's
"include-tag" capability to send tag objects (and emulating it if the
server does not support that capability), "git fetch" relies upon the
presence of refs/tags/* entries in the initial ref advertisement to
locally create refs pointing to the aforementioned tag objects. When
using protocol v2, refs/tags/* entries in the initial ref advertisement
may be suppressed by a ref-prefix argument, leading to the tag object
being downloaded, but the ref not being created.

Commit dcc73cf7ff ("fetch: generate ref-prefixes when using a configured
refspec", 2018-05-18) ensured that "refs/tags/" is always sent as a ref
prefix when "git fetch" is invoked with no refspecs, but not when "git
fetch" is invoked with refspecs. Extend that functionality to make it
work in both situations.

Signed-off-by: Jonathan Tan 
---
 builtin/fetch.c|  2 +-
 t/t5702-protocol-v2.sh | 18 ++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index ea5b9669a..1f447f1e8 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -359,7 +359,7 @@ static struct ref *get_ref_map(struct transport *transport,
refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes);
 
if (ref_prefixes.argc &&
-   (tags == TAGS_SET || (tags == TAGS_DEFAULT && !rs->nr))) {
+   (tags == TAGS_SET || tags == TAGS_DEFAULT)) {
argv_array_push(&ref_prefixes, "refs/tags/");
}
 
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 261e82b0f..6733579c1 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -406,6 +406,24 @@ test_expect_success 'fetch supports various ways of have 
lines' '
$(git -C server rev-parse completely-unrelated)
 '
 
+test_expect_success 'fetch supports include-tag and tag following' '
+   rm -rf server client trace &&
+   git init server &&
+
+   test_commit -C server to_fetch &&
+   git -C server tag -a annotated_tag -m message &&
+
+   git init client &&
+   GIT_TRACE_PACKET="$(pwd)/trace" git -C client -c protocol.version=2 \
+   fetch "$(pwd)/server" to_fetch:to_fetch &&
+
+   grep "fetch> ref-prefix to_fetch" trace &&
+   grep "fetch> ref-prefix refs/tags/" trace &&
+   grep "fetch> include-tag" trace &&
+
+   git -C client cat-file -e $(git -C client rev-parse annotated_tag)
+'
+
 # Test protocol v2 with 'http://' transport
 #
 . "$TEST_DIRECTORY"/lib-httpd.sh
-- 
2.17.0.768.g1526ddbba1.dirty



[PATCH 1/2] t5702: test fetch with multiple refspecs at a time

2018-06-05 Thread Jonathan Tan
Extend the protocol v2 tests to also test fetches with multiple refspecs
specified. This also covers the previously uncovered cases of fetching
with prefix matching and fetching by SHA-1.

Signed-off-by: Jonathan Tan 
---
 t/t5702-protocol-v2.sh | 47 ++
 1 file changed, 47 insertions(+)

diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index a4fe6508b..261e82b0f 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -359,6 +359,53 @@ test_expect_success 'default refspec is used to filter ref 
when fetchcing' '
grep "ref-prefix refs/tags/" log
 '
 
+test_expect_success 'fetch supports various ways of have lines' '
+   rm -rf server client trace &&
+   git init server &&
+   test_commit -C server dwim &&
+   TREE=$(git -C server rev-parse HEAD^{tree}) &&
+   git -C server tag exact \
+   $(git -C server commit-tree -m a "$TREE") &&
+   git -C server tag dwim-unwanted \
+   $(git -C server commit-tree -m b "$TREE") &&
+   git -C server tag exact-unwanted \
+   $(git -C server commit-tree -m c "$TREE") &&
+   git -C server tag prefix1 \
+   $(git -C server commit-tree -m d "$TREE") &&
+   git -C server tag prefix2 \
+   $(git -C server commit-tree -m e "$TREE") &&
+   git -C server tag fetch-by-sha1 \
+   $(git -C server commit-tree -m f "$TREE") &&
+   git -C server tag completely-unrelated \
+   $(git -C server commit-tree -m g "$TREE") &&
+   
+   git init client &&
+   GIT_TRACE_PACKET="$(pwd)/trace" git -C client -c protocol.version=2 \
+   fetch "file://$(pwd)/server" \
+   dwim \
+   refs/tags/exact \
+   refs/tags/prefix*:refs/tags/prefix* \
+   "$(git -C server rev-parse fetch-by-sha1)" &&
+
+   # Ensure that the appropriate prefixes are sent (using a sample)
+   grep "fetch> ref-prefix dwim" trace &&
+   grep "fetch> ref-prefix refs/heads/dwim" trace &&
+   grep "fetch> ref-prefix refs/tags/prefix" trace &&
+
+   # Ensure that the correct objects are returned
+   git -C client cat-file -e $(git -C server rev-parse dwim) &&
+   git -C client cat-file -e $(git -C server rev-parse exact) &&
+   git -C client cat-file -e $(git -C server rev-parse prefix1) &&
+   git -C client cat-file -e $(git -C server rev-parse prefix2) &&
+   git -C client cat-file -e $(git -C server rev-parse fetch-by-sha1) &&
+   test_must_fail git -C client cat-file -e \
+   $(git -C server rev-parse dwim-unwanted) &&
+   test_must_fail git -C client cat-file -e \
+   $(git -C server rev-parse exact-unwanted) &&
+   test_must_fail git -C client cat-file -e \
+   $(git -C server rev-parse completely-unrelated)
+'
+
 # Test protocol v2 with 'http://' transport
 #
 . "$TEST_DIRECTORY"/lib-httpd.sh
-- 
2.17.0.768.g1526ddbba1.dirty



[PATCH 0/2] Fix protocol v2 tag following with CLI refspec

2018-06-05 Thread Jonathan Tan
After the events that led up to Jonathan Nieder's patch fixing fetch by
SHA-1 in protocol v2 [1], while expanding protocol v2's test coverage, I
found a bug with tag following. Patch 2 is a patch that fixes that bug
(and patch 1 is a related but independent test that I had written
beforehand).

[1] 
https://public-inbox.org/git/20180531072339.ga43...@aiede.svl.corp.google.com/

Jonathan Tan (2):
  t5702: test fetch with multiple refspecs at a time
  fetch: send "refs/tags/" prefix upon CLI refspecs

 builtin/fetch.c|  2 +-
 t/t5702-protocol-v2.sh | 65 ++
 2 files changed, 66 insertions(+), 1 deletion(-)

-- 
2.17.0.768.g1526ddbba1.dirty



Re: [PATCH 2/2] fetch: send "refs/tags/" prefix upon CLI refspecs

2018-06-05 Thread Brandon Williams
On 06/05, Jonathan Tan wrote:
> When performing tag following, in addition to using the server's
> "include-tag" capability to send tag objects (and emulating it if the
> server does not support that capability), "git fetch" relies upon the
> presence of refs/tags/* entries in the initial ref advertisement to
> locally create refs pointing to the aforementioned tag objects. When
> using protocol v2, refs/tags/* entries in the initial ref advertisement
> may be suppressed by a ref-prefix argument, leading to the tag object
> being downloaded, but the ref not being created.
> 
> Commit dcc73cf7ff ("fetch: generate ref-prefixes when using a configured
> refspec", 2018-05-18) ensured that "refs/tags/" is always sent as a ref
> prefix when "git fetch" is invoked with no refspecs, but not when "git
> fetch" is invoked with refspecs. Extend that functionality to make it
> work in both situations.
> 
> Signed-off-by: Jonathan Tan 

Test t5702-protocol-v2.sh doesn't pass with this patch.

> ---
>  builtin/fetch.c|  2 +-
>  t/t5702-protocol-v2.sh | 18 ++
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index ea5b9669a..1f447f1e8 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -359,7 +359,7 @@ static struct ref *get_ref_map(struct transport 
> *transport,
>   refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes);
>  
>   if (ref_prefixes.argc &&
> - (tags == TAGS_SET || (tags == TAGS_DEFAULT && !rs->nr))) {
> + (tags == TAGS_SET || tags == TAGS_DEFAULT)) {
>   argv_array_push(&ref_prefixes, "refs/tags/");
>   }
>  
> diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
> index 261e82b0f..6733579c1 100755
> --- a/t/t5702-protocol-v2.sh
> +++ b/t/t5702-protocol-v2.sh
> @@ -406,6 +406,24 @@ test_expect_success 'fetch supports various ways of have 
> lines' '
>   $(git -C server rev-parse completely-unrelated)
>  '
>  
> +test_expect_success 'fetch supports include-tag and tag following' '
> + rm -rf server client trace &&
> + git init server &&
> +
> + test_commit -C server to_fetch &&
> + git -C server tag -a annotated_tag -m message &&
> +
> + git init client &&
> + GIT_TRACE_PACKET="$(pwd)/trace" git -C client -c protocol.version=2 \
> + fetch "$(pwd)/server" to_fetch:to_fetch &&
> +
> + grep "fetch> ref-prefix to_fetch" trace &&
> + grep "fetch> ref-prefix refs/tags/" trace &&
> + grep "fetch> include-tag" trace &&
> +
> + git -C client cat-file -e $(git -C client rev-parse annotated_tag)
> +'
> +
>  # Test protocol v2 with 'http://' transport
>  #
>  . "$TEST_DIRECTORY"/lib-httpd.sh
> -- 
> 2.17.0.768.g1526ddbba1.dirty
> 

-- 
Brandon Williams


Re: [PATCH 2/2] fetch: send "refs/tags/" prefix upon CLI refspecs

2018-06-05 Thread Brandon Williams
On 06/05, Jonathan Tan wrote:
> When performing tag following, in addition to using the server's
> "include-tag" capability to send tag objects (and emulating it if the
> server does not support that capability), "git fetch" relies upon the
> presence of refs/tags/* entries in the initial ref advertisement to
> locally create refs pointing to the aforementioned tag objects. When
> using protocol v2, refs/tags/* entries in the initial ref advertisement
> may be suppressed by a ref-prefix argument, leading to the tag object
> being downloaded, but the ref not being created.
> 
> Commit dcc73cf7ff ("fetch: generate ref-prefixes when using a configured
> refspec", 2018-05-18) ensured that "refs/tags/" is always sent as a ref
> prefix when "git fetch" is invoked with no refspecs, but not when "git
> fetch" is invoked with refspecs. Extend that functionality to make it
> work in both situations.
> 
> Signed-off-by: Jonathan Tan 
> ---
>  builtin/fetch.c|  2 +-
>  t/t5702-protocol-v2.sh | 18 ++
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index ea5b9669a..1f447f1e8 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -359,7 +359,7 @@ static struct ref *get_ref_map(struct transport 
> *transport,
>   refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes);
>  
>   if (ref_prefixes.argc &&
> - (tags == TAGS_SET || (tags == TAGS_DEFAULT && !rs->nr))) {
> + (tags == TAGS_SET || tags == TAGS_DEFAULT)) {
>   argv_array_push(&ref_prefixes, "refs/tags/");
>   }

This is difficult...Really I don't think the default should be to follow
tags.  Mostly because this defeats the purpose of ref filtering when a
user only requests the master branch.  Now instead of the server only
sending the master branch, you get the whole tags namespace as well.

>  
> diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
> index 261e82b0f..6733579c1 100755
> --- a/t/t5702-protocol-v2.sh
> +++ b/t/t5702-protocol-v2.sh
> @@ -406,6 +406,24 @@ test_expect_success 'fetch supports various ways of have 
> lines' '
>   $(git -C server rev-parse completely-unrelated)
>  '
>  
> +test_expect_success 'fetch supports include-tag and tag following' '
> + rm -rf server client trace &&
> + git init server &&
> +
> + test_commit -C server to_fetch &&
> + git -C server tag -a annotated_tag -m message &&
> +
> + git init client &&
> + GIT_TRACE_PACKET="$(pwd)/trace" git -C client -c protocol.version=2 \
> + fetch "$(pwd)/server" to_fetch:to_fetch &&
> +
> + grep "fetch> ref-prefix to_fetch" trace &&
> + grep "fetch> ref-prefix refs/tags/" trace &&
> + grep "fetch> include-tag" trace &&
> +
> + git -C client cat-file -e $(git -C client rev-parse annotated_tag)
> +'
> +
>  # Test protocol v2 with 'http://' transport
>  #
>  . "$TEST_DIRECTORY"/lib-httpd.sh
> -- 
> 2.17.0.768.g1526ddbba1.dirty
> 

-- 
Brandon Williams


git question from a newbie

2018-06-05 Thread Heinz, Steve
Hi.

I am new to Git and have read quite a few articles on it.
I am planning on setting up a remote repository on a windows 2012 R2 server and 
will access it via HTTPS.
I am setting up a local repository on my desk top (others in my group will do 
the same).
On "server1":  I install Git and create a repository "repos".
On "server1":  I create a dummy webpage "default.htm" and place it in the repo 
folder.
On "server1":  I create a web application in IIS pointing to Git
On Server1":   change permissions so IIS_User  has access to the folders.
On "server1":  inside the "repos" folder and right click and choose "bash here"
On "server1":   $ git init  -bare(it's really 2 hyphens)

On Desktop:  open Chrome and type in URL to make sure we can access it
https://xyz/repos/default.htm
  ** make sure you have access, no certificate issues or firewall issues.  
The pages shows up fine

On Desktop:  install Git and create repository "repos".
On Desktop:  right click in "repos" folder and choose "bash here"
On Desktop:  $ git init
On Desktop : add a folder "testProject" under the "repos" folder and add some 
files to the folder
On Desktop:  $ git add . (will add files and folder to working 
tree)
On Desktop   $ git status   (shows it recognizes the filed were 
added)
On Desktop   $ git commit -m "test project commit"   (will stage 
changes)
On Desktop   $ git push https://xyz.domainname.com/repos master

** this is the error I get,  I have tried many different things.  I am sure I 
am doing something stupid
** I have tried a bunch of variations but I always get the same error.  It 
looks like some type of network/permission
** thing but everything seems OK.
   Fatal: repository 'https://xyz.domainname.com/repos/' not found

*** this is where I get the error trying to push staged items to the remote 
repository.
*** I have tried to clone the empty remote repository still same error
*** I checked port 443 is opened and being used for https
*** tried to set origin to https://xyz.domainname.com/repos"; and then $git push 
origin master   (same error)
*** I tried passing credentials to the remote server as well


Any ideas would be greatly appreciated.
Thanks
Steve



The information contained in this email message is intended only for the 
private and confidential use of the recipient(s) named above, unless the sender 
expressly agrees otherwise. In no event shall AAA Northeast or any of its 
affiliates accept any responsibility for the loss, use or misuse of any 
information including confidential information, which is sent to AAA Northeast 
or its affiliates via email, or email attachment. AAA Northeast does not 
guarantee the accuracy of any email or email attachment. If the reader of this 
message is not the intended recipient and/or you have received this email in 
error, you must take no action based on the information in this email and you 
are hereby notified that any dissemination, misuse or copying or disclosure of 
this communication is strictly prohibited. If you have received this 
communication in error, please notify us immediately by email and delete the 
original message.


[PATCH v2 1/2] t5702: test fetch with multiple refspecs at a time

2018-06-05 Thread Jonathan Tan
Extend the protocol v2 tests to also test fetches with multiple refspecs
specified. This also covers the previously uncovered cases of fetching
with prefix matching and fetching by SHA-1.

Signed-off-by: Jonathan Tan 
---
 t/t5702-protocol-v2.sh | 47 ++
 1 file changed, 47 insertions(+)

diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index a4fe6508b..261e82b0f 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -359,6 +359,53 @@ test_expect_success 'default refspec is used to filter ref 
when fetchcing' '
grep "ref-prefix refs/tags/" log
 '
 
+test_expect_success 'fetch supports various ways of have lines' '
+   rm -rf server client trace &&
+   git init server &&
+   test_commit -C server dwim &&
+   TREE=$(git -C server rev-parse HEAD^{tree}) &&
+   git -C server tag exact \
+   $(git -C server commit-tree -m a "$TREE") &&
+   git -C server tag dwim-unwanted \
+   $(git -C server commit-tree -m b "$TREE") &&
+   git -C server tag exact-unwanted \
+   $(git -C server commit-tree -m c "$TREE") &&
+   git -C server tag prefix1 \
+   $(git -C server commit-tree -m d "$TREE") &&
+   git -C server tag prefix2 \
+   $(git -C server commit-tree -m e "$TREE") &&
+   git -C server tag fetch-by-sha1 \
+   $(git -C server commit-tree -m f "$TREE") &&
+   git -C server tag completely-unrelated \
+   $(git -C server commit-tree -m g "$TREE") &&
+   
+   git init client &&
+   GIT_TRACE_PACKET="$(pwd)/trace" git -C client -c protocol.version=2 \
+   fetch "file://$(pwd)/server" \
+   dwim \
+   refs/tags/exact \
+   refs/tags/prefix*:refs/tags/prefix* \
+   "$(git -C server rev-parse fetch-by-sha1)" &&
+
+   # Ensure that the appropriate prefixes are sent (using a sample)
+   grep "fetch> ref-prefix dwim" trace &&
+   grep "fetch> ref-prefix refs/heads/dwim" trace &&
+   grep "fetch> ref-prefix refs/tags/prefix" trace &&
+
+   # Ensure that the correct objects are returned
+   git -C client cat-file -e $(git -C server rev-parse dwim) &&
+   git -C client cat-file -e $(git -C server rev-parse exact) &&
+   git -C client cat-file -e $(git -C server rev-parse prefix1) &&
+   git -C client cat-file -e $(git -C server rev-parse prefix2) &&
+   git -C client cat-file -e $(git -C server rev-parse fetch-by-sha1) &&
+   test_must_fail git -C client cat-file -e \
+   $(git -C server rev-parse dwim-unwanted) &&
+   test_must_fail git -C client cat-file -e \
+   $(git -C server rev-parse exact-unwanted) &&
+   test_must_fail git -C client cat-file -e \
+   $(git -C server rev-parse completely-unrelated)
+'
+
 # Test protocol v2 with 'http://' transport
 #
 . "$TEST_DIRECTORY"/lib-httpd.sh
-- 
2.17.0.768.g1526ddbba1.dirty



[PATCH v2 0/2] Fix protocol v2 tag following with CLI refspec

2018-06-05 Thread Jonathan Tan
> Test t5702-protocol-v2.sh doesn't pass with this patch.

Good catch - I've fixed that.

> This is difficult...Really I don't think the default should be to follow
> tags.  Mostly because this defeats the purpose of ref filtering when a
> user only requests the master branch.  Now instead of the server only
> sending the master branch, you get the whole tags namespace as well.

It's true that there is now a performance drop. My instinctive reaction
is to be conservative and preserve the existing behavior, but I'll see
what others on this list think.

It's worth nothing that with ref-in-want (for example, in your latest
series [1]) we might be able to restore performance, because the server
can send refs/tags/X with the packfile instead of sending all
refs/tags/X refs initially to the client.

[1] https://public-inbox.org/git/20180605175144.4225-1-bmw...@google.com/

Jonathan Tan (2):
  t5702: test fetch with multiple refspecs at a time
  fetch: send "refs/tags/" prefix upon CLI refspecs

 builtin/fetch.c|  2 +-
 t/t5702-protocol-v2.sh | 71 --
 2 files changed, 69 insertions(+), 4 deletions(-)

-- 
2.17.0.768.g1526ddbba1.dirty



[PATCH v2 2/2] fetch: send "refs/tags/" prefix upon CLI refspecs

2018-06-05 Thread Jonathan Tan
When performing tag following, in addition to using the server's
"include-tag" capability to send tag objects (and emulating it if the
server does not support that capability), "git fetch" relies upon the
presence of refs/tags/* entries in the initial ref advertisement to
locally create refs pointing to the aforementioned tag objects. When
using protocol v2, refs/tags/* entries in the initial ref advertisement
may be suppressed by a ref-prefix argument, leading to the tag object
being downloaded, but the ref not being created.

Commit dcc73cf7ff ("fetch: generate ref-prefixes when using a configured
refspec", 2018-05-18) ensured that "refs/tags/" is always sent as a ref
prefix when "git fetch" is invoked with no refspecs, but not when "git
fetch" is invoked with refspecs. Extend that functionality to make it
work in both situations.

This also necessitates a change another test which tested ref
advertisement filtering using tag refs - since tag refs are sent by
default now, the test has been switched to using branch refs instead.

Signed-off-by: Jonathan Tan 
---
 builtin/fetch.c|  2 +-
 t/t5702-protocol-v2.sh | 24 +---
 2 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index ea5b9669a..1f447f1e8 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -359,7 +359,7 @@ static struct ref *get_ref_map(struct transport *transport,
refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes);
 
if (ref_prefixes.argc &&
-   (tags == TAGS_SET || (tags == TAGS_DEFAULT && !rs->nr))) {
+   (tags == TAGS_SET || tags == TAGS_DEFAULT)) {
argv_array_push(&ref_prefixes, "refs/tags/");
}
 
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 261e82b0f..b31b6d8d3 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -204,6 +204,7 @@ test_expect_success 'ref advertisment is filtered during 
fetch using protocol v2
test_when_finished "rm -f log" &&
 
test_commit -C file_parent three &&
+   git -C file_parent branch unwanted-branch three &&
 
GIT_TRACE_PACKET="$(pwd)/log" git -C file_child -c protocol.version=2 \
fetch origin master &&
@@ -212,9 +213,8 @@ test_expect_success 'ref advertisment is filtered during 
fetch using protocol v2
git -C file_parent log -1 --format=%s >expect &&
test_cmp expect actual &&
 
-   ! grep "refs/tags/one" log &&
-   ! grep "refs/tags/two" log &&
-   ! grep "refs/tags/three" log
+   grep "refs/heads/master" log &&
+   ! grep "refs/heads/unwanted-branch" log
 '
 
 test_expect_success 'server-options are sent when fetching' '
@@ -406,6 +406,24 @@ test_expect_success 'fetch supports various ways of have 
lines' '
$(git -C server rev-parse completely-unrelated)
 '
 
+test_expect_success 'fetch supports include-tag and tag following' '
+   rm -rf server client trace &&
+   git init server &&
+
+   test_commit -C server to_fetch &&
+   git -C server tag -a annotated_tag -m message &&
+
+   git init client &&
+   GIT_TRACE_PACKET="$(pwd)/trace" git -C client -c protocol.version=2 \
+   fetch "$(pwd)/server" to_fetch:to_fetch &&
+
+   grep "fetch> ref-prefix to_fetch" trace &&
+   grep "fetch> ref-prefix refs/tags/" trace &&
+   grep "fetch> include-tag" trace &&
+
+   git -C client cat-file -e $(git -C client rev-parse annotated_tag)
+'
+
 # Test protocol v2 with 'http://' transport
 #
 . "$TEST_DIRECTORY"/lib-httpd.sh
-- 
2.17.0.768.g1526ddbba1.dirty



Re: [PATCHv1 1/3] git-p4: raise exceptions from p4CmdList based on error from p4 server

2018-06-05 Thread Luke Diamand
On 5 June 2018 at 20:41, Eric Sunshine  wrote:
> On Tue, Jun 5, 2018 at 6:56 AM Luke Diamand  wrote:
>> On 5 June 2018 at 10:54, Eric Sunshine  wrote:
>> > On Tue, Jun 5, 2018 at 5:14 AM Luke Diamand  wrote:
>> >> +m = re.search('Too many rows scanned \(over 
>> >> (\d+)\)', data)
>> >> +if not m:
>> >> +m = re.search('Request too large \(over 
>> >> (\d+)\)', data)
>> >
>> > Does 'p4' localize these error messages?
>>
>> That's a good question.
>>
>> It turns out that Perforce open-sourced the P4 client in 2014 (I only
>> recently found this out) so we can actually look at the code now!
>>
>> Here's the code:
>>
>> // ErrorId graveyard: retired/deprecated ErrorIds.
>
> Hmm, the "too many rows" error you're seeing is retired/deprecated(?).

There's some code elsewhere that suggests it's being kept alive:

// Retired ErrorIds. We need to keep these so that clients
// built with newer apis can commnunicate with older servers
// still sending these.

static ErrorId MaxResults; // DEPRECATED
static ErrorId MaxScanRows; // DEPRECATED


>
>> ErrorId MsgDb::MaxResults  = { ErrorOf( ES_DB, 32,
>> E_FAILED, EV_ADMIN, 1 ), "Request too large (over %maxResults%); see
>> 'p4 help maxresults'." } ;//NOTRANS
>> ErrorId MsgDb::MaxScanRows = { ErrorOf( ES_DB, 61,
>> E_FAILED, EV_ADMIN, 1 ), "Too many rows scanned (over %maxScanRows%);
>> see 'p4 help maxscanrows'." } ;//NOTRANS
>>
>> I don't think there's actually a way to make it return any language
>> other than English though. [...]
>> So I think probably the language is always English.
>
> The "NOTRANS" annotation on the error messages is reassuring.

I'll check it works OK on Windows; charset translation might cause a problem.


RE: git question from a newbie

2018-06-05 Thread Randall S. Becker
On June 5, 2018 5:24 PM, Steve Heinz wrote:
> I am new to Git and have read quite a few articles on it.
> I am planning on setting up a remote repository on a windows 2012 R2
server
> and will access it via HTTPS.
> I am setting up a local repository on my desk top (others in my group will
do
> the same).
> On "server1":  I install Git and create a repository "repos".
> On "server1":  I create a dummy webpage "default.htm" and place it in the
> repo folder.
> On "server1":  I create a web application in IIS pointing to Git
> On Server1":   change permissions so IIS_User  has access to the folders.
> On "server1":  inside the "repos" folder and right click and choose "bash
> here"
> On "server1":   $ git init  -bare(it's really 2 hyphens)
> 
> On Desktop:  open Chrome and type in URL to make sure we can access it
> https://xyz/repos/default.htm
>   ** make sure you have access, no certificate issues or firewall
issues.  The
> pages shows up fine
> 
> On Desktop:  install Git and create repository "repos".
> On Desktop:  right click in "repos" folder and choose "bash here"
> On Desktop:  $ git init
> On Desktop : add a folder "testProject" under the "repos" folder and add
> some files to the folder
> On Desktop:  $ git add . (will add files and folder to
working tree)
> On Desktop   $ git status   (shows it recognizes the filed
were added)
> On Desktop   $ git commit -m "test project commit"   (will stage
changes)
> On Desktop   $ git push https://xyz.domainname.com/repos master
> 
> ** this is the error I get,  I have tried many different things.  I am
sure I am
> doing something stupid
> ** I have tried a bunch of variations but I always get the same error.  It
looks
> like some type of network/permission
> ** thing but everything seems OK.
>Fatal: repository 'https://xyz.domainname.com/repos/' not found
> 
> *** this is where I get the error trying to push staged items to the
remote
> repository.
> *** I have tried to clone the empty remote repository still same error
> *** I checked port 443 is opened and being used for https
> *** tried to set origin to https://xyz.domainname.com/repos"; and then $git
> push origin master   (same error)
> *** I tried passing credentials to the remote server as well

Missing glue - git remote

git remote add origin https://xyz.domainname.com/repos

Cheers,
Randall

-- Brief whoami:
 NonStop developer since approximately 2112884442
 UNIX developer since approximately 421664400
-- In my real life, I talk too much.





  1   2   >