Re: git-scm.com status report

2017-02-01 Thread Samuel Lijin
In theory, you could also dump the build artifacts to a GH Pages repo
and host it from there, although I don't know if you would run up
against any of the usage limits[0]. The immediate problem I see with
that approach, though, is that I have no idea how any of the dynamic
stuff (e.g. search) would be replaced.

A question: there's a DB schema in there. Does the site still use a DB?

[0] https://help.github.com/articles/what-is-github-pages/#usage-limits

On Wed, Feb 1, 2017 at 10:36 PM, Eric Wong  wrote:
> Jeff King  wrote:
>> With the caveat that I know very little about web hosting, $230/mo
>> sounds like an awful lot for what is essentially a static web site.
>
> Yes, that's a lot.
>
> Fwiw, that covers a year of low-end VPS hosting for the main
> public-inbox.org/git machine + mail host
> (~1GB git objects + ~3GB Xapian index).
>
>> The site does see a lot of hits, but most of the content is a few basic
>> web pages, and copies of other static content that is updated
>> only occasionally (manpage content, lists of downloads, etc).  The biggest
>> dynamic component is the site search, I think.
>
> Maybe optimize search if that's slowest, first.  public-inbox
> uses per-host Xapian indexes so there's no extra network latency
> and it seems to work well.  But maybe you don't get FS write
> access without full VPS access on Heroku...
>
> nginx handles static content easily, and since it looks like you
> guys use unicorn[*] for running the Ruby part.  I really hope
> nginx is in front of unicorn, since (AFAIK) Heroku doesn't put
> nginx in front of it by default.
>
>
> [*] I wrote and maintain unicorn; and have not yet recommended
> any reverse proxy besides nginx to buffer for it.
> However, having varnish or any other caching layer in
> between nginx and unicorn is great, too.  I dunno how Heroku
> (or any proprietary deployment systems) handle it, though.
>
>> I do wonder if there's room for improvement either:
>>
>>   - by measuring and optimizing the Heroku deploy. I have no idea about
>> scaling Rails or Heroku apps. Do we really need three expensive
>> dynos, or a $50/mo database plan? I'm not even sure what to measure,
>> or how. There are some analytics on the site, but I don't have
>> access to them (I could probably dig around for access if there was
>> somebody who actually knew how to do something productive with
>> them).
>
> I track down the most expensive requests in per-request timing
> logs and work on profiling + optimizations from there...
> Nothing fancy and no relying on proprietary tools like NewRelic.
>
> I also watch for queueing in listen socket backlog (with
> raindrops  or ss(8) to
> notice overloading.  Again, I don't know how much visibility
> you have with Heroku.
>
>>   - by moving to a simpler model. I wonder if we could build the site
>> once and then deploy a more static variant of it to a cheaper
>> hosting platform. I'm not really sure what our options would be, how
>> much work it would take to do the conversion, and if we'd lose any
>> functionality.
>
> *shrug*  That'd be more work, at least.  I'd figure out what's
> slow, first.
>
> Fwiw, Varnish definitely helps public-inbox when slammed by
> HN/Reddit traffic.  It's great as long as you don't have
> per-user data to invalidate, which seems to be the case for
> git-scm.
>
>> If anybody is interested in tackling a project like this, let me know,
>> and I can try to provide access to whatever parts are needed.
>
> While I'm not up-to-date with modern Rails or deployment stuff,
> I'm available via email if you have any lower-level
> Ruby/unicorn/nginx-related questions.  I'm sure GitHub/GitLab
> also has folks familiar with nginx+unicorn deployment on
> bare metal or VPS who could also help.


Re: git-scm.com status report

2017-02-01 Thread Eric Wong
Jeff King  wrote:
> With the caveat that I know very little about web hosting, $230/mo
> sounds like an awful lot for what is essentially a static web site.

Yes, that's a lot.

Fwiw, that covers a year of low-end VPS hosting for the main
public-inbox.org/git machine + mail host
(~1GB git objects + ~3GB Xapian index).

> The site does see a lot of hits, but most of the content is a few basic
> web pages, and copies of other static content that is updated
> only occasionally (manpage content, lists of downloads, etc).  The biggest
> dynamic component is the site search, I think.

Maybe optimize search if that's slowest, first.  public-inbox
uses per-host Xapian indexes so there's no extra network latency
and it seems to work well.  But maybe you don't get FS write
access without full VPS access on Heroku...

nginx handles static content easily, and since it looks like you
guys use unicorn[*] for running the Ruby part.  I really hope
nginx is in front of unicorn, since (AFAIK) Heroku doesn't put
nginx in front of it by default.


[*] I wrote and maintain unicorn; and have not yet recommended
any reverse proxy besides nginx to buffer for it.
However, having varnish or any other caching layer in
between nginx and unicorn is great, too.  I dunno how Heroku
(or any proprietary deployment systems) handle it, though.

> I do wonder if there's room for improvement either:
> 
>   - by measuring and optimizing the Heroku deploy. I have no idea about
> scaling Rails or Heroku apps. Do we really need three expensive
> dynos, or a $50/mo database plan? I'm not even sure what to measure,
> or how. There are some analytics on the site, but I don't have
> access to them (I could probably dig around for access if there was
> somebody who actually knew how to do something productive with
> them).

I track down the most expensive requests in per-request timing
logs and work on profiling + optimizations from there...
Nothing fancy and no relying on proprietary tools like NewRelic.

I also watch for queueing in listen socket backlog (with
raindrops  or ss(8) to
notice overloading.  Again, I don't know how much visibility
you have with Heroku.

>   - by moving to a simpler model. I wonder if we could build the site
> once and then deploy a more static variant of it to a cheaper
> hosting platform. I'm not really sure what our options would be, how
> much work it would take to do the conversion, and if we'd lose any
> functionality.

*shrug*  That'd be more work, at least.  I'd figure out what's
slow, first.

Fwiw, Varnish definitely helps public-inbox when slammed by
HN/Reddit traffic.  It's great as long as you don't have
per-user data to invalidate, which seems to be the case for
git-scm.

> If anybody is interested in tackling a project like this, let me know,
> and I can try to provide access to whatever parts are needed.

While I'm not up-to-date with modern Rails or deployment stuff,
I'm available via email if you have any lower-level
Ruby/unicorn/nginx-related questions.  I'm sure GitHub/GitLab
also has folks familiar with nginx+unicorn deployment on
bare metal or VPS who could also help.


init --separate-git-dir does not set core.worktree

2017-02-01 Thread Kyle Meyer
Hello,

As of 6311cfaf9 (init: do not set unnecessary core.worktree,
2016-09-25), "git init --separate-git-dir" no longer sets core.worktree
(test below).  Based on the commit message and the corresponding thread
[1], I don't think this change in behavior was intentional, but I wasn't
able to understand things well enough to attempt a patch.

Thanks.

[1] 
https://public-inbox.org/git/calqjkkzo_y0dncrjjooyz7eso7ybmghvz6fe92oo4su7jec...@mail.gmail.com/T/#u

diff --git a/t/t0001-init.sh b/t/t0001-init.sh
index b8fc588b1..444e75865 100755
--- a/t/t0001-init.sh
+++ b/t/t0001-init.sh
@@ -309,6 +309,7 @@ test_expect_success 'init with separate gitdir' '
git init --separate-git-dir realgitdir newdir &&
echo "gitdir: $(pwd)/realgitdir" >expected &&
test_cmp expected newdir/.git &&
+   check_config realgitdir false "$(pwd)/newdir" &&
test_path_is_dir realgitdir/refs
 '
 


Git / Software Freedom Conservancy status report

2017-02-01 Thread Jeff King
Since it's been about a year since the last one, I'd like to give a
brief overview of the activities of Git as a member project of the
Software Freedom Conservancy. I plan to have a discussion session at the
Git Merge Contributor Summit tomorrow; I'll try to relay any interesting
points from that session to the list.

# Background

Git is a member project of the Software Freedom Conservancy. We joined
in 2010 so that Conservancy could help us manage our money and other
assets, and represent us for legal stuff like trademarks. Conservancy
doesn't hold any copyright on Git code, and our status as a member
project is totally disconnected from the development of the code.

A "Project Leadership Committee" (PLC) represents Git in Conservancy.
The PLC consists of me, Shawn Pearce, and Junio Hamano.

# Financials

In short, we have about $21K (USD) in our account. Here's the breakdown
of money in versus out (these are sums of double-entry transactions, so
negative is good here):

 $-27,468.01  Income:Git
   $6,207.50  Expenses:Git

 $-21,260.51

You can see we don't actually spend that much. Here's the same report
just since 2016-01-01:

  $-4,468.64  Income:Git
   $2,102.32  Expenses:Git

  $-2,366.32

Those reports come from running "ledger" on the file given to us by
Conservancy. I can run other reports if there are things people want to
know.

I _think_ there are some outstanding transactions that actually bump
that closer to $3K (e.g., money invoiced for GSoC that gets collected by
Conservancy and then eventually dumped into a bank account).

Here's the same report over the last three years:

 $-12,990.84  Income:Git
   $2,772.78  Expenses:Git

 $-10,218.06

So we seem to average about $3K/year ahead of our expenses.

## Where does the income come from?

We get money from the mentor stipends for GSoC. We get about $1K/year in
donations (you can donate to Conservancy and earmark for Git, or there's
a "donate" button on git-scm.com, though it's not particularly
prominent).

Links from git-scm.com to buy the paper copy of the Pro Git book are
Amazon affiliate links. If people buy stuff (either the book, or other
things after clicking around the site), we get some money. This accounts
for ~$800 this year.

Packt Publishing has several technical books related to Git, and they
send some of the royalties to the project. That was ~$400 this year.

## Where do expenses go?

Most of the money goes to travel. Note that this accounts for some of
the income, too. if we send a mentor to the GSoC mentor summit, for
example, we invoice Google for the money, which shows up as income.  And
then it leaves us to reimburse the mentor, which is an expense.  So it's
net-zero for the project, but inflates the numbers.

We also spent ~$900 on legal filing fees for the trademark (not this
year, but that's part of the total income/expenses history I showed
earlier).

# Activity Summary for 2016

This is a summary of interesting things related to the project that
happened in the past year.

## Mentoring

We had one GSoC student, Pranit Bauva (who passed). The actual coding
project has little to do with our Conservancy status, but the Git
project did get the stipend money, and handled the financial logistics
for Christian Couder to go to the mentor summit (the money ultimately
comes from Google for that).

We were a potential Outreachy project last Spring for the first time,
but we didn't end up picking an intern. As a reminder, the Outreachy
program is somewhat similar to GSoC, but with a focus on getting
under-represented groups involved in programming (but not necessarily
college students). The projects themselves have to cover the stipend for
the intern (or get funding elsewhere to do so). Last year GitHub offered
to fund one intern for us, but we didn't end up selecting anyone.

## Travel Money

We paid for travel for one developer to Git Merge last year, and we are
covering one this year, too.

Each year I ask if financial circumstances are preventing anyone from
coming, and if the project can help. Each year I have gotten only one
response. The Git PLC discusses each case, and we decide whether and how
much money to provide; Conservancy handles the logistics. That tries to
balance privacy for individuals who need financial support with some
accountability for the project.

## git-scm.com DNS

We now own the git-scm.com domain, which is held for us in a Conservancy
account (this is mostly for convenience; they hold a ton of domains, and
can just auto-renew out of our money). We also have owned git-scm.org
for a while.

There has never really been an "official" Git website, but git-scm.com
has been the de facto one for a while. I think us actually controlling
the domain makes it more so. The actual 

git-scm.com status report

2017-02-01 Thread Jeff King
We (the Git project) got control of the git-scm.com domain this year. We
have never really had an "official" website, but I think a lot of people
consider this to be one.

This is an overview of the current state, as well as some possible
issues and future work.

## What's on the site

We have the domains git-scm.com and git-scm.org (the latter we've had
for a while). They both point to the same website, which has general
information about Git, including:

  - a general overview of Git

  - links to the latest releases (both source and some binary
installers)

  - HTML-rendered copies of the manpages (both for the current version
and historical versions)

  - an HTML rendering of the contents of the Pro Git book, along with
translations. The book content is licensed cc-by-nc-sa and developed
openly.

  - various external links to books, tutorials, GUI tools, etc

## How is it developed and hosted

The site is a Ruby on Rails app. The git repository is
https://github.com/git/git-scm.com. Modifications are generally done by
pull requests there. I have admin access on the repository.

The deployed site is hosted on Heroku. It's part of GitHub's
meta-account, and they pay the bills. I have access to it, and am the
only person who deploys updates. Other technical staff at GitHub have
access, too, because of the account setup, but don't generally
participate in maintenance.

It uses three 1GB Heroku dynos for scaling, which is $150/mo. It also
uses some Heroku addons which add up to another $80/mo.

## Who's the maintainer

These days, it's pretty much me, with a lot of help from Jean-Noël Avila
on issues with the Pro Git import and formatting code.

Long ago, the site content and code was done by Scott Chacon, with
graphic design help from Jason Long.  Scott maintained the site with
help from Bryan Turner for many years. But over time, they both seemed
to get less active, and I haven't seen a peep from either on the site's
GitHub repo in the past year. I've started trying to respond to issues
and pull requests to keep things healthy.

The site is mostly in maintenance mode, but things do need addressing.
People show up with new additions, fixes for typos, broken links and
other formatting problems, etc. There are a lot of long-standing
Asciidoc formatting problems both for the manpages and the imported Pro
Git content.

## What next

We can probably continue in maintenance mode like this for a while.
We've fixed a lot of of the long-standing formatting issues over the
past year, so maintaining seems to have subsided in the past few months
to mostly just merging or rejecting the occasional PR.

Still, if anybody is interested in helping with this work, I'd love to
have more eyes on it. I can give people access to the GitHub repo.
Unfortunately, I can't do so for the Heroku deploy, and part of the
maintenance burden is that the site is finicky and often needs manual
intervention (e.g., a fix to formatting requires rebuilding the
manpages, which needs a job run manually on Heroku).

It's possible that the content or visual design of the site could be
improved in various ways. I don't have any strong desires myself, but
maybe others do. If people start doing larger work, though, we have a
real lack of reviewers, and I have very little expertise with Rails or
with visual design. So anybody who wants to do this should be prepared to
take maintenance ownership.

At some point, GitHub may boot us off of the shared Heroku account,
because my impression is that it's somewhat of an administrative
headache. I don't think the Git project could afford the $230/mo hosting
fees; that's basically all the money we make. On the other hand, we
haven't actively solicited funds to any great degree, and it's possible
we could get GitHub or some other entity to just sponsor us with site
fees (I've heard zero complaints from GitHub about the money; it's
mostly just that the site is an oddball among their other assets).

With the caveat that I know very little about web hosting, $230/mo
sounds like an awful lot for what is essentially a static web site.
The site does see a lot of hits, but most of the content is a few basic
web pages, and copies of other static content that is updated
only occasionally (manpage content, lists of downloads, etc).  The biggest
dynamic component is the site search, I think.

I do wonder if there's room for improvement either:

  - by measuring and optimizing the Heroku deploy. I have no idea about
scaling Rails or Heroku apps. Do we really need three expensive
dynos, or a $50/mo database plan? I'm not even sure what to measure,
or how. There are some analytics on the site, but I don't have
access to them (I could probably dig around for access if there was
somebody who actually knew how to do something productive with
them).

  - by moving to a simpler model. I wonder if we could build the site
once and then deploy a more static variant of it to a cheaper

Git trademark status and policy

2017-02-01 Thread Jeff King
As many of you already know, the Git project (as a member of Software
Freedom Conservancy) holds a trademark on "Git".  This email will try to
lay out a bit of the history and procedure around the enforcement of
that trademark, along with some open questions about policy.

I'll use "we" in the text below, which will generally mean the Git
Project Leadership Committee (PLC). I.e., the people who represent the
Git project as part of Conservancy -- me, Junio Hamano, and Shawn
Pearce.

We approached Conservancy in Feb 2013 about getting a trademark on Git
to ensure that anything calling itself "Git" remained interoperable with
Git. Conservancy's lawyer drafted the USPTO application and submitted it
that summer. The trademark was granted in late 2014 (more on that delay
in a moment).

Concurrently, we developed a written trademark policy, which you can
find here:

  https://git-scm.com/trademark

This was started from a template that Conservancy uses and customized by
Conservancy and the Git PLC.

While the original idea was to prevent people from forking the
software, breaking compatibility, and still calling it Git, the policy
covers several other cases.

One is that you can't imply successorship. So you also can't fork the
software, call it "Git++", and then tell everybody your implementation
is the next big thing.

Another is that you can't use the mark in a way that implies association
with or endorsement by the Git project. To some degree this is necessary
to prevent dilution of the mark for other uses, but there are also cases
we directly want to prevent.

For example, imagine a software project which is only tangentially
related to Git. It might use Git as a side effect, or might just be
"Git-like" in the sense of being a distributed system with chained
hashes. Let's say as an example that it does backups. We'd prefer it
not call itself GitBackups. We don't endorse it, and it's just using the
name to imply association that isn't there. You can come up with similar
hypotheticals: GitMail that stores mailing list archives in Git, or
GitWiki that uses Git as a backing store.

Those are all fictitious examples (actually, there _are_ real projects
that do each of those things, but they gave themselves much more unique
names). But they're indicative of some of the cases we've seen. I'm
intentionally not giving the real names here, because my point isn't to
shame any particular projects, but to discuss general policy.

Careful readers among you may now be wondering about GitHub, GitLab,
Gitolite, etc. And now we get back to why it took over a year to get the
trademark granted.

The USPTO initially rejected our application as confusingly similar to
the existing trademark on GitHub, which was filed in 2008. While one
might imagine where the "Git" in GitHub comes from, by the time we
applied to the USPTO, both marks had been widely used in parallel for
years.  So we worked out an agreement with GitHub which basically says
"we are mutually OK with the other trademark existing".

(There was another delay caused by a competing application from a
proprietary version control company that wanted to re-brand portions of
their system as "GitFocused" (not the real name, but similar in spirit).
We argued our right to the name and refused to settle; they eventually
withdrew their application).

So GitHub is essentially outside the scope of the trademark policy, due
to the history. We also decided to explicitly grandfather some major
projects that were using similar portmanteaus, but which had generally
been good citizens of the Git ecosystem (building on Git in a useful
way, not breaking compatibility). Those include GitLab, JGit, libgit2,
and some others. The reasoning was generally that it would be a big pain
for those projects, which have established their own brands, to have to
switch names. It's hard to hold them responsible for picking a name that
violated a policy that didn't yet exist.

If the "libgit2" project were starting from scratch today, we'd probably
ask it to use a different name (because the name may imply that it's an
official successor). However, we effectively granted permission for this
use and it would be unfair to disrupt that.

There's one other policy point that has come up: the written policy
disallows the use of "Git" or the logo on merchandise. This is something
people have asked about it (e.g., somebody made some Git stress balls,
and another person was printing keycaps with a Git logo). We have always
granted it, but wanted to reserve the right in case there was some use
that we hadn't anticipated that would be confusing or unsavory.

Enforcement of the policy is done as cases are brought to the attention
of Conservancy and the Git PLC. Sometimes people mail Conservancy
directly, and sometimes a use is noticed by the Git PLC, which mails
Conservancy.  In either case, Conservancy's lawyer pings the Git PLC,
and we decide what to do about it, with advice from the lawyer. The end
result is 

Re: [PATCH v2 7/7] completion: recognize more long-options

2017-02-01 Thread SZEDER Gábor
On Wed, Feb 1, 2017 at 5:49 PM, Cornelius Weig
 wrote:
> Hi Gabor,
>
>  thanks for taking a look at these commits.
>
> On 01/31/2017 11:17 PM, SZEDER Gábor wrote:
>> On Fri, Jan 27, 2017 at 10:17 PM,   wrote:
>>> From: Cornelius Weig 
>>>
>>> Recognize several new long-options for bash completion in the following
>>> commands:
>>
>> Adding more long options that git commands learn along the way is
>> always an improvement.  However, seeing "_several_ new long options"
>> (or "some long options" in one of the other patches in the series)
>> makes the reader wonder: are these the only new long options missing
>> or are there more?  If there are more, why only these are added?  If
>> there aren't any more missing long options left, then please say so,
>> e.g. "Add all missing long options, except the potentially
>> desctructive ones, for the following commands: "
>
> Personally, I agree with you that
>> Adding more long options that git commands learn along the way is
>> always an improvement.
> However, people may start complaining that their terminal becomes too
> cluttered when doing a double-Tab. In my cover letter, I go to length
> about this. My assumption was that all options that are mentioned in the
> introduction of the command man-page should be important enough to have
> them in the completion list.

But that doesn't mean that the ones not mentioned in the synopsis
section are not worth completing.

The list of options listed by the completion script for several of
these commands fits on a single line.  The command with the most
options among these is 'git pull', and even its options don't fill
more than half of a 80x25 screen.  I see no danger of people coming
complaining.

> I'll change my commit message accordingly.
>
>>>  - rm: --force
>>
>> '--force' is a potentially destructive option, too.
>
> Thanks for spotting this.
>
> Btw, I haven't found that non-destructive options should not be eligible
> for completion. To avoid confusion about this in the future, I suggest
> to also change the documentation:
>
> index 933bb6e..96f1c7f 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -13,7 +13,7 @@
>  #*) git email aliases for git-send-email
>  #*) tree paths within 'ref:path/to/file' expressions
>  #*) file paths within current working directory and index
> -#*) common --long-options
> +#*) common non-destructive --long-options

I don't mind such a change, but I don't think that list was ever meant
to be comprehensive or decisive.  It is definitely not the former, as
it's missing several things that the completion script does support.
OTOH, it talks about .git/remotes, which has been considered legacy
for quite some years (though it's right, because the completion script
still supports it).

> I take it you have also looked at the code itself? Then I would gladly
> mention you as reviewer in my sign-off.

Yeah, most of the changes was rather straightforward, except the
completion of 'git remote's subcommands' options, but that looks
good, too.

Gábor


Re: [PATCH 4/7] completion: teach ls-remote to complete options

2017-02-01 Thread SZEDER Gábor

> ls-remote needs to complete remote names and its own options.

And refnames, too.

> In
> addition to the existing remote name completions, do also complete
> the options --heads, --tags, --refs, and --get-url.

Why only these four options and not the other four?

There are eight options in total here, so there is really no chance
for cluttered terminals, and all eight are mentioned in the synopsis.

> ---
>  contrib/completion/git-completion.bash | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/contrib/completion/git-completion.bash 
> b/contrib/completion/git-completion.bash
> index 652c7e2..36fe439 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -1449,6 +1449,12 @@ _git_ls_files ()
>  
>  _git_ls_remote ()
>  {
> + case "$cur" in
> + --*)
> + __gitcomp "--heads --tags --refs --get-url"
> + return
> + ;;
> + esac
>   __gitcomp_nl "$(__git_remotes)"
>  }
>  
> -- 
> 2.10.2




Re: [PATCH 6/7] completion: teach remote subcommands option completion

2017-02-01 Thread SZEDER Gábor

> Git-remote needs to complete remote names, its subcommands, and options
> thereof. In addition to the existing subcommand and remote name
> completion, do also complete the options
> 
>  - add: --track --master --fetch --tags --no-tags --mirror=

Oh, '--track' and '--master' are not even in the manpage or in 'git
remote -h', I could only find them after looking at the source code...

Good eyes :)

>  - set-url: --push --add --delete
>  - get-url: --push --all
>  - prune: --dry-run
> ---
>  contrib/completion/git-completion.bash | 36 
> +++---
>  1 file changed, 29 insertions(+), 7 deletions(-)
> 
> diff --git a/contrib/completion/git-completion.bash 
> b/contrib/completion/git-completion.bash
> index e76cbd7..0e09519 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -2384,24 +2384,46 @@ _git_config ()
>  
>  _git_remote ()
>  {
> - local subcommands="add rename remove set-head set-branches set-url show 
> prune update"
> + local subcommands="
> + add rename remove set-head set-branches
> + get-url set-url show prune update
> + "
>   local subcommand="$(__git_find_on_cmdline "$subcommands")"
>   if [ -z "$subcommand" ]; then
> - __gitcomp "$subcommands"
> + case "$cur" in
> + --*)
> + __gitcomp "--verbose"
> + ;;
> + *)
> + __gitcomp "$subcommands"
> + ;;
> + esac
>   return
>   fi
>  
> - case "$subcommand" in
> - rename|remove|set-url|show|prune)
> - __gitcomp_nl "$(__git_remotes)"
> + case "$subcommand,$cur" in
> + add,--*)
> + __gitcomp "--track --master --fetch --tags --no-tags --mirror="
>   ;;
> - set-head|set-branches)
> + add,*)
> + ;;
> + set-head,*|set-branches,*)

The 'set-head' subcommand has '--auto' and '--delete' options, and
'set-branches' has '--add'.

>   __git_complete_remote_or_refspec
>   ;;
> - update)
> + update,*)

The 'update' subcommand has a '--prune' option.

Otherwise it all looks good.


>   __gitcomp "$(__git_get_config_variables "remotes")"
>   ;;
> + set-url,--*)
> + __gitcomp "--push --add --delete"
> + ;;
> + get-url,--*)
> + __gitcomp "--push --all"
> + ;;
> + prune,--*)
> + __gitcomp "--dry-run"
> + ;;
>   *)
> + __gitcomp_nl "$(__git_remotes)"
>   ;;
>   esac
>  }
> -- 
> 2.10.2




Re: [PATCH 2/7] completion: add subcommand completion for rerere

2017-02-01 Thread SZEDER Gábor
> Managing recorded resolutions requires command-line usage of git-rerere.
> Added subcommand completion for rerere and path completion for its
> subcommand forget.
> ---
>  contrib/completion/git-completion.bash | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/contrib/completion/git-completion.bash 
> b/contrib/completion/git-completion.bash
> index c54a557..8329f09 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -2401,6 +2401,17 @@ _git_replace ()
>   __gitcomp_nl "$(__git_refs)"
>  }
>  
> +_git_rerere ()
> +{
> + local subcommands="clear forget diff remaining status gc"
> + local subcommand="$(__git_find_on_cmdline "$subcommands")"
> + if test -z "$subcommand"
> + then
> + __gitcomp "$subcommands"
> + return
> + fi
> +}
> +
>  _git_reset ()
>  {
>   __git_has_doubledash && return
> -- 
> 2.10.2

You didn't add 'rerere' to the list of porcelain commands, i.e. it
won't be listed after 'git '.  I'm not saying it should be
listed there, because I can't decide whether 'rerere' is considered
porcelain or plumbing...  Just wanted to make sure that this omission
was intentional.



Re: git commit results in many lstat()s

2017-02-01 Thread brian m. carlson
On Thu, Feb 02, 2017 at 12:14:30AM +, Gumbel, Matthew K wrote:
> I'm testing such a change locally. Git test suite seems to be running for 
> quite
> a while. Do you know any way to run it in parallel or otherwise speed it
> up?

I usually do something like the following:

  make -j3 all && (cd t && GIT_PROVE_OPTS=-j3 make prove)

This, of course, requires that you have Perl's prove installed, which
has been part of core Perl since 5.10.1.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


RE: git commit results in many lstat()s

2017-02-01 Thread Gumbel, Matthew K
"Junio C Hamano  writes:

"Gumbel, Matthew K"  writes:

>> Yes, I think that when the user passes --only flag to git-commit, then git 
>> does not
>> need to call refresh_cache() in prepare_index() in builtin/commit.c.
>>
>> I may experiment with that. Do you see any downside or negative side-effects?

> There may be other fallouts, but one that immediately comes to mind
> is that it may break pre-commit hook.

If pre-commit hook exists, we can fall-back to original behavior and call
refresh_cache(). Many repos will not have pre-commit hook and can 
benefit from the speedup.

I'm testing such a change locally. Git test suite seems to be running for quite
a while. Do you know any way to run it in parallel or otherwise speed it
up?

Thanks,
Matt


Re: [PATCH] doc: add note about ignoring --no-create-reflog

2017-02-01 Thread Junio C Hamano
Jeff King  writes:

> On Wed, Feb 01, 2017 at 03:27:09PM -0800, Junio C Hamano wrote:
>
>> I had the same trouble wording.  Another thing I noticed was that I
>> deliberately left it vague what "default" this does not override,
>> because it appears to me that those who do not set logallrefupdates
>> will get the compiled-in default and that is also not overriden.
>> 
>> IOW, "does not negate the setting of core.logallrefupdates" will
>> open us to reports "I do not have the configuration set, but I still
>> get reflog even when --no-create-reflog is given".
>> 
>>The negated form `--no-create-reflog` currently does not negate
>>the default; it overrides an earlier `--create-reflog`, though.
>> 
>> perhaps?
>
> True. I thought the default was "off", and that we merely set the config
> when initializing a repo. But looking again, it really is checking
> is_bare_repository() at runtime.
>
> I still think it is OK to mention, as the description of
> core.logallrefupdates is where we document the behavior and the
> defaults. So even with "I do not have it set", that is still the key to
> find more information.

OK, let's take yours as the final and merge it down to 'next'
soonish.


Re: git commit results in many lstat()s

2017-02-01 Thread Junio C Hamano
"Gumbel, Matthew K"  writes:

> "Junio C Hamano"  writes:
>> There probably are other things that can be optimized.
>
> Yes, I think that when the user passes --only flag to git-commit, then git 
> does not
> need to call refresh_cache() in prepare_index() in builtin/commit.c.
>
> I may experiment with that. Do you see any downside or negative side-effects?

There may be other fallouts, but one that immediately comes to mind
is that it may break pre-commit hook.

When we get "--only", we prepare an temporary index to create the
commit out of, and give it to the pre-commit hook.  The hook expects
that the cached stat information is up-to-date, iow, it does not
have to do 'update-index --refresh' before using plumbing commands
like "diff-index" to do its own inspection of the working tree.


Re: [PATCH] doc: add note about ignoring --no-create-reflog

2017-02-01 Thread Jeff King
On Wed, Feb 01, 2017 at 03:27:09PM -0800, Junio C Hamano wrote:

> I had the same trouble wording.  Another thing I noticed was that I
> deliberately left it vague what "default" this does not override,
> because it appears to me that those who do not set logallrefupdates
> will get the compiled-in default and that is also not overriden.
> 
> IOW, "does not negate the setting of core.logallrefupdates" will
> open us to reports "I do not have the configuration set, but I still
> get reflog even when --no-create-reflog is given".
> 
>The negated form `--no-create-reflog` currently does not negate
>the default; it overrides an earlier `--create-reflog`, though.
> 
> perhaps?

True. I thought the default was "off", and that we merely set the config
when initializing a repo. But looking again, it really is checking
is_bare_repository() at runtime.

I still think it is OK to mention, as the description of
core.logallrefupdates is where we document the behavior and the
defaults. So even with "I do not have it set", that is still the key to
find more information.

I do not care that strongly either way, though. This is a minor issue,
and I suspect just about any note would be helpful.

-Peff


Re: [PATCH] merge-recursive: make "CONFLICT (rename/delete)" message show both paths

2017-02-01 Thread Junio C Hamano
Matt McCutchen  writes:

> On Wed, 2017-02-01 at 12:56 -0800, Junio C Hamano wrote:
>
>> Let me make sure if I understood your changes correctly:
>>  ...
>> So the condition to guard the call to update_file() also looks
>> correct to me.
>
> All of the above matches my understanding.  Would it have saved you
> time if I had included some of this explanation in the patch "cover
> letter"?

The fact that I arrived at the same understanding by reading the
change without peeking at such a cheat-sheet gives me more peace of
mind ;-)

Thanks.


Re: [PATCH] doc: add note about ignoring --no-create-reflog

2017-02-01 Thread Junio C Hamano
Jeff King  writes:

> Should this perhaps say "currently" or "this may change in the future",
> so that people (including those who might want to fix it later) know
> that it's a limitation and not intentional?

Good point.

> I'd also probably say it a little shorter, like:
>
>   The negated form `--no-create-reflog` only overrides an earlier
>   `--create-reflog`, but currently does not negate the setting of
>   `core.logallrefupdates`.
>
> I guess that really isn't much shorter (I wondered if you could cut out
> the "overrides --create-reflog" part, since that is the normal and
> expected behavior, but I had trouble wording it to do so).

I had the same trouble wording.  Another thing I noticed was that I
deliberately left it vague what "default" this does not override,
because it appears to me that those who do not set logallrefupdates
will get the compiled-in default and that is also not overriden.

IOW, "does not negate the setting of core.logallrefupdates" will
open us to reports "I do not have the configuration set, but I still
get reflog even when --no-create-reflog is given".

   The negated form `--no-create-reflog` currently does not negate
   the default; it overrides an earlier `--create-reflog`, though.

perhaps?


Re: [PATCH] merge-recursive: make "CONFLICT (rename/delete)" message show both paths

2017-02-01 Thread Matt McCutchen
On Wed, 2017-02-01 at 12:56 -0800, Junio C Hamano wrote:
> Junio C Hamano  writes:
> 
> > Matt McCutchen  writes:
> > ...
> > > Please check that my refactoring is indeed correct!  All the
> > > existing tests pass
> > > for me, but the existing test coverage of these conflict messages
> > > looks poor.
> > 
> > This unfortunately is doing a bit too many things at once from that
> > point of view.  I need to reserve a solid quiet 20-minutes without
> > distraction to check it, which I am hoping to do tonight.
> 
> Let me make sure if I understood your changes correctly:
> 
>  * conflict_rename_delete() knew which one is renamed and which one
>    is deleted (even though the deleted one was called "other"), but
>    because in the original code handle_change_delete() wants to
>    always see tree A first and then tree B in its parameter list,
>    the original code swapped a/b before calling it.  In the original
>    code, handle_change_delete() needed to figure out which one is
>    the deleted one by looking at a_oid or b_oid.
> 
>  * In the updated code, the knowledge of which branch survives and
>    which branch is deleted is passed from the caller to
>    handle_change_delete(), which no longer needs to figure out by
>    looking at a_oid/b_oid.  The updated API only passes the oid for
>    surviving branch (as deleted one would have been 0{40} anyway).
> 
>  * In the updated code, handle_change_delete() is told the names of
>    both branches (the one that survives and the other that was
>    deleted).  It no longer has to switch between o->branch[12]
>    depending on the NULLness of a_oid/b_oid; it knows both names and
>    which one is which.
> 
>  * handle_modify_delete() also calls handle_change_delete().  Unlike
>    conflict_rename_delete(), it is not told by its caller which
>    branch keeps the path and which branch deletes the path, and
>    instead relies on handle_change_delete() to figure it out.
>    Because of the above change to the API, now it needs to sort it
>    out before calling handle_change_delete().
> 
> It all makes sense to me.  
> 
> The single call to update_file() that appears near the end of
> handle_change_delete() in the updated code corresponds to calls to
> the same function in 3 among 4 codepaths in the function in the
> original code.  It is a bit tricky to reason about, though.
> 
> In the original code, update_file() was omitted when we didn't have
> to come up with a unique alternate filename and the one that is left
> is a_oid (i.e. our side).  The way to tell if we did not come up
> with a unique alternate filename used to be to see the "renamed"
> variable but now it is the NULL-ness of "alt_path".

"alt_path" is the same variable that used to be "renamed".  I just
renamed it to be less confusing.

> And the way to
> tell if the side that is left is ours, we check to see o->branch1
> is the change_branch, not delete_branch.
> 
> So the condition to guard the call to update_file() also looks
> correct to me.

All of the above matches my understanding.  Would it have saved you
time if I had included some of this explanation in the patch "cover
letter"?

Matt



Re: [PATCH] doc: add note about ignoring --no-create-reflog

2017-02-01 Thread Cornelius Weig
>   The negated form `--no-create-reflog` only overrides an earlier
>   `--create-reflog`, but currently does not negate the setting of
>   `core.logallrefupdates`.

Even better than Junio's version. I especially like that it mentions
where the default setting comes from.


Re: [PATCH] doc: add note about ignoring --no-create-reflog

2017-02-01 Thread Cornelius Weig


On 02/02/2017 12:11 AM, Junio C Hamano wrote:
> Jeff King  writes:
> 
>> This might be nitpicking, but it's _not_ ignored. It still negates an
>> earlier "--create-reflog". It is only that it does not override the
>> decision to create a reflog caused by the setting of
>> core.logallrefupdates.

This corner case is quite important. Glad you thought about it!

> -- >8 --
> From: Cornelius Weig 
> Date: Wed, 1 Feb 2017 23:07:27 +0100
> Subject: [PATCH] doc: add note about ignoring '--no-create-reflog'
> 
> The commands git-branch and git-tag accept the '--create-reflog'
> option, and create reflog even when core.logallrefupdates
> configuration is explicitly set not to.
> 
> On the other hand, the negated form '--no-create-reflog' is accepted
> as a valid option but has no effect (other than overriding an
> earlier '--create-reflog' on the command line). This silent noop may
> puzzle users.  To communicate that this is a known limitation, add a
> short note in the manuals for git-branch and git-tag.
> 
> Signed-off-by: Cornelius Weig 
> Signed-off-by: Junio C Hamano 
> ---
>  Documentation/git-branch.txt | 3 +++
>  Documentation/git-tag.txt| 3 +++
>  2 files changed, 6 insertions(+)
> 
> diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
> index 5516a47b54..102e426fd8 100644
> --- a/Documentation/git-branch.txt
> +++ b/Documentation/git-branch.txt
> @@ -91,6 +91,9 @@ OPTIONS
>   based sha1 expressions such as "@\{yesterday}".
>   Note that in non-bare repositories, reflogs are usually
>   enabled by default by the `core.logallrefupdates` config option.
> + The negated form `--no-create-reflog` does not override the
> + default, even though it overrides `--create-reflog` that appears
> + earlier on the command line.
>  
>  -f::
>  --force::
> diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
> index 2ac25a9bb3..fd7eeae075 100644
> --- a/Documentation/git-tag.txt
> +++ b/Documentation/git-tag.txt
> @@ -152,6 +152,9 @@ This option is only applicable when listing tags without 
> annotation lines.
>  --create-reflog::
>   Create a reflog for the tag. To globally enable reflogs for tags, see
>   `core.logAllRefUpdates` in linkgit:git-config[1].
> + The negated form `--no-create-reflog` does not override the
> + default, even though it overrides `--create-reflog` that appears
> + earlier on the command line.
>  
>  ::
>   The name of the tag to create, delete, or describe.
> 

Your amended version is quite concise and says everything there is to
say. Thanks


Re: [PATCH] doc: add note about ignoring --no-create-reflog

2017-02-01 Thread Jeff King
On Wed, Feb 01, 2017 at 03:11:57PM -0800, Junio C Hamano wrote:

> diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
> index 5516a47b54..102e426fd8 100644
> --- a/Documentation/git-branch.txt
> +++ b/Documentation/git-branch.txt
> @@ -91,6 +91,9 @@ OPTIONS
>   based sha1 expressions such as "@\{yesterday}".
>   Note that in non-bare repositories, reflogs are usually
>   enabled by default by the `core.logallrefupdates` config option.
> + The negated form `--no-create-reflog` does not override the
> + default, even though it overrides `--create-reflog` that appears
> + earlier on the command line.

Should this perhaps say "currently" or "this may change in the future",
so that people (including those who might want to fix it later) know
that it's a limitation and not intentional?

I'd also probably say it a little shorter, like:

  The negated form `--no-create-reflog` only overrides an earlier
  `--create-reflog`, but currently does not negate the setting of
  `core.logallrefupdates`.

I guess that really isn't much shorter (I wondered if you could cut out
the "overrides --create-reflog" part, since that is the normal and
expected behavior, but I had trouble wording it to do so).

-Peff


Re: [PATCH 1/2] doc: add doc for git-push --recurse-submodules=only

2017-02-01 Thread Junio C Hamano
cornelius.w...@tngtech.com writes:

> From: Cornelius Weig 
>
> Add documentation for the `--recurse-submodules=only` option of
> git-push. The feature was added in commit 225e8bf (add option to
> push only submodules).
>
> Signed-off-by: Cornelius Weig 
> ---
>
> Notes:
> This feature is already in 'next' but was undocumented. Unless somebody 
> reads
> the release notes, there is no way of knowing about it.

Good eyes; the topic bw/push-submodule-only is already in 'master'.

Looks good to me; Brandon?

>
>  Documentation/git-push.txt | 13 +++--
>  1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
> index 8eefabd..1624a35 100644
> --- a/Documentation/git-push.txt
> +++ b/Documentation/git-push.txt
> @@ -272,7 +272,7 @@ origin +master` to force a push to the `master` branch). 
> See the
>   standard error stream is not directed to a terminal.
>  
>  --no-recurse-submodules::
> ---recurse-submodules=check|on-demand|no::
> +--recurse-submodules=check|on-demand|only|no::
>   May be used to make sure all submodule commits used by the
>   revisions to be pushed are available on a remote-tracking branch.
>   If 'check' is used Git will verify that all submodule commits that
> @@ -280,11 +280,12 @@ origin +master` to force a push to the `master` 
> branch). See the
>   remote of the submodule. If any commits are missing the push will
>   be aborted and exit with non-zero status. If 'on-demand' is used
>   all submodules that changed in the revisions to be pushed will be
> - pushed. If on-demand was not able to push all necessary revisions
> - it will also be aborted and exit with non-zero status. A value of
> - 'no' or using `--no-recurse-submodules` can be used to override the
> - push.recurseSubmodules configuration variable when no submodule
> - recursion is required.
> + pushed. If on-demand was not able to push all necessary revisions it 
> will
> + also be aborted and exit with non-zero status. If 'only' is used all
> + submodules will be recursively pushed while the superproject is left
> + unpushed. A value of 'no' or using `--no-recurse-submodules` can be used
> + to override the push.recurseSubmodules configuration variable when no
> + submodule recursion is required.
>  
>  --[no-]verify::
>   Toggle the pre-push hook (see linkgit:githooks[5]).  The


Re: [PATCH] doc: add note about ignoring --no-create-reflog

2017-02-01 Thread Junio C Hamano
Jeff King  writes:

> This might be nitpicking, but it's _not_ ignored. It still negates an
> earlier "--create-reflog". It is only that it does not override the
> decision to create a reflog caused by the setting of
> core.logallrefupdates.

OK, rolling them all into one, how about this as an amend?

-- >8 --
From: Cornelius Weig 
Date: Wed, 1 Feb 2017 23:07:27 +0100
Subject: [PATCH] doc: add note about ignoring '--no-create-reflog'

The commands git-branch and git-tag accept the '--create-reflog'
option, and create reflog even when core.logallrefupdates
configuration is explicitly set not to.

On the other hand, the negated form '--no-create-reflog' is accepted
as a valid option but has no effect (other than overriding an
earlier '--create-reflog' on the command line). This silent noop may
puzzle users.  To communicate that this is a known limitation, add a
short note in the manuals for git-branch and git-tag.

Signed-off-by: Cornelius Weig 
Signed-off-by: Junio C Hamano 
---
 Documentation/git-branch.txt | 3 +++
 Documentation/git-tag.txt| 3 +++
 2 files changed, 6 insertions(+)

diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
index 5516a47b54..102e426fd8 100644
--- a/Documentation/git-branch.txt
+++ b/Documentation/git-branch.txt
@@ -91,6 +91,9 @@ OPTIONS
based sha1 expressions such as "@\{yesterday}".
Note that in non-bare repositories, reflogs are usually
enabled by default by the `core.logallrefupdates` config option.
+   The negated form `--no-create-reflog` does not override the
+   default, even though it overrides `--create-reflog` that appears
+   earlier on the command line.
 
 -f::
 --force::
diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
index 2ac25a9bb3..fd7eeae075 100644
--- a/Documentation/git-tag.txt
+++ b/Documentation/git-tag.txt
@@ -152,6 +152,9 @@ This option is only applicable when listing tags without 
annotation lines.
 --create-reflog::
Create a reflog for the tag. To globally enable reflogs for tags, see
`core.logAllRefUpdates` in linkgit:git-config[1].
+   The negated form `--no-create-reflog` does not override the
+   default, even though it overrides `--create-reflog` that appears
+   earlier on the command line.
 
 ::
The name of the tag to create, delete, or describe.
-- 
2.11.0-800-g4bf73cb6b2



[PATCH 1/2] doc: add doc for git-push --recurse-submodules=only

2017-02-01 Thread cornelius . weig
From: Cornelius Weig 

Add documentation for the `--recurse-submodules=only` option of
git-push. The feature was added in commit 225e8bf (add option to
push only submodules).

Signed-off-by: Cornelius Weig 
---

Notes:
This feature is already in 'next' but was undocumented. Unless somebody 
reads
the release notes, there is no way of knowing about it.

 Documentation/git-push.txt | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index 8eefabd..1624a35 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -272,7 +272,7 @@ origin +master` to force a push to the `master` branch). 
See the
standard error stream is not directed to a terminal.
 
 --no-recurse-submodules::
---recurse-submodules=check|on-demand|no::
+--recurse-submodules=check|on-demand|only|no::
May be used to make sure all submodule commits used by the
revisions to be pushed are available on a remote-tracking branch.
If 'check' is used Git will verify that all submodule commits that
@@ -280,11 +280,12 @@ origin +master` to force a push to the `master` branch). 
See the
remote of the submodule. If any commits are missing the push will
be aborted and exit with non-zero status. If 'on-demand' is used
all submodules that changed in the revisions to be pushed will be
-   pushed. If on-demand was not able to push all necessary revisions
-   it will also be aborted and exit with non-zero status. A value of
-   'no' or using `--no-recurse-submodules` can be used to override the
-   push.recurseSubmodules configuration variable when no submodule
-   recursion is required.
+   pushed. If on-demand was not able to push all necessary revisions it 
will
+   also be aborted and exit with non-zero status. If 'only' is used all
+   submodules will be recursively pushed while the superproject is left
+   unpushed. A value of 'no' or using `--no-recurse-submodules` can be used
+   to override the push.recurseSubmodules configuration variable when no
+   submodule recursion is required.
 
 --[no-]verify::
Toggle the pre-push hook (see linkgit:githooks[5]).  The
-- 
2.10.2



[PATCH 2/2] completion: add completion for --recurse-submodules=only

2017-02-01 Thread cornelius . weig
From: Cornelius Weig 

Command completion for 'git-push --recurse-submodules' already knows to
complete some modes. However, the recently added mode 'only' is missing.

Adding 'only' to the recognized modes completes the list of non-trivial
modes.

Signed-off-by: Cornelius Weig 
---
 contrib/completion/git-completion.bash | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index ff7072a..fe3b0f8 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1675,7 +1675,7 @@ _git_pull ()
__git_complete_remote_or_refspec
 }
 
-__git_push_recurse_submodules="check on-demand"
+__git_push_recurse_submodules="check on-demand only"
 
 __git_complete_force_with_lease ()
 {
-- 
2.10.2



Re: [PATCH v3 4/4] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config

2017-02-01 Thread Johannes Schindelin
Hi Junio,

On Wed, 1 Feb 2017, Junio C Hamano wrote:

> Junio C Hamano  writes:
> 
> > Johannes Schindelin  writes:
> >
> >> That leaves the "putty" case in handle_ssh_variant(), does it not? Was it
> >> not your specific objection that that is the case?
> >
> > Yup, you can remove that while you reroll.
> >
> >> No worries, I will let this simmer for a while. Your fixup has a lot of
> >> duplicated code (so much for maintainability as an important goal... ;-))
> >> and I will have to think about it. My immediate thinking is to *not*
> >> duplicate code,...
> >
> > You need to realize that the namespaces of the configuration and the
> > command names are distinct.  There is no code duplication.
> 
> To explain this a bit, there is no reason why allowed values for
> SSH_VARIANT must be "putty" and "tortoiseplink".  An alternative
> design could be "port_option=-p,needs_batch=yes" and it may be more
> logical and futureproof if a variant of tortoiseplink decides to use
> "-p" instead of "-P" and still require "-batch".
> 
> Prematurely attempting to share code, only because the current
> vocabularies for two distinct concepts happen to overlap, is not
> de-duplicating the code for maintainability.  It is adding
> unnecessary work other people need to do in the future when they
> want to extend the system.

Except, of course, that your hypothetical port_option and needs_batch
settings would be handled at a different point altogether.

I sense very strongly that this discussion has taken a very emotional
turn, which is detrimental to the quality. So let's take a break here.

Ciao,
Johannes


Re: [PATCH v3 0/4] Handle PuTTY (plink/tortoiseplink) even in GIT_SSH_COMMAND

2017-02-01 Thread Junio C Hamano
Johannes Schindelin  writes:

>> Compared to the correctness issue, these are much harder to spot by
>> the submitter alone, who focused so intensely to get his code
>> "correct".  The review process is of greater value to spot these
>> issues.
>
> We will never agree on this.

That's too bad.

> From my perspective, design, explanation and maintainability are a
> consequence of making it easy for reviewers to spot where the code is
> incorrect.
>
> And correctness is not covered by "the submitter tested this". Correctness
> includes all the corner cases, where the "many eyes make bugs shallow"
> really shines.
>
> I'd rather have reviewers find bugs than users.

I'd rather have submitters find bugs than reviewers.

> I will *never* be a fan of a review process that pushes correctness to a
> back seat (yes, it is much harder than spotting typos or lines longer than
> 80 columns per row, but the ultimate goal is to deliver value to the end
> user, not to make life easy for the maintainer).

Did I ever say correctness is pushed to a back seat?

I said that it is easier to spot correctness issues for you as a
submitter than other kinds of issues without outside help (and
implied that if you are a diligent contributor, you should aim for,
and you should be able to achieve, a patch series where correctness
issues do not need to be pointed out).  But other higher level
issues are harder for any submitter to spot (regardless of
experience and competence of the submitter), because one gets so
married to one's own code, design and worldview.  And that is why
"review is primarily to spot bugs" can never be a correct viewpoint.

A reviewee needs to be prepared to accept review comments on higher
level issues, even more readily than comments on correctness issues,
because it is too easy to be constrained by early decisions one has
already made while preparing a patch series and become blind to
bigger picture after staring one's own new code for number of hours.
Higher level issues can be more easily spotted by reviewers, whose
eyes are still fresh to the series.


Re: [PATCH v3 4/4] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config

2017-02-01 Thread Junio C Hamano
Junio C Hamano  writes:

> Johannes Schindelin  writes:
>
>> That leaves the "putty" case in handle_ssh_variant(), does it not? Was it
>> not your specific objection that that is the case?
>
> Yup, you can remove that while you reroll.
>
>> No worries, I will let this simmer for a while. Your fixup has a lot of
>> duplicated code (so much for maintainability as an important goal... ;-))
>> and I will have to think about it. My immediate thinking is to *not*
>> duplicate code,...
>
> You need to realize that the namespaces of the configuration and the
> command names are distinct.  There is no code duplication.

To explain this a bit, there is no reason why allowed values for
SSH_VARIANT must be "putty" and "tortoiseplink".  An alternative
design could be "port_option=-p,needs_batch=yes" and it may be more
logical and futureproof if a variant of tortoiseplink decides to use
"-p" instead of "-P" and still require "-batch".

Prematurely attempting to share code, only because the current
vocabularies for two distinct concepts happen to overlap, is not
de-duplicating the code for maintainability.  It is adding
unnecessary work other people need to do in the future when they
want to extend the system.



Re: [PATCH v3 4/4] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config

2017-02-01 Thread Johannes Schindelin
Hi Junio,

On Wed, 1 Feb 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > That leaves the "putty" case in handle_ssh_variant(), does it not? Was it
> > not your specific objection that that is the case?
> 
> Yup, you can remove that while you reroll.
> 
> > No worries, I will let this simmer for a while. Your fixup has a lot of
> > duplicated code (so much for maintainability as an important goal... ;-))
> > and I will have to think about it. My immediate thinking is to *not*
> > duplicate code,...
> 
> You need to realize that the namespaces of the configuration and the
> command names are distinct.  There is no code duplication.

Since your 2/4 did away with the "plink" and "tortoiseplink" values in
favor of "port_option" and "batch_option", there is a duplication of logic
which I tried to undo in v3 and which you reintroduce in your fixup.

Maybe that refactoring was not so smart to begin with, and we should have
instead modified the code to use an enum instead and keep the original
conditionals instead of setting a port_option and a batch_option
explicitly.

Ciao,
Johannes


Re: [PATCH] doc: add note about ignoring --no-create-reflog

2017-02-01 Thread Junio C Hamano
cornelius.w...@tngtech.com writes:

> From: Cornelius Weig 
>
> The commands git-branch and git-tag accept a `--create-reflog` argument.

For the purpose of contrasting the above with "--no-create-reflog",
I find it a bit too weak to just say "accept".  How about

The commands git-branch and git-tag accept a `--create-reflog`
option, and creates reflog even in a repository where
core.logallrefupdates configuration is set not to.

or something?  After all "--no-create-reflog" is accepted.  It just
does not override the configured (or unconfigured) default.

> On the other hand, the negated form `--no-create-reflog` is accepted as
> a valid option but has no effect. This silent noop may puzzle users.

True, very true.

> To communicate that this behavior is intentional, add a short note in
> the manuals for git-branch and git-tag.

Hmph.  The added "short note" merely states the fact; it does not
hint that it is intentional or it explains what reasoning is behind
that intention.

> Signed-off-by: Cornelius Weig 
> ---
>
> Notes:
> In a previous discussion () 
> it
> was found that git-branch and git-tag accept a "--no-create-reflog" 
> argument,
> but it has no effect, does not produce a warning, and is undocumented.

Reading what Peff said in the thread, I do not think we actively
wanted this behaviour; we agreed that it is merely acceptable.  

So perhaps s/this behaviour is intentional/this is known/ to weaken
the log message?  That way, when somebody else who really cares
comes later and finds this commit that adds explicit notes to these
manual pages via "git blame", s/he would not be dissuaded from
making things better.  Such an update may make it warn when
core.logallrefupdates is not set to false (and continue to ignore
the command line option), or it may make the command line option
actually override the configured default.

With such an update to the log message, I think the patch looks
good.

Thanks.

>  Documentation/git-branch.txt | 1 +
>  Documentation/git-tag.txt| 1 +
>  2 files changed, 2 insertions(+)
>
> diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
> index 1fae4ee..fca3754 100644
> --- a/Documentation/git-branch.txt
> +++ b/Documentation/git-branch.txt
> @@ -91,6 +91,7 @@ OPTIONS
>   based sha1 expressions such as "@\{yesterday}".
>   Note that in non-bare repositories, reflogs are usually
>   enabled by default by the `core.logallrefupdates` config option.
> + The negated form `--no-create-reflog` is silently ignored.
>  
>  -f::
>  --force::
> diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
> index 5b2288c..b0b933e 100644
> --- a/Documentation/git-tag.txt
> +++ b/Documentation/git-tag.txt
> @@ -152,6 +152,7 @@ This option is only applicable when listing tags without 
> annotation lines.
>  --create-reflog::
>   Create a reflog for the tag. To globally enable reflogs for tags, see
>   `core.logAllRefUpdates` in linkgit:git-config[1].
> + The negated form `--no-create-reflog` is silently ignored.
>  
>  ::
>   The name of the tag to create, delete, or describe.


Re: [PATCH v3 4/4] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config

2017-02-01 Thread Junio C Hamano
Johannes Schindelin  writes:

> That leaves the "putty" case in handle_ssh_variant(), does it not? Was it
> not your specific objection that that is the case?

Yup, you can remove that while you reroll.

> No worries, I will let this simmer for a while. Your fixup has a lot of
> duplicated code (so much for maintainability as an important goal... ;-))
> and I will have to think about it. My immediate thinking is to *not*
> duplicate code,...

You need to realize that the namespaces of the configuration and the
command names are distinct.  There is no code duplication.


Re: [PATCH] doc: add note about ignoring --no-create-reflog

2017-02-01 Thread Jeff King
On Wed, Feb 01, 2017 at 02:30:38PM -0800, Junio C Hamano wrote:

> > Notes:
> > In a previous discussion 
> > () it
> > was found that git-branch and git-tag accept a "--no-create-reflog" 
> > argument,
> > but it has no effect, does not produce a warning, and is undocumented.
> 
> Reading what Peff said in the thread, I do not think we actively
> wanted this behaviour; we agreed that it is merely acceptable.
> 
> So perhaps s/this behaviour is intentional/this is known/ to weaken
> the log message?  That way, when somebody else who really cares
> comes later and finds this commit that adds explicit notes to these
> manual pages via "git blame", s/he would not be dissuaded from
> making things better.  Such an update may make it warn when
> core.logallrefupdates is not set to false (and continue to ignore
> the command line option), or it may make the command line option
> actually override the configured default.

Yeah, I'd consider it more of a "known bug" or "known limitation" than
anything.

Those can go in a separate section, but they're probably more likely to
be read when supplied next to the actual option.

> With such an update to the log message, I think the patch looks
> good.
> [...]
> > @@ -91,6 +91,7 @@ OPTIONS
> > based sha1 expressions such as "@\{yesterday}".
> > Note that in non-bare repositories, reflogs are usually
> > enabled by default by the `core.logallrefupdates` config option.
> > +   The negated form `--no-create-reflog` is silently ignored.

This might be nitpicking, but it's _not_ ignored. It still negates an
earlier "--create-reflog". It is only that it does not override the
decision to create a reflog caused by the setting of
core.logallrefupdates.

-Peff


RE: git commit results in many lstat()s

2017-02-01 Thread Gumbel, Matthew K
"Junio C Hamano"  writes:
> There probably are other things that can be optimized.

Yes, I think that when the user passes --only flag to git-commit, then git does 
not
need to call refresh_cache() in prepare_index() in builtin/commit.c.

I may experiment with that. Do you see any downside or negative side-effects?

Matt


Re: [PATCH v3 4/4] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config

2017-02-01 Thread Johannes Schindelin
Hi Junio,

On Wed, 1 Feb 2017, Junio C Hamano wrote:

> Junio C Hamano  writes:
> 
> > I think restructuring along the line of 3/4 of this round is very
> > sensible in the longer term (if anything, handle_ssh_variant() now
> > really handles ssh variants in all cases, which makes it worthy of
> > its name, as opposed to the one in the previous round that only
> > reacts to the overrides).  But it seems that it will take longer to
> > get such a rewrite right, and my priority is seeing this topic to
> > add autodetection to GIT_SSH_COMMAND and other smaller topics in the
> > upcoming -rc0 in a serviceable and correct shape.
> >
> > The restructuring done by 3/4 can come later after the dust settles,
> > if somebody cares deeply enough about performance in the rare cases.
> 
> Having said all that, because I like the patch 3/4 so much, here is
> another way to fix this, and I think (of course I am biased, but...)
> the result is easier to grok.  Not only it makes it clear that the
> namespace for the actual command names on the command line and the
> namespace for the supported values of the configuration are distinct,
> it makes it clear that we do not do anything extra when the user
> explicitly tells us which variant to use.
> 
> This is designed to be squashed into [4/4] of this latest round (as
> opposed to the one in the message I am responding to, which is to be
> squashed into the previous round).
> 
>  connect.c | 42 +++---
>  1 file changed, 31 insertions(+), 11 deletions(-)
> 
> diff --git a/connect.c b/connect.c
> index 7f1f802396..12ad8d4c8b 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -691,17 +691,39 @@ static const char *get_ssh_command(void)
>   return NULL;
>  }
>  
> -static int handle_ssh_variant(const char *ssh_command, int is_cmdline,
> -   int *port_option, int *needs_batch)
> +static int override_ssh_variant(int *port_option, int *needs_batch)
>  {
> - const char *variant = getenv("GIT_SSH_VARIANT");
> + char *variant;
> +
> + variant = xstrdup_or_null(getenv("GIT_SSH_VARIANT"));
> + if (!variant &&
> + git_config_get_string("ssh.variant", ))
> + return 0;
> +
> + if (!strcmp(variant, "plink") || !strcmp(variant, "putty")) {
> + *port_option = 'P';
> + *needs_batch = 0;
> + } else if (!strcmp(variant, "tortoiseplink")) {
> + *port_option = 'P';
> + *needs_batch = 1;
> + } else {
> + *port_option = 'p';
> + *needs_batch = 0;
> + }
> + free(variant);
> + return 1;
> +}
> +
> +static void handle_ssh_variant(const char *ssh_command, int is_cmdline,
> +int *port_option, int *needs_batch)
> +{
> + const char *variant;
>   char *p = NULL;
>  
> - if (variant)
> - ; /* okay, fall through */
> - else if (!git_config_get_string("ssh.variant", ))
> - variant = p;
> - else if (!is_cmdline) {
> + if (override_ssh_variant(port_option, needs_batch))
> + return;
> +
> + if (!is_cmdline) {
>   p = xstrdup(ssh_command);
>   variant = basename(p);
>   } else {
> @@ -717,7 +739,7 @@ static int handle_ssh_variant(const char *ssh_command, 
> int is_cmdline,
>*/
>   free(ssh_argv);
>   } else
> - return 0;
> + return;
>   }
>  
>   if (!strcasecmp(variant, "plink") ||
> @@ -730,8 +752,6 @@ static int handle_ssh_variant(const char *ssh_command, 
> int is_cmdline,
>   *needs_batch = 1;
>   }
>   free(p);
> -
> - return 1;
>  }
>  
>  /*

That leaves the "putty" case in handle_ssh_variant(), does it not? Was it
not your specific objection that that is the case?

No worries, I will let this simmer for a while. Your fixup has a lot of
duplicated code (so much for maintainability as an important goal... ;-))
and I will have to think about it. My immediate thinking is to *not*
duplicate code, strip the .exe extension directly after calling basename()
in the cases where we handle commands, and handle "putty" in the other
cases by replacing it with the string "plink".

But as I said, this will simmer for a while.

Ciao,
Johannes


Re: [PATCH v3 0/4] Handle PuTTY (plink/tortoiseplink) even in GIT_SSH_COMMAND

2017-02-01 Thread Johannes Schindelin
Hi Junio,

On Wed, 1 Feb 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > It is quite preposterous to call this an "iteration" of the patch
> > series, because the code is so different now. I say this because I want
> > to caution that this code has not been tested as thoroughly, by far, as
> > the first iteration.
> >
> > The primary purpose of code review is correctness, everything else is
> > either a consequence of it, or a means to make reviewing easier.
> 
> You are utterly wrong here.
> 
> The primary purpose of code review is to spot and correct the
> problems the submitter has missed.  The problems can span from
> stupid bugs that affect correctness to maintainability, to design
> mistakes, to clarity of explanation for both end users and future
> developers.
> 
> Among them, correctness problems are, as long as the problem to be
> solved is specified clearly enough, the easiest to spot by the
> submitter before the patch is sent out.  The submitter is focused on
> solving one problem, and if the updated code does not even work as
> the submitter advertises it would, that can be caught by the
> submitter before the patch is even sent out.  
> 
> Of course, humans are not perfect, and catching correctness problems
> is important, but that is not the only (let alone primary) thing.
> 
> When a submitter has been focusing on solving one problem, it is
> easy to lose the larger picture and to end up adding something that
> may be "correct" (in the sense of "works as advertised by the
> submitter") but does not fit well with the rest of the system, or
> covers some use cases but misses other important and related use
> cases.  If the "does not fit well" surfaces to the end user level,
> that would become a design problem.  If it affects the future
> developers, that would become a maintainability problem.
> 
> Compared to the correctness issue, these are much harder to spot by
> the submitter alone, who focused so intensely to get his code
> "correct".  The review process is of greater value to spot these
> issues.

We will never agree on this.

>From my perspective, design, explanation and maintainability are a
consequence of making it easy for reviewers to spot where the code is
incorrect.

And correctness is not covered by "the submitter tested this". Correctness
includes all the corner cases, where the "many eyes make bugs shallow"
really shines.

I'd rather have reviewers find bugs than users.

I will *never* be a fan of a review process that pushes correctness to a
back seat (yes, it is much harder than spotting typos or lines longer than
80 columns per row, but the ultimate goal is to deliver value to the end
user, not to make life easy for the maintainer).

Ciao,
Johannes


Re: git commit results in many lstat()s

2017-02-01 Thread Junio C Hamano
"Gumbel, Matthew K"  writes:

> I do not understand why git commit must call lstat() on every file
> in the repository, even when I specify the name of the file I want
> to commit on the command line.

Assuming the "COPYING" and "README.md" files are already tracked:

$ >COPYING
$ >README.md
$ git commit COPYING

would open an editor, in which you would see list of files under
"Changes to be committed", "Changes not staged for commit", etc.
Among the second class you would see README.md listed.

To figure out what paths are "changed", without having to open all
files and compare their contents with what is recorded in the commit
you are building on top of, we do lstat(2) to see if the timestamp
(and other information in the inode) of the files are the same since
you checked them out of HEAD.

$ git commit --no-status COPYING

would reduce the number of lstat(2) somewhat, because the codepath
is told that it does not have to make the list to be shown in the
editor.  So would

$ git commit -m "empty COPYING" COPYING

These two only halve the number of lstat(2), by taking advantage of
the fact that the list of "modified files" does not have to be
built.  There probably are other things that can be optimized.



[PATCH] doc: add note about ignoring --no-create-reflog

2017-02-01 Thread cornelius . weig
From: Cornelius Weig 

The commands git-branch and git-tag accept a `--create-reflog` argument.
On the other hand, the negated form `--no-create-reflog` is accepted as
a valid option but has no effect. This silent noop may puzzle users.

To communicate that this behavior is intentional, add a short note in
the manuals for git-branch and git-tag.

Signed-off-by: Cornelius Weig 
---

Notes:
In a previous discussion () it
was found that git-branch and git-tag accept a "--no-create-reflog" 
argument,
but it has no effect, does not produce a warning, and is undocumented.

 Documentation/git-branch.txt | 1 +
 Documentation/git-tag.txt| 1 +
 2 files changed, 2 insertions(+)

diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
index 1fae4ee..fca3754 100644
--- a/Documentation/git-branch.txt
+++ b/Documentation/git-branch.txt
@@ -91,6 +91,7 @@ OPTIONS
based sha1 expressions such as "@\{yesterday}".
Note that in non-bare repositories, reflogs are usually
enabled by default by the `core.logallrefupdates` config option.
+   The negated form `--no-create-reflog` is silently ignored.
 
 -f::
 --force::
diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
index 5b2288c..b0b933e 100644
--- a/Documentation/git-tag.txt
+++ b/Documentation/git-tag.txt
@@ -152,6 +152,7 @@ This option is only applicable when listing tags without 
annotation lines.
 --create-reflog::
Create a reflog for the tag. To globally enable reflogs for tags, see
`core.logAllRefUpdates` in linkgit:git-config[1].
+   The negated form `--no-create-reflog` is silently ignored.
 
 ::
The name of the tag to create, delete, or describe.
-- 
2.10.2



git commit results in many lstat()s

2017-02-01 Thread Gumbel, Matthew K
Hello,

My high level problem is to speed up git commit on a large repository stored on 
NFS filesystem. I see via strace that it is slow because it makes a large 
number (~50,000) of lstat() calls in serial. Every call is a round-trip to the 
NFS server.

I do not understand why git commit must call lstat() on every file in the 
repository, even when I specify the name of the file I want to commit on the 
command line. Can somebody explain why it must call lstat on every file?

My command-line looks like this: git commit -uno -o -m asdf file-to-commit.txt

Secondly, are there any optimizations I can make to avoid this behavior?

Thanks,
Matt



Re: [ANNOUNCE] Git Merge Contributor Summit topic planning

2017-02-01 Thread Junio C Hamano
Jeff King  writes:

> For instance, if the server knew that XYZ meant
>
>   - send bytes m through n of packfile p, then...
>   
>   - send the object at position i of packfile p, as a delta against the
> object at position j of packfile q
>
>   - ...and so on
>
> Then you could store very small "instruction sheets" for each XYZ that
> rely on the data in the packfiles. If those packfiles go away (e.g., due
> to a repack) that invalidates all of your current XYZ tags. That's OK as
> long as this is an optimization, not a correctness requirement.

Yes.  You can play optimization games.

>> So in the real life, I think that the exchange needs to be more
>> like this:
>> 
>> C: I need a packfile with this want/have
>> ... C/S negotiate what "have"s are common ...
>> S: Sorry, but our negitiation indicates that you are way too
>>behind.  I'll send you a packfile that brings you up to a
>>slightly older set of "want", so pretend that you asked for
>>these slightly older "want"s instead.  The opaque id of that
>>packfile is XYZ.  After getting XYZ, come back to me with
>>your original set of "want"s.  You would give me more recent
>>"have" in that request.  
>> ... connection interrupted ...
>> C: It's me again.  I have up to byte N of pack XYZ
>> S: OK, resuming (or: I do not have it anymore, start from scratch)
>> ... after 0 or more iterations C fully receives and digests XYZ ...
>> 
>> and then the above will iterate until the server does not have to
>> say "Sorry but you are way too behind" and returns a packfile
>> without having to tweak the "want".
>
> Yes, I think that is a reasonable variant. The client knows about
> seeding, but the XYZ conversation continues to happen inside the git
> protocol. So it loses flexibility versus a true CDN redirection, but it
> would "just work" when the server/client both understand the feature,
> without the server admin having to set up a separate bundle-over-http
> infrastructure.

You can also do a CDN offline as a natural extension.  When the
server says "Sorry, you are way too behind.", the above example
tells "I'll update you to a slightly stale version first" to the
client.  An natural extension could say "Go update yourself to a
slightly stale version first by grabbing that bundle over there."

But I agree that doing everything in-line may be a logical and
simpler first step to get there.


Re: [ANNOUNCE] Git Merge Contributor Summit topic planning

2017-02-01 Thread Jeff King
On Wed, Feb 01, 2017 at 10:06:15AM -0800, Junio C Hamano wrote:

> > If you _can_ do that latter part, and you take "I only care about
> > resumability" to the simplest extreme, you'd probably end up with a
> > protocol more like:
> >
> >   Client: I need a packfile with this want/have
> >   Server: OK, here it is; its opaque id is XYZ.
> >   ... connection interrupted ...
> >   Client: It's me again. I have up to byte N of pack XYZ
> >   Server: OK, resuming
> >   [or: I don't have XYZ anymore; start from scratch]
> >
> > Then generating XYZ and generating that bundle are basically the same
> > task.
> 
> The above allows a simple and naive implementation of generating a
> packstream and "tee"ing it to a spool file to be kept while sending
> to the first client that asks XYZ.
> 
> The story I heard from folks who run git servers at work for Android
> and other projects, however, is that they rarely see two requests
> with want/have that result in an identical XYZ, unless "have" is an
> empty set (aka "clone").  In a busy repository, between two clone
> requests relatively close together, somebody would be pushing, so
> you'd need many XYZs in your spool even if you want to support only
> the "clone" case.

Yeah, I agree a tag "XYZ" does not cover all cases, especially for
fetches.

We do caching at GitHub based on the sha1(want+have+options) tag, and it
does catch quite a lot of parallelism, but not all. It catches most
clones, and many fetches that are done by "thundering herds" of similar
clients.

One thing you could do with such a pure "resume XYZ" tag is to represent
the generated pack _without_ replicating the actual object bytes, but
take shortcuts by basing particular bits on the on-disk packfile. Just
enough to serve a deterministic packfile for the same want/have bits.

For instance, if the server knew that XYZ meant

  - send bytes m through n of packfile p, then...
  
  - send the object at position i of packfile p, as a delta against the
object at position j of packfile q

  - ...and so on

Then you could store very small "instruction sheets" for each XYZ that
rely on the data in the packfiles. If those packfiles go away (e.g., due
to a repack) that invalidates all of your current XYZ tags. That's OK as
long as this is an optimization, not a correctness requirement.

I haven't actually built anything like this, though, so I don't have a
complete language for the instruction sheets, nor numbers on how big
they would be for average cases.

> So in the real life, I think that the exchange needs to be more
> like this:
> 
> C: I need a packfile with this want/have
> ... C/S negotiate what "have"s are common ...
> S: Sorry, but our negitiation indicates that you are way too
>behind.  I'll send you a packfile that brings you up to a
>slightly older set of "want", so pretend that you asked for
>these slightly older "want"s instead.  The opaque id of that
>packfile is XYZ.  After getting XYZ, come back to me with
>your original set of "want"s.  You would give me more recent
>"have" in that request.  
> ... connection interrupted ...
> C: It's me again.  I have up to byte N of pack XYZ
> S: OK, resuming (or: I do not have it anymore, start from scratch)
> ... after 0 or more iterations C fully receives and digests XYZ ...
> 
> and then the above will iterate until the server does not have to
> say "Sorry but you are way too behind" and returns a packfile
> without having to tweak the "want".

Yes, I think that is a reasonable variant. The client knows about
seeding, but the XYZ conversation continues to happen inside the git
protocol. So it loses flexibility versus a true CDN redirection, but it
would "just work" when the server/client both understand the feature,
without the server admin having to set up a separate bundle-over-http
infrastructure.

-Peff


Re: [PATCH] merge-recursive: make "CONFLICT (rename/delete)" message show both paths

2017-02-01 Thread Junio C Hamano
Junio C Hamano  writes:

> Matt McCutchen  writes:
> ...
>> Please check that my refactoring is indeed correct!  All the existing tests 
>> pass
>> for me, but the existing test coverage of these conflict messages looks poor.
>
> This unfortunately is doing a bit too many things at once from that
> point of view.  I need to reserve a solid quiet 20-minutes without
> distraction to check it, which I am hoping to do tonight.

Let me make sure if I understood your changes correctly:

 * conflict_rename_delete() knew which one is renamed and which one
   is deleted (even though the deleted one was called "other"), but
   because in the original code handle_change_delete() wants to
   always see tree A first and then tree B in its parameter list,
   the original code swapped a/b before calling it.  In the original
   code, handle_change_delete() needed to figure out which one is
   the deleted one by looking at a_oid or b_oid.

 * In the updated code, the knowledge of which branch survives and
   which branch is deleted is passed from the caller to
   handle_change_delete(), which no longer needs to figure out by
   looking at a_oid/b_oid.  The updated API only passes the oid for
   surviving branch (as deleted one would have been 0{40} anyway).

 * In the updated code, handle_change_delete() is told the names of
   both branches (the one that survives and the other that was
   deleted).  It no longer has to switch between o->branch[12]
   depending on the NULLness of a_oid/b_oid; it knows both names and
   which one is which.

 * handle_modify_delete() also calls handle_change_delete().  Unlike
   conflict_rename_delete(), it is not told by its caller which
   branch keeps the path and which branch deletes the path, and
   instead relies on handle_change_delete() to figure it out.
   Because of the above change to the API, now it needs to sort it
   out before calling handle_change_delete().

It all makes sense to me.  

The single call to update_file() that appears near the end of
handle_change_delete() in the updated code corresponds to calls to
the same function in 3 among 4 codepaths in the function in the
original code.  It is a bit tricky to reason about, though.

In the original code, update_file() was omitted when we didn't have
to come up with a unique alternate filename and the one that is left
is a_oid (i.e. our side).  The way to tell if we did not come up
with a unique alternate filename used to be to see the "renamed"
variable but now it is the NULL-ness of "alt_path".  And the way to
tell if the side that is left is ours, we check to see o->branch1
is the change_branch, not delete_branch.

So the condition to guard the call to update_file() also looks
correct to me.

Thanks.

>> -char *renamed = NULL;
>> +char *alt_path = NULL;
>> +const char *update_path = path;
>>  int ret = 0;
>> +
>>  if (dir_in_way(path, !o->call_depth, 0)) {
>> -renamed = unique_path(o, path, a_oid ? o->branch1 : o->branch2);
>> +update_path = alt_path = unique_path(o, path, change_branch);
>>  }
>>  
>>  if (o->call_depth) {
>> @@ -1081,43 +1085,43 @@ static int handle_change_delete(struct merge_options 
>> *o,
>>   */
>>  ret = remove_file_from_cache(path);
>>  if (!ret)
>> -ret = update_file(o, 0, o_oid, o_mode,
>> -  renamed ? renamed : path);
>> -} else if (!a_oid) {
>> -if (!renamed) {
>> -output(o, 1, _("CONFLICT (%s/delete): %s deleted in %s "
>> -   "and %s in %s. Version %s of %s left in tree."),
>> -   change, path, o->branch1, change_past,
>> -   o->branch2, o->branch2, path);
>> -ret = update_file(o, 0, b_oid, b_mode, path);
>> -} else {
>> -output(o, 1, _("CONFLICT (%s/delete): %s deleted in %s "
>> -   "and %s in %s. Version %s of %s left in tree at 
>> %s."),
>> -   change, path, o->branch1, change_past,
>> -   o->branch2, o->branch2, path, renamed);
>> -ret = update_file(o, 0, b_oid, b_mode, renamed);
>> -}
>> +ret = update_file(o, 0, o_oid, o_mode, update_path);
>>  } else {
>> -if (!renamed) {
>> -output(o, 1, _("CONFLICT (%s/delete): %s deleted in %s "
>> -   "and %s in %s. Version %s of %s left in tree."),
>> -   change, path, o->branch2, change_past,
>> -   o->branch1, o->branch1, path);
>> +if (!alt_path) {
>> +if (!old_path) {
>> +output(o, 1, _("CONFLICT (%s/delete): %s 
>> deleted in %s "
>> +   "and %s in %s. Version %s of %s left in 
>> 

Re: [ANNOUNCE] Git Merge Contributor Summit topic planning

2017-02-01 Thread Stefan Beller
On Mon, Jan 30, 2017 at 4:48 PM, Jeff King  wrote:
> The Contributor Summit is only a few days away; I'd like to work out a
> few bits of logistics ahead of time.
>
> We're up to 26 attendees. The room layout will probably be three big
> round-tables, with a central projector. We should be able to have
> everybody pay attention to a single speaker, or break into 3 separate
> conversations.
>
> The list of topics is totally open. If you're coming and have something
> you'd like to present or discuss, then propose it here. If you're _not_
> coming, you may still chime in with input on topics, but please don't
> suggest a topic unless somebody who is there will agree to lead the
> discussion.

submodules and X (How do submodules and worktrees interact,
should they?, Which functions need support for submodules, e.g. checkout,
branch,  grep, etc...? Are we interested in keeping a submodule its own
logical unit? Do we want to have dedicated plumbing commands for
submodules?)

... would be my line of talk,

Stefan


Re: [PATCH v3 0/4] Handle PuTTY (plink/tortoiseplink) even in GIT_SSH_COMMAND

2017-02-01 Thread Junio C Hamano
Johannes Schindelin  writes:

> It is quite preposterous to call this an "iteration" of the patch
> series, because the code is so different now. I say this because I want
> to caution that this code has not been tested as thoroughly, by far, as
> the first iteration.
>
> The primary purpose of code review is correctness, everything else is
> either a consequence of it, or a means to make reviewing easier.

You are utterly wrong here.

The primary purpose of code review is to spot and correct the
problems the submitter has missed.  The problems can span from
stupid bugs that affect correctness to maintainability, to design
mistakes, to clarity of explanation for both end users and future
developers.

Among them, correctness problems are, as long as the problem to be
solved is specified clearly enough, the easiest to spot by the
submitter before the patch is sent out.  The submitter is focused on
solving one problem, and if the updated code does not even work as
the submitter advertises it would, that can be caught by the
submitter before the patch is even sent out.  

Of course, humans are not perfect, and catching correctness problems
is important, but that is not the only (let alone primary) thing.

When a submitter has been focusing on solving one problem, it is
easy to lose the larger picture and to end up adding something that
may be "correct" (in the sense of "works as advertised by the
submitter") but does not fit well with the rest of the system, or
covers some use cases but misses other important and related use
cases.  If the "does not fit well" surfaces to the end user level,
that would become a design problem.  If it affects the future
developers, that would become a maintainability problem.

Compared to the correctness issue, these are much harder to spot by
the submitter alone, who focused so intensely to get his code
"correct".  The review process is of greater value to spot these
issues.

I've already read and commented on the series; as I said, I think
the rewrite in 3/4 makes the resulting code much easier to read, and
with the fix-up I just sent out, I think the end result is correct,
works as advertised, the way it is advertised is clear to end users,
and is maintainable.  

But I do share your uneasiness about the "new-ness" of the code.


Re: [PATCH v3 4/4] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config

2017-02-01 Thread Junio C Hamano
Junio C Hamano  writes:

> I think restructuring along the line of 3/4 of this round is very
> sensible in the longer term (if anything, handle_ssh_variant() now
> really handles ssh variants in all cases, which makes it worthy of
> its name, as opposed to the one in the previous round that only
> reacts to the overrides).  But it seems that it will take longer to
> get such a rewrite right, and my priority is seeing this topic to
> add autodetection to GIT_SSH_COMMAND and other smaller topics in the
> upcoming -rc0 in a serviceable and correct shape.
>
> The restructuring done by 3/4 can come later after the dust settles,
> if somebody cares deeply enough about performance in the rare cases.

Having said all that, because I like the patch 3/4 so much, here is
another way to fix this, and I think (of course I am biased, but...)
the result is easier to grok.  Not only it makes it clear that the
namespace for the actual command names on the command line and the
namespace for the supported values of the configuration are distinct,
it makes it clear that we do not do anything extra when the user
explicitly tells us which variant to use.

This is designed to be squashed into [4/4] of this latest round (as
opposed to the one in the message I am responding to, which is to be
squashed into the previous round).

 connect.c | 42 +++---
 1 file changed, 31 insertions(+), 11 deletions(-)

diff --git a/connect.c b/connect.c
index 7f1f802396..12ad8d4c8b 100644
--- a/connect.c
+++ b/connect.c
@@ -691,17 +691,39 @@ static const char *get_ssh_command(void)
return NULL;
 }
 
-static int handle_ssh_variant(const char *ssh_command, int is_cmdline,
- int *port_option, int *needs_batch)
+static int override_ssh_variant(int *port_option, int *needs_batch)
 {
-   const char *variant = getenv("GIT_SSH_VARIANT");
+   char *variant;
+
+   variant = xstrdup_or_null(getenv("GIT_SSH_VARIANT"));
+   if (!variant &&
+   git_config_get_string("ssh.variant", ))
+   return 0;
+
+   if (!strcmp(variant, "plink") || !strcmp(variant, "putty")) {
+   *port_option = 'P';
+   *needs_batch = 0;
+   } else if (!strcmp(variant, "tortoiseplink")) {
+   *port_option = 'P';
+   *needs_batch = 1;
+   } else {
+   *port_option = 'p';
+   *needs_batch = 0;
+   }
+   free(variant);
+   return 1;
+}
+
+static void handle_ssh_variant(const char *ssh_command, int is_cmdline,
+  int *port_option, int *needs_batch)
+{
+   const char *variant;
char *p = NULL;
 
-   if (variant)
-   ; /* okay, fall through */
-   else if (!git_config_get_string("ssh.variant", ))
-   variant = p;
-   else if (!is_cmdline) {
+   if (override_ssh_variant(port_option, needs_batch))
+   return;
+
+   if (!is_cmdline) {
p = xstrdup(ssh_command);
variant = basename(p);
} else {
@@ -717,7 +739,7 @@ static int handle_ssh_variant(const char *ssh_command, int 
is_cmdline,
 */
free(ssh_argv);
} else
-   return 0;
+   return;
}
 
if (!strcasecmp(variant, "plink") ||
@@ -730,8 +752,6 @@ static int handle_ssh_variant(const char *ssh_command, int 
is_cmdline,
*needs_batch = 1;
}
free(p);
-
-   return 1;
 }
 
 /*


Re: [ANNOUNCE] Git Merge Contributor Summit topic planning

2017-02-01 Thread Christian Couder
On Tue, Jan 31, 2017 at 1:59 AM, Jeff King  wrote:
> On Tue, Jan 31, 2017 at 01:48:05AM +0100, Jeff King wrote:
>
>> The list of topics is totally open. If you're coming and have something
>> you'd like to present or discuss, then propose it here. If you're _not_
>> coming, you may still chime in with input on topics, but please don't
>> suggest a topic unless somebody who is there will agree to lead the
>> discussion.
>
> Here are the two topics I plan on bringing:
>
>   - Git / Software Freedom Conservancy yearly report. I'll plan to give
> a rundown of the past year's activities and financials, along with
> some open questions that could benefit from community input.
>
>   - The git-scm.com website: who runs that thing, anyway? An overview
> of the site, how it's managed, and what it needs.
>
> I plan to send out detailed emails on both topics to the list on
> Wednesday, and then follow-up with a summary of any useful in-person
> discussion (since obviously not everybody will be at the summit).
>
> I'd encourage anybody with a topic to present to consider doing
> something similar.

GitLab people at the Summit (this includes me) would like to spend a
few minutes to introduce https://gitlab.com/gitlab-org/gitaly/ and
answer any questions.


Re: [PATCH v3 4/4] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config

2017-02-01 Thread Junio C Hamano
Johannes Schindelin  writes:

> index 2734b9a1ca..7f1f802396 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -694,10 +694,14 @@ static const char *get_ssh_command(void)
>  static int handle_ssh_variant(const char *ssh_command, int is_cmdline,
> int *port_option, int *needs_batch)
>  {
> - const char *variant;
> + const char *variant = getenv("GIT_SSH_VARIANT");
>   char *p = NULL;
>  
> - if (!is_cmdline) {
> + if (variant)
> + ; /* okay, fall through */
> + else if (!git_config_get_string("ssh.variant", ))
> + variant = p;
> + else if (!is_cmdline) {
>   p = xstrdup(ssh_command);
>   variant = basename(p);
>   } else {
> @@ -717,7 +721,8 @@ static int handle_ssh_variant(const char *ssh_command, 
> int is_cmdline,
>   }
>  
>   if (!strcasecmp(variant, "plink") ||
> - !strcasecmp(variant, "plink.exe"))
> + !strcasecmp(variant, "plink.exe") ||
> + !strcasecmp(variant, "putty"))

This means that "putty" that appear as the first word of the command
line, not the value configured for ssh.variant or GIT_SSH_VARIANT,
will also trigger the option for "plink", no?

Worse, "plink.exe" configured as the value of "ssh.variant", is
anything other than 'ssh', 'plink', 'putty', or 'tortoiseplink', but
it is not treated as normal ssh, contrary to the documentation.

> +ssh.variant::
> + Depending on the value of the environment variables `GIT_SSH` or
> + `GIT_SSH_COMMAND`, or the config setting `core.sshCommand`, Git
> + auto-detects whether to adjust its command-line parameters for use
> + with plink or tortoiseplink, as opposed to the default (OpenSSH).
> ++
> +The config variable `ssh.variant` can be set to override this auto-detection;
> +valid values are `ssh`, `plink`, `putty` or `tortoiseplink`. Any other value
> +will be treated as normal ssh. This setting can be overridden via the
> +environment variable `GIT_SSH_VARIANT`.

I was hoping that the "rewrite that retains simplicity" was also
correct, but let's not go in this direction.  The more lines you
touch in a hurry, the worse the code will get, and that is to be
expected.

I'd suggest taking the documentation updates from this series, and
then minimally plug the leak introduced by the previous round,
perhaps like the attached SQUASH.  As I said, GIT_SSH_VARIANT and
ssh.variant are expected to be rare cases, and the case in which
they are set does not have to be optimized if it makes the necessary
change smaller and more likely to be correct.

I think restructuring along the line of 3/4 of this round is very
sensible in the longer term (if anything, handle_ssh_variant() now
really handles ssh variants in all cases, which makes it worthy of
its name, as opposed to the one in the previous round that only
reacts to the overrides).  But it seems that it will take longer to
get such a rewrite right, and my priority is seeing this topic to
add autodetection to GIT_SSH_COMMAND and other smaller topics in the
upcoming -rc0 in a serviceable and correct shape.

The restructuring done by 3/4 can come later after the dust settles,
if somebody cares deeply enough about performance in the rare cases.

 Documentation/config.txt | 14 +-
 Documentation/git.txt|  9 -
 connect.c|  9 +
 3 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f2c210f0a0..b88df57ab6 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1950,11 +1950,15 @@ matched against are those given directly to Git 
commands.  This means any URLs
 visited as a result of a redirection do not participate in matching.
 
 ssh.variant::
-   Override the autodetection of plink/tortoiseplink in the SSH
-   command that 'git fetch' and 'git push' use. It can be set to
-   either `ssh`, `plink`, `putty` or `tortoiseplink`. Any other
-   value will be treated as normal ssh. This is useful in case
-   that Git gets this wrong.
+   Depending on the value of the environment variables `GIT_SSH` or
+   `GIT_SSH_COMMAND`, or the config setting `core.sshCommand`, Git
+   auto-detects whether to adjust its command-line parameters for use
+   with plink or tortoiseplink, as opposed to the default (OpenSSH).
++
+The config variable `ssh.variant` can be set to override this auto-detection;
+valid values are `ssh`, `plink`, `putty` or `tortoiseplink`. Any other value
+will be treated as normal ssh. This setting can be overridden via the
+environment variable `GIT_SSH_VARIANT`.
 
 i18n.commitEncoding::
Character encoding the commit messages are stored in; Git itself
diff --git a/Documentation/git.txt b/Documentation/git.txt
index c322558aa7..a0c6728d1a 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -1021,11 +1021,10 @@ personal `.ssh/config` file.  Please 

Re: What's cooking in git.git (Jan 2017, #06; Tue, 31)

2017-02-01 Thread Junio C Hamano
Patrick Steinhardt  writes:

> I just tried to write additional tests to exercise this in our
> config-tests by using `git config --get-urlmatch` with multiple
> `http.extraheader`s set. I've been stumped that it still didn't
> work correctly with my patch, only the last value was actually
> ever returned.

I think that is very much in line with "git config --get" works, and
"--get-urlmatch" being an enhancement of "--get", I would expect
that "--get-urlmatch" to follow the usual "the last one wins" rule.

If you want to see _all_ values for a multi-valued variable, you
would say "git config --get-all".  IIUC, there currently is no
"--get-all-urlmatch", but there may need one.  I dunno.


Re: [PATCH 1/5] add SWAP macro

2017-02-01 Thread Junio C Hamano
René Scharfe  writes:

> Size checks could be added to SIMPLE_SWAP as well.

Between SWAP() and SIMPLE_SWAP() I am undecided.

If the compiler can infer the type and the size of the two
"locations" given to the macro, there is no technical reason to
require the caller to specify the type as an extra argument, so
SIMPLE_SWAP() may not necessarily an improvement over SWAP() from
that point of view.  If the redundancy is used as a sanity check,
I'd be in favor of SIMPLE_SWAP(), though.

If the goal of SIMPLE_SWAP(), on the other hand, were to support the
"only swap char with int for small value" example earlier in the
thread, it means you cannot sanity check the type of things being
swapped in the macro, and the readers of the code need to know about
the details of what variables are being swapped.  It looks to me
that it defeats the main benefit of using a macro.

> The main benefit of a swap macro is reduced repetition IMHO: Users
> specify the variables to swap once instead of twice in a special
> order, and with SWAP they don't need to declare their type again.
> Squeezing out redundancy makes the code easier to read and modify.

Yes.


Re: [PATCH 1/5] add SWAP macro

2017-02-01 Thread René Scharfe

Am 01.02.2017 um 12:47 schrieb Jeff King:

I'm not altogether convinced that SWAP() is an improvement in
readability. I really like that it's shorter than the code it replaces,
but there is a danger with introducing opaque constructs. It's one more
thing for somebody familiar with C but new to the project to learn or
get wrong.


I'm biased, of course, but SIMPLE_SWAP and SWAP exchange the values of 
two variables -- how could someone get that wrong?  Would it help to 
name the macro SWAP_VALUES?  Or XCHG? ;)



IMHO the main maintenance gain from René's patch is that you don't have
to specify the type, which means you can never have a memory-overflow
bug due to incorrect types. If we lose that, then I don't see all that
much value in the whole thing.


Size checks could be added to SIMPLE_SWAP as well.

The main benefit of a swap macro is reduced repetition IMHO: Users 
specify the variables to swap once instead of twice in a special order, 
and with SWAP they don't need to declare their type again.  Squeezing 
out redundancy makes the code easier to read and modify.


René


Re: [ANNOUNCE] Git Merge Contributor Summit topic planning

2017-02-01 Thread Junio C Hamano
Jeff King  writes:

> If you _can_ do that latter part, and you take "I only care about
> resumability" to the simplest extreme, you'd probably end up with a
> protocol more like:
>
>   Client: I need a packfile with this want/have
>   Server: OK, here it is; its opaque id is XYZ.
>   ... connection interrupted ...
>   Client: It's me again. I have up to byte N of pack XYZ
>   Server: OK, resuming
>   [or: I don't have XYZ anymore; start from scratch]
>
> Then generating XYZ and generating that bundle are basically the same
> task.

The above allows a simple and naive implementation of generating a
packstream and "tee"ing it to a spool file to be kept while sending
to the first client that asks XYZ.

The story I heard from folks who run git servers at work for Android
and other projects, however, is that they rarely see two requests
with want/have that result in an identical XYZ, unless "have" is an
empty set (aka "clone").  In a busy repository, between two clone
requests relatively close together, somebody would be pushing, so
you'd need many XYZs in your spool even if you want to support only
the "clone" case.

So in the real life, I think that the exchange needs to be more
like this:

C: I need a packfile with this want/have
... C/S negotiate what "have"s are common ...
S: Sorry, but our negitiation indicates that you are way too
   behind.  I'll send you a packfile that brings you up to a
   slightly older set of "want", so pretend that you asked for
   these slightly older "want"s instead.  The opaque id of that
   packfile is XYZ.  After getting XYZ, come back to me with
   your original set of "want"s.  You would give me more recent
   "have" in that request.  
... connection interrupted ...
C: It's me again.  I have up to byte N of pack XYZ
S: OK, resuming (or: I do not have it anymore, start from scratch)
... after 0 or more iterations C fully receives and digests XYZ ...

and then the above will iterate until the server does not have to
say "Sorry but you are way too behind" and returns a packfile
without having to tweak the "want".

That way, you can limit the number of XYZ you would need to keep to
a reasonable number.

The recent proposal by Jonathan Tan also allows the server side to
tweak the final tips the client receives after the protocol exchange
started.  I suspect the above two will become related.


Re: [PATCH v2 3/3] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config

2017-02-01 Thread Junio C Hamano
Johannes Schindelin  writes:

> On Fri, 27 Jan 2017, Junio C Hamano wrote:
>
>> IOW, I think it is acceptable to always split GIT_SSH_COMMAND into
>> tokens before we realize that the user used the escape hatch and the
>> splitting was a wasted effort.  This is exactly because this thing
>> is an escape hatch that is expected to be rarely used.  Of course,
>> if the "wasted effort" can be eliminated without sacrificing the
>> simplicity of the code, that is fine as well.
>
> Simplicity is retained. Battle-readiness was sacrificed on the way: the
> new code is not tested well enough, and `next` will not help one bit.

Let me make it clear that there is no burning desire to sacrifice
battle-readiness in the above.  If we expect that auto-detection
would be minority, then it makes sense to get the configured value
first and then spend cycles to split and guess only when detection
is needed.  

In this case, because we expect that auto-detection will be used
most of the time, it is good enough to always split first, get the
configured value, and spend cycles to guess, or for that matter it
is perfectly fine to always split and guess first and then override
with the configured value.

If your attempt to optimize for a wrong case ended up causing new
unnecessary bugs, don't blame me.


Re: [PATCH v2 7/7] completion: recognize more long-options

2017-02-01 Thread Cornelius Weig
Hi Gabor,

 thanks for taking a look at these commits.

On 01/31/2017 11:17 PM, SZEDER Gábor wrote:
> On Fri, Jan 27, 2017 at 10:17 PM,   wrote:
>> From: Cornelius Weig 
>>
>> Recognize several new long-options for bash completion in the following
>> commands:
> 
> Adding more long options that git commands learn along the way is
> always an improvement.  However, seeing "_several_ new long options"
> (or "some long options" in one of the other patches in the series)
> makes the reader wonder: are these the only new long options missing
> or are there more?  If there are more, why only these are added?  If
> there aren't any more missing long options left, then please say so,
> e.g. "Add all missing long options, except the potentially
> desctructive ones, for the following commands: "

Personally, I agree with you that
> Adding more long options that git commands learn along the way is
> always an improvement.
However, people may start complaining that their terminal becomes too
cluttered when doing a double-Tab. In my cover letter, I go to length
about this. My assumption was that all options that are mentioned in the
introduction of the command man-page should be important enough to have
them in the completion list. I'll change my commit message accordingly.

>>  - rm: --force
> 
> '--force' is a potentially destructive option, too.

Thanks for spotting this.

Btw, I haven't found that non-destructive options should not be eligible
for completion. To avoid confusion about this in the future, I suggest
to also change the documentation:

index 933bb6e..96f1c7f 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -13,7 +13,7 @@
 #*) git email aliases for git-send-email
 #*) tree paths within 'ref:path/to/file' expressions
 #*) file paths within current working directory and index
-#*) common --long-options
+#*) common non-destructive --long-options
 #
 # To use these routines:
 #


I take it you have also looked at the code itself? Then I would gladly
mention you as reviewer in my sign-off.


Re: [ANNOUNCE] Git Merge Contributor Summit topic planning

2017-02-01 Thread Jeff King
On Wed, Feb 01, 2017 at 10:32:12AM +0100, Erik van Zijst wrote:

> Clients performing a full clone get redirected to a CDN where they seed
> their new local repo from a pre-built bundle file, and then pull/fetch
> any remaining changes. Mercurial has had native, built-in support for
> this for a while now.
> 
> I imagine other large code hosts could benefit from this as well and
> I'd love to gauge the group's interest for this. Could this make sense
> for Git? Would it have a chance of landing?
> 
> Our spike implements it as an optional capability during ref
> advertisement. What are your thoughts on this?

I think this is definitely an interesting topic to discuss tomorrow.

Here are a few observations from my past thinking on the issue. I
haven't read the proposal from earlier this week yet, so some of them
may be obsolete.

Seeding from a bundle CDN generally solves two problems: getting the
bulk of the data from someplace with higher bandwidth (the CDN), and
getting the bulk of the data over a protocol that can be resumed (the
bundle).

But we don't necessarily have to solve both problems simultaneously.
And you might not want to. Storing a separate bundle on another server
is complicated to configure, and doubles the amount of disk space you
need (just half of it is on the CDN). Using a bundle means you can't
seed from a non-bundle source.

So for any solution, I'd want to consider how you can put together the
pieces. Can you seed from a non-bundle? Can you seed from yourself and
just get resumability? If so, how hard is it to serve a pseudo-bundle
based on the packfiles you have on disk (i.e., getting resumability
at least in the common cases without paying the disk cost). I.e., saving
enough data that you could reconstruct the bundle byte-for-byte when you
need to.

If you _can_ do that latter part, and you take "I only care about
resumability" to the simplest extreme, you'd probably end up with a
protocol more like:

  Client: I need a packfile with this want/have
  Server: OK, here it is; its opaque id is XYZ.
  ... connection interrupted ...
  Client: It's me again. I have up to byte N of pack XYZ
  Server: OK, resuming
  [or: I don't have XYZ anymore; start from scratch]

Then generating XYZ and generating that bundle are basically the same
task.

All just food for thought. I look forward to digging into it more on the
list and in the in-person discussion.

-Peff


[PATCH v3 3/4] git_connect(): factor out SSH variant handling

2017-02-01 Thread Johannes Schindelin
We handle plink and tortoiseplink as OpenSSH replacements, by passing
the correct command-line options when detecting that they are used.

To let users override that auto-detection (in case Git gets it wrong),
we need to introduce new code to that end.

In preparation for this code, let's factor out the SSH variant handling
into its own function, handle_ssh_variant().

Signed-off-by: Johannes Schindelin 
---
 connect.c | 72 ---
 1 file changed, 46 insertions(+), 26 deletions(-)

diff --git a/connect.c b/connect.c
index 9f750eacb6..2734b9a1ca 100644
--- a/connect.c
+++ b/connect.c
@@ -691,6 +691,44 @@ static const char *get_ssh_command(void)
return NULL;
 }
 
+static int handle_ssh_variant(const char *ssh_command, int is_cmdline,
+ int *port_option, int *needs_batch)
+{
+   const char *variant;
+   char *p = NULL;
+
+   if (!is_cmdline) {
+   p = xstrdup(ssh_command);
+   variant = basename(p);
+   } else {
+   const char **ssh_argv;
+
+   p = xstrdup(ssh_command);
+   if (split_cmdline(p, _argv)) {
+   variant = basename((char *)ssh_argv[0]);
+   /*
+* At this point, variant points into the buffer
+* referenced by p, hence we do not need ssh_argv
+* any longer.
+*/
+   free(ssh_argv);
+   } else
+   return 0;
+   }
+
+   if (!strcasecmp(variant, "plink") ||
+   !strcasecmp(variant, "plink.exe"))
+   *port_option = 'P';
+   else if (!strcasecmp(variant, "tortoiseplink") ||
+!strcasecmp(variant, "tortoiseplink.exe")) {
+   *port_option = 'P';
+   *needs_batch = 1;
+   }
+   free(p);
+
+   return 1;
+}
+
 /*
  * This returns a dummy child_process if the transport protocol does not
  * need fork(2), or a struct child_process object if it does.  Once done,
@@ -773,7 +811,6 @@ struct child_process *git_connect(int fd[2], const char 
*url,
int port_option = 'p';
char *ssh_host = hostandport;
const char *port = NULL;
-   char *ssh_argv0 = NULL;
transport_check_allowed("ssh");
get_host_and_port(_host, );
 
@@ -794,15 +831,10 @@ struct child_process *git_connect(int fd[2], const char 
*url,
}
 
ssh = get_ssh_command();
-   if (ssh) {
-   char *split_ssh = xstrdup(ssh);
-   const char **ssh_argv;
-
-   if (split_cmdline(split_ssh, _argv))
-   ssh_argv0 = xstrdup(ssh_argv[0]);
-   free(split_ssh);
-   free((void *)ssh_argv);
-   } else {
+   if (ssh)
+   handle_ssh_variant(ssh, 1, _option,
+  _batch);
+   else {
/*
 * GIT_SSH is the no-shell version of
 * GIT_SSH_COMMAND (and must remain so for
@@ -813,22 +845,10 @@ struct child_process *git_connect(int fd[2], const char 
*url,
ssh = getenv("GIT_SSH");
if (!ssh)
ssh = "ssh";
-
-   ssh_argv0 = xstrdup(ssh);
-   }
-
-   if (ssh_argv0) {
-   const char *base = basename(ssh_argv0);
-
-   if (!strcasecmp(base, "tortoiseplink") ||
-   !strcasecmp(base, "tortoiseplink.exe")) {
-   port_option = 'P';
-   needs_batch = 1;
-   } else if (!strcasecmp(base, "plink") ||
-  !strcasecmp(base, "plink.exe")) {
-   port_option = 'P';
-   }
-   free(ssh_argv0);
+   else
+   handle_ssh_variant(ssh, 0,
+  _option,
+  _batch);
}
 
argv_array_push(>args, ssh);
-- 
2.11.1.windows.prerelease.2.9.g3014b57




Re: [PATCH v2 3/3] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config

2017-02-01 Thread Johannes Schindelin
Hi Junio,

On Fri, 27 Jan 2017, Junio C Hamano wrote:

> IOW, I think it is acceptable to always split GIT_SSH_COMMAND into
> tokens before we realize that the user used the escape hatch and the
> splitting was a wasted effort.  This is exactly because this thing
> is an escape hatch that is expected to be rarely used.  Of course,
> if the "wasted effort" can be eliminated without sacrificing the
> simplicity of the code, that is fine as well.

Simplicity is retained. Battle-readiness was sacrificed on the way: the
new code is not tested well enough, and `next` will not help one bit.

Ciao,
Johannes


[PATCH v3 4/4] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config

2017-02-01 Thread Johannes Schindelin
From: Segev Finer 

This environment variable and configuration value allow to
override the autodetection of plink/tortoiseplink in case that
Git gets it wrong.

[jes: wrapped overly-long lines, factored out and changed
get_ssh_variant() to handle_ssh_variant() to accomodate the
change from the putty/tortoiseplink variables to
port_option/needs_batch, adjusted the documentation, free()d
value obtained from the config.]

Signed-off-by: Segev Finer 
Signed-off-by: Johannes Schindelin 
---
 Documentation/config.txt | 11 +++
 Documentation/git.txt|  6 ++
 connect.c| 11 ---
 t/t5601-clone.sh | 26 ++
 4 files changed, 51 insertions(+), 3 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index af2ae4cc02..b88df57ab6 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1949,6 +1949,17 @@ Environment variable settings always override any 
matches.  The URLs that are
 matched against are those given directly to Git commands.  This means any URLs
 visited as a result of a redirection do not participate in matching.
 
+ssh.variant::
+   Depending on the value of the environment variables `GIT_SSH` or
+   `GIT_SSH_COMMAND`, or the config setting `core.sshCommand`, Git
+   auto-detects whether to adjust its command-line parameters for use
+   with plink or tortoiseplink, as opposed to the default (OpenSSH).
++
+The config variable `ssh.variant` can be set to override this auto-detection;
+valid values are `ssh`, `plink`, `putty` or `tortoiseplink`. Any other value
+will be treated as normal ssh. This setting can be overridden via the
+environment variable `GIT_SSH_VARIANT`.
+
 i18n.commitEncoding::
Character encoding the commit messages are stored in; Git itself
does not care per se, but this information is necessary e.g. when
diff --git a/Documentation/git.txt b/Documentation/git.txt
index 4f208fab92..a0c6728d1a 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -1020,6 +1020,12 @@ Usually it is easier to configure any desired options 
through your
 personal `.ssh/config` file.  Please consult your ssh documentation
 for further details.
 
+`GIT_SSH_VARIANT`::
+   If this environment variable is set, it overrides Git's autodetection
+   whether `GIT_SSH`/`GIT_SSH_COMMAND`/`core.sshCommand` refer to OpenSSH,
+   plink or tortoiseplink. This variable overrides the config setting
+   `ssh.variant` that serves the same purpose.
+
 `GIT_ASKPASS`::
If this environment variable is set, then Git commands which need to
acquire passwords or passphrases (e.g. for HTTP or IMAP authentication)
diff --git a/connect.c b/connect.c
index 2734b9a1ca..7f1f802396 100644
--- a/connect.c
+++ b/connect.c
@@ -694,10 +694,14 @@ static const char *get_ssh_command(void)
 static int handle_ssh_variant(const char *ssh_command, int is_cmdline,
  int *port_option, int *needs_batch)
 {
-   const char *variant;
+   const char *variant = getenv("GIT_SSH_VARIANT");
char *p = NULL;
 
-   if (!is_cmdline) {
+   if (variant)
+   ; /* okay, fall through */
+   else if (!git_config_get_string("ssh.variant", ))
+   variant = p;
+   else if (!is_cmdline) {
p = xstrdup(ssh_command);
variant = basename(p);
} else {
@@ -717,7 +721,8 @@ static int handle_ssh_variant(const char *ssh_command, int 
is_cmdline,
}
 
if (!strcasecmp(variant, "plink") ||
-   !strcasecmp(variant, "plink.exe"))
+   !strcasecmp(variant, "plink.exe") ||
+   !strcasecmp(variant, "putty"))
*port_option = 'P';
else if (!strcasecmp(variant, "tortoiseplink") ||
 !strcasecmp(variant, "tortoiseplink.exe")) {
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 9335e10c2a..b52b8acf98 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -401,6 +401,32 @@ test_expect_success 'single quoted plink.exe in 
GIT_SSH_COMMAND' '
expect_ssh "-v -P 123" myhost src
 '
 
+test_expect_success 'GIT_SSH_VARIANT overrides plink detection' '
+   copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink" &&
+   GIT_SSH_VARIANT=ssh \
+   git clone "[myhost:123]:src" ssh-bracket-clone-variant-1 &&
+   expect_ssh "-p 123" myhost src
+'
+
+test_expect_success 'ssh.variant overrides plink detection' '
+   copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink" &&
+   git -c ssh.variant=ssh \
+   clone "[myhost:123]:src" ssh-bracket-clone-variant-2 &&
+   expect_ssh "-p 123" myhost src
+'
+
+test_expect_success 'GIT_SSH_VARIANT overrides plink detection to plink' '
+   GIT_SSH_VARIANT=plink \
+   git clone "[myhost:123]:src" ssh-bracket-clone-variant-3 &&
+   expect_ssh "-P 123" myhost src
+'
+
+test_expect_success 

[PATCH v3 1/4] connect: handle putty/plink also in GIT_SSH_COMMAND

2017-02-01 Thread Johannes Schindelin
From: Segev Finer 

Git for Windows has special support for the popular SSH client PuTTY:
when using PuTTY's non-interactive version ("plink.exe"), we use the -P
option to specify the port rather than OpenSSH's -p option. TortoiseGit
ships with its own, forked version of plink.exe, that adds support for
the -batch option, and for good measure we special-case that, too.

However, this special-casing of PuTTY only covers the case where the
user overrides the SSH command via the environment variable GIT_SSH
(which allows specifying the name of the executable), not
GIT_SSH_COMMAND (which allows specifying a full command, including
additional command-line options).

When users want to pass any additional arguments to (Tortoise-)Plink,
such as setting a private key, they are required to either use a shell
script named plink or tortoiseplink or duplicate the logic that is
already in Git for passing the correct style of command line arguments,
which can be difficult, error prone and annoying to get right.

This patch simply reuses the existing logic and expands it to cover
GIT_SSH_COMMAND, too.

Note: it may look a little heavy-handed to duplicate the entire
command-line and then split it, only to extract the name of the
executable. However, this is not a performance-critical code path, and
the code is much more readable this way.

Signed-off-by: Segev Finer 
Signed-off-by: Johannes Schindelin 
---
 connect.c| 23 ---
 t/t5601-clone.sh | 15 +++
 2 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/connect.c b/connect.c
index 8cb93b0720..c81f77001b 100644
--- a/connect.c
+++ b/connect.c
@@ -772,6 +772,7 @@ struct child_process *git_connect(int fd[2], const char 
*url,
int putty = 0, tortoiseplink = 0;
char *ssh_host = hostandport;
const char *port = NULL;
+   char *ssh_argv0 = NULL;
transport_check_allowed("ssh");
get_host_and_port(_host, );
 
@@ -792,10 +793,15 @@ struct child_process *git_connect(int fd[2], const char 
*url,
}
 
ssh = get_ssh_command();
-   if (!ssh) {
-   const char *base;
-   char *ssh_dup;
-
+   if (ssh) {
+   char *split_ssh = xstrdup(ssh);
+   const char **ssh_argv;
+
+   if (split_cmdline(split_ssh, _argv))
+   ssh_argv0 = xstrdup(ssh_argv[0]);
+   free(split_ssh);
+   free((void *)ssh_argv);
+   } else {
/*
 * GIT_SSH is the no-shell version of
 * GIT_SSH_COMMAND (and must remain so for
@@ -807,8 +813,11 @@ struct child_process *git_connect(int fd[2], const char 
*url,
if (!ssh)
ssh = "ssh";
 
-   ssh_dup = xstrdup(ssh);
-   base = basename(ssh_dup);
+   ssh_argv0 = xstrdup(ssh);
+   }
+
+   if (ssh_argv0) {
+   const char *base = basename(ssh_argv0);
 
tortoiseplink = !strcasecmp(base, 
"tortoiseplink") ||
!strcasecmp(base, "tortoiseplink.exe");
@@ -816,7 +825,7 @@ struct child_process *git_connect(int fd[2], const char 
*url,
!strcasecmp(base, "plink") ||
!strcasecmp(base, "plink.exe");
 
-   free(ssh_dup);
+   free(ssh_argv0);
}
 
argv_array_push(>args, ssh);
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 4241ea5b32..9335e10c2a 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -386,6 +386,21 @@ test_expect_success 'tortoiseplink is like putty, with 
extra arguments' '
expect_ssh "-batch -P 123" myhost src
 '
 
+test_expect_success 'double quoted plink.exe in GIT_SSH_COMMAND' '
+   copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink.exe" &&
+   GIT_SSH_COMMAND="\"$TRASH_DIRECTORY/plink.exe\" -v" \
+   git clone "[myhost:123]:src" ssh-bracket-clone-plink-3 &&
+   expect_ssh "-v -P 123" myhost src
+'
+
+SQ="'"
+test_expect_success 'single quoted plink.exe in GIT_SSH_COMMAND' '
+   copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink.exe" &&
+   GIT_SSH_COMMAND="$SQ$TRASH_DIRECTORY/plink.exe$SQ -v" \
+   git clone "[myhost:123]:src" ssh-bracket-clone-plink-4 &&
+   expect_ssh "-v -P 

[PATCH v3 2/4] connect: rename tortoiseplink and putty variables

2017-02-01 Thread Johannes Schindelin
From: Junio C Hamano 

One of these two may have originally been named after "what exact
SSH implementation do we have" so that we can tweak the command line
options, but these days "putty=1" no longer means "We are using the
plink SSH implementation that comes with PuTTY".  It is set when we
guess that either PuTTY plink or Tortoiseplink is in use.

Rename them after what effect is desired.  The current "putty"
option is about using "-P " when OpenSSH would use "-p ",
so rename it to port_option whose value is either 'p' or 'P".  The
other one is about passing an extra command line option "-batch",
so rename it needs_batch.

[jes: wrapped overly-long line]

Signed-off-by: Junio C Hamano 
Signed-off-by: Johannes Schindelin 
---
 connect.c | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/connect.c b/connect.c
index c81f77001b..9f750eacb6 100644
--- a/connect.c
+++ b/connect.c
@@ -769,7 +769,8 @@ struct child_process *git_connect(int fd[2], const char 
*url,
conn->in = conn->out = -1;
if (protocol == PROTO_SSH) {
const char *ssh;
-   int putty = 0, tortoiseplink = 0;
+   int needs_batch = 0;
+   int port_option = 'p';
char *ssh_host = hostandport;
const char *port = NULL;
char *ssh_argv0 = NULL;
@@ -819,12 +820,14 @@ struct child_process *git_connect(int fd[2], const char 
*url,
if (ssh_argv0) {
const char *base = basename(ssh_argv0);
 
-   tortoiseplink = !strcasecmp(base, 
"tortoiseplink") ||
-   !strcasecmp(base, "tortoiseplink.exe");
-   putty = tortoiseplink ||
-   !strcasecmp(base, "plink") ||
-   !strcasecmp(base, "plink.exe");
-
+   if (!strcasecmp(base, "tortoiseplink") ||
+   !strcasecmp(base, "tortoiseplink.exe")) {
+   port_option = 'P';
+   needs_batch = 1;
+   } else if (!strcasecmp(base, "plink") ||
+  !strcasecmp(base, "plink.exe")) {
+   port_option = 'P';
+   }
free(ssh_argv0);
}
 
@@ -833,11 +836,11 @@ struct child_process *git_connect(int fd[2], const char 
*url,
argv_array_push(>args, "-4");
else if (flags & CONNECT_IPV6)
argv_array_push(>args, "-6");
-   if (tortoiseplink)
+   if (needs_batch)
argv_array_push(>args, "-batch");
if (port) {
-   /* P is for PuTTY, p is for OpenSSH */
-   argv_array_push(>args, putty ? "-P" : 
"-p");
+   argv_array_pushf(>args,
+"-%c", port_option);
argv_array_push(>args, port);
}
argv_array_push(>args, ssh_host);
-- 
2.11.1.windows.prerelease.2.9.g3014b57




[PATCH v3 0/4] Handle PuTTY (plink/tortoiseplink) even in GIT_SSH_COMMAND

2017-02-01 Thread Johannes Schindelin
We already handle PuTTY's plink and TortoiseGit's tortoiseplink in
GIT_SSH by automatically using the -P option to specify ports, and in
tortoiseplink's case by passing the --batch option.

For users who need to pass additional command-line options to plink,
this poses a problem: the only way to do that is to use GIT_SSH_COMMAND,
but Git does not handle that specifically, so those users have to
manually parse the command-line options passed via GIT_SSH_COMMAND and
replace -p (if present) by -P, and add --batch in the case of
tortoiseplink.

This is error-prone and a bad user experience.

To fix this, the changes proposed in this patch series introduce
handling this by splitting the GIT_SSH_COMMAND value and treating the
first parameter with the same grace as GIT_SSH. To counter any possible
misdetection, the user can also specify explicitly via GIT_SSH_VARIANT
or ssh.variant which SSH variant they are using.

Changes relative to v2:

- touched up the documentation for ssh.variant to make it even easier to
  understand

- free()d the config variable

- completely refactored the code to fulfil Junio's burning desire to
  avoid split_cmdline() when unnecessary

It is quite preposterous to call this an "iteration" of the patch
series, because the code is so different now. I say this because I want
to caution that this code has not been tested as thoroughly, by far, as
the first iteration.

The primary purpose of code review is correctness, everything else is
either a consequence of it, or a means to make reviewing easier.

That means that I highly encourage those who pushed for these extensive
changes that make the patch series a lot less robust to balance things
out by at least *rudimentary* testing.


Johannes Schindelin (1):
  git_connect(): factor out SSH variant handling

Junio C Hamano (1):
  connect: rename tortoiseplink and putty variables

Segev Finer (2):
  connect: handle putty/plink also in GIT_SSH_COMMAND
  connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config

 Documentation/config.txt | 11 +++
 Documentation/git.txt|  6 
 connect.c| 75 
 t/t5601-clone.sh | 41 ++
 4 files changed, 114 insertions(+), 19 deletions(-)


base-commit: 8f60064c1f538f06e1c579cbd9840b86b10bcd3d
Published-As: https://github.com/dscho/git/releases/tag/putty-w-args-v3
Fetch-It-Via: git fetch https://github.com/dscho/git putty-w-args-v3

Interdiff vs v2:

 diff --git a/Documentation/config.txt b/Documentation/config.txt
 index f2c210f0a0..b88df57ab6 100644
 --- a/Documentation/config.txt
 +++ b/Documentation/config.txt
 @@ -1950,11 +1950,15 @@ matched against are those given directly to Git 
commands.  This means any URLs
  visited as a result of a redirection do not participate in matching.
  
  ssh.variant::
 -  Override the autodetection of plink/tortoiseplink in the SSH
 -  command that 'git fetch' and 'git push' use. It can be set to
 -  either `ssh`, `plink`, `putty` or `tortoiseplink`. Any other
 -  value will be treated as normal ssh. This is useful in case
 -  that Git gets this wrong.
 +  Depending on the value of the environment variables `GIT_SSH` or
 +  `GIT_SSH_COMMAND`, or the config setting `core.sshCommand`, Git
 +  auto-detects whether to adjust its command-line parameters for use
 +  with plink or tortoiseplink, as opposed to the default (OpenSSH).
 ++
 +The config variable `ssh.variant` can be set to override this auto-detection;
 +valid values are `ssh`, `plink`, `putty` or `tortoiseplink`. Any other value
 +will be treated as normal ssh. This setting can be overridden via the
 +environment variable `GIT_SSH_VARIANT`.
  
  i18n.commitEncoding::
Character encoding the commit messages are stored in; Git itself
 diff --git a/Documentation/git.txt b/Documentation/git.txt
 index c322558aa7..a0c6728d1a 100644
 --- a/Documentation/git.txt
 +++ b/Documentation/git.txt
 @@ -1021,11 +1021,10 @@ personal `.ssh/config` file.  Please consult your ssh 
documentation
  for further details.
  
  `GIT_SSH_VARIANT`::
 -  If this environment variable is set, it overrides the autodetection
 -  of plink/tortoiseplink in the SSH command that 'git fetch' and 'git
 -  push' use. It can be set to either `ssh`, `plink`, `putty` or
 -  `tortoiseplink`. Any other value will be treated as normal ssh. This
 -  is useful in case that Git gets this wrong.
 +  If this environment variable is set, it overrides Git's autodetection
 +  whether `GIT_SSH`/`GIT_SSH_COMMAND`/`core.sshCommand` refer to OpenSSH,
 +  plink or tortoiseplink. This variable overrides the config setting
 +  `ssh.variant` that serves the same purpose.
  
  `GIT_ASKPASS`::
If this environment variable is set, then Git commands which need to
 diff --git a/connect.c b/connect.c
 index 7b4437578b..7f1f802396 100644
 --- a/connect.c
 +++ b/connect.c
 @@ -691,20 +691,45 

Re: [PATCH 1/5] add SWAP macro

2017-02-01 Thread Jeff King
On Wed, Feb 01, 2017 at 12:28:13PM +0100, Johannes Schindelin wrote:

> > One funny thing is that your macro takes address-of itself, behind the
> > scenes. I wonder if it would be more natural for it to take
> > pointers-to-objects, making it look more like a real function (i.e.,
> > SWAP(, ) instead of SWAP(a, b)". And then these funny corner cases
> > become quite obvious in the caller, because the caller is the one who
> > has to type "&".
> 
> But forcing SWAP(, ) would make it even more cumbersome to use, and it
> would also make it harder to optimize, say, by using registers instead of
> addressable memory (think swapping two 32-bit integers: there is *no* need
> to write them into memory just to swap them).

I don't find the register thing compelling at all. I'd expect modern
compilers to optimize *() into just "a" and keep using a register. I'm
sure there are compilers that don't, but I'm also not sure how much we
_care_. If your compiler doesn't do basic micro-optimizations, then you
live with it or get a better compiler.

I'm much more interested in what's readable and maintainable, versus
trying to micro-optimize something that hasn't even been shown to be
measurably interesting.

That said...

> And I think I should repeat my point that this discussion veers towards
> making simple swaps *more* complicated, rather than less complicated. Bad
> direction.

I'm not altogether convinced that SWAP() is an improvement in
readability. I really like that it's shorter than the code it replaces,
but there is a danger with introducing opaque constructs. It's one more
thing for somebody familiar with C but new to the project to learn or
get wrong.

IMHO the main maintenance gain from René's patch is that you don't have
to specify the type, which means you can never have a memory-overflow
bug due to incorrect types. If we lose that, then I don't see all that
much value in the whole thing.

-Peff


Re: [PATCH 1/5] add SWAP macro

2017-02-01 Thread Johannes Schindelin
Hi René,

On Tue, 31 Jan 2017, René Scharfe wrote:

> Am 31.01.2017 um 13:13 schrieb Johannes Schindelin:
>
> > #define SIMPLE_SWAP(T, a, b) do { T tmp_ = a; a = b; b = tmp_; } while (0)
> > ...
> >  uint32_t large;
> >  char nybble;
> >
> >  ...
> >
> >  if (!(large & ~0xf)) {
> >   SIMPLE_SWAP(char, nybble, large);
> >   ...
> >  }
> >
> > i.e. mixing types, when possible.
> >
> > And while I do not necessarily expect that we need anything like this
> > anytime soon, merely the fact that it allows for this flexibility, while
> > being very readable at the same time, would make it a pretty good design
> > in my book.
> 
> Such a skinny macro which only hides repetition is kind of attractive due to
> its simplicity; I can't say the same about the mixed type example above,
> though.
> 
> The fat version isn't that bad either even without inlining, includes a few
> safety checks and doesn't require us to tell the compiler something it
> already knows very well.  I'd rather let the machine do the work.

I am a big fan of letting the machine do the work. But I am not a big fan
of *creating* work for the machine.

So if you asked me what I would think of a script that, given a patch "in
mbox format", automatically fixes all formatting issues that typically
take up a sizable chunk of review time, I would say: yeah, let's get this
done! It would probably take away some fun because then reviewers could
bike-shed less, but I'd think that is a good thing.

If you asked me what my opinion is about a program you wrote that gathers
all the threads and sub threads of code^Wpatch reviews on the Git mailing
list, and cross-references them with public Git branches, and with Junio's
What's Cooking mail, and with the SHA-1s in `pu`: Great! That would
relieve me of a ton of really boring and grueling work, if the machine can
do it, all the better.

And if you ask me about adding a complex macro that adds a bunch of work
for the C compiler just to produce the same assembler code as a one-liner
macro would have produced much easier, I will reply that we could maybe
instead spend that time on letting the machine perform tasks that already
need to be done... :-)

Ciao,
Dscho

Re: [PATCH 1/5] add SWAP macro

2017-02-01 Thread Johannes Schindelin
Hi Peff,

On Tue, 31 Jan 2017, Jeff King wrote:

> On Tue, Jan 31, 2017 at 10:03:01PM +0100, René Scharfe wrote:
> 
> > > Perhaps we could disallow a side-effect operator in the macro.  By
> > > disallow I mean place a comment at the definition to the macro and
> > > hopefully catch something like that in code-review.  We have the
> > > same issue with the `ALLOC_GROW()` macro.
> > 
> > SWAP(a++, ...) is caught by the compiler, SWAP(*a++, ...) works fine.
> > Technically that should be enough. :)  A comment wouldn't hurt, of
> > course.
> 
> One funny thing is that your macro takes address-of itself, behind the
> scenes. I wonder if it would be more natural for it to take
> pointers-to-objects, making it look more like a real function (i.e.,
> SWAP(, ) instead of SWAP(a, b)". And then these funny corner cases
> become quite obvious in the caller, because the caller is the one who
> has to type "&".

But forcing SWAP(, ) would make it even more cumbersome to use, and it
would also make it harder to optimize, say, by using registers instead of
addressable memory (think swapping two 32-bit integers: there is *no* need
to write them into memory just to swap them).

And I think I should repeat my point that this discussion veers towards
making simple swaps *more* complicated, rather than less complicated. Bad
direction.

Ciao,
Dscho

Re: What's cooking in git.git (Jan 2017, #06; Tue, 31)

2017-02-01 Thread Patrick Steinhardt
On Tue, Jan 31, 2017 at 02:45:40PM -0800, Junio C Hamano wrote:
> * ps/urlmatch-wildcard (2017-01-31) 5 commits
>  . urlmatch: allow globbing for the URL host part
>  . urlmatch: include host in urlmatch ranking
>  . urlmatch: split host and port fields in `struct url_info`
>  . urlmatch: enable normalization of URLs with globs
>  . mailmap: add Patrick Steinhardt's work address
> 
>  The  part in "http.." configuration variable
>  can now be spelled with '*' that serves as wildcard.
>  E.g. "http.https://*.example.com.proxy; can be used to specify the
>  proxy used for https://a.example.com, https://b.example.com, etc.,
>  i.e. any host in the example.com domain.
> 
>  With the update it still seems to fail the same t5551#31
>  cf. 

Yeah, you're right. Sorry about this one.

I finally fixed the problem to get t5551 working again, see the
attached patch below. The problem was that I did not honor the
case where multiple configuration entries exist with the same
key. In some cases, this has to lead to the value being
accumulated multiple times, as for example with http.extraheaders
used in t5551. So the fixup below fixes this.

I just tried to write additional tests to exercise this in our
config-tests by using `git config --get-urlmatch` with multiple
`http.extraheader`s set. I've been stumped that it still didn't
work correctly with my patch, only the last value was actually
ever returned. But in fact, current Git (tested with v2.11.0)
also fails to do so correctly and only lists the last value.

As such, I'd argue this is a separate issue. If you're not
opposed, I'd like to fix this in a separate patch series later on
(probably after Git Merge).

--- >8 ---
Subject: [PATCH] fixup! urlmatch: include host in urlmatch ranking

---
 urlmatch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/urlmatch.c b/urlmatch.c
index 6c12f1a48..739a1a558 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -587,7 +587,7 @@ int urlmatch_config_entry(const char *var, const char 
*value, void *cb)
if (!item->util) {
item->util = xcalloc(1, sizeof(matched));
} else {
-   if (cmp_matches(, item->util) <= 0)
+   if (cmp_matches(, item->util) < 0)
 /*
  * Our match is worse than the old one,
  * we cannot use it.
-- 
2.11.0


signature.asc
Description: PGP signature


Re: [ANNOUNCE] Git Merge Contributor Summit topic planning

2017-02-01 Thread Erik van Zijst
On Tue, Jan 31, 2017 at 01:48:05AM +0100, Jeff King wrote:

> The list of topics is totally open. If you're coming and have something
> you'd like to present or discuss, then propose it here. If you're _not_
> coming, you may still chime in with input on topics, but please don't
> suggest a topic unless somebody who is there will agree to lead the
> discussion.

I would like to talk about the possibility of CDN-aided cloning
operations as mentioned on this list earlier this week:
http://public-inbox.org/git/cadoxlgpfgf7w4xjzt0x+xfjdon6rmffgx_96mo9gpssojdk...@mail.gmail.com/

At Bitbucket we have recently rolled out so-called clonebundle support
for Mercurial repositories.

Full clone operations are rather expensive on the server and are
responsible for a substantial part of our CPU and IO load. CDN-based
clonebundles have allowed us to eliminate most of this load for
Mercurial repos and we've since built a clonebundle spike for Git.

Clients performing a full clone get redirected to a CDN where they seed
their new local repo from a pre-built bundle file, and then pull/fetch
any remaining changes. Mercurial has had native, built-in support for
this for a while now.

I imagine other large code hosts could benefit from this as well and
I'd love to gauge the group's interest for this. Could this make sense
for Git? Would it have a chance of landing?

Our spike implements it as an optional capability during ref
advertisement. What are your thoughts on this?

Cheers,
Erik