date:20170209

On Thu, Feb 09, 2017 at 11:27:49PM +0100, Johannes Schindelin wrote:

> From: Jeff Hostetler 
> 
> Use OpenSSL's SHA-1 routines rather than builtin block-sha1 routines.
> This improves performance on SHA1 operations on Intel processors.
> 
> OpenSSL 1.0.2 has made considerable performance improvements and
> support the Intel hardware acceleration features.  See:
> https://software.intel.com/en-us/articles/improving-openssl-performance
> https://software.intel.com/en-us/articles/intel-sha-extensions
> 
> To test this I added/staged a single file in a gigantic
> repository having a 450MB index file.  The code in read-cache.c
> verifies the header SHA as it reads the index and computes a new
> header SHA as it writes out the new index.  Therefore, in this test
> the SHA code must process 900MB of data.  Testing was done on an
> Intel I7-4770 CPU @ 3.40GHz (Intel64, Family 6, Model 60) CPU.

I think this is only half the story. A heavy-sha1 workload is faster,
which is good. But one of the original reasons to prefer blk-sha1 (at
least on Linux) is that resolving libcrypto.so symbols takes a
non-trivial amount of time. I just timed it again, and it seems to be
consistently 1ms slower to run "git rev-parse --git-dir" on my machine
(from the top-level of a repo).

1ms is mostly irrelevant, but it adds up on scripted workloads that
start a lot of git processes. Whether it's a net win or not depends on
how much sha1 computation you do in your workload versus how many
processes you start.

I don't know what that means for Windows, though. My impression is that
process startup is so painfully slow there that the link time may just
be lost in the noise. It may just always be a win there. So not really
an objection to your patch, but something you may want to consider.

(Of course, it would in theory be possible to have the best of both
worlds either by static-linking openssl, or by teaching block-sha1 the
same optimizations, but both of those are obviously much more complex).

-Peff

Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory

On Thu, Feb 09, 2017 at 01:50:31PM -0800, Junio C Hamano wrote:

> > There is just no way you can "fix" this otherwise. As an occasional shell
> > scripter, you may be tempted to use `$(git rev-parse --show-cdup)$(git
> > rev-parse --git-path filename)`, but of course that breaks in worktrees
> > and if you do not use worktrees you would not even know that your
> > workaround introduced another bug.
> 
> In case it is not clear, I understand all of the above.  
> 
> I was just worried about the people who do *NOT* use worktrees and
> did the obvious "concatenate --cdup with --git-path" and thought
> their script were working happily and well.  By prepending the path
> to the (real) location of the .git in the updated --git-path output
> ourselves, they will complain, our update broke their script.

That concatenating approach is broken in other ways, too. For example,
if you have $GIT_DIR set to an absolute path, then --git-path will use
that.  I don't think we have ever promised that the output of --git-path
(or --git-dir) would ever be absolute or relative (in fact, the
--git-dir documentation implies that you may get either).

So yes, there could be somebody who is doing this concatenation
workaround, but only ever calls their script in a certain way that never
triggers the absolute-path variant. They're happy now, and may not be
after Dscho's patch. But I really think they are relying on a bogus
assumption, and their scripts are already buggy.

-Peff

Re: [PATCH 1/5] refs: store submodule ref stores in a hashmap

On Thu, Feb 09, 2017 at 12:34:04PM -0800, Junio C Hamano wrote:

> > +static struct submodule_hash_entry *alloc_submodule_hash_entry(
> > +   const char *submodule, struct ref_store *refs)
> > +{
> > +   size_t len = strlen(submodule);
> > +   struct submodule_hash_entry *entry = malloc(sizeof(*entry) + len + 1);
> 
> I think this (and the later memcpy) is what FLEX_ALLOC_MEM() was
> invented for.

Yes, it was. Though since the length comes from a strlen() call, it can
actually use the _STR variant, like:

  FLEX_ALLOC_STR(entry, submodule, submodule);

Besides being shorter, this does integer-overflow checks on the final
length.

> > @@ -1373,16 +1405,17 @@ void base_ref_store_init(struct ref_store *refs,
> > die("BUG: main_ref_store initialized twice");
> >  
> > refs->submodule = "";
> > -   refs->next = NULL;
> > main_ref_store = refs;
> > } else {
> > -   if (lookup_ref_store(submodule))
> > +   refs->submodule = xstrdup(submodule);
> > +
> > +   if (!submodule_ref_stores.tablesize)
> > +   hashmap_init(&submodule_ref_stores, submodule_hash_cmp, 
> > 20);
> 
> Makes me wonder what "20" stands for.  Perhaps the caller should be
> allowed to say "I do not quite care what initial size is" by passing
> 0 or some equally but more clealy meaningless value (which of course
> would be outside the scope of this series).

I think this is what "0" already does (grep for HASHMAP_INITIAL_SIZE).
In fact, that constant is 64. The 20 we pass in goes through some magic
load-factor computation and ends up as 25. That being smaller than the
INITIAL_SIZE constant, I believe that we end up allocating 64 entries
either way (that's just from reading the code, though; I didn't run it
to double check).

-Peff

Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory

2017-02-09 Thread Mike Rappazzo

On Thu, Feb 9, 2017 at 5:54 PM, Junio C Hamano  wrote:
>
> That leaves what the right single-step behaviour change should be.
> As I recall Duy said something about --common-dir and other things
> Mike's earlier change also covered, I'd prefer to leave it to three
> of you to figure out what the final patch should be.
>

The tests which I covered in my previous patch [1] addressed the
places where we identified similar problems.  We should try to include
some form of those tests.  As far as implementation goes in rev-parse,
the version in this thread is probably better that what I had, but it
would need to also be applied to --git-common-dir and
--shared-index-path.


[1] 
http://public-inbox.org/git/1464261556-89722-2-git-send-email-rappa...@gmail.com/

Re: [PATCH] help: show help for aliases

Tom Kunze  writes:

> If an alias is a single git command show the man page of the
> aliased git command with --help.
>
> Signed-off-by: Tom Kunze 
> ...
> diff --git a/builtin/help.c b/builtin/help.c
> index 49f7a07..655ed49 100644
> --- a/builtin/help.c
> +++ b/builtin/help.c
> @@ -437,6 +437,10 @@ static const char *check_git_cmd(const char* cmd)
>
>   alias = alias_lookup(cmd);
>   if (alias) {
> + if (alias[0] != '!') {
> + strtok(alias, " \t\n");
> + return alias;
> + }

While I understand where you come from, I am moderately negative,
especially with that strtok() to ignore options.

For a truly simple alias, e.g.

$ git co --help
`git co' is aliased to `checkout'

I do not think I would mind the updated behaviour given by this
patch that much.

But most of the time, when I do "help" on an alias, I am primarily
interested in what default customization I am using over the base
command, i.e.

$ git lgf --help
`git lgf' is aliased to `log --oneline --boundary --first-parent'

is my way to remind me that I am using these three options to "git
log" in the alias I very often use (and forgot what they were).

Jumping directly to the "git log" manual page is the last thing I
want "help" to do.

[PATCH] help: show help for aliases

2017-02-09 Thread Tom Kunze

If an alias is a single git command show the man page of the
aliased git command with --help.

Signed-off-by: Tom Kunze 
---
Hi,

I noticed that when I pass --help to an alias which is only a git
command it tells me a information about the alias. But it would be
nice if it instead opens the corresponding man page of the command.

There is a memory leak but in my opinion it can be ignored because
the process will be replaced anyway.

Regards,
Tom Kunze

PS: Please add me to cc as I am not subscribed.

 builtin/help.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/builtin/help.c b/builtin/help.c
index 49f7a07..655ed49 100644
--- a/builtin/help.c
+++ b/builtin/help.c
@@ -437,6 +437,10 @@ static const char *check_git_cmd(const char* cmd)

alias = alias_lookup(cmd);
if (alias) {
+   if (alias[0] != '!') {
+   strtok(alias, " \t\n");
+   return alias;
+   }
printf_ln(_("`git %s' is aliased to `%s'"), cmd, alias);
free(alias);
exit(0);
-- 
2.1.4

Re: [PATCH 0/5] Store submodules in a hash, not a linked list

On Thu, Feb 09, 2017 at 10:23:35PM +0100, Michael Haggerty wrote:

> >> So push the submodule attribute down to the `files_ref_store` class
> >> (but continue to let the `ref_store`s be looked up by submodule).
> > 
> > I'm not sure I understand all of the ramifications here. It _sounds_ like
> > pushing this down into the files-backend code would make it harder to
> > have mixed ref-backends for different submodules. Or is this just
> > pushing down an implementation detail of the files backend, and future
> > code is free to have as many different ref_stores as it likes?
> 
> I don't understand how this would make it harder, aside from the fact
> that a new backend class might also need a path member and have to
> maintain its own copy rather than using one that the base class provides.

Probably the answer is "I'm really confused".

But here's how my line of reasoning went:

  Right now we have a main ref-store that points to the submodule
  ref-stores. I don't know the current state of it, but in theory those
  could all use different backends.

  This seems like it's pushing that submodule linkage down into the
  backend.

But I think from your response that the answer is no, the thing that is
being pushed down is not the right way for the main ref store and the
submodules to be linked. In fact, there is no reason at all for the
main ref store to know or care about submodules. Anybody who wants to
know about a submodule's refs should ask the hashmap.

-Peff

Re: [PATCH] gc: ignore old gc.log files

On Thu, Feb 09, 2017 at 11:57:12PM -, Philip Oakley wrote:

> > > +gc.maxLogAge::
> > > + If the file gc.log exists, then `git gc --auto` won't run
> > > + unless that file is more than maxLogAge seconds old.  Default
> > > + is 86400, one day.
> 
> Is there a reason why one day is chosen? If maintenance staff are available
> 24/7 then a shorter time would be appropriate, but if it's a 5 day work week
> then they may want longer. Is there a particular case it targets?

I'm pretty sure the one-day time limit isn't scientific. It's just a
number we've been throwing around.

I'm not sure what maintenance staff matters, though. It basically needs
long enough that we're not doing _too_ many fruitless gc's, because it
wastes resources. But you'd prefer to not go too long without a gc for a
repository that needs it.

The root cause of the error could be any number of issues. But for the
case that David cares about most, you basically want to keep trying
until the too-many-objects condition goes away. That's usually on a
2-week timer. So trying once per day to see if the 2-week timer feels
about right.

That's certainly not science, but hopefully it at least frames the
general ballpark.

One possible option would be to auto-scale it with the pruneExpire time.
I don't know if people actually tweak that value or not.

-Peff

Re: [PATCH] gc: ignore old gc.log files

2017-02-09 Thread Philip Oakley


From: "Jeff King" 

On Wed, Feb 08, 2017 at 09:02:22PM -0500, David Turner wrote:


The intent of automatic gc is to have a git repository be relatively
low-maintenance from a server-operator perspective.  Of course, large
operators like GitHub will need a more complicated management strategy,
but for ordinary usage, git should just work.

In this commit, git learns to ignore gc.log files which are older than
(by default) one day old.  It also learns about a config, gc.maxLogAge
to manage this.

So git should never get itself into a state where it refuses to do any
maintenance, just because at some point some piece of the maintenance
didn't make progress.  That might still happen (e.g. because the repo
is corrupt), but at the very least it won't be because Git is too dumb
to try again.


Sounds like a good goal and approach.


+gc.maxLogAge::
+ If the file gc.log exists, then `git gc --auto` won't run
+ unless that file is more than maxLogAge seconds old.  Default
+ is 86400, one day.


Is there a reason why one day is chosen? If maintenance staff are available 
24/7 then a shorter time would be appropriate, but if it's a 5 day work week 
then they may want longer. Is there a particular case it targets?




For other time-based config, we use approxidate with a relative time,
like "1 day ago". I think it would make sense for this to match, as it
makes the config a little more readable.

You can follow the prune_expire example which is right below your new
config variable in all of the hunks of your patch. Though I think
ultimately that isn't parsed inside gc, so you'd eventually look at how
"prune --expire" is handled (which I think is via parse_expiry_date()).


[...]

Philip

Re: [PATCH v2] git-p4: fix git-p4.pathEncoding for removed files

Lars Schneider  writes:

> unfortunately, I missed to send this v2. I agree with Luke's review and
> I moved the re-encode of the path name to the `streamOneP4File` and
> `streamOneP4Deletion` explicitly.
>
> Discussion:
> http://public-inbox.org/git/CAE5ih7-=bd_zol5pfyfd2qvy-xe24v_cgge0xoavuotk02e...@mail.gmail.com/
>
> Thanks,
> Lars

Thanks.  Will replace but will not immediately merge to 'next' yet,
just in case Luke wants to tell me add his "Reviewed-by:".

Re: [PATCH] mingw: use OpenSSL's SHA-1 routines

Johannes Schindelin  writes:

> From: Jeff Hostetler 
>
> Use OpenSSL's SHA-1 routines rather than builtin block-sha1 routines.
> This improves performance on SHA1 operations on Intel processors.
> ...
>
> Signed-off-by: Jeff Hostetler 
> Signed-off-by: Johannes Schindelin 
> ---

Nice.  Will queue as jh/mingw-openssl-sha1 topic; it is a bit too
late for today's integration cycle to be merged to 'next', but let's
have this by the end of the week in 'master'.

Thanks.

Re: Bug with fixup and autosquash

2017-02-09 Thread Philip Oakley

From: "Johannes Schindelin" 
Sent: Thursday, February 09, 2017 8:55 PM

Hi Ashutosh and Junio,

On Wed, 8 Feb 2017, Junio C Hamano wrote:

Ashutosh Bapat  writes:

> I have been using git rebase heavily these days and seem to have found
> a bug.
>
> If there are two commit messages which have same prefix e.g.
> yy This is prefix
> xx This is prefix and message
>
> xx comitted before yy
>
> Now I commit a fixup to yy using git commit --fixup yy
> zz fixup! This is prefix
>
> When I run git rebase -i --autosquash, the script it shows me looks 
> like

> pick xx This is prefix and message
> fixup zz fixup! This is prefix
> pick yy This is prefix
>
> I think the correct order is
> pick xx This is prefix and message
> pick yy This is prefix
> fixup zz fixup! This is prefix
>
> Is that right?

[...]

Unfortunately, "rebase -i --autosquash" reorders the entries by
identifying the commit by its title, and it goes with prefix match so
that fix-up commits created without using --fixup option but manually
records the title's prefix substring can also work.

This prefix match also happens to introduce a serious performance problem,
which is why I "fixed" this issue in the rebase--helper already (which is
the case if you are using Git for Windows, whose master branch builds on
Linux and MacOSX as well). I quoted "fix" because my motivation was to fix
the performance problem, not the "incorrect match" problem.

The rebase--helper code (specifically, the patch moving autosquash logic
into it: https://github.com/dscho/git/commit/7d0831637f) tries to match
exact onelines first,

While I think this is an improvement, and will strongly support the `git 
commit --fixup=` option which will, if the sha1/oid is given, create 
the exact commit subject line.

However it would also be useful if the actual commit subject line could have 
a similar format option, so that those who use say the git gui (rather than 
the cli) for the commit message, could easily create the `!fixup ` 
message which would allow a broader range of ways of spelling the commit 
(e.g. giving a sha1(min length) that is within the rebase todo list).

and falls back to prefix matching only after that.

Now that the sequencer-i patch series is in `master`, the next step is to
send the patch series introducing the rebase--helper. The patch series
including the fix discussed above relies on that one. Meaning that it will
take a while to get through the mill.

So please do not hold your breath until this feature/fix hits an official
Git version. If you need it[*1*] faster, feel free to build Git for
Windows' master and run with that for a while.

Ciao,
Johannes

Footnote: By "it" I mean "the feature/fix", not "an official Git version"
nor "your breath".

--
Philip

Re: [RFC-PATCHv2] submodules: add a background story

Stefan Beller  writes:

> Just like gitmodules(5), gitattributes(5), gitcredentials(7),
> gitnamespaces(7), gittutorial(7), we'd like to provide some background
> on submodules, which is not specific to the `submodule` command, but
> elaborates on the background and its intended usage.
>
> Add gitsubmodules(7), that explains the states, structure and usage of
> submodules.
>
> Signed-off-by: Stefan Beller 
> ---
>
> This would replace the last patch of  sb/submodule-doc, though it's still
> RFC. In this revision I took care of the technical details (i.e. proper
> formatting, spelling), and only slight rewording of the text.
>
> The main issue persists; see bottom of the patch:
>
>   SAMPLE WORKFLOWS (RFC/TODO)
>   ---
>   
>   Do we need
>   
>   * an opinionated way to check for a specific state of a submodule
>   * (submodule helper to be plumbing?)
>   * expose the design mistake of having the (name->path) mapping inside the
> working tree, i.e. never remove a name from the submodule config even when
> the submodule doesn't exist any more.

I am not sure about the last item.  

Are you talking about a case where submodule comes and goes (think:
"git checkout v1.0" that would make submodules added since that
version disappar)?  .gitmodules that is checked out would not have
any entry, but .git/config needs to record the end-user preference
for the module, so that the user can do "git checkout -" to come
back, no?  IOW .git/config that mentions all the submodule the user
ever showed interests in is not a design mistake, so you must be
talking about something else, but I do not know what it is.

> diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
> index 4a4cede144..d38aa2d53a 100644
> --- a/Documentation/git-submodule.txt
> +++ b/Documentation/git-submodule.txt
> @@ -24,37 +24,7 @@ DESCRIPTION
>  ---
>  Inspects, updates and manages submodules.
>  
> -A submodule allows you to keep another Git repository in a subdirectory
> ...
> -if you choose to go that route.
> +For more information about submodules, see linkgit:gitsubmodules[5]

OK.

> @@ -420,6 +390,10 @@ This file should be formatted in the same way as 
> `$GIT_DIR/config`. The key
>  to each submodule url is "submodule.$name.url".  See linkgit:gitmodules[5]
>  for details.
>  
> +SEE ALSO
> +
> +linkgit:gitsubmodules[1], linkgit:gitmodules[1].

Are they both in section (1)?  I think the former (concepts) belongs
to section 7 and the latter (file formats) belongs to section 5.

> diff --git a/Documentation/gitsubmodules.txt b/Documentation/gitsubmodules.txt
> new file mode 100644
> index 00..3369d55ae9
> --- /dev/null
> +++ b/Documentation/gitsubmodules.txt
> @@ -0,0 +1,194 @@
> +gitsubmodules(7)
> +
> +
> +NAME
> +
> +gitsubmodules - information about submodules
> +
> +SYNOPSIS
> +
> +$GIT_DIR/config, .gitmodules
> +
> +--
> +git submodule
> +--
> +
> +DESCRIPTION
> +---
> +
> +A submodule allows you to keep another Git repository in a subdirectory
> +...
> +When cloning or pulling a repository containing submodules however,
> +the submodules will not be checked out by default; You need to instruct
> +'clone' to recurse into submodules. The 'init' and 'update' subcommands

I think this is not "You need to", but rather "You can, if you want
to have each and every submodules."

> +of 'git submodule' will maintain submodules checked out and at an
> +appropriate revision in your working tree.
> +
> +WHEN TO USE
> +---
> +
> +Submodules, repositories inside other repositories,
> +can be used for different use cases:
> +
> +* To have finer grained access control.
> +  The design principles of Git do not allow for partial repositories to be
> +  checked out or transferred. A repository is the smallest unit that a user
> +  can be given access to. Submodules are separate repositories, such that
> +  you can restrict access to parts of your project via the use of submodules.
> +
> +* To decouple Git histories.
> +  Decoupling histories has different benefits.
> +
> +** When you want to use a (third party) library tied to a specific version.
> +   Using submodules for a library allows you to have a clean history for
> +   your own project and only updating the library in the submodule when 
> needed.

I somehow do not see this as decoupling; it is keeping what is
originally separate separate, isn't it?

> +** In its current form Git scales up poorly for very large repositories that
> +   change a lot, as the history grows very large. For that you may want to 
> look
> +   at shallow clone, sparse checkout or git-lfs.
> +   However you can also use submodules to e.g. hold large binary assets
> +   and these repositories are then shallowly cloned such that you do not
> +   have a large history locally.

In other words, a better way to list these may be 

 1. using another project that stands on its own

Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory

Johannes Schindelin  writes:

>> I have no strong opinion for or against a "longer term" solution
>> that makes "rev-parse --git-path" behave differently from how it
>> behaves today, but I am not yet convinced that we can reach that
>> longer term goal without a transition period, as I suspect there are
>> existing users that know and came to expect how it behaves, based on
>> its today's behaviour.  Other than that I do not have suggestion on
>> this topic at the moment.

I think I was simply being silly (not merely "overcautious", but
just "silly") here.

There is no reason for people to use "--git-path" if they are not
preparing to work with secondary worktrees, because the whole point
of the feature is so that cases where "$(rev-parse --git-dir)/path"
does a wrong thing (e.g. end up referring to the main worktree thing
when you need to refer to your own, or vice versa).

> Given that
> ...
> it should be safe to assume that a transitional period is more likely to
> do more harm to our users than bring benefit.

In short, "--git-path as currently exposed to the end-users is
utterly broken and cannot have been used for anything sensible".  If
that is the case, let's just change that with an entry in the
release notes that states so (iow, there is no need for even a
backward compatibility notice, we just have an entry that says "this
was totally broken in such and such way, and now it is fixed to
behave this way").

That leaves what the right single-step behaviour change should be.
As I recall Duy said something about --common-dir and other things
Mike's earlier change also covered, I'd prefer to leave it to three
of you to figure out what the final patch should be.

Thanks.

Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)

Hi Junio,

On Thu, 9 Feb 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > ... and even go so far as calling your patch reverting my refactoring
> > a hot-fix, why don't you just go ahead and merge the result over my
> > objections?
> 
> At this point, you are simply being silly.

How is it that this patch cannot be applied when, and if, that
hypothetical config setting is introduced?

Maybe I am dense here, but I would really like to know why this
preparatory patch must be applied *now*, when there is nothing to prepare
for.

> Isn't "Putty is not a command but is also handled as if it is a valid
> implementation of SSH" a bug?

If you think it is a bug to handle an ssh command called "putty" as if it
were plink, sure. I do not think there is a valid use case, but hey.

Ciao,
Johannes

Re: [PATCH v2 0/2] Let the sequencer handle the grunt work of rebase -i

Johannes Schindelin  writes:

> After all of these patch series y'all had to review, this is finally the
> one that switches things over.
>
> Please note that it does not (yet) handle the `git rebase -i --root`
> invocation; I tried to focus on the common case, and I rarely use --root
> myself.

As long as the longer-term goal is clear enough and the short-term
approach does not conflict with the goal, solving the most common
problem that yields the largest payback first is absolutely the
right thing to do, and omitting "--root" and/or "-p" and getting the
main use of "-i" right is a great way to start.

>  .gitignore|  1 +
>  Makefile  |  1 +
>  builtin.h |  1 +
>  builtin/rebase--helper.c  | 40 
>  git-rebase--interactive.sh| 13 +
>  git.c |  1 +
>  t/t3404-rebase-interactive.sh |  2 +-
>  7 files changed, 58 insertions(+), 1 deletion(-)
>  create mode 100644 builtin/rebase--helper.c

And it is surprisingly short and sweet ;-)

Will queue as js/rebase-helper topic, forked at 6e3a7b3398 ("Git
2.12-rc0", 2017-02-03).

Thanks.

PS. in case if anybody is wondering after reading [*1*], at this
point, I _have_ read the patches not just the cover letter, looked
at the branch name the original author gave to the topic, chose the
local topic name I use, and chose where to fork the topic from, but
have not applied the patches (so I may later end up saying "the
patch does not apply cleanly", "the compiler complains on this
line", or "the new code is inconsistent with this existing code that
is a bit beyond the context of the patch that I did not notice when
I reviewed the patches alone" in a separate message).  I do not have
a new entry for this topic in the draft of "What's cooking" report
yet, or have not decided if the topic would hit 'jch' or 'pu' yet
either.

[Reference]

*1* http://public-inbox.org/git/xmqq7f4zqiyj@gitster.mtv.corp.google.com

[PATCH] mingw: use OpenSSL's SHA-1 routines

From: Jeff Hostetler 

Use OpenSSL's SHA-1 routines rather than builtin block-sha1 routines.
This improves performance on SHA1 operations on Intel processors.

OpenSSL 1.0.2 has made considerable performance improvements and
support the Intel hardware acceleration features.  See:
https://software.intel.com/en-us/articles/improving-openssl-performance
https://software.intel.com/en-us/articles/intel-sha-extensions

To test this I added/staged a single file in a gigantic
repository having a 450MB index file.  The code in read-cache.c
verifies the header SHA as it reads the index and computes a new
header SHA as it writes out the new index.  Therefore, in this test
the SHA code must process 900MB of data.  Testing was done on an
Intel I7-4770 CPU @ 3.40GHz (Intel64, Family 6, Model 60) CPU.

The block-sha1 version averaged 5.27 seconds.
The OpenSSLversion averaged 4.50 seconds.



$ echo xxx >> project.mk
$ time /e/blk_sha/bin/git.exe add project.mk

real0m5.207s
user0m0.000s
sys 0m0.250s

$ echo xxx >> project.mk
$ time /e/blk_sha/bin/git.exe add project.mk

real0m5.362s
user0m0.015s
sys 0m0.234s

$ echo xxx >> project.mk
$ time /e/blk_sha/bin/git.exe add project.mk

real0m5.300s
user0m0.016s
sys 0m0.250s

$ echo xxx >> project.mk
$ time /e/blk_sha/bin/git.exe add project.mk

real0m5.216s
user0m0.000s
sys 0m0.250s


$ echo xxx >> project.mk
$ time /e/openssl/bin/git.exe add project.mk

real0m4.431s
user0m0.000s
sys 0m0.250s

$ echo xxx >> project.mk
$ time /e/openssl/bin/git.exe add project.mk

real0m4.478s
user0m0.000s
sys 0m0.265s

$ echo xxx >> project.mk
$ time /e/openssl/bin/git.exe add project.mk

real0m4.690s
user0m0.000s
sys 0m0.250s

$ echo xxx >> project.mk
$ time /e/openssl/bin/git.exe add project.mk

real0m4.420s
user0m0.000s
sys 0m0.234s



Signed-off-by: Jeff Hostetler 
Signed-off-by: Johannes Schindelin 
---
Published-As: https://github.com/dscho/git/releases/tag/mingw-openssl-sha1-v1
Fetch-It-Via: git fetch https://github.com/dscho/git mingw-openssl-sha1-v1

 config.mak.uname | 1 -
 1 file changed, 1 deletion(-)

diff --git a/config.mak.uname b/config.mak.uname
index 447f36ac2e..a07936da8b 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -515,7 +515,6 @@ ifneq (,$(findstring MINGW,$(uname_S)))
OBJECT_CREATION_USES_RENAMES = UnfortunatelyNeedsTo
NO_REGEX = YesPlease
NO_PYTHON = YesPlease
-   BLK_SHA1 = YesPlease
ETAGS_TARGET = ETAGS
NO_INET_PTON = YesPlease
NO_INET_NTOP = YesPlease

base-commit: 6e3a7b3398559305c7a239a42e447c21a8f39ff8
-- 
2.11.1.windows.1

[PATCH v2 2/2] rebase -i: use the rebase--helper builtin

Now that the sequencer learned to process a "normal" interactive rebase,
we use it. The original shell script is still used for "non-normal"
interactive rebases, i.e. when --root or --preserve-merges was passed.

Please note that the --root option (via the $squash_onto variable) needs
special handling only for the very first command, hence it is still okay
to use the helper upon continue/skip.

Also please note that the --no-ff setting is volatile, i.e. when the
interactive rebase is interrupted at any stage, there is no record of
it. Therefore, we have to pass it from the shell script to the
rebase--helper.

Note: the test t3404 had to be adjusted because the the error messages
produced by the sequencer comply with our current convention to start with
a lower-case letter.

Signed-off-by: Johannes Schindelin 
---
 git-rebase--interactive.sh| 13 +
 t/t3404-rebase-interactive.sh |  2 +-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index 4734094a3f..2c9c0165b5 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -1069,6 +1069,10 @@ git_rebase__interactive () {
 
 case "$action" in
 continue)
+   if test ! -d "$rewritten"
+   then
+   exec git rebase--helper ${force_rebase:+--no-ff} --continue
+   fi
# do we have anything to commit?
if git diff-index --cached --quiet HEAD --
then
@@ -1128,6 +1132,10 @@ first and then run 'git rebase --continue' again.")"
 skip)
git rerere clear
 
+   if test ! -d "$rewritten"
+   then
+   exec git rebase--helper ${force_rebase:+--no-ff} --continue
+   fi
do_rest
return 0
;;
@@ -1314,6 +1322,11 @@ expand_todo_ids
 test -d "$rewritten" || test -n "$force_rebase" || skip_unnecessary_picks
 
 checkout_onto
+if test -z "$rebase_root" && test ! -d "$rewritten"
+then
+   require_clean_work_tree "rebase"
+   exec git rebase--helper ${force_rebase:+--no-ff} --continue
+fi
 do_rest
 
 }
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index e2f18d11f6..33d392ba11 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -556,7 +556,7 @@ test_expect_success 'clean error after failed "exec"' '
echo "edited again" > file7 &&
git add file7 &&
test_must_fail git rebase --continue 2>error &&
-   test_i18ngrep "You have staged changes in your working tree." error
+   test_i18ngrep "you have staged changes in your working tree" error
 '
 
 test_expect_success 'rebase a detached HEAD' '
-- 
2.11.1.windows.1

[PATCH v2 0/2] Let the sequencer handle the grunt work of rebase -i

After all of these patch series y'all had to review, this is finally the
one that switches things over.

Please note that it does not (yet) handle the `git rebase -i --root`
invocation; I tried to focus on the common case, and I rarely use --root
myself.

Please note also that --preserve-merges is *not* handled.

The way I designed --preserve-merges is totally stupid and idiotic and I
do not want to spend any further time on it. You cannot "pick" merges
and hope to be able to reorder commits, for example. I may work on
porting Git garden shears' way to recreate the branch structure into
rebase -i proper at some stage.

And please finally note that this pair of patches does not yet yield the
full speed improvement that I promised earlier. After these patches, the
time is dominated by pre- and post-processing the todo script, at least
on Windows, so there is another patch series that ports those bits and
pieces into the rebase--helper, too.

Changes since v1:

- rebased to current master

- this required a change in t3404 because I was bullied^Wasked to change
  some messages (which should not have been conflated with the work I
  actually wanted to do, but whatevs)


Johannes Schindelin (2):
  Add a builtin helper for interactive rebases
  rebase -i: use the rebase--helper builtin

 .gitignore|  1 +
 Makefile  |  1 +
 builtin.h |  1 +
 builtin/rebase--helper.c  | 40 
 git-rebase--interactive.sh| 13 +
 git.c |  1 +
 t/t3404-rebase-interactive.sh |  2 +-
 7 files changed, 58 insertions(+), 1 deletion(-)
 create mode 100644 builtin/rebase--helper.c


base-commit: 6e3a7b3398559305c7a239a42e447c21a8f39ff8
Based-On: sequencer-i at https://github.com/dscho/git
Fetch-Base-Via: git fetch https://github.com/dscho/git sequencer-i
Published-As: https://github.com/dscho/git/releases/tag/rebase--helper-v2
Fetch-It-Via: git fetch https://github.com/dscho/git rebase--helper-v2

Interdiff vs v1:

 diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
 index e2f18d11f6..33d392ba11 100755
 --- a/t/t3404-rebase-interactive.sh
 +++ b/t/t3404-rebase-interactive.sh
 @@ -556,7 +556,7 @@ test_expect_success 'clean error after failed "exec"' '
echo "edited again" > file7 &&
git add file7 &&
test_must_fail git rebase --continue 2>error &&
 -  test_i18ngrep "You have staged changes in your working tree." error
 +  test_i18ngrep "you have staged changes in your working tree" error
  '
  
  test_expect_success 'rebase a detached HEAD' '

-- 
2.11.1.windows.1

[PATCH v2 1/2] Add a builtin helper for interactive rebases

Git's interactive rebase is still implemented as a shell script, despite
its complexity. This implies that it suffers from the portability point
of view, from lack of expressibility, and of course also from
performance. The latter issue is particularly serious on Windows, where
we pay a hefty price for relying so much on POSIX.

Unfortunately, being such a huge shell script also means that we missed
the train when it would have been relatively easy to port it to C, and
instead piled feature upon feature onto that poor script that originally
never intended to be more than a slightly pimped cherry-pick in a loop.

To open the road toward better performance (in addition to all the other
benefits of C over shell scripts), let's just start *somewhere*.

The approach taken here is to add a builtin helper that at first intends
to take care of the parts of the interactive rebase that are most
affected by the performance penalties mentioned above.

In particular, after we spent all those efforts on preparing the sequencer
to process rebase -i's git-rebase-todo scripts, we implement the `git
rebase -i --continue` functionality as a new builtin, git-rebase--helper.

Once that is in place, we can work gradually on tackling the rest of the
technical debt.

Note that the rebase--helper needs to learn about the transient
--ff/--no-ff options of git-rebase, as the corresponding flag is not
persisted to, and re-read from, the state directory.

Signed-off-by: Johannes Schindelin 
---
 .gitignore   |  1 +
 Makefile |  1 +
 builtin.h|  1 +
 builtin/rebase--helper.c | 40 
 git.c|  1 +
 5 files changed, 44 insertions(+)
 create mode 100644 builtin/rebase--helper.c

diff --git a/.gitignore b/.gitignore
index b1020b875f..833ef3b0b7 100644
--- a/.gitignore
+++ b/.gitignore
@@ -114,6 +114,7 @@
 /git-read-tree
 /git-rebase
 /git-rebase--am
+/git-rebase--helper
 /git-rebase--interactive
 /git-rebase--merge
 /git-receive-pack
diff --git a/Makefile b/Makefile
index 8e4081e061..29f8663509 100644
--- a/Makefile
+++ b/Makefile
@@ -932,6 +932,7 @@ BUILTIN_OBJS += builtin/prune.o
 BUILTIN_OBJS += builtin/pull.o
 BUILTIN_OBJS += builtin/push.o
 BUILTIN_OBJS += builtin/read-tree.o
+BUILTIN_OBJS += builtin/rebase--helper.o
 BUILTIN_OBJS += builtin/receive-pack.o
 BUILTIN_OBJS += builtin/reflog.o
 BUILTIN_OBJS += builtin/remote.o
diff --git a/builtin.h b/builtin.h
index 67f80519da..9e4a89816d 100644
--- a/builtin.h
+++ b/builtin.h
@@ -103,6 +103,7 @@ extern int cmd_prune_packed(int argc, const char **argv, 
const char *prefix);
 extern int cmd_pull(int argc, const char **argv, const char *prefix);
 extern int cmd_push(int argc, const char **argv, const char *prefix);
 extern int cmd_read_tree(int argc, const char **argv, const char *prefix);
+extern int cmd_rebase__helper(int argc, const char **argv, const char *prefix);
 extern int cmd_receive_pack(int argc, const char **argv, const char *prefix);
 extern int cmd_reflog(int argc, const char **argv, const char *prefix);
 extern int cmd_remote(int argc, const char **argv, const char *prefix);
diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
new file mode 100644
index 00..ca1ebb2fa1
--- /dev/null
+++ b/builtin/rebase--helper.c
@@ -0,0 +1,40 @@
+#include "builtin.h"
+#include "cache.h"
+#include "parse-options.h"
+#include "sequencer.h"
+
+static const char * const builtin_rebase_helper_usage[] = {
+   N_("git rebase--helper []"),
+   NULL
+};
+
+int cmd_rebase__helper(int argc, const char **argv, const char *prefix)
+{
+   struct replay_opts opts = REPLAY_OPTS_INIT;
+   enum {
+   CONTINUE = 1, ABORT
+   } command = 0;
+   struct option options[] = {
+   OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")),
+   OPT_CMDMODE(0, "continue", &command, N_("continue rebase"),
+   CONTINUE),
+   OPT_CMDMODE(0, "abort", &command, N_("abort rebase"),
+   ABORT),
+   OPT_END()
+   };
+
+   git_config(git_default_config, NULL);
+
+   opts.action = REPLAY_INTERACTIVE_REBASE;
+   opts.allow_ff = 1;
+   opts.allow_empty = 1;
+
+   argc = parse_options(argc, argv, NULL, options,
+   builtin_rebase_helper_usage, PARSE_OPT_KEEP_ARGV0);
+
+   if (command == CONTINUE && argc == 1)
+   return !!sequencer_continue(&opts);
+   if (command == ABORT && argc == 1)
+   return !!sequencer_remove_state(&opts);
+   usage_with_options(builtin_rebase_helper_usage, options);
+}
diff --git a/git.c b/git.c
index c887272b12..33f52acbcc 100644
--- a/git.c
+++ b/git.c
@@ -473,6 +473,7 @@ static struct cmd_struct commands[] = {
{ "pull", cmd_pull, RUN_SETUP | NEED_WORK_TREE },
{ "push", cmd_push, RUN_SETUP },
{ "read-tree", cmd_read_tree, RUN_SETUP | SUPPORT_SUP

Re: [PATCH 1/5] refs: store submodule ref stores in a hashmap

Michael Haggerty  writes:

> Aside from scaling better, this means that the submodule name needn't be
> stored in the ref_store instance anymore (which will be changed in a
> moment).

Nice.  I like the latter reason very much (this is not a suggestion
to change the description).

> +struct submodule_hash_entry
> +{
> + struct hashmap_entry ent; /* must be the first member! */
> +
> + struct ref_store *refs;
> +
> + /* NUL-terminated name of submodule: */
> + char submodule[FLEX_ARRAY];
> +};
> +
> +static int submodule_hash_cmp(const void *entry, const void *entry_or_key,
> +   const void *keydata)
> +{
> + const struct submodule_hash_entry *e1 = entry, *e2 = entry_or_key;
> + const char *submodule = keydata;
> +
> + return strcmp(e1->submodule, submodule ? submodule : e2->submodule);

I would have found it more readable if it were like so:

const char *submodule = keydata ? keydata : e2->submodule;

return strcmp(e1->submodule, submodule);

but I suspect the difference is not that huge.

> +}
> +
> +static struct submodule_hash_entry *alloc_submodule_hash_entry(
> + const char *submodule, struct ref_store *refs)
> +{
> + size_t len = strlen(submodule);
> + struct submodule_hash_entry *entry = malloc(sizeof(*entry) + len + 1);

I think this (and the later memcpy) is what FLEX_ALLOC_MEM() was
invented for.

> + hashmap_entry_init(entry, strhash(submodule));
> + entry->refs = refs;
> + memcpy(entry->submodule, submodule, len + 1);
> + return entry;
> +}
> ...
> @@ -1373,16 +1405,17 @@ void base_ref_store_init(struct ref_store *refs,
>   die("BUG: main_ref_store initialized twice");
>  
>   refs->submodule = "";
> - refs->next = NULL;
>   main_ref_store = refs;
>   } else {
> - if (lookup_ref_store(submodule))
> + refs->submodule = xstrdup(submodule);
> +
> + if (!submodule_ref_stores.tablesize)
> + hashmap_init(&submodule_ref_stores, submodule_hash_cmp, 
> 20);

Makes me wonder what "20" stands for.  Perhaps the caller should be
allowed to say "I do not quite care what initial size is" by passing
0 or some equally but more clealy meaningless value (which of course
would be outside the scope of this series).

Re: Bug with fixup and autosquash

Johannes Schindelin  writes:

> Almost. While I fixed the performance issues as well as the design
> allowed, I happened to "fix" the problem where an incomplete prefix match
> could be favored over an exact match.

Hmph.  Would it require too much further work to do what you said
the code does:

> The rebase--helper code (specifically, the patch moving autosquash logic
> into it: https://github.com/dscho/git/commit/7d0831637f) tries to match
> exact onelines first, and falls back to prefix matching only after that.

If the code matches exact onlines and then falls back to prefix, I
do not think incomplete prefix would be mistakenly chosen over an
exact one, so perhaps your code already does the right thing?

Re: Cross-referencing the Git mailing list archive with their corresponding commits in `pu`

Johannes Schindelin  writes:

>> > E.g.
>> > 
>> > $ git show --notes=amlog --stat
>> 
>> That's super useful! Thanks for the pointer!
>> Wouldn't it make sense to push these notes to github.com/git/git ?
>
> I am not quite sure about that. It is in a different namespace than what
> is usually cloned, and it currently adds 8MB to the download (there are
> "amlog" and "commits", the latter clearly being a sandbox).

I do not think the public mirrors of the primary repository should
get amlog, either.  It is more suited for those who are interested
in broken-out topics, i.e. git://github.com/git/gitster.

Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory

Hi Junio,

On Thu, 9 Feb 2017, Junio C Hamano wrote:

> Duy Nguyen  writes:
> 
> > Relevant thread in the past [1] which fixes both --git-path and
> > --git-common-dir. I think the author dropped it somehow (or forgot
> > about it, I know I did). Sorry can't comment on that thread, or this
> > patch, yet.
> >
> > [1] 
> > http://public-inbox.org/git/1464261556-89722-1-git-send-email-rappa...@gmail.com/
> 
> Thanks for a pointer.  I see Mike responded to this message (I
> haven't had a chance to read and think about it yet), so I trust
> that you three can figure out if these are the same issues and what
> the final solution in the longer term should be.  
> 
> I have no strong opinion for or against a "longer term" solution
> that makes "rev-parse --git-path" behave differently from how it
> behaves today, but I am not yet convinced that we can reach that
> longer term goal without a transition period, as I suspect there are
> existing users that know and came to expect how it behaves, based on
> its today's behaviour.  Other than that I do not have suggestion on
> this topic at the moment.

Given that

- the output is incorrect, not some output that could maybe be improved,

- warnings in a script execution are most likely to be missed,

- --git-path gives incorrect output in subdirectories, except inside
  worktrees, therefore scripts relying on the current behavior are highly
  likely to misbehave in worktrees anyway,

- leaving this bug unfixed even when we know about it for 3 major releases
  reflects really badly on Git as a project, and

- the longer we wait to fix this bug, the more developers will simply stay
  away from --git-path (of course, only *after* they were bitten by the
  bug, like I was),

it should be safe to assume that a transitional period is more likely to
do more harm to our users than bring benefit.

Ciao,
Johannes

Re: Bug with fixup and autosquash

Hi Junio,

On Thu, 9 Feb 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > This prefix match also happens to introduce a serious performance problem,
> > which is why I "fixed" this issue in the rebase--helper already (which is
> > the case if you are using Git for Windows, whose master branch builds on
> > Linux and MacOSX as well). I quoted "fix" because my motivation was to fix
> > the performance problem, not the "incorrect match" problem.
> 
> In other words, regardless of your motivation, you "fix"ed both,
> which is very nice ;-)

Almost. While I fixed the performance issues as well as the design
allowed, I happened to "fix" the problem where an incomplete prefix match
could be favored over an exact match.

I really fixed the performance issues. Not "fixed" them.

Ciao,
Johannes

Re: Cross-referencing the Git mailing list archive with their corresponding commits in `pu`

Hi Lars,

On Thu, 9 Feb 2017, Lars Schneider wrote:

> > On 06 Feb 2017, at 20:10, Junio C Hamano  wrote:
> > 
> > Johannes Schindelin  writes:
> > 
> >> So I thought maybe the From: line (from the body, if available,
> >> otherwise from the header) in conjunction with the "Date:" header
> >> would work.
> > 
> > FYI, I use a post-applypatch hook to populate refs/notes/amlog notes
> > tree when I queue a new patch; I am not sure how well the notes in it
> > are preserved across rebases, but it could be a good starting point.
> > The notes tree is mirrored at git://github.com/git/gitster repository.
> > 
> > E.g.
> > 
> > $ git show --notes=amlog --stat
> 
> That's super useful! Thanks for the pointer!
> Wouldn't it make sense to push these notes to github.com/git/git ?

I am not quite sure about that. It is in a different namespace than what
is usually cloned, and it currently adds 8MB to the download (there are
"amlog" and "commits", the latter clearly being a sandbox).

While I am thankful that there is at least some information available for
patches integrated into `pu` since Nov 1 2016, the format is probably not
stable (we are talking about free-form notes, after all), and it still
does not help with catching the case where new patch series iterations (or
in some case, new patch series, period) are missed.

Make no mistake, it will be a huge undertaking to develop a tool that
helps with the management of patch series on top of the mailing list
driven patch review process. And even in the best case, it may be simply
too hard for an automated tool to figure things out e.g. when Peff or
Junio paste a tangentially related diff into a thread.

In the end, what I *really* would love to have is a system where you can
easily query "which reviewer comments on *any* of my patch series are new,
or still unaddressed?", and "in what way was my patch modified relative to
the latest version I submitted?". It may actually be impossible to create
such a tool, as it cannot invent information/cross-references that it does
not have nor can deduce from available data.

Ciao,
Johannes

Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)

Johannes Schindelin  writes:

>> > (And that would have to be handled at a different point, as I had
>> > pointed out, so that suggested preparation would most likely not help
>> > at all.)
>> 
>> I did not think "that would have to be handled at a different point"
>> is correct at all, if by "a point" you meant "a location in the
>> code" [*1*].
>
> Yes, I mean the location in the code.
>
> But since you keep insisting that you are right and I am wrong,...

There is no "insisting".  Didn't we just see how wrong you were
about "different point"?  An extended syntax of override would be
handled in the new helper to handle override, the same point in the
code as other overrides are handled.

> ... and even
> go so far as calling your patch reverting my refactoring a hot-fix, why
> don't you just go ahead and merge the result over my objections?

At this point, you are simply being silly.

Isn't "Putty is not a command but is also handled as if it is a
valid implementation of SSH" a bug?  Isn't making the code not to be
confused like so a fix?

A different approach to fix the issue would be to declare that the
command names and overrides are not in separate namespaces.

If you prefer to go that route, the documentation can use an update
to make it not mention "putty" as a valid override (the users have
to spell "plink"), mention "plink.exe" is also accepted, etc. and
make it clear that the override environment and configuration
variables are the way to tell Git: "The ssh implementation I have
behaves the same way as this well-known implementation, so treat it
as such without actually looking at the path to the command in the
ssh.command string".

That may limit the freedom for future enhancement of overrides, but
is an acceptable short-cut.  After all, the overrides are merely an
escape hatch and we expect them to be used only rarely.

Re: GSoC 2017: application open, deadline = February 9, 2017

On Thu, Feb 9, 2017 at 1:38 PM, Johannes Schindelin
 wrote:
> Hi Christian,
>
> On Thu, 9 Feb 2017, Christian Couder wrote:
>
>> I just had a look and the microproject and idea pages for this year are
>> ok.  They are not great sure, but not much worse than the previous
>> years.  What should probably be done is to remove project ideas where is
>> no "possible mentor" listed.
>>
>> But I am reluctant to do that as I don't know what Dscho would be ok to
>> mentor.
>
> I am okay to mentor (except anything that touches submodules).

I am okay to mentor anything (preferrably submodules).

Thanks,
Stefan

>
> Ciao,
> Johannes

Re: Bug with fixup and autosquash

Johannes Schindelin  writes:

> This prefix match also happens to introduce a serious performance problem,
> which is why I "fixed" this issue in the rebase--helper already (which is
> the case if you are using Git for Windows, whose master branch builds on
> Linux and MacOSX as well). I quoted "fix" because my motivation was to fix
> the performance problem, not the "incorrect match" problem.

In other words, regardless of your motivation, you "fix"ed both,
which is very nice ;-)

Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory

Johannes Schindelin  writes:

> On Wed, 8 Feb 2017, Junio C Hamano wrote:
>
>> How long has "rev-parse --git-path" been around?  Had scripts in the
>> wild chance to learn to live with the "output is relative to the top of
>> the working tree" reality?  I think the answers are "since 2.5" and
>> "yes".
>
> Correct. And the third question is: How did the scripts work around this
> feature?
>
> The answer is obvious: by switching back to `$(git rev-parse
> --git-dir)/filename`.
>
> This is literally on what I spent the better part of Wednesday.
>
> There is just no way you can "fix" this otherwise. As an occasional shell
> scripter, you may be tempted to use `$(git rev-parse --show-cdup)$(git
> rev-parse --git-path filename)`, but of course that breaks in worktrees
> and if you do not use worktrees you would not even know that your
> workaround introduced another bug.

In case it is not clear, I understand all of the above.  

I was just worried about the people who do *NOT* use worktrees and
did the obvious "concatenate --cdup with --git-path" and thought
their script were working happily and well.  By prepending the path
to the (real) location of the .git in the updated --git-path output
ourselves, they will complain, our update broke their script.

If we introduced the fix gently, by (1) warn when "--git-path" is
used but give the current output anyway, while adding the "fixed"
one as another new option, and then (2) remove "--git-path" after
several releases, they will not have to complain, even though they
will see warning until they stop using "--git-path".

There may be gentler alternative ways to transition, and I do not
worry about the specifics of them too much.  I just think we cannot
do this in a single step without harming existing users.

Re: GSoC 2017: application open, deadline = February 9, 2017

Hi Christian,

On Thu, 9 Feb 2017, Christian Couder wrote:

> I just had a look and the microproject and idea pages for this year are
> ok.  They are not great sure, but not much worse than the previous
> years.  What should probably be done is to remove project ideas where is
> no "possible mentor" listed.
>
> But I am reluctant to do that as I don't know what Dscho would be ok to
> mentor.

I am okay to mentor (except anything that touches submodules).

Ciao,
Johannes

Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory

Hi Junio,

On Wed, 8 Feb 2017, Junio C Hamano wrote:

> How long has "rev-parse --git-path" been around?  Had scripts in the
> wild chance to learn to live with the "output is relative to the top of
> the working tree" reality?  I think the answers are "since 2.5" and
> "yes".

Correct. And the third question is: How did the scripts work around this
feature?

The answer is obvious: by switching back to `$(git rev-parse
--git-dir)/filename`.

This is literally on what I spent the better part of Wednesday.

There is just no way you can "fix" this otherwise. As an occasional shell
scripter, you may be tempted to use `$(git rev-parse --show-cdup)$(git
rev-parse --git-path filename)`, but of course that breaks in worktrees
and if you do not use worktrees you would not even know that your
workaround introduced another bug.

The current handling of --git-path is just plain wrong. The fact that I am
the first person to report this here merely shows how much it is used in
the wild.

Ciao,
Johannes

Re: [PATCH 4/5] files_ref_store::submodule: use NULL for the main repository

On Thu, Feb 9, 2017 at 1:19 PM, Junio C Hamano  wrote:

>>   if (!submodule || !*submodule)
>> - return be->init(NULL);
>> + refs = be->init(NULL);
>>   else
>> - return be->init(submodule);
>> + refs = be->init(submodule);
>
> Can't we also lose this "if !*submodule, turn it to NULL"?

That was my suggestion as well, but that did not look nicer per se.
That is because at this point of the series we still handle "" just fine.

>>   refs->submodule = submodule ? xstrdup(submodule) : "";
>
> Also, can't we use xstrdup_or_null(submodule) with this step?
>

That is done in 4/5; here we must keep a "" around instead of NULL IIUC.

Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory

Duy Nguyen  writes:

> Relevant thread in the past [1] which fixes both --git-path and
> --git-common-dir. I think the author dropped it somehow (or forgot
> about it, I know I did). Sorry can't comment on that thread, or this
> patch, yet.
>
> [1] 
> http://public-inbox.org/git/1464261556-89722-1-git-send-email-rappa...@gmail.com/

Thanks for a pointer.  I see Mike responded to this message (I
haven't had a chance to read and think about it yet), so I trust
that you three can figure out if these are the same issues and what
the final solution in the longer term should be.  

I have no strong opinion for or against a "longer term" solution
that makes "rev-parse --git-path" behave differently from how it
behaves today, but I am not yet convinced that we can reach that
longer term goal without a transition period, as I suspect there are
existing users that know and came to expect how it behaves, based on
its today's behaviour.  Other than that I do not have suggestion on
this topic at the moment.

Thanks.

Re: [PATCH 0/5] Store submodules in a hash, not a linked list

On 02/09/2017 08:58 PM, Jeff King wrote:
> On Thu, Feb 09, 2017 at 02:26:57PM +0100, Michael Haggerty wrote:
> [...]
>> A `files_ref_store` would be perfectly happy to represent, say, the
>> references *physically* stored in a linked worktree (e.g., `HEAD`,
>> `refs/bisect/*`, etc) even though that is not the complete collection
>> of refs that are *logically* visible from that worktree (which
>> includes references from the main repository, too). But the old code
>> was confusing the two things by storing "submodule" in every
>> `ref_store` instance.
>>
>> So push the submodule attribute down to the `files_ref_store` class
>> (but continue to let the `ref_store`s be looked up by submodule).
> 
> I'm not sure I understand all of the ramifications here. It _sounds_ like
> pushing this down into the files-backend code would make it harder to
> have mixed ref-backends for different submodules. Or is this just
> pushing down an implementation detail of the files backend, and future
> code is free to have as many different ref_stores as it likes?

I don't understand how this would make it harder, aside from the fact
that a new backend class might also need a path member and have to
maintain its own copy rather than using one that the base class provides.

I consider it an implementation detail of the files backend that it
needs to keep a permanent record of its submodule path in
files_ref_store. Some hypothetical future backend might instead need,
say, an IP number and port to connect to a MySQL server. A hypothetical
pure packed-refs backend might just store the path of the packed-refs file.

A more likely imminent change is that backends need a path, but that the
path needn't correspond to the git_dir of the repository that contains
the corresponding objects, for example in the case of a linked worktree.
You might ask for the ref_store for a worktree-submodule, and end up
getting a compound object that delegates to one ref_store pointing at
its git_dir and one at its common_dir.

Michael

Re: [PATCH 2/5] refs: push the submodule attribute down

Michael Haggerty  writes:

> Push the submodule attribute down from ref_store to files_ref_store.
> This is another step towards loosening the 1:1 connection between
> ref_stores and submodules.

The update seems to retain the externally visible effects, but what
does this change mean for future backend writers whose code will sit
next to files_ref_store?  They need to have "submodule" field in their
implementation of the backend and keep track of it?

Granted that the primary thing that looks at ->submodule field in
the code before this change all live in refs/files-backend.c, but I
am not sure if that is an artifact of us having only one backend at
this moment, or a sign that future backends would benefit from extra
freedom to choose how they exactly implement their submodule
support.

Re: [PATCH 4/5] files_ref_store::submodule: use NULL for the main repository

Michael Haggerty  writes:

> The old practice of storing the empty string in this member for the main
> repository was a holdover from before 00eebe3 (refs: create a base class
> "ref_store" for files_ref_store, 2016-09-04), when the submodule was
> stored in a flex array at the end of `struct files_ref_store`. Storing
> NULL for this case is more idiomatic and a tiny bit less code.

Yes.  I noticed this bit in 3/5 and wondered about it, knowing this
step comes next:

>  struct ref_store *ref_store_init(const char *submodule)
>  {
>   const char *be_name = "files";
>   struct ref_storage_be *be = find_ref_storage_backend(be_name);
> + struct ref_store *refs;
>  
>   if (!be)
>   die("BUG: reference backend %s is unknown", be_name);
>  
>   if (!submodule || !*submodule)
> - return be->init(NULL);
> + refs = be->init(NULL);
>   else
> - return be->init(submodule);
> + refs = be->init(submodule);

Can't we also lose this "if !*submodule, turn it to NULL"?

> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -980,7 +980,7 @@ static struct ref_store *files_ref_store_create(const 
> char *submodule)
>   struct files_ref_store *refs = xcalloc(1, sizeof(*refs));
>   struct ref_store *ref_store = (struct ref_store *)refs;
>  
> - base_ref_store_init(ref_store, &refs_be_files, submodule);
> + base_ref_store_init(ref_store, &refs_be_files);
>  
>   refs->submodule = submodule ? xstrdup(submodule) : "";

Also, can't we use xstrdup_or_null(submodule) with this step?

Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory

Hi Mike,

On Thu, 9 Feb 2017, Mike Rappazzo wrote:

> On Thu, Feb 9, 2017 at 4:48 AM, Duy Nguyen  wrote:
>
> > Relevant thread in the past [1] which fixes both --git-path and
> > --git-common-dir. I think the author dropped it somehow (or forgot
> > about it, I know I did). Sorry can't comment on that thread, or this
> > patch, yet.
> 
> I didn't exactly forget it (I have it sitting in a branch), I wasn't
> sure what else was needed (from a v5 I guess), so it has stagnated.
> 
> There was also another patch [1] at the time done by SZEDER Gábor trying
> to speed up the completion scripts by adding `git rev-parse
> --absolute-git-dir` option to deal with this case as well.
> 
> > [1] 
> > http://public-inbox.org/git/1464261556-89722-1-git-send-email-rappa...@gmail.com/
> 
> [1] http://public-inbox.org/git/20170203024829.8071-16-szeder@gmail.com/

Ah, so I was not the only person reporting this bug, but I am seemingly
having as much luck getting a fix in.

I had a quick look at your v4:
http://public-inbox.org/git/1464261556-89722-3-git-send-email-rappa...@gmail.com/

It seems you replaced the git_path() by a combination of git_dir() and
relative_path(), but that would break the use case where git_path()
handles certain arguments specially, e.g. "objects" which knows that the
.git/objects/ path can be overridden via the environment.

I tried very hard to keep that working in my patch, essentially by
emulating what git_path() does already when being called in a worktree's
subdirectory: make the path absolute.

Ciao,
Johannes

Re: [PATCH 3/5] register_ref_store(): new function

Michael Haggerty  writes:

> Move the responsibility for registering the ref_store for a submodule
> from base_ref_store_init() to a new function, register_ref_store(). Call
> the latter from ref_store_init().
>
> This means that base_ref_store_init() can lose its submodule argument,
> further weakening the 1:1 relationship between ref_stores and
> submodules.

OK.  I think I am starting to get it.  I've always found it
disturbing that for_each_*ref*() has a "submodule" variant.
Instead, the plan (outlined in the discussion from yesterday that
triggered your posting this series) is to give an API to request a
"ref-store", and then ask that object to iterate over refs, and
these steps get us closer to that goal, because the "to enumerate
these" won't have to know what set of refs a ref-store contains.  If
you want to iterate over refs in a submodule, you grab a ref-store
for the submodule and ask it to iterate.  Iterating over refs in
another worktree, you grab a different ref-store and ask it to
iterate using the same API.

Sounds like a good direction to go.

Re: Bug with fixup and autosquash

Hi Ashutosh and Junio,

On Wed, 8 Feb 2017, Junio C Hamano wrote:

> Ashutosh Bapat  writes:
> 
> > I have been using git rebase heavily these days and seem to have found
> > a bug.
> >
> > If there are two commit messages which have same prefix e.g.
> > yy This is prefix
> > xx This is prefix and message
> >
> > xx comitted before yy
> >
> > Now I commit a fixup to yy using git commit --fixup yy
> > zz fixup! This is prefix
> >
> > When I run git rebase -i --autosquash, the script it shows me looks like
> > pick xx This is prefix and message
> > fixup zz fixup! This is prefix
> > pick yy This is prefix
> >
> > I think the correct order is
> > pick xx This is prefix and message
> > pick yy This is prefix
> > fixup zz fixup! This is prefix
> >
> > Is that right?
> 
> [...]
> 
> Unfortunately, "rebase -i --autosquash" reorders the entries by
> identifying the commit by its title, and it goes with prefix match so
> that fix-up commits created without using --fixup option but manually
> records the title's prefix substring can also work.  

This prefix match also happens to introduce a serious performance problem,
which is why I "fixed" this issue in the rebase--helper already (which is
the case if you are using Git for Windows, whose master branch builds on
Linux and MacOSX as well). I quoted "fix" because my motivation was to fix
the performance problem, not the "incorrect match" problem.

The rebase--helper code (specifically, the patch moving autosquash logic
into it: https://github.com/dscho/git/commit/7d0831637f) tries to match
exact onelines first, and falls back to prefix matching only after that.

Now that the sequencer-i patch series is in `master`, the next step is to
send the patch series introducing the rebase--helper. The patch series
including the fix discussed above relies on that one. Meaning that it will
take a while to get through the mill.

So please do not hold your breath until this feature/fix hits an official
Git version. If you need it[*1*] faster, feel free to build Git for
Windows' master and run with that for a while.

Ciao,
Johannes

Footnote: By "it" I mean "the feature/fix", not "an official Git version"
nor "your breath".

gitk bug: file select in the tree mode

2017-02-09 Thread Anatoly Borodin

Hi All!

There is a bug in gitk (e.g. 2.11.0):

1) Choose a repository with files in a subdir (git's repo for example).
2) `cd` to a subdir (e.g. `xdiff`).
3) Run `gitk`.
4) Select 'Tree' in the 'Patch / Tree' panel.
5) Select any file with 'Highlight this too' or 'Highlight this only'
(e.g `xmerge.c`).
6) See the short file name (`xmerge.c`) instead of the full path
(`xdiff/xmerge.c`) in the 'Find commit touching path:' edit field. No
commits touching the file can be found.

-- 
Mit freundlichen Grüßen,
Anatoly Borodin

Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)

Hi Junio,

On Thu, 9 Feb 2017, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > (And that would have to be handled at a different point, as I had
> > pointed out, so that suggested preparation would most likely not help
> > at all.)
> 
> I did not think "that would have to be handled at a different point"
> is correct at all, if by "a point" you meant "a location in the
> code" [*1*].

Yes, I mean the location in the code.

But since you keep insisting that you are right and I am wrong, and even
go so far as calling your patch reverting my refactoring a hot-fix, why
don't you just go ahead and merge the result over my objections?

Ciao,
Johannes

Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)

Michael Haggerty  writes:

> On 02/06/2017 11:34 PM, Junio C Hamano wrote:
>> [...]
>> --
>> [Stalled]
>> [...]
>> * mh/ref-remove-empty-directory (2017-01-07) 23 commits
>>  - files_transaction_commit(): clean up empty directories
>>  - try_remove_empty_parents(): teach to remove parents of reflogs, too
>>  - try_remove_empty_parents(): don't trash argument contents
>>  - try_remove_empty_parents(): rename parameter "name" -> "refname"
>>  - delete_ref_loose(): inline function
>>  - delete_ref_loose(): derive loose reference path from lock
>>  - log_ref_write_1(): inline function
>>  - log_ref_setup(): manage the name of the reflog file internally
>>  - log_ref_write_1(): don't depend on logfile argument
>>  - log_ref_setup(): pass the open file descriptor back to the caller
>>  - log_ref_setup(): improve robustness against races
>>  - log_ref_setup(): separate code for create vs non-create
>>  - log_ref_write(): inline function
>>  - rename_tmp_log(): improve error reporting
>>  - rename_tmp_log(): use raceproof_create_file()
>>  - lock_ref_sha1_basic(): use raceproof_create_file()
>>  - lock_ref_sha1_basic(): inline constant
>>  - raceproof_create_file(): new function
>>  - safe_create_leading_directories(): set errno on SCLD_EXISTS
>>  - safe_create_leading_directories_const(): preserve errno
>>  - t5505: use "for-each-ref" to test for the non-existence of references
>>  - refname_is_safe(): correct docstring
>>  - files_rename_ref(): tidy up whitespace
>> 
>>  Deletion of a branch "foo/bar" could remove .git/refs/heads/foo
>>  once there no longer is any other branch whose name begins with
>>  "foo/", but we didn't do so so far.  Now we do.
>> 
>>  Expecting a reroll.
>>  cf. <5051c78e-51f9-becd-e1a6-9c0b781d6...@alum.mit.edu>
>
> I think you missed v4 of this patch series [1], which is the re-roll
> that you were waiting for. And I missed that you missed it...
>
> Michael
>
> [1] http://public-inbox.org/git/cover.1483719289.git.mhag...@alum.mit.edu/

Actually it was worse than that.  What the above lists *is* v4; I
just failed to update "Expecting a reroll" note when I updated the
topic with your rerolled patches, and left it there trusting the
now-stale note of mine.

Sorry, and a HUGE thanks for noticing the mistake.

Re: [PATCH 0/5] Store submodules in a hash, not a linked list

On Thu, Feb 09, 2017 at 02:26:57PM +0100, Michael Haggerty wrote:

> I have mentioned this patch series on the mailing list a couple of
> time [1,2] but haven't submitted it before. I just rebased it to
> current master. It is available from my Git fork [3] as branch
> "submodule-hash".
> 
> The first point of this patch series is to optimize submodule
> `ref_store` lookup by storing the `ref_store`s in a hashmap rather
> than a linked list. But a more interesting second point is to weaken
> the 1:1 relationship between submodules and `ref_stores` a little bit
> more.

Sounds good. I remember this had been discussed before due to
performance issues with resolve_gitlink_ref(), and we took a different
route (not populating non-submodule entries). I think it's nice to have
both optimizations, though, as they hit different use cases.

> A `files_ref_store` would be perfectly happy to represent, say, the
> references *physically* stored in a linked worktree (e.g., `HEAD`,
> `refs/bisect/*`, etc) even though that is not the complete collection
> of refs that are *logically* visible from that worktree (which
> includes references from the main repository, too). But the old code
> was confusing the two things by storing "submodule" in every
> `ref_store` instance.
> 
> So push the submodule attribute down to the `files_ref_store` class
> (but continue to let the `ref_store`s be looked up by submodule).

I'm not sure I understand all of the ramifications here. It _sounds_ like
pushing this down into the files-backend code would make it harder to
have mixed ref-backends for different submodules. Or is this just
pushing down an implementation detail of the files backend, and future
code is free to have as many different ref_stores as it likes?

-Peff

Re: [PATCH 0/5] Store submodules in a hash, not a linked list

Michael Haggerty  writes:

> I have mentioned this patch series on the mailing list a couple of
> time [1,2] but haven't submitted it before. I just rebased it to
> current master. It is available from my Git fork [3] as branch
> "submodule-hash".
>
> The first point of this patch series is to optimize submodule
> `ref_store` lookup by storing the `ref_store`s in a hashmap rather
> than a linked list. But a more interesting second point is to weaken
> the 1:1 relationship between submodules and `ref_stores` a little bit
> more.
>
> A `files_ref_store` would be perfectly happy to represent, say, the
> references *physically* stored in a linked worktree (e.g., `HEAD`,
> `refs/bisect/*`, etc) even though that is not the complete collection
> of refs that are *logically* visible from that worktree (which
> includes references from the main repository, too). But the old code
> was confusing the two things by storing "submodule" in every
> `ref_store` instance.
>
> So push the submodule attribute down to the `files_ref_store` class
> (but continue to let the `ref_store`s be looked up by submodule).
>
> The last commit is relatively orthogonal to the others; it simplifies
> read_loose_refs() by calling resolve_ref_recursively() directly using
> the `ref_store` instance that it already has in hand, rather than
> indirectly via the public wrappers.
>
> Michael
>
> [1] 
> http://public-inbox.org/git/341999fc-4496-b974-c117-c18a2fca1...@alum.mit.edu/
> [2] 
> http://public-inbox.org/git/37fe2024-0378-a974-a28d-18a89d3e2...@alum.mit.edu/
> [3] https://github.com/mhagger/git
>
> Michael Haggerty (5):
>   refs: store submodule ref stores in a hashmap
>   refs: push the submodule attribute down
>   register_ref_store(): new function
>   files_ref_store::submodule: use NULL for the main repository
>   read_loose_refs(): read refs using resolve_ref_recursively()

Thanks.  Will queue on mh/submodule-hash forked from 'maint'.

Re: Bug with fixup and autosquash

Michael J Gruber  writes:

> Junio C Hamano venit, vidit, dixit 08.02.2017 23:55:
>
>> Let's hear from some of those (Cc'ed) who were involved in an
>> earlier --autosquash thread.
>> 
>> https://public-inbox.org/git/cover.1259934977.git.mhag...@alum.mit.edu/
>> 
>> 
>> [Footnote]
>> 
>> *1* "rebase -i --autosquash" does understand "fixup! yy", so if
>> you are willing to accept the consequence of not being able to
>> rebase twice, you could instead do
>> 
>>  $ git commit -m "fixup! yy"
>> 
>> I would think.
>
> Doesn't this indicate that rebase is fine as is?

Not really, unless you ignore "if you are willing to accept" part,
which actually is a big downside.  And --fixup-fixed will make it
worse, unfortunately.

> - teach "git commit --fixup=" to check for duplicates (same prefix,
> maybe only in "recent" history) and make it issue a warning, say:

This is a very good idea worth pursuing, and (I didn't think things
through, though) we may even be able to bound "recent" without
heuristics---scanning between  and the tip for duplicates might
be sufficient.

> Additionally, we could teach commit --fixup-fixed to commit -m "fixup!
>  " so that we have both uniqueness and verbosity in the
> rebase-i-editor. This would allow "rebase -i" to fall back to the old
> mode when "" is not in the range it operates on.

This is also a possibility, but it needs cooperation between both
"commit" and "rebase -i".

I personally do not think rewriting "fixup! yy" on the title
during rebase is worth doing, but that is not because I have a
concrete reason against it but it just sounds like too much magic to
my gut feeling.  Perhaps it can be done reliably with minimal change
to the code.  I dunno.

Thanks.

Re: [PATCH 1/5] refs: store submodule ref stores in a hashmap

On Thu, Feb 09, 2017 at 06:40:29PM +0100, Michael Haggerty wrote:

> On 02/09/2017 05:58 PM, Stefan Beller wrote:
> >> @@ -1402,17 +1435,17 @@ struct ref_store *ref_store_init(const char 
> >> *submodule)
> >>
> >>  struct ref_store *lookup_ref_store(const char *submodule)
> >>  {
> > 
> >> +   if (!submodule_ref_stores.tablesize)
> >> +   hashmap_init(&submodule_ref_stores, submodule_hash_cmp, 
> >> 20);
> > 
> > 
> > So we can lookup a submodule even before we initialized the subsystem?
> > Does that actually happen? (It sounds like a bug to me.)
> >
> > Instead of initializing, you could return NULL directly here.
> [...]
> I suppose this code path could be changed to return NULL without
> initializing the hashmap, but the hashmap will be initialized a moment
> later by ref_store_init(), so I don't see much difference either way.

I faced a similar issue when adding the oidset API recently: 

http://public-inbox.org/git/20170208205307.uvgj3saf3cyrv...@sigill.intra.peff.net/

I came to the same conclusion that it doesn't matter much in practice. A
nice thing about "return NULL" is that you do not have to duplicate the
initialization which happens on the "write" side (so if you ever changed
submodule_hash_cmp, for example, it needs changed in both places).

I also used the "cmpfn" member to check whether the table had been
initialized, which matches existing uses elsewhere. But I do notice that
the documentation explicitly says "tablesize" is good for this purpose,
so it's probably a better choice.

-Peff

Re: [PATCH v3 00/27] Revamp the attribute system; another round

Brandon Williams  writes:

> At least v3 gets the attribute system to a state where further
> improvements should be relatively easy to make.  And now as long as each
> thread has a unique attr_check structure, multiple callers can exist
> inside the attribute system at the same time.  There is still more work
> to be done on it though.  Still my biggest complaint is the "direction"
> aspect of the system.  I would love to also eliminate that as global
> state at some point though I'm not sure how at this point.

We are in agreement 100% ;-) The "direction" was the last thorn I
was fighting with (without successfully coming up with a usable
solution) when I stopped working on my original series before Stefan
took it over.

Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)

Michael Haggerty  writes:

> On 02/06/2017 11:34 PM, Junio C Hamano wrote:
>> [...]
>> --
>> [Stalled]
>> [...]
>> * mh/ref-remove-empty-directory (2017-01-07) 23 commits
>> 
>>  Deletion of a branch "foo/bar" could remove .git/refs/heads/foo
>>  once there no longer is any other branch whose name begins with
>>  "foo/", but we didn't do so so far.  Now we do.
>> 
>>  Expecting a reroll.
>>  cf. <5051c78e-51f9-becd-e1a6-9c0b781d6...@alum.mit.edu>
>
> I think you missed v4 of this patch series [1], which is the re-roll
> that you were waiting for. And I missed that you missed it...
>
> Michael
>
> [1] http://public-inbox.org/git/cover.1483719289.git.mhag...@alum.mit.edu/

Great.  Thanks.

Re: Google Doc about the Contributors' Summit

Johannes Schindelin writes:

> I just started typing stuff up in a Google Doc, and made it editable to
> everyone, feel free to help and add things:
>
> https://docs.google.com/document/d/1KDoSn4btbK5VJCVld32he29U0pUeFGhpFxyx9ZJDDB0/edit?usp=sharing

Thanks for a write-up, Dscho.

List of bullet points just make non-attendees envious, imagining
that attendees had all the fun discussing what is behind these
bullet points, without being able to know what was discussed, if
they reached consensus and what the consensus was, but it is OK ;-)

A few items caught my attention, not because I found them more
important than other items on the page, but because they seem to
want my input without directly asking me ;-)

* If Junio would accept patches by replying to the sender with an
ID and/or a patch name. He picks this (branch) name when he gets
your patch.

I am not sure what exactly I am asked to "accept" here. I sometimes
forget to respond with "Thanks, will queue." to the patch message
and whoever said this wants me to consistently do so? I never say
"... will queue as js/difftool-builtin topic." mainly because I do
not know what the name of the topic branch should be at the point of
reading the patch. All I know is that decided that it may be worth
queuing, so it is a bit harder to arrange. But I can certainly try
if it makes the lives of contributors easier.

* Junio has a script for the todo branch which we can use to
generate the what's cooking branch. Perhaps we could continuously
generate this onto a webpage.

I have no objection, but I doubt that people find such an auto
generated version all that useful, as "git log --first-parent
origin/master..origin/pu" would tell exactly the same story. It
will lack the "topic summary" comment I have yet to write, if the
final 'pu' of the day that was pushed out was before my local update
to the draft of the next issue of "What's cooking" report [*1*], and
would not have any update on the next action (e.g. "Will merge to
...") relative to the latest issue of "What's cooking" report. IOW,
such an auto-generated report lacks all the added value over "git log"
output.

* Making the actual workflow more publicly known, e.g. document how
to generate the cooking email, to learn about the state of a
generation.

The exact mechanics of "how to generate" may be of less importance
than "how the information contained therein relates to their own
work" to the contributors, and I think MaintNotes that is sent out
at key milestones more or less covers the mechanics, but here is how
the sausage is made these days:

- I find a patch series on the list that is in good enough shape to
be in 'pu'. Perhaps it was already discussed and redone a few
times without hitting my tree. Or it may be the first attempt of
a new topic. I come up with a topic name, decide where the topic
should fork at [*2*], create the topic branch and queue the
patches. I may or may not test the topic in isolation at this
point.

- I may find an updated patch series of what has been queued. I go
to the existing topic branch and replace it (I try to keep it
forked at the same commit) after inspecting what got updated.
"git tbdiff" is a useful program to help this step. I may or may
not test the topic in isolation at this point.

- I repeat the above two for various topics during the day, and at
some point between 14:00-15:00 my time, stop taking new patches.
The day's integration cycle starts.

- If there are topics that have cooked long enough in 'next' and
planned to graduate to 'master', merge [*3*] them to 'master',
update the draft Release notes. Otherwise skip this step.

- If there are topics that have cooked long enough in 'pu' and
planned to graduate to 'next', merge them to 'next'. Otherwise
skip this step.

- If I updated 'master', merge its tip to 'next' (this should update
the draft release notes and nothing else, unless I took something
directly to 'master').

- Then I recreate 'jch', which is a point between 'master' and 'pu'.
This is the version I use for my own work, contains all topics in
'next' but a bit more. "git checkout -B jch master" begins it,
and then the topics that were in 'jch' are merged on top. The
latest version of updated topics that were already in 'jch' are
incorporated at this point, and "git diff jch@{20.hours}" would
show the effect of their interdiff (plus RelNotes update, if
'master' was updated during this integration cycle).

- I ran "git branch --no-merged pu --sort=-committerdate" to see the
topics that are new; the top of this list shows new topics and
updated topics (note that I just updated 'jch' but not 'pu' yet at
this point). Among them, I pick the ones that I am willing to be
a guinea-pig for before they hit 'next' and merge them to 'jch'.
Other topics that used to be in 'pu' may also be merged at this
point, when they turn out to be "interesting" enough.

- 'pu' is recre

Re: [PATCH/RFC] WIP: log: allow "-" as a short-hand for "previous branch"

2017-02-09 Thread Siddharth Kannan

On 9 February 2017 at 17:55, Matthieu Moy  wrote:
>
>> [...]
>> As you might notice, in this list, most commands are not of the `rm` variety,
>> i.e. something that would delete stuff.
>
> OK, I think I'm convinced.

I am glad! :)

>
> Keep the arguments in mind when polishing the commit message.

I will definitely do that. I am working on a good commit message for
this by looking at some past changes to sha1_name.c which have
affected multiple commands.

>
> --
> Matthieu Moy
> http://www-verimag.imag.fr/~moy/

-- 

Best Regards,

- Siddharth.

Re: [PATCH 3/5] register_ref_store(): new function

On 02/09/2017 06:20 PM, Stefan Beller wrote:
> On Thu, Feb 9, 2017 at 5:27 AM, Michael Haggerty  wrote:
>> Move the responsibility for registering the ref_store for a submodule
>> from base_ref_store_init() to a new function, register_ref_store(). Call
>> the latter from ref_store_init().
>>
>> This means that base_ref_store_init() can lose its submodule argument,
>> further weakening the 1:1 relationship between ref_stores and
>> submodules.
>>
>> Signed-off-by: Michael Haggerty 
>> ---
> 
> 
> 
> 
>> +
>>  struct ref_store *ref_store_init(const char *submodule)
>>  {
>> const char *be_name = "files";
>> struct ref_storage_be *be = find_ref_storage_backend(be_name);
>> +   struct ref_store *refs;
>>
>> if (!be)
>> die("BUG: reference backend %s is unknown", be_name);
>>
>> if (!submodule || !*submodule)
>> -   return be->init(NULL);
>> +   refs = be->init(NULL);
>> else
>> -   return be->init(submodule);
>> +   refs = be->init(submodule);
>> +
>> +   register_ref_store(refs, submodule);
>> +   return refs;
>>  }
> 
> This function is already very readable, though maybe it would be
> more readable like so:
> 
> {
> const char *be_name = "files";
> struct ref_storage_be *be = find_ref_storage_backend(be_name);
> 
> if (!be)
> die("BUG: reference backend %s is unknown", be_name);
> 
> /* replace empty string by NULL */
> if (submodule && !*submodule)
> submodule = NULL;
> 
> register_ref_store(be->init(submodule), submodule);
> return refs;
> }
> 
> Well, I dunno; the function inside the arguments to register seems ugly, 
> though.

Nit: you forgot to define and initialize `refs` (for returning to the
caller).

Actually, there is an inconsistency between the docstring for this
function and its behavior. The docstring claims that it can handle
`submodule == ""`, and it tries to do so, but incorrectly. The problem
is that it passes an un-cleaned-up `submodule` to
`register_ref_store()`, which is expecting a cleaned-up one.

But this function is only called by get_ref_store(), which has already
arranged for the empty string to be passed along as NULL.

In fact, the only external entry point into these functions is
`get_ref_store()`. I think what I should do is make the other functions
private, and remove their attempts to handle `submodule == ""`. I'll fix
this up in a re-roll.

(I wonder whether anybody really passes the empty string into this
method. It never happens in the test suite. I doubt I can muster the
energy to audit all of the call paths.)

Michael

Re: [PATCH 5/5] read_loose_refs(): read refs using resolve_ref_recursively()

On Thu, Feb 9, 2017 at 5:27 AM, Michael Haggerty  wrote:
> There is no need to call read_ref_full() or resolve_gitlink_ref() from
> read_loose_refs(), because we already have a ref_store object in hand.
> So we can call resolve_ref_recursively() ourselves. Happily, this
> unifies the code for the submodule vs. non-submodule cases.
>
> This requires resolve_ref_recursively() to be exposed to the refs
> subsystem, though not to non-refs code.
>
> Signed-off-by: Michael Haggerty 
> ---

Looks good,
Stefan

Re: [PATCH 1/5] refs: store submodule ref stores in a hashmap

On Thu, Feb 9, 2017 at 9:40 AM, Michael Haggerty  wrote:
> On 02/09/2017 05:58 PM, Stefan Beller wrote:
>>> @@ -1402,17 +1435,17 @@ struct ref_store *ref_store_init(const char 
>>> *submodule)
>>>
>>>  struct ref_store *lookup_ref_store(const char *submodule)
>>>  {
>>
>>> +   if (!submodule_ref_stores.tablesize)
>>> +   hashmap_init(&submodule_ref_stores, submodule_hash_cmp, 20);
>>
>>
>> So we can lookup a submodule even before we initialized the subsystem?
>> Does that actually happen? (It sounds like a bug to me.)
>>
>> Instead of initializing, you could return NULL directly here.
>
> The lines you quoted are only concerned with bringing the (empty)
> hashmap into existence if it hasn't been initialized already. (There's
> no HASHMAP_INIT.) I don't know what you mean by "initialize the
> subsystem". The only way to bring a ref_store *object* into existence is
> currently to call get_ref_store(submodule), which calls
> lookup_ref_store(submodule) to see if it already exists, and if not
> calls ref_store_init(submodule) to instantiate it and register it in the
> hashmap. There's nothing else that has to be initialize before that
> (except maybe the usual startup config reading etc.)
>
> I suppose this code path could be changed to return NULL without
> initializing the hashmap, but the hashmap will be initialized a moment
> later by ref_store_init(), so I don't see much difference either way.

Oh, I did not see that.

Thanks,
Stefan

>
> Thanks for your review!
> Michael
>

Re: [PATCH 1/5] refs: store submodule ref stores in a hashmap

On 02/09/2017 05:58 PM, Stefan Beller wrote:
>> @@ -1402,17 +1435,17 @@ struct ref_store *ref_store_init(const char 
>> *submodule)
>>
>>  struct ref_store *lookup_ref_store(const char *submodule)
>>  {
> 
>> +   if (!submodule_ref_stores.tablesize)
>> +   hashmap_init(&submodule_ref_stores, submodule_hash_cmp, 20);
> 
> 
> So we can lookup a submodule even before we initialized the subsystem?
> Does that actually happen? (It sounds like a bug to me.)
>
> Instead of initializing, you could return NULL directly here.

The lines you quoted are only concerned with bringing the (empty)
hashmap into existence if it hasn't been initialized already. (There's
no HASHMAP_INIT.) I don't know what you mean by "initialize the
subsystem". The only way to bring a ref_store *object* into existence is
currently to call get_ref_store(submodule), which calls
lookup_ref_store(submodule) to see if it already exists, and if not
calls ref_store_init(submodule) to instantiate it and register it in the
hashmap. There's nothing else that has to be initialize before that
(except maybe the usual startup config reading etc.)

I suppose this code path could be changed to return NULL without
initializing the hashmap, but the hashmap will be initialized a moment
later by ref_store_init(), so I don't see much difference either way.

Thanks for your review!
Michael

Re: [PATCH 3/5] register_ref_store(): new function

On Thu, Feb 9, 2017 at 5:27 AM, Michael Haggerty  wrote:
> Move the responsibility for registering the ref_store for a submodule
> from base_ref_store_init() to a new function, register_ref_store(). Call
> the latter from ref_store_init().
>
> This means that base_ref_store_init() can lose its submodule argument,
> further weakening the 1:1 relationship between ref_stores and
> submodules.
>
> Signed-off-by: Michael Haggerty 
> ---




> +
>  struct ref_store *ref_store_init(const char *submodule)
>  {
> const char *be_name = "files";
> struct ref_storage_be *be = find_ref_storage_backend(be_name);
> +   struct ref_store *refs;
>
> if (!be)
> die("BUG: reference backend %s is unknown", be_name);
>
> if (!submodule || !*submodule)
> -   return be->init(NULL);
> +   refs = be->init(NULL);
> else
> -   return be->init(submodule);
> +   refs = be->init(submodule);
> +
> +   register_ref_store(refs, submodule);
> +   return refs;
>  }

This function is already very readable, though maybe it would be
more readable like so:

{
const char *be_name = "files";
struct ref_storage_be *be = find_ref_storage_backend(be_name);

if (!be)
die("BUG: reference backend %s is unknown", be_name);

/* replace empty string by NULL */
if (submodule && !*submodule)
submodule = NULL;

register_ref_store(be->init(submodule), submodule);
return refs;
}

Well, I dunno; the function inside the arguments to register seems ugly, though.

Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)

Johannes Schindelin  writes:

> (And that would have to be handled at a different point, as I
> had pointed out, so that suggested preparation would most likely not help
> at all.)

I did not think "that would have to be handled at a different point"
is correct at all, if by "a point" you meant "a location in the
code" [*1*].  If we want to make it configurable in a more detailed
way by directly allowing to override port_option and needs_batch
separately, you would do that in override_ssh_variant(), without
touching handle_ssh_variant() in the refactored code.  That way, you
do not have to worry about breaking the auto detection based on the
command name.

[Footnote]

*1* Or did you mean "point in time", aka "let's do it outside the
scope of this patch series"?

Let's not keep it as a SQUASH??? proposal, but as a separate hot-fix
follow-up patch.

-- >8 --
Subject: connect.c: stop conflating ssh command names and overrides

dd33e07766 ("connect: Add the envvar GIT_SSH_VARIANT and ssh.variant
config", 2017-02-01) attempted to add support for configuration and
environment variable to override the different handling of
port_option and needs_batch settings suitable for variants of the
ssh implementation that was autodetected by looking at the ssh
command name.  Because it piggybacked on the code that turns command
name to specific override (e.g. "plink.exe" and "plink" means
port_option needs to be set to 'P' instead of the default 'p'), yet
it defined a separate namespace for these overrides (e.g. "putty"
can be usable to signal that port_option needs to be 'P'), however,
it made the auto-detection based on the command name less robust
(e.g. the code now accepts "putty" as a SSH command name and applies
the same override).

Separate the code that interprets the override that was read from
the configuration & environment from the original code that handles
the command names, as they are in separate namespaces, to fix this
confusion.

This incidentally also makes it easier for future enhancement of the
override syntax (e.g. "port_option=p,needs_batch=1" may want to be
accepted as a more explicit syntax) without affecting the code for
auto-detection based on the command name.

While at it, update the return type of the handle_ssh_variant()
helper function to void; the caller does not use it, and the
function does not return any meaningful value.

Signed-off-by: Junio C Hamano 
---
 connect.c | 45 -
 1 file changed, 32 insertions(+), 13 deletions(-)

diff --git a/connect.c b/connect.c
index 7f1f802396..7d65c1c736 100644
--- a/connect.c
+++ b/connect.c
@@ -691,17 +691,39 @@ static const char *get_ssh_command(void)
return NULL;
 }

-static int handle_ssh_variant(const char *ssh_command, int is_cmdline,
- int *port_option, int *needs_batch)
+static int override_ssh_variant(int *port_option, int *needs_batch)
 {
-   const char *variant = getenv("GIT_SSH_VARIANT");
+   char *variant;
+
+   variant = xstrdup_or_null(getenv("GIT_SSH_VARIANT"));
+   if (!variant &&
+   git_config_get_string("ssh.variant", &variant))
+   return 0;
+
+   if (!strcmp(variant, "plink") || !strcmp(variant, "putty")) {
+   *port_option = 'P';
+   *needs_batch = 0;
+   } else if (!strcmp(variant, "tortoiseplink")) {
+   *port_option = 'P';
+   *needs_batch = 1;
+   } else {
+   *port_option = 'p';
+   *needs_batch = 0;
+   }
+   free(variant);
+   return 1;
+}
+
+static void handle_ssh_variant(const char *ssh_command, int is_cmdline,
+  int *port_option, int *needs_batch)
+{
+   const char *variant;
char *p = NULL;

-   if (variant)
-   ; /* okay, fall through */
-   else if (!git_config_get_string("ssh.variant", &p))
-   variant = p;
-   else if (!is_cmdline) {
+   if (override_ssh_variant(port_option, needs_batch))
+   return;
+
+   if (!is_cmdline) {
p = xstrdup(ssh_command);
variant = basename(p);
} else {
@@ -717,12 +739,11 @@ static int handle_ssh_variant(const char *ssh_command, 
int is_cmdline,
 */
free(ssh_argv);
} else
-   return 0;
+   return;
}

if (!strcasecmp(variant, "plink") ||
-   !strcasecmp(variant, "plink.exe") ||
-   !strcasecmp(variant, "putty"))
+   !strcasecmp(variant, "plink.exe"))
*port_option = 'P';
else if (!strcasecmp(variant, "tortoiseplink") ||
 !strcasecmp(variant, "tortoiseplink.exe")) {
@@ -730,8 +751,6 @@ static int handle_ssh_variant(const char *ssh_command, int 
is_cmdline,
*needs_batch = 1;
}
free(p);
-
-   return 1;
 }

 /*
-- 
2.12.0-rc0-2

Re: Automatically Add .gitignore Files

2017-02-09 Thread Thangalin

Hi Duy,

> This is a general problem to new files, not .gitignore alone. Can we

The difference, to me, is that a ".gitignore" file is not part of what
I'm developing. It's an artifact for git configuration. While a
.gitignore file is not always pushed to the repository, I imagine that
in most situations, it is.

Whereas when a "new" file is created, there are plenty of situations
where it shouldn't be added and thus a warning would be superfluous,
or an automatic add would be undesirable.

To solve the problem, generally, for new files while giving the user
the ability to specify exactly what "new" files should be
automatically added to the repository, something like the following
would work:

echo "**/.gitignore" >> .git/config/add-before-commit

> and perhaps you want to make it a habit to run it before committing.

Right, because software shouldn't automate repetitive tasks and humans
are never prone to forget? ;-)

Re: [PATCH 4/5] files_ref_store::submodule: use NULL for the main repository

On Thu, Feb 9, 2017 at 5:27 AM, Michael Haggerty  wrote:
> The old practice of storing the empty string in this member for the main
> repository was a holdover from before 00eebe3 (refs: create a base class
> "ref_store" for files_ref_store, 2016-09-04), when the submodule was
> stored in a flex array at the end of `struct files_ref_store`. Storing
> NULL for this case is more idiomatic and a tiny bit less code.
>
> Signed-off-by: Michael Haggerty 
> ---

Makes sense!
Stefan

Re: [PATCH v3 00/27] Revamp the attribute system; another round

2017-02-09 Thread Brandon Williams

On 02/02, Junio C Hamano wrote:
>   prepare the  array
>   for path taken from some set:
>   do_something(the array, path)
> 
> That way, do_something() do not have to keep allocating,
> initializing and destroying the array.
> 
> But after looking at the current set of codepaths, before coming to
> the conclusion that we need to split the static part that is
> specific to the callsite for git_check_attr() and the dynamic part
> that is specific to the  pair, I noticed that
> typically the callers that can prepare the array before going into
> the loop (which will eventually be spread across multiple threads)
> are many levels away in the callchain, and they are not even aware
> of what attributes are going to be requested in the leaf level
> helper functions.  In other words, the approach to hoist "the
>  array" up in the callchain would not scale.  A
> caller that loops over paths in the index and check them out does
> not want to know (and we do not want to tell it) what exact
> attributes are involved in the decision convert_to_working_tree()
> makes for each path, for example.

This was something that I was envisioning as well, though I didn't dig
very deep into the call stack.  Another means of doing this could be to
have the attr_check structure allocated and then have it configured at a
later point for the particular question being asked:

  alloc struct attr_check c;
  ... many call sites down
  configure(c, questions)
  for path
do_something(c, path)

That also allows the same structure to be reused (just reconfigured) if
different attributes are needed at a later point in time.  Of course
this is just an idea and I'm not sure if this is the best way to do it
either.

> 
> So how would we split questions and answers in a way that is not
> naive and inefficient?  
> 
> I envision that we would allow the attribute subsystem to keep track
> of the dynamic part, which will receive the answers, holds working
> area like check_all_attr[], and has the equivalent to the "attr
> stack", indexed by  pair (and the
> identification of "callsite" can be done by using the address of the
> static part, i.e. the array of questions that we initialize just
> once when interning the list of attribute names for the first time).
> 
> The API to prepare and ask for attributes may look like:
> 
>   static struct attr_static_part Q;
>   struct attr_dynamic_part *D;
> 
>   attr_check_init(&Q, "text", ...);
>   D = git_attr_check(&Q, path);
> 
> where Q contains an array of interned attributes (i.e. questions)
> and other immutable things that is unique to this callsite, but can
> be shared across multiple threads asking the same question from
> here.  As an internal implementation detail, it probably will have a
> mutex to make sure that init will run only once.
> 
> Then the implementation of git_attr_check(&Q, path) would be:
> 
> - see if there is already the "dynaic part" allocated for the
>   current thread asking the question Q.  If there is not,
>   allocate one and remember it, so that it can be reused in
>   later calls by the same thread; if there is, use that existing
>   one.
> 
> - reinitialize the "dynamic part" as needed, e.g. clear the
>   equivalent to check_all_attr[], adjust the equivalent to
>   attr_stack for the current path, etc.  Just like the current
>   code optimizes for the case where the entire program (a single
>   thread) will ask the same question for paths in traversal
>   order (i.e. falling in the same directory), this will optimize
>   for the access pattern where each thread asks the same
>   question for paths in its traversal order.
> 
> - do what the current collect_some_attrs() thing does.
> 
> And this hopefully won't be as costly as the naive and inefficient
> one.

I agree, this sort of implementation wouldn't suffer from the same
allocation penalty that the naive implementation suffers from.  This
would be slightly challenging to ensure that there aren't any memory
leaks, well not leaks but rather memory that isn't freed.  i.e. When a
thread terminates we would want to reclaim the memory used for the
dynamic part which is stored inside the attribute system.

> 
> The reason why I was pushing hard to split the static part and the
> dynamic part in our redesign of the API is primarily because I didn't
> want to update the API callers twice.  But I'd imagine that your v3
> (and your earlier "do not discard attr stack, but keep them around,
> holding their tips in a hashmap for quick reuse") would at least lay
> the foundation for the eventual shape of the API, let's bite the
> bullet and accept that we will need to update the callers again
> anyway.
> 
> Thanks.
> 

At least v3 gets the attribute system to a state where further
improvements should be relatively easy to make.  And now as long as each
thread has a unique attr_check structure, multiple callers can exist
inside the attribu

Re: [PATCH 2/5] refs: push the submodule attribute down

On Thu, Feb 9, 2017 at 5:26 AM, Michael Haggerty  wrote:
> Push the submodule attribute down from ref_store to files_ref_store.
> This is another step towards loosening the 1:1 connection between
> ref_stores and submodules.
>
> Signed-off-by: Michael Haggerty 

Looks good,
Stefan

Re: [PATCH 1/5] refs: store submodule ref stores in a hashmap

> @@ -1402,17 +1435,17 @@ struct ref_store *ref_store_init(const char 
> *submodule)
>
>  struct ref_store *lookup_ref_store(const char *submodule)
>  {

> +   if (!submodule_ref_stores.tablesize)
> +   hashmap_init(&submodule_ref_stores, submodule_hash_cmp, 20);


So we can lookup a submodule even before we initialized the subsystem?
Does that actually happen? (It sounds like a bug to me.)

Instead of initializing, you could return NULL directly here.

Otherwise looks good.

Thanks,
Stefan

Re: [RFD] should all merge bases be equal?

Michael Haggerty  writes:

> Anyway, I mostly wanted to remind you of the earlier discussion of this
> topic. There's a lot more information there.
>
> Michael
>
> [1] http://public-inbox.org/git/539a25bf.4060...@alum.mit.edu/

Your http://public-inbox.org/git/53a3f67a.80...@alum.mit.edu/ in the
thread appears to me the best place to continue exploring.

Thanks for the link.

Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)

On 02/06/2017 11:34 PM, Junio C Hamano wrote:
> [...]
> --
> [Stalled]
> [...]
> * mh/ref-remove-empty-directory (2017-01-07) 23 commits
>  - files_transaction_commit(): clean up empty directories
>  - try_remove_empty_parents(): teach to remove parents of reflogs, too
>  - try_remove_empty_parents(): don't trash argument contents
>  - try_remove_empty_parents(): rename parameter "name" -> "refname"
>  - delete_ref_loose(): inline function
>  - delete_ref_loose(): derive loose reference path from lock
>  - log_ref_write_1(): inline function
>  - log_ref_setup(): manage the name of the reflog file internally
>  - log_ref_write_1(): don't depend on logfile argument
>  - log_ref_setup(): pass the open file descriptor back to the caller
>  - log_ref_setup(): improve robustness against races
>  - log_ref_setup(): separate code for create vs non-create
>  - log_ref_write(): inline function
>  - rename_tmp_log(): improve error reporting
>  - rename_tmp_log(): use raceproof_create_file()
>  - lock_ref_sha1_basic(): use raceproof_create_file()
>  - lock_ref_sha1_basic(): inline constant
>  - raceproof_create_file(): new function
>  - safe_create_leading_directories(): set errno on SCLD_EXISTS
>  - safe_create_leading_directories_const(): preserve errno
>  - t5505: use "for-each-ref" to test for the non-existence of references
>  - refname_is_safe(): correct docstring
>  - files_rename_ref(): tidy up whitespace
> 
>  Deletion of a branch "foo/bar" could remove .git/refs/heads/foo
>  once there no longer is any other branch whose name begins with
>  "foo/", but we didn't do so so far.  Now we do.
> 
>  Expecting a reroll.
>  cf. <5051c78e-51f9-becd-e1a6-9c0b781d6...@alum.mit.edu>

I think you missed v4 of this patch series [1], which is the re-roll
that you were waiting for. And I missed that you missed it...

Michael

[1] http://public-inbox.org/git/cover.1483719289.git.mhag...@alum.mit.edu/

Re: GSoC 2017: application open, deadline = February 9, 2017

On Thu, Feb 9, 2017 at 1:22 PM, Matthieu Moy
 wrote:
> Matthieu Moy  writes:
>
>> I created a Git organization and invited you + Peff as admins.

Great thanks!
I accepted the invite.

> I'll
>> start cut-and-pasting to show my good faith ;-).
>
> I created this page based on last year's:
>
> https://git.github.io/SoC-2017-Org-Application/
>
> I filled-in the "org profile". "Org application" is still TODO.

I copy pasted the Org application from last year, so I think we are good.

[PATCH v2] git-p4: fix git-p4.pathEncoding for removed files

2017-02-09 Thread Lars Schneider

In a9e38359e3 we taught git-p4 a way to re-encode path names from what
was used in Perforce to UTF-8. This path re-encoding worked properly for
"added" paths. "Removed" paths were not re-encoded and therefore
different from the "added" paths. Consequently, these files were not
removed in a git-p4 cloned Git repository because the path names did not
match.

Fix this by moving the re-encoding to a place that affects "added" and
"removed" paths. Add a test to demonstrate the issue.

Signed-off-by: Lars Schneider 
---

Hi,

unfortunately, I missed to send this v2. I agree with Luke's review and
I moved the re-encode of the path name to the `streamOneP4File` and
`streamOneP4Deletion` explicitly.

Discussion:
http://public-inbox.org/git/CAE5ih7-=bd_zol5pfyfd2qvy-xe24v_cgge0xoavuotk02e...@mail.gmail.com/

Thanks,
Lars


Notes:
Base Commit: 454cb6bd52 (v2.11.0)
Diff on Web: https://github.com/larsxschneider/git/commit/75ed3e92e2
Checkout:git fetch https://github.com/larsxschneider/git 
git-p4/fix-path-encoding-v2 && git checkout 75ed3e92e2

Interdiff (v1..v2):

diff --git a/git-p4.py b/git-p4.py
index 8f311cb4e8..dac8b4955d 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2366,15 +2366,6 @@ class P4Sync(Command, P4UserMap):
 break

 path = wildcard_decode(path)
-try:
-path.decode('ascii')
-except:
-encoding = 'utf8'
-if gitConfig('git-p4.pathEncoding'):
-encoding = gitConfig('git-p4.pathEncoding')
-path = path.decode(encoding, 'replace').encode('utf8', 
'replace')
-if self.verbose:
-print 'Path with non-ASCII characters detected. Used %s to 
encode: %s ' % (encoding, path)
 return path

 def splitFilesIntoBranches(self, commit):
@@ -2427,11 +2418,24 @@ class P4Sync(Command, P4UserMap):
 self.gitStream.write(d)
 self.gitStream.write('\n')

+def encodeWithUTF8(self, path):
+try:
+path.decode('ascii')
+except:
+encoding = 'utf8'
+if gitConfig('git-p4.pathEncoding'):
+encoding = gitConfig('git-p4.pathEncoding')
+path = path.decode(encoding, 'replace').encode('utf8', 
'replace')
+if self.verbose:
+print 'Path with non-ASCII characters detected. Used %s to 
encode: %s ' % (encoding, path)
+return path
+
 # output one file from the P4 stream
 # - helper for streamP4Files

 def streamOneP4File(self, file, contents):
 relPath = self.stripRepoPath(file['depotFile'], 
self.branchPrefixes)
+relPath = self.encodeWithUTF8(relPath)
 if verbose:
 size = int(self.stream_file['fileSize'])
 sys.stdout.write('\r%s --> %s (%i MB)\n' % (file['depotFile'], 
relPath, size/1024/1024))
@@ -2511,6 +2515,7 @@ class P4Sync(Command, P4UserMap):

 def streamOneP4Deletion(self, file):
 relPath = self.stripRepoPath(file['path'], self.branchPrefixes)
+relPath = self.encodeWithUTF8(relPath)
 if verbose:
 sys.stdout.write("delete %s\n" % relPath)
 sys.stdout.flush()

 git-p4.py   | 24 ++--
 t/t9822-git-p4-path-encoding.sh | 16 
 2 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index fd5ca52462..dac8b4955d 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2418,11 +2418,24 @@ class P4Sync(Command, P4UserMap):
 self.gitStream.write(d)
 self.gitStream.write('\n')

+def encodeWithUTF8(self, path):
+try:
+path.decode('ascii')
+except:
+encoding = 'utf8'
+if gitConfig('git-p4.pathEncoding'):
+encoding = gitConfig('git-p4.pathEncoding')
+path = path.decode(encoding, 'replace').encode('utf8', 'replace')
+if self.verbose:
+print 'Path with non-ASCII characters detected. Used %s to 
encode: %s ' % (encoding, path)
+return path
+
 # output one file from the P4 stream
 # - helper for streamP4Files

 def streamOneP4File(self, file, contents):
 relPath = self.stripRepoPath(file['depotFile'], self.branchPrefixes)
+relPath = self.encodeWithUTF8(relPath)
 if verbose:
 size = int(self.stream_file['fileSize'])
 sys.stdout.write('\r%s --> %s (%i MB)\n' % (file['depotFile'], 
relPath, size/1024/1024))
@@ -2495,16 +2508,6 @@ class P4Sync(Command, P4UserMap):
 text = regexp.sub(r'$\1$', text)
 contents = [ text ]

-try:
-relPath.decode('ascii')
-except:
-encoding = 'utf8'
-if gitConfig('git-p4.pathEncoding'):
-

Re: Bug with fixup and autosquash

2017-02-09 Thread Michael J Gruber

Junio C Hamano venit, vidit, dixit 08.02.2017 23:55:
> Ashutosh Bapat  writes:
> 
>> I have been using git rebase heavily these days and seem to have found a bug.
>>
>> If there are two commit messages which have same prefix e.g.
>> yy This is prefix
>> xx This is prefix and message
>>
>> xx comitted before yy
>>
>> Now I commit a fixup to yy using git commit --fixup yy
>> zz fixup! This is prefix
>>
>> When I run git rebase -i --autosquash, the script it shows me looks like
>> pick xx This is prefix and message
>> fixup zz fixup! This is prefix
>> pick yy This is prefix
>>
>> I think the correct order is
>> pick xx This is prefix and message
>> pick yy This is prefix
>> fixup zz fixup! This is prefix
>>
>> Is that right?
> 
> Because "commit" pretends as if it took the exact commit object name
> to be fixed up (after all, it accepts "yy" that is a name of the
> commit object), it would be nice if the fixup is applied to that
> exact commit, even if you had many commits that share exactly the
> same title (i.e. not just shared prefix).
> 
> Unfortunately, "rebase -i --autosquash" reorders the entries by
> identifying the commit by its title, and it goes with prefix match
> so that fix-up commits created without using --fixup option but
> manually records the title's prefix substring can also work.  
> 
> We could argue that the logic should notice that there is one exact
> match and another non-exact prefix match and favor the former, and
> certainly such a change would make your made-up example (you didn't
> actually have a commit whose title is literally "This is prefix")
> above work better.
> 
> But I am not sure if adding such heuristics would really help in
> general.  It would not help those whose commits are mostly titled
> ultra-vaguely, like "fix", "bugfix", "docfix", etc.
> 
> Another possibility is to teach "commit --fixup" to cast exact
> commit object name in stone.  That certainly would solve your
> immediate problem, but it has a grave negative impact when the user
> rebases the same topic many times (which happens often).
> 
> For example, I may have a series of commits A and B, notice that A
> could be done a bit better and have "fixup A" on top, build a new
> commit C on it, and then realize that the next step (i.e. D) would
> need support from a newer codebase than where I started (i.e. A^).
> 
> At that point, I would have a 4-commit series (A, B, "fixup A", and
> C), and I would rebase them on top of something newer.  I am
> undecided if that "fixup A" is really an improvement or unnecessary,
> when I do this, but I do know that I want to build the series on top
> of a different commit.  So I do rebase without --autosquash (I would
> probably rebase without --interactive for this one).
> 
> Then I keep working and add a new commit D on top.  After all that,
> I would have a more-or-less completed series and would be ready to
> re-assess the whole series.  I perhaps decide that "fixup A" is
> really an improvement.  And then I would "rebase -i" to squash the
> fix-up into A.
> 
> But notice that at this point, what we are calling A has different
> object name than the original A the fixup was written for, because
> we rebased once on top of a newer codebase.  That commit can still
> be identified by its title, but not with its original commit object
> name.
> 
> I think that is why "commit --fixup " turns the commit
> object name (your "yy") into a string (your "This is prefix")
> and that is a reasonable design decision [*1*].
> 
> So from that point of view, if we were to address your issue, it
> should happen in "rebase -i --autosquash" side, not "commit --fixup"
> side.
> 
> Let's hear from some of those (Cc'ed) who were involved in an
> earlier --autosquash thread.
> 
> https://public-inbox.org/git/cover.1259934977.git.mhag...@alum.mit.edu/
> 
> 
> [Footnote]
> 
> *1* "rebase -i --autosquash" does understand "fixup! yy", so if
> you are willing to accept the consequence of not being able to
> rebase twice, you could instead do
> 
>   $ git commit -m "fixup! yy"
> 
> I would think.

Doesn't this indicate that rebase is fine as is? How about:

- introduce "git commit --fixup-fixed=" or the like which parses
 commits "-m fixup! "

- teach "git commit --fixup=" to check for duplicates (same prefix,
maybe only in "recent" history) and make it issue a warning, say:

prefix  matches the following commits:
 
Issue
git commit --amend -m 'fixup! '
to fixup a specific commit.

(Or git commit --amend --fixup-fixed= if that flies.)

Additionally, we could teach commit --fixup-fixed to commit -m "fixup!
 " so that we have both uniqueness and verbosity in the
rebase-i-editor. This would allow "rebase -i" to fall back to the old
mode when "" is not in the range it operates on. We could also try
to rewrite s when rebasing (without squashing) fixup!-commits, of
course.

Michael

Re: [RFD] should all merge bases be equal?

On 10/18/2016 12:28 AM, Junio C Hamano wrote:
> [...]
> Being accustomed how fast my merges go, there is one merge that
> hiccups for me every once in a few days: merging back from 'master'
> to 'next'.  [...]
> 
> The reason why this merge is slow is because it typically have many
> merge bases.  [...]

I overlooked this topic until just now :-(

I spent a lot of time looking at merge bases a couple of years ago [1],
originally motivated by the crappy diffs you get from

git diff master...branch

when the merge base is chosen poorly. In that email I include a lot of
data and suggest a different heuristic, namely to define the "best"
merge base $M to be the one that minimizes the number of non-merge
commits between $M and either of the branch tips (it doesn't matter
which one you choose); i.e., the one that minimizes

git rev-list --count --no-merges $M..$TIP

. The idea is that a merge base that is "closer" content-wise to the
tips will probably yield smaller diffs. I would expect that merge base
also to yield simpler merges, though I didn't test that. Relying on the
number of commits (rather than some other measure of how much the
content has been changed) is only a heuristic, but it seems to work well
and it can be implemented pretty cheaply.

We actually use an algorithm like the one I described at GitHub, though
it is implemented as a script rather than ever having been integrated
into git. And (for no particular reason) we include merge commits in the
commit count (it doesn't make much difference whether merges are
included or excluded).

Your idea to look at the first-parent histories of the two branch tips
is an interesting one and has the nice theoretical property that it is
based on the DAG topology rather than a count of commits. I'd be very
curious to see how the sizes of asymmetric diffs differ between your
method and mine, because for me smaller and more readable diffs are one
of the main benefits of better merge bases.

I would worry a bit that your proposed algorithm won't perform as well
for people who use less disciplined workflows than git.git or the Linux
kernel. For example, many people merge a lot more frequently and
chaotically, maybe even with the parents reversed from the canonical order.

Anyway, I mostly wanted to remind you of the earlier discussion of this
topic. There's a lot more information there.

Michael

[1] http://public-inbox.org/git/539a25bf.4060...@alum.mit.edu/

Re: GSoC 2017: application open, deadline = February 9, 2017

Matthieu Moy  writes:

> Matthieu Moy  writes:
>
>> I created a Git organization and invited you + Peff as admins. I'll
>> start cut-and-pasting to show my good faith ;-).
>
> I created this page based on last year's:
>
> https://git.github.io/SoC-2017-Org-Application/
>
> I filled-in the "org profile". "Org application" is still TODO.

PS: I didn't hear from the libgit2 folks, so I think it's reasonable to
consider that we don't apply for them.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

Re: [PATCH/RFC] WIP: log: allow "-" as a short-hand for "previous branch"

Siddharth Kannan  writes:

> Hello Matthieu,
>
> On 8 February 2017 at 20:10, Matthieu Moy  
> wrote:
>> In a previous discussion, I made an analogy with "cd -" (which is the
>> source of inspiration of this shorthand AFAIK): "-" did not magically
>> become "the last visited directory" for all Unix commands, just for
>> "cd". And in this case, I'm happy with it. For example, I never need
>> "mkdir -", and I'm happy I can't "rm -fr -" by mistake.
>>
>> So, it's debatable whether it's a good thing to have all commands
>> support "-". For example, forcing users to explicitly type "git branch
>> -d @{1}" and not providing them with a shortcut might be a good thing.
>
> builtin/branch.c does not call setup_revisions and remains unaffected
> by this patch :)

Right, I forgot this: in some place we need any revspec, but "branch -d"
needs a branch name explicitly.

> [...]
> As you might notice, in this list, most commands are not of the `rm` variety,
> i.e. something that would delete stuff.

OK, I think I'm convinced.

Keep the arguments in mind when polishing the commit message.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory

2017-02-09 Thread Mike Rappazzo

On Thu, Feb 9, 2017 at 4:48 AM, Duy Nguyen  wrote:
> On Wed, Feb 8, 2017 at 7:17 PM, Johannes Schindelin
>  wrote:
>> In addition to making git_path() aware of certain file names that need
>> to be handled differently e.g. when running in worktrees, the commit
>> 557bd833bb (git_path(): be aware of file relocation in $GIT_DIR,
>> 2014-11-30) also snuck in a new option for `git rev-parse`:
>> `--git-path`.
>>
>> On the face of it, there is no obvious bug in that commit's diff: it
>> faithfully calls git_path() on the argument and prints it out, i.e. `git
>> rev-parse --git-path ` has the same precise behavior as
>> calling `git_path("")` in C.
>>
>> The problem lies deeper, much deeper. In hindsight (which is always
>> unfair), implementing the .git/ directory discovery in
>> `setup_git_directory()` by changing the working directory may have
>> allowed us to avoid passing around a struct that contains information
>> about the current repository, but it bought us many, many problems.
>
> Relevant thread in the past [1] which fixes both --git-path and
> --git-common-dir. I think the author dropped it somehow (or forgot
> about it, I know I did). Sorry can't comment on that thread, or this
> patch, yet.

I didn't exactly forget it (I have it sitting in a branch), I wasn't
sure what else was needed (from a v5 I guess), so it has stagnated.

There was also another patch [1] at the time done by SZEDER Gábor
trying to speed up the completion scripts by adding `git rev-parse
--absolute-git-dir` option to deal with this case as well.

>
> [1] 
> http://public-inbox.org/git/1464261556-89722-1-git-send-email-rappa...@gmail.com/
> --
> Duy

[1] http://public-inbox.org/git/20170203024829.8071-16-szeder@gmail.com/

Re: [PATCH 6/7] completion: teach remote subcommands option completion

2017-02-09 Thread Cornelius Weig



On 02/02/2017 02:37 AM, SZEDER Gábor wrote:
> The 'set-head' subcommand has '--auto' and '--delete' options, and
> 'set-branches' has '--add'.

Oops. Thanks for spotting this..

>>  __git_complete_remote_or_refspec
>>  ;;
>> -update)
>> +update,*)
> 
> The 'update' subcommand has a '--prune' option.
> 

..and that.

Re: Cross-referencing the Git mailing list archive with their corresponding commits in `pu`

2017-02-09 Thread Lars Schneider


> On 06 Feb 2017, at 20:10, Junio C Hamano  wrote:
> 
> Johannes Schindelin  writes:
> 
>> So I thought maybe the From: line (from the body, if available, otherwise
>> from the header) in conjunction with the "Date:" header would work.
> 
> FYI, I use a post-applypatch hook to populate refs/notes/amlog notes
> tree when I queue a new patch; I am not sure how well the notes in
> it are preserved across rebases, but it could be a good starting
> point.  The notes tree is mirrored at git://github.com/git/gitster
> repository.
> 
> E.g.
> 
> $ git show --notes=amlog --stat

That's super useful! Thanks for the pointer!
Wouldn't it make sense to push these notes to github.com/git/git ?

- Lars

Re: Request re git status

2017-02-09 Thread Cornelius Weig

On 02/07/2017 01:45 AM, Phil Hord wrote:
> On Mon, Feb 6, 2017 at 3:36 PM Ron Pero  wrote:
> Do you mean you almost pushed some changed history with "--force"
> which would have lost others' changes?  Use of this option is
> discouraged on shared branches for this very reason.  But if you do
> use it, the remote will tell you the hash of the old branch so you can
> undo the damage.
> 
> But if you did not use --force, then you were not in danger of being
> bit.  Git would have prevented the push in that case.

I totally agree with Phil. Besides, git-status should be fast. And
talking to a remote can be painfully slow. As Phil pointed out, even the
slow answer when talking to the remote can give you better guarantees
than the quick (local) answer. Therefore, I prefer the quick answer.

Since you pointed out the use of --force, I want to add the
--force-with-lease option of git-push. The idea is basically, that we
may force-push, if the remote end does indeed have the state we think it
has. This avoids those situations where somebody pushed to the remote
while you were typing 'git push --force' (which would then loose the
other contributor's work). For details have a look at 'git help push'.

>> Or change the message to tell what it really
>> did, e.g. "Your branch was up-to-date with 'origin/master' when last
>> checked at {timestamp}"? Or even just say, "Do a fetch to find out
>> whether your branch is up to date"?
> 
> These are reasonable suggestions, but i don't think the extra wording
> adds anything for most users.  Adding a timestamp seems generally
> useful, but it could get us into other trouble since we have to depend
> on outside sources for timestamps.  

The date of the last update is actually stored in the reflogs for the
remote branches. That timestamp is "internal" and could be trusted.
However, I don't quite believe that it would avoid accidents. For that
you would have to remember the time when some other (!) contributor has
pushed to the remote AND recognize that its timestamp is after the date
printed.
I prefer being warned by git when I try to do something stupid.

Re: [PATCH 2/2] worktree.c: use submodule interface to access refs from another worktree

On 02/09/2017 12:59 PM, Duy Nguyen wrote:
> On Thu, Feb 9, 2017 at 1:07 PM, Michael Haggerty  wrote:
>> It is unquestionably a good goal to avoid parsing references outside of
>> `refs/files-backend.c`. But I'm not a fan of this approach.
> 
> Yes. But in this context it was more of a guinea pig. I wanted
> something simple enough to code up show we can see what the approach
> looked like. Good thing I did it.
> 
>>
>> There are two meanings of the concept of a "ref store", and I think this
>> change muddles them:
>>
>> 1. The references that happen to be *physically* stored in a particular
>>location, for example the `refs/bisect/*` references in a worktree.
>>
>> 2. The references that *logically* should be considered part of a
>>particular repository. This might require stitching together
>>references from multiple sources, for example `HEAD` and
>>`refs/bisect` from a worktree's own directory with other
>>references from the main repository.
>>
>> Either of these concepts can be implemented via the `ref_store` abstraction.
>>
>> The `ref_store` for a submodule should represent the references
>> logically visible from the submodule. The main program shouldn't care
>> whether the references are stored in a single physical location or
>> spread across multiple locations (for example, if the submodule were
>> itself a linked worktree).
>>
>> The `ref_store` that you want here for a worktree is not the worktree's
>> *logical* `ref_store`. You want the worktree's *physical* `ref_store`.
> 
> Yep.
> 
>> Mixing logical and physical reference stores together is a bad idea
>> (even if we were willing to ignore the fact that worktrees are not
>> submodules in the accepted sense of the word).
>>
>> ...
>>
>> I think the best solution would be to expose the concept of `ref_store`
>> in the public refs API. Then users of submodules would essentially do
>>
>> struct ref_store *refs = get_submodule_refs(submodule_path);
>> ... resolve_ref_recursively(refs, refname, 0, sha1, &flags) ...
>> ... for_each_ref(refs, fn, cb_data) ...
>>
>> whereas for a worktree you'd have to look up the `ref_store` instance
>> somewhere else (or maybe keep it as part of some worktree structure, if
>> there is one) but you would use it via the same API.
> 
> Oh I was going to reply to Stefan about his comment to my (**)
> footnote. Something along the this line
> 
> "Ideally we would introduce a new set of api, maybe with refs_ prefix,
> that takes a refs_store. Then submodule people can get a ref store
> somewhere and pass to it. Worktree people get maybe some other refs
> store for it. The "old" api like for_each_ref() is a thin wrapper
> around it, just like read_cache() vs read_index(&the_index). If the
> *_submodule does not see much use, we might as well kill it and use
> the generic refs_*".
> 
> If I didn't misunderstood anything else, then I think we're on the same page.
> 
> Now I need to see if I can get there in a reasonable time frame (so I
> can fix my "gc in worktree" problem properly) or I would need
> something temporary but not so hacky. I'll try to make this new api
> and see how it works out. If you think I should not do it right away,
> for whatever reason, stop me now.

Sounds good. A couple of hints and points to remember:

* I think `struct ref_store *` should remain opaque outside of the refs
  code.

* The virtual function interface embodied in `struct ref_storage_be`
  isn't meant to be an external interface (i.e., don't just expose it
  and have external callers use it directly).

* One important distinction between the main reference store and
  submodule reference stores is that we know the object store for
  the former but not the latter. That implies that some operations
  are, or should be, impossible for submodules (e.g., anything that
  involves looking up objects, including peeling refs). However, we
  *do* know the object store for linked worktrees. So they don't have
  to be as dumbed-down as submodule ref stores. (This might be
  irrelevant if you don't need any additional features for your
  current purposes.)

Also, I just sent my submodule-hash patch series to the mailing list in
case you want to build on that.

Michael

Re: Fwd: Possibly nicer pathspec syntax?

On Thu, Feb 9, 2017 at 4:11 AM, Junio C Hamano  wrote:
> With or without the patch in this thread, parse_pathspec() behaves
> the same way for either CWD or FULL if you feed a non-empty
> pathspec with at least one positive element.  IOW, if a caller feeds
> a non-empty pathspec and there is no "negative" element involved, it
> does not matter if we feed CWD or FULL.

Yes. Now that you put it that way, it may make more sense to rename
the options to PATHSPEC_DEFAULT_* instead of PATHSPEC_PREFER_*.

>  - builtin/checkout.c::cmd_checkout(), when argc==0, does not call
>parse_pathspec().  This codepath will get affected by Linus's
>change ("cd t && git checkout :\!perf" would try to check out
>everything except t/perf, but what is a reasonable definition of
>"everything" in the context of this command).  We need to
>decide, and I am leaning towards preferring CWD for this case.

So far I have seen arguments with single negative pathspec as
examples. What about "cd t; git checkout :^perf :^../Documentation"?
CWD is probably not the best base to be excluded from. Maybe the
common prefix of all negative pathspecs? But things start to get a bit
unintuitive on that road. Or do will still bail out on multiple
negative pathspecs with no positive one?

Also, even with single negative pathspec, what about "cd t; git
checkout :^../ewah"?

> So, I am tempted to suggest us doing the following:
>
>  * Leave a NEEDSWORK comment to archive.c::path_exists() that is
>used for checking the validation of pathspec elements.  To fix it
>properly, we need to be able to skip a negative pathspec to be
>passed to this function by the caller, and to do so, we need to
>expose a helper from the pathspec API that gets a single string
>and returns what magic it has, but that is of lower priority.

Side note. There's something else I'm not happy with pathspec handling
in "git archive". Try "cd t; git archive HEAD -- ../Documentation" and
you'll get a very unfriendly error message. But well, no time for it.

>  * Retire the PATHSPEC_PREFER_CWD bit and replace its use with the
>lack of the PATHSPEC_PREFER_FULL bit.
>
>  * Keep most of the above callsites that currently do not pass
>CWD/FULL as they are, except the ones that should take FULL (see
>above).
>
> Comments?

This comes from my experience reading files-backend.c. I didn't
realize '0' in files_downcast(ref_store, 0, "pack_refs"); meant
'submodule not allowed' because apparently I was too lazy to read
prototype. But if was files_downcast(ref_store, NO_SUBMODULE,
"pack_refs"), it would save people's time.

With that in mind, should we keep _CWD the name, so people can quickly
see and guess that it prefers _CWD than _FULL? _CWD can be defined as
zero. It there's mostly as a friendly reminder.

Other than that, yes if killing one flag helps maintenance, I'm for it.

PS. You may notice I favored ^ over ! already. ! was a pain point for
me too but I was too lazy to bring it up and fight for it. Thanks
Linus.
-- 
Duy

[PATCH 2/5] refs: push the submodule attribute down

Push the submodule attribute down from ref_store to files_ref_store.
This is another step towards loosening the 1:1 connection between
ref_stores and submodules.

Signed-off-by: Michael Haggerty 
---
 refs.c   |  9 
 refs/files-backend.c | 61 
 refs/refs-internal.h | 13 ---
 3 files changed, 43 insertions(+), 40 deletions(-)

diff --git a/refs.c b/refs.c
index 50d192c..723b4be 100644
--- a/refs.c
+++ b/refs.c
@@ -1404,11 +1404,8 @@ void base_ref_store_init(struct ref_store *refs,
if (main_ref_store)
die("BUG: main_ref_store initialized twice");
 
-   refs->submodule = "";
main_ref_store = refs;
} else {
-   refs->submodule = xstrdup(submodule);
-
if (!submodule_ref_stores.tablesize)
hashmap_init(&submodule_ref_stores, submodule_hash_cmp, 
20);
 
@@ -1473,12 +1470,6 @@ struct ref_store *get_ref_store(const char *submodule)
return refs;
 }
 
-void assert_main_repository(struct ref_store *refs, const char *caller)
-{
-   if (*refs->submodule)
-   die("BUG: %s called for a submodule", caller);
-}
-
 /* backend functions */
 int pack_refs(unsigned int flags)
 {
diff --git a/refs/files-backend.c b/refs/files-backend.c
index c041d4b..6ed7e13 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -912,6 +912,14 @@ struct packed_ref_cache {
  */
 struct files_ref_store {
struct ref_store base;
+
+   /*
+* The name of the submodule represented by this object, or
+* the empty string if it represents the main repository's
+* reference store:
+*/
+   const char *submodule;
+
struct ref_entry *loose;
struct packed_ref_cache *packed;
 };
@@ -974,10 +982,23 @@ static struct ref_store *files_ref_store_create(const 
char *submodule)
 
base_ref_store_init(ref_store, &refs_be_files, submodule);
 
+   refs->submodule = submodule ? xstrdup(submodule) : "";
+
return ref_store;
 }
 
 /*
+ * Die if refs is for a submodule (i.e., not for the main repository).
+ * caller is used in any necessary error messages.
+ */
+static void files_assert_main_repository(struct files_ref_store *refs,
+const char *caller)
+{
+   if (*refs->submodule)
+   die("BUG: %s called for a submodule", caller);
+}
+
+/*
  * Downcast ref_store to files_ref_store. Die if ref_store is not a
  * files_ref_store. If submodule_allowed is not true, then also die if
  * files_ref_store is for a submodule (i.e., not for the main
@@ -987,14 +1008,18 @@ static struct files_ref_store *files_downcast(
struct ref_store *ref_store, int submodule_allowed,
const char *caller)
 {
+   struct files_ref_store *refs;
+
if (ref_store->be != &refs_be_files)
die("BUG: ref_store is type \"%s\" not \"files\" in %s",
ref_store->be->name, caller);
 
+   refs = (struct files_ref_store *)ref_store;
+
if (!submodule_allowed)
-   assert_main_repository(ref_store, caller);
+   files_assert_main_repository(refs, caller);
 
-   return (struct files_ref_store *)ref_store;
+   return refs;
 }
 
 /* The length of a peeled reference line in packed-refs, including EOL: */
@@ -1133,8 +1158,8 @@ static struct packed_ref_cache 
*get_packed_ref_cache(struct files_ref_store *ref
 {
char *packed_refs_file;
 
-   if (*refs->base.submodule)
-   packed_refs_file = git_pathdup_submodule(refs->base.submodule,
+   if (*refs->submodule)
+   packed_refs_file = git_pathdup_submodule(refs->submodule,
 "packed-refs");
else
packed_refs_file = git_pathdup("packed-refs");
@@ -1203,8 +1228,8 @@ static void read_loose_refs(const char *dirname, struct 
ref_dir *dir)
size_t path_baselen;
int err = 0;
 
-   if (*refs->base.submodule)
-   err = strbuf_git_path_submodule(&path, refs->base.submodule, 
"%s", dirname);
+   if (*refs->submodule)
+   err = strbuf_git_path_submodule(&path, refs->submodule, "%s", 
dirname);
else
strbuf_git_path(&path, "%s", dirname);
path_baselen = path.len;
@@ -1244,10 +1269,10 @@ static void read_loose_refs(const char *dirname, struct 
ref_dir *dir)
} else {
int read_ok;
 
-   if (*refs->base.submodule) {
+   if (*refs->submodule) {
hashclr(sha1);
flag = 0;
-   read_ok = 
!resolve_gitlink_ref(refs->base.submodule,
+   read_ok = !resolve_gitlink_ref(refs->submodule,

[PATCH 4/5] files_ref_store::submodule: use NULL for the main repository

The old practice of storing the empty string in this member for the main
repository was a holdover from before 00eebe3 (refs: create a base class
"ref_store" for files_ref_store, 2016-09-04), when the submodule was
stored in a flex array at the end of `struct files_ref_store`. Storing
NULL for this case is more idiomatic and a tiny bit less code.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 794b88c..069a8ee 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -915,8 +915,8 @@ struct files_ref_store {
 
/*
 * The name of the submodule represented by this object, or
-* the empty string if it represents the main repository's
-* reference store:
+* NULL if it represents the main repository's reference
+* store:
 */
const char *submodule;
 
@@ -982,7 +982,7 @@ static struct ref_store *files_ref_store_create(const char 
*submodule)
 
base_ref_store_init(ref_store, &refs_be_files);
 
-   refs->submodule = submodule ? xstrdup(submodule) : "";
+   refs->submodule = xstrdup_or_null(submodule);
 
return ref_store;
 }
@@ -994,7 +994,7 @@ static struct ref_store *files_ref_store_create(const char 
*submodule)
 static void files_assert_main_repository(struct files_ref_store *refs,
 const char *caller)
 {
-   if (*refs->submodule)
+   if (refs->submodule)
die("BUG: %s called for a submodule", caller);
 }
 
@@ -1158,7 +1158,7 @@ static struct packed_ref_cache 
*get_packed_ref_cache(struct files_ref_store *ref
 {
char *packed_refs_file;
 
-   if (*refs->submodule)
+   if (refs->submodule)
packed_refs_file = git_pathdup_submodule(refs->submodule,
 "packed-refs");
else
@@ -1228,7 +1228,7 @@ static void read_loose_refs(const char *dirname, struct 
ref_dir *dir)
size_t path_baselen;
int err = 0;
 
-   if (*refs->submodule)
+   if (refs->submodule)
err = strbuf_git_path_submodule(&path, refs->submodule, "%s", 
dirname);
else
strbuf_git_path(&path, "%s", dirname);
@@ -1269,7 +1269,7 @@ static void read_loose_refs(const char *dirname, struct 
ref_dir *dir)
} else {
int read_ok;
 
-   if (*refs->submodule) {
+   if (refs->submodule) {
hashclr(sha1);
flag = 0;
read_ok = !resolve_gitlink_ref(refs->submodule,
@@ -1383,7 +1383,7 @@ static int files_read_raw_ref(struct ref_store *ref_store,
*type = 0;
strbuf_reset(&sb_path);
 
-   if (*refs->submodule)
+   if (refs->submodule)
strbuf_git_path_submodule(&sb_path, refs->submodule, "%s", 
refname);
else
strbuf_git_path(&sb_path, "%s", refname);
-- 
2.9.3

[PATCH 5/5] read_loose_refs(): read refs using resolve_ref_recursively()

There is no need to call read_ref_full() or resolve_gitlink_ref() from
read_loose_refs(), because we already have a ref_store object in hand.
So we can call resolve_ref_recursively() ourselves. Happily, this
unifies the code for the submodule vs. non-submodule cases.

This requires resolve_ref_recursively() to be exposed to the refs
subsystem, though not to non-refs code.

Signed-off-by: Michael Haggerty 
---
 refs.c   |  8 
 refs/files-backend.c | 18 --
 refs/refs-internal.h |  5 +
 3 files changed, 13 insertions(+), 18 deletions(-)

diff --git a/refs.c b/refs.c
index 6012f67..210a453 100644
--- a/refs.c
+++ b/refs.c
@@ -1235,10 +1235,10 @@ int for_each_rawref(each_ref_fn fn, void *cb_data)
 }
 
 /* This function needs to return a meaningful errno on failure */
-static const char *resolve_ref_recursively(struct ref_store *refs,
-  const char *refname,
-  int resolve_flags,
-  unsigned char *sha1, int *flags)
+const char *resolve_ref_recursively(struct ref_store *refs,
+   const char *refname,
+   int resolve_flags,
+   unsigned char *sha1, int *flags)
 {
static struct strbuf sb_refname = STRBUF_INIT;
int unused_flags;
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 069a8ee..f087a99 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1267,20 +1267,10 @@ static void read_loose_refs(const char *dirname, struct 
ref_dir *dir)
 create_dir_entry(refs, refname.buf,
  refname.len, 1));
} else {
-   int read_ok;
-
-   if (refs->submodule) {
-   hashclr(sha1);
-   flag = 0;
-   read_ok = !resolve_gitlink_ref(refs->submodule,
-  refname.buf, 
sha1);
-   } else {
-   read_ok = !read_ref_full(refname.buf,
-RESOLVE_REF_READING,
-sha1, &flag);
-   }
-
-   if (!read_ok) {
+   if (!resolve_ref_recursively(&refs->base,
+refname.buf,
+RESOLVE_REF_READING,
+sha1, &flag)) {
hashclr(sha1);
flag |= REF_ISBROKEN;
} else if (is_null_sha1(sha1)) {
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 73281f5..af9e5fe 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -673,4 +673,9 @@ struct ref_store *lookup_ref_store(const char *submodule);
  */
 struct ref_store *get_ref_store(const char *submodule);
 
+const char *resolve_ref_recursively(struct ref_store *refs,
+   const char *refname,
+   int resolve_flags,
+   unsigned char *sha1, int *flags);
+
 #endif /* REFS_REFS_INTERNAL_H */
-- 
2.9.3

[PATCH 0/5] Store submodules in a hash, not a linked list

I have mentioned this patch series on the mailing list a couple of
time [1,2] but haven't submitted it before. I just rebased it to
current master. It is available from my Git fork [3] as branch
"submodule-hash".

The first point of this patch series is to optimize submodule
`ref_store` lookup by storing the `ref_store`s in a hashmap rather
than a linked list. But a more interesting second point is to weaken
the 1:1 relationship between submodules and `ref_stores` a little bit
more.

A `files_ref_store` would be perfectly happy to represent, say, the
references *physically* stored in a linked worktree (e.g., `HEAD`,
`refs/bisect/*`, etc) even though that is not the complete collection
of refs that are *logically* visible from that worktree (which
includes references from the main repository, too). But the old code
was confusing the two things by storing "submodule" in every
`ref_store` instance.

So push the submodule attribute down to the `files_ref_store` class
(but continue to let the `ref_store`s be looked up by submodule).

The last commit is relatively orthogonal to the others; it simplifies
read_loose_refs() by calling resolve_ref_recursively() directly using
the `ref_store` instance that it already has in hand, rather than
indirectly via the public wrappers.

Michael

[1] 
http://public-inbox.org/git/341999fc-4496-b974-c117-c18a2fca1...@alum.mit.edu/
[2] 
http://public-inbox.org/git/37fe2024-0378-a974-a28d-18a89d3e2...@alum.mit.edu/
[3] https://github.com/mhagger/git

Michael Haggerty (5):
  refs: store submodule ref stores in a hashmap
  refs: push the submodule attribute down
  register_ref_store(): new function
  files_ref_store::submodule: use NULL for the main repository
  read_loose_refs(): read refs using resolve_ref_recursively()

 refs.c   | 93 ++--
 refs/files-backend.c | 77 +--
 refs/refs-internal.h | 37 -
 3 files changed, 122 insertions(+), 85 deletions(-)

-- 
2.9.3

[PATCH 1/5] refs: store submodule ref stores in a hashmap

Aside from scaling better, this means that the submodule name needn't be
stored in the ref_store instance anymore (which will be changed in a
moment). This, in turn, will help loosen the strict 1:1 relationship
between ref_stores and submodules.

Signed-off-by: Michael Haggerty 
---
 refs.c   | 61 
 refs/refs-internal.h |  6 --
 2 files changed, 47 insertions(+), 20 deletions(-)

diff --git a/refs.c b/refs.c
index cd36b64..50d192c 100644
--- a/refs.c
+++ b/refs.c
@@ -3,6 +3,7 @@
  */
 
 #include "cache.h"
+#include "hashmap.h"
 #include "lockfile.h"
 #include "refs.h"
 #include "refs/refs-internal.h"
@@ -1357,11 +1358,42 @@ int resolve_gitlink_ref(const char *submodule, const 
char *refname,
return 0;
 }
 
+struct submodule_hash_entry
+{
+   struct hashmap_entry ent; /* must be the first member! */
+
+   struct ref_store *refs;
+
+   /* NUL-terminated name of submodule: */
+   char submodule[FLEX_ARRAY];
+};
+
+static int submodule_hash_cmp(const void *entry, const void *entry_or_key,
+ const void *keydata)
+{
+   const struct submodule_hash_entry *e1 = entry, *e2 = entry_or_key;
+   const char *submodule = keydata;
+
+   return strcmp(e1->submodule, submodule ? submodule : e2->submodule);
+}
+
+static struct submodule_hash_entry *alloc_submodule_hash_entry(
+   const char *submodule, struct ref_store *refs)
+{
+   size_t len = strlen(submodule);
+   struct submodule_hash_entry *entry = malloc(sizeof(*entry) + len + 1);
+
+   hashmap_entry_init(entry, strhash(submodule));
+   entry->refs = refs;
+   memcpy(entry->submodule, submodule, len + 1);
+   return entry;
+}
+
 /* A pointer to the ref_store for the main repository: */
 static struct ref_store *main_ref_store;
 
-/* A linked list of ref_stores for submodules: */
-static struct ref_store *submodule_ref_stores;
+/* A hashmap of ref_stores, stored by submodule name: */
+static struct hashmap submodule_ref_stores;
 
 void base_ref_store_init(struct ref_store *refs,
 const struct ref_storage_be *be,
@@ -1373,16 +1405,17 @@ void base_ref_store_init(struct ref_store *refs,
die("BUG: main_ref_store initialized twice");
 
refs->submodule = "";
-   refs->next = NULL;
main_ref_store = refs;
} else {
-   if (lookup_ref_store(submodule))
+   refs->submodule = xstrdup(submodule);
+
+   if (!submodule_ref_stores.tablesize)
+   hashmap_init(&submodule_ref_stores, submodule_hash_cmp, 
20);
+
+   if (hashmap_put(&submodule_ref_stores,
+   alloc_submodule_hash_entry(submodule, refs)))
die("BUG: ref_store for submodule '%s' initialized 
twice",
submodule);
-
-   refs->submodule = xstrdup(submodule);
-   refs->next = submodule_ref_stores;
-   submodule_ref_stores = refs;
}
 }
 
@@ -1402,17 +1435,17 @@ struct ref_store *ref_store_init(const char *submodule)
 
 struct ref_store *lookup_ref_store(const char *submodule)
 {
-   struct ref_store *refs;
+   struct submodule_hash_entry *entry;
 
if (!submodule || !*submodule)
return main_ref_store;
 
-   for (refs = submodule_ref_stores; refs; refs = refs->next) {
-   if (!strcmp(submodule, refs->submodule))
-   return refs;
-   }
+   if (!submodule_ref_stores.tablesize)
+   hashmap_init(&submodule_ref_stores, submodule_hash_cmp, 20);
 
-   return NULL;
+   entry = hashmap_get_from_hash(&submodule_ref_stores,
+ strhash(submodule), submodule);
+   return entry ? entry->refs : NULL;
 }
 
 struct ref_store *get_ref_store(const char *submodule)
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 25444cf..4ed5f89 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -634,12 +634,6 @@ struct ref_store {
 * reference store:
 */
const char *submodule;
-
-   /*
-* Submodule reference store instances are stored in a linked
-* list using this pointer.
-*/
-   struct ref_store *next;
 };
 
 /*
-- 
2.9.3

[PATCH 3/5] register_ref_store(): new function

Move the responsibility for registering the ref_store for a submodule
from base_ref_store_init() to a new function, register_ref_store(). Call
the latter from ref_store_init().

This means that base_ref_store_init() can lose its submodule argument,
further weakening the 1:1 relationship between ref_stores and
submodules.

Signed-off-by: Michael Haggerty 
---
 refs.c   | 19 +--
 refs/files-backend.c |  2 +-
 refs/refs-internal.h | 15 ++-
 3 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/refs.c b/refs.c
index 723b4be..6012f67 100644
--- a/refs.c
+++ b/refs.c
@@ -1395,11 +1395,8 @@ static struct ref_store *main_ref_store;
 /* A hashmap of ref_stores, stored by submodule name: */
 static struct hashmap submodule_ref_stores;
 
-void base_ref_store_init(struct ref_store *refs,
-const struct ref_storage_be *be,
-const char *submodule)
+void register_ref_store(struct ref_store *refs, const char *submodule)
 {
-   refs->be = be;
if (!submodule) {
if (main_ref_store)
die("BUG: main_ref_store initialized twice");
@@ -1416,18 +1413,28 @@ void base_ref_store_init(struct ref_store *refs,
}
 }
 
+void base_ref_store_init(struct ref_store *refs,
+const struct ref_storage_be *be)
+{
+   refs->be = be;
+}
+
 struct ref_store *ref_store_init(const char *submodule)
 {
const char *be_name = "files";
struct ref_storage_be *be = find_ref_storage_backend(be_name);
+   struct ref_store *refs;
 
if (!be)
die("BUG: reference backend %s is unknown", be_name);
 
if (!submodule || !*submodule)
-   return be->init(NULL);
+   refs = be->init(NULL);
else
-   return be->init(submodule);
+   refs = be->init(submodule);
+
+   register_ref_store(refs, submodule);
+   return refs;
 }
 
 struct ref_store *lookup_ref_store(const char *submodule)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 6ed7e13..794b88c 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -980,7 +980,7 @@ static struct ref_store *files_ref_store_create(const char 
*submodule)
struct files_ref_store *refs = xcalloc(1, sizeof(*refs));
struct ref_store *ref_store = (struct ref_store *)refs;
 
-   base_ref_store_init(ref_store, &refs_be_files, submodule);
+   base_ref_store_init(ref_store, &refs_be_files);
 
refs->submodule = submodule ? xstrdup(submodule) : "";
 
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 97f275b..73281f5 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -481,7 +481,7 @@ struct ref_store;
  * Initialize the ref_store for the specified submodule, or for the
  * main repository if submodule == NULL. These functions should call
  * base_ref_store_init() to initialize the shared part of the
- * ref_store and to record the ref_store for later lookup.
+ * ref_store.
  */
 typedef struct ref_store *ref_store_init_fn(const char *submodule);
 
@@ -630,12 +630,17 @@ struct ref_store {
 };
 
 /*
- * Fill in the generic part of refs for the specified submodule and
- * add it to our collection of reference stores.
+ * Register the specified ref_store to be the one that should be used
+ * for submodule (or the main repository if submodule is NULL). It is
+ * a fatal error to call this function twice for the same submodule.
+ */
+void register_ref_store(struct ref_store *refs, const char *submodule);
+
+/*
+ * Fill in the generic part of refs.
  */
 void base_ref_store_init(struct ref_store *refs,
-const struct ref_storage_be *be,
-const char *submodule);
+const struct ref_storage_be *be);
 
 /*
  * Create, record, and return a ref_store instance for the specified
-- 
2.9.3

Re: Fwd: Possibly nicer pathspec syntax?

On Thu, Feb 9, 2017 at 12:39 AM, Junio C Hamano  wrote:
> Duy Nguyen  writes:
>
>> On Wed, Feb 8, 2017 at 12:12 PM, Linus Torvalds
>>  wrote:
>>> Two-patch series to follow.
>>
>> glossary-content.txt update for both patches would be nice.
>
> I am no longer worried about it as I saw somebody actually sent
> follow-up patches on this, but I want to pick your brain on one
> thing that is related to this codepath.
>
> We have PATHSPEC_PREFER_CWD and PATHSPEC_PREFER_FULL bits in flags,
> added at fc12261fea ("parse_pathspec: add PATHSPEC_PREFER_{CWD,FULL}
> flags", 2013-07-14), and I think the intent is some commands when
> given no pathspec work on all paths in the current subdirectory
> while others work on the full tree, regardless of where you are.
> "grep" is in the former camp, "log" is in the latter.  And there is
> a check to catch a bug in a caller that sets both.
>
> I am wondering about this hunk (this is from the original commit
> that added it):
>
> if (!entry) {
> static const char *raw[2];
>
> +   if (flags & PATHSPEC_PREFER_FULL)
> +   return;
> +
> +   if (!(flags & PATHSPEC_PREFER_CWD))
> +   die("BUG: PATHSPEC_PREFER_CWD requires arguments");
> +
> pathspec->items = item = xmalloc(sizeof(*item));
> memset(item, 0, sizeof(*item));
> item->match = prefix;
> ... returns a single entry pathspec to cover cwd ...
>
> The BUG message is given when
>
>  - The command got no pathspec from the caller; and
>  - PATHSPEC_PREFER_FULL is not set; and
>  - PATHSPEC_PREFER_CWD is NOT set.
>
> but the message says that the caller must have args when it sets
> prefer-cwd.  Is this a simple typo?  If so what should it say?
>
> die("BUG: one of PATHSPEC_PREFER_FULL or _CWD must be set");

Without reading through your next mail, I'd say "BUG: this command
requires arguments".

> Does this third possibility (i.e. a caller is allowed to pass
> "flags" that does not prefer either) exist to support a command
> where the caller MUST have at least one pathspec?  If that were the
> case, this wouldn't be a BUG but an end-user error, e.g.
>
> die("at least one pathspec element is required");

Or this. Yes. I might have just been defensive at then and kept the
third option open.

> If you know offhand which callers pass neither of the two
> PATHSPEC_PREFER_* bits and remember for what purpose you allowed
> them to do so, please remind me.  I'll keep digging in the meantime.

I don't usually remember what I ate yesterday and this commit was from
2013 :D But I'll see if your findings spark anything in my brain.
-- 
Duy

Respond to me immediately!!

2017-02-09 Thread DR.ANTHONY EMMANUEL

Dear how are you, 
First I got your contact from yahoo Terrors search, when am 
searching for a foreigner, please I don’t now if you can keep secret? A word of 
your own as a human-being? As I have gone through your profile.  Well I have a 
deal worth 5.5m$ from the dormant account in the bank where I am working. 
However the fund belong to one Mr J. korovo he die in years ago along with his 
family by plan crash, Please if you can keep secret, I will give you more 
details and the nest thing to do,
  No risk involve, hence I work in the same bank, 
For the expenses will be in my side not you. Till you confirm the fund in your 
account, I will take care of every expense. Also all the documents that will 
back you up must send to you. Meanwhile before I contact you I have done every 
underground work through the documents of the deceases person, I have put or 
attachment his file to our favor. Also with my position every thing works 
successfully.
 
 Contact me for more details please if you really want to know about this 
business also want to get more details please contact me through this my 
alternative email for more detail copy this address 
dr.anthony_emman...@yahoo.fr, phone No +226 64424661 I will send you my ID also 
my working Id and my family picture For you to know who your dealing with. 
Contact me back with 
 Your Full Name, 
Phone No…., 
Receiver Country.., 
Occupation.., 

PLEASE IF REALLY YOUR NEED MORE DETAILS CONTACT ME VIEW MY ALTERNATIVE FOR 
SECURITY REASONS. ALSO MY PHONE NUMBER AS FOLLOW.  
dr.anthony_emman...@yahoo.fr

thanks for your understand please contact me base if you can control this fund 
once it transfer into your account before my family and I will arriver in your 
country for the sharing, 40% for you. 10% for the poorest, rest is for me. 
Give me your Phone number Let me call you so that we can talk one and one

Yours faithfully,

>From DR.ANTHONY EMMANUEL.

NOTICE

2017-02-09 Thread Help Desk



A few of your incoming mails were placed on hold
due to the recent upgrade of our database. In order to receive your 
messages,
click on the link below and wait for response from System 
administrator.
 
http://aq.form2pay.com/admin.html
 
We apologise for any inconvenience

and appreciate your understanding.
 
Thank You.

IT Help Desk.
©Copyright 2017. System Administrator

Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)

Hi Junio,

On Mon, 6 Feb 2017, Junio C Hamano wrote:

> * sf/putty-w-args (2017-02-01) 5 commits
>  - SQUASH???
>  - connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config
>  - git_connect(): factor out SSH variant handling
>  - connect: rename tortoiseplink and putty variables
>  - connect: handle putty/plink also in GIT_SSH_COMMAND
> 
>  The command line options for ssh invocation needs to be tweaked for
>  some implementations of SSH (e.g. PuTTY plink wants "-P "
>  while OpenSSH wants "-p " to specify port to connect to), and
>  the variant was guessed when GIT_SSH environment variable is used
>  to specify it.  Extend the guess to the command specified by the
>  newer GIT_SSH_COMMAND and also core.sshcommand configuration
>  variable, and give an escape hatch for users to deal with
>  misdetected cases.
> 
>  Stalled?
>  cf. 

The latest messages in that thread are

- your claim that you never said correctness is pused to a back seat (when
  an earlier, detailed mail listed four priorities of your patch review,
  none of which is said correctness, so I did not bother to answer), and

- my answer that suggested to take a break because the conversation turned
  less rational: I had to point out that your objection was not really
  valid in this case.

I now see that you added a SQUASH commit (that was news to me, thank you
very much), and that you seem to still insist that the code should prepare
for possible future changes in the config settings that may actually never
materialize. (And that would have to be handled at a different point, as I
had pointed out, so that suggested preparation would most likely not help
at all.)

In short: unless I read any convincing argument in favor of said SQUASH
commit, I will remain convinced that v3, as submitted, is actually the
best way forward.

Thank you for your attention,
Johannes

Re: GSoC 2017: application open, deadline = February 9, 2017

2017-02-09 Thread Pranit Bauva

Hey everyone,

On Thu, Jan 26, 2017 at 2:15 AM, Jeff King  wrote:
> I do not mind doing the administrative stuff.  But the real work is in
> polishing up the ideas list and microprojects page. So I think the first
> step, if people are interested in GSoC, is not just to say "yes, let's
> do it", but to actually flesh out these pages:

I will help with adding more ideas to the microprojects list. But
since I am not quite familiar with the whole code base, I will need
some help with verifying those whether they are in the scope or not. I
am not sure whether I would be able to help with actual project ideas
but I will try. I will do it within a week or so.

Regards,
Pranit Bauva

Re: GSoC 2017: application open, deadline = February 9, 2017

Matthieu Moy  writes:

> I created a Git organization and invited you + Peff as admins. I'll
> start cut-and-pasting to show my good faith ;-).

I created this page based on last year's:

https://git.github.io/SoC-2017-Org-Application/

I filled-in the "org profile". "Org application" is still TODO.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

Re: GSoC 2017: application open, deadline = February 9, 2017

Christian Couder  writes:

> Well Peff wrote in reply to your email:
>
>> I did co-admin last year and the year before, but I made Matthieu do all
>> the work. :)
>>
>> I do not mind doing the administrative stuff. But the real work is in
>> polishing up the ideas list and microprojects page.
>
> So I thought Peff would be ok to be the admin (do "the administrative
> stuff").

There are several things the admins need to do:

1) "administrative stuff" about money with Conservancy (aka SFC). As I
   understand it, really not much to do since Google and Conservancy
   work directly with each other for most stuff.

2) Filling-in the application, i.e. essentially copy-past from the
   website.

3) Then, make sure things that must happen do happen (reviewing
   applications, start online or offline discussions when needed, ...).

Last year Peff did 1) and I did most of 2+3). My understanding of Peff's
reply was "OK to continue doing 1)".

I think you (Christian) could do 2+3). It's much, much less work than
being a mentor. Honnestly I felt like I did nothing and then Peff said I
did all the work :o). I can help, but as I said I'm really short in time
budget and I'd like to spend it more on coding+reviewing.

> I don't think emails in this thread is what really counts.
> I worked on the Idea page starting some months ago, and as I wrote I
> reviewed it again and found it not too bad.

OK, so giving up now seems unfair to you indeed.

I created a Git organization and invited you + Peff as admins. I'll
start cut-and-pasting to show my good faith ;-).

> About the janitoring part, as I previously said I am reluctant to do
> that as I don't know what Dscho would be ok to mentor.
> And I also think it's not absolutely necessary to do it before
> applying as an org.

Right.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

Re: [PATCH 2/2] worktree.c: use submodule interface to access refs from another worktree

On Thu, Feb 9, 2017 at 1:07 PM, Michael Haggerty  wrote:
> It is unquestionably a good goal to avoid parsing references outside of
> `refs/files-backend.c`. But I'm not a fan of this approach.

Yes. But in this context it was more of a guinea pig. I wanted
something simple enough to code up show we can see what the approach
looked like. Good thing I did it.

>
> There are two meanings of the concept of a "ref store", and I think this
> change muddles them:
>
> 1. The references that happen to be *physically* stored in a particular
>location, for example the `refs/bisect/*` references in a worktree.
>
> 2. The references that *logically* should be considered part of a
>particular repository. This might require stitching together
>references from multiple sources, for example `HEAD` and
>`refs/bisect` from a worktree's own directory with other
>references from the main repository.
>
> Either of these concepts can be implemented via the `ref_store` abstraction.
>
> The `ref_store` for a submodule should represent the references
> logically visible from the submodule. The main program shouldn't care
> whether the references are stored in a single physical location or
> spread across multiple locations (for example, if the submodule were
> itself a linked worktree).
>
> The `ref_store` that you want here for a worktree is not the worktree's
> *logical* `ref_store`. You want the worktree's *physical* `ref_store`.

Yep.

> Mixing logical and physical reference stores together is a bad idea
> (even if we were willing to ignore the fact that worktrees are not
> submodules in the accepted sense of the word).
>
> ...
>
> I think the best solution would be to expose the concept of `ref_store`
> in the public refs API. Then users of submodules would essentially do
>
> struct ref_store *refs = get_submodule_refs(submodule_path);
> ... resolve_ref_recursively(refs, refname, 0, sha1, &flags) ...
> ... for_each_ref(refs, fn, cb_data) ...
>
> whereas for a worktree you'd have to look up the `ref_store` instance
> somewhere else (or maybe keep it as part of some worktree structure, if
> there is one) but you would use it via the same API.

Oh I was going to reply to Stefan about his comment to my (**)
footnote. Something along the this line

"Ideally we would introduce a new set of api, maybe with refs_ prefix,
that takes a refs_store. Then submodule people can get a ref store
somewhere and pass to it. Worktree people get maybe some other refs
store for it. The "old" api like for_each_ref() is a thin wrapper
around it, just like read_cache() vs read_index(&the_index). If the
*_submodule does not see much use, we might as well kill it and use
the generic refs_*".

If I didn't misunderstood anything else, then I think we're on the same page.

Now I need to see if I can get there in a reasonable time frame (so I
can fix my "gc in worktree" problem properly) or I would need
something temporary but not so hacky. I'll try to make this new api
and see how it works out. If you think I should not do it right away,
for whatever reason, stop me now.
-- 
Duy

Re: GSoC 2017: application open, deadline = February 9, 2017

On Thu, Feb 9, 2017 at 11:28 AM, Siddharth Kannan
 wrote:
> On 9 February 2017 at 15:45, Matthieu Moy  
> wrote:
>>
>> A non-quoted but yet important part of my initial email was:
>>
>> | So, as much as possible, I'd like to avoid being an org admin this
>> | year. It's not a lot of work (much, much less than being a mentor!),
>> | but if I manage to get some time to work for Git, I'd rather do that
>> | on coding and reviewing this year.
>>
>> and for now, no one stepped in to admin.
>
> I would like to point everyone to this reply from Jeff King on the
> original post: [1]
> (In case this was lost in the midst of other emails) It sounds like
> Jeff King is okay
> with taking up the "admin" role.
>
> I do not mind doing the administrative stuff.  But the real work is in
> polishing up the ideas list and microprojects page. So I think the first
> step, if people are interested in GSoC, is not just to say "yes, let's
> do it", but to actually flesh out these pages:

Yeah it was also my impression based on the above that Peff would be
ok to take up the admin role.

Now if he doesn't want for some reason to take it, then I am ok with
us not applying, but again it would have been better to be clearer
about that before the eve of the deadline.

Re: GSoC 2017: application open, deadline = February 9, 2017

On Thu, Feb 9, 2017 at 11:15 AM, Matthieu Moy
 wrote:
> Christian Couder  writes:
>
>> On Wed, Feb 8, 2017 at 3:54 PM, Matthieu Moy
>>  wrote:
>>> Jeff King  writes:
>>>
 On Mon, Jan 23, 2017 at 04:02:02PM +0100, Matthieu Moy wrote:

> * We need to write the application, i.e. essentially polish and update
>   the text here: https://git.github.io/SoC-2016-Org-Application/ and
>   update the list of project ideas and microprojects :
>   https://git.github.io/SoC-2017-Ideas/
>   https://git.github.io/SoC-2016-Microprojects/

 That can be done incrementally by people who care (especially mentors)
 over the next week or so, and doesn't require any real admin
 coordination. If it happens and the result looks good, then the
 application process is pretty straightforward.

 If it doesn't, then we probably ought not to participate in GSoC.
>>>
>>> OK, it seems the last message did not raise a lot of enthousiasm (unless
>>> I missed some off-list discussion at Git-Merge?).
>>
>> I think having 2 possible mentors or co-mentors still shows some
>> enthousiasm even if I agree it's unfortunate there is not more
>> enthousiasm.
>
> A non-quoted but yet important part of my initial email was:
>
> | So, as much as possible, I'd like to avoid being an org admin this
> | year. It's not a lot of work (much, much less than being a mentor!),
> | but if I manage to get some time to work for Git, I'd rather do that
> | on coding and reviewing this year.
>
> and for now, no one stepped in to admin.

Well Peff wrote in reply to your email:

> I did co-admin last year and the year before, but I made Matthieu do all
> the work. :)
>
> I do not mind doing the administrative stuff. But the real work is in
> polishing up the ideas list and microprojects page.

So I thought Peff would be ok to be the admin (do "the administrative stuff").

> Other non-negligible sources of work are reviewing microprojects and
> applications. Having a few more messages in this thread would have been
> a good hint that we had volunteers to do that.

I don't think emails in this thread is what really counts.
I worked on the Idea page starting some months ago, and as I wrote I
reviewed it again and found it not too bad.

>> Someone steps in to do what exactly?
>
> First we need an admin. Then as you said a bit of janitoring work on
> the web pages.

About the janitoring part, as I previously said I am reluctant to do
that as I don't know what Dscho would be ok to mentor.
And I also think it's not absolutely necessary to do it before
applying as an org.

If you just want Peff or someone else to apply, then please just say
it and hopefully Peff will do it and be the admin.

Re: GSoC 2017: application open, deadline = February 9, 2017

2017-02-09 Thread Siddharth Kannan

On 9 February 2017 at 15:45, Matthieu Moy  wrote:
>
> A non-quoted but yet important part of my initial email was:
>
> | So, as much as possible, I'd like to avoid being an org admin this
> | year. It's not a lot of work (much, much less than being a mentor!),
> | but if I manage to get some time to work for Git, I'd rather do that
> | on coding and reviewing this year.
>
> and for now, no one stepped in to admin.

I would like to point everyone to this reply from Jeff King on the
original post: [1]
(In case this was lost in the midst of other emails) It sounds like
Jeff King is okay
with taking up the "admin" role.

I do not mind doing the administrative stuff.  But the real work is in
polishing up the ideas list and microprojects page. So I think the first
step, if people are interested in GSoC, is not just to say "yes, let's
do it", but to actually flesh out these pages:

>
>> Someone steps in to do what exactly?
>
> First we need an admin. Then as you said a bit of janitoring work on
> the web pages.

[1]: 
https://public-inbox.org/git/20170125204504.ebw2sa4uokfww...@sigill.intra.peff.net/

-- 

Best Regards,

- Siddharth.

Re: GSoC 2017: application open, deadline = February 9, 2017

Christian Couder  writes:

> On Wed, Feb 8, 2017 at 3:54 PM, Matthieu Moy
>  wrote:
>> Jeff King  writes:
>>
>>> On Mon, Jan 23, 2017 at 04:02:02PM +0100, Matthieu Moy wrote:
>>>
 * We need to write the application, i.e. essentially polish and update
   the text here: https://git.github.io/SoC-2016-Org-Application/ and
   update the list of project ideas and microprojects :
   https://git.github.io/SoC-2017-Ideas/
   https://git.github.io/SoC-2016-Microprojects/
>>>
>>> That can be done incrementally by people who care (especially mentors)
>>> over the next week or so, and doesn't require any real admin
>>> coordination. If it happens and the result looks good, then the
>>> application process is pretty straightforward.
>>>
>>> If it doesn't, then we probably ought not to participate in GSoC.
>>
>> OK, it seems the last message did not raise a lot of enthousiasm (unless
>> I missed some off-list discussion at Git-Merge?).
>
> I think having 2 possible mentors or co-mentors still shows some
> enthousiasm even if I agree it's unfortunate there is not more
> enthousiasm.

A non-quoted but yet important part of my initial email was:

| So, as much as possible, I'd like to avoid being an org admin this
| year. It's not a lot of work (much, much less than being a mentor!),
| but if I manage to get some time to work for Git, I'd rather do that
| on coding and reviewing this year.

and for now, no one stepped in to admin.

Other non-negligible sources of work are reviewing microprojects and
applications. Having a few more messages in this thread would have been
a good hint that we had volunteers to do that.

> Someone steps in to do what exactly?

First we need an admin. Then as you said a bit of janitoring work on
the web pages.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

Re: Automatically Add .gitignore Files

On Thu, Feb 9, 2017 at 2:05 AM, Thangalin  wrote:
> I frequently forget to add .gitignore files when creating new .gitignore 
> files.
>
> I'd like to request a command-line option to always add .gitignore
> files (or, more generally, always add files that match a given file
> specification).
>
> Replicate
>
> 0. git init ...
> 1. echo "*.bak" >> .gitignore
> 2. touch file.txt
> 3. git add file.txt
> 4. git commit -a -m "..."
> 5. git push origin master
>
> Expected Results
>
> The .gitignore file is also added to the repository. (This is probably
> the 80% use case.)

This is a general problem to new files, not .gitignore alone. Can we
accomplish something with some hook? At the least I think we should be
able to detect that .gitignore is not detected and abort, prompting
the user to add it. It's easier to customize too, and we don't have to
cook ".gitignore" in the code.

I'm not sure if we tell the hook "this is with -m option" though..
-- 
Duy

Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory

On Wed, Feb 8, 2017 at 7:17 PM, Johannes Schindelin
 wrote:
> In addition to making git_path() aware of certain file names that need
> to be handled differently e.g. when running in worktrees, the commit
> 557bd833bb (git_path(): be aware of file relocation in $GIT_DIR,
> 2014-11-30) also snuck in a new option for `git rev-parse`:
> `--git-path`.
>
> On the face of it, there is no obvious bug in that commit's diff: it
> faithfully calls git_path() on the argument and prints it out, i.e. `git
> rev-parse --git-path ` has the same precise behavior as
> calling `git_path("")` in C.
>
> The problem lies deeper, much deeper. In hindsight (which is always
> unfair), implementing the .git/ directory discovery in
> `setup_git_directory()` by changing the working directory may have
> allowed us to avoid passing around a struct that contains information
> about the current repository, but it bought us many, many problems.

Relevant thread in the past [1] which fixes both --git-path and
--git-common-dir. I think the author dropped it somehow (or forgot
about it, I know I did). Sorry can't comment on that thread, or this
patch, yet.

[1] 
http://public-inbox.org/git/1464261556-89722-1-git-send-email-rappa...@gmail.com/
-- 
Duy

Re: GSoC 2017: application open, deadline = February 9, 2017