Re: [PATCH v2 21/25] sequencer: refactor write_message()

2016-10-05 Thread Johannes Schindelin
Hi Junio & Hannes,

On Thu, 15 Sep 2016, Junio C Hamano wrote:

> Johannes Sixt  writes:
> 
> > Am 11.09.2016 um 12:55 schrieb Johannes Schindelin:
> >> -static int write_message(struct strbuf *msgbuf, const char *filename)
> >> +static int write_with_lock_file(const char *filename,
> >> +  const void *buf, size_t len, int append_eol)
> >>  {
> >>static struct lock_file msg_file;
> >>
> >>int msg_fd = hold_lock_file_for_update(&msg_file, filename, 0);
> >>if (msg_fd < 0)
> >>return error_errno(_("Could not lock '%s'"), filename);
> >> -  if (write_in_full(msg_fd, msgbuf->buf, msgbuf->len) < 0)
> >> -  return error_errno(_("Could not write to %s"), filename);
> >> -  strbuf_release(msgbuf);
> >> +  if (write_in_full(msg_fd, buf, len) < 0)
> >> +  return error_errno(_("Could not write to '%s'"), filename);
> >> +  if (append_eol && write(msg_fd, "\n", 1) < 0)
> >> +  return error_errno(_("Could not write eol to '%s"), filename);
> >>if (commit_lock_file(&msg_file) < 0)
> >>return error(_("Error wrapping up %s."), filename);
> >>
> >>return 0;
> >>  }
> >
> > The two error paths in the added lines should both
> >
> > rollback_lock_file(&msg_file);
> >
> > , I think. But I do notice that this is not exactly new, so...
> 
> It may not be new for this step, but overall the series is aiming to
> libify the stuff, so we should fix fd and lockfile leaks like this
> as we notice them.

Makes sense, even for the final commit_lock_file().

Ciao,
Dscho


Re: [musl] Re: Regression: git no longer works with musl libc's regex impl

2016-10-05 Thread Szabolcs Nagy
* Johannes Schindelin  [2016-10-05 13:17:49 +0200]:
> I had a brief look at the source code (you use backtracking... hopefully
> nobody uses musl to parse regular expressions from untrusted, or
> inexperienced, sources [*1*]), and it seems that the regex code might

does git use BRE?

a conforming BRE implementation has to use back tracking
if the pattern has back references.

usually ERE implementations may also use back tracking
since they support back references as an extension.

musl does not support this extension (and many others) so
it never uses back tracking for ERE matches, note however
that match complexity and memory usage of a conforming
ERE implementation is still exponential in pattern
length because of repetition counts.


Re: [PATCH v2 17/25] sequencer: support amending commits

2016-10-05 Thread Johannes Schindelin
Hi Junio,

On Mon, 12 Sep 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > This teaches the sequencer_commit() function to take an argument that
> > will allow us to implement "todo" commands that need to amend the commit
> > messages ("fixup", "squash" and "reword").
> >
> > Signed-off-by: Johannes Schindelin 
> > ---
> >  sequencer.c | 6 --
> >  sequencer.h | 2 +-
> >  2 files changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/sequencer.c b/sequencer.c
> > index 6e9732c..60b522e 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -485,7 +485,7 @@ static char **read_author_script(void)
> >   * author metadata.
> >   */
> >  int sequencer_commit(const char *defmsg, struct replay_opts *opts,
> > - int allow_empty, int edit)
> > + int allow_empty, int edit, int amend)
> >  {
> > char **env = NULL;
> > struct argv_array array;
> > @@ -514,6 +514,8 @@ int sequencer_commit(const char *defmsg, struct 
> > replay_opts *opts,
> > argv_array_push(&array, "commit");
> > argv_array_push(&array, "-n");
> >  
> > +   if (amend)
> > +   argv_array_push(&array, "--amend");
> > if (opts->gpg_sign)
> > argv_array_pushf(&array, "-S%s", opts->gpg_sign);
> > if (opts->signoff)
> > @@ -786,7 +788,7 @@ static int do_pick_commit(enum todo_command command, 
> > struct commit *commit,
> > }
> > if (!opts->no_commit)
> > res = sequencer_commit(opts->edit ? NULL : git_path_merge_msg(),
> > -   opts, allow, opts->edit);
> > +   opts, allow, opts->edit, 0);
> >  
> >  leave:
> > free_message(commit, &msg);
> 
> Hmm, this is more about a comment on 18/25, but I suspect that
> "amend" or any opportunity given to the user to futz with the
> contents in the editor invites a wish for the result to be treated
> with stripspace.

The point of separating the cleanup_commit_message from the edit flag is
to allow final fixup commands to stripspace, even without letting the user
edit the message.

So while you are correct that "edit != 0" should imply
"cleanup_commit_message != 0", I would rather keep that explicit.

> No existing callers use "amend" to call this function, so there is
> no change in behaviour, but at the same time, we do not have enough
> information to see if 'amend' should by default toggle cleanup.

It should not. Non-final fixup/squash commands *need* to keep the
comment lines.

Keeping things explicit makes it easier to understand the flow, methinks.

Ciao,
Dscho


Re: [PATCH v2 19/25] sequencer: remember do_recursive_merge()'s return value

2016-10-05 Thread Johannes Schindelin
Hi Junio,

On Mon, 12 Sep 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > The return value of do_recursive_merge() may be positive (indicating merge
> > conflicts), so let's OR later error conditions so as not to overwrite them
> > with 0.
> 
> Are the untold assumptions as follows?
> 
>  - The caller wants to act on positive (not quite an error), zero
>  (success) or negative (error);
> 
>  - do_recursive_merge() can return positive (success with reservation),
>  zero or negative, and the call to it would return immediately if it got
>  negative;
> 
>  - all other functions that come later may return either zero or
>  negative, and never positive;
> 
>  - Hence the caller can be assured that "res" being positive can only
>  mean do_recursive_merge() succeeded with reservation and everything
>  else succeeded.

More or less. The idea is that there are problems with the merge, and
there are fatal errors.

Alex and I chose to stay close to the original Python implementation when
we decided that merge_trees() should return 1 for success and 0 for
failure. In hindsight, I regret that, as it disagrees with the C
convention to return 0 for success. However, it would have been yet
another mistake to return -1 in case of merge conflicts: the function did
not have a problem (such as out-of-memory or some such).

I can only guess that the do_recursive_merge() function tried to undo the
damage by reversing the logic: 0 for success, 1 for unclean merge. It is
at least more in line with the C convention.

So much so, in fact, that we can still use negative values to indicate
fatal errors that should be handled accordingly.

> This can be extended if the only thing the caller cares about is
> positive/zero/negative and it does not care what exact positive
> value it gets--in such a case, we can loosen the condition on the
> return values from other functions whose return values are OR'ed
> together; they may also return positive to signal the same "not
> quite an error", i.e. updating the latter two points to
> 
>  - all other functions that come later can return positive (success
>with reservation), zero or negative.
> 
>  - Hence the caller can be assured that "res" being positive can
>mean nobody failed with negative return, but it is not an
>unconditional success, which is signalled by value "res" being
>0.
> 
> I cannot quite tell which is the case, especially what is planned in
> the future.  The proposed log message is a good place to explain the
> future course this code will take.

Okay, I will try to come up with a better commit message.

Actually, come to think of it, I will change the patch, as it is too
confusing. What I want is to preserve a positive return value in case of
merge conflicts, and that is exactly what I should do instead of playing
games with the Boolean OR operator.

Ciao,
Dscho


Re: [musl] Re: Regression: git no longer works with musl libc's regex impl

2016-10-05 Thread James B
On Wed, 5 Oct 2016 12:41:50 +0200 (CEST)
Johannes Schindelin  wrote:

> > 
> > Wow, I don't know that Windows is a git's first-tier platform now,
> 
> It is. Git for Windows is maintained by me, and I make as certain as I can
> that it works fine. 
> And yes, we have download numbers to support my claim.
> The latest release is less than 24h old, but I can point you to Git for
> Windows 2.8.1 whose 32-bit installer was downloaded 397,273 times, and
> whose 64-bit installer was downloaded 3,780,079 times.

Number downloads does not make first-tier platform. You know that as well as 
everyone else.

First-tier support is the decision made by the maintainers that the entire 
features of the software must be available on those first tier platforms. So if 
Windows is indeed first-tier platform for git, it means any features that don't 
work on git version of Windows must not be used/developed or even castrated. 
That's a scary thought.

Being a maintainer for "Git for Windows" does not make one automatically as the 
maintainer for "git", although that can happen.

So this decision that "Windows is now a first-tier platform for git" - is your 
own opinion, or is this the collective opinion of *all* the git maintainers? 


> 
> > and Linux/POSIX second.
> 
> This is not at all what I said, so please be careful of what you accuse
> me.

Yes, you did not say that. I said that. And I will say more. Git has 
Linux/POSIX roots. Attempting to "not use common POSIX features because they're 
not available on Windows" *does* make Linux/POSIX feels like second class 
platform for git. The way I see it, it should be *the other way around*.

It's a very sad day for a tool that was developed originally to maintain Linux 
kernel, by the Linux kernel author, now is restricted to avoid use/optimise on 
Linux/POSIX features *because* it has to run on another Windows ...

> 
> What I said is that we never exploited the full POSIX standard, but that
> we made certain to use a subset of POSIX in Git which would be relatively
> easy to emulate using Windows' API.

All this just proves my point above.

And - I notice you use the pronoun "we" - is that a "royal we" (which means the 
entire point is your own or your cohorts position), or is it the official 
position of all git maintainers?

> 
> > Are we talking about the same git that was originally written in Linus
> > Torvalds, and is used to manage Linux kernel?
> 
> It was originally written by (not in) Linus Torvalds, and yes, the Linux
> kernel is one of its many users.

That was a rhetorical question.

> 
> > Are you by any chance employed by Redmond, directly or indirectly?
> 
> I am not exactly employed by Redmond, but by Microsoft (this is what you
> meant, I guess).
> 
> I maintained Git for Windows in my spare time, next to a very demanding
> position in science, for eight years. In 2015, I joined Microsoft and part
> of my role is to maintain Git for Windows, allowing me to do a much better
> job at it.


Well thank you for being honest. I can see now why you responded the way you 
did (and still do). By being employed by Microsoft, and especially paid to work 
on Git for Windows, you have all the incentives to make it work best on 
Windows, and to make it as its first-tier platform within the limitation of 
Windows.

That in itself is not a problem - it only starts to become a problem when you 
try to cut down support for other platforms or stifle improvements on other 
platforms because "hey it makes it too hard to do those things in Windows".

> 
> Of course, I do not only improve Git's Windows support, but contribute
> other patches, too. You might also appreciate the fact that some of my
> colleagues started contributing patches to Git that benefit all Git users.
> 

By "colleagues" I assume other Microsoft employees? 
I don't have a problem with that - thank you and your colleagues for making git 
better.

But that does not give the right to you to control that "if it doesn't work on 
Windows we shouldn't do it". If you do, and at the same time claim that 
musl-libc (which is Linux-only) support is unimportant compared to your 4 
million Git-for-Windows downloads really don't do well for your or your 
employer's image. I don't have to say why - everyone outside Microsoft knows 
why.

In conclusion, I certainly hope that your view is not shared by the other git 
maintainers.

PS: Rich, sorry for the distraction. I have said what I want to say, so I'll 
bow out from this thread.

cheers,
James


Re: [PATCH v2 04/25] sequencer: future-proof remove_sequencer_state()

2016-10-05 Thread Johannes Schindelin
Hi Junio,

On Mon, 12 Sep 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > +static const char *get_dir(const struct replay_opts *opts)
> > +{
> > +   return git_path_seq_dir();
> > +}
> 
> Presumably this is what "In a couple of commits" meant, i.e. opts
> will be used soonish.

Exactly. The sequencer code was taught to use a state directory different
from rebase -i's (even if it quite clearly imitated rebase -i's approach),
to allow for rebase -i to run at the same time as the sequencer (by
calling it via cherry-pick).

So we need to be able to use different seq_dir locations, depending on the
mode we are running in.

I briefly considered consolidating them and using .git/rebase-merge/ as
state directory also for cherry-pick/revert, but that would cause
problems: I am surely not the only user who cherry-picks commits manually
while running interactive rebases.

Ciao,
Dscho


Re: [PATCH v2 05/25] sequencer: allow the sequencer to take custody of malloc()ed data

2016-10-05 Thread Johannes Schindelin
Hi Junio,

On Mon, 12 Sep 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > The sequencer is our attempt to lib-ify cherry-pick. Yet it behaves
> > like a one-shot command when it reads its configuration: memory is
> > allocated and released only when the command exits.
> >
> > This is kind of okay for git-cherry-pick, which *is* a one-shot
> > command. All the work to make the sequencer its work horse was
> > done to allow using the functionality as a library function, though,
> > including proper clean-up after use.
> >
> > This patch introduces an API to pass the responsibility of releasing
> > certain memory to the sequencer. Example:
> >
> > const char *label =
> > sequencer_entrust(opts, xstrfmt("From: %s", email));
> 
> I thought we (not just me) were already pretty clear during the last
> round of review that we will not want this entrust() thing.

That does not match my understanding.

The problem is that we are building functionality for libgit.a, not merely
for a builtin that we know will simply exit() and take all allocated
memory with it.

The additional problem is that the sequencer was *already* meant for
libgit.a, yet simply strdup()s data left and right and assigns it to const
fields, purposefully wasting memory.

Sure, I can leave those memory leaks in, but then I also have to introduce
new ones via the rebase -i support.

If you prefer to accept such sloppy work, I will change it of course,
feeling dirty that it has my name on it.

Ciao,
Dscho


Re: git commit -p with file arguments

2016-10-05 Thread Christian Neukirchen
Duy Nguyen  writes:

>> At the least I think we should clarify this in the document.
>
> How about something like this? Would it help?

I think it captures the current behavior well.

-- 
Christian Neukirchenhttp://chneukirchen.org


Re: [PATCH v3 0/6] Pull out require_clean_work_tree() functionality from builtin/pull.c

2016-10-05 Thread Johannes Schindelin
Hi Junio,

On Tue, 4 Oct 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > This is the 5th last patch series of my work to accelerate interactive
> > rebases in particular on Windows.
> 
> Offtopic, but I am always confused by what you might mean by this
> "nth last patch series".  Is this series 5th from the last and we
> have four more to go?

Yes, we have four more to go:

- the patch series (called "prepare-sequencer" in my fork) preparing the
  sequencer code for the next patch series, such as revamping the parser
  for the edit script (or "todo script" as I used to say all the time, or
  "insn sheet" as sequencer calls it),

- the patch series ("sequencer-i") teaching the sequencer to parse and
  execute the commands of rebase -i's git-rebase-todo file,

- the patch series ("rebase--helper") introducing the builtin, and using
  it from rebase -i, and finally

- the patch series ("rebase-i-extra") that moves more performance critical
  bits and pieces from git-rebase--interactive.sh into the rebase--helper.

I had originally planned to stop at rebase--helper and invite other
developers to join the fun of making rebase -i a true builtin, but the
performance improvement was surprisingly disappointing before the
rebase--helper learned to skip unnecessary picks, to verify that the
script is valid, to expand/collapse the SHA-1s, and to rearrange
fixup!/squash!  lines.

> In any case, after a quick re-read and comparison with the last
> round, I think this is in a good shape.  I'd say that we would wait
> for a few days for others to comment and then merge it to 'next' if
> we missed nothing glaringly wrong.

Perfect!
Dscho


Re: [PATCH v3 3/6] Make the require_clean_work_tree() function reusable

2016-10-05 Thread Johannes Schindelin
Hi Junio,

On Tue, 4 Oct 2016, Junio C Hamano wrote:

> I'd tweak the message while queuing, though.
> 
> wt-status: make the require_clean_work_tree() function reusable
> 
> The function "git pull" uses to stop the user when the working
> tree has changes is useful in other places.

I stumbled over this sentence. How about

The function used by "git pull" to stop [...]

instead?

Thanks,
Dscho


Re: [musl] Re: Regression: git no longer works with musl libc's regex impl

2016-10-05 Thread Johannes Schindelin
Hi Rich,

On Tue, 4 Oct 2016, Rich Felker wrote:

> On Tue, Oct 04, 2016 at 06:08:33PM +0200, Johannes Schindelin wrote:
>
> > And lastly, the best alternative would be to teach musl about
> > REG_STARTEND, as it is rather useful a feature.
> 
> Maybe, but it seems fundamentally costly to support -- it's extra
> state in the inner loops that imposes costly spill/reload on archs
> with too few registers (x86).

It is true that it could cause that.

I had a brief look at the source code (you use backtracking... hopefully
nobody uses musl to parse regular expressions from untrusted, or
inexperienced, sources [*1*]), and it seems that the regex code might
spill unnecessarily already (I see, for example, that the reg_notbol,
reg_noteol and reg_newline flags all use up complete int registers, not
merely bits of a single one).

It seems, specifically, that the *match_end_ofs parameter of the two
regexec backends is always set to point to eo, which is so far not
initialized. You could initialize it to -1 and set it to pmatch[0].rm_eo
if the REG_STARTEND flag is set. The GET_NEXT_WCHAR() macro would then
need to test something like

if (str_byte >= string + *match_end_ofs) {
ret = REG_NOMATCH; goto error_exit;
}

This does not handle non-zero pmatch[0].rm_so, though. I would probably
try to pass another input parameter for that, but I have not verified yet
that a "^" would be handled properly (if pmatch[0].rm_so > 0 and
REG_STARTEND is set, "^" should *not* match).

> I'll look at doing this when we overhaul/replace the regex
> implementation, and I'm happy to do some performance-regression tests
> for adding it now if someone has a simple patch (as was mentioned on the
> musl list).

I'd be interested to be kept in the loop, if you do not mind Cc:ing me.

Ciao,
Johannes

Footnote *1*:
http://stackstatus.net/post/147710624694/outage-postmortem-july-20-2016


RE: [musl] Re: Regression: git no longer works with musl libc's regex impl

2016-10-05 Thread Johannes Schindelin
Hi writeonce,

On Tue, 4 Oct 2016, writeo...@midipix.org wrote:

> < On Tue, 4 Oct 2016, Rich Felker wrote:
> < 
> < > On Tue, Oct 04, 2016 at 11:27:22AM -0400, Jeff King wrote:
> < > > On Tue, Oct 04, 2016 at 11:08:48AM -0400, Rich Felker wrote:
> < > > 
> < > > > 1. is nonzero mod page size, it just works; the remainder of the
> < > > > last page reads as zero bytes when mmapped.
> < > > 
> < > > Is that a portable assumption?
> < > 
> < > Yes.
> < 
> < No, it is not. You quote POSIX, but the matter of the fact is that we
> < use a subset of POSIX in order to be able to keep things running on
> < Windows.
> 
> As far as I can tell (and as the attached program may help demonstrate),
> the above assumption has been valid on all versions of Windows since at
> least Windows 2000.

And since W2K is already past its end of life, it would be safe for
practical considerations.

However, I have to add two comments to that:

- it is *not* guaranteed. The behavior is undefined, even if you see
  consistent behavior so far. Future Windows versions might break that
  assumption freely, though.

- some implementations of the REG_STARTEND feature have the nice property
  that they can read past NUL characters. Granted, not all of them do
  (AFAIU one example is FreeBSD itself, the first platform to sport
  REG_STARTEND), but we at least reap the benefit whenever using a regex
  that *can* read past NUL characters.

> In this context, one thing to remember is that the page-size for the mod
> operation is 4096, whereas the POSIX page-size (for the purpose of mmap
> and mremap) is 65536.

Indeed. A colleague of mine spotted the segfault when diffing a file that
was *exactly* 4,096 bytes.

> Note also that in the case of file-backed mapped sections, using
> kernel32.dll or msvcrt.dll or cygwin/newlib or midipix/musl is of little
> significance, specifically since all invoke ZwCreateSection and
> ZwMapViewOfSection under the hood.

Right. It's all backed by the very same kernel functions.

Ciao,
Johannes


Re: "Purposes, Concepts,Misfits, and a Redesign of Git" (a research paper)

2016-10-05 Thread Duy Nguyen
On Wed, Oct 5, 2016 at 5:14 PM, Jakub Narębski  wrote:
> [git@vger.kernel.org does not accept HTML emails]
>
> I just hope that this email don't get mangled too much...
>
> On 5 October 2016 at 04:55, Santiago Perez De Rosso
>  wrote:
>> On Fri, Sep 30, 2016 at 6:25 PM Jakub Narębski  wrote:
>>> W dniu 30.09.2016 o 18:14, Konstantin Khomoutov pisze:
>>>
 The "It Will Never Work in Theory" blog has just posted a summary of a
 study which tried to identify shortcomings in the design of Git.

 In the hope it might be interesting, I post this summary here.
 URL: http://neverworkintheory.org/2016/09/30/rethinking-git.html
>>>
>>> I will comment on the article itself, not just on the summary.
>>>
>>> | 2.2 Git
>>> [...]
>>> | But tracked files cannot be ignored; to ignore a tracked file
>>> | one has to mark it as “assume unchanged.” This “assume
>>> | unchanged” file will not be recognized by add; to make it
>>> | tracked again this marking has to be removed.
>>>
>>> WRONG!  Git has tracked files, untracked unignored files, and
>>> untracked ignored files (mostly considered unimportant).
>>>
>>> The "assume unchanged" bit is _performance_ optimization. It is not,
>>> and cannot be a 'ignore tracked files' bit - here lies lost work!!!
>>> You can use (imperfectly) "prefer worktree" bit hack instead.
>>>
>>> You can say, if 'ignoring change to tracked files' is motivation,
>>> or purpose, it lacks direct concept.
>>
>>
>> I don't see what's wrong with the paragraph you mention. I am aware of the
>> fact that assumed unchanged is intended to be used as a performance
>> optimization but that doesn't seem to be the way it is used in practice.
>> Users have appropriated the optimization and effectively turned into a
>> concept that serves the purpose of preventing the commit of a file. For
>> example:
>>
>> from 
>> http://gitready.com/intermediate/2009/02/18/temporarily-ignoring-files.html
>>
>>  So, to temporarily ignore changes in a certain file, run:
>>  git update-index --assume-unchanged 
>>  ...
>>
>> from 
>> http://stackoverflow.com/questions/17195861/undo-git-update-index-assume-unchanged-file
>>  The way you git ignore watching/tracking a particular dir/file.
>>  you just run this:
>>  git update-index --assume-unchanged 
>> ...
>>
>>
>> btw, this appropriation suggests that users want to be able to ignore
>> tracked files and they do what they can with what they are given (which
>> in this case means abusing the assumed unchanged bit).
>
> Yes, this is true that users may want to be able to ignore changes to
> tracked files (commit with dirty tree), but using `assume-unchanged` is
> wrong and dangerous solution.  Unfortunately the advice to use it is
> surprisingly pervasive.  I would thank you to not further this error.
> (Well, `skip-worktree` is newer, that's why it is lesser known, perhaps)
>
> To ignore tracked files you need to use `skip-worktree` bit.
>
> You can find the difference between `assume-unchanged` and
> `skip-worktree`, and when use which in:
> http://stackoverflow.com/questions/13630849/git-difference-between-assume-unchanged-and-skip-worktree
> http://fallengamer.livejournal.com/93321.html
> http://blog.stephan-partzsch.de/how-to-ignore-changes-in-tracked-files-with-git/
>
> The difference is that skip-worktree will not overwrite a file that is
> different from the version in the index, but assume-unchanged can.  This
> means that the latter can OVERWRITE YOUR PRECIOUS CHANGES!
>
> Some people started to recommend it
> http://stackoverflow.com/questions/32251037/ignore-changes-to-a-tracked-file
> http://www.virtuouscode.com/2011/05/20/keep-local-modifications-in-git-tracked-files/

And since skip-worktree bits may be set/cleared freely when sparse
checkout mode is on, you should never manipulate these bits directly
if you also use sparse checkout.

>>> | *Detached Head* Suppose you are working on some branch
>>> | and realize that the last few commits you did are wrong, so
>>> | you decide to go back to an old commit to start over again.
>>> | You checkout that old commit and keep working creating
>>> | commits. You might be surprised to discover that these new
>>> | commits you’ve been working on belong to no branch at all.
>>> | To avoid losing them you need to create a new branch or reset
>>> | an existing one to point to the last commit.
>>>
>>> It would be hard to be surprised unless one is in habit of
>>> disregarding multi-line warning from Git... ;-)
>>
>> True if you are an expert user, but I can assure you novices will
>> find that situation baffling, even with the multi-line warnings.

Hmm...  when you switch away from a detached HEAD, you are advised to
do "git branch  blah blah". How is it baffling? Genuine
question, maybe I have been using git for too long I just fail to see
it.

> True, the "detached HEAD" case (aka "unnamed branch") can be puzzling
> for Git users, and it has few uses (e.g. checking out the state of
> tag temporarily, to test it).
>

Re: [musl] Re: Regression: git no longer works with musl libc's regex impl

2016-10-05 Thread Johannes Schindelin
Hi James,

On Wed, 5 Oct 2016, James B wrote:

> On Tue, 4 Oct 2016 18:08:33 +0200 (CEST)
> Johannes Schindelin  wrote:
> 
> > No, it is not. You quote POSIX, but the matter of the fact is that we
> > use a subset of POSIX in order to be able to keep things running on
> > Windows.
> > 
> > And quite honestly, there are lots of reasons to keep things running
> > on Windows, and even to favor Windows support over musl support. Over
> > four million reasons: the Git for Windows users.
> 
> Wow, I don't know that Windows is a git's first-tier platform now,

It is. Git for Windows is maintained by me, and I make as certain as I can
that it works fine. And yes, we have download numbers to support my claim.
The latest release is less than 24h old, but I can point you to Git for
Windows 2.8.1 whose 32-bit installer was downloaded 397,273 times, and
whose 64-bit installer was downloaded 3,780,079 times.

> and Linux/POSIX second.

This is not at all what I said, so please be careful of what you accuse
me.

What I said is that we never exploited the full POSIX standard, but that
we made certain to use a subset of POSIX in Git which would be relatively
easy to emulate using Windows' API.

> Are we talking about the same git that was originally written in Linus
> Torvalds, and is used to manage Linux kernel?

It was originally written by (not in) Linus Torvalds, and yes, the Linux
kernel is one of its many users.

Git is not used only for the Linux kernel, though, and I am certain that
Linus agrees that it should not cater only to the Linux folks. Git is used
very widely in OSS as well as in the industry. So we, the Git developers,
kind of have an obligation to make things work in a much broader
perspective than you suggested.

> Are you by any chance employed by Redmond, directly or indirectly?

I am not exactly employed by Redmond, but by Microsoft (this is what you
meant, I guess).

I maintained Git for Windows in my spare time, next to a very demanding
position in science, for eight years. In 2015, I joined Microsoft and part
of my role is to maintain Git for Windows, allowing me to do a much better
job at it.

Of course, I do not only improve Git's Windows support, but contribute
other patches, too. You might also appreciate the fact that some of my
colleagues started contributing patches to Git that benefit all Git users.

> Sorry - can't help it.

I do not know why you are sorry, and I do not believe that you have to be.

Ciao,
Johannes


Re: git commit -p with file arguments

2016-10-05 Thread Duy Nguyen
On Fri, Sep 09, 2016 at 05:54:30PM +0700, Duy Nguyen wrote:
> On Tue, Sep 6, 2016 at 4:08 AM, Christian Neukirchen
>  wrote:
> > Hi,
> >
> > I noticed the following suprising behavior:
> >
> > % git --version
> > git version 2.10.0
> >
> > % git add bar
> > % git status -s
> > A  bar
> >  M foo
> >
> > % git commit -p foo
> > [stage a hunk]
> > ...
> > # Explicit paths specified without -i or -o; assuming --only paths...
> > # On branch master
> > # Changes to be committed:
> > #   new file:   bar
> > #   modified:   foo
> > #
> >
> > So why does it want to commit bar too, when I explicitly wanted to
> > commit foo only?
> >
> > This is not how "git commit files..." works, and the man page says
> >
> > 3.by listing files as arguments to the commit command, in which
> >case the commit will ignore changes staged in the index, and
> >instead record the current content of the listed files (which 
> > must
> >already be known to Git);
> >
> > I'd expect "git commit -p files..." to work like
> > "git add -p files... && git commit files...".
> 
> ...
> 
> At the least I think we should clarify this in the document.

How about something like this? Would it help?

-- 8< --
Subject: [PATCH] git-commit.txt: clarify --patch mode with pathspec

How pathspec is used, with and without --interactive/--patch, is
different. But this is not clear from the document. These changes hint
the user to keep reading (to option #5) instead of stopping at #2 and
assuming --patch/--interactive behaves the same way.

And since all the options listed here always mention how the index is
involved (or not) in the final commit, add that bit for #5 as well. This
"on top of the index" is implied when you head over git-add(1), but if
you just go straight to the "Interactive mode" and not read what git-add
is for, you may miss it.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/git-commit.txt | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-commit.txt b/Documentation/git-commit.txt
index b0a294d..f2ab0ee 100644
--- a/Documentation/git-commit.txt
+++ b/Documentation/git-commit.txt
@@ -29,7 +29,8 @@ The content to be added can be specified in several ways:
 2. by using 'git rm' to remove files from the working tree
and the index, again before using the 'commit' command;
 
-3. by listing files as arguments to the 'commit' command, in which
+3. by listing files as arguments to the 'commit' command
+   (without --interactive or --patch switch), in which
case the commit will ignore changes staged in the index, and instead
record the current content of the listed files (which must already
be known to Git);
@@ -41,7 +42,8 @@ The content to be added can be specified in several ways:
actual commit;
 
 5. by using the --interactive or --patch switches with the 'commit' command
-   to decide one by one which files or hunks should be part of the commit,
+   to decide one by one which files or hunks should be part of the commit
+   in addition to contents in the index,
before finalizing the operation. See the ``Interactive Mode'' section of
linkgit:git-add[1] to learn how to operate these modes.
 
-- 
2.8.2.524.g6ff3d78

-- 8< --
Duy


Re: Feature Request: user defined suffix for temp files created by git-mergetool

2016-10-05 Thread Josef Ridky
Hi David,

thank you very much for your reply.
Today I realized, that my attachment has been cut off, so I sent it in the 
morning [1].

I believe, that most of answer can be find in my previous email from this 
morning.
Only change, that should be done by this request is add possibility to edit 
hard-coded suffix of temporary files. 
I don't think, that change hard-coded suffix to switch between two hard-coded 
suffix of temporary files is suitable solution, but as well it's not bad 
solution.

Please look at the patch [1] and tell me, what do you think about this change.

[1] http://marc.info/?l=git&m=147565363609649&w=2

Regards
Josef

| Sent: Wednesday, October 5, 2016 11:47:06 AM
| 
| On Tue, Oct 04, 2016 at 01:18:47AM -0400, Josef Ridky wrote:
| > Hi Anatoly,
| > 
| > 
| > | Sent: Monday, October 3, 2016 5:18:44 PM
| > | 
| > | Hi Josef,
| > | 
| > | 
| > | On Mon, Oct 3, 2016 at 8:36 AM, Josef Ridky  wrote:
| > | > In several projects, we are using git mergetool for comparing files
| > | > from
| > | > different folders.
| > | > Unfortunately, when we have opened three files  for comparing using
| > | > meld
| > | > tool (e.q. Old_version -- Result -- New_version),
| > | > we can see only name of temporary files created by mergetool in the
| > | > labels
| > | > (e.g. foo_REMOTE -- foo_BASE -- foo_LOCAL)
| > | > and users (and sometime even we) are confused, which of the files
| > | > should
| > | > they edit and save.
| > | 
| > | `git mergetool` just creates temporary files (with some temporary
| > | names) and calls `meld` (or `vimdiff`, etc) with the file names as
| > | parameters. So why wouldn't you call `meld` with the file names you
| > | want?
| > 
| > 
| > Because files, that we want, are temporary files created by
| > git mergetool and we are not able to change their name.
| 
| [I didn't see your original patch, but we actually prefer inline
|  patches in the email, as sent via `git send-email`.
|  Documentation/SubmittingPatches has more details.
| 
|  Please also make sure to add a test to t/t7610-mergetool.sh
|  exercising any new features.]
| 
| Are you proposing support for config variables to control how
| the temporary files are named?
| 
| e.g. something like "mergetool.strings.{local,remote,base}" for
| overriding the hard-coded {LOCAL,REMOTE,BASE} strings?
| 
| I don't want to over-engineer it, but do you want to support
| executing a command to get the name, or is having a replacement
| sufficient?
| 
| Now I'm curious... if replacing the strings is sufficient, what
| do you plan to call them?  I can imagine maybe something like
| OURS, and THEIRS might be helpful since it matches the
| nomenclature already used by Git, e.g. "git merge -s ours".
| 
| Since these are temporary files, changing these names might not
| be entirely out of the question.  This might be a case where
| using the same words as a related Git feature might help reduce
| the mental burden of using mergetool.  OURS and THEIRS are
| probably the only names that fit that category, IMO.
| BASE is already good enough (merge-base).
| 
| The downside of making it configurable is that it can confuse
| users who use mergetool at someone else's desk where they've
| named these strings to "catty", "wombat", and "jimbo".  This
| doesn't seem like the kind of place where we want to allow users
| to be creative, but we do care about having a good default.
| 
| OURS and THEIRS are intuitive names, so switching existing users
| to those would not have much downside IMO, and it's a little
| less "I just merged a REMOTE branch" centric, which is good.
| 
| Do you think these names should be changed?
| If so, did you have those names in mind, or something else
| entirely?
| 
| cheers,
| --
| David
| 


Re: "Purposes, Concepts,Misfits, and a Redesign of Git" (a research paper)

2016-10-05 Thread Jakub Narębski
[git@vger.kernel.org does not accept HTML emails]

I just hope that this email don't get mangled too much...

On 5 October 2016 at 04:55, Santiago Perez De Rosso
 wrote:
> On Fri, Sep 30, 2016 at 6:25 PM Jakub Narębski  wrote:
>> W dniu 30.09.2016 o 18:14, Konstantin Khomoutov pisze:
>>
>>> The "It Will Never Work in Theory" blog has just posted a summary of a
>>> study which tried to identify shortcomings in the design of Git.
>>>
>>> In the hope it might be interesting, I post this summary here.
>>> URL: http://neverworkintheory.org/2016/09/30/rethinking-git.html
>>
>> I will comment on the article itself, not just on the summary.
>>
>> | 2.2 Git
>> [...]
>> | But tracked files cannot be ignored; to ignore a tracked file
>> | one has to mark it as “assume unchanged.” This “assume
>> | unchanged” file will not be recognized by add; to make it
>> | tracked again this marking has to be removed.
>>
>> WRONG!  Git has tracked files, untracked unignored files, and
>> untracked ignored files (mostly considered unimportant).
>>
>> The "assume unchanged" bit is _performance_ optimization. It is not,
>> and cannot be a 'ignore tracked files' bit - here lies lost work!!!
>> You can use (imperfectly) "prefer worktree" bit hack instead.
>>
>> You can say, if 'ignoring change to tracked files' is motivation,
>> or purpose, it lacks direct concept.
>
>
> I don't see what's wrong with the paragraph you mention. I am aware of the
> fact that assumed unchanged is intended to be used as a performance
> optimization but that doesn't seem to be the way it is used in practice.
> Users have appropriated the optimization and effectively turned into a
> concept that serves the purpose of preventing the commit of a file. For
> example:
>
> from 
> http://gitready.com/intermediate/2009/02/18/temporarily-ignoring-files.html
>
>  So, to temporarily ignore changes in a certain file, run:
>  git update-index --assume-unchanged 
>  ...
>
> from 
> http://stackoverflow.com/questions/17195861/undo-git-update-index-assume-unchanged-file
>  The way you git ignore watching/tracking a particular dir/file.
>  you just run this:
>  git update-index --assume-unchanged 
> ...
>
>
> btw, this appropriation suggests that users want to be able to ignore
> tracked files and they do what they can with what they are given (which
> in this case means abusing the assumed unchanged bit).

Yes, this is true that users may want to be able to ignore changes to
tracked files (commit with dirty tree), but using `assume-unchanged` is
wrong and dangerous solution.  Unfortunately the advice to use it is
surprisingly pervasive.  I would thank you to not further this error.
(Well, `skip-worktree` is newer, that's why it is lesser known, perhaps)

To ignore tracked files you need to use `skip-worktree` bit.

You can find the difference between `assume-unchanged` and
`skip-worktree`, and when use which in:
http://stackoverflow.com/questions/13630849/git-difference-between-assume-unchanged-and-skip-worktree
http://fallengamer.livejournal.com/93321.html
http://blog.stephan-partzsch.de/how-to-ignore-changes-in-tracked-files-with-git/

The difference is that skip-worktree will not overwrite a file that is
different from the version in the index, but assume-unchanged can.  This
means that the latter can OVERWRITE YOUR PRECIOUS CHANGES!

Some people started to recommend it
http://stackoverflow.com/questions/32251037/ignore-changes-to-a-tracked-file
http://www.virtuouscode.com/2011/05/20/keep-local-modifications-in-git-tracked-files/


>> | The notion of a “remote branch” must not be confused
>> | with that of an “upstream branch.” An upstream branch is
>> | just a convenience for users: after the user assigns it to some
>> | branch, commands like pull and push default to use that
>> | branch for fetching and pushing changes if no branch is given
>> | as input.
>>
>> Actually "upstream branch" (and related "upstream repository")
>> are a concept, not only a convenience. They denote a branch
>> (usually in remote repository) which is intended to ultimately
>> include changes in given branch. Note that "upstream branch"
>> can be set separately for any given local branch.
>
>
> We never say upstream branch is not a concept. It even appears in the
> table.

Hmmm... I got onfused by words "is just a convenience", which for me
implies that it is not a concept.

>>
>>
>> One thing that can enormously help recovering from errors, and
>> is not covered in the list of concepts is REFLOG.
>
>
> Yes, the analysis is not exhaustive, there are other concepts missing
> too (e.g., "submodule")

I think reflog is something that every user should know about, and
useful for all.  Submodules and subtrees is something situational,
needed only for a subset of users.

On the other hand the concept of "submodules" is something that it is
under active development, and it is assumed to be immature.  So coming
up with a good (re)design for this feature, and/or good set of
concept, would be a very g

Re: Feature Request: user defined suffix for temp files created by git-mergetool

2016-10-05 Thread David Aguilar
On Tue, Oct 04, 2016 at 01:18:47AM -0400, Josef Ridky wrote:
> Hi Anatoly,
> 
> 
> | Sent: Monday, October 3, 2016 5:18:44 PM
> | 
> | Hi Josef,
> | 
> | 
> | On Mon, Oct 3, 2016 at 8:36 AM, Josef Ridky  wrote:
> | > In several projects, we are using git mergetool for comparing files from
> | > different folders.
> | > Unfortunately, when we have opened three files  for comparing using meld
> | > tool (e.q. Old_version -- Result -- New_version),
> | > we can see only name of temporary files created by mergetool in the labels
> | > (e.g. foo_REMOTE -- foo_BASE -- foo_LOCAL)
> | > and users (and sometime even we) are confused, which of the files should
> | > they edit and save.
> | 
> | `git mergetool` just creates temporary files (with some temporary
> | names) and calls `meld` (or `vimdiff`, etc) with the file names as
> | parameters. So why wouldn't you call `meld` with the file names you
> | want?
> 
> 
> Because files, that we want, are temporary files created by
> git mergetool and we are not able to change their name.

[I didn't see your original patch, but we actually prefer inline
 patches in the email, as sent via `git send-email`.
 Documentation/SubmittingPatches has more details.

 Please also make sure to add a test to t/t7610-mergetool.sh
 exercising any new features.]

Are you proposing support for config variables to control how
the temporary files are named?

e.g. something like "mergetool.strings.{local,remote,base}" for
overriding the hard-coded {LOCAL,REMOTE,BASE} strings?

I don't want to over-engineer it, but do you want to support
executing a command to get the name, or is having a replacement
sufficient?

Now I'm curious... if replacing the strings is sufficient, what
do you plan to call them?  I can imagine maybe something like
OURS, and THEIRS might be helpful since it matches the
nomenclature already used by Git, e.g. "git merge -s ours".

Since these are temporary files, changing these names might not
be entirely out of the question.  This might be a case where
using the same words as a related Git feature might help reduce
the mental burden of using mergetool.  OURS and THEIRS are
probably the only names that fit that category, IMO.
BASE is already good enough (merge-base).

The downside of making it configurable is that it can confuse
users who use mergetool at someone else's desk where they've
named these strings to "catty", "wombat", and "jimbo".  This
doesn't seem like the kind of place where we want to allow users
to be creative, but we do care about having a good default.

OURS and THEIRS are intuitive names, so switching existing users
to those would not have much downside IMO, and it's a little
less "I just merged a REMOTE branch" centric, which is good.

Do you think these names should be changed?
If so, did you have those names in mind, or something else
entirely?

cheers,
-- 
David


Re: [PATCH 1/3] Resurrect "diff-lib.c: adjust position of i-t-a entries in diff"

2016-10-05 Thread Duy Nguyen
On Tue, Oct 4, 2016 at 11:15 PM, Junio C Hamano  wrote:
> Duy Nguyen  writes:
>
>> We don't use it internally _yet_. I need to go through all the
>> external diff code and see --shift-ita should be there. The end goal
>> is still changing the default behavior and getting rid of --shift-ita,
>
> I do not agree with that endgame, and quite honestly I do not want
> to waste time reviewing such a series.

I do not believe current "diff" behavior wrt. i-t-a entries is right
either. There's no point in pursuing this series then. Feel free to
revert 3f6d56d (commit: ignore intent-to-add entries instead of
refusing - 2012-02-07) and bring back the old i-t-a behavior. All the
problems would go away.
-- 
Duy


Re: Feature Request: user defined suffix for temp files created by git-mergetool

2016-10-05 Thread Josef Ridky
Hi, 

I have just realize, that my attachment has been cut off from my previous 
message.
Below you can find patch with requested change.

Add support for user defined suffix part of name of temporary files
created by git mergetool
---
 Documentation/git-mergetool.txt | 26 -
 git-mergetool.sh| 52 +
 2 files changed, 72 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-mergetool.txt b/Documentation/git-mergetool.txt
index e846c2e..6cf5935 100644
--- a/Documentation/git-mergetool.txt
+++ b/Documentation/git-mergetool.txt
@@ -8,7 +8,7 @@ git-mergetool - Run merge conflict resolution tools to resolve 
merge conflicts
 SYNOPSIS
 
 [verse]
-'git mergetool' [--tool=] [-y | --[no-]prompt] [...]
+'git mergetool' [--tool=] [-y | --[no-]prompt] [--local=] 
[--remote=] [--backup=] [--base=] [...]
 
 DESCRIPTION
 ---
@@ -79,6 +79,30 @@ success of the resolution after the custom tool has exited.
Prompt before each invocation of the merge resolution program
to give the user a chance to skip the path.
 
+--local=::
+   Use string from  as part of suffix of name of temporary
+   file (local) for merging. If not used or is equal with any
+   other (remote|backup|base), default value is used.
+   Default suffix is LOCAL.
+
+--remote=::
+   Use string from  as part of suffix of name of temporary
+   file (remote) for merging. If not used or is equal with any
+   other (local|backup|base), default value is used.
+   Default suffix is REMOTE.
+
+--backup=::
+   Use string from  as part of suffix of name of temporary
+   file (backup) for merging. If not used or is equal with any
+   other (local|remote|base), default value is used.
+   Default suffix is BACKUP.
+
+--base=::
+   Use string from  as part of suffix of name of temporary
+   file (base) for merging. If not used or is equal with any
+   other (local|remote|backup), default value is used.
+   Default suffix is BASE.
+
 TEMPORARY FILES
 ---
 `git mergetool` creates `*.orig` backup files while resolving merges.
diff --git a/git-mergetool.sh b/git-mergetool.sh
index bf86270..e3433ee 100755
--- a/git-mergetool.sh
+++ b/git-mergetool.sh
@@ -8,11 +8,18 @@
 # at the discretion of Junio C Hamano.
 #
 
-USAGE='[--tool=tool] [--tool-help] [-y|--no-prompt|--prompt] [file to merge] 
...'
+USAGE='[--tool=tool] [--tool-help] [-y|--no-prompt|--prompt] [--local=name] 
[--remote=name] [--backup=name] [--base=name] [file to merge] ...'
 SUBDIRECTORY_OK=Yes
 NONGIT_OK=Yes
 OPTIONS_SPEC=
 TOOL_MODE=merge
+
+#optional name space convention
+local_name=""
+remote_name=""
+base_name=""
+backup_name=""
+
 . git-sh-setup
 . git-mergetool--lib
 
@@ -271,10 +278,33 @@ merge_file () {
BASE=${BASE##*/}
fi
 
-   BACKUP="$MERGETOOL_TMPDIR/${BASE}_BACKUP_$$$ext"
-   LOCAL="$MERGETOOL_TMPDIR/${BASE}_LOCAL_$$$ext"
-   REMOTE="$MERGETOOL_TMPDIR/${BASE}_REMOTE_$$$ext"
-   BASE="$MERGETOOL_TMPDIR/${BASE}_BASE_$$$ext"
+   if [ "$local_name" != "" ]  && [ "$local_name" != "$remote_name" ] && [ 
"$local_name" != "$backup_name" ] && [ "$local_name" != "$base_name" ]
+   then
+   LOCAL="$MERGETOOL_TMPDIR/${BASE}_${local_name}_$$$ext"
+   else
+   LOCAL="$MERGETOOL_TMPDIR/${BASE}_LOCAL_$$$ext"
+   fi
+
+   if [ "$remote_name" != "" ] && [ "$remote_name" != "$local_name" ] && [ 
"$remote_name" != "$backup_name" ] && [ "$remote_name" != "$base_name" ]
+   then
+   REMOTE="$MERGETOOL_TMPDIR/${BASE}_${remote_name}_$$$ext"
+   else
+   REMOTE="$MERGETOOL_TMPDIR/${BASE}_REMOTE_$$$ext"
+   fi
+
+   if [ "$backup_name" != "" ] && [ "$backup_name" != "$local_name" ] && [ 
"$backup_name" != "$remote_name" ] && [ "$backup_name" != "$base_name" ]
+   then
+   BACKUP="$MERGETOOL_TMPDIR/${BASE}_${backup_name}_$$$ext"
+   else
+   BACKUP="$MERGETOOL_TMPDIR/${BASE}_BACKUP_$$$ext"
+   fi
+
+   if [ "$base_name" != "" ] && [ "$base_name" != "$local_name" ] && [ 
"$base_name" != "$remote_name" ] && [ "$base_name" != "$backup_name" ]
+   then
+   BASE="$MERGETOOL_TMPDIR/${BASE}_${base_name}_$$$ext"
+   else
+   BASE="$MERGETOOL_TMPDIR/${BASE}_BASE_$$$ext"
+   fi
 
base_mode=$(git ls-files -u -- "$MERGED" | awk '{if ($3==1) print $1;}')
local_mode=$(git ls-files -u -- "$MERGED" | awk '{if ($3==2) print 
$1;}')
@@ -396,6 +426,18 @@ do
--prompt)
prompt=true
;;
+   --local=*)
+   local_name=${1#--local=}
+   ;;
+   --remote=*)
+   remote_name=${1#--remote=}
+   ;;
+   --base=*)
+   base_name=${1#--base=}
+   ;;
+   --backup=*)
+   backup_name=${1#--backup=}
+   ;;
--)

<    1   2