Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content

2018-09-19 Thread Jeff King
On Wed, Sep 19, 2018 at 06:12:23PM +0200, Duy Nguyen wrote:

> On Wed, Sep 19, 2018 at 1:19 AM Jeff King  wrote:
> >
> > On Tue, Sep 18, 2018 at 12:36:06PM -0700, Jacob Keller wrote:
> >
> > > > I like that, too. It's a little more costly just because it may involve
> > > > object-db writes, but I think in most cases it would be fine. I almost
> > > > always "git stash" away discarded changes these days instead of "git
> > > > reset --hard", because it effectively provides this kind of log.
> > >
> > > Obviously we do eventually turn the index into a tree, which is used
> > > by the commit. Would it be possible to simply somehow store these
> > > trees, and have commands which blow the tree away simply instead, save
> > > it? I'm not sure how costly that is.
> >
> > For an up-to-date index with the cache-tree extension, this should be
> > pretty cheap.  Even for a partially-staged index, where we already have
> > the blob in the object database, it should just incur the tree
> > computation (and parts you didn't touch would hopefully have an
> > up-to-date cache-tree).
> 
> Just FYI. I wanted to get at least pruning support in place before
> posting another RFC. But right now I'm going without cache-tree.
> Whenever a file is updated, the _blob_ hashes are recorded in the
> index log (one line per update, same as reflog format) and the
> description part of the line is the updated path.
> 
> This is simpler to do, and we can still reconstruct a tree/commit in
> memory if we need to (but not whole tree snapshots; but I don't think
> that would be very useful). But maybe cache-tree approach is better.
> We will see.

Oh, interesting. Sort of a hybrid approach. :) That makes sense, and I
don't see any reason it wouldn't work.

-Peff


Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content

2018-09-19 Thread Duy Nguyen
On Wed, Sep 19, 2018 at 1:19 AM Jeff King  wrote:
>
> On Tue, Sep 18, 2018 at 12:36:06PM -0700, Jacob Keller wrote:
>
> > > I like that, too. It's a little more costly just because it may involve
> > > object-db writes, but I think in most cases it would be fine. I almost
> > > always "git stash" away discarded changes these days instead of "git
> > > reset --hard", because it effectively provides this kind of log.
> >
> > Obviously we do eventually turn the index into a tree, which is used
> > by the commit. Would it be possible to simply somehow store these
> > trees, and have commands which blow the tree away simply instead, save
> > it? I'm not sure how costly that is.
>
> For an up-to-date index with the cache-tree extension, this should be
> pretty cheap.  Even for a partially-staged index, where we already have
> the blob in the object database, it should just incur the tree
> computation (and parts you didn't touch would hopefully have an
> up-to-date cache-tree).

Just FYI. I wanted to get at least pruning support in place before
posting another RFC. But right now I'm going without cache-tree.
Whenever a file is updated, the _blob_ hashes are recorded in the
index log (one line per update, same as reflog format) and the
description part of the line is the updated path.

This is simpler to do, and we can still reconstruct a tree/commit in
memory if we need to (but not whole tree snapshots; but I don't think
that would be very useful). But maybe cache-tree approach is better.
We will see.
-- 
Duy


Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content

2018-09-18 Thread Jeff King
On Tue, Sep 18, 2018 at 12:36:06PM -0700, Jacob Keller wrote:

> > I like that, too. It's a little more costly just because it may involve
> > object-db writes, but I think in most cases it would be fine. I almost
> > always "git stash" away discarded changes these days instead of "git
> > reset --hard", because it effectively provides this kind of log.
> 
> Obviously we do eventually turn the index into a tree, which is used
> by the commit. Would it be possible to simply somehow store these
> trees, and have commands which blow the tree away simply instead, save
> it? I'm not sure how costly that is.

For an up-to-date index with the cache-tree extension, this should be
pretty cheap.  Even for a partially-staged index, where we already have
the blob in the object database, it should just incur the tree
computation (and parts you didn't touch would hopefully have an
up-to-date cache-tree).

Where it gets expensive is when you are throwing away working-tree
changes, and you have to re-hash those objects. Conceptually that's not
much worse than just calling `git add`, but there are corner cases
(e.g., imagine you keep a 1GB clone of another project in an ignored
directory, and then `git clean -dx` wants to `git add` all of it).

-Peff


Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content

2018-09-18 Thread Eckhard Maaß
On Tue, Sep 18, 2018 at 12:41:30PM -0700, Jacob Keller wrote:
> I think having both is good. There are a lot of ways to accidentally
> throw away work, and it's pretty frustrating to have it happen. But
> the reflog is also somewhat complicated, and I've definitely seen a
> lot of developers who've never heard of it, and struggle with the
> concept.

It's definitely good to improve on "oh, I screwed up - how can I
recover?" part. 

> I personally think having the nice "it looks like you're about to
> throw away all your changes, are you sure" style of protection using
> something like --clobber-index is useful as a mode, even if we have an
> index log of sorts.

I don't think it's an appealing design. If we have the index reflog
giving us an undo function, then it is (at least for me) irritating to
have additional protection. Furthermore, this protection, to not break
existing workflows, needs to be configurable. This comes with much
baggage on its own. Having then two things which seem to solve the same
problem setting, but somehow different, is in my opinion even more
confusing in the long run.

Greetings,
Eckhard


Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content

2018-09-18 Thread Jacob Keller
On Mon, Sep 17, 2018 at 12:26 PM Junio C Hamano  wrote:
> FWIW, I didn't mean to say that we should give users a way to
> recover.  Your "commit -a" or "commit $path" protection just
> prevents the situation from happening, and I think it is sufficient.
>
> The sole point I wanted to raise by bringing up the above was that
> we should have the same degree of protection against "add $path" or
> "add -u".
>
> Of course, "index log" is interesting and it may even turn out to be
> useful (I was skeptical about "reference log" the same way, but it
> turned out to be useful without burdening the system too heavily),
> and it may even remove the need for the "do not accidentally lose
> information by adding more to the index" protection.  But until that
> happens, if we are to have such a protection, we would wnat to give
> the same degree of protection to "commit" and "add".

I think having both is good. There are a lot of ways to accidentally
throw away work, and it's pretty frustrating to have it happen. But
the reflog is also somewhat complicated, and I've definitely seen a
lot of developers who've never heard of it, and struggle with the
concept.

I personally think having the nice "it looks like you're about to
throw away all your changes, are you sure" style of protection using
something like --clobber-index is useful as a mode, even if we have an
index log of sorts. Having it be default helps new people, even if it
does get in the way of someone who knows what they're doing. Having it
be configurable, to me, sort of defeats the point, since it means
having to tell people to turn this on.

I personally don't mind having to type an extended option to clobber
when I know it's what I want, but I can see that being painful.

However, if we had a reflog for the index, this becomes less of a
problem since recovery is much easier.

Thanks,
Jake


Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content

2018-09-18 Thread Jacob Keller
On Mon, Sep 17, 2018 at 11:15 AM Jeff King  wrote:
>
> On Mon, Sep 17, 2018 at 07:29:26PM +0200, Duy Nguyen wrote:
>
> > > On the other hand, if I am keeping some change that should never be
> > > in a commit in the working tree file, and building the contents in
> > > the index using "add -p" to incrementally, it would be the same
> > > disaster as you are trying to prevent if I by mistake did a whole
> > > path 'add', even if I catch myself doing so before running 'commit'
> > > i.e.
> > >
> > > edit X
> > > git add -p X
> > > git diff --cached X
> > > git diff X
> > > ... repeat the above number of times ...
> > > git add X ;# OOPS!
> > > git add . ;# OOPS! even worse!
> > >
> > > Even though this does not involve "git commit -a" or "git commit X",
> > > an unrecoverable damage that requires redoing the manual work is
> > > already done.
> >
> > I don't see a good way to get to recover this situation. I could go
> > back to the "index log" idea, where we keep a log of index changes (or
> > just "interesting" changes). That way there's no behavior change at
> > all. The user who accidentally updates/deletes something can always
> > retrieve the old content back (assuming that they realize quickly
> > since we can't keep very long log).
>
> FWIW, I like that approach much better, since:
>
>   1. It does not bother or restrict anybody in their workflow; instead,
>  they pay the complexity price only when they know they have made a
>  mistake.
>
>   2. It covers many more cases (e.g., just doing the wrong thing via
>  "add -p").
>

I also think this is a better approach for the same reasons.

> A naive index log would be pretty cheap in CPU, at least for POSIX-ish
> systems. You could just hard link "index" to "index.N" before renaming
> "index.lock" over "index". But I guess if you have a gigantic index,
> that's less appealing. So maybe storing the equivalent of a "--raw" diff
> between the two index states would make more sense (and after all, you
> don't really need the stat-cache or cache-tree). It would cost more to
> reconstruct the index on the fly, but then the point is that you would
> create these logs a lot more than you access them.
>
> > I've been thinking about allowing to undo worktree changes too (e.g.
> > accidental "git reset --hard") and this log can cover it as well.
>
> I like that, too. It's a little more costly just because it may involve
> object-db writes, but I think in most cases it would be fine. I almost
> always "git stash" away discarded changes these days instead of "git
> reset --hard", because it effectively provides this kind of log.
>

Obviously we do eventually turn the index into a tree, which is used
by the commit. Would it be possible to simply somehow store these
trees, and have commands which blow the tree away simply instead, save
it? I'm not sure how costly that is.

Thanks,
Jake


Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content

2018-09-18 Thread Jacob Keller
On Mon, Sep 17, 2018 at 10:09 AM Junio C Hamano  wrote:
>
> It usually is safer (simply because you do not have to think about
> it) to start a behaviour change like this as a strict opt-in to gain
> confidence.

I tend to agree, however.. in this case it could be considered safer
to err on the side of not throwing away the index which could have
crafted changes in it.

> The approach to check if the contents in the index matches that in
> the HEAD per-path (i.e. "The contents we are adding to the index is
> whole working tree contents for that path.  But the index already
> has contents different from HEAD for the path---are we losing
> information by doing this?"), is a very good one.  But for the
> protection to be effective, I think "git commit" and "git add"
> should be covered the same way, ideally with the same code and
> possibly the same configuration knob and/or command line option to
> control the behaviour.

Checking both commit and add makes sense to me.

>
> If the information loss caused by the "add/commit X" or "add
> -u/commit -a" is so serious that this new feature deserves to become
> the default (which I do not yet think it is the case, by the way),
> then we could even forbid "commit X" or "commit -a" when the paths
> involved has difference between the index and the HEAD, without any
> configuration knob or command line override for "commit", and then
> tell the users to use "git add/rm" _with_ the override before coming
> back to "git commit".

I was going to suggest we have some sort of reflog equivalent for the
index, but Duy seems to discuss that in a follow-on mail.

>
> How should this new check intract with paths added with "add -N", by
> the way?


Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content

2018-09-18 Thread Jeff King
On Mon, Sep 17, 2018 at 08:41:36PM +0200, Duy Nguyen wrote:

> > I think the reflog approach has been successful: give these intermediate
> > states a name. So in theory I could do something like:
> >
> >   git checkout -p :@{2.minutes.ago}
> >
> > though it would probably be useful to be able to walk the states, too,
> > just like we have "log --reflog-walk".
> >
> > The syntax above is off-the-cuff (and based on the ":" index
> > syntax). I guess if we had a separate log for "stuff in the worktree
> > that got thrown away by reset" we might need a separate syntax.
> 
> I'm leaning towards reflog too. Not because of the syntax but because
> of reusing the current code base. I don't have to worry about pruning,
> walking, gc-ing... because it's pretty much already there. And the UI
> is not so urgent since reflog file is very readable, early adopters
> can just open the file and get the hash.

Ah, good point on pruning/gc. I had imagined a new "index log" format.
But really if you just turn the result into a tree, then it becomes just
another ref(log). That might be slightly less efficient, but the
flexibility and simplicity are probably worth it. Or maybe with
cache-tree it is not even less efficient.

So that definitely seems like the right direction.

> I'm trying to quickly make something that writes to
> "$GIT_DIR/logs/index" and see how it goes. But yeah I'll probably drop
> this patch. The ":@{2.minutes.ago}" just makes me like this direction
> more, even though I don't know if I could even make that work.

I think "index@{2.minutes.ago}" would almost work, but it would probably
complain about a reflog without a matching ref. That seems like it would
be pretty easy to work around in the reflog code. Or maybe even by
having a stash-like "index log" ref.

-Peff


Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content

2018-09-17 Thread Junio C Hamano
Duy Nguyen  writes:

> On Mon, Sep 17, 2018 at 7:09 PM Junio C Hamano  wrote:
>> ...
>> On the other hand, if I am keeping some change that should never be
>> in a commit in the working tree file, and building the contents in
>> the index using "add -p" to incrementally, it would be the same
>> disaster as you are trying to prevent if I by mistake did a whole
>> path 'add', even if I catch myself doing so before running 'commit'
>> i.e.
>>
>> edit X
>> git add -p X
>> git diff --cached X
>> git diff X
>> ... repeat the above number of times ...
>> git add X ;# OOPS!
>> git add . ;# OOPS! even worse!
>>
>> Even though this does not involve "git commit -a" or "git commit X",
>> an unrecoverable damage that requires redoing the manual work is
>> already done.
>
> I don't see a good way to get to recover this situation. I could go
> back to the "index log" idea,...

FWIW, I didn't mean to say that we should give users a way to
recover.  Your "commit -a" or "commit $path" protection just
prevents the situation from happening, and I think it is sufficient.

The sole point I wanted to raise by bringing up the above was that
we should have the same degree of protection against "add $path" or
"add -u".

Of course, "index log" is interesting and it may even turn out to be
useful (I was skeptical about "reference log" the same way, but it
turned out to be useful without burdening the system too heavily),
and it may even remove the need for the "do not accidentally lose
information by adding more to the index" protection.  But until that
happens, if we are to have such a protection, we would wnat to give
the same degree of protection to "commit" and "add".


Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content

2018-09-17 Thread Duy Nguyen
On Mon, Sep 17, 2018 at 8:15 PM Jeff King  wrote:
> On Mon, Sep 17, 2018 at 07:29:26PM +0200, Duy Nguyen wrote:
> > I don't see a good way to get to recover this situation. I could go
> > back to the "index log" idea, where we keep a log of index changes (or
> > just "interesting" changes). That way there's no behavior change at
> > all. The user who accidentally updates/deletes something can always
> > retrieve the old content back (assuming that they realize quickly
> > since we can't keep very long log).
>
> FWIW, I like that approach much better, since:
>
>   1. It does not bother or restrict anybody in their workflow; instead,
>  they pay the complexity price only when they know they have made a
>  mistake.
>
>   2. It covers many more cases (e.g., just doing the wrong thing via
>  "add -p").
>
> A naive index log would be pretty cheap in CPU, at least for POSIX-ish
> systems. You could just hard link "index" to "index.N" before renaming
> "index.lock" over "index". But I guess if you have a gigantic index,
> that's less appealing. So maybe storing the equivalent of a "--raw" diff
> between the two index states would make more sense (and after all, you
> don't really need the stat-cache or cache-tree). It would cost more to
> reconstruct the index on the fly, but then the point is that you would
> create these logs a lot more than you access them.

Yeah. The problem though is extracting the information out of these
index.N files.

> > I've been thinking about allowing to undo worktree changes too (e.g.
> > accidental "git reset --hard") and this log can cover it as well.
>
> I like that, too. It's a little more costly just because it may involve
> object-db writes, but I think in most cases it would be fine. I almost
> always "git stash" away discarded changes these days instead of "git
> reset --hard", because it effectively provides this kind of log.

Yeah the added cost is pretty much given. You want safety, you pay extra :)

> > The only downside is we need a new command for the UI (or perhaps I
> > can just add "git add --log" or something like that).
>
> I think the reflog approach has been successful: give these intermediate
> states a name. So in theory I could do something like:
>
>   git checkout -p :@{2.minutes.ago}
>
> though it would probably be useful to be able to walk the states, too,
> just like we have "log --reflog-walk".
>
> The syntax above is off-the-cuff (and based on the ":" index
> syntax). I guess if we had a separate log for "stuff in the worktree
> that got thrown away by reset" we might need a separate syntax.

I'm leaning towards reflog too. Not because of the syntax but because
of reusing the current code base. I don't have to worry about pruning,
walking, gc-ing... because it's pretty much already there. And the UI
is not so urgent since reflog file is very readable, early adopters
can just open the file and get the hash.

This also lets me handle logging worktree changes in the future
(keeping track of untracked files is impossible with the "index.N"
approach)

I'm trying to quickly make something that writes to
"$GIT_DIR/logs/index" and see how it goes. But yeah I'll probably drop
this patch. The ":@{2.minutes.ago}" just makes me like this direction
more, even though I don't know if I could even make that work.
-- 
Duy


Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content

2018-09-17 Thread Jeff King
On Mon, Sep 17, 2018 at 07:29:26PM +0200, Duy Nguyen wrote:

> > On the other hand, if I am keeping some change that should never be
> > in a commit in the working tree file, and building the contents in
> > the index using "add -p" to incrementally, it would be the same
> > disaster as you are trying to prevent if I by mistake did a whole
> > path 'add', even if I catch myself doing so before running 'commit'
> > i.e.
> >
> > edit X
> > git add -p X
> > git diff --cached X
> > git diff X
> > ... repeat the above number of times ...
> > git add X ;# OOPS!
> > git add . ;# OOPS! even worse!
> >
> > Even though this does not involve "git commit -a" or "git commit X",
> > an unrecoverable damage that requires redoing the manual work is
> > already done.
> 
> I don't see a good way to get to recover this situation. I could go
> back to the "index log" idea, where we keep a log of index changes (or
> just "interesting" changes). That way there's no behavior change at
> all. The user who accidentally updates/deletes something can always
> retrieve the old content back (assuming that they realize quickly
> since we can't keep very long log).

FWIW, I like that approach much better, since:

  1. It does not bother or restrict anybody in their workflow; instead,
 they pay the complexity price only when they know they have made a
 mistake.

  2. It covers many more cases (e.g., just doing the wrong thing via
 "add -p").

A naive index log would be pretty cheap in CPU, at least for POSIX-ish
systems. You could just hard link "index" to "index.N" before renaming
"index.lock" over "index". But I guess if you have a gigantic index,
that's less appealing. So maybe storing the equivalent of a "--raw" diff
between the two index states would make more sense (and after all, you
don't really need the stat-cache or cache-tree). It would cost more to
reconstruct the index on the fly, but then the point is that you would
create these logs a lot more than you access them.

> I've been thinking about allowing to undo worktree changes too (e.g.
> accidental "git reset --hard") and this log can cover it as well.

I like that, too. It's a little more costly just because it may involve
object-db writes, but I think in most cases it would be fine. I almost
always "git stash" away discarded changes these days instead of "git
reset --hard", because it effectively provides this kind of log.

> The only downside is we need a new command for the UI (or perhaps I
> can just add "git add --log" or something like that).

I think the reflog approach has been successful: give these intermediate
states a name. So in theory I could do something like:

  git checkout -p :@{2.minutes.ago}

though it would probably be useful to be able to walk the states, too,
just like we have "log --reflog-walk".

The syntax above is off-the-cuff (and based on the ":" index
syntax). I guess if we had a separate log for "stuff in the worktree
that got thrown away by reset" we might need a separate syntax.

> Should I just drop this patch and go that direction instead? More
> work, but maybe better end result.

I guess my whole response is a long-winded "yes". ;)

-Peff


Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content

2018-09-17 Thread Duy Nguyen
On Mon, Sep 17, 2018 at 7:09 PM Junio C Hamano  wrote:
>
> Nguyễn Thái Ngọc Duy   writes:
>
> > This is about mixing "git add -p" and "git commit -a" (or "git commit
> > ") where you may accidentally lose staged changes. After the
> > discussion with Jonathan, I'm going with a bit different approach than
> > v1, this behavior now becomes default, and if the user wants the old
> > behavior back, they can use --clobber-index.
> >
> > Another change is "git commit " is covered as well, as pointed
> > out by Jacob.
> >
> > I will need to add some test cases of course, if this direction is
> > still promising. One thing I'm not sure about is whether want to
> > deliberately clobber the index often, then perhaps we should add a
> > config key to bring the old behavior back.
>
> It usually is safer (simply because you do not have to think about
> it) to start a behaviour change like this as a strict opt-in to gain
> confidence.

Heh. The RFC was opt-in. Jonathan suggested changing default behavior
and I went along just to see how far I could push it :)

> As I often see myself futzing with the same file, adding changes to
> it incrementally, so that I can view progress in "diff --cached" and
> "diff" output, it would be a serious usability regression if the
> last step in the following sequence is rejected by default:
>
> edit X
> git add X
> git diff --cached X
> git diff X
> ... repeat the above number of times ...
> git commit X ;# or "git commit -a" to finally conclude

But yes if there's a valid use case where this behavior change becomes
a problem, then opt-in makes more sense.

> On the other hand, if I am keeping some change that should never be
> in a commit in the working tree file, and building the contents in
> the index using "add -p" to incrementally, it would be the same
> disaster as you are trying to prevent if I by mistake did a whole
> path 'add', even if I catch myself doing so before running 'commit'
> i.e.
>
> edit X
> git add -p X
> git diff --cached X
> git diff X
> ... repeat the above number of times ...
> git add X ;# OOPS!
> git add . ;# OOPS! even worse!
>
> Even though this does not involve "git commit -a" or "git commit X",
> an unrecoverable damage that requires redoing the manual work is
> already done.

I don't see a good way to get to recover this situation. I could go
back to the "index log" idea, where we keep a log of index changes (or
just "interesting" changes). That way there's no behavior change at
all. The user who accidentally updates/deletes something can always
retrieve the old content back (assuming that they realize quickly
since we can't keep very long log).

I've been thinking about allowing to undo worktree changes too (e.g.
accidental "git reset --hard") and this log can cover it as well.

The only downside is we need a new command for the UI (or perhaps I
can just add "git add --log" or something like that).

Should I just drop this patch and go that direction instead? More
work, but maybe better end result.

> How should this new check intract with paths added with "add -N", by
> the way?

Good point. The check here is basically "git diff --cached". I would
need to turn it to "git diff --cached --ita-invisible-in-index" to
make ita entries disappear.
-- 
Duy


Re: [PATCH v2 0/1] Make 'git commit' not accidentally lose staged content

2018-09-17 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> This is about mixing "git add -p" and "git commit -a" (or "git commit
> ") where you may accidentally lose staged changes. After the
> discussion with Jonathan, I'm going with a bit different approach than
> v1, this behavior now becomes default, and if the user wants the old
> behavior back, they can use --clobber-index.
>
> Another change is "git commit " is covered as well, as pointed
> out by Jacob.
>
> I will need to add some test cases of course, if this direction is
> still promising. One thing I'm not sure about is whether want to
> deliberately clobber the index often, then perhaps we should add a
> config key to bring the old behavior back.

It usually is safer (simply because you do not have to think about
it) to start a behaviour change like this as a strict opt-in to gain
confidence.

As I often see myself futzing with the same file, adding changes to
it incrementally, so that I can view progress in "diff --cached" and
"diff" output, it would be a serious usability regression if the
last step in the following sequence is rejected by default:

edit X
git add X
git diff --cached X
git diff X
... repeat the above number of times ...
git commit X ;# or "git commit -a" to finally conclude

On the other hand, if I am keeping some change that should never be
in a commit in the working tree file, and building the contents in
the index using "add -p" to incrementally, it would be the same
disaster as you are trying to prevent if I by mistake did a whole
path 'add', even if I catch myself doing so before running 'commit'
i.e.

edit X
git add -p X
git diff --cached X
git diff X
... repeat the above number of times ...
git add X ;# OOPS!
git add . ;# OOPS! even worse!

Even though this does not involve "git commit -a" or "git commit X",
an unrecoverable damage that requires redoing the manual work is
already done.

The approach to check if the contents in the index matches that in
the HEAD per-path (i.e. "The contents we are adding to the index is
whole working tree contents for that path.  But the index already
has contents different from HEAD for the path---are we losing
information by doing this?"), is a very good one.  But for the
protection to be effective, I think "git commit" and "git add"
should be covered the same way, ideally with the same code and
possibly the same configuration knob and/or command line option to
control the behaviour.

If the information loss caused by the "add/commit X" or "add
-u/commit -a" is so serious that this new feature deserves to become
the default (which I do not yet think it is the case, by the way),
then we could even forbid "commit X" or "commit -a" when the paths
involved has difference between the index and the HEAD, without any
configuration knob or command line override for "commit", and then
tell the users to use "git add/rm" _with_ the override before coming
back to "git commit".

How should this new check intract with paths added with "add -N", by
the way?


[PATCH v2 0/1] Make 'git commit' not accidentally lose staged content

2018-09-16 Thread Nguyễn Thái Ngọc Duy
This is about mixing "git add -p" and "git commit -a" (or "git commit
") where you may accidentally lose staged changes. After the
discussion with Jonathan, I'm going with a bit different approach than
v1, this behavior now becomes default, and if the user wants the old
behavior back, they can use --clobber-index.

Another change is "git commit " is covered as well, as pointed
out by Jacob.

I will need to add some test cases of course, if this direction is
still promising. One thing I'm not sure about is whether want to
deliberately clobber the index often, then perhaps we should add a
config key to bring the old behavior back.

Nguyễn Thái Ngọc Duy (1):
  commit: do not clobber the index

 Documentation/git-commit.txt |  11 +++-
 builtin/commit.c | 105 ---
 cache.h  |   1 +
 read-cache.c |  11 
 t/t2201-add-update-typechange.sh |   2 +-
 t/t4015-diff-whitespace.sh   |   2 +-
 t/t7102-reset.sh |   2 +-
 t/t7500-commit.sh|   2 +-
 t/t7502-commit.sh|   4 +-
 t/t9350-fast-export.sh   |   2 +-
 10 files changed, 126 insertions(+), 16 deletions(-)

-- 
2.19.0.rc0.337.ge906d732e7