Re: [RFC PATCH] Introduce "precious" file concept

2018-11-27 Thread Per Lundberg
On 11/28/18 3:21 AM, brian m. carlson wrote:

Thanks for the elaboration, Brian - good to get things down to a 
practical, real-world level.

 > [...]
 >
> I point this out to underscore how fundamental this change is.  People
> overwhelmingly do not read the release notes, so expecting people to
> realize that a change has been made, especially when many people only
> upgrade Git because of a security issue, may result in unexpected
> consequences.

This is one of the more important things of software engineering. _Don't 
mix security fixes with breaking changes_. They are very different 
things and like you say, we can't really expect people to real release 
notes for every little incremental release we do.

That's an important part of the SemVer guarantee: a minor version 
bump/patch level increase means "bug fix" or "added functionality in a 
backwards-compatible way". So: no changing of default behavior or 
semantics, but adding a new behavior which is opt-in is perfectly fine.

> Just because we don't think of this use of Git as normal or desirable > 
> doesn't mean people don't do it and don't expect it to keep working.

In other words, we need to be "bug-by-bug" compatible with previous 
versions. :-) What some people would consider a bug, others would 
consider a feature.

> I think any change we make here has to be opt-in, at least until Git
> 3.0.  A config knob would probably be the right way to go. 

Agree. It's less than optimal but I think it's something that we all 
could live with. Deciding to switching the default (or not) is then 
rightfully postponed to a later time, and we can revisit the pros and 
cons then. The important thing now is to get the functionality 
implemented in a good way, tested and eventually merged.
--
Per Lundberg


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-27 Thread Per Lundberg
On 11/27/18 2:55 PM, Jacob Keller wrote:

> Personally, I would rather err on the side which requires the least
> interaction from users to avoid silently clobbering an ignored file.
> 
 > [...]
> 
> I don't like the idea of precious because it means people have to know
> and remember to opt in, and it's quite possible they will not do so
> until after they've lost real data.

I agree strongly with this personally; if we must choose between "might 
break automation" and "might delete non-garbage files", I would say the 
former is the lesser evil of the two.

But, if I had 10 000 000 servers set up using automated scripts that 
would break because of this, I might think differently. Quite likely so, 
in fact.

What are these automation scenarios _more specifically_? Junio or Brian, 
would you care to elaborate? Is it for build servers where you want "git 
clean -dfx" to always reset the working copy to a pristine state or are 
we talking about some other scenarios?

> I'd only have trashable apply in the case where it was implicit. i.e.
> git clean -fdx would still delete them, as this is an explicit
> operation that (hopefully?) users know will delete data.

This is one of the tougher calls, unfortunately.

If I was a user (which I am), and I was typing "git clean -dfx", what 
would I expect?

The help text (currently) states "-x   remove ignored files, too".

Would it be safe to assume that people would understand that "ignored 
_does not_ mean trashable when doing "git checkout some-ref" BUT it 
_does_ mean trashable in the "git clean -dfx" context"? I'm not so 
certain. It would be one of those perceived inconsistencies that would 
make people scream in anger because they _presumed_ that with the new 
"trashable" concept, "git clean -dfx" would no longer hit them in the leg.

And the other way around: if we change "git clean -dfx" to _not_ treat 
"ignored == trashable", it is likely to "hose automation" as it has been 
previously stated. People who might be using this syntax and _want_ it 
to remove ignored files would be upset, and rightfully so.

So in my POV, it's a tough decision between two, less-than-optimal 
alternatives.

But I would perhaps be able to live with the current semantics for "git 
clean -dfx" _as long as we update the help text_ so that "-x" indicates 
more clearly that non-trashable files can be deleted. It doesn't make 
things _worse_ than they currently are and if this is what it takes to 
get the trashable concept implemented and accepted by the community, 
it's a compromise I'd be willing to make.
--
Per Lundberg



Re: [RFC PATCH] Introduce "precious" file concept

2018-11-27 Thread Per Lundberg
On 11/26/18 5:55 PM, Duy Nguyen wrote:
> On Mon, Nov 26, 2018 at 4:47 PM Ævar Arnfjörð Bjarmason
>  wrote:
>> Some of the solutions overlap with this thing you want, but I think it's
>> worth keeping the distinction between the two in mind.
> 
> On the other hand all use cases should be considered. It's going to be
> a mess to have "trashable" attribute that applies to some commands
> while "precious" to some others (and even worse when they overlap,
> imagine having to define both in .gitattributes)

Agree - I think it would be a very bad idea to have a "mix" of both 
trashable and precious. IMO, we should try to find which one of these 
concepts suits most general use cases best and causes less churn for 
existing scripts/users' existing "semantic expectations", and pick that one.
--
Per Lundberg


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-26 Thread Per Lundberg
On 11/13/18 1:22 AM, brian m. carlson wrote:
> This is going to totally hose automation.  My last job had files which
> might move from tracked to untracked (a file that had become generated),
> and long-running CI and build systems would need to be able to check out
> one status and switch to the other.  Your proposed change will prevent
> those systems from working, whereas they previously did.
> 
> I agree that your proposal would have been a better design originally,
> but breaking the way automated systems currently work is probably going
> to be a dealbreaker.

How about something like this:

1. Introduce a concept with "garbage" files, which git is "permitted to 
delete" without prompting.

2. Retain the current default, i.e. "ignored files are garbage" for now, 
making the new behavior _opt in_ to avoid breaking automated 
systems/existing scripts for anyone. Put the setting for this behind a 
new core.* config flag.

3. In the plan for version 3.0 (a new major version where some breakage 
can be tolerable, according to Semantic Versioning), change the default 
so that "only explicit garbage is garbage". Include very clear notices 
of this in the release notes. The config flag is retained, but its 
default changes from true->false or vice versa. People who dislike the 
new behavior can easily change back to the 2.x semantics.

Would this be a reasonable compromise for everybody?
-- 
Per Lundberg



Re: [RFC PATCH] Introduce "precious" file concept

2018-11-11 Thread Per Lundberg
On 11/11/18 5:41 PM, Duy Nguyen wrote:
> On Sun, Nov 11, 2018 at 1:33 PM Ævar Arnfjörð Bjarmason
>  wrote:
 >
>> That will lose no data, and in the very rare cases where a checkout of
>> tracked files would overwrite an ignored pattern, we can just error out
>> (as we do with the "Ok to overwrite" branch removed) and tell the user
>> to delete the files to proceed. 
> There's also the other side of the coin. If this refuse to overwrite
> triggers too often, it can become an annoyance.

Sure, but doing "git checkout -f some_ref" instead of "git checkout 
some_ref" isn't really _that_ annoying, is it? I think, people (because 
of not having read/studied the .gitignore semantics well enough) having 
their files being overwritten _without realizing it_ is a bigger danger. 
But obviously there is a bit of treading a thin line here.

If we feel thrashable is stretching it too far (which I don't think it 
is), we could add a "core.ignore_files_are_trashable" setting that 
brings back the old semantics, for those who have a strong feeling about it.

It's also quite possible to do it the other way around - i.e. set 
"core.ignore_files_are_trashable" to true by default, and let the "new" 
behavior be opt-in. However, this might "miss the mark" in that those 
people who would really benefit from the new semantics might miss this 
setting, just like they could risk missing the "precious" setting.

(I also think "trashable" sounds better and is more clear & precise than 
"precious", for whatever that is worth.)
--
Per Lundberg


Ignored files being silently overwritten when switching branches

2018-10-15 Thread Per Lundberg
Hi,

Sorry if this question has been asked before; I skimmed through the list 
archives and the FAQ but couldn't immediately find it - please point me 
in the right direction if it has indeed been discussed before.

We were renaming some previously-included configuration files (foo.conf) 
in one of our repos, instead providing a "default" configuration 
(foo.conf.default) that can easily be copied over to foo.conf by 
individual developers. This all works fine, and the *.conf are now added 
to the .gitignore list.

_However_, when switching back to our previous release branches (which 
includes the foo.conf file in the tree), we have noticed that git 
silently overwrites the locally-modified foo.conf file with the upstream 
foo.conf file from that branch. When switching back to master, the file 
contents is therefore perpetually lost, which is a bit unfortunate.

I did a quick repro case here: https://github.com/perlun/git-test, and 
it seems easy to reproduce this behavior using the following steps (also 
documented in that git repo):

$ git init
$ touch foo.txt
$ nano foo.txt
$ git add foo.txt
$ git commit -m 'Add foo.txt'
[master (root-commit) 8ef05cb] Add foo.txt
  1 file changed, 1 insertion(+)
  create mode 100644 foo.txt
$ git checkout -b dev
Switched to a new branch 'dev'
$ git mv foo.txt foo.bar
$ git commit -m "Rename foo.txt -> foo.bar"
[dev 4c55c9b] Rename foo.txt -> foo.bar
  1 file changed, 0 insertions(+), 0 deletions(-)
  rename foo.txt => foo.bar (100%)
$ echo 'my local foo.txt' > foo.txt
$ echo foo.txt > .gitignore
$ git commit -m "Add .gitignore"
[dev 4c16acb] Add .gitignore
  1 file changed, 2 insertions(+)
  create mode 100644 .gitignore
$ git checkout master # This will silently overwrite the local foo.txt

So my question is: is this by design or should this be considered a bug 
in git? Of course, it depends largely on what .gitignore is being used 
for - if we are talking about files which can easily be regenerated 
(build artifacts, node_modules folders etc.) I can totally understand 
the current behavior, but when dealing with more sensitive & important 
content it's a bit inconvenient.


What I would have expected would be for git to complain, with this message:

error: The following untracked working tree files would be overwritten 
by checkout:
foo.txt
Please move or remove them before you switch branches.
Aborting

This is normally the message you get when a _non-ignored_ file is being 
overwritten. But apparently not so when an ignored file is being 
overwritten. If this can be tweaked in the local repo settings somehow, 
please let me know.
--
Best regards,
Per