On Tue, Oct 16 2018, Jeff King wrote:
> On Mon, Oct 15, 2018 at 01:01:50PM +0000, Per Lundberg wrote: > >> Sorry if this question has been asked before; I skimmed through the list >> archives and the FAQ but couldn't immediately find it - please point me >> in the right direction if it has indeed been discussed before. > > It is a frequently asked question, but it doesn't seem to be in any FAQ > that I could find. The behavior you're seeing is intended. See this > message (and the rest of the thread) for discussion: > > https://public-inbox.org/git/7viq39avay....@alter.siamese.dyndns.org/ > >> So my question is: is this by design or should this be considered a bug >> in git? Of course, it depends largely on what .gitignore is being used >> for - if we are talking about files which can easily be regenerated >> (build artifacts, node_modules folders etc.) I can totally understand >> the current behavior, but when dealing with more sensitive & important >> content it's a bit inconvenient. > > Basically: yes. It would be nice to have that "do not track this, but do > not trash it either" state for a file, but Git does not currently > support that. There's some patches in that thread that could be picked up by someone interested. I think the approach mentioned by Matthieu Moy here makes the most sense: https://public-inbox.org/git/vpqd3t9656k....@bauges.imag.fr/ I don't think the rationale mentioned by Junio in https://public-inbox.org/git/7v4oepaup7....@alter.siamese.dyndns.org/ is very convincing. The question is not whether .gitignore is intended to be used in some specific way, e.g. only ignoring *.o files, but whether we can reasonably suspect that users use the combination of the features we expose in such a way that their precious data gets destroyed. User data should get the benefit of the doubt. Off the top of my head, I can imagine many ways in which this'll go wrong: 1. Even if you're using .gitignore only for "trashable" as as Junio mentions, git not trashing your data depends on everyone who modifies .gitignore in your project having enough situational awareness not to inadvertently add a glob to the file which *accidentally* ignores existing files, and *nothing warns about this*. Between the caveat noted in "It is not possible to re-include[...]" in gitignore(5) and negative pathspecs it can be really easy to get this wrong. So e.g. in git.git I can add a line with "*" to .gitignore, and nothing will complain or look unusual as long as I'm not introducing new files, and I'll only find out when some-new-file.c of mine gets trashed. 2. Related, the UI "git add <ignored>" presents is just "Use -f if you really want to add them". Users who aren't careful will just think "oh, I just need -f in this case" and not alter .gitignore, leaving a timebomb for future users. Those new users will have no way of knowing that they've cloned a repo with a broken overzealous .gitignore, e.g. there's nothing on clone that says "you've just cloned a repo with N files, all of which are ignored, so git clean etc. will likely wipe out anything you have in the checkout". 3. Since we implictly expose this "you need a one-off action to override .gitignore" noted in #2 users can and *do* use this for "soft" ignores. E.g. in a big work repo there's an ignore for *.png, even though the repo has thousands of such files, because it's not considered good practice to add them anymore (there's another static repo), and someone thought to use .gitignore to enforce that suggestion. I have a personal repo where I only want *.gpg files, and due to the inability to re-include files recursively (noted in #1) I just ignore '*' and use git veeery carefully. I was only worried about 'git clean' so far, but now I see I need to worry about "checkout" as well. But maybe the use-cases I'm mentioning are highly unusual and the repos at work have ended up in some bizarre state and nobody else cares about this. It would be interesting if someone at a big git hosting providers (hint: Jeff :) could provide some numbers about how common it is to have a repository containing tracked files ignored by a .gitignore the repository itself carries. This wouldn't cover all of #1-3 above, but is probably a pretty good proxy metric. I thought this could be done by: git ls-tree -r --name-only HEAD | git check-ignore --no-index --stdin But I see that e.g. on git.git this goes wrong due to t/helper/.gitignore. So I don't know how one would answer "does this repo have .gitignored files tracked?" in a one-liner.