Re: which files are "known to git"?
On Mon, 21 May 2018, Jonathan Nieder wrote: > Robert P. J. Day wrote: > > On Mon, 21 May 2018, Elijah Newren wrote: > > >> Hi Robert, > >> > >> I had always assumed prior to your email that 'known to Git' > >> meant 'tracked' or 'recorded in the index'... > > > > i *know* i've been in this discussion before, but i don't > > remember where, i *assume* it was on this list, and i recall > > someone (again, don't remember who) who opined that there are two > > categories of files that are "known to git": > > My understanding was the same as Elijah's. > > I would be in favor of a patch that replaces the phrase "known to > Git" in Git's documentation with something less confusing. ironically, the 2nd edition of o'reilly's "version control with git" uses the phrases "known to Git" and "unknown to Git" on p. 378 (and nowhere else that i can see): "Furthermore, for the purposes of this [git clean] command, Git uses a slightly more conservative concept of under version control. Specifically, the manual page uses the phrase “files that are unknown to Git” for a good reason: even files that are mentioned in the .gitignore and .git/info/exclude files are actually known to Git. They represent files that are not version controlled, but Git does know about them. And because those files are called out in the .gitignore files, they must have some known (to you) behavior that shouldn’t be disturbed by Git. So Git won’t clean out the ignored files unless you explicitly request it with the -x option." that phrase even occurs in git-produced diagnostic messages such as: dir.c: error("pathspec '%s' did not match any file(s) known to git.", in any event, perhaps the phrase "known to Git" has some value, as long as it's defined very precisely and used consistently, which it obviously isn't right now. rday -- Robert P. J. Day Ottawa, Ontario, CANADA http://crashcourse.ca/dokuwiki Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday
Re: which files are "known to git"?
Jonathan Niederwrites: > My understanding was the same as Elijah's. > > I would be in favor of a patch that replaces the phrase "known to Git" > in Git's documentation with something less confusing. One possible twist I recall was that normally we only pay attention to the index (i.e. the term "tracked files" means "paths shown in ls-files (no args) output"), but there was a desire to take a union of paths in the index and paths in the HEAD (the obvious difference is those that are removed from the index), and we may have called these "known to Git" in the discussion to distinguish them from "paths in the index". Clearly the phrase we are discussing (e.g. the ones used in "git clean" documentation) has been used _without_ such a desire but merely is used carelessly and confusingly. So I am all for finding "known to Git" and replace them with "tracked" and/or "added to the index" when the phrase is not used to mean "union of paths in the index and in the HEAD". Thanks.
Re: which files are "known to git"?
On Mon, 21 May 2018, Elijah Newren wrote: > > can anyone refresh my memory if that happened here, and whether > > that was the consensus after the discussion was over? > > Perhaps this: > https://public-inbox.org/git/EEC5BA1D5F274F02AE20FC269868FDEF@PhilipOakley/ > ? yup, that's it, thanks. rday
Re: which files are "known to git"?
On Mon, May 21, 2018 at 10:40 AM, Robert P. J. Daywrote: > On Mon, 21 May 2018, Elijah Newren wrote: > >> Hi Robert, >> >> I had always assumed prior to your email that 'known to Git' meant >> 'tracked' or 'recorded in the index'... > > i *know* i've been in this discussion before, https://public-inbox.org/git/alpine.LFD.2.21.1711120430580.30032@localhost.localdomain/ via git clone --mirror https://public-inbox.org/git git-ml && cd git-ml git log --oneline --author=rpj...@crashcourse.ca / known # search for "known" in message subjects I really value the public inbox to work as a git repo, as then you can dig though it just as you dig through commits.
Re: which files are "known to git"?
Hi, Robert P. J. Day wrote: > i did a quick search for that phrase in the current code base and > came up with: > > builtin/difftool.c: /* The symlink is unknown to Git so read from > the filesystem */ > dir.c:error("pathspec '%s' did not match any file(s) known to > git.", > Documentation/git-rm.txt:removes only the paths that are known to Git. > Giving the name of > Documentation/git-commit.txt: be known to Git); > Documentation/user-manual.txt:error: pathspec > '261dfac35cb99d380eb966e102c1197139f7fa24' did not match any file(s) known to > git. > Documentation/gitattributes.txt: Notice all types of potential > whitespace errors known to Git. > Documentation/git-clean.txt:Normally, only files unknown to Git are removed, > but if the `-x` > Documentation/RelNotes/1.8.2.1.txt: * The code to keep track of what > directory names are known to Git on > Documentation/RelNotes/1.8.1.6.txt: * The code to keep track of what > directory names are known to Git on > Documentation/RelNotes/2.9.0.txt: known to Git. They have been taught to > do the normalization. > Documentation/RelNotes/2.8.4.txt: known to Git. They have been taught to > do the normalization. > Documentation/RelNotes/1.8.3.txt: * The code to keep track of what directory > names are known to Git on > t/t3005-ls-files-relative.sh: echo "error: pathspec $sq$f$sq > did not match any file(s) known to git." > t/t3005-ls-files-relative.sh: echo "error: pathspec $sq$f$sq > did not match any file(s) known to git." > > so it's not like there's a *ton* of that, but still enough to want to > get it right. should there be a precise definition for the phrase > "known to git", or should that phrase simply be banned/replaced? In my opinion: the latter. It's not like the phrase represents some concept that we don't have any other name for. They're also known as "tracked files" and that name is more intuitive. Thanks, Jonathan
Re: which files are "known to git"?
On Mon, May 21, 2018 at 10:40 AM, Robert P. J. Daywrote: > On Mon, 21 May 2018, Elijah Newren wrote: > >> I had always assumed prior to your email that 'known to Git' meant >> 'tracked' or 'recorded in the index'... > > i *know* i've been in this discussion before, but i don't remember > where, i *assume* it was on this list, and i recall someone (again, > don't remember who) who opined that there are two categories of files > that are "known to git": > > 1) files known in a *positive* sense, those being explicitly tracked > files, and > > 2) files known in a *negative* sense, as in explicitly ignored files > > can anyone refresh my memory if that happened here, and whether that > was the consensus after the discussion was over? Perhaps this: https://public-inbox.org/git/EEC5BA1D5F274F02AE20FC269868FDEF@PhilipOakley/ ? > If that's the > definition that's being used, then this passage makes sense: > > "Normally, only files unknown to Git are removed, but if the -x > option is specified, ignored files are also removed." > > that pretty clearly implies that ignored files are considered "known" > to git. Yes, _if_ that's the definition used, then that passage makes sense. But if that's the definition used, then the other two passages I pointed out in Documentation/git-commit.txt and Documentation/git-rm.txt do NOT make sense and need to be rewritten. Junio has already chimed in elsewhere on this thread and stated pretty clearly that the intended meaning for 'known to Git' was just (1), not (2), and even provided a suggested wording fix for Documentation/git-clean.txt. Putting that into a patch format and submitting along with an update to Documentation/glossary-content.txt as Duy suggested look like the two todos to me, though perhaps others want to discuss ways to just avoid the phrase 'known to Git' (as suggested by Jonathan).
Re: which files are "known to git"?
On Mon, 21 May 2018, Jonathan Nieder wrote: > Robert P. J. Day wrote: > > On Mon, 21 May 2018, Elijah Newren wrote: > > >> Hi Robert, > >> > >> I had always assumed prior to your email that 'known to Git' > >> meant 'tracked' or 'recorded in the index'... > > > > i *know* i've been in this discussion before, but i don't > > remember where, i *assume* it was on this list, and i recall > > someone (again, don't remember who) who opined that there are two > > categories of files that are "known to git": > > My understanding was the same as Elijah's. > > I would be in favor of a patch that replaces the phrase "known to > Git" in Git's documentation with something less confusing. first, i want to apologize to everyone for opening this apparent can of worms. (it's victoria day here in canada, and i intended to spend it just puttering around with git-related minutiae, not encouraging thought-provoking questions about the fundamental nature of git.) i did a quick search for that phrase in the current code base and came up with: builtin/difftool.c: /* The symlink is unknown to Git so read from the filesystem */ dir.c: error("pathspec '%s' did not match any file(s) known to git.", Documentation/git-rm.txt:removes only the paths that are known to Git. Giving the name of Documentation/git-commit.txt: be known to Git); Documentation/user-manual.txt:error: pathspec '261dfac35cb99d380eb966e102c1197139f7fa24' did not match any file(s) known to git. Documentation/gitattributes.txt:Notice all types of potential whitespace errors known to Git. Documentation/git-clean.txt:Normally, only files unknown to Git are removed, but if the `-x` Documentation/RelNotes/1.8.2.1.txt: * The code to keep track of what directory names are known to Git on Documentation/RelNotes/1.8.1.6.txt: * The code to keep track of what directory names are known to Git on Documentation/RelNotes/2.9.0.txt: known to Git. They have been taught to do the normalization. Documentation/RelNotes/2.8.4.txt: known to Git. They have been taught to do the normalization. Documentation/RelNotes/1.8.3.txt: * The code to keep track of what directory names are known to Git on t/t3005-ls-files-relative.sh: echo "error: pathspec $sq$f$sq did not match any file(s) known to git." t/t3005-ls-files-relative.sh: echo "error: pathspec $sq$f$sq did not match any file(s) known to git." so it's not like there's a *ton* of that, but still enough to want to get it right. should there be a precise definition for the phrase "known to git", or should that phrase simply be banned/replaced? i have no idea, open to suggestions. rday
Re: which files are "known to git"?
Robert P. J. Day wrote: > On Mon, 21 May 2018, Elijah Newren wrote: >> Hi Robert, >> >> I had always assumed prior to your email that 'known to Git' meant >> 'tracked' or 'recorded in the index'... > > i *know* i've been in this discussion before, but i don't remember > where, i *assume* it was on this list, and i recall someone (again, > don't remember who) who opined that there are two categories of files > that are "known to git": My understanding was the same as Elijah's. I would be in favor of a patch that replaces the phrase "known to Git" in Git's documentation with something less confusing. Thanks, Jonathan
Re: which files are "known to git"?
On Mon, 21 May 2018, Elijah Newren wrote: > Hi Robert, > > I had always assumed prior to your email that 'known to Git' meant > 'tracked' or 'recorded in the index'... i *know* i've been in this discussion before, but i don't remember where, i *assume* it was on this list, and i recall someone (again, don't remember who) who opined that there are two categories of files that are "known to git": 1) files known in a *positive* sense, those being explicitly tracked files, and 2) files known in a *negative* sense, as in explicitly ignored files can anyone refresh my memory if that happened here, and whether that was the consensus after the discussion was over? if that's the definition that's being used, then this passage makes sense: "Normally, only files unknown to Git are removed, but if the -x option is specified, ignored files are also removed." that pretty clearly implies that ignored files are considered "known" to git. rday
Re: which files are "known to git"?
On Tue, May 22, 2018 at 12:09 AM, Elijah Newrenwrote: > > I had always assumed prior to your email that 'known to Git' meant > 'tracked' or 'recorded in the index'. That's been my intention as well ;-) > From Documentation/git-clean.txt: > > Normally, only files unknown to Git are removed, but if the `-x` > option is specified, ignored files are also removed. The above makes it sound as if "unknown to Git" is synonym to "not marked as ignored via the exclude mechanism", which would incorrectly imply "known to Git" is "marked as ignored via the exclude mechanism". Which is a sheer nonsense. I think this is written while forgetting that "known to Git" was already a term with a specific meaning, and used a confusing term unnecessarily loosely. "clean" removes files that are not in the index and are not marked as ignored by default, but with "clean -x", the user can remove all files that are not in the index, even the ones that are marked as ignored. In the above version of description, "files that are not in the index" can be replaced with "untracked files" and we can also say "files unknown to Git" (if we want to), but the set of files "clean" operates by default is narrower than "unknown to Git"--it is "unknown to Git and not marked as ignored".
Re: which files are "known to git"?
On Mon, 21 May 2018, Elijah Newren wrote: > Hi Robert, > I had always assumed prior to your email that 'known to Git' meant > 'tracked' or 'recorded in the index'. However, a quick `git grep -i > known.to.git` shows that we're actually not consistent by what we > mean with this phrase. A little test setup: > > $ echo ignoreme >>.gitignore > $ git add .gitignore > $ git commit -m ignoreme > $ touch ignoreme > $ git ls-files -o > ignoreme > $ git ls-files -o --exclude-standard > $ > > >From Documentation/git-clean.txt: > > Normally, only files unknown to Git are removed, but if the `-x` > option is specified, ignored files are also removed. > > This implies that ignored files are not 'unknown to Git', or fixing the > double negative, that ignored files are 'known to Git': > $ git clean -n > $ git clean -nx > Would remove ignoreme > $ uh oh ... i'm just now remembering a discussion once upon a time where this wasn't simply a double negative. IIRC (and someone else help me out here), "known to git" also meant known *not* to be tracked or something like that (as in, ignored files). anyone remember that conversation? rday
Re: which files are "known to git"?
On Mon, 21 May 2018, Elijah Newren wrote: > Hi Robert, > > On Mon, May 21, 2018 at 4:18 AM, Robert P. J. Day> wrote: > > > > updating my git courseware and, since some man pages refer to files > > "known to git", i just want to make sure i understand precisely which > > files those are. AIUI, they would include: > > > > * tracked files > > * ignored files > > * new files which have been staged but not yet committed > > > > is that it? are there others? > > Doesn't the first category of yours include the third? I always > read 'tracked' as 'in the index'. you're right, i was being redundant. rday
Re: which files are "known to git"?
On Mon, May 21, 2018 at 5:09 PM, Elijah Newrenwrote: > Robert, since you're working on documentation of sorts anyway, would > you like to propose some patches to fix things here? I'm not entirely > sure what to suggest, and we might need a random suggestion to get the > discussion started before we figure out what we want here, but it'd be > nice to fix this inconsistency. Make sure to fix Documentation/glossary-content.txt too, Robert, if you plan to improve documentation. -- Duy
Re: which files are "known to git"?
Hi Robert, On Mon, May 21, 2018 at 4:18 AM, Robert P. J. Daywrote: > > updating my git courseware and, since some man pages refer to files > "known to git", i just want to make sure i understand precisely which > files those are. AIUI, they would include: > > * tracked files > * ignored files > * new files which have been staged but not yet committed > > is that it? are there others? Doesn't the first category of yours include the third? I always read 'tracked' as 'in the index'. I had always assumed prior to your email that 'known to Git' meant 'tracked' or 'recorded in the index'. However, a quick `git grep -i known.to.git` shows that we're actually not consistent by what we mean with this phrase. A little test setup: $ echo ignoreme >>.gitignore $ git add .gitignore $ git commit -m ignoreme $ touch ignoreme $ git ls-files -o ignoreme $ git ls-files -o --exclude-standard $ >From Documentation/git-clean.txt: Normally, only files unknown to Git are removed, but if the `-x` option is specified, ignored files are also removed. This implies that ignored files are not 'unknown to Git', or fixing the double negative, that ignored files are 'known to Git': $ git clean -n $ git clean -nx Would remove ignoreme $ >From Documentation/git-commit.txt: 3. by listing files as arguments to the 'commit' command (without --interactive or --patch switch), in which case the commit will ignore changes staged in the index, and instead record the current content of the listed files (which must already be known to Git); This implies that only recorded-in-the-index files are known to Git: $ git commit -m testing ignoreme error: pathspec 'ignoreme' did not match any file(s) known to git. $ >From Documentation/git-rm.txt: The list given to the command can be exact pathnames, file glob patterns, or leading directory names. The command removes only the paths that are known to Git. Giving the name of a file that you have not told Git about does not remove that file. This also implies that only recorded-in-the-index files are known to Git: $ git rm ignoreme fatal: pathspec 'ignoreme' did not match any files $ I can't see any evidence of usage that suggests any more categories than tracked and ignored, but whether ignored files are included in the set of 'files known to Git' appears to depend on which man page you are reading...which is rather unfortunate. Robert, since you're working on documentation of sorts anyway, would you like to propose some patches to fix things here? I'm not entirely sure what to suggest, and we might need a random suggestion to get the discussion started before we figure out what we want here, but it'd be nice to fix this inconsistency. Elijah
RE: which files are "known to git"?
On May 21, 2018 7:19 AM, Robert P. J. Day: > updating my git courseware and, since some man pages refer to files > "known to git", i just want to make sure i understand precisely which files > those are. AIUI, they would include: > > * tracked files > * ignored files > * new files which have been staged but not yet committed You might want to consider git metadata/config/attribute files, hooks, filters, etc., that may not be not formally part of a repository, but can be required to ensure the content is complete. Cheers, Randall -- Brief whoami: NonStop developer since approximately 2112884442 UNIX developer since approximately 421664400 -- In my real life, I talk too much.