Re: how to remove from history just *one* version of a file/dir?
On Fri, Sep 15, 2017 at 07:06:43AM -0400, Robert P. J. Day wrote: > > I think you want to stick with a --tree-filter (or an > > --index-filter), but just selectively decide when to do the > > deletion. For example, if you can tell the difference between the > > two states based on the presence of some file, then perhaps: > > > > git filter-branch --prune-empty --index-filter ' > > if git rev-parse --verify :dir/sentinel >/dev/null 2>&1 > > then > > git rm --cached -rf dir > > fi > > ' HEAD > > > > The "--prune-empty" is optional, but will drop commits that become > > empty because they _only_ touched that directory. > > > > We use ":dir/sentinel" to see if the entry is in the index, because > > the index filter won't have the tree checked out. Likewise, we need > > to use "rm --cached" to just touch the index. > > got it. one last query -- i note that there is no "else" clause in > that code for "--index-filter". am i assuming correctly that if i was > using "--tree-filter" instead, i really would need if/then/else along > the lines of: > > if blah ; then > skip_commit "$@" > else > git commit-tree "$@" > fi > > thank you kindly. No, I think a tree-filter would just be: if test -e dir/sentinel then rm -rf dir git add -u fi (I can't remember if the "add -u" is necessary or not; I rarely use tree filters). In other words, for each commit you are just saying "if the bad version of the directory is present, then get rid of it". You shouldn't need to deal with commit-tree at all. The filter-branch script will take care of committing whatever tree state your filter leaves in place. Do note that I didn't test either of the versions I sent to you, so it's possible I'm missing some subtle thing. But I'm pretty sure the general direction is correct. -Peff
Re: how to remove from history just *one* version of a file/dir?
On Thu, 14 Sep 2017, Jeff King wrote: > On Thu, Sep 14, 2017 at 07:32:11AM -0400, Robert P. J. Day wrote: > > > [is this the right place to ask questions about git usage? or is > > there a different forum where one can submit possibly > > embarrassingly silly questions?] > > No, this is the right place for embarrassing questions. :) > > > say, early on, one commits a sizable directory of content, call > > it /mydir. that directory sits there for a while until it becomes > > obvious it's out of date and worthless and should never have been > > committed. the obvious solution would seem to be: > > > > $ git filter-branch --tree-filter 'rm -rf /mydir' HEAD > > > > correct? > > That would work, though note that using an --index-filter would be > more efficient (since it avoids checking out each tree as it walks > the history). i'm just digging into --index-filter as we speak, i realize it's noticeably faster. > > however, say one version of that directory was committed early > > on, then later tossed for being useless with "git rm", and > > subsequently replaced by newer content under exactly the same > > name. now i'd like to go back and delete the history related to > > that early version of /mydir, but not the second. > > Makes sense as a goal. > > > obviously, i can't use the above command as it would delete both > > versions. so it appears the solution would be a trivial > > application of the "--commit-filter" option: > > > >git filter-branch --commit-filter ' > > if [ "$GIT_COMMIT" = "" ] ; then > >skip_commit "$@"; > > else > >git commit-tree "$@"; > > fi' HEAD > > > > where is the commit that introduced the first verrsion of > > /mydir. do i have that right? is there a simpler way to do this? > > No, this won't work. Filter-branch is not walking the history and > applying the changes to each commit, like rebase does. It's > literally operating on each commit object, and recall that each > commit object points to a tree that is a snapshot of the repository > contents. > > So if you skip a commit, that commit itself goes away. But the > commit after it (which didn't touch the unwanted contents) will > still mention those contents in its tree. ah, of course, duh. > I think you want to stick with a --tree-filter (or an > --index-filter), but just selectively decide when to do the > deletion. For example, if you can tell the difference between the > two states based on the presence of some file, then perhaps: > > git filter-branch --prune-empty --index-filter ' > if git rev-parse --verify :dir/sentinel >/dev/null 2>&1 > then > git rm --cached -rf dir > fi > ' HEAD > > The "--prune-empty" is optional, but will drop commits that become > empty because they _only_ touched that directory. > > We use ":dir/sentinel" to see if the entry is in the index, because > the index filter won't have the tree checked out. Likewise, we need > to use "rm --cached" to just touch the index. got it. one last query -- i note that there is no "else" clause in that code for "--index-filter". am i assuming correctly that if i was using "--tree-filter" instead, i really would need if/then/else along the lines of: if blah ; then skip_commit "$@" else git commit-tree "$@" fi thank you kindly. rday -- Robert P. J. Day Ottawa, Ontario, CANADA http://crashcourse.ca Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday
Re: how to remove from history just *one* version of a file/dir?
On Thu, Sep 14, 2017 at 07:32:11AM -0400, Robert P. J. Day wrote: > [is this the right place to ask questions about git usage? or is > there a different forum where one can submit possibly embarrassingly > silly questions?] No, this is the right place for embarrassing questions. :) > say, early on, one commits a sizable directory of content, call it > /mydir. that directory sits there for a while until it becomes obvious > it's out of date and worthless and should never have been committed. > the obvious solution would seem to be: > > $ git filter-branch --tree-filter 'rm -rf /mydir' HEAD > > correct? That would work, though note that using an --index-filter would be more efficient (since it avoids checking out each tree as it walks the history). > however, say one version of that directory was committed early on, > then later tossed for being useless with "git rm", and subsequently > replaced by newer content under exactly the same name. now i'd like to > go back and delete the history related to that early version of > /mydir, but not the second. Makes sense as a goal. > obviously, i can't use the above command as it would delete both > versions. so it appears the solution would be a trivial application of > the "--commit-filter" option: > >git filter-branch --commit-filter ' > if [ "$GIT_COMMIT" = "" ] ; then >skip_commit "$@"; > else >git commit-tree "$@"; > fi' HEAD > > where is the commit that introduced the first verrsion of > /mydir. do i have that right? is there a simpler way to do this? No, this won't work. Filter-branch is not walking the history and applying the changes to each commit, like rebase does. It's literally operating on each commit object, and recall that each commit object points to a tree that is a snapshot of the repository contents. So if you skip a commit, that commit itself goes away. But the commit after it (which didn't touch the unwanted contents) will still mention those contents in its tree. I think you want to stick with a --tree-filter (or an --index-filter), but just selectively decide when to do the deletion. For example, if you can tell the difference between the two states based on the presence of some file, then perhaps: git filter-branch --prune-empty --index-filter ' if git rev-parse --verify :dir/sentinel >/dev/null 2>&1 then git rm --cached -rf dir fi ' HEAD The "--prune-empty" is optional, but will drop commits that become empty because they _only_ touched that directory. We use ":dir/sentinel" to see if the entry is in the index, because the index filter won't have the tree checked out. Likewise, we need to use "rm --cached" to just touch the index. -Peff
how to remove from history just *one* version of a file/dir?
[is this the right place to ask questions about git usage? or is there a different forum where one can submit possibly embarrassingly silly questions?] i've been perusing "git filter-branch", and i'm curious if i have the right idea about how to very selectively get rid of some useless history. say, early on, one commits a sizable directory of content, call it /mydir. that directory sits there for a while until it becomes obvious it's out of date and worthless and should never have been committed. the obvious solution would seem to be: $ git filter-branch --tree-filter 'rm -rf /mydir' HEAD correct? however, say one version of that directory was committed early on, then later tossed for being useless with "git rm", and subsequently replaced by newer content under exactly the same name. now i'd like to go back and delete the history related to that early version of /mydir, but not the second. obviously, i can't use the above command as it would delete both versions. so it appears the solution would be a trivial application of the "--commit-filter" option: git filter-branch --commit-filter ' if [ "$GIT_COMMIT" = "" ] ; then skip_commit "$@"; else git commit-tree "$@"; fi' HEAD where is the commit that introduced the first verrsion of /mydir. do i have that right? is there a simpler way to do this? rday -- Robert P. J. Day Ottawa, Ontario, CANADA http://crashcourse.ca Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday