Re: [RFC 0/3] Reflogs for deleted refs: fix breakage and suggest namespace change
On Sat, Aug 18, 2012 at 01:39:41PM -0700, Junio C Hamano wrote: mhag...@alum.mit.edu writes: Given that a flag day would anyway be required to add a d/f-tolerant system, I could live with a separate graveyard namespace as originally proposed by Jeff. However, I still think that as long as we are making a jump, we could try to land closer to the ultimate destination. Do we _know_ already what the ultimate destination looks like? If the answer is yes, then I agree, but otherwise, I doubt it is a good idea to introduce unnecessary complexity to the system that may have to be ripped out and redone. I didn't get the impression that we know the ultimate destination from the previous discussion, especially if we discount the tangent around having next and next/foo at the same time which was on nobody's wish, but I may be misremembering things. Sorry for the slow response on this topic; I was traveling all last week and am still catching up with emails. I don't think we know what the ultimate destination looks like. If I had to choose, it would probably be something like: refs/heads/next.ref refs/heads/next/foo.ref which is easy to read and manipulate. But this is not compatible with the current system, because: 1. It cannot use .ref, as that is allowed in ref names currently. 2. This can't co-exist with existing, non-tweaked refs, since refs/heads/next would still conflict (you'd have to instead do refs/heads.dir/next.dir/foo. But since making a change like this would involve bumping the repositoryformatversion flag _anyway_, so at that point we don't really have to care about compatibility, and we are free to design what looks good. So in other words, I do not think any ultimate destination that I find palatable would be achievable without making the full format jump anyway. If all things were equal, I'd say there is no reason not to get as close as we can. But I find some of the proposals significantly less readable (in particular, the directory-munging is IMHO much harder to read). And it is not as if it is buying us anything; you still have to make a magic translation between a dead log and a live one. Another option I've considered is simply holding back the graveyard topic, working on the d/f tolerant storage, and then implementing the graveyards on top (which is basically free at that point). But as you note, it is not really a commonly-requested feature. If it were easy, I'd say let's do it. But the idea of bumping repositoryformatversion for the first time in git's history just to add a feature nobody wants is not very appealing to me. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 0/3] Reflogs for deleted refs: fix breakage and suggest namespace change
Jeff King p...@peff.net writes: So in other words, I do not think any ultimate destination that I find palatable would be achievable without making the full format jump anyway. If all things were equal, I'd say there is no reason not to get as close as we can. But I find some of the proposals significantly less readable (in particular, the directory-munging is IMHO much harder to read). And it is not as if it is buying us anything; you still have to make a magic translation between a dead log and a live one. Yes, that is where the earlier comment of mine on this topic came from. Another option I've considered is simply holding back the graveyard topic, working on the d/f tolerant storage, and then implementing the graveyards on top (which is basically free at that point). But as you note, it is not really a commonly-requested feature. If it were easy, I'd say let's do it. But the idea of bumping repositoryformatversion for the first time in git's history just to add a feature nobody wants is not very appealing to me. Amen. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 0/3] Reflogs for deleted refs: fix breakage and suggest namespace change
On 20 Aug 2012, at 13:32, Alexey Muranov wrote: The problem of mapping branch names to file paths looks to me very similar to the problem of mapping URLs to file paths for static web sites, so i would propose to use the same solution: add a special extension to distinguish a file from a directory, for example .branch and .tag (like .html in the case of URL). This would allow having both branches next and next/foo with refs stored in files next.branch and next/foo.branch. This will look very clear and familiar to people not specialist in Git, but familiar with static web sites. The only limitation this would introduces is that branch names foo.branch would need to be forbidden. If the extension is optional, this makes the new rule almost compatible with the current one, except if somebody is currently using branches named like foo.branch or next.branch/foo. Another possible choice for the extensions: .~br and .~tg (to keep readability of file names and allow all currently allowed branch names).-- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 0/3] Reflogs for deleted refs: fix breakage and suggest namespace change
On 19 Aug 2012, at 02:02, Junio C Hamano wrote: Alexey Muranov alexey.mura...@gmail.com writes: I hope my opinion might be useful because i do not know anything about the actual implementation of Git,... That sounds like contradiction. I think that the implementation (the code), the model, and the interface are independent. On the top level, for example, one does not need to know how commit storage is optimized, it is enough to understand that each commit is a snapshot of a subtree in a file directory. To just give a quick idea of my ideas, i thought that 'fetching' in Git was an inevitable evil that stands apart from other operations and is necessary only because the computer communication on Earth is not sufficiently developed to keep all Git repositories constantly in sync,... It is a feature, not a symptom of an insufficiently developed technology, that I do not have to know what random tweaks and experiments are done in repositories of 47 thousands people who clone from me, and I can sync with any one of them only when I know there is something worth looking at when I say git fetch. Currently, one of the main functions of 'fetch', apart from changing the remote tracking branches, is downloading the remote objects. This is necessary because of an insufficiently developed technology. The other main function is changing the local copies of remote branches (changing the remote tracking branches), this is what i described as taking a snapshot. I did not understand what you meant by I do not have to know what random tweaks and experiments are done in repositories of 47 thousands people who clone from me, and I can sync with any one of them only when I know there is something worth looking at when I say git fetch. How is it possible to know and not to know what is going on in the remote repository in the same time? -Alexey.-- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 0/3] Reflogs for deleted refs: fix breakage and suggest namespace change
On 19 Aug 2012, at 02:02, Junio C Hamano wrote: Alexey Muranov alexey.mura...@gmail.com writes: I hope my opinion might be useful because i do not know anything about the actual implementation of Git,... That sounds like contradiction. I meant that i am psychologically not attached to the current behavior, and may provide a naïve view point, if you like. -Alexey.-- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 0/3] Reflogs for deleted refs: fix breakage and suggest namespace change
On 19 Aug 2012, at 02:02, Junio C Hamano wrote: Alexey Muranov alexey.mura...@gmail.com writes: Excuse me if i miss something again, but i might be willing to discuss the ultimate destination. Could you possibly state in simple terms what the problem with determining the ultimate destination is? Decide if it makes sense to break backward compatibility of loose ref representation merely to support having a branch next and another branch next/foo in the same repository, and if it does, what the new loose ref representation looks like. I looked again through this thread and tried to understand better the issues. 1. I vote for moving dead reflogs to logs/graveyard (or to logs/deadlogs). 2. I think that allowing both next and next/foo complicates the mapping from branch names to file paths, and it does not seem necessary if dead reflogs are moved away to graveyard anyway. 3. There remains the question what to do with dead reflogs for different branches having the same name. Maybe, keep the death date and time under the graveyard directory and not allow the user to delete 2 times in less than 1 second? /logs/graveyard/-mm-dd-hhmmss/refs/heads/next/foo In a sense this is similar to the git storage model: an atomic destructive operation creates a timestamped commit in logs/graveyard directory.-- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 0/3] Reflogs for deleted refs: fix breakage and suggest namespace change
On 08/18/2012 10:39 PM, Junio C Hamano wrote: mhag...@alum.mit.edu writes: Given that a flag day would anyway be required to add a d/f-tolerant system, I could live with a separate graveyard namespace as originally proposed by Jeff. However, I still think that as long as we are making a jump, we could try to land closer to the ultimate destination. Do we _know_ already what the ultimate destination looks like? No; we can only guess. I just wanted to submit some code so that the existence/absence of code would not prejudice the decision. If the answer is yes, then I agree, but otherwise, I doubt it is a good idea to introduce unnecessary complexity to the system that may have to be ripped out and redone. I didn't get the impression that we know the ultimate destination from the previous discussion, especially if we discount the tangent around having next and next/foo at the same time which was on nobody's wish, but I may be misremembering things. It's been a wish of mine, but it's pretty low priority. I've also brainstormed about some other changes that could be connected with a new repo format: * Allow deleted loose references (for example denoted by value 0{40}) that override packed references with the same name. This would remove the need to rewrite the packed-refs file when a reference is deleted. (A prerequisite for this change would be to allow next and next/foo at the same time.) * Push HEAD and its friends down out of $GIT_DIR into a reference-specific directory. * Rename lock files to look less like reference names (e.g., something like refs/foo~lock instead of refs/foo.lock). * Somehow munge reference names in a way to avoid other filesystem limitations -- e.g., case insensitivity, filenames like com and prn or with multiple dots under Windows. * ...or maybe a packed-refs file that can (usually) be updated in-place, and get rid of loose references entirely. Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 0/3] Reflogs for deleted refs: fix breakage and suggest namespace change
Michael Haggerty mhag...@alum.mit.edu writes: It's been a wish of mine, but it's pretty low priority. I've also brainstormed about some other changes that could be connected with a new repo format: * Allow deleted loose references (for example denoted by value 0{40}) that override packed references with the same name. This would remove the need to rewrite the packed-refs file when a reference is deleted. (A prerequisite for this change would be to allow next and next/foo at the same time.) We would need to think the performance implications through of the approach; it tempts us to accumulate the loose removed markers in the hope that it would be an improvement than having to rewrite the packed-refs over and over, and without numbers to back that theory up, we may be worsening the system without knowing. Having said that, it is an interesting idea. I wouldn't use 0{40} as the sentinel value but rather use letters outside [0-9a-f], though. * Push HEAD and its friends down out of $GIT_DIR into a reference-specific directory. Not going to happen for several years, I am afraid, as I think many casual tools do an equivalent of test -f $DIR/HEAD to see if $DIR is a repository; even our own gitweb does so. We should advertise an easy way for scripted Porcelains to directly ask is_git_directory(). * Rename lock files to look less like reference names (e.g., something like refs/foo~lock instead of refs/foo.lock). If you do the ~d/~f thing, foo.lock becomes a non-issue, no? * Somehow munge reference names in a way to avoid other filesystem limitations -- e.g., case insensitivity, filenames like com and prn or with multiple dots under Windows. Very interesting. I however am afraid that the users and the projects will learn to avoid the problematic names a lot sooner than such a change will be implemented to make the issue go away (or they have already learned long time ago), and the end result may end up solving a non-issue only to make the output from find .git/refs even more unreadable. * ...or maybe a packed-refs file that can (usually) be updated in-place, and get rid of loose references entirely. I find this equally intriguing as your deleted one above. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 0/3] Reflogs for deleted refs: fix breakage and suggest namespace change
Alexey Muranov alexey.mura...@gmail.com writes: 2. I think that allowing both next and next/foo complicates the mapping from branch names to file paths, and it does not seem necessary if dead reflogs are moved away to graveyard anyway. It is unclear why the first two lines above leads to the conclusion it does not seem necessary (but honestly, I do not particularly care). 3. There remains the question what to do with dead reflogs for different branches having the same name. Maybe, keep the death date and time under the graveyard directory and not allow the user to delete 2 times in less than 1 second? /logs/graveyard/-mm-dd-hhmmss/refs/heads/next/foo How would that help us in what way? When I ask git log -g next/foo for the next/foo branch that currently exists, I want to see the update history of its tip since I created it for the last time, and then an entry that says I created it at such and such time. If I used to have the branch before but deleted, then the output should be followed by another entry that says I deleted it at such and such time, followed by the history of the tip updates. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 0/3] Reflogs for deleted refs: fix breakage and suggest namespace change
On 19 Aug 2012, at 19:38, Junio C Hamano wrote: Alexey Muranov alexey.mura...@gmail.com writes: 2. I think that allowing both next and next/foo complicates the mapping from branch names to file paths, and it does not seem necessary if dead reflogs are moved away to graveyard anyway. It is unclear why the first two lines above leads to the conclusion it does not seem necessary (but honestly, I do not particularly care). I thought that the first reason that allowing next and next/foo seemed necessary was avoiding conflicts with dead reflogs or between dead reflogs. If dead reflog for next/foo is moved away, it will not conflict with a new one for next. There remains a problem with a conflict between dead next/foo and dead next. This can be solved as Jeff suggested by adding special escape symbols, or as i suggested below, by keeping reflogs deleted on different occasions in different timestamp directories. 3. There remains the question what to do with dead reflogs for different branches having the same name. Maybe, keep the death date and time under the graveyard directory and not allow the user to delete 2 times in less than 1 second? /logs/graveyard/-mm-dd-hhmmss/refs/heads/next/foo How would that help us in what way? When I ask git log -g next/foo for the next/foo branch that currently exists, I want to see the update history of its tip since I created it for the last time, and then an entry that says I created it at such and such time. If I used to have the branch before but deleted, then the output should be followed by another entry that says I deleted it at such and such time, followed by the history of the tip updates. I only suggested how to resolve conflicts between dead reflogs in graveyard if next and next/foo cannot coexist. For example, if first next/foo was created and deleted, and then next was created and deleted. It also seems nice to me to have dead reflogs for different identically named branches (created and deleted independently) in separate files. It is possible to collect the information for git log -g next/foo by looking through all timestamp subdirectories in graveyard. -Alexey.-- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 0/3] Reflogs for deleted refs: fix breakage and suggest namespace change
Alexey Muranov alexey.mura...@gmail.com writes: I only suggested how to resolve conflicts between dead reflogs in graveyard if next and next/foo cannot coexist. But Jeff's patch series already has the support for a case where you delete next (graveyard gets 'next'), create next/foo and then delete that (graveyard gets 'next/foo', too) anyway (check the list archive before posting). It is a solved problem. It is possible to collect the information for git log -g next/foo by looking through all timestamp subdirectories in graveyard. It is possible if you wrote a new file every time you add one entry to reflog, or if you created a directory with timestamp in its name and wrote a new file there, too. We are not particularly interested in it is possible when many implementations can all trivially allow it to be possible; the question is what a sensible solution is among them, and I didn't find a directory with timestamp in its name a particularly sensible way to go. Either Jeff's refname $name's log goes to logs/graveyard/$name~ or Michael's append ~d to each directory component, append ~f to the leaf component that are already proposed will keep one file per name property to allow us to open once and efficiently read the file through. Why would we want to see an inferiour alternative added to the discussion? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 0/3] Reflogs for deleted refs: fix breakage and suggest namespace change
Junio C Hamano gits...@pobox.com writes: Either Jeff's refname $name's log goes to logs/graveyard/$name~ or Michael's append ~d to each directory component, append ~f to the leaf component that are already proposed will keep one file per name property to allow us to open once and efficiently read the file through. Why would we want to see an inferiour alternative added to the discussion? Note that there may be some other advantage I am not seeing in the directory with timestamp in its name; if it is a big enough advantage over what have already been proposed, then that would be a valid reason why we may want to see it as an alternative (and at that point, it is no longer inferior). That is the reason why I asked How would that help us in what way? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC 0/3] Reflogs for deleted refs: fix breakage and suggest namespace change
From: Michael Haggerty mhag...@alum.mit.edu On 08/17/2012 01:29 AM, Junio C Hamano wrote: Junio C Hamano gits...@pobox.com writes: I like the general direction. Perhaps a long distant future direction could be to also use the same trick in the ref namespace so that we can have 'next' branch itself, and 'next/foo', 'next/bar' forks that are based on the 'next' branch at the same time (it obviously is a totally unrelated topic)? I notice that I was responsible for making this topic veer in the wrong direction by bringing up a new feature having 'next' and 'next/bar' at the same time which nobody asked. Perhaps we can drop that for now to simplify the scope of the topic, to bring the log graveyard back on track? Given that a flag day would anyway be required to add a d/f-tolerant system, I could live with a separate graveyard namespace as originally proposed by Jeff. However, I still think that as long as we are making a jump, we could try to land closer to the ultimate destination. So here are some patches that apply on top of Jeff's to show what I mean. (Please also note that I made some technical comments about Jeff's patches in an earlier email.) The first two patches fix a breakage that I see when I apply Jeff's patches to master. The third changes the implementation of refname_to_graveyard_reflog() and graveyard_reflog_to_refname() and touches up some test cases. It changes the naming convention for dead references to $GIT_DIR/logs/refs~d/heads~d/foo~d/bar~d/baz~f I.e., the dead reflogs are stored closer to the living. It is not obvious whether the refs part of the name should be munged to refs~d as I have done, or left unmunged. The argument in favor of munging is that the algorithm is more uniform. On the other hand, extending the same scheme to loose references would produce filenames like $GIT_DIR/refs~d/heads~d/foo~d/bar~d/baz~f or maybe they should be nested inside of the refs directory like $GIT_DIR/refs/refs~d/heads~d/foo~d/bar~d/baz~f (which would also give a better place to store top-level reference names). I structured the patches to apply on top of Jeff's for presentation purposes, but if they are desired it would of course make more sense to squash his and mine together in the obvious way. I am a little bit worried that there are other test cases that use git prune in the belief that it will remove all commits that were referred to by deleted references. The test suite runs cleanly for me with these patches, but before they are integrated we should audit the places where the test suite calls to git prune to make sure that they are still testing what they think. Michael Haggerty (3): t9300: format test in modern style prior to modifying it Delete reflogs for dead references to allow pruning Change naming convention for the reflog graveyard refs.c | 31 --- t/t7701-repack-unpack-unreachable.sh | 4 ++-- t/t9300-fast-import.sh | 13 +++-- 3 files changed, 33 insertions(+), 15 deletions(-) -- 1.7.11.3 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 0/3] Reflogs for deleted refs: fix breakage and suggest namespace change
mhag...@alum.mit.edu writes: Given that a flag day would anyway be required to add a d/f-tolerant system, I could live with a separate graveyard namespace as originally proposed by Jeff. However, I still think that as long as we are making a jump, we could try to land closer to the ultimate destination. Do we _know_ already what the ultimate destination looks like? If the answer is yes, then I agree, but otherwise, I doubt it is a good idea to introduce unnecessary complexity to the system that may have to be ripped out and redone. I didn't get the impression that we know the ultimate destination from the previous discussion, especially if we discount the tangent around having next and next/foo at the same time which was on nobody's wish, but I may be misremembering things. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 0/3] Reflogs for deleted refs: fix breakage and suggest namespace change
On 18 Aug 2012, at 22:39, Junio C Hamano wrote: Do we _know_ already what the ultimate destination looks like? If the answer is yes, then I agree, but otherwise, I doubt it is a good idea to introduce unnecessary complexity to the system that may have to be ripped out and redone. I didn't get the impression that we know the ultimate destination from the previous discussion, especially if we discount the tangent around having next and next/foo at the same time which was on nobody's wish, but I may be misremembering things. Excuse me if i miss something again, but i might be willing to discuss the ultimate destination. Could you possibly state in simple terms what the problem with determining the ultimate destination is? I hope my opinion might be useful because i do not know anything about the actual implementation of Git, but for a while i thought i was understanding it's intended mathematical model, until i ran into unexpected for me default behavior of not pruning when fetching. To just give a quick idea of my ideas, i thought that 'fetching' in Git was an inevitable evil that stands apart from other operations and is necessary only because the computer communication on Earth is not sufficiently developed to keep all Git repositories constantly in sync, and because one might prefer to work with a somewhat dated snapshot of a remote than with the constantly changing current version. I thought snapshot could be a good alternative name for fetch. -Alexey.-- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 0/3] Reflogs for deleted refs: fix breakage and suggest namespace change
Alexey Muranov alexey.mura...@gmail.com writes: On 18 Aug 2012, at 22:39, Junio C Hamano wrote: Do we _know_ already what the ultimate destination looks like? If the answer is yes, then I agree, but otherwise, I doubt it is a good idea to introduce unnecessary complexity to the system that may have to be ripped out and redone. I didn't get the impression that we know the ultimate destination from the previous discussion, especially if we discount the tangent around having next and next/foo at the same time which was on nobody's wish, but I may be misremembering things. Excuse me if i miss something again, but i might be willing to discuss the ultimate destination. Could you possibly state in simple terms what the problem with determining the ultimate destination is? Decide if it makes sense to break backward compatibility of loose ref representation merely to support having a branch next and another branch next/foo in the same repository, and if it does, what the new loose ref representation looks like. I hope my opinion might be useful because i do not know anything about the actual implementation of Git,... That sounds like contradiction. To just give a quick idea of my ideas, i thought that 'fetching' in Git was an inevitable evil that stands apart from other operations and is necessary only because the computer communication on Earth is not sufficiently developed to keep all Git repositories constantly in sync,... It is a feature, not a symptom of an insufficiently developed technology, that I do not have to know what random tweaks and experiments are done in repositories of 47 thousands people who clone from me, and I can sync with any one of them only when I know there is something worth looking at when I say git fetch. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html