2010/3/26, Jamie Lokier wrote: > Yang Tse wrote: > > content is provided in files and files have a history and as > > such a revision number. > > Git history is not linear, and is updated independently at different > locations without having to be online all the time, so a global > incrementing version number is fundamentally impossible.
Given a single repo with more than one branch a given file might have a different life on each branch. The file could perfectly have a revision number which included branch Id for each branch without much problem. The fun starts when two branches are merged, and how conceptually the merging should be considered. One option would be to consider there exists a donor branch and a recipient branch. In this case the merged file would continue with recipient branch Id and next revision number for that branch. Another option would be to consider that both branches are donors and that the merging 'creates' a third unified branch. In this case the branch Id would be a new one and the revision number part would start over for the merged file. With disconnected repos it would be something similar, but a repo Id would also come into the 'file revision number'. Implying that the revision number of a given file would be a three part thingy. 'repo_id . branch_id . rev_id'. Whether those designing a VCS wish to use the above or not is a different matter. But it should be doable. As a matter of fact commit time-stamps in a DVCS will most probably be those of the merge point when two repos are merged and not the time-stamp of the commit that took place when in disconnected state. It is a similar problem. The big falacy of nearly all so called 'distributed' VCS is calling them distributed, they should simply be called mostly-disconnected. The longer the repos are disconnected without merging contents the less likely I would call it, as a whole, a VCS. In the end a project needs an authoritative repository. > > Another thing that shows that git is yet in its infancy is the use of > > internal keys as external references. GUID's are fine as primary keys > > in a database, > > They aren't internal keys, and they aren't GUIDs! > > They are strong crypto hashes of the entire history up to that point. > > And yes, they are checked. OK, my wrong. They are hashes. > > but only for internal purposes. Exposing these as the primary way of > > identifying elements, no matter if these are code commits, financial > > transactions or cake recipes makes little sense, except maybe for > > debugging purposes. > > Good luck arguing that one with git users :-) Nah, It is simply my opinion mostly for personal consumption. I'm not involved in git design nor anything related. It might be valuable for some, at some moments, to have very low level VCS commands available at their fingertips. But most users simply need a high level interface which allows to perform basic operations, without the real risk of providing every committer the capability of screwing the repository. Obviously if the project implements a pull-only policy, and the ones doing the pull's have nearly written themselves the VCS the risk of screwing the authoritative repository diminishes, anyway, for me, it is a disconnected VCS not distributed. > What other identifier can you use to identify a point in history > globally, without requiring every user to be online when they commit > things locally? > > That's a serious question, btw. If you have a good solution I'd be > interested. Globally across disconnected and unsynchronized repos, it does not exist because disconnected and unsynchronized repos are not subject to have nothing common beyond the point in which they forked. When they are merged they can share and agree on whatever they want, at merge point they can agree on whatever global identifier they wish to share. But as soon as they disconnect again they can not assume nothing about the other disconnected repo, it might stay unchanged, get wiped out or even corrupted, each repo has its own life. > All the VCSs based on things like CVS/SVN two-way replication have a > worse problem: version numbers increment, but are different at each > location. When someone tells you to look up version x.y.z, you have > to use the master server holding x.y.z or you can't look it up. What > if it's down, corrupt, or you're working offline? Then you're stuck. Yep, that's the reason that I said that post-commit hooks could be used as a poor man's replication system. It may work in some setups, depending much on the ability of those deciding how it is deployed. It has shortfalls, and the more manual intervention is required to perform a single merge in a given VCS the more likely it is it will require additional manual administration in a multi-master setup. Single-master with multiple replicas should work more smoothly and be easier to manage. The fact is that truly distributed VCS is not something widespread even when it is possible to do. Paxos and Mencius algorithms are the key to this http://en.wikipedia.org/wiki/Paxos_algorithm some external links on that wikipedia are _really_ interesting. VCS related systems I've been able to locate are for example: Reliable Software's Code Co-op http://www.relisoft.com WANdisco's CVS-multisite SVN-multisite and JIRA-multisite http://www.wandisco.com Cheers, -- -=[Yang]=- ------------------------------------------------------------------- List admin: http://cool.haxx.se/list/listinfo/curl-library Etiquette: http://curl.haxx.se/mail/etiquette.html
