Re: Git clonebundles
Hi Junio, On Tue, 7 Feb 2017, Junio C Hamano wrote: > Johannes Schindelin writes: > > >> If people think it might be useful to have it around to experiment, I > >> can resurrect and keep that in 'pu' (or rather 'jch'), as long as it > >> does not overlap and conflict with other topics in flight. Let me try > >> that in today's integration cycle. > > > > I would like to remind you of my suggestion to make this more publicly > > visible and substantially easier to play with, by adding it as an > > experimental feature (possibly guarded via an explicit opt-in config > > setting). > > I do not understand why you want to give this topic undue prominence > ovver any other random topic that cook in 'pu' [...] Since you ask so nicely for an explanation: clonebundles got a really lively and active discussion at the Contributors' Summit. So it is not your run of the mill typo fix, the bundle issue is something that clearly receives a lot of interest in particular from developers who are unfamiliar with the idiosynchracies of the code Git development. And I got the very distinct impression that Git would benefit a lot from these developers, *in particular* since they come with fresh perspectives. Now, we can make it hard for them (e.g. expecting them to sift through a few months' worth of What's Cooking mails, to find out whether there has been any related work, and what is the branch name, if any, and where to find that branch), and we can alternatively make it easy for them to help us make Git better. I would like us to choose the easier route for them. Because it would benefit us. Ciao, Johannes
Re: Git clonebundles
Johannes Schindelin writes: >> If people think it might be useful to have it around to experiment, I >> can resurrect and keep that in 'pu' (or rather 'jch'), as long as it >> does not overlap and conflict with other topics in flight. Let me try >> that in today's integration cycle. > > I would like to remind you of my suggestion to make this more publicly > visible and substantially easier to play with, by adding it as an > experimental feature (possibly guarded via an explicit opt-in config > setting). I do not understand why you want to give this topic undue prominence ovver any other random topic that cook in 'pu' and later merged down to 'next' and then 'master' only after they turn out to be useful (or at least harmless). If there were somebody who is the champion of that topic, advocating that any clone-bundle solution must be based on this topic, it would be different. Even though I am not opposed to the topic myself, I am not that somebody. That is why I kept it around to wait to see if somebody finds it potentially useful and then discarded it after seeing no such person stepped up. That champion of the topic would spend the necessaly engineering effort to document it as experimental, to make sure that there is a reasonable upgrade/transition route if the "v3" format turns out to be not very useful, etc. by rerolling the patches or following-up on them to advance it from 'pu' down to 'next' and to 'master' just like any other topic. Judging from the tone of his message (i.e. "unfortunately" in it), Christian may want to be one, or somebody else may want to be one.
Re: Git clonebundles
On Tue, Feb 7, 2017 at 4:04 AM, Johannes Schindelin wrote: > Hi Junio, > > On Mon, 6 Feb 2017, Junio C Hamano wrote: > >> Christian Couder writes: >> >> > There is also Junio's work on Bundle v3 that was unfortunately >> > recently discarded. Look for "jc/bundle" in: >> > >> > http://public-inbox.org/git/xmqq4m0cry60@gitster.mtv.corp.google.com/ >> > >> > and previous "What's cooking in git.git" emails. >> >> If people think it might be useful to have it around to experiment, I >> can resurrect and keep that in 'pu' (or rather 'jch'), as long as it >> does not overlap and conflict with other topics in flight. Let me try >> that in today's integration cycle. > > I would like to remind you of my suggestion to make this more publicly > visible and substantially easier to play with, by adding it as an > experimental feature (possibly guarded via an explicit opt-in config > setting). > > Ciao, > Johannes For making this more publicly visible, I want to look into publishing the cooking reports on the git-scm.com. Maybe we can have a "dev" section there, that has * a "getting started" section linking to Documentation/SubmittingPatches How to setup your travis * "current state of development" section e.g. the cooking reports, the release calender, description of the workflow (which branches do exist and serve which purpose), Most of the static information is already covered quite well in Documentation/ so there is definitively overlap, hence lots of links to the ground truth. The dynamic information however (release calender, cooking reports) are not described well enough in Documentation/ so I think we'd want to focus on these in that dev section.
Re: Git clonebundles
Hi Junio, On Mon, 6 Feb 2017, Junio C Hamano wrote: > Christian Couder writes: > > > There is also Junio's work on Bundle v3 that was unfortunately > > recently discarded. Look for "jc/bundle" in: > > > > http://public-inbox.org/git/xmqq4m0cry60@gitster.mtv.corp.google.com/ > > > > and previous "What's cooking in git.git" emails. > > If people think it might be useful to have it around to experiment, I > can resurrect and keep that in 'pu' (or rather 'jch'), as long as it > does not overlap and conflict with other topics in flight. Let me try > that in today's integration cycle. I would like to remind you of my suggestion to make this more publicly visible and substantially easier to play with, by adding it as an experimental feature (possibly guarded via an explicit opt-in config setting). Ciao, Johannes
Re: Git clonebundles
Christian Couder writes: > There is also Junio's work on Bundle v3 that was unfortunately > recently discarded. > Look for "jc/bundle" in: > > http://public-inbox.org/git/xmqq4m0cry60@gitster.mtv.corp.google.com/ > > and previous "What's cooking in git.git" emails. If people think it might be useful to have it around to experiment, I can resurrect and keep that in 'pu' (or rather 'jch'), as long as it does not overlap and conflict with other topics in flight. Let me try that in today's integration cycle.
Re: Git clonebundles
On Sat, Feb 4, 2017 at 6:39 PM, Shawn Pearce wrote: > On Mon, Jan 30, 2017 at 11:00 PM, Stefan Saasen wrote: >> >> Bitbucket recently added support for Mercurial’s clonebundle extension >> (http://gregoryszorc.com/blog/2015/10/22/cloning-improvements-in-mercurial-3.6/). >> Mercurial’s clone bundles allow the Mercurial client to seed a repository >> using >> a bundle file instead of dynamically generating a bundle for the client. > ... >> Prior art >> ~ >> >> Our proof-of-concept is built on top of ideas that have been >> circulating for a while. We are aware of a number of proposed changes >> in this space: >> >> >> * Jeff King's work on network bundles: >> https://github.com/peff/git/commit/17e2409df37edd0c49ef7d35f47a7695f9608900 >> * Nguyễn Thái Ngọc Duy's work on "[PATCH 0/8] Resumable clone >> revisited, proof of concept": >> https://www.spinics.net/lists/git/msg267260.html >> * Resumable clone work by Kevin Wern: >> https://public-inbox.org/git/1473984742-12516-1-git-send-email-kevin.m.w...@gmail.com/ > > I think you missed the most common deployment of prior art, which is > Android using the git-repo tool[1]. The git-repo tool has had > clone.bundle support since Sep 2011[2] and the Android Git servers > have been answering /clone.bundle requests[3] since just before that. > The bundle files are generated with `git bundle create` on a regular > schedule by cron. > > [1] > https://gerrit.googlesource.com/git-repo/+/04071c1c72437a930db017bd4c562ad06087986a/project.py#2091 > [2] > https://gerrit.googlesource.com/git-repo/+/f322b9abb4cadc67b991baf6ba1b9f2fbd5d7812 > [3] https://android.googlesource.com/platform/frameworks/base/clone.bundle There is also Junio's work on Bundle v3 that was unfortunately recently discarded. Look for "jc/bundle" in: http://public-inbox.org/git/xmqq4m0cry60@gitster.mtv.corp.google.com/ and previous "What's cooking in git.git" emails. I am also working on adding external object database support using previous work by Peff: http://public-inbox.org/git/20161130210420.15982-1-chrisc...@tuxfamily.org/ that could be extended to support clone bundles. [...]
Re: Git clonebundles
On Mon, Jan 30, 2017 at 11:00 PM, Stefan Saasen wrote: > > Bitbucket recently added support for Mercurial’s clonebundle extension > (http://gregoryszorc.com/blog/2015/10/22/cloning-improvements-in-mercurial-3.6/). > Mercurial’s clone bundles allow the Mercurial client to seed a repository > using > a bundle file instead of dynamically generating a bundle for the client. ... > Prior art > ~ > > Our proof-of-concept is built on top of ideas that have been > circulating for a while. We are aware of a number of proposed changes > in this space: > > > * Jeff King's work on network bundles: > https://github.com/peff/git/commit/17e2409df37edd0c49ef7d35f47a7695f9608900 > * Nguyễn Thái Ngọc Duy's work on "[PATCH 0/8] Resumable clone > revisited, proof of concept": > https://www.spinics.net/lists/git/msg267260.html > * Resumable clone work by Kevin Wern: > https://public-inbox.org/git/1473984742-12516-1-git-send-email-kevin.m.w...@gmail.com/ I think you missed the most common deployment of prior art, which is Android using the git-repo tool[1]. The git-repo tool has had clone.bundle support since Sep 2011[2] and the Android Git servers have been answering /clone.bundle requests[3] since just before that. The bundle files are generated with `git bundle create` on a regular schedule by cron. [1] https://gerrit.googlesource.com/git-repo/+/04071c1c72437a930db017bd4c562ad06087986a/project.py#2091 [2] https://gerrit.googlesource.com/git-repo/+/f322b9abb4cadc67b991baf6ba1b9f2fbd5d7812 [3] https://android.googlesource.com/platform/frameworks/base/clone.bundle > Whilst the above mentioned proposals/proposed changes are in a similar > space, I would be interest to understand whether there is any > consensus on the general idea of supporting static bundle files as a > mechanism to seed a repository? I don't think we have a consensus on how to advertise a bundle file is available, which is why there are so many instances of prior art. In 2011 I just threw together /clone.bundle on HTTP because it was easy to make the Python wrapper ask for the file and handle 404 gracefully as not found and fall back to `git clone`.
Git clonebundles
Hi all, Bitbucket recently added support for Mercurial’s clonebundle extension (http://gregoryszorc.com/blog/2015/10/22/cloning-improvements-in-mercurial-3.6/). Mercurial’s clone bundles allow the Mercurial client to seed a repository using a bundle file instead of dynamically generating a bundle for the client. Mercurial clonebundles? ~~~ With Mercurial clonebundles the high level clone sequence looks like this: 1. The command "hg clone URL" attempts to clone the repository at URL. 2. If a bundle file exists for the repository, the existence of the file `clonebundles.manifest` causes the server to advertise the `clonebundle` capability (capabilities lookup is the first command the client issues). 3. In the above case the client then executes the command "clonebundles". 4. The manifest file will be returned. 5. The client then selects a bundle file to download from the list of URLs advertised in the manifests file, to seed the repository. 6. To update the repository the last step involves fetching the latest changes. Why is this useful? ~~~ The fact that clone bundles can be distributed as static files enables us to use static file servers for bundle distribution. Users have also reported latency improvements for clone operations of popular Mercurial repositories. Additionally this significantly reduces the resource usage of clone operations, as clone operations are reduced to simpler fetches to resolve the delta between the current repository and the downloaded bundle state. clonebundles for git? ~ We recently looked into how this concept could be translated to git. This is not a new idea and has been discussed before (more on that later) but our success with the Mercurial clonebundle rollout prompted us to revisit this topic. We believe that bringing a similar concept to git could have the following benefits: * Improved clone times for users that clone large git repositories, especially if bundle file distribution leverages global CDNs. * Improved scalability of git for managing large popular repositories. Offloading a significant portion of the clone resource usage to CDNs or static file hosts. Our current proof-of-concept to explore this space, closely follows the approach from Mercurial outlined above. * An `/info/bundle` path returns a bundle manifest (over HTTP) * The bundle manifest contains a simple list of URLs with some additional meta data that allows the client to select a suitable bundle download URL * The bundle download URL points to a bundle file generated using `git bundle create` including all the relevant refs as a self contained repository seed. * The client probes the target URL with a `GET` request to $URL/info/bundle and downloads the bundle file if present. * The repository will be created based on the downloaded bundle (downloading a static file allows resumable downloads or parallel downloads of chunks if the file/web server supports range requests). * A `git fetch` and the appropriate checkout then updates the "cloned" repository to match the latest upstream state. The proof-of-concept was built as an external binary `git-clone2` that mimics the behaviour of the `git clone` command, so unfortunately I can't provide any patches to git to demonstrate the behaviour. Ultimately our proof-of-concept is built around a few core ideas: * Re-use the existing bundle format as a single-file, self-contained repository representation. * Introduce a bundle manifest (accessible at `$URL/info/bundle`) that allows the client to resolve a suitable bundle download URL. * Teach the `git clone` command to accept and prefer seeding a repository using a static bundle file that is advertised in a bundle manifest. * Re-use as much as possible of the existing commands and in particular the `git bundle` machinery to seed the repository and to create the static bundle file. * We accept additional storage requirements for the bundle files in addition to the actual repository content in pack-files or loose objects. Hosting providers or system administrators are free to decide how many bundles to advertise and how frequently the bundles are updated. * It targets the "seed from a bundle file" use case, with resumable clones just being a potential side-effect. Some of the problems that need to be solved with an approach like this are: * Bundle advertisement/bundle negotiation: We considered advertising a new capability "clonebundle" as part of the rev advertisement capabilities list. This would allow clients that support clonebundles to abort the clone attempt and resolve a suitable bundle URL from a bundle manifest at `$URL/info/bundle` instead. For HTTP this would amount to an early termination when retrieving the ref-advertisement. Note: We didn't pursue this for our proof-of-concept so we didn't explore whether this is feasible. * Uniform approach for the supported transports: Our