Re: Git ~unusable on slow lines :,'C
Marcel Partap writes: >>> Bam, the server kicked me off after taking to long to sync my copy. >> This is unrelated to git. The HTTP server's configuration is too >> impatient. > Yes. How does that mean it is unrelated to git? > >>> - git fetch should show the total amount of data it is about to >>> transfer! >> It can't, because it doesn't know. > The server side doesn't know at how much the objects *it just repacked > for transfer* weigh in? > If that truly is the case, wouldn't it make sense to make git a little > more introspective? f.e. It sends you more objects than the ones it just repacked in the normal case. It could tell you, but it would have to keep track of more information (which would make it take longer for the first bytes to get to you) for little gain. The only thing you'd be able to do is to abort the transfer immediately, but you can do that anyway, and waiting is only going to add history to download. >> # git info git://foo.org/bar.git >> .. [server generating figures] .. >> URL: git://foo.org/bar.git >> Created/Earliest commit: ... >> Last modified/Latest commit: ... >> Total object count: (..commits, ..files, .. directories) >> Total repository size (compressed): ... MiB >> Branches: >> [git branch -va] + branch size > >> The error message doesn't really know whether it is going to overwrite >> it (the CR comes from the server), though I suppose an extra LF wouldn't >> hurt there. > Definitely wouldn't hurt. > >>> - would be nice to be able to tell git fetch to get the next chunk of >>> say 500 commits instead of trying to receive ALL commits, then b0rking >>> after umpteen percent on server timeout. Not? >> You asked for the current state of the repository, and that's what its >> giving you. > And instead, I would rather like to ask for the next 500 commits. No way > to do it. Do you mean that there are no tags in between your current state and the one you want to be at? > >> The timeout has nothing to do with git, if you can't >> convince the admins to increase it, you can try using another transport >> which doesn't suffer from HTTP, as it's most likely an anti-DoS measure. > See, I probably can't convince the admins to drop their anti-dos measures. > And they (drupal.org admins) probably will not change their allowed > protocol policies. Switch to using the raw git protocol, which is much less likely to have this sort of measure. > Despite that, i've had timeouts or simply stale connections dying down > before with other repositories and various transport modes. > The easiest fix would be an option to tell git to not fetch everything... > >> If you want to download it bit by bit, you can tell fetch to download >> particular tags. > ..without specifying specific commit tags. > Browsing gitweb sites to find a tag for which the fetch doesn't time out > is hugely inconvenient, especially on a slow line. Don't use the web then. Use ls-remote to see what's at the other end. > >> Doing this automatically for this would be working >> around a configuration issue for a particular server, which is generally >> better fixed in other ways. > It is not only a configuration issue for one particular server. Git in > general is hardly usable on slow lines because > - it doesn't show the volume of data that is to be downloaded! How would showing the amount of data help your connection? > - it doesn't allow the user to sync up in steps the circumstances will > allow to succeed. This is unfortunate is some circunstances, but you haven't shown that yours is one of these. cmn -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git ~unusable on slow lines :,'C
- git fetch should show the total amount of data it is about to transfer! >>> It can't, because it doesn't know. >> The server side doesn't know at how much the objects *it just repacked >> for transfer* weigh in? > Actually it does. Then, please, make it display it. > What value is that to you? The size that is to be transferred, and the total repository size. > You asked for the repository. If you know its size is going to be ~105 > MiB you have two choices... continue to get the repository you asked > for, or disconnect and give up. > Either way the size doesn't help you. Yes it does - when displayed, one could make an informed choice. But it doesn't show this, just the object count.. and that is of low expressiveness. It so happened last week that I tried cloning a repository with a seemingly moderate amount of objects and small code base. However, full Java RE zips had been checked in and updated multiple times - suddenly my monthly 3G traffic limit was exhausted. Needless to say without a clue how much more data would follow, I aborted the transfer - and was left with a net result of *zilch* bytes of code, and a line cut down to ridiculous speed. Now I can't even sync up my Drupal copy. > It would require a protocol modification to send a size estimate down > to the client before the data in order to give the client a better > progress meter than the object count (allowing it instead to track by > bytes received). Well, if it requires that, so be it. I fail to understand why this wasn't considered before. > But this has been seen as not very useful or worthwhile > since it doesn't really help anyone do anything better. Huh? > So why change the protocol? Sanity? Usability of git with slow lines? > Git assumes that once it has commit X, all versions > that predate X are already on the local workstation. And that's true for all my repositories, since none of them was cloned --shallow. > This is a fundamental assumption that the entire protocol relies on. What about --shallow, --depth? > It is not trivial to change. Many changes for the better are not trivial. And still worth it. > We have been through this many times on the mailing > list, please search the archives for "resumable clone". Ok - yet that probably doesn't invalidate all arguments in favor of it. > they should [...] host these bundle files [...] > and users can download and resume these Thanks for the tip, I will forward it to the server administrators. However, this does not help to handle the huge amount of commits to fetch that pile up within a couple of months. > This is currently the best way to support resumable clone. I wasn't even mentioning that, but that'd be nice to have aswell^^... > If bundles are made once per month or after each > major release its usually a manageable delta. While downloading bundle delta files definitely is a plausible solution - isn't that quite far from user friendly? > If it did show you, what would you do? Not try to checkout a repository full of JRE zips blindfolded? > Declare defeat before it even > starts to download and give up and start a thread about how Git > requires too much bandwidth? Kindly ask the author to locally rewrite his history and recreate the repository with *LINKS* to JRE zips instead? Not for a second did I doubt the efficiency of git's packing and compression algorithms! That's why I'm quite amazed about the shear existence of these issues of not showing the repository size before downloading (or, IIUC, *anywhere*) and a protocol that is incapable of resuming or partly fetching a repository, even though it obviously provides means of negotiation between server and client.. Just boggles me that within 7+ years of development this hasn't been addressed (disclaimer: I do not claim to grok the protocol - not wanting to put blame on anyone here :). > Have you tried to shallow clone the repository in question? No - would it allow me to fuse the two repositories afterwards? That'd actually be quite cool and a good idea to instantly solve my current problem... gonna try that, thx :) #Regards!Marcel -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git ~unusable on slow lines :,'C
c...@elego.de (Carlos Martín Nieto) writes: > If you want to download it bit by bit, you can tell fetch to download > particular tags. Doing this automatically for this would be working > around a configuration issue for a particular server, which is generally > better fixed in other ways. As part of an upcoming "protocol update" discussion, we may want to include allowing "upload-pack" to accept a request for commit that is not at the tip of any ref. E.g. "want refs/heads/master~*0.1" might ask "I know your entire history is very big; please give me only the one tenth of the oldest history during this round." (this is not a suggestion on how to do this at the UI level). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git ~unusable on slow lines :,'C
On Tue, Oct 9, 2012 at 7:06 AM, Marcel Partap wrote: >>> Bam, the server kicked me off after taking to long to sync my copy. >> This is unrelated to git. The HTTP server's configuration is too >> impatient. > Yes. How does that mean it is unrelated to git? It means its out of our control, we cannot modify the HTTP server's configuration to have a longer timeout. We can recommend that the timeout be increased, but as you point out the admins may not do that. >>> - git fetch should show the total amount of data it is about to >>> transfer! >> It can't, because it doesn't know. > The server side doesn't know at how much the objects *it just repacked > for transfer* weigh in? Actually it does. Its just not used here. What value is that to you? You asked for the repository. If you know its size is going to be ~105 MiB you have two choices... continue to get the repository you asked for, or disconnect and give up. Either way the size doesn't help you. It would require a protocol modification to send a size estimate down to the client before the data in order to give the client a better progress meter than the object count (allowing it instead to track by bytes received). But this has been seen as not very useful or worthwhile since it doesn't really help anyone do anything better. So why change the protocol? >> You asked for the current state of the repository, and that's what its >> giving you. > And instead, I would rather like to ask for the next 500 commits. No way > to do it. No, there isn't. Git assumes that once it has commit X, all versions that predate X are already on the local workstation. This is a fundamental assumption that the entire protocol relies on. It is not trivial to change. We have been through this many times on the mailing list, please search the archives for "resumable clone". >> The timeout has nothing to do with git, if you can't >> convince the admins to increase it, you can try using another transport >> which doesn't suffer from HTTP, as it's most likely an anti-DoS measure. > See, I probably can't convince the admins to drop their anti-dos measures. > And they (drupal.org admins) probably will not change their allowed > protocol policies. Then if they are hosting really big repositories that are hard for their contributors to obtain, they should take the time to write a script that periodically creates a bundle file for each repository using `git bundle create repo.bundle --all`. They can host these bundle files in any file transport service like HTTP or BitTorrent, and users can download and resume these using normal HTTP download tools. Once you have a bundle file locally, you can clone from it with modern Git with `git clone $(pwd)/repo.bundle` to initialize the repository. This is currently the best way to support resumable clone. The repo will be stale by whatever time has elapsed since the bundle file was created. But then Git can do an incremental fetch to catch up, and this transfer size should be limited to the progress made since the bundle was made. If bundles are made once per month or after each major release its usually a manageable delta. > It is not only a configuration issue for one particular server. Git in > general is hardly usable on slow lines because > - it doesn't show the volume of data that is to be downloaded! If it did show you, what would you do? Declare defeat before it even starts to download and give up and start a thread about how Git requires too much bandwidth? Have you tried to shallow clone the repository in question? > - it doesn't allow the user to sync up in steps the circumstances will > allow to succeed. Sadly, this is quite true. :-( -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git ~unusable on slow lines :,'C
>> Bam, the server kicked me off after taking to long to sync my copy. > This is unrelated to git. The HTTP server's configuration is too > impatient. Yes. How does that mean it is unrelated to git? >> - git fetch should show the total amount of data it is about to >> transfer! > It can't, because it doesn't know. The server side doesn't know at how much the objects *it just repacked for transfer* weigh in? If that truly is the case, wouldn't it make sense to make git a little more introspective? f.e. > # git info git://foo.org/bar.git > .. [server generating figures] .. > URL: git://foo.org/bar.git > Created/Earliest commit: ... > Last modified/Latest commit: ... > Total object count: (..commits, ..files, .. directories) > Total repository size (compressed): ... MiB > Branches: > [git branch -va] + branch size > The error message doesn't really know whether it is going to overwrite > it (the CR comes from the server), though I suppose an extra LF wouldn't > hurt there. Definitely wouldn't hurt. >> - would be nice to be able to tell git fetch to get the next chunk of >> say 500 commits instead of trying to receive ALL commits, then b0rking >> after umpteen percent on server timeout. Not? > You asked for the current state of the repository, and that's what its > giving you. And instead, I would rather like to ask for the next 500 commits. No way to do it. > The timeout has nothing to do with git, if you can't > convince the admins to increase it, you can try using another transport > which doesn't suffer from HTTP, as it's most likely an anti-DoS measure. See, I probably can't convince the admins to drop their anti-dos measures. And they (drupal.org admins) probably will not change their allowed protocol policies. Despite that, i've had timeouts or simply stale connections dying down before with other repositories and various transport modes. The easiest fix would be an option to tell git to not fetch everything... > If you want to download it bit by bit, you can tell fetch to download > particular tags. ..without specifying specific commit tags. Browsing gitweb sites to find a tag for which the fetch doesn't time out is hugely inconvenient, especially on a slow line. > Doing this automatically for this would be working > around a configuration issue for a particular server, which is generally > better fixed in other ways. It is not only a configuration issue for one particular server. Git in general is hardly usable on slow lines because - it doesn't show the volume of data that is to be downloaded! - it doesn't allow the user to sync up in steps the circumstances will allow to succeed. #Regards!Marcel. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git ~unusable on slow lines :,'C
Marcel Partap writes: > Dear Git Devs, > I love GIT, but since a couple of months I'm on 3G and after my traffic > limit is transcended, things slow down to a feeble 8KiB/s. Jst like > back then - things moved somewhat slower. And I'm fine with that - as > long as things just keep moving. > Unfortunately, git does not scale down very well, so for ten more days I > will be unable to get the newest commits onto my machine. Which is very, > very sad :/ >> git fetch --verbose --all >> Fetching origin >> POST git-upload-pack (1023 bytes) >> POST git-upload-pack (gzip 1123 to 614 bytes) >> POST git-upload-pack (gzip 1973 to 1030 bytes) >> POST git-upload-pack (gzip 5173 to 2639 bytes) >> POST git-upload-pack (gzip 7978 to 4042 bytes) >> remote: Counting objects: 24504, done. >> remote: Compressing objects: 100% (10705/10705), done. >> error: RPC failed; result=56, HTTP code = 200iB | 10 KiB/s >> fatal: The remote end hung up unexpectedly >> fatal: early EOF >> fatal: index-pack failed >> error: Could not fetch origin > Bam, the server kicked me off after taking to long to sync my copy. This is unrelated to git. The HTTP server's configuration is too impatient. > Multiple potential points of action: > - git fetch should show the total amount of data it is about to > transfer! It can't, because it doesn't know. > - when ab^H^Horting, the cursor should be moved down (tput cud1) to not > overwrite previous output The error message doesn't really know whether it is going to overwrite it (the CR comes from the server), though I suppose an extra LF wouldn't hurt there. > - would be nice to be able to tell git fetch to get the next chunk of > say 500 commits instead of trying to receive ALL commits, then b0rking > after umpteen percent on server timeout. Not? You asked for the current state of the repository, and that's what its giving you. The timeout has nothing to do with git, if you can't convince the admins to increase it, you can try using another transport which doesn't suffer from HTTP, as it's most likely an anti-DoS measure. If you want to download it bit by bit, you can tell fetch to download particular tags. Doing this automatically for this would be working around a configuration issue for a particular server, which is generally better fixed in other ways. cmn -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Git ~unusable on slow lines :,'C
Dear Git Devs, I love GIT, but since a couple of months I'm on 3G and after my traffic limit is transcended, things slow down to a feeble 8KiB/s. Jst like back then - things moved somewhat slower. And I'm fine with that - as long as things just keep moving. Unfortunately, git does not scale down very well, so for ten more days I will be unable to get the newest commits onto my machine. Which is very, very sad :/ > git fetch --verbose --all > Fetching origin > POST git-upload-pack (1023 bytes) > POST git-upload-pack (gzip 1123 to 614 bytes) > POST git-upload-pack (gzip 1973 to 1030 bytes) > POST git-upload-pack (gzip 5173 to 2639 bytes) > POST git-upload-pack (gzip 7978 to 4042 bytes) > remote: Counting objects: 24504, done. > remote: Compressing objects: 100% (10705/10705), done. > error: RPC failed; result=56, HTTP code = 200iB | 10 KiB/s > fatal: The remote end hung up unexpectedly > fatal: early EOF > fatal: index-pack failed > error: Could not fetch origin Bam, the server kicked me off after taking to long to sync my copy. Multiple potential points of action: - git fetch should show the total amount of data it is about to transfer! - when ab^H^Horting, the cursor should be moved down (tput cud1) to not overwrite previous output - would be nice to be able to tell git fetch to get the next chunk of say 500 commits instead of trying to receive ALL commits, then b0rking after umpteen percent on server timeout. Not? #Regards!Marcel c: -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html