On Mon, Aug 15, 2011 at 18:40, Russell N. Nelson - rnnelson <rnnel...@clarkson.edu> wrote: > The problem is that 1) the files are bulky,
That's expected. :-) > 2) there are many of them, 3) they are in constant flux, That is not really a problem: since there are many of them statistically they are not in flux. > and 4) it's likely that your connection would close for whatever reason > part-way through the download.. I seem not to forgot to mention zsync/rsync. ;-) > Even taking a snapshot of the filenames is dicey. By the time you finish, > it's likely that there will be new ones, and possible that some will be > deleted. Probably the best way to make this work is to 1) make a snapshot of > files periodically, Since I've been told they're backed up it naturally should exist. > 2) create an API which returns a tarball using the snapshot of files that > also implements Range requests. I would very much prefer ready-to-use format instead of a tarball, not to mention it's pretty resource consuming to create a tarball just for that. > Of course, this would result in a 12-terabyte file on the recipient's host. > That wouldn't work very well. I'm pretty sure that the recipient would need > an http client which would 1) keep track of the place in the bytestream and > 2) split out files and write them to disk as separate files. It's possible > that a program like getbot already implements this. I'd make a snapshot without tar especially because partial transfers aren't possible that way. -- byte-byte, grin _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l