Hi Wouldn't a p2p system scale better than any server based solution? Also in regards to cost...
Pedro. On Sat, Jul 18, 2015 at 9:42 AM, Philip Hands <p...@hands.com> wrote: > Troy Benjegerdes <ho...@hozed.org> writes: > > > On Fri, Jul 17, 2015 at 09:38:06PM +0200, Jakub Wilk wrote: > >> * Ole Streicher <oleb...@debian.org>, 2015-07-17, 10:34: > >> >But: These packages sum up to ~25 GB, with the maximal package > >> >size of 3.5 GB. > >> > >> Well, that's a lot. Just as data points: > >> > >> * The biggest binary package currently in the archive, > >> ns3-doc_3.17+dfsg-1_all.deb, is only ~1GB. > >> > >> * The biggest source package, nvidia-cuda-toolkit_6.0.37-5, is only > >> ~1.5GB. > >> > >> > >> I'm afraid you might need to wait for the advent of data.d.o: > >> https://lists.debian.org/87tzgm6yee....@vorlon.ganneff.de > >> (mind the typo: s/2 weeks/10 years/) > >> > > > > My first thought was "well, can all of us science-type users > > agree to host something like /afs/data.d.o/", and then I saw > > the following: > > > > On Fri, Jul 17, 2015 at 02:03:54AM -0700, Afif Elghraoui wrote: > >> Package: wnpp > >> Severity: wishlist > >> Owner: Afif Elghraoui <a...@ghraoui.name> > >> X-Debbugs-Cc: debian-devel@lists.debian.org > >> > >> * Package name : ori > >> Version : 0.8.1 > >> Upstream Author : Stanford University <orifs-de...@lists.stanford.edu > > > >> * URL : http://ori.scs.stanford.edu/ > >> * License : ori (MIT-like) > >> Programming Lang: C++ > >> Description : secure distributed file system > >> > >> Ori is a distributed file system built for offline operation and > empowers > >> the user with control over synchronization operations and conflict > >> resolution. > >> History is provided through lightweight snapshots and users can verify > that > >> the history has not been tampered with. Through the use of replication, > >> instances can be resilient and recover damaged data from other nodes. > > > > So is there any sort of reasonable internet-scale distributed > > filesystem in use that might actually work for this? > > Git-annex supports Tahoe-LAFS: > > https://git-annex.branchable.com/special_remotes/tahoe/ > > but given that it also supports all of these: > > https://git-annex.branchable.com/special_remotes/ > > I'd guess that the data would quite often reside on resources that are at > least as reliable as whatever we might set up, so one could just do it > on a case by case basis. > > git-annex allows one to set the number of copies that one wants to exist > of the data, so one could perhaps insist that data have multiple > sources, and that could be checked periodically, with some plan to copy > data elsewhere if and when a source disappears. > > The users of the data could be given the option to contribute to the > checking process, so that it gets done as part of the act of using the > data. > > Any effort required to shift data to new resources when old sources > disappear could be done by those that benefit from the access to the > data, in a distributed manner. > > Cheers, Phil. > -- > |)| Philip Hands [+44 (0)20 8530 9560] HANDS.COM Ltd. > |-| http://www.hands.com/ http://ftp.uk.debian.org/ > |(| Hugo-Klemm-Strasse 34, 21075 Hamburg, GERMANY >