I second this idea. Large archives should always be available using bittorrent. I would actually suggest posting magnet links for them though instead of .torrent files. This way you can leverage the acceptable source feature of magnet links.
https://en.wikipedia.org/wiki/Magnet_URI_scheme#Web_links_to_the_file This way we get the best of both worlds: the constant availability of direct downloads, and the reduction in load that p2p filesharing provides. Thank you, Derric Atzrott -----Original Message----- From: wikitech-l-boun...@lists.wikimedia.org [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Oren Bochman Sent: 05 June 2012 08:44 To: 'Wikimedia developers' Subject: Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update Any chance that these archived can be served via bittorent - so that even partial downloaders can become servers - leveraging p2p to reduce overall bandwidth load on the servers and increase download times? -----Original Message----- From: wikitech-l-boun...@lists.wikimedia.org [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Mike Dupont Sent: Saturday, June 02, 2012 1:28 AM To: Wikimedia developers; wikiteam-disc...@googlegroups.com Subject: Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update I have run cron archiving now every 30 minutes, http://ia700802.us.archive.org/34/items/wikipedia-delete-2012-06/ it is amazing how fast the stuff gets deleted on wikipedia. what about the proposed deletes are there categories for that? thanks mike On Wed, May 30, 2012 at 6:26 AM, Mike Dupont <jamesmikedup...@googlemail.com> wrote: > https://github.com/h4ck3rm1k3/wikiteam code here > > On Wed, May 30, 2012 at 6:26 AM, Mike Dupont > <jamesmikedup...@googlemail.com> wrote: >> Ok, I merged the code from wikteam and have a full history dump >> script that uploads to archive.org, next step is to fix the bucket >> metadata in the script mike >> >> On Tue, May 29, 2012 at 3:08 AM, Mike Dupont >> <jamesmikedup...@googlemail.com> wrote: >>> Well, I have now updated the script to include the xml dump in raw >>> format. I will have to add more information the achive.org item, at >>> least a basic readme. >>> other thing is that the wikipybot does not support the full history >>> it seems, so that I will have to move over to the wikiteam version >>> and rework it, I just spent 2 hours on this so i am pretty happy for >>> the first version. >>> >>> mike >>> >>> On Tue, May 29, 2012 at 1:52 AM, Hydriz Wikipedia <ad...@alphacorp.tk> >>> wrote: >>>> This is quite nice, though the item's metadata is too little :) >>>> >>>> On Tue, May 29, 2012 at 3:40 AM, Mike Dupont >>>> <jamesmikedup...@googlemail.com >>>>> wrote: >>>> >>>>> first version of the Script is ready , it gets the versions, puts >>>>> them in a zip and puts that on archive.org >>>>> https://github.com/h4ck3rm1k3/pywikipediabot/blob/master/export_de >>>>> leted.py >>>>> >>>>> here is an example output : >>>>> http://archive.org/details/wikipedia-delete-2012-05 >>>>> >>>>> http://ia601203.us.archive.org/24/items/wikipedia-delete-2012-05/a >>>>> rchive2012-05-28T21:34:02.302183.zip >>>>> >>>>> I will cron this, and it should give a start of saving deleted data. >>>>> Articles will be exported once a day, even if they they were >>>>> exported yesterday as long as they are in one of the categories. >>>>> >>>>> mike >>>>> >>>>> On Mon, May 21, 2012 at 7:21 PM, Mike Dupont >>>>> <jamesmikedup...@googlemail.com> wrote: >>>>> > Thanks! and run that 1 time per day, they dont get deleted that quickly. >>>>> > mike >>>>> > >>>>> > On Mon, May 21, 2012 at 9:11 PM, emijrp <emi...@gmail.com> wrote: >>>>> >> Create a script that makes a request to Special:Export using >>>>> >> this >>>>> category >>>>> >> as feed >>>>> >> https://en.wikipedia.org/wiki/Category:Candidates_for_speedy_de >>>>> >> letion >>>>> >> >>>>> >> More info >>>>> https://www.mediawiki.org/wiki/Manual:Parameters_to_Special:Export >>>>> >> >>>>> >> >>>>> >> 2012/5/21 Mike Dupont <jamesmikedup...@googlemail.com> >>>>> >>> >>>>> >>> Well I whould be happy for items like this : >>>>> >>> http://en.wikipedia.org/wiki/Template:Db-a7 >>>>> >>> would it be possible to extract them easily? >>>>> >>> mike >>>>> >>> >>>>> >>> On Thu, May 17, 2012 at 2:23 PM, Ariel T. Glenn >>>>> >>> <ar...@wikimedia.org> >>>>> >>> wrote: >>>>> >>> > There's a few other reasons articles get deleted: copyright >>>>> >>> > issues, personal identifying data, etc. This makes >>>>> >>> > maintaning the sort of mirror you propose problematic, although a >>>>> >>> > similar mirror is here: >>>>> >>> > http://deletionpedia.dbatley.com/w/index.php?title=Main_Page >>>>> >>> > >>>>> >>> > The dumps contain only data publically available at the time >>>>> >>> > of the >>>>> run, >>>>> >>> > without deleted data. >>>>> >>> > >>>>> >>> > The articles aren't permanently deleted of course. The >>>>> >>> > revisions >>>>> texts >>>>> >>> > live on in the database, so a query on toolserver, for >>>>> >>> > example, >>>>> could be >>>>> >>> > used to get at them, but that would need to be for research >>>>> >>> > purposes. >>>>> >>> > >>>>> >>> > Ariel >>>>> >>> > >>>>> >>> > Στις 17-05-2012, ημέρα Πεμ, και ώρα 13:30 +0200, ο/η Mike >>>>> >>> > Dupont >>>>> έγραψε: >>>>> >>> >> Hi, >>>>> >>> >> I am thinking about how to collect articles deleted based >>>>> >>> >> on the >>>>> "not >>>>> >>> >> notable" criteria, >>>>> >>> >> is there any way we can extract them from the mysql >>>>> >>> >> binlogs? how are these mirrors working? I would be >>>>> >>> >> interested in setting up a mirror >>>>> of >>>>> >>> >> deleted data, at least that which is not spam/vandalism >>>>> >>> >> based on >>>>> tags. >>>>> >>> >> mike >>>>> >>> >> >>>>> >>> >> On Thu, May 17, 2012 at 1:09 PM, Ariel T. Glenn < >>>>> ar...@wikimedia.org> >>>>> >>> >> wrote: >>>>> >>> >> > We now have three mirror sites, yay! The full list is >>>>> >>> >> > linked to >>>>> from >>>>> >>> >> > http://dumps.wikimedia.org/ and is also available at >>>>> >>> >> > >>>>> >>> >> > >>>>> http://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dum >>>>> ps#Current_Mirrors >>>>> >>> >> > >>>>> >>> >> > Summarizing, we have: >>>>> >>> >> > >>>>> >>> >> > C3L (Brazil) with the last 5 good known dumps, Masaryk >>>>> >>> >> > University (Czech Republic) with the last 5 known good >>>>> dumps, >>>>> >>> >> > Your.org (USA) with the complete archive of dumps, and >>>>> >>> >> > >>>>> >>> >> > for the latest version of uploaded media, Your.org with >>>>> >>> >> > http/ftp/rsync access. >>>>> >>> >> > >>>>> >>> >> > Thanks to Carlos, Kevin and Yenya respectively at the >>>>> >>> >> > above sites >>>>> for >>>>> >>> >> > volunteering space, time and effort to make this happen. >>>>> >>> >> > >>>>> >>> >> > As people noticed earlier, a series of media tarballs >>>>> >>> >> > per-project (excluding commons) is being generated. As >>>>> >>> >> > soon as the first run >>>>> of >>>>> >>> >> > these is complete we'll announce its location and start >>>>> >>> >> > generating them on a semi-regular basis. >>>>> >>> >> > >>>>> >>> >> > As we've been getting the bugs out of the mirroring >>>>> >>> >> > setup, it is getting easier to add new locations. Know >>>>> >>> >> > anyone interested? Please let >>>>> us >>>>> >>> >> > know; we would love to have them. >>>>> >>> >> > >>>>> >>> >> > Ariel >>>>> >>> >> > >>>>> >>> >> > >>>>> >>> >> > _______________________________________________ >>>>> >>> >> > Wikitech-l mailing list >>>>> >>> >> > Wikitech-l@lists.wikimedia.org >>>>> >>> >> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l >>>>> >>> >> >>>>> >>> >> >>>>> >>> >> >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> > _______________________________________________ >>>>> >>> > Wikitech-l mailing list >>>>> >>> > Wikitech-l@lists.wikimedia.org >>>>> >>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l >>>>> >>> >>>>> >>> >>>>> >>> >>>>> >>> -- >>>>> >>> James Michael DuPont >>>>> >>> Member of Free Libre Open Source Software Kosova >>>>> >>> http://flossk.org Contributor FOSM, the CC-BY-SA map of the >>>>> >>> world http://fosm.org Mozilla Rep >>>>> >>> https://reps.mozilla.org/u/h4ck3rm1k3 >>>>> >>> >>>>> >>> _______________________________________________ >>>>> >>> Wikitech-l mailing list >>>>> >>> Wikitech-l@lists.wikimedia.org >>>>> >>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> -- >>>>> >> Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com >>>>> >> Pre-doctoral student at the University of Cádiz (Spain) >>>>> >> Projects: AVBOT | StatMediaWiki | WikiEvidens | WikiPapers | >>>>> >> WikiTeam Personal website: >>>>> >> https://sites.google.com/site/emijrp/ >>>>> >> >>>>> >> >>>>> >> _______________________________________________ >>>>> >> Xmldatadumps-l mailing list >>>>> >> xmldatadump...@lists.wikimedia.org >>>>> >> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l >>>>> >> >>>>> > >>>>> > >>>>> > >>>>> > -- >>>>> > James Michael DuPont >>>>> > Member of Free Libre Open Source Software Kosova >>>>> > http://flossk.org Contributor FOSM, the CC-BY-SA map of the >>>>> > world http://fosm.org Mozilla Rep >>>>> > https://reps.mozilla.org/u/h4ck3rm1k3 >>>>> >>>>> >>>>> >>>>> -- >>>>> James Michael DuPont >>>>> Member of Free Libre Open Source Software Kosova http://flossk.org >>>>> Contributor FOSM, the CC-BY-SA map of the world http://fosm.org >>>>> Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 >>>>> >>>>> _______________________________________________ >>>>> Wikitech-l mailing list >>>>> Wikitech-l@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l >>>>> >>>> >>>> >>>> >>>> -- >>>> Regards, >>>> Hydriz >>>> >>>> We've created the greatest collection of shared knowledge in >>>> history. Help protect Wikipedia. Donate now: >>>> http://donate.wikimedia.org >>>> _______________________________________________ >>>> Wikitech-l mailing list >>>> Wikitech-l@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l >>> >>> >>> >>> -- >>> James Michael DuPont >>> Member of Free Libre Open Source Software Kosova http://flossk.org >>> Contributor FOSM, the CC-BY-SA map of the world http://fosm.org >>> Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 >> >> >> >> -- >> James Michael DuPont >> Member of Free Libre Open Source Software Kosova http://flossk.org >> Contributor FOSM, the CC-BY-SA map of the world http://fosm.org >> Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 > > > > -- > James Michael DuPont > Member of Free Libre Open Source Software Kosova http://flossk.org > Contributor FOSM, the CC-BY-SA map of the world http://fosm.org > Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l