This is quite nice, though the item's metadata is too little :) On Tue, May 29, 2012 at 3:40 AM, Mike Dupont <jamesmikedup...@googlemail.com > wrote:
> first version of the Script is ready , it gets the versions, puts them > in a zip and puts that on archive.org > https://github.com/h4ck3rm1k3/pywikipediabot/blob/master/export_deleted.py > > here is an example output : > http://archive.org/details/wikipedia-delete-2012-05 > > http://ia601203.us.archive.org/24/items/wikipedia-delete-2012-05/archive2012-05-28T21:34:02.302183.zip > > I will cron this, and it should give a start of saving deleted data. > Articles will be exported once a day, even if they they were exported > yesterday as long as they are in one of the categories. > > mike > > On Mon, May 21, 2012 at 7:21 PM, Mike Dupont > <jamesmikedup...@googlemail.com> wrote: > > Thanks! and run that 1 time per day, they dont get deleted that quickly. > > mike > > > > On Mon, May 21, 2012 at 9:11 PM, emijrp <emi...@gmail.com> wrote: > >> Create a script that makes a request to Special:Export using this > category > >> as feed > >> https://en.wikipedia.org/wiki/Category:Candidates_for_speedy_deletion > >> > >> More info > https://www.mediawiki.org/wiki/Manual:Parameters_to_Special:Export > >> > >> > >> 2012/5/21 Mike Dupont <jamesmikedup...@googlemail.com> > >>> > >>> Well I whould be happy for items like this : > >>> http://en.wikipedia.org/wiki/Template:Db-a7 > >>> would it be possible to extract them easily? > >>> mike > >>> > >>> On Thu, May 17, 2012 at 2:23 PM, Ariel T. Glenn <ar...@wikimedia.org> > >>> wrote: > >>> > There's a few other reasons articles get deleted: copyright issues, > >>> > personal identifying data, etc. This makes maintaning the sort of > >>> > mirror you propose problematic, although a similar mirror is here: > >>> > http://deletionpedia.dbatley.com/w/index.php?title=Main_Page > >>> > > >>> > The dumps contain only data publically available at the time of the > run, > >>> > without deleted data. > >>> > > >>> > The articles aren't permanently deleted of course. The revisions > texts > >>> > live on in the database, so a query on toolserver, for example, > could be > >>> > used to get at them, but that would need to be for research purposes. > >>> > > >>> > Ariel > >>> > > >>> > Στις 17-05-2012, ημέρα Πεμ, και ώρα 13:30 +0200, ο/η Mike Dupont > έγραψε: > >>> >> Hi, > >>> >> I am thinking about how to collect articles deleted based on the > "not > >>> >> notable" criteria, > >>> >> is there any way we can extract them from the mysql binlogs? how are > >>> >> these mirrors working? I would be interested in setting up a mirror > of > >>> >> deleted data, at least that which is not spam/vandalism based on > tags. > >>> >> mike > >>> >> > >>> >> On Thu, May 17, 2012 at 1:09 PM, Ariel T. Glenn < > ar...@wikimedia.org> > >>> >> wrote: > >>> >> > We now have three mirror sites, yay! The full list is linked to > from > >>> >> > http://dumps.wikimedia.org/ and is also available at > >>> >> > > >>> >> > > http://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Current_Mirrors > >>> >> > > >>> >> > Summarizing, we have: > >>> >> > > >>> >> > C3L (Brazil) with the last 5 good known dumps, > >>> >> > Masaryk University (Czech Republic) with the last 5 known good > dumps, > >>> >> > Your.org (USA) with the complete archive of dumps, and > >>> >> > > >>> >> > for the latest version of uploaded media, Your.org with > >>> >> > http/ftp/rsync > >>> >> > access. > >>> >> > > >>> >> > Thanks to Carlos, Kevin and Yenya respectively at the above sites > for > >>> >> > volunteering space, time and effort to make this happen. > >>> >> > > >>> >> > As people noticed earlier, a series of media tarballs per-project > >>> >> > (excluding commons) is being generated. As soon as the first run > of > >>> >> > these is complete we'll announce its location and start generating > >>> >> > them > >>> >> > on a semi-regular basis. > >>> >> > > >>> >> > As we've been getting the bugs out of the mirroring setup, it is > >>> >> > getting > >>> >> > easier to add new locations. Know anyone interested? Please let > us > >>> >> > know; we would love to have them. > >>> >> > > >>> >> > Ariel > >>> >> > > >>> >> > > >>> >> > _______________________________________________ > >>> >> > Wikitech-l mailing list > >>> >> > Wikitech-l@lists.wikimedia.org > >>> >> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > >>> >> > >>> >> > >>> >> > >>> > > >>> > > >>> > > >>> > _______________________________________________ > >>> > Wikitech-l mailing list > >>> > Wikitech-l@lists.wikimedia.org > >>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > >>> > >>> > >>> > >>> -- > >>> James Michael DuPont > >>> Member of Free Libre Open Source Software Kosova http://flossk.org > >>> Contributor FOSM, the CC-BY-SA map of the world http://fosm.org > >>> Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 > >>> > >>> _______________________________________________ > >>> Wikitech-l mailing list > >>> Wikitech-l@lists.wikimedia.org > >>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l > >> > >> > >> > >> > >> -- > >> Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com > >> Pre-doctoral student at the University of Cádiz (Spain) > >> Projects: AVBOT | StatMediaWiki | WikiEvidens | WikiPapers | WikiTeam > >> Personal website: https://sites.google.com/site/emijrp/ > >> > >> > >> _______________________________________________ > >> Xmldatadumps-l mailing list > >> xmldatadump...@lists.wikimedia.org > >> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l > >> > > > > > > > > -- > > James Michael DuPont > > Member of Free Libre Open Source Software Kosova http://flossk.org > > Contributor FOSM, the CC-BY-SA map of the world http://fosm.org > > Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 > > > > -- > James Michael DuPont > Member of Free Libre Open Source Software Kosova http://flossk.org > Contributor FOSM, the CC-BY-SA map of the world http://fosm.org > Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 > > _______________________________________________ > Wikitech-l mailing list > Wikitech-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > -- Regards, Hydriz We've created the greatest collection of shared knowledge in history. Help protect Wikipedia. Donate now: http://donate.wikimedia.org _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l