Ok, thanks for clarifying and this pointer, Aaron. Bob On Thu, Jun 25, 2015 at 3:20 PM, Aaron Halfaker <ahalfa...@wikimedia.org> wrote: > No way of searching the content of deleted pages. You can start with the > `archive` table. You might find that you can identify edits that add 'hoax' > templates by performing a regex match on `archive.ar_comment`. > > -Aaron > > On Thu, Jun 25, 2015 at 5:16 PM, Robert West <robert.bob.w...@gmail.com> > wrote: >> >> Thanks, Aaron! >> >> On Thu, Jun 25, 2015 at 3:06 PM, Aaron Halfaker <ahalfa...@wikimedia.org> >> wrote: >> > Ahh yes. Sorry for not responding sooner. The best way to get deleted >> > article text is by getting the appropriate permission with a Wikimedia >> > user >> > account and then using that account to hit the web API. E.g. >> > >> > https://en.wikipedia.org/w/api.php?action=help&modules=query%2Bdeletedrevisions >> >> Looking at this page, it seems I need to supply the title, pageid, or >> revid of the deleted page (or page with deleted revisions) I'm >> interested in. >> However, I don't know yet what pages are relevant to me -- I only know >> this after having done a pass over the text of *all* deleted >> revisions. >> More concretely, my query is basically "all deleted revisions that >> contain the {{hoax}} template", but I don't know yet which deleted >> pages have such revisions. >> >> Is there any way of doing this? >> >> Thanks! >> Bob >> >> > The best way to get this permission is to contact Community Advocacy >> > (pbeaude...@wikimedia.org and jalexan...@wikimedia.org) to request that >> > they >> > supply you with the "wmf-research" right/group. >> > >> > On Thu, Jun 25, 2015 at 4:15 PM, Leila Zia <le...@wikimedia.org> wrote: >> >> >> >> Aaron, any chance you know the answer to this question? I have a vague >> >> memory that we talked about deleted pages and their text some time >> >> back. >> >> This data should live somewhere, right? given that deleted pages can be >> >> restored. >> >> >> >> Thanks, >> >> Leila >> >> >> >> On Wed, Jun 24, 2015 at 2:03 PM, Leila Zia <le...@wikimedia.org> wrote: >> >>> >> >>> switching to the public list with Bob's permission. >> >>> >> >>> On Wed, Jun 24, 2015 at 1:58 PM, Robert West >> >>> <robert.bob.w...@gmail.com> >> >>> wrote: >> >>>> >> >>>> Hi everyone, >> >>>> >> >>>> I'd like to find all enwiki articles that were ever marked with the >> >>>> {{hoax}} template. Pages with that template mostly end up being >> >>>> deleted, so >> >>>> they're not available in the public revision dumps. >> >>>> >> >>>> Hence my question: >> >>>> Is there a way of getting access to the full enwiki revision dump >> >>>> including all deleted pages? >> >>>> I don't know yet which deleted articles I'm interested in, but will >> >>>> only >> >>>> know that after having done a pass over the full revision history. >> >>>> >> >>>> I know that viewing deleted content is problematic (hence I'm sending >> >>>> this request to this internal research list), but I signed an NDA and >> >>>> have >> >>>> access to data on HDFS via stat1002, so there might be a way for me >> >>>> to >> >>>> access that data? >> >>>> >> >>>> I'm also aware of a list of archived hoaxes, but many shorter-lived >> >>>> hoaxes that got deleted fast are not included there. >> >>>> >> >>>> Thanks -- any pointers welcome! >> >>>> Bob >> >>>> >> >>>> >> >>>> -- >> >>>> Up for a little language game? -- http://www.unfun.me >> >>>> >> >>>> _______________________________________________ >> >>>> Research-Internal mailing list >> >>>> research-inter...@lists.wikimedia.org >> >>>> https://lists.wikimedia.org/mailman/listinfo/research-internal >> >>>> >> >>> >> >> >> > >> > >> > _______________________________________________ >> > Analytics mailing list >> > Analytics@lists.wikimedia.org >> > https://lists.wikimedia.org/mailman/listinfo/analytics >> > >> >> >> >> -- >> Up for a little language game? -- http://www.unfun.me >> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics > > > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics >
-- Up for a little language game? -- http://www.unfun.me _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics