Ok, thanks for clarifying and this pointer, Aaron.
Bob

On Thu, Jun 25, 2015 at 3:20 PM, Aaron Halfaker <ahalfa...@wikimedia.org> wrote:
> No way of searching the content of deleted pages.  You can start with the
> `archive` table.  You might find that you can identify edits that add 'hoax'
> templates by performing a regex match on `archive.ar_comment`.
>
> -Aaron
>
> On Thu, Jun 25, 2015 at 5:16 PM, Robert West <robert.bob.w...@gmail.com>
> wrote:
>>
>> Thanks, Aaron!
>>
>> On Thu, Jun 25, 2015 at 3:06 PM, Aaron Halfaker <ahalfa...@wikimedia.org>
>> wrote:
>> > Ahh yes.  Sorry for not responding sooner.  The best way to get deleted
>> > article text is by getting the appropriate permission with a Wikimedia
>> > user
>> > account and then using that account to hit the web API.  E.g.
>> >
>> > https://en.wikipedia.org/w/api.php?action=help&modules=query%2Bdeletedrevisions
>>
>> Looking at this page, it seems I need to supply the title, pageid, or
>> revid of the deleted page (or page with deleted revisions) I'm
>> interested in.
>> However, I don't know yet what pages are relevant to me -- I only know
>> this after having done a pass over the text of *all* deleted
>> revisions.
>> More concretely, my query is basically "all deleted revisions that
>> contain the {{hoax}} template", but I don't know yet which deleted
>> pages have such revisions.
>>
>> Is there any way of doing this?
>>
>> Thanks!
>> Bob
>>
>> > The best way to get this permission is to contact Community Advocacy
>> > (pbeaude...@wikimedia.org and jalexan...@wikimedia.org) to request that
>> > they
>> > supply you with the "wmf-research" right/group.
>> >
>> > On Thu, Jun 25, 2015 at 4:15 PM, Leila Zia <le...@wikimedia.org> wrote:
>> >>
>> >> Aaron, any chance you know the answer to this question? I have a vague
>> >> memory that we talked about deleted pages and their text some time
>> >> back.
>> >> This data should live somewhere, right? given that deleted pages can be
>> >> restored.
>> >>
>> >> Thanks,
>> >> Leila
>> >>
>> >> On Wed, Jun 24, 2015 at 2:03 PM, Leila Zia <le...@wikimedia.org> wrote:
>> >>>
>> >>> switching to the public list with Bob's permission.
>> >>>
>> >>> On Wed, Jun 24, 2015 at 1:58 PM, Robert West
>> >>> <robert.bob.w...@gmail.com>
>> >>> wrote:
>> >>>>
>> >>>> Hi everyone,
>> >>>>
>> >>>> I'd like to find all enwiki articles that were ever marked with the
>> >>>> {{hoax}} template. Pages with that template mostly end up being
>> >>>> deleted, so
>> >>>> they're not available in the public revision dumps.
>> >>>>
>> >>>> Hence my question:
>> >>>> Is there a way of getting access to the full enwiki revision dump
>> >>>> including all deleted pages?
>> >>>> I don't know yet which deleted articles I'm interested in, but will
>> >>>> only
>> >>>> know that after having done a pass over the full revision history.
>> >>>>
>> >>>> I know that viewing deleted content is problematic (hence I'm sending
>> >>>> this request to this internal research list), but I signed an NDA and
>> >>>> have
>> >>>> access to data on HDFS via stat1002, so there might be a way for me
>> >>>> to
>> >>>> access that data?
>> >>>>
>> >>>> I'm also aware of a list of archived hoaxes, but many shorter-lived
>> >>>> hoaxes that got deleted fast are not included there.
>> >>>>
>> >>>> Thanks -- any pointers welcome!
>> >>>> Bob
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Up for a little language game? -- http://www.unfun.me
>> >>>>
>> >>>> _______________________________________________
>> >>>> Research-Internal mailing list
>> >>>> research-inter...@lists.wikimedia.org
>> >>>> https://lists.wikimedia.org/mailman/listinfo/research-internal
>> >>>>
>> >>>
>> >>
>> >
>> >
>> > _______________________________________________
>> > Analytics mailing list
>> > Analytics@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>> >
>>
>>
>>
>> --
>> Up for a little language game? -- http://www.unfun.me
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
>
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>



-- 
Up for a little language game? -- http://www.unfun.me

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to