Bertel,

On Mon, Jan 2, 2017 at 7:40 AM, Bertel Teilfeldt Hansen <geilfe...@gmail.com
> wrote:

> Hi Gabriel,
>
> The REST API looks promising - thank you!
>
> Having played around with it a bit, I seem to only be able to get one
> revision per request. Is that correct, or am I doing something wrong?
>


this is correct. The requests themselves are quite cheap, and can be
parallelized up to rate limit set out in the API documentation.



> My project requires every revision and its references from a large number
> of articles, so that would make a lot of requests. The regular API allows
> for multiple revisions per request (only with action=query, though).
>


There is a caveat here in that we currently don't store all revisions for
all articles. This means that requests for really old revisions will
trigger a more expensive on-demand parse, just as with the action API. Can
you say more about the number of articles you are targeting, and how this
list is selected? Regarding the selection, I am mainly wondering if you are
targeting especially frequently edited articles.

Thanks,

Gabriel


>
> Thanks!
>
> Bertel
>
>
>
>
>
>
>
> 2016-12-21 17:01 GMT+01:00 Gabriel Wicke <gwi...@wikimedia.org>:
>
>> Bertel, another option is to use the REST API:
>>
>>
>>    - HTML for a specific revision: https://en.wikipedia
>>    .org/api/rest_v1/#!/Page_content/getFormatRevision
>>    <https://en.wikipedia.org/api/rest_v1/#!/Page_content/getFormatRevision>
>>    - Within this HTML, references are marked up like this:
>>    https://www.mediawiki.org/wiki/Specs/HTML/1.3.0/Extensions/Cite
>>    <https://www.mediawiki.org/wiki/Specs/HTML/1.3.0/Extensions/Cite>.
>>    Any HTML or XML DOM parser can be used to extract this information.
>>
>> Hope this helps,
>>
>> Gabriel
>>
>> On Wed, Dec 21, 2016 at 3:20 AM, Bertel Teilfeldt Hansen <
>> geilfe...@gmail.com> wrote:
>>
>>> Hi Brad and Gergo,
>>>
>>> Thanks for your responses!
>>>
>>> @Brad: Yeah, that was also my impression, but I wasn't sure. Seemed
>>> strange that the example in the official docs would point to a place where
>>> the feature was disabled. Thank you for clearing that up!
>>>
>>> @Gergo: I've been looking at action=parse, but as far as I understand
>>> it, it is limited to one revision per API request, which makes it quite
>>> slow to get a bunch of older revisions from a large number of articles.
>>> action=query&prop=revisions&rvprop=content omits the references from
>>> the output (just gives the string "{{reflist}}" after "References").
>>> "mvrefs" sounds very promising, though! I will definitely check that out -
>>> thank you!
>>>
>>> Best,
>>>
>>> Bertel
>>>
>>> 2016-12-20 19:51 GMT+01:00 Gergo Tisza <gti...@wikimedia.org>:
>>>
>>>> On Tue, Dec 20, 2016 at 10:18 AM, Bertel Teilfeldt Hansen <
>>>> geilfe...@gmail.com> wrote:
>>>>
>>>>> And is there no way of getting references through the API?
>>>>>
>>>>
>>>> There is no nice way, but you can always get the HTML (or the parse
>>>> tree, depending on whether you want parsed or raw refs) and process it;
>>>> references are not hard to extract. For the wikitext version, there is a
>>>> python tool: https://github.com/mediawiki-utilities/python-mwrefs
>>>>
>>>> _______________________________________________
>>>> Mediawiki-api mailing list
>>>> Mediawiki-api@lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Mediawiki-api mailing list
>>> Mediawiki-api@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
>>>
>>>
>>
>>
>> --
>> Gabriel Wicke
>> Principal Engineer, Wikimedia Foundation
>>
>> _______________________________________________
>> Mediawiki-api mailing list
>> Mediawiki-api@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
>>
>>
>
> _______________________________________________
> Mediawiki-api mailing list
> Mediawiki-api@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
>
>


-- 
Gabriel Wicke
Principal Engineer, Wikimedia Foundation
_______________________________________________
Mediawiki-api mailing list
Mediawiki-api@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api

Reply via email to