Re: [Mediawiki-api] Wikipedia data extraction

Amir E. Aharoni Sat, 03 Mar 2012 11:34:47 -0800

MaxSem on IRC gave a solution that may help you.

Using the following call, you can get section titles, numbers and
offsets from the beginning of the page:
https://en.wikipedia.org/w/api.php?action=parse&page=Pittsburgh&prop=sections


Using the following call, you can get a section's text by its number:
https://en.wikipedia.org/w/api.php?action=parse&page=Pittsburgh&prop=wikitext&section=2

You can tweak your calls using the API sandbox:
https://en.wikipedia.org/wiki/Special:ApiSandbox

--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
‪“We're living in pieces,
I want to live in peace.” – T. Moore‬



2012/3/3 Ashish Mukherjee <ashish.mukher...@gmail.com>:
> Hi,
>
> I am using the following perl modules to extract data from Wikipedia and
> Wikitravel respectively -
>
> - WWW::Wikipedia
> - MediaWiki::API
>
> From both these APIs and also by looking at the MediaWiki APIs, I seem to
> get the entire chunk of text in the Web Service response. To extract
> different sections of the Wiki entry, I have to rely on pattern matching and
> regular expressions.
>
> Is there a better way to achieve this? Is there some sample code in any
> language (preferably, perl) which anyone can share, or is there some tool
> which does this out of the box?
>
> Any help would be appreciated.
>
> Regards,
> Ashish
>
>
>
>
> _______________________________________________
> Mediawiki-api mailing list
> Mediawiki-api@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
>

_______________________________________________
Mediawiki-api mailing list
Mediawiki-api@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api

Re: [Mediawiki-api] Wikipedia data extraction

Reply via email to