[pywikibot] about parsing the dump

Luigi Assom Mon, 18 Jan 2016 10:31:45 -0800

hello hello!
about the use of pywikibot:
is it possible to use to parse the xml dump?


I am interested in extracting links from pages (internal, external, with
distinction from ones belonging to category).
I also would like to handle transitive redirect.
I would like to process the dump, without accessing wiki, either access
wiki with proper limits in butch.

Is there maybe something in the package already taking care of this ?
I 've seen in https://www.mediawiki.org/wiki/Manual:Pywikibot/Scripts
there is a "ghost" extracting_links.py" script,
I wonted to ask before re-inventing the wheel, and if pywikibot is suitable
tool for the purpose.

Thank you,
L.

_______________________________________________
pywikibot mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikibot

[pywikibot] about parsing the dump

Reply via email to