Il 18/03/22 14:04, Erik del Toro ha scritto:
Just wanted to tell you, thathttp://aarddict.org users and dictionary
creators also stumbled about these missing namespaces and are now
suggesting to continue scraping these. So is scraping the expected
approach?
Thanks for mentioning this. Not sure what you mean by scraping here
exactly: if you mean parsing the wikitext, definitely not; if you mean
getting the already-parsed HTML from the REST API, it's acceptable.
https://www.mediawiki.org/wiki/API:REST_API/Reference#Get_HTML
As for HTML dumps, the ZIM files by Kiwix for the French Wiktionary
include pages like "Conjugaison:espagnol/aumentar", so that's another
possible avenue for bulk imports. I've checked the latest version:
https://download.kiwix.org/zim/wiktionary/wiktionary_fr_all_nopic_2022-01.zim.torrent
Federico
_______________________________________________
Xmldatadumps-l mailing list -- xmldatadumps-l@lists.wikimedia.org
To unsubscribe send an email to xmldatadumps-l-le...@lists.wikimedia.org