Hi everyone,
I am happy to announce a new tool I've been working on for the last few
months, WDumper.
The tool is available at https://tools.wmflabs.org/wdumps/.
The idea is to provide a user interface to easily generate RDF dumps for
subsets of the data contained in Wikidata.
As an example, the tool can generate dumps with only english labels or
for a subset of the properties.
The tool is based on Wikidata Toolkit and processes the original JSON
dumps provided by Wikidata.
When you submit a request to create a dump, it will be added to a queue.
The queue is processed in regular intervals (the maximum wait time in
queue is 1h).
You can view a list of created dumps on
https://tools.wmflabs.org/wdumps/dumps.
The generated dump can either be downloaded directly or uploaded to
Zenodo for archival, which also generates a DOI for easy referencing in
scientific publications.
I want to thank Prof. Dr. Markus Krötzsch for the original idea for this
tool and support during the development of the tool.
If you have any questions, feel free to ask them by mail or create an
issue on the GitHub page: https://github.com/bennofs/wdumper. The
current version does not have a lot of features yet, so ideas for
extending the tool with additional filters or options that you'd like to
use are valuable feedback as well.
Also a small word of caution: while I did of course test the tool, the
Wikidata data model is quite complex. Since the tool is new, bugs are
more likely, so always apply a sanity check to the results.
If you find bugs, please tell me or create an issue on GitHub.
Regards,
Benno Fünfstück
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata