If the challenge is downloading large files, you can also get local access
to all of the dumps (wikidata, wikipedia, and more) through the PAWS
<https://wikitech.wikimedia.org/wiki/PAWS> (Wikimedia-hosted Jupyter
notebooks) and Toolforge
<https://wikitech.wikimedia.org/wiki/Help:Toolforge> (more general-purpose
Wikimedia hosting environment). From Toolforge, you could run the Wikidata
toolkit (Java) that Denny mentions. I'm personally more familiar with
Python, so my suggestion is to use Python code to filter down the dumps to
what you desire. Below is an example Python notebook that will do this on
PAWS, though the PAWS environment is not set up for these longer running
jobs and will probably die before the process is complete, so I'd highly
recommend converting it into a script that can run on Toolforge (see
https://wikitech.wikimedia.org/wiki/Help:Toolforge/Dumps).

PAWS example:
https://paws-public.wmflabs.org/paws-public/User:Isaac_(WMF)/Simplified_Wikidata_Dumps.ipynb

Best,
Isaac


On Thu, Apr 30, 2020 at 1:33 AM raffaele messuti <raffa...@docuver.se>
wrote:

> On 27/04/2020 18:02, Kingsley Idehen wrote:
> >> [1] https://w.wiki/PBi <https://w.wiki/PBi>
> >>
> > Do these CONSTRUCT queries return any of the following document
> content-types?
> >
> > RDF-Turtle, RDF-XML, JSON-LD ?
>
> you can use content negotiation on the sparql endpoint
>
> ~ query="CONSTRUCT { ... }"
> ~ curl -H "Accept: application/rdf+xml" https://query.wikidata.org/sparql
> --data-urlencode query=$query
> ~ curl -H "Accept: text/turtle" -G https://query.wikidata.org/sparql
> --data-urlencode query=$query
>
>
>
> --
> raffa...@docuver.se
>
> _______________________________________________
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>


-- 
Isaac Johnson (he/him/his) -- Research Scientist -- Wikimedia Foundation
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to