The Wikidata Query Service currently holds some 3.8 billion triples –
you can see the numbers on Grafana [1]. But WDQS “munges” the dump
before importing it – for instance, it merges wdata:… into wd:… and
drops `a wikibase:Item` and `a wikibase:Statement` types; see [2] for
details – so the triple count in the un-munged dump will be somewhat
larger than the triple count in WDQS.

Cheers,
Lucas

[1]:
https://grafana.wikimedia.org/dashboard/db/wikidata-query-service?panelId=7&fullscreen
[2]:
https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#WDQS_data_differences


On 07.11.2017 17:09, Laura Morales wrote:
> How many triples does wikidata have? The old dump from rdfhdt seem to have 
> about 2 billion, which means wikidata doubled the number of triples in less 
> than a year?
>  
>  
>
> Sent: Tuesday, November 07, 2017 at 3:24 PM
> From: "Jérémie Roquet" <jroq...@arkanosis.net>
> To: "Discussion list for the Wikidata project." <wikidata@lists.wikimedia.org>
> Subject: Re: [Wikidata] Wikidata HDT dump
> Hi everyone,
>
> I'm afraid the current implementation of HDT is not ready to handle
> more than 4 billions triples as it is limited to 32 bit indexes. I've
> opened an issue upstream: https://github.com/rdfhdt/hdt-cpp/issues/135
>
> Until this is addressed, don't waste your time trying to convert the
> entire Wikidata to HDT: it can't work.
>
> --
> Jérémie
>
> _______________________________________________
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata[https://lists.wikimedia.org/mailman/listinfo/wikidata]
>
> _______________________________________________
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata


_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to