Re: [Wikidata] Are we ready for our future

Darren Cook Fri, 03 May 2019 07:50:31 -0700

> Wikidata grows like mad. This is something we all experience in the really 
> bad 
> response times we are suffering. It is so bad that people are asked what kind 
> of 
> updates they are running because it makes a difference in the lag times there 
> are.
> 
> Given that Wikidata is growing like a weed, ...


As I've delved deeper into Wikidata I get the feeling it is being
developed with the assumptions of infinite resources, and no strong
guidelines of exactly what the scope is (i.e. where you draw the line
between what belongs in Wikidata and what does not).

This (and concerns of it being open to data vandalism) has personally
made me back-off a bit. I'd originally planned to have Wikidata be the
primary data source, but I'm now leaning towards keeping data tables and
graphs outside, with scheduled scripts to import into Wikidata, and
export from Wikidata.

> For the technical guys, consider our growth and plan for at least one year.

The 37GB (json, bz2) data dump file (it was already 33GB, twice the size
of the English wikipedia dump, when I grabbed it last November) is
unwieldy. And, as there is no incremental changes being published, it is
hard to create a mirror.

Can that dump file be split up in some functional way, I wonder?

Darren


-- 
Darren Cook, Software Researcher/Developer

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Are we ready for our future

Reply via email to