Hoi, Thank you for this answer. It helps. It helps to understand / appreciate the work that is done. Without updates like this, it becomes increasingly hard to be confident that our future will remain bright. Thanks, GerardM
On Thu, 6 Jun 2019 at 21:33, Guillaume Lederrey <gleder...@wikimedia.org> wrote: > Hello all! > > There has been a number of concerns raised about the performance and > scaling of Wikdata Query Service. We share those concerns and we are > doing our best to address them. Here is some info about what is going > on: > > In an ideal world, WDQS should: > > * scale in terms of data size > * scale in terms of number of edits > * have low update latency > * expose a SPARQL endpoint for queries > * allow anyone to run any queries on the public WDQS endpoint > * provide great query performance > * provide a high level of availability > > Scaling graph databases is a "known hard problem", and we are reaching > a scale where there are no obvious easy solutions to address all the > above constraints. At this point, just "throwing hardware at the > problem" is not an option anymore. We need to go deeper into the > details and potentially make major changes to the current architecture. > Some scaling considerations are discussed in [1]. This is going to take > time. > > Reasonably, addressing all of the above constraints is unlikely to > ever happen. Some of the constraints are non negotiable: if we can't > keep up with Wikidata in term of data size or number of edits, it does > not make sense to address query performance. On some constraints, we > will probably need to compromise. > > For example, the update process is asynchronous. It is by nature > expected to lag. In the best case, this lag is measured in minutes, > but can climb to hours occasionally. This is a case of prioritizing > stability and correctness (ingesting all edits) over update latency. > And while we can work to reduce the maximum latency, this will still > be an asynchronous process and needs to be considered as such. > > We currently have one Blazegraph expert working with us to address a > number of performance and stability issues. We > are planning to hire an additional engineer to help us support the > service in the long term. You can follow our current work in phabricator > [2]. > > If anyone has experience with scaling large graph databases, please > reach out to us, we're always happy to share ideas! > > Thanks all for your patience! > > Guillaume > > [1] > https://wikitech.wikimedia.org/wiki/Wikidata_query_service/ScalingStrategy > [2] https://phabricator.wikimedia.org/project/view/1239/ > > -- > Guillaume Lederrey > Engineering Manager, Search Platform > Wikimedia Foundation > UTC+2 / CEST > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata >
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata