Le dim. 9 juin 2019 à 23:18, Amirouche Boubekki <
amirouche.boube...@gmail.com> a écrit :

> I made a proposal for a grant at
> https://meta.wikimedia.org/wiki/Grants:Project/WDQS_On_FoundationDB
>
> Mind the fact that this is not about the versioned quadstore. It is about
> simple triplestore, it mainly missing bindings for foundationdb and SPARQL
> syntax.
>
> Also, I will prolly need help to interface with geo and label services.
>
> Feedback welcome!
>

I got "feedback" in others threads from the same topic that I will quote
and reply to.

> So there needs to be some smarter solution, one that we'd unlike to
develop inhouse

Big cat, small fish. As wikidata continue to grow, it will have specific
needs.
Needs that are unlikely to be solved by off-the-shelf solutions.

> but one that has already been verified by industry experience and other
deployments.

FoundationDB and WiredTiger are respectively used at Apple (among other
companies)
and MongoDB since 3.2 all over-the-world. WiredTiger is also used at Amazon.

> We also have a plan on improving the throughput of Blazegraph, which
we're working on now.

What is the phabricator ticket? Please.

> "Evaluation of Metadata Representations in RDF stores"

I don't understand how this is related to the scaling issues.

> [About proprietary version Virtuoso], I dare say [it must have] enormous
advantage for us to consider running it in production.

That will be vendor lock-in for wikidata and wikimedia along all the poor
souls that try to interop with it.

> This project seems to be still very young.

First commit
<https://github.com/arangodb/arangodb/commit/6577d5417a000c29c9ee7666cbcc3cefae6eee21>
is from 2011.

> AgangoDB seems to be document database inside.

It has two backends: MMAP and rocksdb.

> While I would be very interested if somebody took on themselves to model
Wikidata
> in terms of ArangoDB documents,

It looks like a bounty.

ArangoDB is a multi-model database, it support:

- Document
- Graph
- Key-Value

> load the whole data and see what the resulting performance would be, I am
not sure
> it would be wise for us to invest our team's - very limited currently -
resources into that.

I am biased. I would advise against trying arangodb. This is another short
term solution.

> the concept of having single data store is probably not realistic at
least
> within foreseeable timeframes.

Incorrect. My solution is in the foreseeable future.

> We use separate data store for search (ElasticSearch) and probably will
> have to have separate one for queries, whatever would be the mechanism.

It would be interesting to read how much "resource" is poured into keeping
all those synchronized:

- ElasticSearch
- MySQL
- BlazeGraph

Maybe some REDIS?
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to