Jonas added a comment.

So, if you want to put a dataset into the database, here's the questions to answer:

  1. What kind of queries we are planning to run? Which use cases they would support? I am not sure I am clear on use case for "constraint violations for actors that live in Germany and are born before 1945" - what use case would produce such query?

The use case is similar to the existing maintenance queries. A user wants to keep his domain or project clean and achieve the 100% criteria.

  1. Do we need to have the data in WDQS at all? We have MWAPI gateway, maybe we could just query the suitable API?

Storing it somewhere else will not allow to scale that easily and be flexible with the queries at the same time.

  1. Is this data set separate from Wikidata data or needs to be in the same namespace (depends on cross-querying needs)?

Yes, cross querying is needed.

  1. What is the data model (would be nice to have a wiki page describing it)?

@Lucas_Werkmeister_WMDE could you please provide a draft.

  1. How the data are updated - when update happens, what triggers it, which data are updated, how soon we need the updates, etc. Note that there is no external push write interface to the database, by design, and having it would involve significant security hurdles to clear - to ensure that only authorized clients can modify the data, and only the part of the data they are authorized to. As Blazegraph does not have support for users/roles and other access controls, we may have to find some solution to it.

When a constraint check is executed for an Item the result will be stored and the old result for that Item will be deleted.
Access will only be allowed from within the cluster.

  1. How these data would be imported/reimported if node is reimaged? Right now WDQS is designed as a secondary data storage - i.e. it does not store any data which does not have primary source, and can be cleaned up and restored from external sources.

It will never be imported and it will never be complete.
It is just a snapshot.


TASK DETAIL
https://phabricator.wikimedia.org/T192567

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Jonas
Cc: Lucas_Werkmeister_WMDE, Gehel, Smalyshev, Jonas, Aklapper, Lahi, Gq86, Darkminds3113, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Avner, Agabi10, FloNight, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to