MPhamWMF created this task.
MPhamWMF added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  As a user, I want to know which Wikidata predicates may put my query at risk 
of timeouts due to their distribution across subgraphs, so that I can minimize 
the chances of my queries failing, and the chances of my query impacting other 
users.
  
  Some queries are distributed across interconnected subgraphs in such a way 
that queries can become unknowingly/unintentionally more expensive: e.g. 
searching for humans who are authors may be unexpectedly more expensive due to 
author predicates being highly overrepresented in the scholarly articles 
subgraph (this is not a real example). From a user perspective, seemingly 
simple queries can then lead to timeouts due to WDQS needing to do unexpected 
parsing through connected large subgraphs.
  
  In T291205 <https://phabricator.wikimedia.org/T291205>, we identified and 
examined specific predicates with these distributional properties. This ticket 
is to identify these properties automatically, and divide them into groups such 
as:
  
  - danger properties, always timesout query
  - used across a lot of subgraphs
  - used only in 1 or 2 subgraphs
  
  Out of scope
  
  - decisions around what to do with identified properties

TASK DETAIL
  https://phabricator.wikimedia.org/T295779

WORKBOARD
  https://phabricator.wikimedia.org/project/board/891/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: MPhamWMF
Cc: Aklapper, MPhamWMF, CBogen, Namenlos314, Gq86, Lucas_Werkmeister_WMDE, 
EBjune, merbst, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, 
Tobias1984, Manybubbles
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to