Multichill created this task.
Herald added a subscriber: Aklapper.
Herald added projects: Wikidata, Discovery.

TASK DESCRIPTION
  Currently we use autolist2 to get a bunch of items in a category, combine 
that with a query and work on the intersection. The category might give several 
hundreds of items, but the query is everything with either 
https://phabricator.wikimedia.org/P31 or https://phabricator.wikimedia.org/P279 
so the result is huge (for nlwiki about 1.8 million items). This makes it very 
slow, heavy and times out every once in a while. Take for example 
https://tools.wmflabs.org/autolist/?language=nl&project=wikipedia&category=Motorfietstechniek&depth=0&wdq=&pagepile=&wdqs=SELECT%20%3Fitem%20%0AWHERE%0A%7B%0A%09%3Fsitelink%20schema%3Aabout%20%3Fitem%20.%20%3Fsitelink%20schema%3AinLanguage%20%22nl%22%20%0A%20%20%20%20.%20%7B%20%3Fitem%20wdt%3AP31%20%3Fp31%20%7D%20UNION%20%7B%20%3Fitem%20wdt%3AP279%20%3F279%20%7D%0A%7D&statementlist=P&run=Run&mode_manual=or&mode_cat=and&mode_wdq=not&mode_wdqs=not&mode_find=or&chunk_size=10000
  
    Getting pages in category tree... 263 pages found.
    
    Getting corresponding Wikidata items... 263 items found.
    
    Getting WDQS data... 1,871,517 items loaded.
    
    Combining datasets...
    After OR : 0 items.
    After AND : 263 items.
    After NOT : 251 items.
    251 items in combination.
    
    Query took 118.65857410431 seconds. 0.5 MB memory used.
  
  The other way around would be better. Do a query to get all items that have a 
sitelink to some wiki, but no statements. This times out. We discussed this on 
irc and one solution is to add a new triple to store the number of statements 
(and the number of sitelinks while we're at it). That way we can just query for 
that new triple. For that http://wikiba.se/ontology-1.0.owl needs to be 
expanded.

TASK DETAIL
  https://phabricator.wikimedia.org/T129037

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Multichill
Cc: Sjoerddebruin, hoo, Aklapper, Multichill, debt, Gehel, D3r1ck01, FloNight, 
Izno, jkroll, Smalyshev, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, 
Mbch331



_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to