dcausse created this task.
dcausse added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  As part of the offline evaluation of the WDQS graph split (scholarly article 
vs the rest) we want to extract multiple sets of SPARQL queries.
  (initially drafted from 
https://docs.google.com/document/d/1QsV96LtpK5lDD2N2jy-6vaF_0d_Yf_HLb8uFARFMxJ8)
  
  1/ From the query logs identify and extract sets of queries emitted from a 
set of known sources:
  
  - Listeria
  - Mix-n-match
  - Pywikibot
  - wd/wb-integrator
  
  2/ From queries written in wikidata wikipages:
  
  - 
https://observablehq.com/@pac02/hello-sparql-queries-dataset?collection=@pac02/wikidata-tools
  - 
https://huggingface.co/datasets/htriedman/wiki-sparql/viewer/htriedman--wiki-sparql/test
  
  3/ A set of queries from the query logs, ideally representative of the 
following characteristics:
  
  - Query size
  - Query time
  - Status code (http return status)
  
  The output is expected to be a hive table with 2 columns:
  
  - query: the sparql query in plain text
  - provenance: a code identifying the provenance (source) of the query
  
  Note: query logs are available in `events.wdqs_external_sparql_query`.

TASK DETAIL
  https://phabricator.wikimedia.org/T349512

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, AWesterinen, Namenlos314, Gq86, Lucas_Werkmeister_WMDE, 
EBjune, merbst, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, 
Tobias1984, Manybubbles
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to