dcausse moved this task from In Progress to Needs review on the Discovery-Search (Current work) board. dcausse added a comment.
> percentage, number of WDQS queries per month that involve Lexemes > >> percentage, number of the above queries that only involve Lexemes (i.e. doesn't require anything from the larger Wikidata graph) with very naive heuristics and for one day I extracted 529097 queries involving lexemes. 357917 seemed to require data from wikidata but I would not trust this too much. Since the language is a wikidata item a query requesting labels in a language using its language code rather than its QID falls into the category of queries requiring the wikidata graph. I did not run the analysis on the full month because it's rather slow and given the precision of the heuristics I chose I would not trust these numbers anyways. If we need more precise numbers the analysis will have to be more involved. For ref, here are the list of predicates I used to detect a `lexeme` query: `wikibase:lemma`, `ontolex:lexicalForm`, `ontolex:representation`, `ontolex:LexicalEntry`, `ontolex:sense`,`dct:language`, `wikibase:lexicalCategory`, `wikibase:grammaticalFeature`. > given the current rate of growth of Wikidata, approximately how much time it would take for non-Lexeme Wikidata to grow back to its current size The lexemes RDF dataset is about 77M triples (0.6% of the total size of the graph). If we were to remove lexemes from the main graph at current growth rate it would take ~10days for wikidata to grow back to the equivalent size. Note that in the current graph "only" 29316 distinct wikidata items are being referenced from the lexemes. TASK DETAIL https://phabricator.wikimedia.org/T275068 WORKBOARD https://phabricator.wikimedia.org/project/board/1227/ EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: dcausse Cc: Lydia_Pintscher, DVrandecic, Lucas_Werkmeister_WMDE, Aklapper, MPhamWMF, Invadibot, maantietaja, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs