Isaac added a comment.
Updated API to be slightly more robust to instance-of-only edge cases and provide the individual features. Output for https://wikidata-quality.wmcloud.org/api/item-scores?qid=Q67559155: { "item": "https://www.wikidata.org/wiki/Q67559155", "features": { "ref-completeness": 0.9055531797461024, "claim-completeness": 0.903502532415779, "label-desc-completeness": 1.0, "num-claims": 11 }, "predicted-completeness": "A", "predicted-quality": "C" } Details: - `ref-completeness`: what proportion of expected references does the item have? References that are internal to Wikimedia are only given half-credit while external links / identifiers are given full credit. Based on what proportion of claims for a given property typically have references on Wikidata. Also takes into account missing statements. - `claim-completeness`: what proportion of the expected claims does the item have. Data taken from Recoin <https://www.wikidata.org/wiki/Wikidata:Recoin> where less common properties for a given instance-of are weighted less. - `label-desc-completeness`: what proportion of expected labels/descriptions are present. Right now the expected labels/descriptions are English plus any language for which the item has a sitelink. - `num-claims`: how many total properties the item has actually so it's a misnomer and something I'll fix at some point (I don't give more credit for e.g., having 3 authors instead of 1 author for a scientific paper) - `predicted-completeness`: E (worst) to A (best) based on (see guidelines <https://www.wikidata.org/wiki/Wikidata:Item_quality>), which uses just the proportional `*-completeness` features. - `predicted-quality`: same classes but now also includes the more generic `num-claims` feature too. Regarding T332021 <https://phabricator.wikimedia.org/T332021>, I'll have to think about how to count that for the label-desc score. Probably no change for descriptions but for labels, perhaps accept it in place of English but still expect language-specific labels for any languages that have a sitelink? Either way, label/descriptions are not a major feature so it won't greatly affect the model. TASK DETAIL https://phabricator.wikimedia.org/T321224 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Isaac Cc: Michael, Lydia_Pintscher, diego, Miriam, Isaac, Astuthiodit_1, karapayneWMDE, Invadibot, Ywats0ns, maantietaja, ItamarWMDE, Akuckartz, Nandana, Abdeaitali, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, Avner, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Capt_Swing, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org