Hey Jane, Yes. Exactly :) Best
On Tue, Dec 8, 2015 at 9:37 PM Jane Darnell <jane...@gmail.com> wrote: > Very useful, Amir, thanks! I just ran it for occupation=painter > (p=P106&q=Q1028181) > Am I correct in my interpretation that in general painters have fewer > claims than the entire population of items with the property occupation? > > On Tue, Dec 8, 2015 at 6:48 PM, Amir Ladsgroup <ladsgr...@gmail.com> > wrote: > >> Hey, >> There has been several discussion regarding quality of information in >> Wikidata. I wanted to work on quality of wikidata but we don't have any >> source of good information to see where we are ahead and where we are >> behind. So I thought the best thing I can do is to make something to show >> people how exactly sourced our data is with details. So here we have >> *http://tools.wmflabs.org/wd-analyst/index.php >> <http://tools.wmflabs.org/wd-analyst/index.php>* >> >> You can give only a property (let's say P31) and it gives you the four >> most used values + analyze of sources and quality in overall (check this >> out <http://tools.wmflabs.org/wd-analyst/index.php?p=P31>) >> and then you can see about ~33% of them are sources which 29.1% of them >> are based on Wikipedia. >> You can give a property and multiple values you want. Let's say you want >> to compare P27:Q183 (Country of citizenship: Germany) and P27:Q30 (US) >> Check this out >> <http://tools.wmflabs.org/wd-analyst/index.php?p=P27&q=Q30%7CQ183>. And >> you can see US biographies are more abundant (300K over 200K) but German >> biographies are more descriptive (3.8 description per item over 3.2 >> description over item) >> >> One important note: Compare P31:Q5 (a trivial statement) 46% of them are >> not sourced at all and 49% of them are based on Wikipedia **but* *get >> this statistics for population properties (P1082 >> <http://tools.wmflabs.org/wd-analyst/index.php?p=P1082>) It's not a >> trivial statement and we need to be careful about them. It turns out there >> are slightly more than one reference per statement and only 4% of them are >> based on Wikipedia. So we can relax and enjoy these highly-sourced data. >> >> Requests: >> >> - Please tell me whether do you want this tool at all >> - Please suggest more ways to analyze and catch unsourced materials >> >> Future plan (if you agree to keep using this tool): >> >> - Support more datatypes (e.g. date of birth based on year, >> coordinates) >> - Sitelink-based and reference-based analysis (to check how much of >> articles of, let's say, Chinese Wikipedia are unsourced) >> >> >> - Free-style analysis: There is a database for this tool that can be >> used for way more applications. You can get the most unsourced statements >> of P31 and then you can go to fix them. I'm trying to build a playground >> for this kind of tasks) >> >> I hope you like this and rock on! >> <http://tools.wmflabs.org/wd-analyst/index.php?p=P136&q=Q11399> >> Best >> >> _______________________________________________ >> Wikidata mailing list >> Wikidata@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikidata >> >> > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata >
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata