Very useful, Amir, thanks! I just ran it for occupation=painter
 (p=P106&q=Q1028181)
Am I correct in my interpretation that in general painters have fewer
claims than the entire population of items with the property occupation?

On Tue, Dec 8, 2015 at 6:48 PM, Amir Ladsgroup <ladsgr...@gmail.com> wrote:

> Hey,
> There has been several discussion regarding quality of information in
> Wikidata. I wanted to work on quality of wikidata but we don't have any
> source of good information to see where we are ahead and where we are
> behind. So I thought the best thing I can do is to make something to show
> people how exactly sourced our data is with details. So here we have 
> *http://tools.wmflabs.org/wd-analyst/index.php
> <http://tools.wmflabs.org/wd-analyst/index.php>*
>
> You can give only a property (let's say P31) and it gives you the four
> most used values + analyze of sources and quality in overall (check this
> out <http://tools.wmflabs.org/wd-analyst/index.php?p=P31>)
>  and then you can see about ~33% of them are sources which 29.1% of them
> are based on Wikipedia.
> You can give a property and multiple values you want. Let's say you want
> to compare P27:Q183 (Country of citizenship: Germany) and P27:Q30 (US)
> Check this out
> <http://tools.wmflabs.org/wd-analyst/index.php?p=P27&q=Q30%7CQ183>. And
> you can see US biographies are more abundant (300K over 200K) but German
> biographies are more descriptive (3.8 description per item over 3.2
> description over item)
>
> One important note: Compare P31:Q5 (a trivial statement) 46% of them are
> not sourced at all and 49% of them are based on Wikipedia **but* *get
> this statistics for population properties (P1082
> <http://tools.wmflabs.org/wd-analyst/index.php?p=P1082>) It's not a
> trivial statement and we need to be careful about them. It turns out there
> are slightly more than one reference per statement and only 4% of them are
> based on Wikipedia. So we can relax and enjoy these highly-sourced data.
>
> Requests:
>
>    - Please tell me whether do you want this tool at all
>    - Please suggest more ways to analyze and catch unsourced materials
>
> Future plan (if you agree to keep using this tool):
>
>    - Support more datatypes (e.g. date of birth based on year,
>    coordinates)
>    - Sitelink-based and reference-based analysis (to check how much of
>    articles of, let's say, Chinese Wikipedia are unsourced)
>
>
>    - Free-style analysis: There is a database for this tool that can be
>    used for way more applications. You can get the most unsourced statements
>    of P31 and then you can go to fix them. I'm trying to build a playground
>    for this kind of tasks)
>
> I hope you like this and rock on!
> <http://tools.wmflabs.org/wd-analyst/index.php?p=P136&q=Q11399>
> Best
>
> _______________________________________________
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to