Hey Jane,
Yes. Exactly :)

Best

On Tue, Dec 8, 2015 at 9:37 PM Jane Darnell <jane...@gmail.com> wrote:

> Very useful, Amir, thanks! I just ran it for occupation=painter
>  (p=P106&q=Q1028181)
> Am I correct in my interpretation that in general painters have fewer
> claims than the entire population of items with the property occupation?
>
> On Tue, Dec 8, 2015 at 6:48 PM, Amir Ladsgroup <ladsgr...@gmail.com>
> wrote:
>
>> Hey,
>> There has been several discussion regarding quality of information in
>> Wikidata. I wanted to work on quality of wikidata but we don't have any
>> source of good information to see where we are ahead and where we are
>> behind. So I thought the best thing I can do is to make something to show
>> people how exactly sourced our data is with details. So here we have 
>> *http://tools.wmflabs.org/wd-analyst/index.php
>> <http://tools.wmflabs.org/wd-analyst/index.php>*
>>
>> You can give only a property (let's say P31) and it gives you the four
>> most used values + analyze of sources and quality in overall (check this
>> out <http://tools.wmflabs.org/wd-analyst/index.php?p=P31>)
>>  and then you can see about ~33% of them are sources which 29.1% of them
>> are based on Wikipedia.
>> You can give a property and multiple values you want. Let's say you want
>> to compare P27:Q183 (Country of citizenship: Germany) and P27:Q30 (US)
>> Check this out
>> <http://tools.wmflabs.org/wd-analyst/index.php?p=P27&q=Q30%7CQ183>. And
>> you can see US biographies are more abundant (300K over 200K) but German
>> biographies are more descriptive (3.8 description per item over 3.2
>> description over item)
>>
>> One important note: Compare P31:Q5 (a trivial statement) 46% of them are
>> not sourced at all and 49% of them are based on Wikipedia **but* *get
>> this statistics for population properties (P1082
>> <http://tools.wmflabs.org/wd-analyst/index.php?p=P1082>) It's not a
>> trivial statement and we need to be careful about them. It turns out there
>> are slightly more than one reference per statement and only 4% of them are
>> based on Wikipedia. So we can relax and enjoy these highly-sourced data.
>>
>> Requests:
>>
>>    - Please tell me whether do you want this tool at all
>>    - Please suggest more ways to analyze and catch unsourced materials
>>
>> Future plan (if you agree to keep using this tool):
>>
>>    - Support more datatypes (e.g. date of birth based on year,
>>    coordinates)
>>    - Sitelink-based and reference-based analysis (to check how much of
>>    articles of, let's say, Chinese Wikipedia are unsourced)
>>
>>
>>    - Free-style analysis: There is a database for this tool that can be
>>    used for way more applications. You can get the most unsourced statements
>>    of P31 and then you can go to fix them. I'm trying to build a playground
>>    for this kind of tasks)
>>
>> I hope you like this and rock on!
>> <http://tools.wmflabs.org/wd-analyst/index.php?p=P136&q=Q11399>
>> Best
>>
>> _______________________________________________
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
> _______________________________________________
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to