Re: [Wikidata-l] Statistics

2013-10-19 Thread Gerard Meijssen
Hoi,
This is my analysis of the situation with several strategies to remedy the
situation. I am really interested in your reaction and yes, fallback is in
there but there has to be something to fallback to. That is currently
missing.
Thanks,
 Gerard

http://ultimategerardm.blogspot.nl/2013/10/wikdata-needs-378000-labels.html


On 18 October 2013 22:27, Lydia Pintscher lydia.pintsc...@wikimedia.dewrote:

 On Fri, Oct 18, 2013 at 7:26 AM, Gerard Meijssen
 gerard.meijs...@gmail.com wrote:
  Hoi,
 
  I do not know if you have seen the statistics compiled by Magnus [1].
 They
  are up to date and useful.
 
  I blogged about it [2]. As far as I am concerned, the biggest challenge
 we
  face is the lack of labels. Given that 280+ languages are represented in
  Wikidata it clearly demonstrates that Wikidata is useless as it is for
 most
  languages. Please tell me that I am wrong and explain why.

 This is correct to a certain degree. However we have language
 fallbacks on the roadmap which will significantly help improve the
 situation. Liangent has put a lot of effort into this over the summer
 during Google Summer of Code. The other thing is that there is clearly
 a number of items which are more used than others. My theory is that
 they are also the ones that are more complete. If there is no label in
 a small language for a very obscure item than this is less bad as when
 there is none for a much-used item. Not all items are created equal.
 We should keep that in mind when interpreting statistics.


 Cheers
 Lydia

 --
 Lydia Pintscher - http://about.me/lydia.pintscher
 Product Manager for Wikidata

 Wikimedia Deutschland e.V.
 Obentrautstr. 72
 10963 Berlin
 www.wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
 unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
 Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Statistics

2013-10-19 Thread Magnus Manske
toolspam
The Terminator [1] can show you the most linked-to (~important) items with
no label (term, hence the name) in major languages.
/toolspam

[1] http://tools.wmflabs.org/wikidata-terminator/index.php


On Fri, Oct 18, 2013 at 9:27 PM, Lydia Pintscher 
lydia.pintsc...@wikimedia.de wrote:

 On Fri, Oct 18, 2013 at 7:26 AM, Gerard Meijssen
 gerard.meijs...@gmail.com wrote:
  Hoi,
 
  I do not know if you have seen the statistics compiled by Magnus [1].
 They
  are up to date and useful.
 
  I blogged about it [2]. As far as I am concerned, the biggest challenge
 we
  face is the lack of labels. Given that 280+ languages are represented in
  Wikidata it clearly demonstrates that Wikidata is useless as it is for
 most
  languages. Please tell me that I am wrong and explain why.

 This is correct to a certain degree. However we have language
 fallbacks on the roadmap which will significantly help improve the
 situation. Liangent has put a lot of effort into this over the summer
 during Google Summer of Code. The other thing is that there is clearly
 a number of items which are more used than others. My theory is that
 they are also the ones that are more complete. If there is no label in
 a small language for a very obscure item than this is less bad as when
 there is none for a much-used item. Not all items are created equal.
 We should keep that in mind when interpreting statistics.


 Cheers
 Lydia

 --
 Lydia Pintscher - http://about.me/lydia.pintscher
 Product Manager for Wikidata

 Wikimedia Deutschland e.V.
 Obentrautstr. 72
 10963 Berlin
 www.wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
 unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
 Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
undefined
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Statistics

2013-10-19 Thread rupert THURNER
i took one example and am lost already, pegasus, listed on top with 5
labels without description:
http://tools.wmflabs.org/wikidata-terminator/index.php?lang=determ=Pegasusdoit=1

then i take one with a description Sternbild knapp nördlich des
Himmelsäquators:
https://www.wikidata.org/wiki/Q8864

and, i do not see this description, nor can i figure out where this
description came from. where did i make the error?

rupert


On Sat, Oct 19, 2013 at 2:08 PM, Magnus Manske
magnusman...@googlemail.com wrote:
 toolspam
 The Terminator [1] can show you the most linked-to (~important) items with
 no label (term, hence the name) in major languages.
 /toolspam

 [1] http://tools.wmflabs.org/wikidata-terminator/index.php


 On Fri, Oct 18, 2013 at 9:27 PM, Lydia Pintscher
 lydia.pintsc...@wikimedia.de wrote:

 On Fri, Oct 18, 2013 at 7:26 AM, Gerard Meijssen
 gerard.meijs...@gmail.com wrote:
  Hoi,
 
  I do not know if you have seen the statistics compiled by Magnus [1].
  They
  are up to date and useful.
 
  I blogged about it [2]. As far as I am concerned, the biggest challenge
  we
  face is the lack of labels. Given that 280+ languages are represented in
  Wikidata it clearly demonstrates that Wikidata is useless as it is for
  most
  languages. Please tell me that I am wrong and explain why.

 This is correct to a certain degree. However we have language
 fallbacks on the roadmap which will significantly help improve the
 situation. Liangent has put a lot of effort into this over the summer
 during Google Summer of Code. The other thing is that there is clearly
 a number of items which are more used than others. My theory is that
 they are also the ones that are more complete. If there is no label in
 a small language for a very obscure item than this is less bad as when
 there is none for a much-used item. Not all items are created equal.
 We should keep that in mind when interpreting statistics.


 Cheers
 Lydia

 --
 Lydia Pintscher - http://about.me/lydia.pintscher
 Product Manager for Wikidata

 Wikimedia Deutschland e.V.
 Obentrautstr. 72
 10963 Berlin
 www.wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
 unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
 Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




 --
 undefined

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Statistics

2013-10-19 Thread Magnus Manske
You used German (de) on the Terminator page. Have you switched your
Wikidata language to de accordingly?


On Sat, Oct 19, 2013 at 1:35 PM, rupert THURNER rupert.thur...@gmail.comwrote:

 i took one example and am lost already, pegasus, listed on top with 5
 labels without description:

 http://tools.wmflabs.org/wikidata-terminator/index.php?lang=determ=Pegasusdoit=1

 then i take one with a description Sternbild knapp nördlich des
 Himmelsäquators:
 https://www.wikidata.org/wiki/Q8864

 and, i do not see this description, nor can i figure out where this
 description came from. where did i make the error?

 rupert


 On Sat, Oct 19, 2013 at 2:08 PM, Magnus Manske
 magnusman...@googlemail.com wrote:
  toolspam
  The Terminator [1] can show you the most linked-to (~important) items
 with
  no label (term, hence the name) in major languages.
  /toolspam
 
  [1] http://tools.wmflabs.org/wikidata-terminator/index.php
 
 
  On Fri, Oct 18, 2013 at 9:27 PM, Lydia Pintscher
  lydia.pintsc...@wikimedia.de wrote:
 
  On Fri, Oct 18, 2013 at 7:26 AM, Gerard Meijssen
  gerard.meijs...@gmail.com wrote:
   Hoi,
  
   I do not know if you have seen the statistics compiled by Magnus [1].
   They
   are up to date and useful.
  
   I blogged about it [2]. As far as I am concerned, the biggest
 challenge
   we
   face is the lack of labels. Given that 280+ languages are represented
 in
   Wikidata it clearly demonstrates that Wikidata is useless as it is for
   most
   languages. Please tell me that I am wrong and explain why.
 
  This is correct to a certain degree. However we have language
  fallbacks on the roadmap which will significantly help improve the
  situation. Liangent has put a lot of effort into this over the summer
  during Google Summer of Code. The other thing is that there is clearly
  a number of items which are more used than others. My theory is that
  they are also the ones that are more complete. If there is no label in
  a small language for a very obscure item than this is less bad as when
  there is none for a much-used item. Not all items are created equal.
  We should keep that in mind when interpreting statistics.
 
 
  Cheers
  Lydia
 
  --
  Lydia Pintscher - http://about.me/lydia.pintscher
  Product Manager for Wikidata
 
  Wikimedia Deutschland e.V.
  Obentrautstr. 72
  10963 Berlin
  www.wikimedia.de
 
  Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
 
  Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
  unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
  Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
 
 
 
  --
  undefined
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
undefined
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Statistics

2013-10-19 Thread rupert THURNER
maybe the most important question first: is it the goal that human
editors extend / correct this data in wikidata, or is there a feed?

if it is really humans who should enter data:
thanks for the hint magnus, i can see it now, hallelujah. i'd have
never in my life the idea to change the GUI language in preferences to
de to change the contents language. and, it takes 8 clicks on a
smaller screen - enough so i would not do it more than one time every
5 years :)

rupert

On Sat, Oct 19, 2013 at 7:38 PM, Magnus Manske
magnusman...@googlemail.com wrote:
 You used German (de) on the Terminator page. Have you switched your
 Wikidata language to de accordingly?


 On Sat, Oct 19, 2013 at 1:35 PM, rupert THURNER rupert.thur...@gmail.com
 wrote:

 i took one example and am lost already, pegasus, listed on top with 5
 labels without description:

 http://tools.wmflabs.org/wikidata-terminator/index.php?lang=determ=Pegasusdoit=1

 then i take one with a description Sternbild knapp nördlich des
 Himmelsäquators:
 https://www.wikidata.org/wiki/Q8864

 and, i do not see this description, nor can i figure out where this
 description came from. where did i make the error?

 rupert


 On Sat, Oct 19, 2013 at 2:08 PM, Magnus Manske
 magnusman...@googlemail.com wrote:
  toolspam
  The Terminator [1] can show you the most linked-to (~important) items
  with
  no label (term, hence the name) in major languages.
  /toolspam
 
  [1] http://tools.wmflabs.org/wikidata-terminator/index.php
 
 
  On Fri, Oct 18, 2013 at 9:27 PM, Lydia Pintscher
  lydia.pintsc...@wikimedia.de wrote:
 
  On Fri, Oct 18, 2013 at 7:26 AM, Gerard Meijssen
  gerard.meijs...@gmail.com wrote:
   Hoi,
  
   I do not know if you have seen the statistics compiled by Magnus [1].
   They
   are up to date and useful.
  
   I blogged about it [2]. As far as I am concerned, the biggest
   challenge
   we
   face is the lack of labels. Given that 280+ languages are represented
   in
   Wikidata it clearly demonstrates that Wikidata is useless as it is
   for
   most
   languages. Please tell me that I am wrong and explain why.
 
  This is correct to a certain degree. However we have language
  fallbacks on the roadmap which will significantly help improve the
  situation. Liangent has put a lot of effort into this over the summer
  during Google Summer of Code. The other thing is that there is clearly
  a number of items which are more used than others. My theory is that
  they are also the ones that are more complete. If there is no label in
  a small language for a very obscure item than this is less bad as when
  there is none for a much-used item. Not all items are created equal.
  We should keep that in mind when interpreting statistics.
 
 
  Cheers
  Lydia
 
  --
  Lydia Pintscher - http://about.me/lydia.pintscher
  Product Manager for Wikidata
 
  Wikimedia Deutschland e.V.
  Obentrautstr. 72
  10963 Berlin
  www.wikimedia.de
 
  Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
 
  Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
  unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
  Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
 
 
 
  --
  undefined
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




 --
 undefined

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Statistics

2013-10-18 Thread Lydia Pintscher
On Fri, Oct 18, 2013 at 7:26 AM, Gerard Meijssen
gerard.meijs...@gmail.com wrote:
 Hoi,

 I do not know if you have seen the statistics compiled by Magnus [1]. They
 are up to date and useful.

 I blogged about it [2]. As far as I am concerned, the biggest challenge we
 face is the lack of labels. Given that 280+ languages are represented in
 Wikidata it clearly demonstrates that Wikidata is useless as it is for most
 languages. Please tell me that I am wrong and explain why.

This is correct to a certain degree. However we have language
fallbacks on the roadmap which will significantly help improve the
situation. Liangent has put a lot of effort into this over the summer
during Google Summer of Code. The other thing is that there is clearly
a number of items which are more used than others. My theory is that
they are also the ones that are more complete. If there is no label in
a small language for a very obscure item than this is less bad as when
there is none for a much-used item. Not all items are created equal.
We should keep that in mind when interpreting statistics.


Cheers
Lydia

-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Obentrautstr. 72
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l