Re: [Wikidata-l] Countries ranked

2013-10-05 Thread Sven Manguard
Katie has a bot that does it. I think there is a script somewhere already
too, but I can't find it.

Sven
On Oct 5, 2013 3:58 PM, "Jan Dudík"  wrote:

> Give me python script, and I'll move all coordinates from cs and sk wiki
> ;-)
>
> there are only sripts for importing types "item" and "string" yet.
>
> JAnD
>
> ---
> Ing. Jan Dudík
> projekce dopravních staveb
> tel. 777082195
>
>
> 2013/10/5 Katie Filbert 
>
>> On Sat, Oct 5, 2013 at 3:49 PM, Maarten Dammers wrote:
>>
>>>  Hi Katie,
>>>
>>> Op 4-10-2013 23:29, Katie Filbert schreef:
>>>
>>>  As many folks enjoy country rankings, I have generated a list of
>>> countries (Property:P17) ranked by number of coordinates (P625) in
>>> Wikidata.  Note this data is from the September 22 database dump.
>>>
>>> Did you consider doing this as a query on the live database on Tool*?
>>>
>>
>> Unfortunately tool labs does not have PostgreSQL and PostGIS yet :(
>>  which is what I need to do certain stuff.
>>
>> It's easy enough to work with dumps, and can do stuff on tool labs when
>> it makes sense.
>>
>>
>>> You have a page for an item with coordinates.
>>> * Join this page against pagelinks for P625 (coordinate location)
>>> * Join this page against pagelinks for P17 (country)
>>> * Join this page against pagelinks for countrypage
>>> * Join countrypage aginst pagelinks for Q6256 (country) or Q1763527
>>> (constituent country)
>>>
>>> Group it by country and do some ordering.
>>>
>>> We do seem to have quite a few items where country is missing, see
>>> http://208.80.153.172/wdq/?q=claim[625]_AND_noclaim[17]_AND_noclaim[31]
>>> . We should probably work on that too.
>>>
>>
>> Yep!
>>
>>
>>>
>>> I was wondering how many articles do have coordinates at the Dutch
>>> Wikipedia, but not at Wikidata. For that I created a tracker category, see
>>> https://nl.wikipedia.org/wiki/Categorie:Wikipedia:Co%C3%B6rdinaten_niet_op_Wikidata.
>>>  We could probably do some LUA magic to compare coordinates in Wikidata
>>> with local coordinates and see how far these are apart. Did anyone already
>>> build something in LUA that might be reused for this?
>>>
>>
>> I am not aware of that. Definitely a good idea.
>>
>> Cheers,
>> Katie
>>
>>
>>
>>>
>>>
>>> Maarten
>>>
>>> ___
>>> Wikidata-l mailing list
>>> Wikidata-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>>>
>>>
>>
>>
>> --
>> Katie Filbert
>> filbe...@gmail.com
>> @filbertkm / @wikimediadc / @wikidata
>>
>> ___
>> Wikidata-l mailing list
>> Wikidata-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>>
>>
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Countries ranked

2013-10-05 Thread Jan Dudík
Give me python script, and I'll move all coordinates from cs and sk wiki ;-)

there are only sripts for importing types "item" and "string" yet.

JAnD

---
Ing. Jan Dudík
projekce dopravních staveb
tel. 777082195


2013/10/5 Katie Filbert 

> On Sat, Oct 5, 2013 at 3:49 PM, Maarten Dammers wrote:
>
>>  Hi Katie,
>>
>> Op 4-10-2013 23:29, Katie Filbert schreef:
>>
>>  As many folks enjoy country rankings, I have generated a list of
>> countries (Property:P17) ranked by number of coordinates (P625) in
>> Wikidata.  Note this data is from the September 22 database dump.
>>
>> Did you consider doing this as a query on the live database on Tool*?
>>
>
> Unfortunately tool labs does not have PostgreSQL and PostGIS yet :(  which
> is what I need to do certain stuff.
>
> It's easy enough to work with dumps, and can do stuff on tool labs when it
> makes sense.
>
>
>> You have a page for an item with coordinates.
>> * Join this page against pagelinks for P625 (coordinate location)
>> * Join this page against pagelinks for P17 (country)
>> * Join this page against pagelinks for countrypage
>> * Join countrypage aginst pagelinks for Q6256 (country) or Q1763527
>> (constituent country)
>>
>> Group it by country and do some ordering.
>>
>> We do seem to have quite a few items where country is missing, see
>> http://208.80.153.172/wdq/?q=claim[625]_AND_noclaim[17]_AND_noclaim[31]
>> . We should probably work on that too.
>>
>
> Yep!
>
>
>>
>> I was wondering how many articles do have coordinates at the Dutch
>> Wikipedia, but not at Wikidata. For that I created a tracker category, see
>> https://nl.wikipedia.org/wiki/Categorie:Wikipedia:Co%C3%B6rdinaten_niet_op_Wikidata.
>>  We could probably do some LUA magic to compare coordinates in Wikidata
>> with local coordinates and see how far these are apart. Did anyone already
>> build something in LUA that might be reused for this?
>>
>
> I am not aware of that. Definitely a good idea.
>
> Cheers,
> Katie
>
>
>
>>
>>
>> Maarten
>>
>> ___
>> Wikidata-l mailing list
>> Wikidata-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>>
>>
>
>
> --
> Katie Filbert
> filbe...@gmail.com
> @filbertkm / @wikimediadc / @wikidata
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Countries ranked

2013-10-05 Thread Katie Filbert
On Sat, Oct 5, 2013 at 3:49 PM, Maarten Dammers  wrote:

>  Hi Katie,
>
> Op 4-10-2013 23:29, Katie Filbert schreef:
>
>  As many folks enjoy country rankings, I have generated a list of
> countries (Property:P17) ranked by number of coordinates (P625) in
> Wikidata.  Note this data is from the September 22 database dump.
>
> Did you consider doing this as a query on the live database on Tool*?
>

Unfortunately tool labs does not have PostgreSQL and PostGIS yet :(  which
is what I need to do certain stuff.

It's easy enough to work with dumps, and can do stuff on tool labs when it
makes sense.


> You have a page for an item with coordinates.
> * Join this page against pagelinks for P625 (coordinate location)
> * Join this page against pagelinks for P17 (country)
> * Join this page against pagelinks for countrypage
> * Join countrypage aginst pagelinks for Q6256 (country) or Q1763527
> (constituent country)
>
> Group it by country and do some ordering.
>
> We do seem to have quite a few items where country is missing, see
> http://208.80.153.172/wdq/?q=claim[625]_AND_noclaim[17]_AND_noclaim[31] .
> We should probably work on that too.
>

Yep!


>
> I was wondering how many articles do have coordinates at the Dutch
> Wikipedia, but not at Wikidata. For that I created a tracker category, see
> https://nl.wikipedia.org/wiki/Categorie:Wikipedia:Co%C3%B6rdinaten_niet_op_Wikidata.
>  We could probably do some LUA magic to compare coordinates in Wikidata
> with local coordinates and see how far these are apart. Did anyone already
> build something in LUA that might be reused for this?
>

I am not aware of that. Definitely a good idea.

Cheers,
Katie



>
>
> Maarten
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>


-- 
Katie Filbert
filbe...@gmail.com
@filbertkm / @wikimediadc / @wikidata
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Countries ranked

2013-10-05 Thread Maarten Dammers

Hi Katie,

Op 4-10-2013 23:29, Katie Filbert schreef:
As many folks enjoy country rankings, I have generated a list of 
countries (Property:P17) ranked by number of coordinates (P625) in 
Wikidata.  Note this data is from the September 22 database dump.

Did you consider doing this as a query on the live database on Tool*?
You have a page for an item with coordinates.
* Join this page against pagelinks for P625 (coordinate location)
* Join this page against pagelinks for P17 (country)
* Join this page against pagelinks for countrypage
* Join countrypage aginst pagelinks for Q6256 (country) or Q1763527 
(constituent country)


Group it by country and do some ordering.

We do seem to have quite a few items where country is missing, see 
http://208.80.153.172/wdq/?q=claim[625]_AND_noclaim[17]_AND_noclaim[31] 
. We should probably work on that too.


I was wondering how many articles do have coordinates at the Dutch 
Wikipedia, but not at Wikidata. For that I created a tracker category, 
see 
https://nl.wikipedia.org/wiki/Categorie:Wikipedia:Co%C3%B6rdinaten_niet_op_Wikidata 
. We could probably do some LUA magic to compare coordinates in Wikidata 
with local coordinates and see how far these are apart. Did anyone 
already build something in LUA that might be reused for this?


Maarten
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Countries ranked

2013-10-04 Thread Magnus Manske
I can offer:

189,742 of 210,616 places (90%) in the U.S. with coordinates
http://208.80.153.172/wdq/?q=tree[30][150][17,131]_AND_claim[625]

55,504 of 92,510 (60%) in Russia:
http://208.80.153.172/wdq/?q=tree[159][150][17,131]_AND_claim[625]

You'll have to do the rest on your own ;-)
(for automated stats, just copy the API call for all countries)




On Fri, Oct 4, 2013 at 10:47 PM, Katie Filbert  wrote:

> On Fri, Oct 4, 2013 at 11:34 PM, Sven Manguard wrote:
>
>> This is cool, but do we have any statistics on the number of pages have
>> coordinate locations versus the number of pages that should have
>> coordinated locations?
>>
> Actually I am working on that.
>
> http://tools.wmflabs.org/audetools/coords (note, it is incomplete data
> and not as user-friendly yet as it could be)
>
> Click on each Wikipedia link to see per-country stats for that Wikipedia.
>
> Stay tuned for more :)
>
>
>> This statistic is more a reflection of where the bots have been running
>> and where they haven't been run yet. I'd be very interested in seeing the
>> top 10 once we've imported all the coords we can.
>>
> That would be really cool to see at that point.
>
> Cheers,
> Katie
>
>
>> Sven
>> On Oct 4, 2013 5:30 PM, "Katie Filbert"  wrote:
>>
>>> As many folks enjoy country rankings, I have generated a list of
>>> countries (Property:P17) ranked by number of coordinates (P625) in
>>> Wikidata.  Note this data is from the September 22 database dump.
>>>
>>> There are a total of 737,271 coordinates in Wikidata.
>>>
>>> Top countries are
>>>
>>> 1) US
>>> 2) Russia
>>> 3) UK
>>> 4) China
>>> 5) France
>>> 6) Ukraine
>>> 7) Canada
>>> 8) Germany
>>> 9) Australia
>>> 10) Poland
>>>
>>> See the full list (which also has a few items entered for P17 that are
>>> not really countries):
>>>
>>> https://www.wikidata.org/wiki/User:Aude/countrystats
>>>
>>> Cheers,
>>> Katie (user:aude)
>>>
>>> --
>>> Katie Filbert
>>> filbe...@gmail.com
>>> @filbertkm / @wikimediadc / @wikidata
>>>
>>> ___
>>> Wikidata-l mailing list
>>> Wikidata-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>>>
>>>
>> ___
>> Wikidata-l mailing list
>> Wikidata-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>>
>>
>
>
> --
> Katie Filbert
> filbe...@gmail.com
> @filbertkm / @wikimediadc / @wikidata
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>


-- 
undefined
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Countries ranked

2013-10-04 Thread Katie Filbert
On Fri, Oct 4, 2013 at 11:34 PM, Sven Manguard wrote:

> This is cool, but do we have any statistics on the number of pages have
> coordinate locations versus the number of pages that should have
> coordinated locations?
>
Actually I am working on that.

http://tools.wmflabs.org/audetools/coords (note, it is incomplete data and
not as user-friendly yet as it could be)

Click on each Wikipedia link to see per-country stats for that Wikipedia.

Stay tuned for more :)


> This statistic is more a reflection of where the bots have been running
> and where they haven't been run yet. I'd be very interested in seeing the
> top 10 once we've imported all the coords we can.
>
That would be really cool to see at that point.

Cheers,
Katie


> Sven
> On Oct 4, 2013 5:30 PM, "Katie Filbert"  wrote:
>
>> As many folks enjoy country rankings, I have generated a list of
>> countries (Property:P17) ranked by number of coordinates (P625) in
>> Wikidata.  Note this data is from the September 22 database dump.
>>
>> There are a total of 737,271 coordinates in Wikidata.
>>
>> Top countries are
>>
>> 1) US
>> 2) Russia
>> 3) UK
>> 4) China
>> 5) France
>> 6) Ukraine
>> 7) Canada
>> 8) Germany
>> 9) Australia
>> 10) Poland
>>
>> See the full list (which also has a few items entered for P17 that are
>> not really countries):
>>
>> https://www.wikidata.org/wiki/User:Aude/countrystats
>>
>> Cheers,
>> Katie (user:aude)
>>
>> --
>> Katie Filbert
>> filbe...@gmail.com
>> @filbertkm / @wikimediadc / @wikidata
>>
>> ___
>> Wikidata-l mailing list
>> Wikidata-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>>
>>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>


-- 
Katie Filbert
filbe...@gmail.com
@filbertkm / @wikimediadc / @wikidata
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Countries ranked

2013-10-04 Thread Sven Manguard
This is cool, but do we have any statistics on the number of pages have
coordinate locations versus the number of pages that should have
coordinated locations? This statistic is more a reflection of where the
bots have been running and where they haven't been run yet. I'd be very
interested in seeing the top 10 once we've imported all the coords we can.

Sven
On Oct 4, 2013 5:30 PM, "Katie Filbert"  wrote:

> As many folks enjoy country rankings, I have generated a list of countries
> (Property:P17) ranked by number of coordinates (P625) in Wikidata.  Note
> this data is from the September 22 database dump.
>
> There are a total of 737,271 coordinates in Wikidata.
>
> Top countries are
>
> 1) US
> 2) Russia
> 3) UK
> 4) China
> 5) France
> 6) Ukraine
> 7) Canada
> 8) Germany
> 9) Australia
> 10) Poland
>
> See the full list (which also has a few items entered for P17 that are not
> really countries):
>
> https://www.wikidata.org/wiki/User:Aude/countrystats
>
> Cheers,
> Katie (user:aude)
>
> --
> Katie Filbert
> filbe...@gmail.com
> @filbertkm / @wikimediadc / @wikidata
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


[Wikidata-l] Countries ranked

2013-10-04 Thread Katie Filbert
As many folks enjoy country rankings, I have generated a list of countries
(Property:P17) ranked by number of coordinates (P625) in Wikidata.  Note
this data is from the September 22 database dump.

There are a total of 737,271 coordinates in Wikidata.

Top countries are

1) US
2) Russia
3) UK
4) China
5) France
6) Ukraine
7) Canada
8) Germany
9) Australia
10) Poland

See the full list (which also has a few items entered for P17 that are not
really countries):

https://www.wikidata.org/wiki/User:Aude/countrystats

Cheers,
Katie (user:aude)

-- 
Katie Filbert
filbe...@gmail.com
@filbertkm / @wikimediadc / @wikidata
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l