Re: [Wikidata-l] Countries ranked
Katie has a bot that does it. I think there is a script somewhere already too, but I can't find it. Sven On Oct 5, 2013 3:58 PM, "Jan Dudík" wrote: > Give me python script, and I'll move all coordinates from cs and sk wiki > ;-) > > there are only sripts for importing types "item" and "string" yet. > > JAnD > > --- > Ing. Jan Dudík > projekce dopravních staveb > tel. 777082195 > > > 2013/10/5 Katie Filbert > >> On Sat, Oct 5, 2013 at 3:49 PM, Maarten Dammers wrote: >> >>> Hi Katie, >>> >>> Op 4-10-2013 23:29, Katie Filbert schreef: >>> >>> As many folks enjoy country rankings, I have generated a list of >>> countries (Property:P17) ranked by number of coordinates (P625) in >>> Wikidata. Note this data is from the September 22 database dump. >>> >>> Did you consider doing this as a query on the live database on Tool*? >>> >> >> Unfortunately tool labs does not have PostgreSQL and PostGIS yet :( >> which is what I need to do certain stuff. >> >> It's easy enough to work with dumps, and can do stuff on tool labs when >> it makes sense. >> >> >>> You have a page for an item with coordinates. >>> * Join this page against pagelinks for P625 (coordinate location) >>> * Join this page against pagelinks for P17 (country) >>> * Join this page against pagelinks for countrypage >>> * Join countrypage aginst pagelinks for Q6256 (country) or Q1763527 >>> (constituent country) >>> >>> Group it by country and do some ordering. >>> >>> We do seem to have quite a few items where country is missing, see >>> http://208.80.153.172/wdq/?q=claim[625]_AND_noclaim[17]_AND_noclaim[31] >>> . We should probably work on that too. >>> >> >> Yep! >> >> >>> >>> I was wondering how many articles do have coordinates at the Dutch >>> Wikipedia, but not at Wikidata. For that I created a tracker category, see >>> https://nl.wikipedia.org/wiki/Categorie:Wikipedia:Co%C3%B6rdinaten_niet_op_Wikidata. >>> We could probably do some LUA magic to compare coordinates in Wikidata >>> with local coordinates and see how far these are apart. Did anyone already >>> build something in LUA that might be reused for this? >>> >> >> I am not aware of that. Definitely a good idea. >> >> Cheers, >> Katie >> >> >> >>> >>> >>> Maarten >>> >>> ___ >>> Wikidata-l mailing list >>> Wikidata-l@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l >>> >>> >> >> >> -- >> Katie Filbert >> filbe...@gmail.com >> @filbertkm / @wikimediadc / @wikidata >> >> ___ >> Wikidata-l mailing list >> Wikidata-l@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikidata-l >> >> > > ___ > Wikidata-l mailing list > Wikidata-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata-l > > ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Countries ranked
Give me python script, and I'll move all coordinates from cs and sk wiki ;-) there are only sripts for importing types "item" and "string" yet. JAnD --- Ing. Jan Dudík projekce dopravních staveb tel. 777082195 2013/10/5 Katie Filbert > On Sat, Oct 5, 2013 at 3:49 PM, Maarten Dammers wrote: > >> Hi Katie, >> >> Op 4-10-2013 23:29, Katie Filbert schreef: >> >> As many folks enjoy country rankings, I have generated a list of >> countries (Property:P17) ranked by number of coordinates (P625) in >> Wikidata. Note this data is from the September 22 database dump. >> >> Did you consider doing this as a query on the live database on Tool*? >> > > Unfortunately tool labs does not have PostgreSQL and PostGIS yet :( which > is what I need to do certain stuff. > > It's easy enough to work with dumps, and can do stuff on tool labs when it > makes sense. > > >> You have a page for an item with coordinates. >> * Join this page against pagelinks for P625 (coordinate location) >> * Join this page against pagelinks for P17 (country) >> * Join this page against pagelinks for countrypage >> * Join countrypage aginst pagelinks for Q6256 (country) or Q1763527 >> (constituent country) >> >> Group it by country and do some ordering. >> >> We do seem to have quite a few items where country is missing, see >> http://208.80.153.172/wdq/?q=claim[625]_AND_noclaim[17]_AND_noclaim[31] >> . We should probably work on that too. >> > > Yep! > > >> >> I was wondering how many articles do have coordinates at the Dutch >> Wikipedia, but not at Wikidata. For that I created a tracker category, see >> https://nl.wikipedia.org/wiki/Categorie:Wikipedia:Co%C3%B6rdinaten_niet_op_Wikidata. >> We could probably do some LUA magic to compare coordinates in Wikidata >> with local coordinates and see how far these are apart. Did anyone already >> build something in LUA that might be reused for this? >> > > I am not aware of that. Definitely a good idea. > > Cheers, > Katie > > > >> >> >> Maarten >> >> ___ >> Wikidata-l mailing list >> Wikidata-l@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikidata-l >> >> > > > -- > Katie Filbert > filbe...@gmail.com > @filbertkm / @wikimediadc / @wikidata > > ___ > Wikidata-l mailing list > Wikidata-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata-l > > ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Countries ranked
On Sat, Oct 5, 2013 at 3:49 PM, Maarten Dammers wrote: > Hi Katie, > > Op 4-10-2013 23:29, Katie Filbert schreef: > > As many folks enjoy country rankings, I have generated a list of > countries (Property:P17) ranked by number of coordinates (P625) in > Wikidata. Note this data is from the September 22 database dump. > > Did you consider doing this as a query on the live database on Tool*? > Unfortunately tool labs does not have PostgreSQL and PostGIS yet :( which is what I need to do certain stuff. It's easy enough to work with dumps, and can do stuff on tool labs when it makes sense. > You have a page for an item with coordinates. > * Join this page against pagelinks for P625 (coordinate location) > * Join this page against pagelinks for P17 (country) > * Join this page against pagelinks for countrypage > * Join countrypage aginst pagelinks for Q6256 (country) or Q1763527 > (constituent country) > > Group it by country and do some ordering. > > We do seem to have quite a few items where country is missing, see > http://208.80.153.172/wdq/?q=claim[625]_AND_noclaim[17]_AND_noclaim[31] . > We should probably work on that too. > Yep! > > I was wondering how many articles do have coordinates at the Dutch > Wikipedia, but not at Wikidata. For that I created a tracker category, see > https://nl.wikipedia.org/wiki/Categorie:Wikipedia:Co%C3%B6rdinaten_niet_op_Wikidata. > We could probably do some LUA magic to compare coordinates in Wikidata > with local coordinates and see how far these are apart. Did anyone already > build something in LUA that might be reused for this? > I am not aware of that. Definitely a good idea. Cheers, Katie > > > Maarten > > ___ > Wikidata-l mailing list > Wikidata-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata-l > > -- Katie Filbert filbe...@gmail.com @filbertkm / @wikimediadc / @wikidata ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Countries ranked
Hi Katie, Op 4-10-2013 23:29, Katie Filbert schreef: As many folks enjoy country rankings, I have generated a list of countries (Property:P17) ranked by number of coordinates (P625) in Wikidata. Note this data is from the September 22 database dump. Did you consider doing this as a query on the live database on Tool*? You have a page for an item with coordinates. * Join this page against pagelinks for P625 (coordinate location) * Join this page against pagelinks for P17 (country) * Join this page against pagelinks for countrypage * Join countrypage aginst pagelinks for Q6256 (country) or Q1763527 (constituent country) Group it by country and do some ordering. We do seem to have quite a few items where country is missing, see http://208.80.153.172/wdq/?q=claim[625]_AND_noclaim[17]_AND_noclaim[31] . We should probably work on that too. I was wondering how many articles do have coordinates at the Dutch Wikipedia, but not at Wikidata. For that I created a tracker category, see https://nl.wikipedia.org/wiki/Categorie:Wikipedia:Co%C3%B6rdinaten_niet_op_Wikidata . We could probably do some LUA magic to compare coordinates in Wikidata with local coordinates and see how far these are apart. Did anyone already build something in LUA that might be reused for this? Maarten ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Countries ranked
I can offer: 189,742 of 210,616 places (90%) in the U.S. with coordinates http://208.80.153.172/wdq/?q=tree[30][150][17,131]_AND_claim[625] 55,504 of 92,510 (60%) in Russia: http://208.80.153.172/wdq/?q=tree[159][150][17,131]_AND_claim[625] You'll have to do the rest on your own ;-) (for automated stats, just copy the API call for all countries) On Fri, Oct 4, 2013 at 10:47 PM, Katie Filbert wrote: > On Fri, Oct 4, 2013 at 11:34 PM, Sven Manguard wrote: > >> This is cool, but do we have any statistics on the number of pages have >> coordinate locations versus the number of pages that should have >> coordinated locations? >> > Actually I am working on that. > > http://tools.wmflabs.org/audetools/coords (note, it is incomplete data > and not as user-friendly yet as it could be) > > Click on each Wikipedia link to see per-country stats for that Wikipedia. > > Stay tuned for more :) > > >> This statistic is more a reflection of where the bots have been running >> and where they haven't been run yet. I'd be very interested in seeing the >> top 10 once we've imported all the coords we can. >> > That would be really cool to see at that point. > > Cheers, > Katie > > >> Sven >> On Oct 4, 2013 5:30 PM, "Katie Filbert" wrote: >> >>> As many folks enjoy country rankings, I have generated a list of >>> countries (Property:P17) ranked by number of coordinates (P625) in >>> Wikidata. Note this data is from the September 22 database dump. >>> >>> There are a total of 737,271 coordinates in Wikidata. >>> >>> Top countries are >>> >>> 1) US >>> 2) Russia >>> 3) UK >>> 4) China >>> 5) France >>> 6) Ukraine >>> 7) Canada >>> 8) Germany >>> 9) Australia >>> 10) Poland >>> >>> See the full list (which also has a few items entered for P17 that are >>> not really countries): >>> >>> https://www.wikidata.org/wiki/User:Aude/countrystats >>> >>> Cheers, >>> Katie (user:aude) >>> >>> -- >>> Katie Filbert >>> filbe...@gmail.com >>> @filbertkm / @wikimediadc / @wikidata >>> >>> ___ >>> Wikidata-l mailing list >>> Wikidata-l@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l >>> >>> >> ___ >> Wikidata-l mailing list >> Wikidata-l@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikidata-l >> >> > > > -- > Katie Filbert > filbe...@gmail.com > @filbertkm / @wikimediadc / @wikidata > > ___ > Wikidata-l mailing list > Wikidata-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata-l > > -- undefined ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Countries ranked
On Fri, Oct 4, 2013 at 11:34 PM, Sven Manguard wrote: > This is cool, but do we have any statistics on the number of pages have > coordinate locations versus the number of pages that should have > coordinated locations? > Actually I am working on that. http://tools.wmflabs.org/audetools/coords (note, it is incomplete data and not as user-friendly yet as it could be) Click on each Wikipedia link to see per-country stats for that Wikipedia. Stay tuned for more :) > This statistic is more a reflection of where the bots have been running > and where they haven't been run yet. I'd be very interested in seeing the > top 10 once we've imported all the coords we can. > That would be really cool to see at that point. Cheers, Katie > Sven > On Oct 4, 2013 5:30 PM, "Katie Filbert" wrote: > >> As many folks enjoy country rankings, I have generated a list of >> countries (Property:P17) ranked by number of coordinates (P625) in >> Wikidata. Note this data is from the September 22 database dump. >> >> There are a total of 737,271 coordinates in Wikidata. >> >> Top countries are >> >> 1) US >> 2) Russia >> 3) UK >> 4) China >> 5) France >> 6) Ukraine >> 7) Canada >> 8) Germany >> 9) Australia >> 10) Poland >> >> See the full list (which also has a few items entered for P17 that are >> not really countries): >> >> https://www.wikidata.org/wiki/User:Aude/countrystats >> >> Cheers, >> Katie (user:aude) >> >> -- >> Katie Filbert >> filbe...@gmail.com >> @filbertkm / @wikimediadc / @wikidata >> >> ___ >> Wikidata-l mailing list >> Wikidata-l@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikidata-l >> >> > ___ > Wikidata-l mailing list > Wikidata-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata-l > > -- Katie Filbert filbe...@gmail.com @filbertkm / @wikimediadc / @wikidata ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Countries ranked
This is cool, but do we have any statistics on the number of pages have coordinate locations versus the number of pages that should have coordinated locations? This statistic is more a reflection of where the bots have been running and where they haven't been run yet. I'd be very interested in seeing the top 10 once we've imported all the coords we can. Sven On Oct 4, 2013 5:30 PM, "Katie Filbert" wrote: > As many folks enjoy country rankings, I have generated a list of countries > (Property:P17) ranked by number of coordinates (P625) in Wikidata. Note > this data is from the September 22 database dump. > > There are a total of 737,271 coordinates in Wikidata. > > Top countries are > > 1) US > 2) Russia > 3) UK > 4) China > 5) France > 6) Ukraine > 7) Canada > 8) Germany > 9) Australia > 10) Poland > > See the full list (which also has a few items entered for P17 that are not > really countries): > > https://www.wikidata.org/wiki/User:Aude/countrystats > > Cheers, > Katie (user:aude) > > -- > Katie Filbert > filbe...@gmail.com > @filbertkm / @wikimediadc / @wikidata > > ___ > Wikidata-l mailing list > Wikidata-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata-l > > ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] Countries ranked
As many folks enjoy country rankings, I have generated a list of countries (Property:P17) ranked by number of coordinates (P625) in Wikidata. Note this data is from the September 22 database dump. There are a total of 737,271 coordinates in Wikidata. Top countries are 1) US 2) Russia 3) UK 4) China 5) France 6) Ukraine 7) Canada 8) Germany 9) Australia 10) Poland See the full list (which also has a few items entered for P17 that are not really countries): https://www.wikidata.org/wiki/User:Aude/countrystats Cheers, Katie (user:aude) -- Katie Filbert filbe...@gmail.com @filbertkm / @wikimediadc / @wikidata ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l