[Wikidata] Tool for consuming left-over data from import

2017-08-04 Thread André Costa
Hi all!

As part of the Connected Open Heritage project Wikimedia Sverige have been
migrating Wiki Loves Monuments datasets from Wikipedias to Wikidata.

In the course of doing this we keep a note of the data which we fail to
migrate. For each of these left-over bits we know which item and which
property it belongs to as well as the source field and language from the
Wikipedia list.  An example would e.g. be a "type of building" field where
we could not match the text to an item on Wikidata but know that the target
property is P31.

We have created dumps of these (such as
https://tools.wmflabs.org/coh/_total_se-ship_new.json, don't worry this one
is tiny) but are now looking for an easy way for users to consume them.

Does anyone know of a tool which could do this today? The Wikidata game
only allows (AFAIK) for yes/no/skip whereas you would here want something
like /invalid/skip. And if not are there any tools which with
a bit of forking could be made to do it?

We have only published a few dumps but there are more to come. I would also
imagine that this, or a similar, format could be useful for other
imports/template harvests where some fields are more easily handled by
humans.

Any thoughts and suggestions are welcome.
Cheers,
André
André Costa | Senior Developer, Wikimedia Sverige | andre.co...@wikimedia.se
| +46 (0)733-964574

Stöd fri kunskap, bli medlem i Wikimedia Sverige.
Läs mer på blimedlem.wikimedia.se
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] New step towards structured data for Commons is now available: federation

2017-07-06 Thread André Costa
Nice!

Will the connection back to the image be included in the rdf? The /entity/
path was not available so couldn't check what is there now.

Cheers,
André



On 6 Jul 2017 15:10, "Léa Lacroix"  wrote:

> Hello all,
>
> As you may know, WMF, WMDE and volunteers are working together on the 
> structured
> data for Commons
>  project.
> We’re currently working on a lot of technical groundwork for this project.
> One big part of that is allowing the use of Wikidata’s items and properties
> to describe media files on Commons. We call this feature federation. We
> have now developed the necessary code for it and you can try it out on a
> test system and give feedback.
>
> We have one test wiki that represents Commons (http://structured-commons.
> wmflabs.org) and another one simulating Wikidata (
> http://federated-wikidata.wmflabs.org). You can see an example
>  where the
> statements use items and properties from the faked Wikidata. Feel free to
> try it by adding statements to to some of the files on the test system.
> (You might need to create some items on http://federated-wikidata.
> wmflabs.org if they don’t exist yet. We have created a few for testing.)
> If you have any questions or concern, please let us know.
> Thanks,
>
> --
> Léa Lacroix
> Project Manager Community Communication for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt
> für Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Three folders about Wikidata

2016-02-23 Thread André Costa
The three originals are in Swedish but are in turn based on a single one in
German.

--
André Costa
GLAM Developer
Wikimedia Sverige
On 23 Feb 2016 12:33, "Gerard Meijssen"  wrote:

> Hoi,
> What was the original language. German ?
> Thanks,
>  GerardM
>
> On 23 February 2016 at 09:54, Romaine Wiki  wrote:
>
>> Hi all,
>>
>> Currently I am working on translating three folders about Wikidata
>> (aiming at GLAMs, businesses and research) to English, and later Dutch and
>> French (so they can be used in Belgium, France and the Netherlands).
>>
>> I translated the texts of these folders here:
>> https://be.wikimedia.org/wiki/User:Romaine/Wikidata
>>
>> The texts of the folders are not completely reviewed, but if anyone as
>> native speaker wants to look at them and fix some grammar/etc, feel free to
>> do that.
>>
>> Greetings,
>> Romaine
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata Propbrowse

2016-02-15 Thread André Costa
Would it be possible to set the language used to search with? Whilst I most
often use English on Wikidata I'm sure a lot of people don't.

/André
On 14 Feb 2016 22:03, "Markus Kroetzsch" 
wrote:

> On 14.02.2016 18:03, Hay (Husky) wrote:
>
>> On Sun, Feb 14, 2016 at 4:40 PM, Markus Kroetzsch
>>  wrote:
>>
>>> I suspect that https://query.wikidata.org can count how many times each
 property is used.

>>>
>>>
>>> Amazingly, you can (I was surprised):
>>>
>>>
>>> https://query.wikidata.org/#SELECT%20%3FanyProp%20%28count%28*%29%20as%20%3Fcount%29%0AWHERE%20{%0A%20%20%20%20%3Fpid%20%3FanyProp%20%3FsomeValue%20.%0A}%0AGROUP%20BY%20%3FanyProp%0AORDER%20BY%20DESC%28%3Fcount%29
>>> 
>>>
>> That's a really nice find! Any idea how to filter the query so you
>> only get the property statements?
>>
>
> I would just filter this in code; a more complex SPARQL query is just
> getting slower. Here is a little example Python script that gets all the
> data you need:
>
>
> https://github.com/Wikidata/WikidataClassBrowser/blob/master/helpers/python/fetchPropertyStatitsics.py
>
> I intend to use this in our upcoming new class/property browser as well.
> Maybe it would actually make sense to merge the two applications at some
> point (the focus of our tool are classes and their connection to
> properties, as in the existing Miga tool, but a property browser is an
> integral part of this).
>
> Markus
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] New datatype for mathematical expressions is now available

2016-02-11 Thread André Costa
As does pywikibot :)

--
André Costa
GLAM Developer
Wikimedia Sverige
On 10 Feb 2016 13:37, "Lydia Pintscher" 
wrote:

> Hey everyone :)
>
> The new datatype for mathematical expressions is now enabled here. As soon
> as properties using it are created they'll show up at
> https://www.wikidata.org/wiki/Special:ListProperties?datatype=math. At
> https://en.wikipedia.org/wiki/Help:Displaying_a_formula you can find out
> more about the supported syntax.
> Thanks to Markus the Wikidata Toolkit supports it as well already:
> https://lists.wikimedia.org/pipermail/wikidata/2016-February/008169.html
>
>
> Cheers
> Lydia
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt
> für Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] [Wikidata-tech] Technical information about the new "math" and "external-id" data types

2016-02-05 Thread André Costa
I noticed that the `math` data type validates the input (e.g. "/foo" is
refused). Are there any limitations on what `external-id` is allowed to be?

------
André Costa
GLAM Developer
Wikimedia Sverige
On 5 Feb 2016 17:18, "Lydia Pintscher"  wrote:

> On Fri, Feb 5, 2016 at 1:50 PM Markus Kroetzsch <
> markus.kroetz...@tu-dresden.de> wrote:
>
>> I think I already commented on this in other places. Wasn't there a
>> tracker item where the derived values were discussed? Some thing to keep
>> in mind here is that many properties have multiple URIs and URLs
>> associated. This is no problem in RDF, but your above encoding might not
>> work for this case.
>>
>
> The ticket is https://phabricator.wikimedia.org/T112548 and its blockers
> I assume.
>
> Cheers
> Lydia
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt
> für Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Photographers' Identities Catalog (& WikiData)

2015-12-14 Thread André Costa
I'm planning to bring a few of the datasets into mix'n'match (@Magnus this
is the one I asked sbout on Twitter) in January but not all of them are
suitable and I believe separating KulturNav into multiple datasets on
mix'n'match maxes more sense and makes it more likely that they get matched.

Some of the early adopters of KulturNav have been working with WMSE to
facilitate bi-directional matching. This is done on a dataset-by-dataset
level since different institutions are responsible for different datasets.
My hope is that mix'n'match will help in this area as well, even as a tool
for the institutions own staff who are often interested in matching entries
to Wikipedia (which most of the time means wikidata).

@John: There are processes for matching kulturnav identifiers to wikidata
entities. Only afterwards are details imported. Mainly to source statements
[1] and [2]. There is some (not so user friendly) stats at [3].

Cheers,
André

[1]
https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/L_PBot_2
[2]
https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/L_PBot_3
[3] https://tools.wmflabs.org/lp-tools/misc/data/
--
André Costa
GLAM developer
Wikimedia Sverige

Magnus Manske, 13/12/2015 11:24:

>
> Since no one mentioned it, there is a tool to do the matching to WD much
> more efficiently:
> https://tools.wmflabs.org/mix-n-match/
<https://tools.wmflabs.org/mix-n-match/>

+1

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Photographers' Identities Catalog (& WikiData)

2015-12-09 Thread André Costa
Happy to be of use. There is also one for:
* Swedish photo studios [1]
* Norwegian photographers[2]
* Norwegian photo studios [3]
I'm less familiar with these though and don't have a timeline for wikidata
integration.

Cheers,
André

[1] http://kulturnav.org/deb494a0-5457-4e5f-ae9b-e1826e0de681
[2] http://kulturnav.org/508197af-6e36-4e4f-927c-79f8f63654b2
[3] http://kulturnav.org/7d2a01d1-724c-4ad2-a18c-e799880a0241
------
André Costa
GLAM developer
Wikimedia Sverige
On 9 Dec 2015 15:07, "David Lowe"  wrote:

> Thanks, André! I don't know that I've found that before. Great to get
> country (or region) specific lists like this.
> D
>
> On Wednesday, December 9, 2015, André Costa 
> wrote:
>
>> In case you haven't come across it before
>> http://kulturnav.org/1f368832-7649-4386-97b6-ae40cce8752b is the entry
>> point to the Swedish database of (primarily early) photographers curated by
>> the Nordic Museum in Stockholm.
>>
>> It's not that well integrated into Wikidata yet but the plan is to fix
>> that during early 2016. That would also allow a variety of photographs on
>> Wikimedia Commons to be linked to these entries.
>>
>> Cheers,
>> André
>>
>> André Costa | GLAM developer, Wikimedia Sverige |
>> andre.co...@wikimedia.se | +46 (0)733-964574
>>
>> Stöd fri kunskap, bli medlem i Wikimedia Sverige.
>> Läs mer på blimedlem.wikimedia.se
>>
>> On 9 December 2015 at 02:44, David Lowe  wrote:
>>
>>> Thanks, Tom.
>>> I'll have to look at this specific case when I'm back at work tomorrow,
>>> as it does seem you found something in error.
>>> As for my process: with WD, I queried out the label, description &
>>> country of citizenship, dob & dod of of everyone with occupation:
>>> photographer. After some cleaning, I can get the WD data formatted like my
>>> own (Name, Nationality, Dates). I can then do a simple match, where
>>> everything matches exactly. For the remainder, I then match names and
>>> dates- without Nationality, which is often very "soft" information. For
>>> those that pass a smell test (one is "English" the other is "British") I
>>> pass those along, too. For those with greater discrepancies, I look still
>>> closer. For those with still greater discrepancies, I manually,
>>> individually query my database for anyone with the same last name & same
>>> first initial to catch misspellings or different transliterations. I also
>>> occasionally put my entire database into open refine to catch instances
>>> where, for instance, a Chinese name has been given as FamilyName, GivenName
>>> in one source, and GivenName, FamilyName in another.
>>> In short, this is scrupulously- and manually- checked data. I'm not
>>> savvy enough to let an algorithm make my mistakes for me! But let me know
>>> if this seems to be more than bad luck of the draw- finding the conflicting
>>> data you found.
>>> I have also to say, I may suppress the Niepce Museum collection, as it's
>>> from a really crappy list of photographers in their collection which I
>>> found many years ago, and can no longer find. I don't want to blame them
>>> for the discrepancy, but that might be the source. I don't know.
>>> As I start to query out places of birth & death from WD in the next
>>> days, I expect to find more discrepancies. (Just today, I found dozens of
>>> folks whom ULAN gendered one way, and WD another- but were undeniably the
>>> same photographer. )
>>> Thanks,
>>> David
>>>
>>>
>>> On Tuesday, December 8, 2015, Tom Morris  wrote:
>>>
>>>> Can you explain what "indexing" means in this context?  Is there some
>>>> type of matching process?  How are duplicates resolved, if at all? Was the
>>>> Wikidata info extracted from a dump or one of the APIs?
>>>>
>>>> When I looked at the first person I picked at random, Pierre Berdoy
>>>> (ID:269710), I see that both Wikidata and Wikipedia claim that he was born
>>>> in Biarritz while the NYPL database claims he was born in Nashua, NH.  So,
>>>> it would appear that there are either two different people with the same
>>>> name, born in different places, or the birth place is wrong.
>>>>
>>>>
>>>> http://mgiraldo.github.io/pic/?&biography.TermID=2028247&Location=269710|42.7575,-71.4644
>>>> https://www.wikidata.org/wiki/Q3383941
>>>>
>>&

Re: [Wikidata] Wikidata Analyst, a tool to comprehensively analyze quality of Wikidata

2015-12-09 Thread André Costa
Nice tool!

To understand the statistics better.
If a claim has two sources, one wikipedia and one other, how does that show
up in the statistics?

The reason I'm wondering is because I would normally care if a claim is
sourced or not (but not by how many sources) and whether it is sourced by
only Wikipedias or anything else.

E.g.
1) a statment with 10 claims each sourced is "better" than one with 10
claims where one claim has 10 sources.
2) a statement with a wiki source + another source is "better" than on with
just a wiki source and just as "good" as one without the wiki source.

Also is wiki ref/source Wikipedia only or any Wikimedia project? Whilst
(last I checked) the others were only 70,000 refs compared to the 21
million from Wikipedia they might be significant for certain domains and
are just as "bad".

Cheers,
André
On 9 Dec 2015 10:37, "Gerard Meijssen"  wrote:

> Hoi,
> What would be nice is to have an option to understand progress from one
> dump to the next like you can with the Statistics by Magnus. Magnus also
> has data on sources but this is more global.
> Thanks,
>  GerardM
>
> On 8 December 2015 at 21:41, Markus Krötzsch <
> mar...@semantic-mediawiki.org> wrote:
>
>> Hi Amir,
>>
>> Very nice, thanks! I like the general approach of having a stand-alone
>> tool for analysing the data, and maybe pointing you to issues. Like a
>> dashboard for Wikidata editors.
>>
>> What backend technology are you using to produce these results? Is this
>> live data or dumped data? One could also get those numbers from the SPARQL
>> endpoint, but performance might be problematic (since you compute averages
>> over all items; a custom approach would of course be much faster but then
>> you have the data update problem).
>>
>> An obvious feature request would be to display entity ids as links to the
>> appropriate page, and maybe with their labels (in a language of your
>> choice).
>>
>> But overall very nice.
>>
>> Regards,
>>
>> Markus
>>
>>
>> On 08.12.2015 18:48, Amir Ladsgroup wrote:
>>
>>> Hey,
>>> There has been several discussion regarding quality of information in
>>> Wikidata. I wanted to work on quality of wikidata but we don't have any
>>> source of good information to see where we are ahead and where we are
>>> behind. So I thought the best thing I can do is to make something to
>>> show people how exactly sourced our data is with details. So here we
>>> have *http://tools.wmflabs.org/wd-analyst/index.php*
>>>
>>> You can give only a property (let's say P31) and it gives you the four
>>> most used values + analyze of sources and quality in overall (check this
>>> out )
>>>   and then you can see about ~33% of them are sources which 29.1% of
>>> them are based on Wikipedia.
>>> You can give a property and multiple values you want. Let's say you want
>>> to compare P27:Q183 (Country of citizenship: Germany) and P27:Q30 (US)
>>> Check this out
>>> . And
>>> you can see US biographies are more abundant (300K over 200K) but German
>>> biographies are more descriptive (3.8 description per item over 3.2
>>> description over item)
>>>
>>> One important note: Compare P31:Q5 (a trivial statement) 46% of them are
>>> not sourced at all and 49% of them are based on Wikipedia **but* *get
>>> this statistics for population properties (P1082
>>> ) It's not a
>>> trivial statement and we need to be careful about them. It turns out
>>> there are slightly more than one reference per statement and only 4% of
>>> them are based on Wikipedia. So we can relax and enjoy these
>>> highly-sourced data.
>>>
>>> Requests:
>>>
>>>   * Please tell me whether do you want this tool at all
>>>   * Please suggest more ways to analyze and catch unsourced materials
>>>
>>> Future plan (if you agree to keep using this tool):
>>>
>>>   * Support more datatypes (e.g. date of birth based on year,
>>> coordinates)
>>>   * Sitelink-based and reference-based analysis (to check how much of
>>> articles of, let's say, Chinese Wikipedia are unsourced)
>>>
>>>   * Free-style analysis: There is a database for this tool that can be
>>> used for way more applications. You can get the most unsourced
>>> statements of P31 and then you can go to fix them. I'm trying to
>>> build a playground for this kind of tasks)
>>>
>>> I hope you like this and rock on!
>>> 
>>> Best
>>>
>>>
>>> ___
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>
>>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
>
> ___
> Wikidata mailing list

Re: [Wikidata] Photographers' Identities Catalog (& WikiData)

2015-12-09 Thread André Costa
In case you haven't come across it before
http://kulturnav.org/1f368832-7649-4386-97b6-ae40cce8752b is the entry
point to the Swedish database of (primarily early) photographers curated by
the Nordic Museum in Stockholm.

It's not that well integrated into Wikidata yet but the plan is to fix that
during early 2016. That would also allow a variety of photographs on
Wikimedia Commons to be linked to these entries.

Cheers,
André

André Costa | GLAM developer, Wikimedia Sverige | andre.co...@wikimedia.se |
 +46 (0)733-964574

Stöd fri kunskap, bli medlem i Wikimedia Sverige.
Läs mer på blimedlem.wikimedia.se

On 9 December 2015 at 02:44, David Lowe  wrote:

> Thanks, Tom.
> I'll have to look at this specific case when I'm back at work tomorrow, as
> it does seem you found something in error.
> As for my process: with WD, I queried out the label, description & country
> of citizenship, dob & dod of of everyone with occupation: photographer.
> After some cleaning, I can get the WD data formatted like my own (Name,
> Nationality, Dates). I can then do a simple match, where everything matches
> exactly. For the remainder, I then match names and dates- without
> Nationality, which is often very "soft" information. For those that pass a
> smell test (one is "English" the other is "British") I pass those along,
> too. For those with greater discrepancies, I look still closer. For those
> with still greater discrepancies, I manually, individually query my
> database for anyone with the same last name & same first initial to catch
> misspellings or different transliterations. I also occasionally put my
> entire database into open refine to catch instances where, for instance, a
> Chinese name has been given as FamilyName, GivenName in one source, and
> GivenName, FamilyName in another.
> In short, this is scrupulously- and manually- checked data. I'm not savvy
> enough to let an algorithm make my mistakes for me! But let me know if this
> seems to be more than bad luck of the draw- finding the conflicting data
> you found.
> I have also to say, I may suppress the Niepce Museum collection, as it's
> from a really crappy list of photographers in their collection which I
> found many years ago, and can no longer find. I don't want to blame them
> for the discrepancy, but that might be the source. I don't know.
> As I start to query out places of birth & death from WD in the next days,
> I expect to find more discrepancies. (Just today, I found dozens of folks
> whom ULAN gendered one way, and WD another- but were undeniably the same
> photographer. )
> Thanks,
> David
>
>
> On Tuesday, December 8, 2015, Tom Morris  wrote:
>
>> Can you explain what "indexing" means in this context?  Is there some
>> type of matching process?  How are duplicates resolved, if at all? Was the
>> Wikidata info extracted from a dump or one of the APIs?
>>
>> When I looked at the first person I picked at random, Pierre Berdoy
>> (ID:269710), I see that both Wikidata and Wikipedia claim that he was born
>> in Biarritz while the NYPL database claims he was born in Nashua, NH.  So,
>> it would appear that there are either two different people with the same
>> name, born in different places, or the birth place is wrong.
>>
>>
>> http://mgiraldo.github.io/pic/?&biography.TermID=2028247&Location=269710|42.7575,-71.4644
>> https://www.wikidata.org/wiki/Q3383941
>>
>> Tom
>>
>>
>>
>>
>> On Tue, Dec 8, 2015 at 7:10 PM, David Lowe  wrote:
>>
>>> Hello all,
>>> The Photographers' Identities Catalog (PIC) is an ongoing project of
>>> visualizing photo history through the lives of photographers and photo
>>> studios. I have information on 115,000 photographers and studios as of
>>> tonight. It is still under construction, but as I've almost completed an
>>> initial indexing of the ~12,000 photographers in WikiData, I thought I'd
>>> share it with you. We (the New York Public Library) hope to launch it
>>> officially in mid to late January. This represents about 12 years worth of
>>> my work of researching in NYPL's photography collection, censuses and
>>> business directories, and scraping or indexing trusted websites, databases,
>>> and published biographical dictionaries pertaining to photo history.
>>> Again, please bear in mind that our programmer is still hard at work
>>> (and I continue to refine and add to the data*), but we welcome your
>>> feedback, questions, critiques, etc. To see the WikiData photographers,
>>> select WikiData from the Source dropdown. Have fu

Re: [Wikidata] Source statistics

2015-09-08 Thread André Costa
Hi,

Many thanks for the many answers.

I went for the SPARQL solution in the end since I only have a short list of
Q-numbers which can be sources. With the new query service it also has the
benefit of being something I can just turn into a url and then give to the
GLAM to look up on the fly =)

Now I also have an excuse to learn some more SPARQL to play around with
this =)

Cheers,
André

André Costa | GLAM-tekniker, Wikimedia Sverige | andre.co...@wikimedia.se |
+46 (0)733-964574

Stöd fri kunskap, bli medlem i Wikimedia Sverige.
Läs mer på blimedlem.wikimedia.se

On 7 September 2015 at 22:29, Stas Malyshev  wrote:

> Hi!
>
> > A small fix though: I think you should better use count(?statement)
> > rather than count(?ref), right?
>
> Yes, of course, my mistake - I modified it from different query and
> forgot to change it.
>
> > I have tried a similar query on the public test endpoint on labs
> > earlier, but it timed out for me (I was using a very common reference
> > though ;-). For rarer references, live queries are definitely the better
> > approach.
>
> Works for me for Q216047, didn't check others though. For a popular
> references, labs one may be too slow, indeed. A faster one is coming
> "real soon now" :)
>
> --
> Stas Malyshev
> smalys...@wikimedia.org
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] Source statistics

2015-09-07 Thread André Costa
Hi all!

I'm wondering if there is a way (SQL, api, tool or otherwise) for finding
out how often a particular source is used on Wikidata.

The background is a collaboration with two GLAMs where we have used ther
open (and CC0) datasets to add and/or source statements on Wikidata for
items on which they can be considered an authority. Now I figured it would
be nice to give them back a number for just how big the impact was.

While I can find out how many items should be affected I couldn't find an
easy way, short of analysing each of these, for how many statements were
affected.

Any suggestions would be welcome.

Some details: Each reference is a P248 claim + P577 claim (where the latter
may change)

Cheers,
André / Lokal_Profil
André Costa | GLAM-tekniker, Wikimedia Sverige | andre.co...@wikimedia.se |
+46 (0)733-964574

Stöd fri kunskap, bli medlem i Wikimedia Sverige.
Läs mer på blimedlem.wikimedia.se
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wiktionary-Wikidata meetup at Wikimania

2015-07-17 Thread André Costa
We are in the lobby!

 André Costa | GLAM-tekniker, Wikimedia Sverige | andre.co...@wikimedia.se |
 +46 (0)733-964574

Stöd fri kunskap, bli medlem i Wikimedia Sverige.
Läs mer på blimedlem.wikimedia.se

On 9 July 2015 at 08:18, André Costa  wrote:

> Hi all
>
> We are planning on holding a meetup[1
> <https://wikimania2015.wikimedia.org/wiki/Wiktionary-Wikidata_Meetup>] at
> Wikimania for discussing Wiktionary-Wikidata/Wikibase and it's place in the
> Linked Open Data world.
>
> The recent proposal[2
> <https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development/Proposals/2015-05>]
> has again made this a current topic and we believe a meetup at Wikimania
> would be a great opportunity to discuss this and related issues. It would
> therefore be especially interesting if people from both projects would want
> to show up.
>
> In addition to looking at how Wikitionary will be integrated into Wikidata
> we are also interested in seeing how a structured Wikitonary fits in the
> greater world of open data. Our personal background for this is
> acollaboration between Wikimedia Sverige and the Swedish Centre for
> Terminology[3 <http://tnc.se/the-swedish-centre-for-terminology.html>]
> who are looking at making their resources available as Linked Open Data and
> thinking about how Wikidata can fit as a node in this and also about
> whether Wikibase might be a suitable platform. That said we would be
> surprised if that was the only example out there. If you've had similar
> thoughts or know of other connections then we would love to hear about them.
>
> The meetup will take place on Friday 17 July 17:30-20:00. Location TBD.
> Sign up on the meetup page[1
> <https://wikimania2015.wikimedia.org/wiki/Wiktionary-Wikidata_Meetup>] if
> you are interested in attending and post a note on the discussion page if
> there is something specific you have been thinking about.
>
> Regards,
> André Costa / Lokal_Profil
>
> P.S. Sorry for the short notice
>
> [1] https://wikimania2015.wikimedia.org/wiki/Wiktionary-Wikidata_Meetup
> [2]
> https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development/Proposals/2015-05
> [3] http://tnc.se/the-swedish-centre-for-terminology.html
>  André Costa | GLAM-tekniker, Wikimedia Sverige | andre.co...@wikimedia.se
> | +46 (0)733-964574
>
> Stöd fri kunskap, bli medlem i Wikimedia Sverige.
> Läs mer på blimedlem.wikimedia.se
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] Wiktionary-Wikidata meetup at Wikimania

2015-07-09 Thread André Costa
Hi all

We are planning on holding a meetup[1
<https://wikimania2015.wikimedia.org/wiki/Wiktionary-Wikidata_Meetup>] at
Wikimania for discussing Wiktionary-Wikidata/Wikibase and it's place in the
Linked Open Data world.

The recent proposal[2
<https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development/Proposals/2015-05>]
has again made this a current topic and we believe a meetup at Wikimania
would be a great opportunity to discuss this and related issues. It would
therefore be especially interesting if people from both projects would want
to show up.

In addition to looking at how Wikitionary will be integrated into Wikidata
we are also interested in seeing how a structured Wikitonary fits in the
greater world of open data. Our personal background for this is
acollaboration between Wikimedia Sverige and the Swedish Centre for
Terminology[3 <http://tnc.se/the-swedish-centre-for-terminology.html>] who
are looking at making their resources available as Linked Open Data and
thinking about how Wikidata can fit as a node in this and also about
whether Wikibase might be a suitable platform. That said we would be
surprised if that was the only example out there. If you've had similar
thoughts or know of other connections then we would love to hear about them.

The meetup will take place on Friday 17 July 17:30-20:00. Location TBD.
Sign up on the meetup page[1
<https://wikimania2015.wikimedia.org/wiki/Wiktionary-Wikidata_Meetup>] if
you are interested in attending and post a note on the discussion page if
there is something specific you have been thinking about.

Regards,
André Costa / Lokal_Profil

P.S. Sorry for the short notice

[1] https://wikimania2015.wikimedia.org/wiki/Wiktionary-Wikidata_Meetup
[2]
https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development/Proposals/2015-05
[3] http://tnc.se/the-swedish-centre-for-terminology.html
 André Costa | GLAM-tekniker, Wikimedia Sverige | andre.co...@wikimedia.se |
 +46 (0)733-964574

Stöd fri kunskap, bli medlem i Wikimedia Sverige.
Läs mer på blimedlem.wikimedia.se
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] calendar model screwup

2015-07-06 Thread André Costa
There also seems to be an issue where a year could be recorded as either
time="+000-00-00T00:00:00Z" or time="+000-01-01T00:00:00Z"
(with precision=9). At least I spotted that my earlier bot runs was doing
this.

These display the same but if you compare the claims they show up as
different. I also guess only the latter is correct. Is there a way of
checking how many of the first there is and possible convert these to the
latter?

 André Costa | GLAM-tekniker, Wikimedia Sverige | andre.co...@wikimedia.se |
 +46 (0)733-964574

Stöd fri kunskap, bli medlem i Wikimedia Sverige.
Läs mer på blimedlem.wikimedia.se

On 2 July 2015 at 20:48, Pierpaolo Bernardi  wrote:

> On Thu, Jul 2, 2015 at 4:16 PM, Neil Harris 
> wrote:
> > On 01/07/15 15:00, Pierpaolo Bernardi wrote:
> >>
> > Just for future reference, if anyone's interested, THE book on this
> topic is
> > "Calendrical Calculations".
> >
> > Alas, their code is closed-source, but the book is still the best
> reference
> > I know of.
>
> True. HOWEVER, the authors of the book are the same people who wrote
> the Emacs calendar, which is substantially the same code as the one in
> the book, and the code in Emacs is GPL, of course.  One can look at
> the book for explanations, and at Emacs calendar for code, if one
> needs free code.
>
> Cheers
> P.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata