from:"Denny Vrandečić"

Re: [Wikidata-l] Wikidata for Wiktionary

2015-05-17 Thread Denny Vrandečić

Daniel's answer fits exactly with the proposal (which is unsurprising,
because he reviewed and certainly influenced it).

To make it clear again: the proposal on
https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development/Proposals/2015-05
is a proposal for the tasks that need to be performed. Your questions are
mostly about the data model, which was discussed earlier in the following
proposal:
https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development/Proposals/2013-08

Since I am not sure which questions remain open, I will try to address them
here again, on the risk of repeating what has been said before.
Unfortunately you seem to not use the terminology as defined in the second
proposal linked above, which makes the discussion unnecessarily harder than
it could be. If you prefer another terminology, I would be happy if you
link to a one pager describing it, so that we can effectively communicate.

 How do we go from a spelled form of a lexeme at Wiktionary and to an 
 identifier
on Wikidata?

If with spelled form of a lexeme at Wiktionary you mean a Form as per the
proposal, then the answer is: Forms have statements, and statements may
point to Items, Forms, Senses, Lexemes, etc.. The exact properties to be
used in these statements are up to the community.

If with spelled form of a lexeme at Wiktionary you mean Lexeme as per the
proposal, than the answer is: Lexems have statements, and statements may
point to Items, Forms, Senses, Lexemes, etc. The exact properties to be
used in these statements are up to the community.

This is already stated in the second link above.

 And how do we go from one Sense to another synonym Sense?

A Sense has a set of statements, and statements may point to other Senses.
The exact properties used are up to the community. So a statement with the
property 'synonym' stated on a Sense could point to another Sense.

 Do we use statements?

Yes.

 But then only the L-identifiers can be used, so we will link them at the
Lexeme level..

No. As the second link above says, Senses and Forms also have Statements.
It is not only Lexemes that have Statements.

 Wiktionary is organized around homonyms while Wikipedia is organized around
synonyms, especially across languages, and I think this difference creates
some of the problems.

Yes, that is why Tasks 1, 2, 9 and 10 in the proposal for the task
breakdown, the first link above, deal with exactly this question.

Since Gerard stated that his question was subsumed by the above list, I
hope that his question is also answered?

I am afraid that I could not write a new proposal which is significantly
clearer than the current, but I can keep answering questions. But all the
questions you have asked seem to be explicitly answered in the two links
given above. Since I know you are smart, I am wondering what is not working
in the communication right now. Did you miss the first link? Because
without that it is indeed hard to fully understand the second link (but the
first link is already given in the second link).

So, please, keep asking questions. And everyone else too. I would like to
continue improving the proposals based on your questions and suggestions.



On Sat, May 16, 2015 at 3:46 PM John Erling Blad jeb...@gmail.com wrote:

 Your description is pretty far from whats in the proposal right now.
 The proposal is not clear at all, so I would say update it and
 resubmit if for a new discussion.

 On Sat, May 16, 2015 at 12:21 PM, Daniel Kinzler
 daniel.kinz...@wikimedia.de wrote:
  Am 15.05.2015 um 01:11 schrieb John Erling Blad:
  How do we go from a spelled form of a lexeme at Wiktionary and to an
  identifier on Wikidata?
 
  What do you mean by go to? And what do you mean by identifier on
 Wikidata -
  Items, Lexemes, Senses, or Forms?
 
  Generally, Wiktionary currently combines words with the same rendering
 from
  different languages on a single page. So a single Wiktionary page would
  correspond to several Lexeme entries on Wikidata, since Lexemes on
 wikidata
  would be split per language.
 
  I suppose a Lexeme-Entry could be linked back to the corresponding pages
 on the
  various Wiktionaries, but I don't really see the value of that, and
 sitelinks
  are currently not planned for Lexeme entries. It probably makes more
 sense for
  the Wiktionary pages to explicitly reference the Wikidata-Lexeme that
  corresponds to each language-section on the page.
 
  And how do we go from one Sense to another
  synonym Sense? Do we use statements? But then only the L-identifiers
  can be used, so we will link them at the Lexeme level..
 
  Why can only L-Identifiers be used? Senses (and Forms) are entities and
 have
  identifiers. They wouldn't have a wiki-page of their own, but that's not
 a
  problem. The intention is that it's possible for one Sense to have a
 statement
  referring directly to another Sense (of the same or a different Lexeme).
 
  Wiktionary is organized around homonyms while Wikipedia is organized
  around

Re: [Wikidata-l] Wikidata for Wiktionary

2015-05-17 Thread Denny Vrandečić

John, sorry, I guess I was too slow - as far as I understand you have now
re-read the 13-08 proposal, which has made my last Email redundant.

https://www.wikidata.org/w/index.php?title=Wikidata_talk:Wiktionary/Development/Proposals/2015-05diff=216035102oldid=216029531

I hope that the model is clear now. Thanks for your engagement!
Denny

On Sun, May 17, 2015 at 12:20 PM Denny Vrandečić vrande...@gmail.com
wrote:

Daniel's answer fits exactly with the proposal (which is unsurprising,
because he reviewed and certainly influenced it).

To make it clear again: the proposal on

https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development/Proposals/2015-05
is a proposal for the tasks that need to be performed. Your questions are
mostly about the data model, which was discussed earlier in the following
proposal:

https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development/Proposals/2013-08

Since I am not sure which questions remain open, I will try to address
them here again, on the risk of repeating what has been said before.
Unfortunately you seem to not use the terminology as defined in the second
proposal linked above, which makes the discussion unnecessarily harder than
it could be. If you prefer another terminology, I would be happy if you
link to a one pager describing it, so that we can effectively communicate.

How do we go from a spelled form of a lexeme at Wiktionary and to an
identifier
on Wikidata?

If with spelled form of a lexeme at Wiktionary you mean a Form as per
the proposal, then the answer is: Forms have statements, and statements may
point to Items, Forms, Senses, Lexemes, etc.. The exact properties to be
used in these statements are up to the community.

If with spelled form of a lexeme at Wiktionary you mean Lexeme as per
the proposal, than the answer is: Lexems have statements, and statements
may point to Items, Forms, Senses, Lexemes, etc. The exact properties to
be used in these statements are up to the community.

This is already stated in the second link above.

And how do we go from one Sense to another synonym Sense?

A Sense has a set of statements, and statements may point to other Senses.
The exact properties used are up to the community. So a statement with the
property 'synonym' stated on a Sense could point to another Sense.

Do we use statements?

Yes.

But then only the L-identifiers can be used, so we will link them at
the Lexeme level..

No. As the second link above says, Senses and Forms also have Statements.
It is not only Lexemes that have Statements.

Wiktionary is organized around homonyms while Wikipedia is organized around
synonyms, especially across languages, and I think this difference
creates some of the problems.

Yes, that is why Tasks 1, 2, 9 and 10 in the proposal for the task
breakdown, the first link above, deal with exactly this question.

Since Gerard stated that his question was subsumed by the above list, I
hope that his question is also answered?

I am afraid that I could not write a new proposal which is significantly
clearer than the current, but I can keep answering questions. But all the
questions you have asked seem to be explicitly answered in the two links
given above. Since I know you are smart, I am wondering what is not working
in the communication right now. Did you miss the first link? Because
without that it is indeed hard to fully understand the second link (but the
first link is already given in the second link).

So, please, keep asking questions. And everyone else too. I would like to
continue improving the proposals based on your questions and suggestions.

On Sat, May 16, 2015 at 3:46 PM John Erling Blad jeb...@gmail.com wrote:

Your description is pretty far from whats in the proposal right now.
The proposal is not clear at all, so I would say update it and
resubmit if for a new discussion.

On Sat, May 16, 2015 at 12:21 PM, Daniel Kinzler
daniel.kinz...@wikimedia.de wrote:
Am 15.05.2015 um 01:11 schrieb John Erling Blad:
How do we go from a spelled form of a lexeme at Wiktionary and to an
identifier on Wikidata?

What do you mean by go to? And what do you mean by identifier on
Wikidata -
Items, Lexemes, Senses, or Forms?

Generally, Wiktionary currently combines words with the same rendering
from
different languages on a single page. So a single Wiktionary page would
correspond to several Lexeme entries on Wikidata, since Lexemes on
wikidata
would be split per language.

I suppose a Lexeme-Entry could be linked back to the corresponding
pages on the
various Wiktionaries, but I don't really see the value of that, and
sitelinks
are currently not planned for Lexeme entries. It probably makes more
sense for
the Wiktionary pages to explicitly reference the Wikidata-Lexeme that
corresponds to each language-section on the page.

And how do we go from one Sense to another
synonym Sense? Do we use statements? But then only

Re: [Wikidata-l] Wikidata for Wiktionary

2015-05-08 Thread Denny Vrandečić

I very much appreciate OmegaWiki - it has been a trailblazer for many of
the ideas in Wikidata, and as you say, it is the granddaddy in many ways.
OmegaWiki has been extensively looked into and the results from that have
directly flown into the current proposal. The write up of that analysis can
be found here:

https://www.wikidata.org/wiki/Wikidata:Comparison_of_Projects_and_Proposals_for_Wiktionary

On Fri, May 8, 2015 at 11:46 AM Gerard Meijssen gerard.meijs...@gmail.com
wrote:

Hoi,
Please do appreciate that OmegaWiki, originally WiktionaryZ, really wants
to be considered in all this. It is the grand daddy of Wikidata and it does
combine everything you would want as far as lexical data is concerned.
Thanks,
GerardM

On 8 May 2015 at 18:18, Denny Vrandečić vrande...@gmail.com wrote:

I very much agree with Lydia and Nemo that there should not be a separate
Wikibase instance for Wiktionary data. Having a single community in a
single project, and not having to vote for admins here and there, have two
different watchlists, have documentation be repeated, policies being
rediscussed, etc. sounds like a smart move. Also, the Item-data and the
Lexical-data would be much tighter connected than with any other project,
and queries should be able to seamlessly work between them.

The only reason Commons is proposed to have its own instance is because
the actual multimedia files are there, and the community caring about those
files is there and should work in one place. If there was only a single
Wiktionary project, it might also be worth to consider having the
structured data there - but since there are more than 150 editions of
Wiktionary, a centralized place makes more sense. And since we already have
Wikidata for that, I don't see the advantage of splitting the potential
communities.

On Fri, May 8, 2015 at 8:35 AM Luca Martinelli martinellil...@gmail.com
wrote:

2015-05-08 15:33 GMT+02:00 Federico Leva (Nemo) nemow...@gmail.com:
+1. The Wikimedia community has been long able to think of all the
Wikimedia
projects as an organic whole. Software, on the other hand, too often
forced
innatural divisions.

Wiktionary, Wikipedia, Commons and Wikiquote (to name the main cases)
link
to each other all the time in a constructive division of labour. It
makes no
sense to make connections between them harder.

I start from here, since Nemo got the point IMHO: the fact that every
project has its own scope doesn't imply that the whole of the
community works on different scopes - we just decided to split up our
duties among ourselves. But it's not just that.

TL;DR: Wikidata and Wiktionary deal with the same things (concepts),
therefore are best-suited for each other, given some needed
adaptations. Structured Data and Structured Wikiquote deal with
different things (objects), therefore are not to be considered good
examples.

Long version here:

In theory, one might just agree that a separate instance of Wikibase
might be the best solution for Wiktionary, but Structured Data and
Structured Wikiquote are different from a theoretical Structured
Wiktionary, because they respectively deal with images, quotes and
words.

Images and quotes are describable *objects*, as the Wiki*
articles/pages are, and there are billions and billions of those
objects out there. This is the main, if not just the only, reason why
we *have* to put up a separate instance of Wikibase to deal with them:
thinking that Wikidata might deal with such an infinite task is just
nuts.

Words, on the other hands, are describable *concepts*, not objects.
They can be linked one another by relation, they have synonyms and
opposites, they can be regrouped or separated, etcetera, which is
exactly what we're currently doing with Wikidata items.

I know, words are even more than images and quotes, so it would be
even more nuts to think to deal with this just with Wikidata - but
Wikidata is *already* structured for dealing with concepts, making it
the best choice for integrating data from Wiktionary.

In other words, Wikidata and Wiktionary both work with *concepts*,
while all the other projects work with *objects*. From a more
practical point of view, why should I have a Wikidata item about, say,
present tense[1] *AND* a completely similar item on Structured
Wiktionary? It's the same concept, why should I have it in two
different-yet-linked databases, belonging to and maintained by the
very same community? Why can't we work something out to keep all
informations just in one database?

This is why I think that setting up a separate Wikibase for Wiktionary
might end up in doubling our efforts and splitting our communities,
which is exactly the opposite of what we need to do (halving the
efforts and doubling the community).[2]

Sorry for the long post. :)

[1] https://www.wikidata.org/wiki/Q192613
[2] Not sure if I have to remark this, but please, PLEASE, note this
is just

Re: [Wikidata-l] Wikidata for Wiktionary

2015-05-07 Thread Denny Vrandečić

I am not sure I understand what you are saying. The lexical data in
Wikidata does allow for statements on Lexemes and Forms, as the proposal
states explicitly.

On Thu, May 7, 2015 at 9:25 PM Gerard Meijssen gerard.meijs...@gmail.com
wrote:

 Hoi,
 Given the opposition to having statements on the level of the label, it
 does not make sense to have Wiktionary included in Wikidata.
 Thanks,
   GerardM

 On 8 May 2015 at 06:19, Denny Vrandečić vrande...@gmail.com wrote:

 I would disagree with requiring the Wiktionary communities to change
 their ways. Instead we should adapt our plans to fit into the way they are
 set up.

 Even if the English Wiktionary community would change to have
 per-language pages instead of the current system, it would be rather
 unlikely that all other language editions of Wiktionary would follow in a
 timely manner. I would prefer to leave this decision to the autonomy of the
 projects, and instead adapt to them (which is, by the way, what the
 proposal does).

 Yair, as Daniel said, the current Wiktionary pages would not be mapped to
 Q-Items. Since this was unclear, I tried to update the text to make it
 clearer. Let me know if it is still confusing.

 I do not think a separate Wikibase instance would be needed to provide
 the data for Wiktionary. I think this can and should be done on Wikidata.
 But as said by Milos and pointed out by Gerard, lexical knowledge does
 indeed require a different data schema. This is why the proposal introduces
 new entity types for lexemes, forms, and senses. The data model is mostly
 based on lexical ontologies that we surveyed, like LEMON and others.


 On Thu, May 7, 2015 at 2:26 PM Federico Leva (Nemo) nemow...@gmail.com
 wrote:

 Andy Mabbett, 07/05/2015 22:53:
  The Wiktionary communities tend to strongly disagree that splitting
 entries
  per language would be easier for either editors or readers.
  How many languages are currently used? How will this scale to ~300
 languages?

 Hm? Last time I counted, the English Wiktionary alone used way more than
 300 languages.

 Nemo

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Wikidata for Wiktionary

2015-05-07 Thread Denny Vrandečić

I mean, the lexical data in Wikidata according to the proposal would allow
for statements on Lexemes and Forms. I slipped into the future for a moment
;)

On Thu, May 7, 2015 at 9:32 PM Denny Vrandečić vrande...@gmail.com wrote:

 I am not sure I understand what you are saying. The lexical data in
 Wikidata does allow for statements on Lexemes and Forms, as the proposal
 states explicitly.

 On Thu, May 7, 2015 at 9:25 PM Gerard Meijssen gerard.meijs...@gmail.com
 wrote:

 Hoi,
 Given the opposition to having statements on the level of the label, it
 does not make sense to have Wiktionary included in Wikidata.
 Thanks,
   GerardM

 On 8 May 2015 at 06:19, Denny Vrandečić vrande...@gmail.com wrote:

 I would disagree with requiring the Wiktionary communities to change
 their ways. Instead we should adapt our plans to fit into the way they are
 set up.

 Even if the English Wiktionary community would change to have
 per-language pages instead of the current system, it would be rather
 unlikely that all other language editions of Wiktionary would follow in a
 timely manner. I would prefer to leave this decision to the autonomy of the
 projects, and instead adapt to them (which is, by the way, what the
 proposal does).

 Yair, as Daniel said, the current Wiktionary pages would not be mapped
 to Q-Items. Since this was unclear, I tried to update the text to make it
 clearer. Let me know if it is still confusing.

 I do not think a separate Wikibase instance would be needed to provide
 the data for Wiktionary. I think this can and should be done on Wikidata.
 But as said by Milos and pointed out by Gerard, lexical knowledge does
 indeed require a different data schema. This is why the proposal introduces
 new entity types for lexemes, forms, and senses. The data model is mostly
 based on lexical ontologies that we surveyed, like LEMON and others.


 On Thu, May 7, 2015 at 2:26 PM Federico Leva (Nemo) nemow...@gmail.com
 wrote:

 Andy Mabbett, 07/05/2015 22:53:
  The Wiktionary communities tend to strongly disagree that splitting
 entries
  per language would be easier for either editors or readers.
  How many languages are currently used? How will this scale to ~300
 languages?

 Hm? Last time I counted, the English Wiktionary alone used way more than
 300 languages.

 Nemo

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Wikidata for Wiktionary

2015-05-07 Thread Denny Vrandečić

I would disagree with requiring the Wiktionary communities to change their
ways. Instead we should adapt our plans to fit into the way they are set up.

Even if the English Wiktionary community would change to have per-language
pages instead of the current system, it would be rather unlikely that all
other language editions of Wiktionary would follow in a timely manner. I
would prefer to leave this decision to the autonomy of the projects, and
instead adapt to them (which is, by the way, what the proposal does).

Yair, as Daniel said, the current Wiktionary pages would not be mapped to
Q-Items. Since this was unclear, I tried to update the text to make it
clearer. Let me know if it is still confusing.

I do not think a separate Wikibase instance would be needed to provide the
data for Wiktionary. I think this can and should be done on Wikidata. But
as said by Milos and pointed out by Gerard, lexical knowledge does indeed
require a different data schema. This is why the proposal introduces new
entity types for lexemes, forms, and senses. The data model is mostly based
on lexical ontologies that we surveyed, like LEMON and others.


On Thu, May 7, 2015 at 2:26 PM Federico Leva (Nemo) nemow...@gmail.com
wrote:

 Andy Mabbett, 07/05/2015 22:53:
  The Wiktionary communities tend to strongly disagree that splitting
 entries
  per language would be easier for either editors or readers.
  How many languages are currently used? How will this scale to ~300
 languages?

 Hm? Last time I counted, the English Wiktionary alone used way more than
 300 languages.

 Nemo

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Wikidata for Wiktionary

2015-05-06 Thread Denny Vrandečić

The work on queries and arbitrary access is well on its way, and also the
new UI is continually being developed and deployed. I don't think that it
is too early to think and gather consensus on how the steps for Wiktionary
could look like. I am certainly not proposing to stop the current work on
queries, but merely to create realistic tasks for the Wiktionary phase of
Wikidata.

On Wed, May 6, 2015, 21:54 Gerard Meijssen gerard.meijs...@gmail.com
wrote:

 Hoi,
 Would it not make sense to FIRST finish a few things.. Like Commons and
 Query ?
 Thanks,
 GerardM

 On 7 May 2015 at 04:54, Denny Vrandečić vrande...@gmail.com wrote:

 It is rather clear that everyone wants Wikidata to also support
 Wiktionary, and there have been plenty of proposals in the last few years.
 I think that the latest proposals are sufficiently similar to go for the
 next step: a break down of the tasks needed to get this done.

 Currently, the idea of having Wikidata supporting Wiktionary is stalled
 because it is regarded as a large monolithic task, and as such it is hard
 to plan and commit to. I tried to come up with a task break-down, and
 discussed it with Lydia and Daniel, and now, as said in the last office
 hour, here it is for discussion and community input.


 https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development/Proposals/2015-05

 I think it would be really awesome if we would start moving in this
 direction. Wiktionary supported by Wikidata could quickly become one of the
 crucial pieces of infrastructure for the Web as a whole, but in particular
 for Wikipedia and its future development.

 Cheers,
 Denny

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] novalue in qualifiers or references

2015-04-25 Thread Denny Vrandečić

Actually I think that having no value for the end date qualifier probably
means that it has not ended yet. There is no other way to express whether
this information is currently merely incomplete (i.e. it has ended, but no
one bothered to fill it in) or not (i.e. it has not ended yet). This is
pretty much the same use case as for normal claims.

Other qualifiers I could imagine where an explicit no value would make
sense is P678, I guess.

In references it might make sense to state explicitly that the source does
not have an issue number or an ISSN, etc., in order for example to allow
cleanup of references and to mark the cases where a reference does not have
a given value from those cases where it is merely incomplete.

I don't have superstrong arguments as you see (I would have much stronger
arguments for unknown value), but I would prefer not to forbid no value
in those cases explicitly, because it might be useful and it is already
there.

[1] https://www.wikidata.org/wiki/Special:WhatLinksHere/Q18615010

On Thu, Apr 23, 2015 at 1:27 PM, Stas Malyshev smalys...@wikimedia.org
wrote:

 Hi!

 I was lately looking into the use of novalue in wikidata, specifically
 in qualifiers and references. While use of novalue in property values is
 pretty clear for me, not sure it is as useful in qualifiers and refs.

 Example:

 https://www.wikidata.org/wiki/Q62#P6

 As we can see, Edwin Mah Lee is the mayor of San Francisco, with end
 date set to novalue. I wonder how useful is this - most entries like
 this just omit end date, and if we query this in SPARQL, for example, we
 would do something like FILTER NOT EXISTS (?statement q:P582
 ?enddate). Inconsistently having novalues there makes it harder to
 process both visually (instead of just looking for one having no end
 date we need to look for either no end date or end date with specific
 novalue) and automatically. And in overwhelming majority of cases I
 feel novalue and absence of value model exactly the same fact - it is
 a current event, etc. Is there any useful case for using novalue there?

 Another example: https://www.wikidata.org/wiki/Q2866#P569

 Here we have reference with stated in:no value. I don't think I
 understand what it means - not stated anywhere? How would we know to
 make such claim? Is a lie? Why would we keep confirmed lies in the data?
 Does not have confirmed source that we know of? Many things do, why
 would we have stated in in this particular case?
 Summarily, it is unclear for me that novalue in references is ever useful.

 To quantify this, we do not have a lot of such things: on the partial
 dump I'm working with for WDQS (which contains at least half of the DB)
 there are 14 novalue refs and 13 properties using novalue as qualifier,
 leader being P582 with 200+ uses, and overall 422 uses. So volume-wise
 it's not a big deal but I'd like to figure out what's the right thing to
 do here and establish some guidelines.

 Thanks,
 --
 Stas Malyshev
 smalys...@wikimedia.org

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] World's largest cities with a female mayor :-)

2015-04-20 Thread Denny Vrandečić

This is seriously awesome! Thank you!

On Mon, Apr 20, 2015 at 1:18 PM Markus Krötzsch 
mar...@semantic-mediawiki.org wrote:

 Hi all,

 For many years, Denny and I have been giving talks about why we need to
 improve the data management in Wikipedia. To explain and motivate this,
 we have often asked the simple question: What are the world's largest
 cities with a female mayor? The information to answer this is clearly
 in Wikipedia, but it would be painfully hard to get the result by
 reading articles.

 I recently had the occasion of actually phrasing this in SPARQL, so that
 an answer can now, finally, be given. The query to run at

 http://milenio.dcc.uchile.cl/sparql

 is as follows (with some explaining comments inline):

 PREFIX : http://www.wikidata.org/entity/ SELECT DISTINCT ?city
 ?citylabel ?mayorlabel WHERE {
   ?city :P31c/:P279c* :Q515 .  # find instances of subclasses of city
   ?city :P6s ?statement .  # with a P6 (head of goverment) statement
   ?statement :P6v ?mayor . # ... that has the value ?mayor
   ?mayor :P21c :Q6581072 . # ... where the ?mayor has P21 (sex or
 gender) female
   FILTER NOT EXISTS { ?statement :P582q ?x }  # ... but the statement
 has no P582 (end date) qualifier

   # Now select the population value of the ?city
   # (the number is reached through a chain of three properties)
   ?city :P1082s/:P1082v/http://www.wikidata.org/ontology#numericValue
 ?population .

   # Optionally, find English labels for city and mayor:
   OPTIONAL {
 ?city rdfs:label ?citylabel .
 FILTER ( LANG(?citylabel) = en )
   }
   OPTIONAL {
 ?mayor rdfs:label ?mayorlabel .
 FILTER ( LANG(?mayorlabel) = en )
   }
 } ORDER BY DESC(?population) LIMIT 100

 To see the results, just paste this into the box at
 http://milenio.dcc.uchile.cl/sparql and press Run query.

 The query does not filter the most recent population but relies on
 Virtuoso to pick the biggest value for DESC sorting, and on the world to
 have (mostly) cities with increasing population numbers over time. This
 is also the reason why the population is not printed (it would give you
 more than one match per city then, even with DISTINCT). Picking the
 current population will become easier once ranks are used more widely to
 mark it.

 There might also be some inaccuracies in cases where a past mayor does
 not have an end date set in Wikidata (Madrid has a suspiciously large
 number of current mayors ...), but a query can only ever be as good as
 its input data.

 I hope this is inspiring to some of you. One could also look for the
 world's youngest or oldest current mayors with similar queries, for
 example.

 Cheers,

 Markus


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

[Wikidata-l] Initial release of the primary sources tool

2015-04-01 Thread Denny Vrandečić

I am happy to let you know about the initial release of the primary sources
tool. More info is available here:

https://www.wikidata.org/wiki/Wikidata:Primary_sources_tool

The release is meant to facilitate your feedback. There are probably plenty
of things that should be fixed before the tool gets widely used. Please
report the issues:

https://github.com/google/primarysources/issues

Even better are pull requests!

A huge shoutout to Sebastian Schaffert (backend) and Thomas Steiner
(frontend) who worked on the tool in their 20% time.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] ViziData

2015-02-24 Thread Denny Vrandečić

Any time property or the birth date property, specifically?

On Tue Feb 24 2015 at 10:58:09 AM Maximilian Klein isa...@gmail.com wrote:

 Next research question:

 {q | instance_of(i, q) and has_time_property(i) and has_geo_property(i)}

 In this case we know humans (q5) are things that have time properties and
 geo properties. What are all the types of things that have time and geo
 properties?

 Make a great day,
 Max Klein ‽ http://notconfusing.com/

 On Mon, Feb 23, 2015 at 10:57 AM, Georg Wild 
 georg.w...@mailbox.tu-dresden.de wrote:

 On 23.02.2015 19:26, Maximilian Klein wrote:
  Georg,
 
  Nice viz! In your example you show births and deaths, so is your
  dataset, {items with date of birth} \intersect {items with place of
  birth}? In general are you thinking about visualising items that have
  birth a geo-cordinate and a time-coordinate?

 Yes that is correct, or to be more precise, all items that are humans
 (instance of Q5) and have both, a time- and geo-coordinate. So if for
 example an item would have a birth date specified but no place of birth
 (or the other way around) it won't be included in the extracted births
 dataset.

 Oh, and something that I forgot to mention earlier: The application is
 quite performance intensive and is best viewed in Chromium (or Chrome)
 if possible.

  Make a great day,
  Max Klein ‽ http://notconfusing.com/
 
  On Thu, Feb 19, 2015 at 8:59 AM, Lydia Pintscher
  lydia.pintsc...@wikimedia.de mailto:lydia.pintsc...@wikimedia.de
 wrote:
 
  On Thu, Feb 19, 2015 at 2:46 PM, Georg Wild
  georg.w...@mailbox.tu-dresden.de
  mailto:georg.w...@mailbox.tu-dresden.de wrote:
   Hello Wikidatans,
  
   I'd like to quickly introduce you ViziData [1], a data
 visualization app
   that I wrote as part of my bachelors thesis last year and will
 work on
   improving in the coming months. It displays the geographical and
   temporal location of events (currently only births and deaths of
   humans are available in the prototype). The data is extracted from
   Wikidata with Wikidata Toolkit.
  
   The tool means to show an interesting use of the data in Wikidata
   (especially larger amounts) and can also give an impression about
 the
   quality and completeness of the collected data on a larger scale.
  
   Planned improvements include:
   * other datasets to display
   * more efficient and useful timeline widget
   * embedded tile map for orientation
   * canvas rendering for performance
   * information about events (e.g. listing persons who were born at
   selected point)
   * code quality :S
  
  
   The source is available on Github under the MIT license [2] and
 the
   corresponding paper can be read online [3] (in German only
 though). If
   you have any questions or concerns about this project, feel free
 to
   contact me :]
 
  Very nice, Georg! Looking forward to more data sets to display.
 
 
  Cheers
  Lydia
 
  --
  Lydia Pintscher - http://about.me/lydia.pintscher
  Product Manager for Wikidata
 
  Wikimedia Deutschland e.V.
  Tempelhofer Ufer 23-24
  10963 Berlin
  www.wikimedia.de http://www.wikimedia.de
 
  Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens
 e. V.
 
  Eingetragen im Vereinsregister des Amtsgerichts
 Berlin-Charlottenburg
  unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
  Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985
  tel:27%2F681%2F51985.
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org mailto:
 Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
 
 
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 


 --
 ☘ excellentiā excelsiōre ☘

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Call for development openness

2015-02-20 Thread Denny Vrandečić

Also, Gerard - you are one to quickly chide others for not being
constructive in their criticism, and I very much appreciate you doing so.

I would like to ask you to reconsider whether your contribution to this
thread meets your own threshold for being constructive.



Can we please stop being hurtful and dismissive of each other? We have a
great project, riding an amazing wave, and there's too much for each one of
us to do to afford to hurt each other and make this a place less nice than
it could be.



On Fri Feb 20 2015 at 1:44:53 PM Denny Vrandečić vrande...@google.com
wrote:

 Regarding Paul's comment:


 
 I first heard about Wikidata at SemTech in San Francisco and I was told
 very directly that they were not interested in working with anybody who was
 experienced with putting data from generic database in front of users
 because they had worked so hard to get academic positions and get a grant
 from the Allen Institute and it is more cost-effective and more compatible
 with academic advancement to hire a bunch of young people who don't know
 anything but will follow orders.
 

 I am, frankly, baffled by this story. It very likely was me, presenting
 Wikidata at SemTech in SF, so it probably was me you have been talking
 with, but I have no recollection of a conversation going the way you
 describe it.

 If I remember the timing correctly, I didn't have an academic position at
 the time of SemTech. Actually, I gave up my academic position to move to
 Berlin and work on Wikidata.

 The donors on Wikidata never exercised any influence on the projects,
 beyond requiring reports on the progress.

 I cannot imagine that I would ever have said that we were not interested
 in working with anybody who was experienced with putting data from generic
 database in front of users, because, really, that would make no sense to
 say. I also do not remember having gotten an application from you.

 Regarding the team that we wanted and eventually did hire, I would sternly
 disagree with the description of a bunch of young people who don't know
 anything but will follow orders - from the applications we got we choose
 the most suitable team we could pull together. And considering the
 discussions we had in the following months, following orders was neither
 their strength nor the qualification they were chosen for. Nor did they
 consist only of young people. Instead, it turned out, they were exactly the
 kind of independent thinkers with dedication to the goal and quality that
 we were aiming for. Fortunately, for the project.

 Maybe the conversation went differently than you are remembering it.
 E.g. I would have insisted on building Wikidata on top of MediaWiki (for
 operational reasons).
 E.g. I would have insisted on everyone to work on Wikidata to move to
 Berlin (because I thought it would be the only possibility to get the
 project to an acceptable state in the original timeframe, so that we can
 ensure its future sustainability).
 E.g. I would have disagreed on being able to use RDF/SPARQL backends back
 then out of the box to be Wikidata's backend (but I would have been open
 for anyone showing me that I was wrong, and indeed very happy because,
 seriously, I have an unreasonable fondness for SPARQL and RDF).
 E.g. I would have disagreed that our job as Wikimedia is to spend too many
 resource in pretty frontends (because that is something the community can
 do, and as we see, is doing very well - I think Wikimedia should really
 concentrate on those pieces of work that cannot and are not being done by
 the community).
 E.g. I would have insisted on not outsourcing any major part of the
 development effort to an external service provider.
 E.g. it could be that we already had all positions filled, and simply no
 money for more people (really depends on the timing).
 So there are plenty of points we might have disagreed with, and which,
 maybe misunderstood, maybe subtly altered by the passage of time in a
 fallible memory, have lead to the recollection of our conversation that you
 presented, but, for the reasons mentioned above, I think that your
 recollection is incorrect.






 On Fri Feb 20 2015 at 12:42:44 PM Daniel Kinzler 
 daniel.kinz...@wikimedia.de wrote:

 Hi Paul!

 I understand your frustration, but let me put a few things into
 perspective.

 For reference: I'm employed by WMDE and work on wikibase/wikidata. I have
 been
 working on MediaWiki since 2005, and am being payed for it since 2008.

 Am 20.02.2015 um 19:14 schrieb Paul Houle:
  I am not an academic.  The people behind Wikidata are.

 To the extend that most of us have some college degree. The only full
 academic
 involved is Markus Krötzsch, who together with Denny Vrandecic developed
 many of
 the concepts behind Wikidata. He acts as an advisor to the Wikidata
 project, but
 doesn't have any formal position.

 Oh, we also have a group of students working on their bachelor project
 with us.

  I first heard about Wikidata

Re: [Wikidata-l] Call for development openness

2015-02-20 Thread Denny Vrandečić

Regarding Paul's comment:


I first heard about Wikidata at SemTech in San Francisco and I was told
very directly that they were not interested in working with anybody who was
experienced with putting data from generic database in front of users
because they had worked so hard to get academic positions and get a grant
from the Allen Institute and it is more cost-effective and more compatible
with academic advancement to hire a bunch of young people who don't know
anything but will follow orders.


I am, frankly, baffled by this story. It very likely was me, presenting
Wikidata at SemTech in SF, so it probably was me you have been talking
with, but I have no recollection of a conversation going the way you
describe it.

If I remember the timing correctly, I didn't have an academic position at
the time of SemTech. Actually, I gave up my academic position to move to
Berlin and work on Wikidata.

The donors on Wikidata never exercised any influence on the projects,
beyond requiring reports on the progress.

I cannot imagine that I would ever have said that we were not interested
in working with anybody who was experienced with putting data from generic
database in front of users, because, really, that would make no sense to
say. I also do not remember having gotten an application from you.

Regarding the team that we wanted and eventually did hire, I would sternly
disagree with the description of a bunch of young people who don't know
anything but will follow orders - from the applications we got we choose
the most suitable team we could pull together. And considering the
discussions we had in the following months, following orders was neither
their strength nor the qualification they were chosen for. Nor did they
consist only of young people. Instead, it turned out, they were exactly the
kind of independent thinkers with dedication to the goal and quality that
we were aiming for. Fortunately, for the project.

Maybe the conversation went differently than you are remembering it.
E.g. I would have insisted on building Wikidata on top of MediaWiki (for
operational reasons).
E.g. I would have insisted on everyone to work on Wikidata to move to
Berlin (because I thought it would be the only possibility to get the
project to an acceptable state in the original timeframe, so that we can
ensure its future sustainability).
E.g. I would have disagreed on being able to use RDF/SPARQL backends back
then out of the box to be Wikidata's backend (but I would have been open
for anyone showing me that I was wrong, and indeed very happy because,
seriously, I have an unreasonable fondness for SPARQL and RDF).
E.g. I would have disagreed that our job as Wikimedia is to spend too many
resource in pretty frontends (because that is something the community can
do, and as we see, is doing very well - I think Wikimedia should really
concentrate on those pieces of work that cannot and are not being done by
the community).
E.g. I would have insisted on not outsourcing any major part of the
development effort to an external service provider.
E.g. it could be that we already had all positions filled, and simply no
money for more people (really depends on the timing).
So there are plenty of points we might have disagreed with, and which,
maybe misunderstood, maybe subtly altered by the passage of time in a
fallible memory, have lead to the recollection of our conversation that you
presented, but, for the reasons mentioned above, I think that your
recollection is incorrect.






On Fri Feb 20 2015 at 12:42:44 PM Daniel Kinzler 
daniel.kinz...@wikimedia.de wrote:

 Hi Paul!

 I understand your frustration, but let me put a few things into
 perspective.

 For reference: I'm employed by WMDE and work on wikibase/wikidata. I have
 been
 working on MediaWiki since 2005, and am being payed for it since 2008.

 Am 20.02.2015 um 19:14 schrieb Paul Houle:
  I am not an academic.  The people behind Wikidata are.

 To the extend that most of us have some college degree. The only full
 academic
 involved is Markus Krötzsch, who together with Denny Vrandecic developed
 many of
 the concepts behind Wikidata. He acts as an advisor to the Wikidata
 project, but
 doesn't have any formal position.

 Oh, we also have a group of students working on their bachelor project
 with us.

  I first heard about Wikidata at SemTech in San Francisco and I was told
 very
  directly that they were not interested in working with anybody who was
  experienced with putting data from generic database in front of users
 because
  they had worked so hard to get academic positions and get a grant from
 the Allen
  Institute and it is more cost-effective and more compatible with academic
  advancement to hire a bunch of young people who don't know anything but
 will
  follow orders.

 Auch. Working with such people would be a drag. Luckily, we have an
 awesome team
 of full blooded programmers. Not that we get everything right, or done in
 time...

  RDF* and SPARQL* do not

Re: [Wikidata-l] Call for development openness

2015-02-19 Thread Denny Vrandečić

Also, the problem most SPARQL backend developers worried about was not
Wikidata's size, but it's dynamicity. Not the number of triples, but the
frequency of edits. And we did talk to many of those people.

On Thu, Feb 19, 2015, 07:05 Markus Krötzsch mar...@semantic-mediawiki.org
wrote:

 Hi Paul,

 Re RDF*/SPARQL*: could you send a link? Someone has really made an
 effort to find the least googleable terminology here ;-)

 Re relying on standards: I think this argument is missing the point. If
 you look at what developers in Wikidata are concerned with, it is +90%
 interface and internal data workflow. This would be exaclty the same no
 matter which data standard you would use. All the challenges of
 providing a usable UI and a stable API would remain the same, since a
 data encoding standard does not help with any of this. If you have
 followed some of the recent discussion on the DBpedia mailing list about
 the UIs they have there, you can see that Wikidata is already in a very
 good position in comparison when it comes to exposing data to humans
 (thanks to Magnus, of course ;-). RDF is great but there are many
 problems that it does not even try to solve (rightly so). These problems
 seem to be dominant in the Wikidata world right now.

 This said, we are in a great position to adopt new standards as they
 come along. I agree with you on the obvious relationships between
 Wikidata statements and the property graph model. We are well aware of
 this. Graph databases are considered for providing query solutions to
 Wikidata, and we are considering to set up a SPARQL endpoint for our
 existing RDF as well. Overall, I don't see a reason why we should not
 embrace all of these technologies as they suit our purpose, even if they
 were not available yet when Wikidata was first conceived.

 Re It is also exciting that vendors are getting on board with this and
 we are going to seeing some stuff that is crazy scalable (way past 10^12
 facts on commodity hardware) very soon. [which vendors?] [citation
 needed] ;-) We would be very interested in learning about such
 technologies. After the recent end of Titan, the discussion of query
 answering backends is still ongoing.

 Cheers,

 Markus


 On 18.02.2015 21:25, Paul Houle wrote:
  What bugs me about it is that Wikidata has gone down the same road as
  Freebase and Neo4J in the sense of developing something ad-hoc that is
  not well understood.
 
  I understand the motivations that lead there,  because there are
  requirements to meet that standards don't necessarily satisfy,  plus
  Wikidata really is doing ambitious things in the sense of capturing
  provenance information.
 
  Perhaps it has come a little too late to help with Wikidata but it seems
  to me that RDF* and SPARQL* have a lot to offer for data wikis in that
  you can view data as plain ordinary RDF and query with SPARQL but you
  can also attach provenance and other metadata in a sane way with sweet
  syntax for writing it in Turtle or querying it in other ways.
 
  Another way of thinking about it is that RDF* is formalizing the
  property graph model which has always been ad hoc in products like
  Neo4J.  I can say that knowing what the algebra is you are implementing
  helps a lot in getting the tools to work right.  So you not only have
  SPARQL queries as a possibility but also languages like Gremlin and
  Cypher and this is all pretty exciting.  It is also exciting that
  vendors are getting on board with this and we are going to seeing some
  stuff that is crazy scalable (way past 10^12 facts on commodity
  hardware) very soon.
 
 
 
 
  On Tue, Feb 17, 2015 at 12:20 PM, Jeroen De Dauw jeroended...@gmail.com
  mailto:jeroended...@gmail.com wrote:
 
  Hey,
 
  As Lydia mentioned, we obviously do not actively discourage outside
  contributions, and will gladly listen to suggestions on how we can
  do better. That being said, we are actively taking steps to make it
  easier for developers not already part of the community to start
  contributing.
 
  For instance, we created a website about our software itself [0],
  which lists the MediaWiki extensions and the different libraries [1]
  we created. For most of our libraries, you can just clone the code
  and run composer install. And then you're all set. You can make
  changes, run the tests and submit them back. Different workflow than
  what you as MediaWiki developer are used to perhaps, though quite a
  bit simpler. Furthermore, we've been quite progressive in adopting
  practices and tools from the wider PHP community.
 
  I definitely do not disagree with you that some things could, and
  should, be improved. Like you I'd like to see the Wikibase git
  repository and naming of the extensions be aligned more, since it
  indeed is confusing. Increased API stability, especially the
  JavaScript one, is something else on my wish-list, amongst a lot of
  other

Re: [Wikidata-l] Wikidata CACM article (Was: Conflict of Interest policy for Wikidata)

2015-01-08 Thread Denny Vrandečić

Yes, CC-BY is great.

On Thu Jan 08 2015 at 7:01:12 AM Markus Krötzsch 
mar...@semantic-mediawiki.org wrote:

 On 08.01.2015 15:10, ja...@j1w.xyz wrote:
  Prior to viewing Markus Krötzsch's Wikidata page, I was unaware of the
  Wikidata: A Free Collaborative Knowledgebase article [1] written by
  Denny Vrandečić and Markus Krötzsch.  This is a very helpful article
  that in my opinion should be featured on the Wikidata main page.

 Glad you liked it. Checking the Wikidata item, I notice that it is
 actually Open Access and not all rights reserved. It is available for
 free (forever) from the ACM [1], but it seems they do not define any
 license. However, as we have retained all the rights, we can do what we
 like there.

 Denny, shall we use CC-BY?

 Markus

 [1] http://cacm.acm.org/magazines/2014/10/178785-wikidata/fulltext


 
  [1] http://cacm.acm.org/magazines/2014/10/178785-wikidata/fulltext
 
  Regards,
  James Weaver
 
  On Wed, Jan 7, 2015, at 05:14 PM, Markus Krötzsch wrote:
  Irrespective of the general policy discussion, I have now been bold and
  changed my item and user page to record that relationship as by my
  earlier suggestion (as copied below):
 
  https://www.wikidata.org/wiki/Q18618630
 
  I was wondering if, given that we have single signon, website account
  on should point to Wikidata or to Wikimedia or something else. But
  besides this minor point this seems to be a nice way to have COI
  declarations in the data (would also be interesting to know which living
  people have official Wikimedia accounts).
 
  Cheers,
 
  Markus
 
  On 07.01.2015 15:25, Markus Krötzsch wrote:
  ...
 
  In addition, there should be a template that one can use on one's user
  page to disclose that one is the person described in a certain item.
  Conversely, we should also use our website account on property (P553)
  to connect living people to their Wikidata user account, so the COI is
  recorded in the data. One could further disclose other COIs on one's
  user page in some standard format, but maybe with Wikidata we could
  actually derive such COIs automatically (your family members, the
  companies you founded, the university you graduated from, etc. can all
  be specified in data).
 
  Cheers,
 
  Markus
 
 
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Conflict of Interest policy for Wikidata

2015-01-08 Thread Denny Vrandečić

completely-self-servingYay! I would love to see it featured on the
Wikidata main page! Let's slashdot ACM :)/completely-self-serving

On Thu Jan 08 2015 at 6:11:57 AM ja...@j1w.xyz wrote:

 Prior to viewing Markus Krötzsch's Wikidata page, I was unaware of the
 Wikidata: A Free Collaborative Knowledgebase article [1] written by
 Denny Vrandečić and Markus Krötzsch.  This is a very helpful article
 that in my opinion should be featured on the Wikidata main page.

 [1] http://cacm.acm.org/magazines/2014/10/178785-wikidata/fulltext

 Regards,
 James Weaver

 On Wed, Jan 7, 2015, at 05:14 PM, Markus Krötzsch wrote:
  Irrespective of the general policy discussion, I have now been bold and
  changed my item and user page to record that relationship as by my
  earlier suggestion (as copied below):
 
  https://www.wikidata.org/wiki/Q18618630
 
  I was wondering if, given that we have single signon, website account
  on should point to Wikidata or to Wikimedia or something else. But
  besides this minor point this seems to be a nice way to have COI
  declarations in the data (would also be interesting to know which living
  people have official Wikimedia accounts).
 
  Cheers,
 
  Markus
 
  On 07.01.2015 15:25, Markus Krötzsch wrote:
  ...
  
   In addition, there should be a template that one can use on one's user
   page to disclose that one is the person described in a certain item.
   Conversely, we should also use our website account on property (P553)
   to connect living people to their Wikidata user account, so the COI is
   recorded in the data. One could further disclose other COIs on one's
   user page in some standard format, but maybe with Wikidata we could
   actually derive such COIs automatically (your family members, the
   companies you founded, the university you graduated from, etc. can all
   be specified in data).
  
   Cheers,
  
   Markus
  
  
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] [Mediawiki-api] Freebase like API with an OUTPUT feature ?

2015-01-08 Thread Denny Vrandečić

Actually, since Wikidata allows now properties on properties, one might
easily create an item Disambiguating property and then make a claim
instance of - Disambiguating property on the relevant property. there is
no need for any extra implementation work.

On Wed Jan 07 2015 at 9:48:32 AM Thad Guidry thadgui...@gmail.com wrote:

Hi Lydia,

It's more than that. I can get labels just fine with props=labels

Ideally there were be a Number 3 a reconcile service, or an API that
can be USED as a reconcile service.

Given a search string of Paris, let's say...

1. Return some disambiguating properties and their labels and values. For
reconciling purposes, you don't want to deal with codes like P12345 but
instead a human understandable description of the property.
a. Allow the output of the information returned to be expanded or
reduced by some parameter values that I mentioned as OUTPUT.
b. Allow the use of a (disambiguator) parameter to output only the
disambiguating properties. (disambiguating properties are those that are
most important when comparing A = B and given a type). In Freebase API, we
had the option of this as shown here:
http://freebase-search.freebaseapps.com/?query=Texasoutput=(disambiguator)limit=100scoring=entitylang=en

The current disambiguator with Wikidata is actually the descriptions.
Wikidata does not flag or mark properties like P856 (official site) as a
disambiguating property, an important property. Freebase does however.
It would be nice for Wikidata to begin work on having a disambiguating
property flag (boolean Y/N) like Freebase does.

The closest starting point for a Reconcile API with the current API
structure that I can see is hacking a bit on this one:

https://www.wikidata.org/w/api.php?action=wbgetentitiessites=enwikititles=Parislanguages=enprops=descriptions|claims

Btw, that closest starting point, only outputs 1 entity for Paris in the
enwiki... where's Paris, Texas ?

Thad
+ThadGuidry https://www.google.com/+ThadGuidry

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

[Wikidata-l] Conflict of Interest policy for Wikidata

2014-12-30 Thread Denny Vrandečić

I found out the other day that there's an item about myself, and I wanted
to edit it, and got a weird feeling about it. So I raised the question on
the project chat

https://www.wikidata.org/wiki/Wikidata:Project_chat#COI_and_editing

and got told that an RFC would be a good idea. So I tried one. I don't
think it has caused problems yet, though - but it might be easier to
discuss these things before they cause problems.

https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Conflict_of_Interest

Input is highly appreciated.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] How to declare a property is transitive, etc.

2014-12-18 Thread Denny Vrandečić

In OWL this is done through instance of (i.e. rdf:type) pointing at a
Transitive Property class (owl:TransitiveProperty). So the most similar
representation of that in Wikidata would be to have an item for transitive
property, and make an instance of: transitive property statement on the
respective property.

Obvious caveat: for now this is just a syntactic marker, the system does
not do anything special with it. But a SPARQL endpoint with an OWL-regime
or an OWL reasoner could make the inferences if this statement is
appropriately translated.

Hope that helps,
Denny

On Thu Dec 18 2014 at 7:45:43 PM Emw emw.w...@gmail.com wrote:

 Hi all,

 Could those knowledgeable about OWL or intending to use Wikidata's RDF /
 OWL exports please weigh in at https://www.wikidata.org/wiki/
 Wikidata:Property_proposal/Property_metadata#How_should_
 we_declare_that_a_property_is_transitive ? [1]

 Being able to declare certain properties of properties is an essential
 building block for querying and inference.  However, the way to declare
 that a property is, say, transitive in OWL does not have a clear analog in
 Wikidata syntax.  We could certainly shoehorn such a statement into our
 existing model (and it looks like we'll need to), but it is important to do
 so in a way that complicate things as little as possible for downstream
 users, e.g. outside researchers or developers using the RDF exports and
 assuming standard OWL semantics.

 Please make any comments on this on-wiki at the location linked above.
 That way we can keep the discussion centralized.

 Other discussions on that page could also benefit from input by people
 knowledgeable about Semantic Web vocabulary.

 Thanks,
 Eric

 https://www.wikidata.org/wiki/User:Emw

 1.  Discussion permalink: https://www.wikidata.org/w/
 index.php?title=Wikidata:Property_proposal/Property_
 metadataoldid=182088235#How_should_we_declare_that_a_
 property_is_transitive
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Tool for adding references and data to Wikidata

2014-12-17 Thread Denny Vrandečić

Hi Gerard,

I very much agree. It would be very good to have a discussion on which kind
of data can be integrated in which way.

One way or the other, one of the most frequent criticisms of Wikidata is a
lack of references, which this tool will tackle on the way as well. And at
the same time it will allow for a human curation step, which I think is
crucial for the Wikidata community to gain ownership of the data.

Just dumping everything into Wikidata is, in my opinion, not a sustainable
solution. But since the data will be released free, well, the community can
decide to do it otherwise, obviously.

Cheers,
Denny


On Wed Dec 17 2014 at 8:40:25 AM Lydia Pintscher 
lydia.pintsc...@wikimedia.de wrote:

 On Wed, Dec 17, 2014 at 4:04 PM, Lane Rasberry l...@bluerasberry.com
 wrote:
  Hello,
 
  Where is the appropriate place on Wikidata to discuss this? This is big
  enough for its own WikiProject. Does it already have one somewhere?
 Should I
  make one? Actually I just did.
 
  https://www.wikidata.org/wiki/Wikidata:WikiProject_Freebase
 
  I hardly know what the implications are of this but it seems big enough
 to
  have a dedicated place for discussion.
 
  Thanks to Denny for whatever role you had in getting access to
  well-developed data collected by another project. I do not understand
 that
  is happening here but it seems like really good news, and I hope someone
  explains it more.

 Thanks for starting the project, Lane. Will you announce it on the
 Project chat? That way most people on-wiki will see it and can jump
 in. Once it has a bit more content we can announce it more widely.


 Cheers
 Lydia
 --
 Lydia Pintscher - http://about.me/lydia.pintscher
 Product Manager for Wikidata

 Wikimedia Deutschland e.V.
 Tempelhofer Ufer 23-24
 10963 Berlin
 www.wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
 unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
 Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] sneak peek time - checking against 3rd party databases

2014-12-02 Thread Denny Vrandečić

wohoo. that's pretty awesome! congrats.

Are they going to use the soon-to-be-available property mapping properties?

On Tue Dec 02 2014 at 1:33:38 PM Lydia Pintscher 
lydia.pintsc...@wikimedia.de wrote:

 Hey folks :)

 The student team working on data quality and trust is hard at work and
 just showed me a first demo. I wanted to share that with you as well.
 One part of the team is working on checking Wikidata's data against
 other databases. Attached is a screenshot of their first demo showing
 checking of our data against MusicBrainz. In the end this will be
 nicely integrated on Wikidata as part of the constraint violation
 reports probably. This is already working way better than I expected
 it ever would. So heads off to the students. I'm sure they'll kick ass
 over the next months.


 Cheers
 Lydia

 --
 Lydia Pintscher - http://about.me/lydia.pintscher
 Product Manager for Wikidata

 Wikimedia Deutschland e.V.
 Tempelhofer Ufer 23-24
 10963 Berlin
 www.wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
 unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
 Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Various questions

2014-11-11 Thread Denny Vrandečić

On Tue Nov 11 2014 at 1:51:08 PM Denny Vrandečić vrande...@google.com
wrote:

 +1 for removing the blacklist from the code.

 On Tue Nov 11 2014 at 12:28:05 AM John Erling Blad jeb...@gmail.com
 wrote:

 What did I say, etc, etc, etc... It feels good to be right. I was
 right. Me. I and myself.
 Some stuff always bites you, even if it was quite fun! ;)

 On Tue, Nov 11, 2014 at 9:09 AM, Jeroen De Dauw jeroended...@gmail.com
 wrote:
  Hey,
 
  I was looking through the configuration trying to debug my issues from
 my
  last
  email and noticed the list of blacklisted IDs.  They appear to be
 numbers
  with
  special meaning.  I was curious about two things, why are they
 blacklisted
  and
  what is the meaning of the remaining number?
 
  * 1: I imagine that this just refers to #1
  * 23: Probably refers to the 23 enigma
  * 42: Life the universe and everything
  * 1337: leet
  * 9001: ISO 9001, which deals with quality assurance
  * 31337: Elite
 
 
  I guess we probably ought to delete those default values. They where
 added
  for something easter-egg like in the Wikidata project, and might well
 get in
  the way for third party users. This is also not the list of actual IDs
 that
  got blacklisted on Wikidata.org, which was a bit more extensive, and for
  instance had Q2013, the year in which Wikidata launched. I submitted a
  removal of these blacklisted IDs from the default config in
  https://gerrit.wikimedia.org/r/#/c/172504/
 
  The only number that left me lost was 720101010. I couldn't figure this
  one out.
 
 
  720101010 is 1337 for trolololo :)
 
  Cheers
 
  --
  Jeroen De Dauw - http://www.bn2vs.com
  Software craftsmanship advocate
  Evil software architect at Wikimedia Germany
  ~=[,,_,,]:3
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Various questions

2014-11-11 Thread Denny Vrandečić

On Tue Nov 11 2014 at 1:51:32 PM Denny Vrandečić vrande...@google.com
wrote:



 On Tue Nov 11 2014 at 1:51:08 PM Denny Vrandečić vrande...@google.com
 wrote:

 +1 for removing the blacklist from the code.

 On Tue Nov 11 2014 at 12:28:05 AM John Erling Blad jeb...@gmail.com
 wrote:

 What did I say, etc, etc, etc... It feels good to be right. I was
 right. Me. I and myself.
 Some stuff always bites you, even if it was quite fun! ;)

 On Tue, Nov 11, 2014 at 9:09 AM, Jeroen De Dauw jeroended...@gmail.com
 wrote:
  Hey,
 
  I was looking through the configuration trying to debug my issues
 from my
  last
  email and noticed the list of blacklisted IDs.  They appear to be
 numbers
  with
  special meaning.  I was curious about two things, why are they
 blacklisted
  and
  what is the meaning of the remaining number?
 
  * 1: I imagine that this just refers to #1
  * 23: Probably refers to the 23 enigma
  * 42: Life the universe and everything
  * 1337: leet
  * 9001: ISO 9001, which deals with quality assurance
  * 31337: Elite
 
 
  I guess we probably ought to delete those default values. They where
 added
  for something easter-egg like in the Wikidata project, and might well
 get in
  the way for third party users. This is also not the list of actual IDs
 that
  got blacklisted on Wikidata.org, which was a bit more extensive, and
 for
  instance had Q2013, the year in which Wikidata launched. I submitted a
  removal of these blacklisted IDs from the default config in
  https://gerrit.wikimedia.org/r/#/c/172504/
 
  The only number that left me lost was 720101010. I couldn't figure
 this
  one out.
 
 
  720101010 is 1337 for trolololo :)
 
  Cheers
 
  --
  Jeroen De Dauw - http://www.bn2vs.com
  Software craftsmanship advocate
  Evil software architect at Wikimedia Germany
  ~=[,,_,,]:3
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

[Wikidata-l] Birthday gift: Missing Wikipedia links (was Re: Wikidata turns two!)

2014-10-29 Thread Denny Vrandečić

Folks,

as you know, many Googlers are huge fans of Wikipedia. So here’s a little
gift for Wikidata’s second birthday.

Some of my smart colleagues at Google have run a few heuristics and
algorithms in order to discover Wikipedia articles in different languages
about the same topic which are missing language links between the articles.
The results contain more than 35,000 missing links with a high confidence
according to these algorithms. We estimate a precision of about 92+% (i.e.
we assume that less than 8% of those are wrong, based on our evaluation).
The dataset covers 60 Wikipedia language editions.

Here are the missing links, available for download from the WMF labs
servers:

https://tools.wmflabs.org/yichengtry/merge_candidate.20141028.csv

The data is published under CC-0.

What can you do with the data? Since it is CC-0, you can do anything you
want, obviously, but here are a few suggestions:

There’s a small tool on WMF labs that you can use to verify the links (it
displays the articles side by side from a language pair you select, and
then you can confirm or contradict the merge):

https://tools.wmflabs.org/yichengtry

The tool does not do the change in Wikidata itself, though (we thought it
would be too invasive if we did that). Instead, the results of the human
evaluation are saved on WMF labs. You are welcome to take the tool and
extend it with the possibility to upload the change directly on Wikidata,
if you so wish, or, once the data is verified, to upload the results.

Also, Magnus Manske is already busy uploading the data to the Wikidata
game, so you can very soon also play the merge game on the data directly.
He is also creating the missing items on Wikidata. Thanks Magnus for a very
pleasant cooperation!

I want to call out to my colleagues at Google who created the dataset -
Jiang Bian and Si Li - and to Yicheng Huang, the intern who developed the
tool on labs.

I hope that this small data release can help a little with further
improving the quality of Wikidata and Wikipedia! Thank you all, you are
awesome!

Cheers,
Denny



On Wed Oct 29 2014 at 10:52:05 AM Lydia Pintscher 
lydia.pintsc...@wikimedia.de wrote:

Hey folks :)

Today Wikidata is turning two. It amazes me what we've achieved in
just 2 years. We've built an incredible project that is set out to
change the world. Thank you everyone who has been a part of this so
far.
We've put together some notes and opinions. And there are presents as
well! Check them out and leave your birthday wishes:
https://www.wikidata.org/wiki/Wikidata:Second_Birthday


Cheers
Lydia

--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Birthday gift: Missing Wikipedia links (was Re: Wikidata turns two!)

2014-10-29 Thread Denny Vrandečić

Sure, you can keep all your todos with Google ;)

https://www.gmail.com/mail/help/tasks/

Cheers,
Denny


On Wed Oct 29 2014 at 2:58:03 PM Jeroen De Dauw jeroended...@gmail.com
wrote:

 Hey,

 Does this mean we can also shoot a TODO list in the direction of Google? :)

 Cheers

 --
 Jeroen De Dauw - http://www.bn2vs.com
 Software craftsmanship advocate
 Evil software architect at Wikimedia Germany
 ~=[,,_,,]:3
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Super Lachaise, a mobile app based on Wikidata

2014-10-28 Thread Denny Vrandečić

That's a great idea!

Just curious, for such a specific use case, why did you go for an App
instead of a Website?



On Tue Oct 28 2014 at 7:29:22 AM Sjoerd de Bruin sjoerddebr...@me.com
wrote:

 Not available in the Dutch iTunes Store...

 Op 28 okt. 2014 om 15:26 heeft Pierre-Yves Beaudouin 
 pierre.beaudo...@gmail.com het volgende geschreven:

 I'm happy to announce the release of Super Lachaise on the App Store. It's
 a free mobile app that help you during the visit of the Père Lachaise
 cemetery. This is probably one of the firsts mobile apps to use Wikidata ;)

 http://www.superlachaise.fr/
 https://itunes.apple.com/fr/app/super-lachaise/id918263934


 Pyb

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Open Data Awards

2014-10-27 Thread Denny Vrandečić

Yay! Congratulations!

On Mon Oct 27 2014 at 4:55:51 PM John Lewis johnflewi...@gmail.com wrote:

 Hi everyone,

 Some exciting news here. The Open Data Awards' finalists lists were
 recently published on their website. Wikidata has been listed as a finalist
 in two different categories which are the Open Data Innovation Award and
 the Open Data Publisher Award. Lydia
 http://www.wikidata.org/wiki/User:Lydia_Pintscher_(WMDE) and Magnus
 http://www.wikidata.org/wiki/User:Magnus_Manske will be representing
 Wikidata at the gala dinner where the winner of each category will be
 announced live. I will be standing in as a backup should Lydia be unable to
 attend the award dinner but let's wish Lydia and Magnus a good time and
 keep our fingers crossed that Wikidata will win at least one of the two
 categories we've been nominated for. As Lydia would say - the entire
 community is awesome for working to help build Wikidata to where it is and
 this is as much as all of our work as it is the development team's for
 helping build and innovate the way free knowledge is shared within the
 mission of the Wikimedia Foundation.


 Thanks,

 John Lewis


 --
 John Lewis
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] all human genes are now wikidata items

2014-10-06 Thread Denny Vrandečić

Wow! That's pretty cool work!

Do you have any plans to keep the data fresh?

On Mon Oct 06 2014 at 1:22:12 PM Benjamin Good ben.mcgee.g...@gmail.com
wrote:

 I thought folks might like to know that every human gene (according to the
 United States National Center for Biotechnology Information) now has a
 representative entity on wikidata.  I hope that these are the seeds for
 some amazing applications in biology and medicine.

 Well done Andra and ProteinBoxBot !

 For example:
 Here is one (of approximately 40,000) called spinocerebellar ataxia 37
 https://www.wikidata.org/wiki/Q18081265

 -Ben
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] How can I increase the throughput of ProteinBoxBot?

2014-09-30 Thread Denny Vrandečić

That's very cool! To get an idea, how big is your dataset?

On Tue Sep 30 2014 at 12:06:56 PM Daniel Kinzler 
daniel.kinz...@wikimedia.de wrote:

 What makes it so slow?

 Note that you can use wbeditentity to perform complex edits with a single
 api
 call. It's not as streight forward to use as, say, wbaddclaim, but much
 more
 powerfull and efficient.

 -- daniel

 Am 30.09.2014 19:00, schrieb Andra Waagmeester:
  Hi All,
 
I have joined the development team of the ProteinBoxBot
  (https://www.wikidata.org/wiki/User:ProteinBoxBot) . Our goal is to make
  Wikidata the canonical resource for referencing and translating
 identifiers for
  genes and proteins from different species.
 
  Currently adding all genes from the human genome and their related
 identifiers
  to Wikidata takes more then a month to complete. With the objective to
 add other
  species, as well as having frequent updates for each of the genomes, it
 would be
  convenient if we could increase this throughput.
 
  Would it be accepted if we increase the throughput by running multiple
 instances
  of ProteinBoxBot in parallel. If so, what would be an accepted number of
  parallel instances of a bot to run? We can run multiple instances from
 different
  geographical locations if necessary.
 
  Kind regards,
 
 
  Andra
 
 
 
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 


 --
 Daniel Kinzler
 Senior Software Developer

 Wikimedia Deutschland
 Gesellschaft zur Förderung Freien Wissens e.V.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Item both subclass and instance?

2014-09-25 Thread Denny Vrandečić

Fully agree with Markus' beautifully written explanation, although I am not
completely convinced of the level theory - but it seems to work in the
given examples, and a few other examples I was thinking through.

Note that Porsche 356 could very much be an instance of car model - but
not of car. All the rules that Markus has mentioned would stay intact in
this case. We often don't make the difference between car and car model
in our day to day speech, which is a common source of confusion (i.e. the
Porsche 356 is a beautiful car vs the Porsche 356 is a beautiful car
model - both would be acceptable in natural language, but alas, not in
Wikidata).



On Thu, Sep 25, 2014 at 3:53 PM, Markus Krötzsch 
mar...@semantic-mediawiki.org wrote:

 Hi,

 I fully agree with Thomas and the other replies given here. Let me give
 some other views on these topics (partly overlapping with what was said
 before). It's important to understand these things to get the subclass
 of/instance of thing right -- and it would be extremely useful if we could
 get this right in our data :-)


 What is a class and what is an item is often a matter of perspective, and
 it is certainly accepted in the ontology modelling community that one thing
 may need to be both.

 The important thing is that subclass of is a relation between *similar*
 things (usually of the same type):

 * sports car subclass of car
 * Porsche Carrera subclass of sports car
 * Porsche 356 subclass of Porsche Carrera

 Use A subclass of B if it makes sense to say all A's are also B's as
 in all Porsche Carreras are sports cars.

 In contrast, instance of is between things that are very *different* in
 nature:

 * Douglas Adams instance of human
 * human instance of species

 Subclass naturally forms chains, like in my example. You can leave out
 some part of the chain and the result is still meaningful:

 * Porsche Carrera subclass of car [makes sense]

 For instance of, this does not work:

 * Douglas Adams instance of species [bogus]

 So if you want to organise things in a hierarchy (specific to general),
 then you need subclass of. If you just describe the type of one thing,
 then you need instance of. It is perfectly possible that one thing
 participates in both types of relationships.

 In addition to these general guidelines, I would say that a well-modelled
 ontology should be organised in levels: whenever you use instance of, you
 go to a higher level; if you use subclass of, you stay on your current
 level. Each thing should belong to only one level. Here is an example where
 this is violated:

 * Porsche Carrera subclass of sports car
 * Porsche 356 subclass of Porsche Carrera
 * Porsche 356 instance of sports car

 Each of these makes sense individually, but the combination is weird. We
 should make up our mind if we want to treat Porsche 356 as a class (on the
 same level as sports car) or as an instance (on a lower level than sports
 car), but not do both at the same time. I think subclass of usually
 should be preferred in such a case (because if it is possible to use
 subclass of, then it is usually also quite likely that more specific items
 occur later [Porsche 356 v1 or whatever], and we really will need subclass
 of to build a hierarchy then).

 Cheers,

 Markus



 On 25.09.2014 20:10, Thomas Douillard wrote:

 Hi, this is a long discussion :) Is is allowed by OWL2 notion called
 Punning.

 The rationale is that Hydrogen is a chemical elements, and that the
 chemical element is not a subclass of atom. Rather a chemical elements
 is a type of atom, so chemical elements is a metaclass : a class of
 class of atoms.
 


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l



 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] policy toward using non-CC0 licensed external databases a

2014-09-13 Thread Denny Vrandečić

On Sep 13, 2014 3:20 PM, P. Blissenbach pu...@web.de wrote:

 Regarding purely factual data comprising a less than significant portion
of a
 database - which is certainly true for all ISBNs in Googles databas
 Btw. if a statement about an ISBN is sourced, among ohers, with Source:
Google,
 that does not imply having it from Google. It only states the fact:
Google has
 it, too.

 Purodha

That's also why it is actually called reference and not source.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Language mappings between different wikipedia pages

2014-07-17 Thread Denny Vrandečić

Hey Marieke,

You can either use the Wikidata toolkit by Markus Krötzsch, if you want to
work on the dump, or the Wikidata web API, if you only need a few such
mappings at a time.
On Jul 17, 2014 9:24 AM, Erp, M.G.J. van marieke.van@vu.nl wrote:

 Hi there,

 I was wondering how to get the language mappings between different
 wikipedia pages. This information seems to be available on Wikidata as I
 can find it through browsing different pages on Wikidata such as
 http://www.wikidata.org/wiki/Q213710 and the
 https://www.mediawiki.org/wiki/Manual:Langlinks_table mentions a
 langlinks table, but I can't figure out how to get a dump.

 The Wiki interlanguage link records at
 http://dumps.wikimedia.org/wikidatawiki/20140705/ looked promising but
 that seems to contain user information if I'm not mistaken. For example, 
 select count(*), ll_title from langlinks group by 2 order by 1 desc limit
 20;” results in:

 +--+--+
 | count(*) | ll_title |
 +--+--+
 |  284 | User:تفکر|
 |  272 | user:OffsBlink   |
 |  215 | User:YourEyesOnly|
 |  179 | User:MoiraMoira  |
 |   65 | User:AvocatoBot  |
 |   35 | User:Shikai shaw |
 |   35 | user:Shuaib-bot  |
 |   33 | user:לערי ריינהארט   |
 |   33 | User:Leyo|
 |   27 | user:Лобачев Владимир|
 |   20 | User:Wagino 20100516 |
 |   18 | user:Gangleri|
 |   17 | user:I18n|
 |   16 | user:Meursault2004   |
 |   12 | User:Labant  |
 |   11 | User:Stryn   |
 |   11 | User:angelia2041 |
 |   10 | user:Kelvin  |
 |   10 | User:JCIV|
 |9 | Template:Mbox|
 +--+———+

 I checked out  the #mediawiki IRC channel someone recommended the
 Interwiki link tracking records but those seem to also contain al sorts
 of other links, and I don't see a way to filter out the in other
 languages links. It would be great if you could help me out.

 Thanks!

 Marieke van Erp



 --
 Computational Lexicology  Terminology Lab (CLTL)
 The Network Institute, VU University Amsterdam

 De Boelelaan 1105
 1081 HV  Amsterdam, The Netherlands
 http://www.mariekevanerp.com
 http://www.newsreader-project.eu



 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Wikidata Toolkit 0.1.0 released

2014-04-09 Thread Denny Vrandečić

Hi Markus,

On Wed Apr 09 2014 at 4:18:50 AM, Markus Krötzsch 
mar...@semantic-mediawiki.org wrote:

 Change to the directory of the example module (wdtk-examples), then run:

 mvn exec:java
 -Dexec.mainClass=org.wikidata.wdtk.examples.DumpProcessingExample


Thanks, that is exactly what I needed! :)

I understand that WDTK is a library to be used in your own applications,
but I am often not patient enough to actually go and code up a whole app
myself in a new dev environment before I actually see that the thing is
running. So being able to actually start and run the example application is
superuseful for my motivation, because now I can go ahead and tinker with
it while it is running, and iteratively change it to what I want.

Thanks again for the prompt and useful answer! It works like a charm now!

Cheers,
Denny
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

[Wikidata-l] Wikidata Toolkit 0.1.0 released

2014-04-08 Thread Denny Vrandečić

I was trying to use this, but my Java is a bit rusty. How do I run the
DumpProcessingExample?

I did the following steps:

git clone https://github.com/Wikidata/Wikidata-Toolkit
cd Wikidata-Toolkit
mvn install
mvn test

Now, how do I start DumpProcessingExample?

Sorry for being a bit dense here.

Cheers,
Denny

On Mon Mar 31 2014 at 6:47:21 AM, Markus Krötzsch 
mar...@semantic-mediawiki.org wrote:

 Dear all,

 I am happy to announce the very first release of Wikidata Toolkit [1],
 the Java library for programming with Wikidata and Wikibase. This
 initial release can download and parse Wikidata dump files for you, so
 as to process all Wikidata content in a streaming fashion. An example
 program is provided [2]. The libary can also be used with MediaWiki
 dumps generated by other Wikibase installations (if you happen to work
 in EAGLE ;-).

 Maven users can get the library directly from Maven Central (see [1]);
 this is the preferred method of installation. There is also an
 all-in-one JAR at github [3] and of course the sources [4].

 Version 0.1.0 is of course alpha, but the code that we have is already
 well-tested and well-documented. Improvements that are planned for the
 next release include:

 * Faster and more robust loading of Wikibase dumps
 * Support for various serialization formats, such as JSON and RDF
 * Initial support for Wikibase API access

 Nevertheless, you can already give it a try now. In later releases, it
 is also planned to support more advanced processing after loading,
 especially for storing and querying the data.

 Feedback is welcome. Developers are also invited to contribute via github.

 Cheers,

 Markus

 [1] https://www.mediawiki.org/wiki/Wikidata_Toolkit
 [2]
 https://github.com/Wikidata/Wikidata-Toolkit/blob/v0.1.0/
 wdtk-examples/src/main/java/org/wikidata/wdtk/examples/
 DumpProcessingExample.java
 [3] https://github.com/Wikidata/Wikidata-Toolkit/releases
 (you'll also need to install the third party dependencies manually when
 using this)
 [4] https://github.com/Wikidata/Wikidata-Toolkit/

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] qLabel

2014-04-02 Thread Denny Vrandečić

That's a toughie. Looking forward to see that one resolved :)


On Wed, Apr 2, 2014 at 2:14 AM, Andy Mabbett a...@pigsonthewing.org.ukwrote:

 On 1 April 2014 20:01, Denny Vrandečić vrande...@google.com wrote:

  a bug on the github project

 I've raised another, about the use of adjectives and adverbs:

https://github.com/googleknowledge/qlabel/issues/2

 --
 Andy Mabbett
 @pigsonthewing
 http://pigsonthewing.org.uk

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] qLabel

2014-04-02 Thread Denny Vrandečić

Yes.

That is why qLabel has a mechanism to implement your own loaders. The
Wikidata and Freebase loaders are much more efficient than the generic RDF
loader. Using SPARQL, RDF loading can be made more effective, but the LOD
protocols break down with regards to that.

Such a basic thing like labels on the Web of Data is a highly unsolved
problem, and there is plenty of space for improvement. I hope that qLabel
will incite the small number of changes required to change this situation.

Cheers,
Denny





On Wed, Apr 2, 2014 at 8:44 AM, Paul Houle ontolo...@gmail.com wrote:

 I've been thinking about this kind of problem in my own systems.  Name
 and link generation from entities is a cross-cutting concern that's
 best separated from other queries in your application.  With SPARQL
 and multiple languages each with multiple rdf:label it is awkward to
 write queries that bring labels back with identifiers,  particularly
 if you want to apply rules that amount if an ?lang label doesn't
 exist for a topic,  show a label from a language that uses that uses
 the same alphabet as ?lang in preference to any others.  Another
 issue too is that the design and business people might have some
 desire for certain kinds of labels and it's good to be able to change
 that without changing your queries.

 Anyway,  a lot of people live on the other end of internet connections
 with 50ms, 2000ms or more latency to the network core,  plus sometimes
 the network has a really bad day or even a bad few seconds.  For every
 hundred or so TCP packets you send across the modern internet,  you
 lose one.  The fewer packets you send per interaction the less likely
 the user is going to experience this.

 If 20 names are looked up sequentially and somebody is on 3G cellular
 with 300ms latency,  the user needs to wait six seconds for this data
 to load on top of the actual time moving the data and waiting for the
 server to get out of it's own way.  This is using jQuery so it's very
 likely the page has other Javascript geegaws in that work OK for the
 developer who lives in Kansas City but ordinary folks in Peoria might
 not have the patience to wait until your page is fully loaded.


 Batch queries give users performance they can feel,  even if they
 demand more of your server.  In my system I am looking at having a
 name lookup server that is stupidly simple and looks up precomputed
 names in a key value store,  everything really stripped down and
 efficient with no factors of two left on the floor.  I'm looking at
 putting a pretty ordinary servlet that writes HTML in front of it,
 but a key thing is that the front of the back end runs queries in
 parallel to fight latency,  which is the scourge of our times.  (It's
 the difference between Github and Altassian)

 On Wed, Apr 2, 2014 at 4:36 AM, Daniel Kinzler
 daniel.kinz...@wikimedia.de wrote:
  Hey Denny! Awesome tool!
 
  It's so awesome, we are already wondering about how to handle the load
 this may
  generate.
 
  As far as I can see, qlabel uses the wbgetentities API module. This has
 the
  advantage of allowing the labels for all relevant entities to be fetched
 with a
  single query, but it has the disadvantage of not being cacheable.
 
  If qlabel used the .../entity/Q12345.json URLs to get entity data, that
 would be
  covered by the web caches (squid/varnish). But it would mean one request
 per
  entity, and would also return the full entity data, not just the  labels
 in one
  language. So, a lot more traffic.
 
  If this becomes big, we should probably offer a dedicated web interface
 for
  fetching labels of many entities in a given language, using nice,
 cacheable
  URLs. This would mean a new cache entry per language per combination of
 entities
  - potentially, a large number. However, the combination of entities
 requested is
  determiend by the page being localized - that is, all visitors of a
 given page
  in a given language would hit the same cache entry. That seems workable.
 
  Anyway, we are not there quite yet, just something to ponder :)
 
  -- daniel
 
 
  Am 01.04.2014 20:14, schrieb Denny Vrandečić:
  I just published qLabel, an Open Source jQuery plugin that allows to
 annotate
  HTML elements with Wikidata Q-IDs (or Freebase IDs, or, technically,
 with any
  other Semantic Web / Linked Data URI), and then grabs the labels and
 displays
  them in the selected language of the user.
 
  Put differently, it allows for the easy creation of multilingual
 structured
  websites. And it is one more way in which Wikidata data can be used, by
 anyone.
 
  Contributors and users are more than welcome!
 
  
 http://google-opensource.blogspot.com/2014/04/qlabel-multilingual-content-without.html
 
 
 
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
 
 
  --
  Daniel Kinzler
  Senior Software Developer
 
  Wikimedia Deutschland
  Gesellschaft

[Wikidata-l] Wikidata for organizing 1000s of extensions in mediawiki.org?

2014-03-20 Thread Denny Vrandečić

I would very strongly recommend to use Semantic MediaWiki for this use
case. It is more powerful, we use SMW in other WMF contexts already, and
supporting the data inside Meta (instead of inside Wikidata and then
transcluding it) allows us also to generate workflows in Meta involving
local User-accounts, etc., and reduces complexity since the data is saved
in one place, and you don't have to switch between Meta and Wikidata to
update the data for an extension. It also frees Wikidata for having have to
extend their policies to support this specific use case (would MediaWiki
extension developers all get their own item per Notability, etc.). Also,
SMW already right now supports the uses cases you are asking for right now.

I understand that SMW was already suggested in Bugzilla. I understand
Wikidata looks more sexy right now, but I think it is not the most
appropriate tool for this use case.

Just my 2 cents.

On Wed Mar 19 2014 at 10:09:34 PM, Quim Gil q...@wikimedia.org wrote:

 Organize MediaWiki's catalog of 1000s of extensions using Wikidata. Is
 this a sensible idea? Reality checks and other opinions are welcome here or
 at

 https://bugzilla.wikimedia.org/show_bug.cgi?id=46704#c33

 Pasting the relevant part for convenience:

 Has anybody discussed the possibility of creating Wikidata items for
 extensions, after defining a set of properties to describe them? Linking
 those Wikidata items to mediawiki.org extension pages, and then playing
 with templates and what not to keep the semantic data up to date (version
 number, last release, dependencies, compatible with MediaWiki releases...)?
 Then play with templates, queries and visualizations to create all kinds of
 useful output, from structured extension pages to a proper and robust map
 of extensions.


 --
 Quim Gil
 Engineering Community Manager @ Wikimedia Foundation
 http://www.mediawiki.org/wiki/User:Qgil
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Queries - can they be stored as statements in Category/List items?

2014-03-07 Thread Denny Vrandečić

Micru,

thank you for the explanation. I understand better now what you mean.

I still disagree - let me explain why. I think that trying to express a
query definition into a single statement is very hard. Having a specific
Query namespace allows us to create a completely new UI for them, allows us
to use a different data model for Queries than for items, and allows us to
treat Query pages very different (e.g. for caching) than, e.g. item pages.

For example, the different data model would allow us to restrict the number
of queries on a page. If they were just a statement, what would stop a
contributor from creating several such statements on one page? What happens
when someone removes the same as query statement? What happens if someone
adds it to the page for USA (e.g. same as query instance of-country,
continent-North America, population-300M)? Would this page
suddenly be treated differently? Also, you already show in your mock up
that the same as query statement requires plenty of special code (e.g.
for the different visualizations, etc.)

One option would be to have them as item pages, but then treat them
continuously different. This would mean more and more exceptions and
special casing in the code. I think that Queries and Items are sufficiently
different to deserve their own treatment. In my personal opinion, this is
provides sufficient reasons for Query pages and Item pages being distinct.

What would be the advantage of having Queries being expressed in the Items?
Less entities? Less confusion about what these list of- and
category-items mean? Both reasons I don't find sufficiently enticing to
change my opinion on this.

Cheers,
Denny




On Fri Mar 07 2014 at 5:00:19 AM, David Cuenca dacu...@gmail.com wrote:

 Denny, sorry for the confusion, it is a complex topic, or it could also be
 that I am terribly bad at explaining :)
 Based on that item page I have made a mock-up which perhaps makes things
 easier:
 http://i.imgur.com/1dSfrqx.png

 The reasoning for this being:
 1) there is a well-defined set of queries that are equivalent to
 categories/lists, so there is no need to have independent query pages
 2) if a wikipedia wants to include query results on a page, it is quite
 probable that the query already exists as a list/category
 3) and if it doesn't then it will be *very* specific to that language
 wikipedia. In that case there is no need to define a query page on
 wikidata, but on the wikipedia page itself as an inclusion syntax command
 or another similar module

 You are right that it might be a bit preliminary, as there are not even
 simple queries yet, but since this kind of decisions might have an impact
 on later design, I think it is worth start presenting the concepts/options
 now. Besides, ideas and a common understanding take time to develop, and
 the RFC was started, so I thought it was worth giving it some attention.

 Cheers,
 Micru

 On Fri, Mar 7, 2014 at 12:18 AM, Denny Vrandečić vrande...@gmail.comwrote:

 Since I am obviously bad at guessing what you mean, can you please
 explicate what you mean with replicate that functionality on Wikidata?

 Sorry, I am too dense to understand it.

 What do you want to happen, explicitly?

 I go to http://www.wikidata.org/wiki/Q6573995 - how should it be
 different from what it displays today?

 Do you want the item pages to have the feature to directly embed query
 results, instead of having a one-click distance to the actual query page
 and its results?

 Or is there more to it?




 On Thu Mar 06 2014 at 3:10:44 PM, David Cuenca dacu...@gmail.com wrote:

 I'm not saying that the results yielded by Category:Books by Jean-Paul
 Sartre or Category:Books by J.R.R. Tolkien are or should be the same as
 the result yielded by a corresponding Wikidata query, but the concepts they
 represent, they are the same. Ditto for lists.
 (As a further clarification, I didn't mention anything about changing
 Wikipedia categories or Wikipedia lists either.)

 My question was regarding the functionality of WD items associated with
 Wikipedia categories and Wikipedia lists.
 Conceptually those items represent (or can represent) queries.
 WDQ, the tool by Magnus, already can interpret certain statements as
 queries [1].
 Would it make sense to replicate that functionality on Wikidata?

 Cheers,
 Micru

 [1] http://tools.wmflabs.org/reasonator/?q=6573995


 On Thu, Mar 6, 2014 at 11:16 PM, Denny Vrandečić vrande...@gmail.comwrote:

 But that's simply not the case. The Category:Books by Jean-Paul Sartre
 [1] or Category:Books by J.R.R. Tolkien [2} neither are a complete list of
 books by those authors (e.g. Sartre's fictional books are missing,
 Tolkien's non-fictional *and* Middle earth books are missing), nor are they
 only including books by Tolkien (e.g. they also include templates and other
 categories, which are likely not written by Sartre or Tolkien).

 If the plan is to change the way categories are used in Wikipedia and
 the other Wikimedia wikis

Re: [Wikidata-l] rank related changes

2014-03-07 Thread Denny Vrandečić

Wikidata labels are simple. This is due to the necessities of the project.
We need one single label to display. Having Wikidata labels with ranks,
qualifieres, sources, etc. simply would not work in the UI.

Labels and names in reality are indeed extremely complex. But as already
pointed out, this kind of information can be expressed with Statements, and
we already have properties to do so and will probably get more such
properties when the multi- and monolingual text properties get developed.

So, yes, Gerard, Daniel would be wrong if he would say that labels are
simple in the world. But that is not what he said. He was simply referring
to labels as they are already implemented in Wikidata, and that serve a
very specific purpose - and for these, he is absolutely right to say that
ranks do not apply for them.

The only purpose of labels and descriptions is to provide identifying
information and to provide something to display for an item. The only
purpose for aliases is to increase recall for search. I would consider
having an alias containing a frequent typo absolutely OK, if it helps
people find that item. They don't have to be right. They don't have to be
sourced. They have to be useful.

Statements on the other hand contain the actual content of Wikidata. And
those have ranks, qualifiers, sources, etc. Statemnt can contain historical
names of cities, and say from when to when they were used. Queries can then
some day use this information and display it within the context of a
specific query. But that is not what Wikidata labels are there for.

I hope that makes sense.

On Fri Mar 07 2014 at 12:01:32 AM, Gerard Meijssen 
gerard.meijs...@gmail.com wrote:

 Hoi,
 The name was Batavia at that time in any language.

 The issue is that when you fudge information in this way, you can not have
 proper queries. This is why Daniel is wrong and the notion that labels are
 simple needs to be revisited. It is not rare at all and it exists in many
 domains. This is why it is wrong, wrong, wrong.
 Thanks,
   GerardM


 On 6 March 2014 19:31, Joe Filceolaire filceola...@gmail.com wrote:

 Use 'Birth name (P513)' (string datatype) for Cassius Clay or 'Official
 name' (Proposed property with monolingual text datatype) for Batavia - with
 date qualifiers.

 Joe

 On Thu, Mar 6, 2014 at 4:12 PM, Gerard Meijssen 
 gerard.meijs...@gmail.com wrote:

 Hoi,
 So how do I indicate that up to a particular date Jakarta was called
 Batavia ? Muhammed Ali was called Cassius Clay ? There is no discussion
 about it. All there is an (potentially perceived) inability to use
 appropriate labels at will.

 Labels are not simple.
 Thanks,
 Gerard


 On 6 March 2014 17:07, Daniel Kinzler daniel.kinz...@wikimedia.dewrote:

 Am 06.03.2014 16:27, schrieb Gerard Meijssen:
  Hoi,
  I hope this will be revisited. Many items change there name and
 dependent on a
  date they or it are called differently.

 If the name is something that is changed, debated, or otherwise a
 subject of
 discussion, create a statement using an appropriate property. The point
 of
 having labels is precisely that they are simple.

 -- daniel

 --
 Daniel Kinzler
 Senior Software Developer

 Wikimedia Deutschland
 Gesellschaft zur Förderung Freien Wissens e.V.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l



 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l



 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

[Wikidata-l] Status PropertySuggester

2014-01-09 Thread Denny Vrandečić

Welcome to Wikidata! I am very much looking forward to see the results of
your work. The demo looks very promising and the results are already so
much better than what we currently have. And also the answers are very
fast, which is promising.

Awesome work, and welcome!

On Thu Jan 09 2014 at 8:42:45 AM, Weidhaas, Virginia 
virginia.weidh...@student.hpi.uni-potsdam.de wrote:

  Hi,



 first of all we would like to introduce ourselves.

 We are students from the wikidata.lib bachelor project at the chair of
 Prof. Dr. Naumann Information Systems. We will work on Wikidata from
 October 2013 until June 2014. Our mentor is Anja Jentzsch. Every member is
 a student of the Hasso-Plattner-Institute in their fifth semester of
 Bachelor.



 We aim to provide an extension for wikidata, that simplifies adding
 statements by suggesting properties which fit to the item.

 BugZilla Ticket we relate to:
 https://bugzilla.wikimedia.org/show_bug.cgi?id=46555

 Our Project Documentation:
 https://github.com/Wikidata-lib/Wikidata.lib/wiki/Intelligent-Forms



 Status:

 At the moment we have API functionality to return property suggestions
 ranked by correlation with an item or a set of property ids. You can try it
 out yourself here:

 http://suggester.wmflabs.org/wiki/index.php/Spezial:PropertySuggester

 Alternatively you can use the underlying api module wbsgetsuggestions



 The next step will be to integrate that functionality with the
 entityselector input field when adding statements to some item.



 We are looking forward to your feedback and ideas.



 Moritz, Christian, Virginia, Felix

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] How are queries doing?

2014-01-07 Thread Denny Vrandečić

The main reason why Queries are not done yet is because in the beginning of
2013 I deprioritized them compared to the original plan. Only a single
developer kept working on them, instead of a major part of the team, as was
originally planned.

I made this decision because it became clear to me that we will likely be
able to continue the Wikidata development beyond the original 12-month plan
(as was indeed the case) and that, in the medium run, rushing this
functionality would only hurt the project. I thus decided to increase the
priorities on tasks which had a higher short-term benefit and were more
immediate, e.g. many smaller things, but also more datatypes, ranks, and
clean-ups, but also reactions to the roll-outs which had begun back then.
This made us highly responsive to the current needs of the community, and
lead to a sustained growth of Wikidata.

If it would be needed, queries could be rushed. But that would have a
negative impact on the longer sustainability of the project. If it would be
deemed a higher priority, the development of queries could be sped up. But
this comes with a sacrifice regarding other functionalities. Thus yes, more
resources would lead to a faster development of queries (if it were decided
that this would be the appropriate priority).

The latter especially means that a sustained contribution from external
developers can also lead to a faster development of the query
functionality. We have seen with the sustained support of Benestar for the
Badges functionality that this is feasible and possible. So instead of
simply expressing complaints about features not being developed fast
enough, how about actually helping with making them real? it is Open Source
after all. Or at least simply make a case for the importance of this
functionality? The development team keeps listening to the community like
no other that I know of, and prioritizes their effort with respect to that.

So, in short, blame me.

Cheers,
Denny




On Tue, Jan 7, 2014 at 2:08 PM, Jan Kučera kozuc...@gmail.com wrote:

 Hm,

 nice to read all the reasoning why queries are yet still not possible, but
 I think we live in 2014 not and not 1914 actually... seems like the problem
 is too small budget or bad management... can not really think of another
 reason. How much do you think would it cost to make queries reality for
 production at Wikidata?

 Regards,
 Jan



 2013/11/29 Gerard Meijssen gerard.meijs...@gmail.com

 Hoi,
 Please understand that providing functionality like query is something
 that has to fit into a continuously live environment. This is an
 environment where the Wikidata functionality is used all the time and where
 some of the underlying functionality is changed as well. The Wikidata
 development is not happening in a vacuum.

 Given that we hope to get next week a new type of property, it should be
 obvious that Wikidata is not feature complete. When you add on extra
 functionality like a query engine, you add extra complications while the
 work is ongoing to get to the stage where Wikidata is feature complete for
 the data types.

 Another aspect is that it is NOT the Wikidata team to decide what goes
 into production on Wikipedia projects. The Ask 1.0 functionality for
 instance is at its release level. It is now for other people to determine
 if they want to include it in. They have their own road maps and, it is not
 obvious for an observer what the rationales are. NB Ask 1.0 is also used in
 Semantic MediaWiki and it provides a query kind of functionality. Query
 does require some performance grin and what is too much /grin.

 So in one aspect there is a query functionality to be used in Wikipedia
 ea. What the query functionality will deliver that is still being build is
 not clear to me.

 On another note, there are other projects that have lingering before they
 were implemented. Nothing new here. There have been other projects that had
 to change because of external pressures. Nothing new here.

 If you want query functionality on the existing data now, there is a hack
 that works quite nicely. It makes use of data replicated to the labs
 environment. The replication is broken and given the holidays it has not
 been picked up for the third day now so the data is three days old.
 Thanks,
GerardM





 On 29 November 2013 14:03, Martynas Jusevičius marty...@graphity.orgwrote:

 Jan,

 my suspicion is that my predictions from last year hold true: it is a
 far more complex task to design a scalable and performant data model,
 query language and/or query engine solely for Wikidata than the
 designers of this project anticipated - unless they did anticipate and
 now knowingly fail to deliver.

 You can check some threads from december last year, and they relate to
 even older ones:
 http://www.mail-archive.com/wikidata-l@lists.wikimedia.org/msg01415.html

 Martynas

 On Fri, Nov 29, 2013 at 1:47 PM, Jan Kučera kozuc...@gmail.com wrote:
  Ok.
 
  One is a bit disappointed seeing

Re: [Wikidata-l] [Wikisource-l] DNB 11M bibliographic records as CC0

2013-12-06 Thread Denny Vrandečić

Thanks for reviving this thread, Luiz. I also wanted to ask whether we
should be updating parts of DNB and similar data. Maybe not create new
entries, but for those that we already have, add some of the available data
and point to the DNB dataset?


On Fri, Dec 6, 2013 at 3:24 PM, Luiz Augusto lugu...@gmail.com wrote:

 Just found this thread while browsing my email archives (I'm/was inactive
 on Wikimedia for at least 2 years)

 IMHO will be very helpfull if a central place hosting metadata from
 digitized works will be created.

 In my past experience, I've found lots of PD-old books from languages like
 french, spanish and english in repositories from Brazil and Portugal, with
 UI mostly in portuguese (ie, with very low probabilities to get found by
 volunteers from subdomains from those languages), for example.

 I particularly loves validating metadata more than proofreading books.
 Perhaps a tool/place like this makes new ways to contribute to Wikisource
 and helps on user retention (based on some wikipedians that gets fun making
 good articles but loves also sometimes to simply make trivial changes on
 their spare time)?

 I know that the thread was focused on general metadata from all kinds and
 ages of books, but I had this idea while reading this

 [[:m:User:555]]


 On Mon, Aug 26, 2013 at 10:42 AM, Thomas Douillard 
 thomas.douill...@gmail.com wrote:

 I know, I started a discussion about porting the bot to WIkidata in
 scientific Journal Wikiproject. One answer I got : the bot owner had other
 things to do in his life than running the bot and was not around very often
 any more. Having everiyhing in Wikidata already will be a lot more reliable
 and lazier, no tool that works one day but not the other one, no effort to
 tell the newbies that they should go to another website, no significant
 problem.

 Maybe one opposition would be that the data would be vandalised easily,
 but maybe we should find a way to deal with imported sourced datas which
 have no real reason to be modified, just marked deprecated or updated by
 another import from the same source.


 2013/8/26 David Cuenca dacu...@gmail.com

 If the problem is to automate bibliographic data importing, a solution
 is what you propose, to import everything. Another one is to have an import
 tool to automatically import the data for the item that needs it. In WP
 they do that, there is a tool to import book/journal info by ISBN/doi. The
 same can be done in WD.

 Micru


 On Mon, Aug 26, 2013 at 9:23 AM, Thomas Douillard 
 thomas.douill...@gmail.com wrote:

 If Wikidata has an ambition to be a really reliable database, we should
 do eveything we can to make it easy for users to use any source they want.
 In this perspective, if we got datas with guaranted high quality, it make
 it easy for Wikidatian to find and use these references for users. Entering
 a reference in the database seems to me a highly fastidious, boring, and
 easily automated task.

 With that in mind, any reference that the user will not have to enter
 by hand is something good, and import high quality sources datas should
 pass every Wikidata community barriers easily. If there is no problem for
 the software to handle that many information, I say we really have no
 reason not to do the imports.

 Tom


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




 --
 Etiamsi omnes, ego non

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l



 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l



 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] quantities datatype available for testing

2013-11-22 Thread Denny Vrandečić

It is either obvious that they should be entering only integers or positive
numbers, in which case such feedback isn't helpful, or it might end up
being too restrictive again. Who tells me that a system like this won't get
used in order to force cities to have a population of an integer bigger
than 10,000?

I understand the wish and desire to restrict user input, but I would like
to remind everyone that Wikidata comes from the wiki side, which adheres
more to the 'let's gather input and then verify it' than the 'let's make
everyone give us correct input in the first place' side.


On Fri, Nov 22, 2013 at 11:24 AM, Helder . helder.w...@gmail.com wrote:

 On Fri, Nov 22, 2013 at 4:11 PM, Lukas Benedix
 bene...@zedat.fu-berlin.de wrote:
  The problem I see with this practice is that a user doesn't get any
 feedback
  that he is entering 'invalid' values.

 +1

 Helder

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] quantities datatype available for testing

2013-11-22 Thread Denny Vrandečić

So instead better to limit your freedom to express yourself in the first
place.

I'd take the bot. At least in the history of the article it is recorded
that it was tried to enter 123.45 for a population, and we can later figure
out what was happening.

Why not wait and see if this is really a problem? I wonder how many such
mistakes will ever be entered, besides jokes and vandalism. And the
latter is easier to catch if we don't require the pranksters to use data
that sounds correct. Do we have any indication that contributors are being
supported by a system that doesn't let them enter negative numbers for
populations?

On Fri, Nov 22, 2013 at 1:46 PM, Lukas Benedix
bene...@zedat.fu-berlin.dewrote:

I don't want to feel like John Connor... hunted by a bot that comes
after my edits and reverts them only because I entered 123.45 for a
property that should be an integer.

Am Fr 22.11.2013 21:56, schrieb Denny Vrandečić:

It is either obvious that they should be entering only integers or
positive numbers, in which case such feedback isn't helpful, or it might
end up being too restrictive again. Who tells me that a system like this
won't get used in order to force cities to have a population of an integer
bigger than 10,000?

I understand the wish and desire to restrict user input, but I would
like to remind everyone that Wikidata comes from the wiki side, which
adheres more to the 'let's gather input and then verify it' than the 'let's
make everyone give us correct input in the first place' side.

On Fri, Nov 22, 2013 at 11:24 AM, Helder . helder.w...@gmail.com wrote:

On Fri, Nov 22, 2013 at 4:11 PM, Lukas Benedix
bene...@zedat.fu-berlin.de wrote:
The problem I see with this practice is that a user doesn't get any
feedback
that he is entering 'invalid' values.

Helder

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing
listWikidata-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Questions about statement qualifiers

2013-11-05 Thread Denny Vrandečić

Hello Antoine,

just to add to what was already said:

a Qualifier in Wikidata is not a statement about a statement. In RDF
semantics, the pattern that we follow is not the reification of the triple
and then to make triples with the reified triple as a subject, as per 
http://www.w3.org/TR/rdf-mt/#ReifAndCont but rather the pattern of n-ary
relations per http://www.w3.org/TR/swbp-n-aryRelations/ . The use cases
very beautifully visualize how Wikidata maps to RDF: 
http://www.w3.org/TR/swbp-n-aryRelations/#useCase1

This is also what Wikidata's mapping to RDF document explains and
motivates: https://meta.wikimedia.org/wiki/Wikidata/Development/RDF

I hope this helps,

Denny



On Oct 31, 2013 3:40 AM, Antoine Zimmermann antoine.zimmerm...@emse.fr
wrote:

 Hello,


 I have a few questions about how statement qualifiers should be used.


 First, my understanding of qualifiers is that they define statements about
 statements. So, if I have the statement:

 Q17(Japan)  P6(head of government)  Q132345(Shinzō Abe)

 with the qualifier:

  P39(office held)  Q274948(Prime Minister of Japan)

 it means that the statement holds an office, right?
 It seems to me that this is incorrect and that this qualifier should in
 fact be a statement about Shinzō Abe. Can you confirm this?



 Second, concerning temporal qualifiers: what does it mean that the start
 or end is no value?  I can imagine two interpretations:

  1. the statement is true forever (a person is a dead person from the
 moment of their death till the end of the universe)
  2. (for end date) the statement is still true, we cannot predict when
 it's going to end.

 For me, case number 2 should rather be marked as unknown value rather
 than no value. But again, what does unknown value means in comparison
 to having no indicated value?



 Third, what if a statement is temporarily true (say, X held office from T1
 to T2) then becomes false and become true again (like X held same office
 from T3 to T4 with T3  T2)?  The situation exists for Q35171(Grover
 Cleveland) who has the following statement:

 Q35171  P39(position held)  Q11696(President of the United States of
 America)

 with qualifiers, and a second occurrence of the same statement with
 different qualifiers. The wikidata user interface makes it clear that there
 are two occurrences of the statement with different qualifiers, but how
 does the wikidata data model allows me to distinguish between these two
 occurrences?

 How do I know that:

  P580(start date)  March 4 1885

 only applies to the first occurrence of the statement, while:

  P580(start date)  March 4 1893

 only applies to the second occurrence of the statement?
 I could have a heuristic that says if two start dates are given, then
 assume that they are the starting points of two disjoint intervales. But
 can I always guarantee this?


 Best,
 AZ

 --
 Antoine Zimmermann
 ISCOD / LSTI - Institut Henri Fayol
 École Nationale Supérieure des Mines de Saint-Étienne
 158 cours Fauriel
 42023 Saint-Étienne Cedex 2
 France
 Tél:+33(0)4 77 42 66 03
 Fax:+33(0)4 77 42 66 66
 http://zimmer.aprilfoolsreview.com/

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] creator template, wikimedia commons

2013-09-25 Thread Denny Vrandečić

This is completely up to the community, whether they want this data and the
necessary structures for it. It really depends on the scope of the dataset.
But here it is the same: there is no way to use this data in short-term for
the metadata in Commons. This will be possible in a few months, if you are
willing to wait this long.

Best way is probably to describe the dataset (in scope, i.e. what it
covers, and depth, i.e. what it says about the covered items) on Wikidata
or here, and see if there are opinions about adding this data.

Cheers,
Denny

2013/9/25 Antoine Isaac ais...@few.vu.nl

Hello Denny,

I think we in Europeana had the same problem in the GLAMwiki toolset
project [1].
We wanted to submit the metadata we had for Europeana objects to be
uploaded in Commons, but that was not fully possible... So we'd have to
think of an alternative. Do you think it could happen via Wikidata?

Best,

Antoine

[1]

It is a bit early for that question right now.

In the long run we plan to have metadata about Commons media file
(current state of the discussion is here: https://commons.wikimedia.org/*
*wiki/Commons:Wikidata_for_**media_infohttps://commons.wikimedia.org/wiki/Commons:Wikidata_for_media_info)
- but this is planned for 2014.

For now, the Creator template in Commons can not be replaced with
Wikidata. There is no way to integrate data from Wikidata in an arbitrary
Commons page (which is what you would need in order to replace the Creator
template). We have a Bug for ( this https://bugzilla.wikimedia.**
org/show_bug.cgi?id=47930https://bugzilla.wikimedia.org/show_bug.cgi?id=47930)
and aim for this in this fall / early winter to be completed.

My assumption would be to, for the long term, create the Creators as
items in Wikidata (if they do not exist) and add data to them. But this
will not yield any short-term visible results, which is frustrating. So you
might want to have a double strategy: add them to Wikidata and add them in
the creator namespace, just as you have done so far.

Does anyone have another opinion on this?

Sorry for not having better news right now,
Denny

2013/9/21 rupert THURNER rupert.thur...@gmail.com mailto:
rupert.thurner@gmail.**com rupert.thur...@gmail.com

hi,

we are currently experimenting to have, after zb zürich earlier the
year[1], a second museum from switzerland uploading full quality
images (i.e. tif format)[2]. i was wondering what is the most
wikidata compatible way of adding a creator information to an image
like this one:

https://commons.wikimedia.org/**wiki/File:Soleure_Aa_0012.tifhttps://commons.wikimedia.org/wiki/File:Soleure_Aa_0012.tif

the painter winterlin is in the category, has a template including
personal information, just like the wikipedia article, and the
wikidata entry.

rupert.

[1] https://commons.wikimedia.org/**wiki/Commons:**
Zentralbibliothek_Z%C3%BCrichhttps://commons.wikimedia.org/wiki/Commons:Zentralbibliothek_Z%C3%BCrich
[2] https://commons.wikimedia.org/**wiki/Commons:**
Zentralbibliothek_Solothurnhttps://commons.wikimedia.org/wiki/Commons:Zentralbibliothek_Solothurn

__**_
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org mailto:Wikidata-l@lists.**
wikimedia.org Wikidata-l@lists.wikimedia.org

https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l

--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.

__**_
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l

--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Re: [Wikidata-l] Counting sitelinks of subclasses.

2013-09-24 Thread Denny Vrandečić

I would be surprised if that theory held true. I expect that both very
abstract (fruit) and extremely specific (golden delicious) items would have
a lower sitelink count than the golden layer of most useful terms (apple)
in the hierarchy (I am reminded of the theory of word length and term
frequency in linguistics).

But I would assume that indeed in the subclass hierarchy that Wikidata will
eventually exhibit would have such a golden layer (and that these terms
are not randomly distributed over the hierarchy).

Would be fun to examine :)

Cheers,
Denny



2013/9/24 Klein,Max kle...@oclc.org

  Hello All,

 It struck me that one interesting way to see if subclasses are useful was
 to test this hypothesis.

 Let QID_a and QID_b be two Wikidata items.

 Conjecture: if QID_b is subclass of QID_a,
 then count_stelinks(QID_b) = count_sitelinks(QID_a).

 Has anyone investigated this problem, or can think of an efficient way to
 test it? Or can tell me why it ought not to be true?

  Maximilian Klein
 Wikipedian in Residence, OCLC
 +17074787023

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] test.wikidata.org offers URL datatype now

2013-09-06 Thread Denny Vrandečić

in the case of MediaWiki wikis, I guess http://en.wikipedia.org/wiki/**
Technical_University_of_Denmarkhttp://en.wikipedia.org/wiki/Technical%20University%20of%20Denmark
should
be preferred, but that is besides the point.

Yes, spaces are not escaped. That was intentional, as they are also not
escaped when entering the URL in wiki syntax. Should we behave differently?

Thank you for the tests!

2013/9/6 Finn Årup Nielsen f...@imm.dtu.dk

It is apparently not possible to enter a URL with spaces and have it
automatically escaped, e.g.,

http://en.wikipedia.org/wiki/**Technicalhttp://en.wikipedia.org/wiki/TechnicalUniversity
of Denmark

should be entered as:

http://en.wikipedia.org/wiki/**Technical%20University%20of%**20Denmarkhttp://en.wikipedia.org/wiki/Technical%20University%20of%20Denmark

On the other hand
http://en.wikipedia.org/wiki/**Københavnhttp://en.wikipedia.org/wiki/K%C3%B8benhavnworks
ok for
http://en.wikipedia.org/wiki/**K%C3%B8benhavnhttp://en.wikipedia.org/wiki/K%C3%B8benhavn

Also http://københavn.dk http://xn--kbenhavn-54a.dk works.

see
https://test.wikidata.org/**wiki/Q132https://test.wikidata.org/wiki/Q132and
https://test.wikidata.org/**wiki/Q133https://test.wikidata.org/wiki/Q133

cheers
Finn Årup Nielsen

On 09/06/2013 12:10 PM, Denny Vrandečić wrote:

Hello all,

in preparation of next week's deployment to Wikidata.org,
test.wikidata.org http://test.wikidata.org now has the new datatype

URL deployed.

If you have the time, we would appreciate if you tested it and let us
know about errors and problems.

The URL datatype should be a big step in allowing to introduce better
sourcing and reliability of the content of Wikidata.

Cheers,
Denny

--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt
für Körperschaften I Berlin, Steuernummer 27/681/51985.

__**_
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l

--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Re: [Wikidata-l] The Day the Knowledge Graph Exploded

2013-08-23 Thread Denny Vrandečić

Just a few corrections to the historical dates given by Tom.

2013/8/23 Tom Morris tfmor...@gmail.com

 In a word, no.  Google acquired Metaweb, the company that built Freebase,
 which forms the core of the Knowledge Graph in 2010.  Metaweb was founded
 in 2005 (interesting Google search: Metaweb founding) and started
 extracting information from Wikipedia into Freebase in 2006.
 https://www.freebase.com/m/0gw0?linkslang=enhistorical=true

 The first DBpedia release was in 2007.  Semantic information nets go back
 to the 60s. TBL coined the term semantic web in 2006.


TBL coined the term semantic web at latest in 1994, probably even before
(I don't have Weaving the Web at hand, but here are TBL's slides from the
WWW conference in 1994: http://www.w3.org/Talks/WWW94Tim/)


 WikiData is a great project, but this progress has been building,
 excrutiatingly slowly, over decades.  One could even make the argument that
 WikiData is the result of Knowledge Graph and its antecedents rather than
 the other way around.


Wikidata is influenced by RDF (1999), OWL (2004), Semantic MediaWiki
(2005), Freebase (2006), DBpedia (2007), Semantic Forms (2007), and many
many other technologies that are less visible or don't have such a strong
brand (and Michael is very aware of that history, he's been around years
before working on the technologies these are based on).

I understand Michael's question to be much more concrete: does the progress
in Wikidata has anything to do with the changes in the Knowledge Graph's
visibility in Google's searches that happened last month?

Cheers,
Denny


 Tom

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] The Day the Knowledge Graph Exploded

2013-08-23 Thread Denny Vrandečić

Oh, that's a clear and loud I have no idea :)


2013/8/23 Tom Morris tfmor...@gmail.com

 On Fri, Aug 23, 2013 at 10:10 AM, Denny Vrandečić 
 denny.vrande...@wikimedia.de wrote:


 I understand Michael's question to be much more concrete: does the
 progress in Wikidata has anything to do with the changes in the Knowledge
 Graph's visibility in Google's searches that happened last month?


 So, what's your opinion?

 Tom

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Make Commons a wikidata client

2013-08-22 Thread Denny Vrandečić

Hi Maarten,

thanks. That's the best proposal I have seen so far in how to proceed with
Phase 1 on Commons. I usually had pushed Commons support further to the
back, but with this I think we would indeed create some real value with a
small change. I will bounce Commons Phase 1 client support up on my list.

I guess we should disallow sitelinks to the File: namespace, in order to
avoid people trying to add metadata about the media files themselves?

Cheers,
Denny

2013/8/10 Maarten Dammers maar...@mdammers.nl

Hi everyone,

At Wikimania we had several discussions about the future of Wikidata and
Commons. Some broader feedback would be nice.
Now we have a property Commons category (https://www.wikidata.org/**
wiki/Property:P373 https://www.wikidata.org/wiki/Property:P373). This
is a string and an intermediate solution.
In the long run Commons should probably be a wikibase instance in it's own
right (structured metadata stored at Commons) integrated with Wikidata.org,
see
https://www.wikidata.org/wiki/**Wikidata:Wikimedia_Commonshttps://www.wikidata.org/wiki/Wikidata:Wikimedia_Commonsfor
more info.
In the meantime we should make Commons a wikidata client like Wikipedia
and Wikivoyage. How would that work?

We have an item
https://www.wikidata.org/wiki/**Q9920https://www.wikidata.org/wiki/Q9920for
the city Haarlem. It links to the Wikipedia article Haarlem and the
Wikivoyage article Haarlem. It should link to the Commons gallery
Haarlem
(https://commons.wikimedia.**org/wiki/Haarlemhttps://commons.wikimedia.org/wiki/Haarlem
)

We have an item
https://www.wikidata.org/wiki/**Q7427769https://www.wikidata.org/wiki/Q7427769for
the category Haarlem. It links to the Wikipedia category Haarlem. It
should link to the Commons category Haarlem (https://commons.wikimedia.*
*org/wiki/Category:Haarlemhttps://commons.wikimedia.org/wiki/Category:Haarlem
).

The category item (Q7427769) links to article item (Q9920) using the
property main category topic (https://www.wikidata.org/**
wiki/Property:P301 https://www.wikidata.org/wiki/Property:P301).
We would need to make an inverse property of P301 to make the backlink.

Some reasons why this is helpful:
* Wikidata takes care of a lot of things like page moves, deletions, etc.
Now with P373 (Commons category) it's all manual
* Having Wikidata on Commons means that you can automatically get
backlinks to Wikipedia, have intro's for category, etc etc
* It's a step in the right direction. It makes it easier to do next steps

Small change, lot's of benefits!

Maarten

__**_
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l

--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Re: [Wikidata-l] Help on adding a not exact date

2013-08-22 Thread Denny Vrandečić

This doesn't really work yet in the UI. Basically, you could only enter sth
like 6th c. BC which in this case would not be correct.
1st Millenium BC would be possible and correct, but it is a bit too wide.
That's the only thing supported right now.
We will be working on improving this situation.

Cheers,
Denny

2013/8/22 Mathieu Stumpf psychosl...@culture-libre.org

Hello,

I'm beginning with wikidata and didn't find how to add a date which is not
exact. In fact I found [1], which would gain clarity with an example, but
the given format is refused when I try to use it. For example, I want to
add birth and death date for Pittacus of Mytilene[2], for which I know no
accurate value, but it something like -650 to -570, with something like a
quarter century accuracy. Actually, looking at the English Wikipedia, -640
to -568 is given, but without sources unfortunatly. Well, for the sake of
the example lets ignore that, what input should I provide to the birth day
property to match my previous description?

[1] https://meta.wikimedia.org/**wiki/Wikidata/Data_model#**
Dates_and_timeshttps://meta.wikimedia.org/wiki/Wikidata/Data_model#Dates_and_times
[2]
https://www.wikidata.org/wiki/**Q311835https://www.wikidata.org/wiki/Q311835

--
Association Culture-Libre
http://www.culture-libre.org/

__**_
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l

--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Re: [Wikidata-l] Phase #3 deadline

2013-08-21 Thread Denny Vrandečić

Hi Jan,

we currently assume that we will have a first querying capability available
this fall. The implementation has progressed very well in the last few
months and weeks, including special pages to access it, API modules, etc.

Indeed querying will be available later than originally anticipated since
we had reprioritized it, and because of that we had much less people
working on this functionality (for a while, it was only one person working
on this), and other tasks were moved to higher prio, such as more data
types, better history support, allowing arbitrary access to items in the
clients, support for other sister projects, etc.

By the way, we mostly dropped the idea of phases to speak about development
goals as it doesn't really fit the current development plan, but that's
just a naming issue.

So expect some simple querying capability (give me all items with a
specific value on this property) to be deployed within the next month or
three, but don't be mad if we slip by a few weeks due to some unexpected
deployment issue.

Cheers,
Denny




2013/8/21 Jan Kučera kozuc...@gmail.com

 Hi there,

 how is the development of phase #3 (lists) going? Is it due to soon?

 Sub-question: I guess sorting feature in lists will be implemented in
 default as list without sorting would be a bad idea?

 Thx for answer.

 Cheers,
 Kozuch

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] [Wikimedia-l] Meeting about the support of Wiktionary in Wikidata

2013-08-10 Thread Denny Vrandečić

[Sorry for cross-posting]

Yes, I agree that the OmegaWiki community should be involved in the
discussions, and I pointed GerardM to our proposals whenever and
discussions, using him as a liaison. We also looked and keep looking at the
OmegaWiki data model to see what we are missing.

Our latest proposal is different from OmegaWiki in two major points:

* our primary goal is to provide support for structured data in the
Wiktionaries. We do not plan to be the main resource ourselves, where
readers come to in order to look up something, we merely provide structured
data that a Wiktionary may or may not use. This parallels the role of
Wikidata has with regards to Wikipedia. This also highlights the difference
between Wikidata and OmegaWiki, since OmegaWiki's goal is to create a
dictionary of all words of all languages, including lexical, terminological
and ontological information.

* a smaller difference is the data model. Wikidata's latest proposal to
support Wiktionary is centered around lexemes, and we do not assume that
there is such a things as a language-independent defined meaning. But no
matter what model we end up with, it is important to ensure that the bulk
of the data could freely flow between the projects, and even though we
might disagree on this issue in the modeling, it is ensured that the
exchange of data is widely possible.

We tried to keep notes on the discussion we had today: 
http://epl.wikimedia.org/p/WiktionaryAndWikidata

My major take home message for me is that:
* the proposal needs more visual elements, especially a mock-up or sketch
of how it would look like and how it could be used on the Wiktionaries
* there is no generally accepted place for a discussion that involves all
Wiktionary projects. Still, my initial decision to have the discussion on
the Wikidata wiki was not a good one, and it should and will be moved to
Meta.

Having said that, the current proposal for the data model of how to support
Wiktionary with Wikidata seems to have garnered a lot of support so far. So
this is what I will continue building upon. Further comments are extremely
welcomed. You can find it here:

http://www.wikidata.org/wiki/Wikidata:Wiktionary

As said, it will be moved to Meta, as soon as the requested mockups and
extensions are done.

Cheers,
Denny





2013/8/10 Samuel Klein meta...@gmail.com

 Hello,

  On Fri, Aug 9, 2013 at 6:13 PM, JP Béland lebo.bel...@gmail.com wrote:
  I agree. We also need to include the Omegawiki community.

 Agreed.

 On Fri, Aug 9, 2013 at 12:22 PM, Laura Hale la...@fanhistory.com wrote:
  Why? The question of moving them into the WMF fold was pretty much no,
  because the project has an overlapping purpose with Wiktionary,

 This is not actually the case.
 There was overwhelming community support for adopting Omegawiki - at
 least simply providing hosting.  It stalled because the code needed a
 security and style review, and Kip (the lead developer) was going to
 put some time into that.  The OW editors and dev were very interested
 in finding a way forward that involved Wikidata and led to a combined
 project with a single repository of terms, meanings, definitions and
 translations.

 Recap: The page describing the OmegaWiki project satisfies all of the
 criteria for requesting WMF adoption.
 * It is well-defined on Meta http://meta.wikimedia.org/wiki/Omegawiki
 * It describes an interesting idea clearly aligned with expanding the
 scope of free knowledge
 * It is not a 'competing' project to Wiktionaries; it is an idea that
 grew out of the Wiktionary community, has been developed for years
 alongside it, and shares many active contributors and linguiaphiles.
 * It started an RfC which garnered 85% support for adoption.
 http://meta.wikimedia.org/wiki/Requests_for_comment/Adopt_OmegaWiki

 Even if the current OW code is not used at all for a future Wiktionary
 update -- and this idea was proposed and taken seriously by the OW
 devs -- their community of contributors should be part of discussions
 about how to solve the Wiktionary problem that they were the first to
 dedicate themselves to.

 Regards,
 Sam.

 ___
 Wikimedia-l mailing list
 wikimedi...@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
 mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

[Wikidata-l] Some Wiktionary data in Wikidata - Updated proposal

2013-08-06 Thread Denny Vrandečić

Following numerous discussions and after input from many people, we are
happy to present the new version of the proposal that would lead to
Wikidata supporting structured data for the Wiktionaries.

http://www.wikidata.org/wiki/Wikidata:Wiktionary

I am very thankful to all those that provided input, and am also happy to
be able to send this out before Wikimania, and thus potentially have a good
discussion there. But obviously everyone feel free to chime in beyond that.

I would be glad if I could again ask the community to spread the word to
the Wiktionary communities, in order to gain as much feedback from them as
possible, and hopefully support.

Cheers,
Denny


-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] All interwikis from nl.wikivoyage have been moved to Wikidata

2013-07-29 Thread Denny Vrandečić

That's amazing! And so fast! Any idea how many links there have been? (Just
curious).

Thanks for reporting!
Denny


2013/7/29 Romaine Wiki romaine_w...@yahoo.com

 Hello all,

 I am happy to announce that all interwikis from all articles, templates,
 project pages (except some archive pages) have been moved to Wikidata. This
 includes the removal of all local interwikis. With this I roughly checked
 all pages if they are connected to the right article on Wikidata.

 I solved a lot of interwikiconflict, often with disambiguation pages. I
 also made sure that every articles has an item on Wikidata.

 The Dutch Wikivoyage is the first Wikivoyage that fully switched to
 Wikidata.

 Greetings,

 Romaine


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Wikidata Map Interface

2013-07-29 Thread Denny Vrandečić

Hi Jacobo,

I hope you don't mind that I share the answer with the list. I think the
answer to this question might be of general interest.

the JavaScript creating the visualization in the browser is here:

https://dl.dropboxusercontent.com/u/172199972/map/map.js

As you can see it is just a simple usage of HTML5 canvas.

It requires two data files as these (careful, large):

https://dl.dropboxusercontent.com/u/172199972/map/wdlabel.js
https://dl.dropboxusercontent.com/u/172199972/map/graph.js

The first contains all items, their latlong, and their label.
The second contains the graph, the way items are connected to each other.

The latter two files are created by the following Python scripts, in two
steps. First, you need to create the knowledge base. This can be done with
the following scripts:

https://github.com/mkroetzsch/wda

Use there the script

https://github.com/mkroetzsch/wda/blob/master/wda-analyze-edits-and-write-kb.py


Careful when you run it, it will download all Wikidata dumps. This might
need a few free Gigabyte and a decent internet connection.

Now, you should have the file kb.txt.gz, containing the knowledge base.
By the way, you can also download the knowledge base as it is created
nightly by us here:
https://dl.dropboxusercontent.com/u/172199972/kb.txt.gz

Finally, you will need a few scripts from here:
https://github.com/vrandezo/wikidata-analytics

Run them in the following order:
geolabel.py - extracts a list of all locations and their label from the
knowledge base 
https://github.com/vrandezo/wikidata-analytics/blob/master/geolabel.py
geolabel2wdlabel.py - transforms the list to JavaScript for ready
consumption by the Wikidata Map Interface 
https://github.com/vrandezo/wikidata-analytics/blob/master/geolabel2wdlabel.py

geo.py - extract a list of all locations from the knowledge base 
https://github.com/vrandezo/wikidata-analytics/blob/master/geo.py
graph.py - extracts the simple knowledge graph from the knowledge base 
https://github.com/vrandezo/wikidata-analytics/blob/master/graph.py
geograph.py - extracts the part of the simple knowledge graph that connects
geographical items with each other (needs geo and graph) 
https://github.com/vrandezo/wikidata-analytics/blob/master/geograph.py
geograph2geojs.py - transforms the geograph to JavaScript for ready
consumption by the Wikidata Map Interface 
https://github.com/vrandezo/wikidata-analytics/blob/master/geograph2geojs.py


This should you give the two files wdlabel.js and graph.js, which will be
called by the Wikidata Map Interface (see it's HTML source in order to see
how).

This process is run nightly on a machine we have standing here in the
office. I am planning to set this up on labs, but didn't find the time yet.

I hope this helps,
Denny






2013/7/29 Jacobo Nájera jac...@metahumano.org

 Hi Denny,

 I am interested in Wikidata Map Interface, Where can i see and download
 the code? I want to experiment and document with it.

 Thanks,
 Jacobo

 --
 Wikimedia México




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

[Wikidata-l] Watson receives Feigenbaum Prize - Money donated to Wikimedia and Wikidata

2013-07-16 Thread Denny Vrandečić

The AAAI awards the Feigenbaum Prize to the Watson team, which decides to
donate the prize money to the Wikimedia Foundation, explicitly listing
Wikidata as a reason.

When asked for a comment, Wikidata said:

Q2013 P3 Q12253 .

Congratulations to the Watson team and their stunning results!

More Info:

http://blog.wikimedia.org/2013/07/16/ibm-research-watson-aaai-prize-wikimedia-foundation/


Deutsch:

http://blog.wikimedia.de/2013/07/16/ibm-research-spendet-preisgeld-des-aaai-feigenbaum-preises-fur-watson-an-die-wikimedia-foundation/


-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

[Wikidata-l] 239 Million language links removed from the Wikipedias

2013-07-15 Thread Denny Vrandečić

In June 2012 I ran an analysis to discover how many language links were on
Wikipedia. Last week, I rerun the analysis again - and the results are
stunning.

Of the 240 Million language links, 239.2 Million have been removed so far.
This is an amazing result by the community. Congratulations.

Last year, 4.9 GB of text was required to represent the language links.
These have almost completely gone. And whereas last year for smaller
Wikipedias the language links made a substantial part of their content,
they have no almost completely disappeared.

Congratulations! Let's get ready for having the same positive effect on
Wikivoyage, starting next week!

(Note that the deployment might happen on a Tuesday for a change, as Monday
will be blocked for a few other deployments)

Here is the full data:
2013 analysis: http://simia.net/languagelinks/2013.html
2012 analysis: http://simia.net/languagelinks/index.html

Addshore is currently working on getting some actionable analytics out of
the dumps, in order to deal with the last remaining language links.

Cheers,
Denny




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

[Wikidata-l] Propagation of changes to the Wikipedias currently lagging

2013-07-13 Thread Denny Vrandečić

copied from
http://www.wikidata.org/wiki/Wikidata:Project_chat#Propagation_of_changes_to_the_Wikipedias_currently_lagging

Changes to Wikidata are currently propagated to the Wikipedias with a lag
of several hours, but this should be fixed during the next few hours.

The Dispatcher, who is responsible to push the edits from Wikidata to the
individual Wikipedias, choked yesterday on some edit. We did not notice
until the morning (thanks to the community for reporting on various
places). We got the Dispatcher running again. The backlog then was about 19
hours, and is now going down again, it seems roughly at a rate of two hours
per one hour, so it should have caught up in about half a day. You can see
the [[Special:DispatchStats|current status on wiki]].

We currently do not know why the Dispatcher got stalled, and also not on
which edit exactly. We simply skipped a few edits, and it started working
again. We will continue investigating. Because of that, it might happen
again any time. We keep watching the stats. A detailed description of our
current status can be found on the [
http://article.gmane.org/gmane.org.wikimedia.wikidata.technical/117Wikidata-tech
mailing list]. --[[User:Denny|Denny]] ([[User
talk:Denny|talk]]) 11:32, 13 July 2013 (UTC)

--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

[Wikidata-l] A personal note, and a secret

2013-07-11 Thread Denny Vrandečić

I am truly and deeply amazed by the Wikidata community.

A bit more than a year ago, I moved to Berlin and assembled a fantastic
team of people to help realize a vision. Today, we have collected millions
of statements, geographical locations, points in time, persons and their
connections, creative works, and species - and every single minute, hundred
of edits are improving and changing this knowledge base that anyone can
edit, that anyone can use for free.

So much more is left to do, and the further we go, the more opportunities
open. More datatypes - links are on the horizon, quantities will be a major
step. I can hardly wait to see Wikidata answer queries. And there are so
many questions unanswered - what does the community need in order to
maintain Wikidata best? Which tools, reports, special pages are needed?
What is the right balance between automation and flexibility?

Besides Wikipedia, Wikidata can be used in many other places. We just
started the conversations about sister projects, but also external projects
are expected to become smarter thanks to Wikidata. I expect tools and
libraries and patterns for these type of uses will emerge in the next few
months, and applications will become more intelligent and act more
informed, powered by Wikidata.

A project like Wikidata needs in its early days a strong, sometimes
stubborn leader in order to accelerate its growth. But at some point a
project gathers sufficient momentum, and the community moves faster than
any single leader could lead, and suddenly they might become bottlenecks,
and instead of accelerating the project the might be stalling it.

Wikidata has reached the point where it is time for me to step down. The
Wikidata development team in Berlin will, in the upcoming weeks and months,
set up processes that allow the community, that I learned to trust even
more during that year, to take over the reigns. I will stay with the team
until the end of September, and then become again what I have been for the
last decade - a normal and proud member of the Wikimedia communities.

I also would like to use this chance to reveal a secret. Wikidata items are
identified by a Q, followed by a number, Wikidata properties by a P,
followed by a number. Whereas it is obvious that the P stands for property,
some of you have asked - why Q? My answer was, that Q not only looks cool,
but also makes for great identifiers, and hopefully a certain set of people
will some day associate a number like Q9036 with something they can look up
in Wikidata. But the true reason is that Q is the first letter of the name
of the woman I love. We married last year, among all that Wikidata
craziness, and I am thankful to her for the patience she had while I was
discussing whether to show wiki identifiers or language keys, what bugs to
prioritize when, and which calendar systems were used in Sweden.

I will continue to be a community member with Wikidata. My new day job,
though, will be at Google, and from there I hope to continue to effectively
further our goals towards a world where everyone has access to the sum of
all knowledge.

Sincerely,
Denny Vrandečić

-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Wikidata Properties Search

2013-07-10 Thread Denny Vrandečić

Hi Hady,

use the MediaWiki API, like this:

http://www.wikidata.org/w/api.php?action=querylist=allpagesformat=jsonapnamespace=120aplimit=10

You can list through all the results using

http://www.wikidata.org/w/api.php?action=querylist=allpagesformat=jsonapnamespace=120aplimit=10apcontinue=P110

etc.

2013/7/10 Hady elsahar hadyelsa...@gmail.com

HI,

i'm playing around with creating some mappings between DBpedia properties
and WikiData ones .

in order to visualize the properties and make an easy search , i wanted to
create a simple file contains the properties URI and their labels in english

i tried to scrab those uris from 1 to 200 for example
http://www.wikidata.org/wiki/Special:EntityData/P164.nt

i noticed that WikiData properties are not in order by number as the
entities , lots of properties are not there. is there a better practice or
just ignoring empty properties.

which number intervals should i use ?

thanks
Regards

-
Hady El-Sahar
Research Assistant
Center of Informatics Sciences | Nile
Universityhttp://nileuniversity.edu.eg/

email : hadyelsa...@gmail.com
Phone : +2-01220887311
http://hadyelsahar.me/

http://www.linkedin.com/in/hadyelsahar

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Re: [Wikidata-l] Update nl-wiki request for bot

2013-07-08 Thread Denny Vrandečić

I just wanted to say thank you! That's truly amazing work.

As far as I can tell, more than 200 Million lines of wikitext have so far
been removed from the Wikipedias. That's 200 Million lines that do not have
to maintained anymore.

(I have not run the actual analysis yet, I have been waiting for the bots
to finish their job, but maybe I should as it is pretty much exactly a year
since I run the analysis on the pre-Wikidata age Wikipedia dumps).

You are amazing!

Cheers,
Denny




2013/7/8 addshorewiki addshorew...@gmail.com

 For the bot removing interwiki links that are redirects etc my new code
 should be ready by this weekend (I hope) and this should give the lists I
 have a big clear out! :)

 Addshore
 On 8 Jul 2013 04:32, Romaine Wiki romaine_w...@yahoo.com wrote:

 Today we reached at nl-wiki the situation that + 64% of the
 interwikiconflicts have been solved. A lot of this work has been done by
 the Dutch community, but also a lot of work is done by users form other
 projects, thank you very much for the help!

 I have checked the complete template namespace and category namespace for
 local interwiki's and all are removed from these pages, so these namespaces
 are now clean on nl-wiki. If users from especially smaller Wikipedia's want
 to know on what pages of their wiki are local interwikis left, you can use
 AWB, download the latest databasedump and do a query on that dump. If you
 want to know what query you need exactly, e-mail me personally as the
 string of the query is a bit long. But it is even for noobs on bots and
 codes easy to do. (I can also do it for you.)

 With doing all this solving of interwikiconflicts, we came across several
 things:
 * A lot of biological conflicts are in our list of interwikiconflicts.
 Certain genus do only have one species under it, what makes some Wikipedias
 make that together one article, while others want two articles as it are
 two layers in the taxonomical tree. One article on the English Wikipedia
 that created hundreds of interwikiconflicts was a list to which many
 redirects were linking which were used for interwikis. All have been
 removed with a bot.

 * Another thing we notice is that a lot of renamings of articles to make
 place for a disambiguation page haven't been proparly executed, as on
 Wikidata in an item of a group of articles, one of the links was to a
 disambiguation page. (It would be nice if a bot could check for
 disambiguation pages (based on the presence of a template from
 [[MediaWiki:Disambiguationspage]] on that wiki in it) so that we know where
 we need to fix this.)

 * Another thing we see is that a lot of interwikis are still local
 because the local interwiki links to a page that is a redirect because the
 page was renamed, while this wasn't changed by a bot. Most interwikibots do
 not recognize that the redirect is the same page as the one added to
 Wikidata. So we need a bot to remove all interwikis that link to a redirect
 linking to a page that is in the same item as the page where the local
 interwikis are in.


 Let's clean this mess up!


 Romaine

 ---
 http://www.wikidata.org/wiki/User:Romaine



 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Some Wiktionary data in Wikidata

2013-06-21 Thread Denny Vrandečić

done this change.


2013/6/20 Denny Vrandečić denny.vrande...@wikimedia.de

 Thinking about it again, and discussing it internally, maybe we should
 replace word with expression and meaning with sense?

 Any +1's or differing opinions?


 2013/6/20 Denny Vrandečić denny.vrande...@wikimedia.de

 The current proposal does not cover grammar rules explicitly. If at all,
 I would regard that as a later extension once the lexical information is in
 place. Also, my limited understanding of the topic does not even allow for
 coming up with a data model to cover grammar rules, or to know whether
 there is something like sufficiently widely accepted models to represent
 grammar, or if there are still discussions whether Chomsky or Systemic
 Functional Grammars or whatever else would make the cut...

 Regarding word vs expression - I do not care much about the actual
 term, and it seems that both seem valid. With the suggested change from
 meaning to word sense though, it might make more sense to keep word
 here. But as said, no strong opinion here. I definitively see that saying
 that http://en.wiktionary.org/wiki/carry_coals_to_Newcastle is a
 word is kinda weird. expression would fix that.

 Any further opinions?

 Cheers,
 Denny




 2013/6/19 David Cuenca dacu...@gmail.com

 Hi Denny,

 Thank you very much for this fantastic update about the intentions of
 supporting a semantic dictionary in Wikidata :)
 Just a minor correction: I think instead of word, it should be
 expression because some languages don't follow the same logic.

 On the other hand, do you think it would be possible to accommodate
 grammar rules too?
 I have added some people from Apertium that might have some insights
 about it.

 Cheers,
 Micru

  On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić 
 denny.vrande...@wikimedia.de wrote:

  Hello,

 I would like all interested in the interaction of Wikidata and
 Wiktionary to take a look at the following proposal. It is trying to serve
 all use cases mentioned so far, and remain still fairly simple to 
 implement.

 http://www.wikidata.org/wiki/Wikidata:Wiktionary

 To the best of our knowledge, we have checked all discussions on this
 topic, and also related work like OmegaWiki, Wordnet, etc., and are
 building on top of that.

 I would extremely appreciate if some liaison editors could reach out to
 the Wiktionaries in order to get a wider discussion base. We are currently
 reading more on related work and trying to improve the proposal.

 It would be great if we could keep the discussion on the discussion
 page on the wiki, so to bundle it a bit. Or at least have pointers there.

 http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary

 Note that we are giving this proposal early. Implementation has not
 started yet (obviously, otherwise the discussion would be a bit moot), and
 this is more a mid-term commitment (i.e. if the discussion goes smoothly,
 it might be implemented and deployed by the end of the year or so, although
 this depends on the results of the discussion obviously).

 Cheers,
 Denny




 --
 Project director Wikidata
 Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
 Tel. +49-30-219 158 26-0 | http://wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
 der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
 Körperschaften I Berlin, Steuernummer 27/681/51985.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




 --
 Etiamsi omnes, ego non




 --
 Project director Wikidata
 Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
 Tel. +49-30-219 158 26-0 | http://wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
 der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
 Körperschaften I Berlin, Steuernummer 27/681/51985.




 --
 Project director Wikidata
 Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
 Tel. +49-30-219 158 26-0 | http://wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
 der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
 Körperschaften I Berlin, Steuernummer 27/681/51985.




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l

Re: [Wikidata-l] Some Wiktionary data in Wikidata

2013-06-21 Thread Denny Vrandečić

It was never intended to create a Wiktionary Database separate from
Wikidata, but have it being a part of Wikidata.




2013/6/21 Gerard Meijssen gerard.meijs...@gmail.com

 Hoi,

 Denny, when you look at the data currently in Wikidata, you find what is
 in essence more than a basis for a translation dictionary.

 The notion that we need something separate is a notion you should reasses.
 What we need is some clean-up of the labels currently in use. What we also
 need are more definitions. We do not need another Wikidata for Wiktionary

 Thanks,
 GerarM
 Hello,

 I would like all interested in the interaction of Wikidata and Wiktionary
 to take a look at the following proposal. It is trying to serve all use
 cases mentioned so far, and remain still fairly simple to implement.

 http://www.wikidata.org/wiki/Wikidata:Wiktionary

 To the best of our knowledge, we have checked all discussions on this
 topic, and also related work like OmegaWiki, Wordnet, etc., and are
 building on top of that.

 I would extremely appreciate if some liaison editors could reach out to
 the Wiktionaries in order to get a wider discussion base. We are currently
 reading more on related work and trying to improve the proposal.

 It would be great if we could keep the discussion on the discussion page
 on the wiki, so to bundle it a bit. Or at least have pointers there.

 http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary

 Note that we are giving this proposal early. Implementation has not
 started yet (obviously, otherwise the discussion would be a bit moot), and
 this is more a mid-term commitment (i.e. if the discussion goes smoothly,
 it might be implemented and deployed by the end of the year or so, although
 this depends on the results of the discussion obviously).

 Cheers,
 Denny




 --
 Project director Wikidata
 Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
 Tel. +49-30-219 158 26-0 | http://wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
 der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
 Körperschaften I Berlin, Steuernummer 27/681/51985.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Some Wiktionary data in Wikidata

2013-06-20 Thread Denny Vrandečić

Thank you, Sundar!

2013/6/20 BalaSundaraRaman sundarbe...@yahoo.com

Hi Denny,

I've left a message at the Tamil Wiktionary Village Pump.

http://ta.wiktionary.org/w/index.php?title=%E0%AE%B5%E0%AE%BF%E0%AE%95%E0%AF%8D%E0%AE%9A%E0%AE%A9%E0%AE%B0%E0%AE%BF:%E0%AE%86%E0%AE%B2%E0%AE%AE%E0%AE%B0%E0%AE%A4%E0%AF%8D%E0%AE%A4%E0%AE%9F%E0%AE%BFdiff=1194066oldid=1194039

Cheers,
Sundar

That language is an instrument of human reason, and not merely a medium
for the expression of thought, is a truth generally admitted.
- George Boole, quoted in Iverson's Turing Award Lecture

Original message:

Hello,

http://www.wikidata.org/wiki/Wikidata:Wiktionary

To the best of our knowledge, we have checked all discussions on this
topic, and also related work like OmegaWiki, Wordnet, etc., and are
building on top of that.

I would extremely appreciate if some liaison editors could reach out to the
Wiktionaries in order to get a wider discussion base. We are currently
reading more on related work and trying to improve the proposal.

It would be great if we could keep the discussion on the discussion page on
the wiki, so to bundle it a bit. Or at least have pointers there.

http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary

Note that we are giving this proposal early. Implementation has not started
yet (obviously, otherwise the discussion would be a bit moot), and this is
more a mid-term commitment (i.e. if the discussion goes smoothly, it might
be implemented and deployed by the end of the year or so, although this
depends on the results of the discussion obviously).

Cheers,
Denny

--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Re: [Wikidata-l] Some Wiktionary data in Wikidata

2013-06-20 Thread Denny Vrandečić

Thanks, I did, and did now again. As far as I can tell, it seems compatible
(and even would be compatible with the simpler current Wikidata model,
actually).

Cheers,
Denny


2013/6/19 Tom Morris tfmor...@gmail.com

 If you haven't already, it might be worth looking at the Freebase schema
 for Wordnet, especially how it connects synsets to Freebase topics:

 https://www.freebase.com/base/wordnet/synset?schema=

 Tom


 On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić 
 denny.vrande...@wikimedia.de wrote:

 Hello,

 I would like all interested in the interaction of Wikidata and Wiktionary
 to take a look at the following proposal. It is trying to serve all use
 cases mentioned so far, and remain still fairly simple to implement.

 http://www.wikidata.org/wiki/Wikidata:Wiktionary

 To the best of our knowledge, we have checked all discussions on this
 topic, and also related work like OmegaWiki, Wordnet, etc., and are
 building on top of that.

 I would extremely appreciate if some liaison editors could reach out to
 the Wiktionaries in order to get a wider discussion base. We are currently
 reading more on related work and trying to improve the proposal.

 It would be great if we could keep the discussion on the discussion page
 on the wiki, so to bundle it a bit. Or at least have pointers there.

 http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary

 Note that we are giving this proposal early. Implementation has not
 started yet (obviously, otherwise the discussion would be a bit moot), and
 this is more a mid-term commitment (i.e. if the discussion goes smoothly,
 it might be implemented and deployed by the end of the year or so, although
 this depends on the results of the discussion obviously).

 Cheers,
 Denny




 --
 Project director Wikidata
 Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
 Tel. +49-30-219 158 26-0 | http://wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
 der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
 Körperschaften I Berlin, Steuernummer 27/681/51985.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l



 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Some Wiktionary data in Wikidata

2013-06-20 Thread Denny Vrandečić

The current proposal does not cover grammar rules explicitly. If at all, I
would regard that as a later extension once the lexical information is in
place. Also, my limited understanding of the topic does not even allow for
coming up with a data model to cover grammar rules, or to know whether
there is something like sufficiently widely accepted models to represent
grammar, or if there are still discussions whether Chomsky or Systemic
Functional Grammars or whatever else would make the cut...

Regarding word vs expression - I do not care much about the actual
term, and it seems that both seem valid. With the suggested change from
meaning to word sense though, it might make more sense to keep word
here. But as said, no strong opinion here. I definitively see that saying
that http://en.wiktionary.org/wiki/carry_coals_to_Newcastle is a word
is kinda weird. expression would fix that.

Any further opinions?

Cheers,
Denny




2013/6/19 David Cuenca dacu...@gmail.com

 Hi Denny,

 Thank you very much for this fantastic update about the intentions of
 supporting a semantic dictionary in Wikidata :)
 Just a minor correction: I think instead of word, it should be
 expression because some languages don't follow the same logic.

 On the other hand, do you think it would be possible to accommodate
 grammar rules too?
 I have added some people from Apertium that might have some insights about
 it.

 Cheers,
 Micru

  On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić 
 denny.vrande...@wikimedia.de wrote:

  Hello,

 I would like all interested in the interaction of Wikidata and Wiktionary
 to take a look at the following proposal. It is trying to serve all use
 cases mentioned so far, and remain still fairly simple to implement.

 http://www.wikidata.org/wiki/Wikidata:Wiktionary

 To the best of our knowledge, we have checked all discussions on this
 topic, and also related work like OmegaWiki, Wordnet, etc., and are
 building on top of that.

 I would extremely appreciate if some liaison editors could reach out to
 the Wiktionaries in order to get a wider discussion base. We are currently
 reading more on related work and trying to improve the proposal.

 It would be great if we could keep the discussion on the discussion page
 on the wiki, so to bundle it a bit. Or at least have pointers there.

 http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary

 Note that we are giving this proposal early. Implementation has not
 started yet (obviously, otherwise the discussion would be a bit moot), and
 this is more a mid-term commitment (i.e. if the discussion goes smoothly,
 it might be implemented and deployed by the end of the year or so, although
 this depends on the results of the discussion obviously).

 Cheers,
 Denny




 --
 Project director Wikidata
 Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
 Tel. +49-30-219 158 26-0 | http://wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
 der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
 Körperschaften I Berlin, Steuernummer 27/681/51985.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




 --
 Etiamsi omnes, ego non




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Some Wiktionary data in Wikidata

2013-06-20 Thread Denny Vrandečić

Thinking about it again, and discussing it internally, maybe we should
replace word with expression and meaning with sense?

Any +1's or differing opinions?


2013/6/20 Denny Vrandečić denny.vrande...@wikimedia.de

 The current proposal does not cover grammar rules explicitly. If at all, I
 would regard that as a later extension once the lexical information is in
 place. Also, my limited understanding of the topic does not even allow for
 coming up with a data model to cover grammar rules, or to know whether
 there is something like sufficiently widely accepted models to represent
 grammar, or if there are still discussions whether Chomsky or Systemic
 Functional Grammars or whatever else would make the cut...

 Regarding word vs expression - I do not care much about the actual
 term, and it seems that both seem valid. With the suggested change from
 meaning to word sense though, it might make more sense to keep word
 here. But as said, no strong opinion here. I definitively see that saying
 that http://en.wiktionary.org/wiki/carry_coals_to_Newcastle is a word
 is kinda weird. expression would fix that.

 Any further opinions?

 Cheers,
 Denny




 2013/6/19 David Cuenca dacu...@gmail.com

 Hi Denny,

 Thank you very much for this fantastic update about the intentions of
 supporting a semantic dictionary in Wikidata :)
 Just a minor correction: I think instead of word, it should be
 expression because some languages don't follow the same logic.

 On the other hand, do you think it would be possible to accommodate
 grammar rules too?
 I have added some people from Apertium that might have some insights
 about it.

 Cheers,
 Micru

  On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić 
 denny.vrande...@wikimedia.de wrote:

  Hello,

 I would like all interested in the interaction of Wikidata and
 Wiktionary to take a look at the following proposal. It is trying to serve
 all use cases mentioned so far, and remain still fairly simple to implement.

 http://www.wikidata.org/wiki/Wikidata:Wiktionary

 To the best of our knowledge, we have checked all discussions on this
 topic, and also related work like OmegaWiki, Wordnet, etc., and are
 building on top of that.

 I would extremely appreciate if some liaison editors could reach out to
 the Wiktionaries in order to get a wider discussion base. We are currently
 reading more on related work and trying to improve the proposal.

 It would be great if we could keep the discussion on the discussion page
 on the wiki, so to bundle it a bit. Or at least have pointers there.

 http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary

 Note that we are giving this proposal early. Implementation has not
 started yet (obviously, otherwise the discussion would be a bit moot), and
 this is more a mid-term commitment (i.e. if the discussion goes smoothly,
 it might be implemented and deployed by the end of the year or so, although
 this depends on the results of the discussion obviously).

 Cheers,
 Denny




 --
 Project director Wikidata
 Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
 Tel. +49-30-219 158 26-0 | http://wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
 der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
 Körperschaften I Berlin, Steuernummer 27/681/51985.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




 --
 Etiamsi omnes, ego non




 --
 Project director Wikidata
 Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
 Tel. +49-30-219 158 26-0 | http://wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
 der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
 Körperschaften I Berlin, Steuernummer 27/681/51985.




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

[Wikidata-l] Some Wiktionary data in Wikidata

2013-06-19 Thread Denny Vrandečić

Hello,

I would like all interested in the interaction of Wikidata and Wiktionary
to take a look at the following proposal. It is trying to serve all use
cases mentioned so far, and remain still fairly simple to implement.

http://www.wikidata.org/wiki/Wikidata:Wiktionary

To the best of our knowledge, we have checked all discussions on this
topic, and also related work like OmegaWiki, Wordnet, etc., and are
building on top of that.

I would extremely appreciate if some liaison editors could reach out to the
Wiktionaries in order to get a wider discussion base. We are currently
reading more on related work and trying to improve the proposal.

It would be great if we could keep the discussion on the discussion page on
the wiki, so to bundle it a bit. Or at least have pointers there.

http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary

Note that we are giving this proposal early. Implementation has not started
yet (obviously, otherwise the discussion would be a bit moot), and this is
more a mid-term commitment (i.e. if the discussion goes smoothly, it might
be implemented and deployed by the end of the year or so, although this
depends on the results of the discussion obviously).

Cheers,
Denny




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Visualisations of The Most Unique Wikipedias According to Wikidata

2013-06-19 Thread Denny Vrandečić

Can I have a statement about how much easier it would have been with
Wikidata? :)


2013/6/13 Brent Hecht bhe...@cs.umn.edu

 Hi all,

 In my (recently finished) thesis, I looked at a lot of different
 properties (e.g. topic, centrality, popularity via pageviews) of common
 and unique concepts across multilingual Wikipedia.

 It's all in Chapter 3:
 http://www-users.cs.umn.edu/~bhecht/publications/bhecht_thesis_final.pdf.

 A lot of these questions were addressed in the pre-Wikidata era :-)

 - Brent

 Brent Hecht, Ph.D.
 Assistant Professor
 Department of Computer Science and Engineering
 University of Minnesota
 e: bhe...@cs.umn.edu
 t: @bhecht
 w: http://www-users.cs.umn.edu/~bhecht/

 On Jun 13, 2013, at 12:33 PM, Klein,Max kle...@oclc.org wrote:

  That's an excellent recommendation. I will attempt to research the
 common properties of the least unique Wikidata items.
 
  Maximilian Klein
  Wikipedian in Residence, OCLC
  +17074787023
 
  
  From: wikidata-l-boun...@lists.wikimedia.org on behalf of Paul A. Houle
  Sent: Thursday, June 13, 2013 6:57 AM
  To: Discussion list for the Wikidata project.
  Subject: Re: [Wikidata-l] Visualisations of The Most Unique
 Wikipedias  According to Wikidata
 
 I think Poland may do better than average because Polish people,  out
 of
  national pride,  have made a special effort to be well documented in
 English
  Wikipedia and represent a Polish point-of-view on topics like the city of
  Gdansk.
 
One fascinating thing about Wikidata is that it provides access to all
 of
  the wonderful concepts shared in the Wikiverse,  so now sites like
 Ookaboo
  can collect pictures of many beautiful places that don't exist in en
  Wikipedia.
 
   On the other hand I'm also interested in the other end of the curve,
  those elite concepts which are represented widely across the Wikipedias.
  Surely this is connected with subjective importance,  with some flavor
  towards global appeal,  whatever that would turn out to mean.  Any
 chance
  you could run a report on those?
 
 
  -Original Message-
  From: Mathieu Stumpf
  Sent: Thursday, June 13, 2013 4:51 AM
  To: wikidata-l@lists.wikimedia.org
  Subject: Re: [Wikidata-l] Visualisations of The Most Unique Wikipedias
  According to Wikidata
 
  Le 2013-06-12 22:22, Klein,Max a écrit :
  Hello Wikidatians,
 
  I made a few visualizations of the distributions of language links
  in Wikidata Items. You can also use these stats to see which Items
  represent wikipedia articles which are unique to a language and
  compare the uniquenesses of all languages. Also I investigate all the
  items with just two language links, to look at Wikipedia pairs
 
  See the full analysis:
 
 http://notconfusing.com/the-most-unique-wikipedias-according-to-wikidata/
  [1]
 
  Interesting! Could you also create that kind of visualisations by
  topics : how much uniqueness come from biographies of local football
  people, compared with history events or abstract concepts ?
 
  Also, in a completly unrelated topic, you may explain me in private
  what you mean with Create a communal house to live in which is in your
  public todo list, it sounds interesting. :P
 
 
  --
  Association Culture-Libre
  http://www.culture-libre.org/
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Some Wiktionary data in Wikidata

2013-06-19 Thread Denny Vrandečić

It's not about the actual content, but rather about the data model.

2013/6/19 Neil Harris n...@tonal.clara.co.uk

On 19/06/13 15:03, Tom Morris wrote:

If you haven't already, it might be worth looking at the Freebase schema
for Wordnet, especially how it connects synsets to Freebase topics:
https://www.freebase.com/base/wordnet/synset?schema=

Tom

WordNet does not seem to be under a free license -- see

http://wordnet.princeton.edu/wordnet/license/

Since Wikidata's CC0 licensing allows commercial use, surely integrating
any kind of data from WordNet risks conflict with WordNet's license?

Neil

On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić
denny.vrande...@wikimedia.de wrote:

Hello,

I would like all interested in the interaction of Wikidata and Wiktionary
to take a look at the following proposal. It is trying to serve all use
cases mentioned so far, and remain still fairly simple to implement.
http://www.wikidata.org/wiki/Wikidata:Wiktionary
http://www.wikidata.org/wiki/Wikidata:Wiktionary

To the best of our knowledge, we have checked all discussions on this
topic, and also related work like OmegaWiki, Wordnet, etc., and are
building on top of that.

I would extremely appreciate if some liaison editors could reach out to
the Wiktionaries in order to get a wider discussion base. We are currently
reading more on related work and trying to improve the proposal.

It would be great if we could keep the discussion on the discussion page
on the wiki, so to bundle it a bit. Or at least have pointers there.
http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary
http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary

Note that we are giving this proposal early. Implementation has not
started yet (obviously, otherwise the discussion would be a bit moot), and
this is more a mid-term commitment (i.e. if the discussion goes smoothly,
it might be implemented and deployed by the end of the year or so, although
this depends on the results of the discussion obviously).

Cheers,
Denny

--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

___
Wikidata-l mailing
listWikidata-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

[Wikidata-l] Usage of Wikidata: the brilliance of Wikipedians

2013-04-25 Thread Denny Vrandečić

I am completely amazed by a particularly brilliant way that Wikipedia uses
Wikidata. Instead of simply displaying the data from Wikidata and removing
the local data, a template and workflow is proposed, which...

* grabs the relevant data from Wikidata
* compares it with the data given locally in the Wikipedia
* displays the Wikipedia data
* adds a maintenance category in case the data is different

This allows both communities to check the maintenance category, provide a
security net for vandal changes, still notice if some data has changed,
etc. -- and to phase out the local data over time when they get comfortable
and if they want to. It is a balance of maintenance effort and data quality.

I am not saying that is the right solution in every use case, for every
topic, for every language. But it is a perfect example how the community
will surprise us by coming up with ingenious solutions if they get enough
flexibility, powerful tools, and enough trust.

Yay, Wikipedia!

The workflow is described here:

http://en.wikipedia.org/wiki/Template_talk:Commons_category#Edit_request_on_24_April_2013:_Check_Wikidata_errors

There is an RFC currently going on about whether and how to use Wikidata
data in the English Wikipedia, coming out of the discussion that was here a
few days ago. If you are an English Wikipedian, you might be interested:

http://en.wikipedia.org/wiki/Wikipedia:Requests_for_comment/Wikidata_Phase_2

--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Re: [Wikidata-l] How does wikidata handle topic redirect/merge/split

2013-04-22 Thread Denny Vrandečić

2013/4/7 Jianyong Zhang zhjy...@gmail.com

 On Tue, Apr 2, 2013 at 9:54 PM, Denny Vrandečić 
 denny.vrande...@wikimedia.de wrote:

 2013/4/1 Jianyong Zhang zhjy...@gmail.com

 1)  It becomes redirect to another article, will Qx be changed in this
 scenario?


 I expect that if a Wikipedia article gets moved, this will be updated on
 the Wikidata item manually. Otherwise the language links that were
 displayed on the original article would not show up.

 If an article gets turned into a redirect to an already existing article,
 this would be a merge (see Question 4).




 Thanks for the detailed reply. I still have some question on redirect.
 See, if the article A redirects to the article B, will we have 2 items or
 only one for the final targets and all its redirects?

  From my point of view, if a redirect talks about same topic as its final
 target, then it makes sense to only have 1 item. Such as,
 http://en.wikipedia.org/wiki/Obama and http://en.wikipedia.org/wiki/Obamaand
 http://en.wikipedia.org/wiki/Barack_Obama.

 But in many cases, a redirect talks about a related but different topic
 with its final target. Such as,
 http://en.wikipedia.org/wiki/Social_activist and
 http://en.wikipedia.org/wiki/Activism.
 how will wikidata handle such redirects?

 And back to the original question, if an article becomes a redirect, for
 the above 2 different scenarios,what will we do?




In general, you can't point to a redirect from Wikidata. When entering,
Wikidata tries to resolve it to the redirected article, tries to save it -
and if there is already
an item linked to an article, the save will fail.

If someone turns an article into a redirect, we don't notice that
automatically. I hope that bots will clean that up over time, but I would
consider that an issue with the
data. Any data on that item cannot be used on the Wikipedia article, since
it is a redirect.

I hope that helps,
Cheers,
Denny

-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

[Wikidata-l] Our log table is exploding

2013-04-22 Thread Denny Vrandečić

Hey all,

I just got a warning from Ops that our log table is growing extremely fast.
One write up by this is here:

https://bugzilla.wikimedia.org/show_bug.cgi?id=47415

Basically, a vast majority of edits on Wikidata are written to the log
table as they are autopatrolled. And since we have a lot of edits, this
makes the table grow very very quickly.

We would like to:
* stop logging so many edits
* drop those logs that are already there about patrolling

We want to understand how that influences your workflows and what we can do
about that.
Please speak up if this change would be an issue.

Cheers,
Denny

-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Qualifiers, bug fixes, improved search - all in one night!

2013-04-19 Thread Denny Vrandečić

Hey Dario,

there is on simple fix we want to apply rather sooner than later, which is
to use the number of language links for ranking. This should work rather
well. The thing is that this is kinda hard to implement in MySQL, I
figured, and that we would need to use something Lucene based (probably
Solr) for that. The Solr extension is quite far, but we currently are not
working on getting it deployed. In short, it's all in the pipes, and it
just takes a bit...

Cheers,
Denny




2013/4/19 Dario Taraborelli dtarabore...@wikimedia.org

 Hi Lydia and all,

 great to hear about this deployment, I am particularly excited about
 qualifier support (as per my previous post).

 Since you also mention improvements to search, I was wondering whether you
 had specific plans for work on search functionality.
 Unless I use the Items by title page, if  type Berlin in a regular
 search form the item I am actually looking for (Q64) is ranked #34 in the
 search results (i.e. three clicks away on the more link).

 I'd be curious to hear the team's thoughts on how to make search more
 effective and user friendly.

 Dario

 On Apr 18, 2013, at 2:26 PM, Lydia Pintscher lydia.pintsc...@wikimedia.de
 wrote:

  Heya folks :)
 
  We have just deployed qualifiers
  (
 http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model_primer#Qualifiers
 )
  and bug fixes on wikidata.org. Qualifiers! Bug fixes are especially
  for Internet Explorer 8. Please let me know how it is working for you
  now if you're using IE8 and if there are still any major problems when
  using Wikidata with it.
 
  In addition the script we ran on the database to make search
  case-insensitive has finished running. This should be another huge
  step towards a nice search here. (This change also affects the
  autocomplete for items and properties.)
 
  As usual please let me know what you think and tell me if there are any
 issues.
 
 
  Cheers
  Lydia
 
  --
  Lydia Pintscher - http://about.me/lydia.pintscher
  Community Communications for Wikidata
 
  Wikimedia Deutschland e.V.
  Obentrautstr. 72
  10963 Berlin
  www.wikimedia.de
 
  Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
 
  Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
  unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
  Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Page history and properties

2013-04-04 Thread Denny Vrandečić

This is in my opinion an upstream issue for MediaWiki proper. I do not
think that templates and images from Commons are that different. Take this
image for example:


https://en.wikipedia.org/wiki/File:Treaty_of_Accession_2011_Ratification_Map.svg


It always reflects the current state of ratification.

Take the templates that display the conservation status of species in
Wikipedia. It encodes a whole lot of knowledge about different preservation
status systems, and if they change, this is also not preserved anywhere in
the history.

https://en.wikipedia.org/wiki/Wikipedia:Conservation_status

I agree that this is an issue. But our solution is consistent with the way
it is done in other parts of Wikipedia, and a solution should not be
partially addressing Wikidata but Wikipedia as a whole.

One way would be to great HTML dumps of Wikipedia at regular intervals, as,
e.g., the Internet Archive does it.

A much more thorough discussion of this issue can be found here in a RENDER
deliverable I was co-authoring in 2010:

http://render-project.eu/wp-content/uploads/2010/05/D1.1.2.pdf

Cheers,
Denny



2013/4/4 Gregor Hagedorn g.m.haged...@gmail.com

  when templates (or, in the case of wikidata, properties) get deleted or
 renamed.
  Nobody has come up with a good solution yet.

 I think we did discuss a simple, working solution: Saving the value
 together with the Wikipedia page.

 The major argument against that was: it is a waste of storage to
 create a new Wikipedia page (perhaps daily) when property values
 included in a page are changed in Wikidata. I personally value trust
 and documentation of change much higher than disk storage, but even
 then, there are ways to balance this. So perhaps a modified proposal
 that matches the current development stage:

 If an editor saves a page with {{#property:population}} the parser
 looks up the current value and changes this to:
   {{#property:population|current value=2348732}}
 and stores this wikitext version in the Wikipedia. The same would
 apply to updating, saving {{#property:population|current
 value=2348732}} may result in {{#property:population|current
 value=2348700}} being saved.

 This would mean no additional waste of storage for articles that are
 regularly changed. For those that are not, one could imagine a
 bot-based monthly update check to make past knowledge transparent.

 I realize that this would require a pattern, where the
 Wikidata-derived values would remain editable on the topic/article
 pages, i.e. the property function would have to be inserted in the
 template call, rather than in the template definition. Those wikidata
 properties automatically called inside templates with a dynamic item
 decided by the current template call would not be preserved. However,
 both editing patterns would be available and it would be up to the
 community of each Wikipedia to choose the preferred one.

 (As I said previously: although similar to the issue of commons images
 and templates, the issue at stake for Wikidata is different. Because
 of the problems in preserving a transparent editing history, updates
 to commons images are generally restricted to truly minor improvements
 (contrast, cropping, better resolution, etc.). I am not aware of
 cases, where commons images regularly are replaced with updated
 content that is different in substance and thus automatically changes
 all Wikipedia pages, representing different knowledge. I don't want to
 exclude this, but even for changing company logos the usual solution
 is to create a new name, preserving the old logo. Similarly, templates
 may fail to work in old versions (big problem!), but I am not aware
 that a template would render out-of-time information when viewing a
 past revision. Thus, the problem of Wikidata with respect to
 endangering the trust basis of Wikipedia, the version system, is
 related, but different).

 Gregor

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] [Wikitech-l] Wikidata queries

2013-04-03 Thread Denny Vrandečić

2013/3/28 Petr Onderka gsv...@gmail.com

 How will be the queries formatted? Do I understand it correctly that a
 QueryConcept is a JSON object?


Not decided yet. Probably it will be a JSON object, though, and edited
through an UI.


 Have you considered using something more in line with the format of
 action=query queries?


Yes, but it didn't seem a good fit, especially because the query module
works on the page metadata and the queries we discuss here work on the item
data.


 Though I guess what you need is much more complicated and trying to
 fit it into the action=query model wouldn't end well.


I am not even sure it is much more complicated. But I am very worried it is
too different.

Cheers,
Denny



 Petr Onderka
 [[en:User:Svick]]

 2013/3/28 Denny Vrandečić denny.vrande...@wikimedia.de:
  We have a first write up of how we plan to support queries in Wikidata.
  Comments on our errors and requests for clarifications are more than
  welcome.
 
  https://meta.wikimedia.org/wiki/Wikidata/Development/Queries
 
  Cheers,
  Denny
 
  P.S.: unfortunately, no easter eggs inside.
 
  --
  Project director Wikidata
  Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
  Tel. +49-30-219 158 26-0 | http://wikimedia.de
 
  Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
  Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
 unter
  der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
  Körperschaften I Berlin, Steuernummer 27/681/51985.
  ___
  Wikitech-l mailing list
  wikitec...@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikitech-l

 ___
 Wikitech-l mailing list
 wikitec...@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] How does wikidata handle topic redirect/merge/split

2013-04-02 Thread Denny Vrandečić

Hi Alex,

the current implementation of Wikidata supports the same level of history
as MediaWiki itself, i.e. templates, images from Commons, and data from
Wikidata have their own versioning scheme, and all information about their
history is retained -- but when a page is rendered from a previous version,
then the current templates, images, and data is being displayed.

Whereas I know many people who share your opinion, I also think it is
important to be consistent in this case.

Cheers,
Denny





2013/4/2 tanch...@mskcc.org

  Thank you Michael (and apologies to Denny for being addressed as Danny).

  We all know change is a constant and we need to design information
 technology with that in mind.  As you noted, population is a great example
 of something that is constantly changing.

  Including a date with population makes sense and in certain situations.

  However, there maybe cases when the data related to specific article
 changes, that recent changes, watchlist and perhaps history will also
 be updated. This can only happen when there is two-way reference from the
 article to the data and back.  Also preserving the context of a page, at
 that point in time can also be valuable.

  Anyway, perhaps coding for the date as in the population example will be
 sufficient in most cases.  I do somehow feel that we should make g it easy
 for humans to create the articles and let the machine record the hard
 references and allow humans some means to recognize that the associated
 data in the articles they are subscribed to has changed.

  Thanks again.

 Best Regards,
 Alex

 On Apr 2, 2013, at 12:15 PM, Michael Hale hale.michael...@live.com
 wrote:

   Well you can still view the revision history of an item on Wikidata, as
 you'd expect. I view the information as being more tied to a specific
 reference than to a specific revision of the item. I don't think the notion
 of orphaned data is as big of a deal in a database as it is in an
 encyclopedia. We can monitor the creation of new items the same way that
 new articles are monitored on the encyclopedia. Especially with historical
 data, it might not be currently included in any sites that we know, but it
 should still be there for when people want to make historical charts for
 reports, school projects, etc. The two methods we have under development to
 improve the situation are ranks and qualifiers. Ranks let you differentiate
 between multiple claims about a property as to which one is preferred
 (likely the one with the most reputable reference) and qualifiers are that
 extra bit of information that let you differentiate multiple claims in a
 way that is appropriate for the property (perhaps a date for population
 values). Do you think these methods will be satisfactory for your concerns?

  --
 From: tanch...@mskcc.org
 To: wikidata-l@lists.wikimedia.org
 Date: Tue, 2 Apr 2013 14:23:13 +
 Subject: Re: [Wikidata-l] How does wikidata handle topic
 redirect/merge/split

 Hi Danny,

  I'm been on the distribution list since the development of wikidata
 started and I think what everyone has set out to do and accomplished so far
 is amazing and will have a profound impact just as Wikipedia has.

  I've been quietly on the sidelines absorbing some (I have to admit I
 cant follow all) the intellectual discussions among the participants.

  I do have a thought about this issue of referential integrity and
 orphaned data that I'd like to share.

  Mediawiki has what links here to an article, at least for information
 residing on the same site.  It also maintains what a page looks like at a
 point in time.  Since data referenced on a specific edition/revision of an
 article can now reside outside of that article, the intent of the
 information in the article will be lost if it is not tied to the revision
 of the associated data when that information changes.

  One way that this can probably be handled in some future implementation,
 if not already done, is to also carry within the reference the timestamp of
 the referenced data as the reference backwards from the data.  It will be
 difficult and cumbersome for humans to do this but as the link is stored in
 mediawiki site, code can be added to make the reference.  In that process,
 it can also inform the host of the data, to add it to the what links here
 so there is a backward reference.  To prevent spam and other issues such as
 performance, only approved sites (such as wikipedia sites) can be added to
 what links here.

  Feel free to include back the distribution list in your reply if you see
 merits in this suggestion.

  Best Regards,
 Alex

 On Apr 2, 2013, at 9:54 AM, Denny Vrandečić 
 denny.vrande...@wikimedia.de wrote:

   Hi Janyong,

  as Michael said, Wikidata does not automatically get updated in any
 case. We are planning to improve a bit the experience with moving a page in
 the Wikipedias, but it won't become automatic. Mostly because these issues
 are in general

[Wikidata-l] Wikidata queries

2013-03-28 Thread Denny Vrandečić

We have a first write up of how we plan to support queries in Wikidata.
Comments on our errors and requests for clarifications are more than
welcome.

https://meta.wikimedia.org/wiki/Wikidata/Development/Queries

Cheers,
Denny

P.S.: unfortunately, no easter eggs inside.

-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] A data model for Roman forts (castra)

2013-03-27 Thread Denny Vrandečić

Oh, I would please ask to wait another week or two, for us to have
qualifiers. Maybe they can deal with some of these cases. We just got them
demoed today, and they really look neat, so I am very convinced they will
be there with the next update.



2013/3/27 Michael Hale hale.michael...@live.com

 I think you can use either in other contexts. I see the tradeoff being
 that the inclusion syntax and templates for referring to properties of
 another item might be slightly longer. For the example of children of
 Charles Dickens you definitely want to have each child be their own item,
 as it currently is: http://www.wikidata.org/wiki/Q5686. The question for
 construction phases of Roman forts is whether or not each phase has enough
 information to justify being a complete item. Although if you are ready to
 go right now then it isn't really a question because qualifiers aren't
 implemented yet. I suggest creating the extra properties you need and
 creating separate items for the construction phases, even if they only have
 a few properties each, and then later on if downgrading them to just
 qualified values of properties for the main fort item simplifies some of
 your queries, inclusion syntax usage, and template boxes, then you can
 certainly do that.

 --
 Date: Wed, 27 Mar 2013 20:44:39 +0200
 From: saturn...@gmx.com
 To: wikidata-l@lists.wikimedia.org

 Subject: Re: [Wikidata-l] A data model for Roman forts (castra)

 Thanks Denny and Michael, it really helps. It is indeed a major difference
 between string enumerations of XSD and the values that should link to other
 items resulting in a knowledge network.

 Choosing between lists of values and own item as value, I would prefer
 usually an own item because it could be used in other context, e.g. a
 sentence. A good example could be The children of Charles Dickens -
 Q were younger than 



 On 03/27/2013 04:53 PM, Michael Hale wrote:

 Regarding the construction phases complex type that you want you have a
 couple of options. Properties support lists of values, so you could split
 the type into multiple properties and give a list of values for each. Then
 when qualifiers are added you could add a date range qualifier to each
 value to specify the phase or just use string qualifiers that say phase
 1, phase 2, etc. depending on how detailed the information is. You can
 also group the properties in their own item. So you could create an item
 called Potaissa phase 1 construction details and then have construction
 phase just be a list of those specific items, which themselves contain the
 information.

  --
 Date: Wed, 27 Mar 2013 12:55:18 +0100
 From: denny.vrande...@wikimedia.de
 To: wikidata-l@lists.wikimedia.org
 Subject: Re: [Wikidata-l] A data model for Roman forts (castra)

 I would say this is a good starting point. The Wikidata data model is
 described in full detail here [1], and a introductory primer is given here
 [2]. Qualifiers are not implemented yet, but will be there soon, and
 followed by more datatypes (like time, geo, etc.).

  The major difference is that values like stone for material or
 opus-quadratum for technique should not be strings - this does not
 translate well. They should be pointing to items, e.g. Q8063 instead of
 stone and Q2631941 instead of opus-quadratum.

  The other thing is that Wikidata does not really intend to enable
 constraints in that very strong sense that your schema chooses. So if
 someone wants to add a value for material that you did not preconceive,
 like Q40861 (Marble), Wikidata-as-a-software will not stop them from doing
 so (just as Wikipedia-as-a-software does not stop you from entering that,
 either) (see also [3]).

  I hope this helps,
 Denny


  [1] http://meta.wikimedia.org/wiki/Wikidata/Data_model
 [2] http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model_primer
 [3] http://blog.wikimedia.de/2013/02/22/restricting-the-world/



 2013/3/27 Flaviu fla...@gmx.com

  I fully agree with you that the XSD model cannot by precisely integrated
 into Wikidata and also I know Wikidata development is in progress. I think
 I could deal with simple properties like material but I'm not sure how to
 deal with complex properties like construction phases I'm not sure. Even
 if it is no implementation yet, how these complex properties could be
 defined?


 On 03/26/2013 11:53 PM, Michael Hale wrote:

 You can't integrate the XSD model precisely as it is defined because
 Wikidata doesn't allow all of the constraints that XSD allows.
 Specifically, you'll notice that you can't force an item to have a specific
 property (like the document or epigraphic reference in your model) and
 enumerations aren't currently supported. Wikidata has a global collection
 of properties and any item can use any arbitrary subset of them. The list
 is here: http://www.wikidata.org/wiki/Wikidata:List_of_properties. Some
 of the ones you want already exist, like

Re: [Wikidata-l] Expiration date for data

2013-03-21 Thread Denny Vrandečić

We do have strong types, but only few of time: item, commons media, string,
time, geo, URL. Government leader would not be a supported type.

The exact list and details are here: 
http://meta.wikimedia.org/wiki/Wikidata/Data_model#Datatypes_and_their_Values


Cheers,
Denny




2013/3/21 Michael Hale hale.michael...@live.com

 That seems better to constrain the overall type of a qualifier to any
 property. It still doesn't feel exactly right, but I'm not sure what would.
 Now that I think about it more, for the case of heads of government it
 doesn't seem appropriate to use a qualifier at all to me. It would just be
 a list of items which are presumably people. Each of those items would then
 have a single date or list of dates for start of head of government and end
 of head of government. The qualifier would be redundant. It seems the
 downside to having everything be strongly typed like in Freebase is that
 you end up with really weird and specific entity types like government
 leadership timespan to try to capture all of the details that you want,
 and the downside to semi-weakly typed items in Wikidata is that you might
 end up with different items representing the same information with
 different properties or qualifiers. But I have faith that Wikidata will
 ultimately work and achieve stability and convergence for the most common
 types just like how template boxes naturally emerged on Wikipedia. And I
 think the key advantage of Wikidata is that it will achieve growth,
 stability, and convergence without suffocating from having too many weird
 and specific item types to try to bridge and glue different types of
 information together.

 --
 Date: Thu, 21 Mar 2013 15:40:39 +0100
 From: denny.vrande...@wikimedia.de

 To: wikidata-l@lists.wikimedia.org
 Subject: Re: [Wikidata-l] Expiration date for data

 We will have a time datatype, and every property is strongly typed. This
 is also true for properties used as qualifiers.

 Regarding the priority of qualifiers: very high. They are the next major
 UI feature to be deployed, and as far as I can tell from the progress of
 the team it looks like they will be deployed in April.

 Cheers,
 Denny



 2013/3/20 Dario Taraborelli dtarabore...@wikimedia.org

 I disagree, and fully concur with Tom: a generic string type for a
 datetime qualifier defies the purpose of making wikidata statements
 well-formed and machine-readable.
 I don't think we should enforce typing for *all* qualifiers and I second
 the general organic growth approach, but datetime qualifiers strike me as
 a fundamental exception. Would you represent geocoordinates as a generic
 string and wait for organic growth to determine the appropriate datatype?
 I appreciate the overheads of adding datatype support, but this decision
 will have a major impact on the shape of collaborative work on wikidata.

 Denny – on a related note, I wanted to ask you what is the priority of
 qualifier support relative to the other items you mentioned in your list.
 As I noted in my previous post, the only way for an editor to correct an
 outdated statement is to remove information (e.g. Lombardy: head of local
 government: -Roberto Formigoni +Roberto Maroni ): this information will
 then be lost forever in an item's revision history. The sooner we introduce
 basic support for qualifiers, the sooner we can avoid removing valuable
 information from wikidata entries just for the sake of keeping them
 up-to-date.

 Dario

 On Mar 15, 2013, at 10:09 AM, Michael Hale hale.michael...@live.com
 wrote:

 For most of the scenarios I can think of, parsing the dates out of strings
 that are in a standard format by convention will be much easier. The number
 of ways people will want to use qualifiers will increase like the number of
 properties and items. So the way I see it, we have to support string-based
 qualifiers at the minimum. Then I think we should only support strongly
 typed qualifiers if performance requires it. By setting an update polling
 frequency on templates that use the information I don't think we'll run
 into performance issues for most scenarios. Even with this example the
 qualifier type is a date range, not just a date. So do we want them to have
 to choose from a large, fixed list of qualifier types or just look at a
 similar example and set a string to something similar and then gradually
 enforce types on the most popular uses that we see. I think this type of
 organic growth as opposed to trying to guess the qualifier types in advance
 is exactly in the spirit of Wikipedia.

 --
 Date: Fri, 15 Mar 2013 09:58:38 -0400
 From: tfmor...@gmail.com
 To: wikidata-l@lists.wikimedia.org
 Subject: Re: [Wikidata-l] Expiration date for data

 On Fri, Mar 15, 2013 at 1:49 AM, Michael Hale hale.michael...@live.com
 wrote:

 Yes, I think once qualifiers are enabled you would just have something
 like:
 ...
 Property(head of local government)
 ...

Re: [Wikidata-l] Expiration date for data

2013-03-21 Thread Denny Vrandečić

It really depends on your definitions :)

Items are strongly typed as items. Any item can have any property. And only
items can have properties. Time or geocoordinates, e.g., can not have
properties.

But yes, there is no forcing of properties onto any item, nor any
restriction of usage of every property. See also here:

http://blog.wikimedia.de/2013/02/22/restricting-the-world/

Cheers,
denny




2013/3/21 Michael Hale hale.michael...@live.com

 Yes, I just meant that items aren't forced to have a specific set of
 properties by the software, so they are essentially weakly typed, right?

 --
 Date: Thu, 21 Mar 2013 16:09:58 +0100

 From: denny.vrande...@wikimedia.de
 To: wikidata-l@lists.wikimedia.org
 Subject: Re: [Wikidata-l] Expiration date for data

 We do have strong types, but only few of time: item, commons media,
 string, time, geo, URL. Government leader would not be a supported type.

 The exact list and details are here: 
 http://meta.wikimedia.org/wiki/Wikidata/Data_model#Datatypes_and_their_Values
 

 Cheers,
 Denny




 2013/3/21 Michael Hale hale.michael...@live.com

 That seems better to constrain the overall type of a qualifier to any
 property. It still doesn't feel exactly right, but I'm not sure what would.
 Now that I think about it more, for the case of heads of government it
 doesn't seem appropriate to use a qualifier at all to me. It would just be
 a list of items which are presumably people. Each of those items would then
 have a single date or list of dates for start of head of government and end
 of head of government. The qualifier would be redundant. It seems the
 downside to having everything be strongly typed like in Freebase is that
 you end up with really weird and specific entity types like government
 leadership timespan to try to capture all of the details that you want,
 and the downside to semi-weakly typed items in Wikidata is that you might
 end up with different items representing the same information with
 different properties or qualifiers. But I have faith that Wikidata will
 ultimately work and achieve stability and convergence for the most common
 types just like how template boxes naturally emerged on Wikipedia. And I
 think the key advantage of Wikidata is that it will achieve growth,
 stability, and convergence without suffocating from having too many weird
 and specific item types to try to bridge and glue different types of
 information together.

 --
 Date: Thu, 21 Mar 2013 15:40:39 +0100
 From: denny.vrande...@wikimedia.de

 To: wikidata-l@lists.wikimedia.org
 Subject: Re: [Wikidata-l] Expiration date for data

 We will have a time datatype, and every property is strongly typed. This
 is also true for properties used as qualifiers.

 Regarding the priority of qualifiers: very high. They are the next major
 UI feature to be deployed, and as far as I can tell from the progress of
 the team it looks like they will be deployed in April.

 Cheers,
 Denny



 2013/3/20 Dario Taraborelli dtarabore...@wikimedia.org

 I disagree, and fully concur with Tom: a generic string type for a
 datetime qualifier defies the purpose of making wikidata statements
 well-formed and machine-readable.
 I don't think we should enforce typing for *all* qualifiers and I second
 the general organic growth approach, but datetime qualifiers strike me as
 a fundamental exception. Would you represent geocoordinates as a generic
 string and wait for organic growth to determine the appropriate datatype?
 I appreciate the overheads of adding datatype support, but this decision
 will have a major impact on the shape of collaborative work on wikidata.

 Denny – on a related note, I wanted to ask you what is the priority of
 qualifier support relative to the other items you mentioned in your list.
 As I noted in my previous post, the only way for an editor to correct an
 outdated statement is to remove information (e.g. Lombardy: head of local
 government: -Roberto Formigoni +Roberto Maroni ): this information will
 then be lost forever in an item's revision history. The sooner we introduce
 basic support for qualifiers, the sooner we can avoid removing valuable
 information from wikidata entries just for the sake of keeping them
 up-to-date.

 Dario

 On Mar 15, 2013, at 10:09 AM, Michael Hale hale.michael...@live.com
 wrote:

 For most of the scenarios I can think of, parsing the dates out of strings
 that are in a standard format by convention will be much easier. The number
 of ways people will want to use qualifiers will increase like the number of
 properties and items. So the way I see it, we have to support string-based
 qualifiers at the minimum. Then I think we should only support strongly
 typed qualifiers if performance requires it. By setting an update polling
 frequency on templates that use the information I don't think we'll run
 into performance issues for most scenarios. Even with this example the
 qualifier type is a

Re: [Wikidata-l] Expiration date for data

2013-03-14 Thread Denny Vrandečić

Hi Dario,

two or three features are still missing to enable that (sorted in order we
are probably going to deploy them):
* qualifiers
* the time datatype
* statement ranks

As soon as they are available, this can be modeled in a way that it can be
useful for projects accessing the data.

So, progress yet, but it's not there yet :)

Cheers,
Denny






2013/3/14 Dario Taraborelli dtarabore...@wikimedia.org

 Has there been any progress on time-based qualifiers since this thread?
 If so, can someone point me to relevant discussions/proposals?

 Thanks
 Dario

 On Oct 11, 2012, at 8:28 AM, Marco Fleckinger marco.fleckin...@gmail.com
 wrote:

  Hi,
 
  On 11.10.2012 16:12, Lydia Pintscher wrote:
  On Thu, Oct 11, 2012 at 11:13 AM,bene...@zedat.fu-berlin.de  wrote:
  Is there something like VALID_FROM and VALID_TO in your Database?
 
  LB
 
  This is basically what the qualifiers do.
  http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model_primer has
  more details.
 
  Hm, sorry I didn't remember this. Thank you for reminding!
 
  Marco
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] question about Inclusion policy discussion

2013-03-14 Thread Denny Vrandečić

That is a tough question. We are pretty sure that we technically scale
quite well, and there is no reason that the community should restrict
itself out of technical reasons. If the number of item suddenly increases
by one or two orders of magnitudes, we would probably meet a few hiccups on
the way, but the architecture should be able to deal with that.

What I am much more worried about is, is the scaling of the community
though. One of my statements from my Wikidata talks is we do not want to
become the biggest data heap out there, but rather aim for an organic
community, that is strong and resilient enough to maintain the data that is
being collected. See also Wikidata requirement #6 
http://meta.wikimedia.org/wiki/Wikidata/Notes/Requirements (a page worth
re-reading).

Sometimes it might sense for Wikidata to bridge and connect to external
data sources that have their own way of maintenance and curation. Should
the dataset really be merged into Wikidata? Is the data wikilike? Is it
used in the Wikimedia projects? Or could it be also provided as a linked
open dataset, which is referenced from Wikidata?

Just to give an example: sure, one could theoretically start to collect
temperature data of a city in hourly measurements*, but it could maybe make
more sense to point to an external site that collects this data in a more
efficient format, provide the mapping identifiers, and allow for a bot to
go there and discover the data. Wikidata in turn could provide an
aggregation of the data, which indeed would be used on e.g. Wikipedia and
Wikivoyage, but leave the full dataset on the external site.

(Which, by the way, would also be a viable solutions for datasets which
have incompatible licenses).

I hope this makes sense, Cheers,
Denny

* Actually, this kind of data would probably kill us faster than creating
many items, as it would make a single item be ginormous. We scale not that
well in that direction.



2013/3/14 Benjamin Good ben.mcgee.g...@gmail.com

 I've been struggling to understand what should go into wikidata and what
 should not.  I see that this is because it hasn't been decided yet ;)
 http://www.wikidata.org/wiki/Wikidata_talk:Notability

 In helping the community to make this decision I think it would be really
 helpful for the developers to weigh in on the technical capacity of the
 envisioned/realized wikidata infrastructure.  If we know how big the system
 could realistically be and continue to work well technically, it might help
 discussions about how much and what kind of content we should put into it.
  If the plan is to cope with only a few tens of millions of subjects that
 is quite different than if the plan allows for the potential creation of
 billions of items.  (Suggesting less inclusive versus more inclusive
 policies).

 ?

 -Ben

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] [Wikimedia-l] Are there plans for interactions between wikidata and wiktionaries ?

2013-03-11 Thread Denny Vrandečić

There is currently a number of things going on re the future of Wiktionary.

There is, for example, the suggestion to adopt OmegaWiki, which could
potentially complicate a Wikibase-Solution in the future (but then again,
structured data is often rather easy to transform):
http://meta.wikimedia.org/wiki/Requests_for_comment/Adopt_OmegaWiki

There is this grant proposal for elaborating the future of Wiktionary,
which I consider a potentially smarter first step:


http://meta.wikimedia.org/wiki/Grants:IEG/Elaborate_Wikisource_strategic_vision


There's this discussion on Wikdiata itself:

https://www.wikidata.org/wiki/Wikidata:Wiktionary

And I know that Daniel K. is very interested in working into this direction.

Personally, I regard Wiktionary as the third priority, following Wikipedia
and Commons. A lot of the other projects -- like Wikivoyage or Wikisource
-- can be served with only small changes to Wikidata as it is, but both
Commons and Wiktionary would require a bit of thought (and here again,
Commons much less than Wiktionary). I would appreciate a discussion with
the Wiktionary-Communities, and also to make them more aware of the
OmegaWiki proposal, the potential of Wikidata for Wiktionary, etc. Just to
give a comparison: it took a few months to write the original Wikidata
proposal, and it was up for discussion for several months before it was
decided and acted upon. I would strongly advise to again choose slow and
careful planning over hastened decisions.

Cheers,
Denny






2013/3/9 Mathieu Stumpf psychosl...@culture-libre.org

 Hello,

 First, congratulation for all the already achieved great work on the
 wikidata project.

 Now I would be interested to know more about future development,
 especially on interactions with wiktionaries.

 I think wikidata could help to improve wiktionaries drastically, by
 unifying not only interlangs links, but also definitions and
 translations.

 More accurately what I mean is that currently you often have, attached
 to one wiki article you have usually several definitions for each
 language where the word is used. But often when I seek a non-french word
 in the french wiktionary, looking at the native wiktionary will bring
 more definition than what you can find on the french article.

 I saw that on the english wiktionary, the interface added a quick add
 feature, which ask user to fill translation for each meaning. That's
 great and I wish it would be added in all chapters. And I think that we
 could add even more hey, what about translating just this little thing
 feature across all dictionary by centralizing entries, so that each
 word is associated with one or several meaning by language. Then all
 meanings could be redistributed to all wiktionnaries, even when no
 translation is available for a given meaning in the local chapter. In
 this cas we could have an information box that would say this word have
 an other meaning which wasn't yet translated in ${local_language}, if
 you one of the language in which a translation is available, please help
 us to improve the wiktionary.

 What do think about such a project, could it work with wikidata?

 kind regards,
 mathieu

 ___
 Wikimedia-l mailing list
 wikimedi...@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] One entity per page question

2013-03-06 Thread Denny Vrandečić

Yes, every Wikidata page is about one and exactly one entity. There cannot
be two entities on one page.

Bonny and Clyde is one entity, designing the pair of people.

Bonny and Clyde each might also be one entity each, and there could be
relevant connections between the three entities Bonny, Clyde, and
Bonny and Clyde.

2013/3/6 Yuri Astrakhan yuriastrak...@gmail.com

Sk!d, of course if you ask for multiple items, you get multiple items. My
question is the difference between MediaWiki concept of a page, and the
wikidata concept of an entity, specifically relating to items and
properties (not queries). Are these concepts interchangeable? Is
one MediaWiki page the same as one wikidata item, and have one-to-one
mapping?

On Wed, Mar 6, 2013 at 9:41 AM, swuensch swuen...@gmail.com wrote:

It is wbgetentitie*s* requests like:
https://www.wikidata.org/w/api.php?action=wbgetentitiesids=Q219937|Q42format=jsonfmgive
you two entities Q219937 and Q42.

Sk!d

On Wed, Mar 6, 2013 at 3:35 PM, Yuri Astrakhan
yuriastrak...@gmail.comwrote:

During an IRC discussion, I was told that a page in namespace 0 like
Q219937 http://www.wikidata.org/wiki/Q219937 does not necessarily
have a one-to-one relationship with an entity like Bonnie and Clyde.

wbgetentitieshttp://www.wikidata.org/w/api.php?action=wbgetentitiesids=Q219937format=jsonfm
API
call gives this:

entities: {
q219937: {
pageid: 214789,
ns: 0,
title: Q219937,
lastrevid: 7969610,
modified: 2013-02-27T09:17:25Z,
id: q219937,
type: item,
aliases: {
..

How is it possible to have more than one entity in one wiki page
titled Q219937, if the entity id is the same as page title? In what cases
would it be used? Is that a needed extra complexity?

In the case of Bonnie and Clyde (one wikipage in language A vs two
wikipages in B), wikidata can have three entities with links to static
redirects, apparently solving the need of one-to-many.

I am only considering item entities (ns:0), since query pages will
obviously have more than one entity associated with them.

Thanks!

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

--
*Severin Wünsch*

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Re: [Wikidata-l] deployment on enwp delayed

2013-02-13 Thread Denny Vrandečić

We are still working on a postmortem.

As of now, it seems there has been some serious memchached failures and
some interplay with another software deployment.




2013/2/13 Jan Kučera kozuc...@gmail.com

 What were the issues in detail?


 2013/2/12 Lydia Pintscher lydia.pintsc...@wikimedia.de

 On Tue, Feb 12, 2013 at 12:57 PM, Lydia Pintscher
 lydia.pintsc...@wikimedia.de wrote:
  We'll do another attempt later today.

 There were unfortunately too many other issues unrelated to Wikidata
 so we also had to call off this one. Sorry.


 Cheers
 Lydia

 --
 Lydia Pintscher - http://about.me/lydia.pintscher
 Community Communications for Wikidata

 Wikimedia Deutschland e.V.
 Obentrautstr. 72
 10963 Berlin
 www.wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
 unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
 Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l



 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] phase 1 live on the English Wikipedia

2013-02-13 Thread Denny Vrandečić

You have examples of that? Did not happen to my edits (so far).

2013/2/13 Denny Vrandečić denny.vrande...@wikimedia.de

Block them until they behave?

2013/2/13 Katie Chan k...@ktchan.info

On 13/02/2013 21:01, Lydia Pintscher wrote:

Heya :)

Third time's a charm, right? We're live on the English Wikipedia with
phase 1 now \o/
Details are in this blog post:
http://blog.wikimedia.de/2013/**02/13/wikidata-live-on-the-**
english-wikipediahttp://blog.wikimedia.de/2013/02/13/wikidata-live-on-the-english-wikipedia
An FAQ is being worked on at
http://meta.wikimedia.org/**wiki/Wikidata/Deployment_**Questionshttp://meta.wikimedia.org/wiki/Wikidata/Deployment_Questions
Thanks everyone who helped!

Now if only those interwiki bots would stop adding links back...

KTC

--
Experience is a good school but the fees are high.
- Heinrich Heine

__**_
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l

--
Project director Wikidata

Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Re: [Wikidata-l] Complaint about the partial Phase II deployment

2013-02-08 Thread Denny Vrandečić

Sven,

thank you for your honest opinion, and I know that you are not alone with
it - but I also heard a lot of people express excitement and joy about the
deployment, and based on the activity it seems that a lot of people like
it. We consider ourselves happy to be part of an intelligent and critical,
and at the same time sympathetic community.

I fully agree that the first deployment of Phase II functionality was
early. A lot of features are missing, as we have repeatedly communicated.

Was the deployment too early? That I disagree with. It is widely accepted
wisdom for software projects to release early, release often. I think
that is very valid advice, and I decided to follow it with the project.
This allows us to see if some of our basic assumptions work. This allows
us to test a few things before we fully commit to them and spend big
amounts of effort without any reality check.

The project plan as a whole was planned like this. Basically language links
are just a test-run for some of the technology we will need in order to
implement Phase II. Many back-end features -- the propagation of data from
a central repository to the Wikipedias, the way recent changes deals with
this, the scalability of some of our assumptions -- are equivalent in Phase
I and II. Phase I was always implemented with Phase II in mind. We are
doing the same thing now. We implement some features -- namely statements
per se, references for statements -- but with a limited set of data sets
and with some major limitations. But these will get expanded over time.

Compare it with another project like the Visual Editor. It is deployed to
the English Wikipedia. Plenty of features are still missing. But only with
their current deployment schedule can the VE team gather crucial data for
their further development. The main difference is that VE is an opt-in
feature -- statements in Wikidata are not, they are there, in your face.

I regard a project like Wikidata not as a software development project. It
is a growing, living socio-technical system, and in this case actually it
is one embedded in an even bigger such system, the Wikimedia movement as a
whole. We are developing technical features that we think will lead all of
us towards our common goals, and then we watch how the communities adapt to
them, which social rules they build on them, which technical developments
of their own are built on top of ours. We (as the development team) are
part of this ecosystem, and we (as all of us Wikimedians) are growing
together. Technical possibilities shape the rules the Wikidata editor
community agrees on, and the actual usage of the system and your feedback
shapes and prioritizes the future technical development that we plan and
undergo as the development team.

I also see that some decisions of the community are based on the currently
available features, but i do not think that this is problematic -- because
I am very confident that future new features will continue to shape new
rules and that the existing ones will be revisited and updated accordingly.

The timing of the deployment of phase II to wikidata.org has nothing to do
with the deployment of phase I to the English Wikipedia, which is currently
scheduled for Monday. We simply deployed features when they are deemed
ready. We do not plan features ahead with the intention to keep interest
high, or in order to win editors from other communities, etc. Also, we
regard phase II only as sufficiently finished when it is actually deployed
to the Wikipedias. And this, obviously, still requires a much better
support for references and a bigger number of data types. Also, so far
there is no reason to believe that there will be any major problems
revealed once we deploy to the English Wikipedia.

I regard your feedback as very important, and I am thankful for it. I
understand that everyone would like to have all the features immediately.
But I disagree with you on the point that we should not have deployed last
Monday. We were working very hard in order to be able to deploy on Monday,
and not wait even longer, and we are very proud with how smoothly it went -
fully conscious of the limitations of the current state. The situation will
soon improve, and we would like to stay a project for now that deploys new
features in a comparably quick succession.

After this explanation I hope that I have the support of the Wikidata
community to continue in this spirit.

Cheers,
Denny



2013/2/8 Sven Manguard svenmangu...@gmail.com

 Hello there. I have been an active and vocal supporter of Wikidata since
 almost the day it went live, and after giving Phase II a legitimate chance,
 I have to say that in my opinion the decision to deploy Phase II with only
 a small number of the expected features has been a massive mistake. Yes, I
 understand that the project was losing momentum and that several people
 commented that they felt that there was nothing to do on the project before
 Phase II hit, however the partial

Re: [Wikidata-l] Bug Report for Q17 Japan - Cannot remove the extra language

2013-02-06 Thread Denny Vrandečić

I assume you mean the unability to remove the second Yen.

This is... interesting. We have right now no idea what is going one.
Investigating.

Thank you for reporting,
Denny


2013/2/6 Napoleon Tan napoleon@gmail.com

 I think the wikidata Item entry for Japan is corrupted. I cannot remove
 the extra language no matter what. I think the file is somehow corrupted
 for this page.

 http://www.wikidata.org/wiki/Q17

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] first parts of phase 2 live on wikidata.org now

2013-02-06 Thread Denny Vrandečić

It is a feature. The reason for that is that in the Wikipedias when you
access the data you can't use the property names otherwise - they have to
be unique. So in order to be able to write {{#property:capital}}, the
capital property needs to be unique (and a {{#property:p25}} we considered
to be unacceptable usability-wise).

That is why we decided that property labels need to be unique (per
language). If there are better ways to solve that problem, we are all ears.

Cheers,
Denny

2013/2/6 Dennis Tobar dennis.to...@gmail.com

Hi:

As we know, two properties may have the same name. In Spanish we call
género to two topics: gender (P21) and genus (P74). The first is related
to sex and second to taxonomic categoy. So, if we call both as género,
the site doesn't allow it (Edit not allowed: Otra propiedad (21) ya tiene
la etiqueta género asociada con el código de idioma es)

Is it any bug or feature?

Regards

On Wed, Feb 6, 2013 at 5:47 AM, Mathieu Stumpf
psychosl...@culture-libre.org wrote:

Le 2013-02-05 15:58, Lydia Pintscher a écrit :

On Tue, Feb 5, 2013 at 3:26 PM, Nicholas Humfrey
nicholas.humf...@bbc.co.uk wrote:

This is fantastic :) you are making amazingly fast progress!

Thank you!

I have been trying to assign the 'is a' property to David Cameron:
http://www.wikidata.org/wiki/**Q192 http://www.wikidata.org/wiki/Q192

And make him a Politician:
http://www.wikidata.org/wiki/**Q82955http://www.wikidata.org/wiki/Q82955

But it doesn't seem to let me select 'Politician' in the value field.
How
is the list of allowed values defined?

I just set this. This is possible.
There is no list of allowed values. All existing items are allowed if
it is a property of type item.

About relation names, 'is a' is vague, isn't it? I mean, Mr. Cameron may
have political activities today, and make something else tomorrow, as he
may used to do something else before. So wouldn't be interested to give
more accurate information, like he have been UK prime minister since 11 may
2010 (and adding information on end date of phenomena when possible, which
is not the case here). And then you may add prime minister in a political
role category.

Now it all depends on granularity wikidata is aiming to.

Cheers
Mathieu
--
Association Culture-Libre
http://www.culture-libre.org/

__**_
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l

--
Dennis Tobar Calderón
Ingeniero en Informática UTEM

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Re: [Wikidata-l] Reification in Wikidata serialisation

2013-02-04 Thread Denny Vrandečić

We are reluctant, but open, to renaming it. But not to Fact. Statement
has the nice ambiguous quality regarding its correctness which Fact lacks.

On the other hand, the similarity to rdf:Statement is not merely syntactic,
so I do not see too much of an issue here.


2013/2/1 Nicholas Humfrey nicholas.humf...@bbc.co.uk

 Hello,

 My colleague Yves Raimond and myself were just having a quick chat about
 the Wikidata RDF serialisation plans.

 http://meta.wikimedia.org/wiki/Wikidata/Development/RDF


 While the reification makes sense, we thought that it looked a bit too
 much like rdf:Statement.

 w:Berlin s:Population Berlin:Statement1 .

 Berlin:Statement1 rdf:type o:Statement .



 Perhaps you could rename o:Statement to o:Fact instead?


 nick.



 -
 http://www.bbc.co.uk
 This e-mail (and any attachments) is confidential and
 may contain personal views which are not the views of the BBC unless
 specifically stated.
 If you have received it in
 error, please delete it from your system.
 Do not use, copy or disclose the
 information in any way nor act in reliance on it and notify the sender
 immediately.
 Please note that the BBC monitors e-mails
 sent or received.
 Further communication will signify your consent to
 this.
 -

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] wikidata stats (was: getting some stats for the Hungarian Wikipedia)

2013-02-01 Thread Denny Vrandečić

No, not by design.

The design would be to have

http://en.wikidata.org/wiki/Breck

be an alias for

http://www.wikidata.org/wiki/Q123803

but we didn't have yet the time to set this up properly.

If anyone knows Apache well and has some time on their hands, please ping
me on IRC.

Right now, they are all identical, but that's not as planned. Only the one
without the language code is supposed to be canonical.

Re swuensch: it does not have to do with the viewerlanguage.

Re first mail: is there not a plain wd as well (for www.wikidata?) or a
www.wd?

Should check myself :)

Cheers,
Denny






2013/2/1 Ed Summers e...@pobox.com

 Yes, I'm just noticing that there are, for example:

   http://de.wikidata.org/wiki/Q123803
   http://en.wikidata.org/wiki/Q123803
   http://www.wikidata.org/wiki/Q123803

 Which are identical. Is this by design?

 //Ed

 On Fri, Feb 1, 2013 at 5:15 AM, swuensch swuen...@gmail.com wrote:
  maybe the statistic is splitted up by the viewerlanguage.
 
  On Fri, Feb 1, 2013 at 11:11 AM, Ed Summers e...@pobox.com wrote:
 
  Diederek van Liere over on the analytics list [1] let me know that
  webstatscollector was updated to start collecting wikidata stats as of
  Feb 1st UTC 0:00. Yay.
 
  I took a look, and I was a bit confused by the language prefixes, for
  example:
 
  de.wd Wikidata:Hauptseite 14 369565
  de.wd Wikidata:Introduction/de 1 48357
  de.wd Wikidata:Labels_and_descriptions_task_force/de 1 39804
  de.wd Wikidata:Translation_administrators/ca 1 11902
  dsb.wd Q54919 1 12281
  dsb.wd Special:SetLabel/q54919/en 1 
  dsb.wd Special:WhatLinksHere/Q54919 1 9607
  el.wd File:Wikidata_item_creation_progress.png 1 13545
  el.wd Wikidata:Community_portal 1 11816
  en.wd Q123801 1 10801
  en.wd Q123803 1 10908
  en.wd Q123843 1 10872
  en.wd Q124027 1 11312
  en.wd Q124345 1 11156
  en.wd Q14217 1 11485
 
  Does that make sense to anyone? I thought there was just
  www.wikidata.org? If it doesn't make sense let me know and I will
  follow up with Diederek.
 
  //Ed
 
  [1]
 
 http://lists.wikimedia.org/pipermail/analytics/2013-February/000388.html
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
 
 
 
  --
  Severin Wünsch
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

[Wikidata-l] Fwd: Todos for RDF export

2013-01-29 Thread Denny Vrandečić

2013/1/25 Daniel Kinzler daniel.kinz...@wikimedia.de

 Hi!

 I thought about the RDF export a bit, and I think we should break this up
 into
 several steps for better tracking. Here is what I think needs to be done:


Daniel,
I am answering to Wikidata-l, and adding Tpt (since he started working on
something similar), hoping to get more input on the open list.

I especially hope that Markus and maybe Jeroen can provide insight from the
experience with Semantic MediaWiki.

Just to reiterate internally: in my opinion we should learn from the
experience that SMW made here, but we should not immediately try to create
common code for this case. First step should be to create something that
works for Wikibase, and then analyze if we can refactor some code on both
Wikibase and SMW and then have a common library that both build on. This
will give us two running systems that can be tested against while
refactoring. But starting the other way around -- designing a common
library, developing it for both Wikibase and SMW, while keeping SMW's
constraints in mind -- will be much more expensive in terms of resources. I
guess we agree on the end result -- share as much code as possible. But
please let us not *start* with that goal, but rather aim first at the goal
Get an RDF export for Wikidata. (This is especially true because of the
fact that Wikibase is basically reified all the way through, something SMW
does not have to deal with).

In Semantic MediaWiki, the relevant parts of the code are (if I get it
right):

SMWSemanticData is roughly what we call Wikibase::Entity

includes/export/SMW_ExportController.php - SMWExportController - main
object responsible for creating serializations. Used for configuration, and
then calls the SMWExporter on the relevant data (which it collects itself)
and applies the defined SMWSerializer on the returned SMWExpData.

includes/export/SMW_Exporter.php -  SMWExporter - takes a SMWSemanticData
object and returns a SMWExpData object, which is optimized for being
exported
includes/export/SMW_Exp_Data.php -  SMWExpData - holds the data that is
needed for export
includes/export/SMW_Exp_Element.php - several classes used to represent the
data in SMWExpData. Note that there is some interesting interplay happening
with DataItems and DataValues here.

includes/export/SMW_Serializer.php - SMWSerializer - abstract class for
different serializers
includes/export/SMW_Serializer_RDFXML.php - SMWRDFXMLSerializer -
responsible to create the RDF/XML serialization
includes/export/SWM_Serializer_Turtle.php - SMWTurtleSerializer -
responsible to create the Turtle serialization

special/URIResolver/SMW_SpecialURIResolver.php - SMWURIResolver - Special
page that deals with content negotiation.
special/Export/SMW_SpecialOWLExport.php - SMWSpecialOWLExport - Special
page that serializes a single item.
maintenance/SMW_dumpRDF.php - calling the serialization code to create a
dump of the whole wiki, or of certain entity types. Basically configures a
SMWExportController and let's it do its job.

There are some smart ideas in the way that the ExportController and
Exporter are being called by both the dump script as well as the single
item serializer, and that allow it to scale to almost any size.

Remember that unlike SMW, Wikibase contains mostly reified knowledge. Here
is the spec of how to translate the internal Wikibase representation to
RDF: http://meta.wikimedia.org/wiki/Wikidata/Development/RDF

The other major influence is obviously the MediaWiki API, with its (almost)
clean separation of results and serialization formats. Whereas we can also
get inspired here, the issue is that RDF is a graph based model and the
MediaWiki API is really built for a tree. Therefore I am afraid that we
cannot reuse much here.

Note that this does not mean that the API can not be used to access the
data about entities, but merely that the API answers with tree-based
objects, most prominently the JSON objects described here:
http://meta.wikimedia.org/wiki/Wikidata/Data_model/JSON

So, after this lengthy prelude, let's get to the Todos that Daniel suggests:

* A low-level serializer for RDF triples, with namespace support. Would be
 nice
 if it had support for different forms of output (xml, n3, etc). I suppose
 we can
 just use an existing one, but it needs to be found and tried.


Re reuse: the thing is that to the best of my knowledge PHP RDF packages
are quite heavyweight (because they also contain parsers, not just
serializers, and often enough SPARQL processors and support for blank nodes
etc.), and it is rare that they support the kind of high-throughput
streaming that we would require for the complete dump (i.e. there is
obviously no point of first setting all triples into a graph model and then
call the model-serialize() method, this needs too much memory). Also some
optimizations that we can use (re ordering of triples, use of namespaces,
some assumptions about the whole dump, etc.). I will ask the Semantic Web

Re: [Wikidata-l] Coordinate datatype -- update

2013-01-17 Thread Denny Vrandečić

Exactly. This is about the backend and the API. The user will rather use a
Widget maybe similar to this one:

http://localhost/~denny_WMDE/valueparser/time.html




2013/1/17 Luca Martinelli martinellil...@gmail.com

 2013/1/17 Denny Vrandečić denny.vrande...@wikimedia.de:
  Based on the feedback so far I have frozen the datatype for time [1] and
  updated the datatype for coordinates.
 ***CUT***

 Sorry if my question appears silly, but I'll take the risk.

 I assume this deals with how the system recognizes the data we put
 in, and not with how the user puts the data into the system, am I
 right?

 In other words, will I/we be forced to use THIS way of inserting
 datas, or we'll put them they way we know/can and then the system will
 recalculate them in this way?

 --
 Luca Sannita Martinelli
 http://it.wikipedia.org/wiki/Utente:Sannita

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Coordinate datatype -- update

2013-01-17 Thread Denny Vrandečić

I heard this URL might be better than the previous on most setups:

http://simia.net/valueparser/time.html




2013/1/17 Denny Vrandečić denny.vrande...@wikimedia.de

 Exactly. This is about the backend and the API. The user will rather use a
 Widget maybe similar to this one:

 http://localhost/~denny_WMDE/valueparser/time.html




 2013/1/17 Luca Martinelli martinellil...@gmail.com

 2013/1/17 Denny Vrandečić denny.vrande...@wikimedia.de:
  Based on the feedback so far I have frozen the datatype for time [1] and
  updated the datatype for coordinates.
 ***CUT***

 Sorry if my question appears silly, but I'll take the risk.

 I assume this deals with how the system recognizes the data we put
 in, and not with how the user puts the data into the system, am I
 right?

 In other words, will I/we be forced to use THIS way of inserting
 datas, or we'll put them they way we know/can and then the system will
 recalculate them in this way?

 --
 Luca Sannita Martinelli
 http://it.wikipedia.org/wiki/Utente:Sannita

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




 --
 Project director Wikidata
 Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
 Tel. +49-30-219 158 26-0 | http://wikimedia.de

 Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
 Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
 der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
 Körperschaften I Berlin, Steuernummer 27/681/51985.




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

[Wikidata-l] Is the inclusion syntax powerful enough?

2013-01-10 Thread Denny Vrandečić

Hi all,

We did yet another version of the inclusion syntax (admittedly, the last
one is a few months old).

We decided to very much simplify the syntax and also the ability of the
inclusion syntax, and depend on anything more complex than what will be
possible with that syntax on Lua.

I am expecting that a lot of people will not like this idea.

Therefore I will start two threads: first this one, where we discuss if the
inclusion syntax is actually sufficient or if it lacks absolutely essential
features that should not depend on Lua.

http://meta.wikimedia.org/wiki/Wikidata/Notes/Inclusion_syntax_v0.3

And a second thread where we discuss the details and merits of the proposal
as it is, and if there are issues with it as it is.

Cheers,
Denny

P.S.: yes, I learned that we should not have too many topics per thread
starter. Let's see if this gets any feedback at all :)

-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Update to time and space model

2013-01-08 Thread Denny Vrandečić

2013/1/8 Gregor Hagedorn g.m.haged...@gmail.com

 ON COORDINATES:

 a) what you describe is more specific than a geolocation (which may be
 expressed by other means than coordinates). I suggest to give the data
 type the more specific name:

 geocoordinates



Yep, agreed. Or just coordinates.



 b) with respect to precision: I don't understand the reasoning to
 stick this to degrees. Since we are describing locations on an
 ellipsoid, the longitude to distance and latitude to distance
 conversions are different, and they are different for different points
 on earth. See example on en.wikipedia, a minute at equator is 1843
 versus 1855 m.


The model defines it as using the arcdistance on the given equator.



 In practice the potential location error will be given in a distance
 measure. You want to convert it to degrees in a highly complex
 conversion. Why? The back conversion will usually be non-ambiguous
 (since the backconversion will always describe an ellipsis rather than
 a circle).


In practice the value will be given as 44°15'. Then we know it is by the
minute - and not that it is given by a nautical mile. I am not making a
highly complex conversion -- I am just looking at the number and saying oh
yeah, this seems to be given by the minute, and not by the second or by the
degree.

The reason why I prefer degrees on a given equator to meters is that it
makes more sense on varying globes, like the Earth, Moon, Sun, Jupiter, and
Phobos. What we need is the possibility to understand that 44°15' should
not be displayed as 44°15'00.001 the next time the value is displayed. And
by saying it is correct by the minute allows us to do so. Making the
statement in meters would actually require us to make that complex
calculation which would be based on the given geodetic system -- which is
much more complicated than the current suggestion.



 c) Furthermore, as before, I believe that precision and accuracy will
 usually both contribute to the error your are interested in and which
 is typically described in geolocations having a +/- addition.

 I suggest to replace precision with
 errorradius
 or
 uncertaintyradius
 or
 uncertaintyInMeters

 which would be the great circle distance. To somewhat simplify, the
 unit could be fixed to m.



I think precision is actually what I mean here for geocoordinates: with how
much precision is the coordinate given? How many 0s after the dot need to
be written? Is the minute specified or not? Is the second specified?

This can be used for transforming from one geodesic system to the other,
or, simpler, from degree minute seconds to degree in decimals.

But then again, I don't mind calling it uncertainty or uncertaintyRadius.




 Here is some work done in our area (biodiversity):
 http://code.google.com/p/darwincore/wiki/Location

 The term there is
 http://terms.gbif.org/wiki/dwc:coordinateUncertaintyInMeters


Yep, pretty much what I meant, just that I am suggesting not to use meters
but something that is easier to translate into degrees.



 d) the correct name for globe is Geodetic datum or geodetic
 system (which is more than the globe). See
 http://en.wikipedia.org/wiki/Geodetic_system or
 http://terms.gbif.org/wiki/dwc:geodeticDatum. WGS 84 (as a wikidata
 item) is a valid geodetic datum or system. Both terms are equally
 correct. Globe is not correct.


OK.


 Gregor

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Re: [Wikidata-l] Update to time and space model

2013-01-08 Thread Denny Vrandečić

Thanks to the pointer, Katie. I meant to look into Max' work for a while,
but failed. Now I did and asked him many questions :)

So the biggest difference is that Max uses dim to represent what we mean
here with precision. And dim is somehow related to precision in a
globe dependent way (which is fine) -- or differently put, dim is what
Gregor would prefer here, since it is measured in Meter.

Otherwise it looks pretty much the same.

I still would prefer Arcdegree of the equator of the given globe over
Meter, as it allows to measure any globe without having too much details
about the globe. but otherwise it seems like the same things. (And they can
be transformed from one to the other using a simple factor).

2013/1/8 Katie Filbert katie.filb...@wikimedia.de

On Tue, Jan 8, 2013 at 1:54 PM, Nikola Smolenski smole...@eunet.rswrote:

On 08/01/13 12:36, Denny Vrandečić wrote:

Location:
https://meta.wikimedia.org/**wiki/Wikidata/Development/**
Representing_values#**Geolocationhttps://meta.wikimedia.org/wiki/Wikidata/Development/Representing_values#Geolocation

I'm not sure if we should be going that far, but there may be cases where
longitude and latitude are known with different degree of accuracy, so
multiple precisions might be needed.

I think it's worth taking a look at what MaxSem has done with the GeoData
extension, which is used for mobile apps, etc.:

https://www.mediawiki.org/wiki/Extension:GeoData

GeoData uses globe, as that's consistent with how coordinate templates are
done now on Wikipedia. I think starting simple and consistent with GeoData
and the coordinate templates is good.

If no globe parameter is specified in the coordinate template, then
Earth is assumed (and lat/long -- WGS84).

For the moon, selenographic coordinates are assumed and there are other
reference globes for other planets and moons.

http://en.wikipedia.org/wiki/Selenographic_coordinate

Perhaps things can get more complex later and having WGS84 coordinates
wouldn't interfere with that.

Cheers,
Katie

__**_
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l

--
Katie Filbert
Wikidata Developer

Wikimedia Germany e.V. | NEW: Obentrautstr. 72 | 10963 Berlin
Phone (030) 219 158 26-0

http://wikimedia.de

Wikimedia Germany - Society for the Promotion of free knowledge eV Entered
in the register of Amtsgericht Berlin-Charlottenburg under the number 23
855 as recognized as charitable by the Inland Revenue for corporations I
Berlin, tax number 27/681/51985.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

1 2 >

1 - 100 of 170 matches

Mail list logo