Re: [Wikidata-l] Complete Datamodel in WON

2012-09-17 Thread Jeroen De Dauw
Hey,

As explained in the text, the aliases are not distinguished from other
 property values in the data model right now. This was the status of the
 discussion when we last talked about this, but we can also re-introduce
 aliases as a special field (I see why this would be useful). Daniel had an
 argument against this, saying that many other property values could also
 work as aliases in certain domains (e.g. binomial names of biological
 species). So the special status of the alias in the data model was
 questioned.


Right, that makes sense to implement at some point if there really is
demand for this. This is rather harder to implement then what we're
currently doing and is blocked by phase 2 stuff and probably phase 3 stuff,
while we want to have it in phase 1 already.

A while back we also had a related discussion where Daniel took the
position that we should also not have special labels and descriptions. The
conclusion of that was that we will have them but that we will make them
accessible via the same interface as regular properties (at least for read
ops).

 if two items have the same description, can one of them use an alias that
is the title of the other?

Good question. Right now this is not enforced. Then again, right now
aliases are not used anywhere for lookups except in the fulltext search
thing, where this restriction is not really relevant. Denny, Daniel, any
thoughts on this?

 This is also based on a preliminary decision made a while back: the idea
was that properties, while not having Wikipedia articles, will still need
unique string identifiers that can be used in wikitext (e.g. queries) where
one does not want to address properties by ID or by label+description
pairs.

This seems odd to me - you sure the term TitleRecord is being used
consistently through the data model and this thread? I'm using it as
GlobalSiteId PageName.

I do agree you would probably not want to put label and description in
wikitext, and that just the label might or might not be sufficient, even if
they are unique per language. If you need an id that really is always
unique you can just use the p12345 thing. Since most of the editing of
these will happen via GUIs (right?) this seems to be quite acceptable. Or
does anybody see a better approach? In any case, why would you resort to
GlobalSiteId PageName rather then label description? What makes it so
odd is that the GlobalSiteId PageName is meant to indicate equivalence of
items across sites, which is rather different then using it to identify
properties in wikitext.

 It seems that a property could at best have a list of PropertyValueSnaks
(no auxiliary Snaks, no references, no statement rank).

Why not have a list of claims?

Cheers

--
Jeroen De Dauw
http://www.bn2vs.com
Don't panic. Don't be evil.
--
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Complete Datamodel in WON

2012-09-17 Thread Denny Vrandečić
2012/9/17 Jeroen De Dauw jeroended...@gmail.com:
 if two items have the same description, can one of them use an alias that
 is the title of the other?

 Good question. Right now this is not enforced. Then again, right now aliases
 are not used anywhere for lookups except in the fulltext search thing, where
 this restriction is not really relevant. Denny, Daniel, any thoughts on
 this?

It technically can happen that one of two items with the same
description but different labels can have the alias of the other item.

If the item selection widget would not display the label, this could
lead to confusion.

I always assumed that if an alias is used for lookup, the canonical
label would still be displayed (probably additionally to the alias in
that case). Thus the confusion cannot happen.

So no constraints on the aliases will be introduced for now. They are
not needed.

Cheers,
Denny

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Complete Datamodel in WON

2012-09-17 Thread Denny Vrandečić
Re: keys for properties

For now the following solution seems to be the simplest:

* make labels for properties be unique for a given language

In that case they can be used as keys. Every wiki has one (and exactly
one) site language. If the label is unique, a property can be
addressed by languagecode + label,and the languagecode could be used
per default inside a wiki. No extra keys per site would be needed,
which would have otherwise been provided by the sitelink data (which
would be more appropriately named sitekey instead of sitelinks in that
case). The description is not used to identify properties by the
machine.

This means that two different properties cannot have the same name in
one language. We will need to figure out if this is a problem during
usage, and if it is, change it later.

I hope this makes sense,
Denny


2012/9/17 Markus Krötzsch mar...@semantic-mediawiki.org:
 On 17/09/12 12:54, Jeroen De Dauw wrote:

 Hey,

 As explained in the text, the aliases are not distinguished from
 other property values in the data model right now. This was the
 status of the discussion when we last talked about this, but we can
 also re-introduce aliases as a special field (I see why this would
 be useful). Daniel had an argument against this, saying that many
 other property values could also work as aliases in certain domains
 (e.g. binomial names of biological species). So the special status
 of the alias in the data model was questioned.


 Right, that makes sense to implement at some point if there really is
 demand for this. This is rather harder to implement then what we're
 currently doing and is blocked by phase 2 stuff and probably phase 3
 stuff, while we want to have it in phase 1 already.

 A while back we also had a related discussion where Daniel took the
 position that we should also not have special labels and descriptions.
 The conclusion of that was that we will have them but that we will make
 them accessible via the same interface as regular properties (at least
 for read ops).


 Ok, I agree with that. I will change the model to have explicit aliases
 somewhere.



   if two items have the same description, can one of them use an alias
 that is the title of the other?

 Good question. Right now this is not enforced. Then again, right now
 aliases are not used anywhere for lookups except in the fulltext search
 thing, where this restriction is not really relevant. Denny, Daniel, any
 thoughts on this?

   This is also based on a preliminary decision made a while back: the
 idea was that properties, while not having Wikipedia articles, will
 still need unique string identifiers that can be used in wikitext (e.g.
 queries) where one does not want to address properties by ID or by
 label+description pairs.

 This seems odd to me - you sure the term TitleRecord is being used
 consistently through the data model and this thread? I'm using it as
 GlobalSiteId PageName.


 Yes, this is what I mean. But PageName is just a string, and does not need
 to refer to an actual page (or be displayed as a link). It can still be used
 as a string key to refer to the property on a certain site.



 I do agree you would probably not want to put label and description in
 wikitext, and that just the label might or might not be sufficient, even
 if they are unique per language. If you need an id that really is always
 unique you can just use the p12345 thing. Since most of the editing of
 these will happen via GUIs (right?) this seems to be quite acceptable.
 Or does anybody see a better approach?


 Well, the above. It allows you to assign a human-readable key to each
 property that you can use instead of p12345 and that is still unique for
 each site. Moreover, this can be done with code that is similar to what we
 already have for site links in Items (but without linking and thus also
 without auto completion).


 In any case, why would you resort
 to GlobalSiteId PageName rather then label description?


 Because it is easier. First of all, label description is not enough: you
 need to say which language you talk about to make it a key (this can be
 guessed from the site, but this is still not a unique selection). Second,
 you do not need to mention the GlobalSiteId if you are on a site and want to
 use its own ID. So one addressing method requires one strong key (PageName),
 the other requires three string keys (language, label, description). The
 former seems easier.


 What makes
 it so odd is that the GlobalSiteId PageName is meant to indicate
 equivalence of items across sites, which is rather different then using
 it to identify properties in wikitext.


 What you are saying (equivalence across sites) only is another way to say
 that GlobalSiteId PageName is a key for entities on Wikidata. Such keys
 can always be used to define equivalence classes (of keys that refer to the
 same thing); how is that a problem?



   It seems that a property could at best have a 

Re: [Wikidata-l] Entities, statements and claims

2012-09-17 Thread Denny Vrandečić
That sounds very good to me.
+1 (or am I allowed to say +2 ? ;)

2012/9/16 Jeroen De Dauw jeroended...@gmail.com:
 Hey,

 There is some disagreement regarding the interface to access statements in
 various entities.

 == Current implementation ==

 This is not fully implemented yet, but it's what the interface implies will
 be done.

 Entities contain a list of statements. In other words, all items, properties
 and queries can have statements.

 Obvious objection: statements for properties and queries should not have
 associated rank and references. They could just have a list of claims.

 Original reason to go with this approach anyway: the alternative is to put
 statement handling in Item and add claim handling in either both Property
 and Query or in a common base. This would result in duplication and loss of
 common interface for the Entities.

 == New proposal ==

 I think we can accommodate all the concerns listed above as follows:

 * All Entities provide access to a list of claims.
 * Properties and Queries contain a list of claims.
 * Items contain a list of statements to which they provide access both as
 list of statements and list of claims. The apparent list of claims would
 correspond to a filter on rank=primary followed by a map from statement to
 it's claim on the list of statements.

 Any objections to implementing it like that?

 Cheers

 --
 Jeroen De Dauw
 http://www.bn2vs.com
 Don't panic. Don't be evil.
 --

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l




-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l