Hi,
On 24/08/13 19:45, Jeroen De Dauw wrote:
Hey,
The situation with commonsMedia is a bit bad because it should be a
URL rather than a string. What I do in wda is effectively a type
conversion from string to URI in this particular case. Maybe we can
fix this somehow in the future when URIs are supported as a value
datatype.
Ok, this makes me somewhat concerned. We do have a IriValue DV [0],
which we've had for nearly a year. It is indeed not used for
commonsMedia, not sure why. What concerns me is that we are now
introducing a "url" data type, which will also just use the string DV,
rather then the IRI DV. I'm not very happy with this, though it is what
most of the team wants. If there is a problem with this approach, it
should be outlined _soon_, since this is something not far from
deployment if I understand it correctly.
If we have an IRI DV, considering that URLs are special IRIs, it seems
clear that IRI would be the best way of storing them. For any Web-based
format (esp. OWL and RDF), there is a big difference between "some
arbitrary string" and an IRI. Similarly, many tools that use data will
naturally treat URLs in a different way than other strings when
displaying them to users. If this difference is not captured in the
data, then applications have to look it up, use some kind of hard-coded
handling for certain properties, or apply heuristics to decide which
strings are supposed to be URLs. Using IRI DVs would solve the problem
in a cleaner way with less effort.
Of course, you could just use "string" for all types of datavalue
without loosing datavalue information. However, this would make the
Wikidata data model inadequate for some important uses. The exported RDF
will fix this in a sense, so people using this will get the important
information from there. However, RDF has other problems that make it
difficult to use as a primary data dump format (esp. heavy
normalisation), and it is not available from Wikibase yet. Therefore, I
think it would be problematic if the Wikidata data model is simplified
to such an extent that practically important information is no longer
easy to get for external users.
I appreciate that there might be split opinions about this among the
developers (who see the immediate technical consequences, esp. for their
piece of work). However, this decision has important long-term
consequences beyond current engineering aspects. Luckily, Wikidata has a
recognized expert in Web data technologies as its technical director ;-)
-- the team should trust his judgement here.
Cheers,
Markus
_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l