Coming back to the question of P's and Q's (sorry, it's been a busy few
weeks)
I read people saying "Don't worry because prefixes", but with respect I
don't agree.
IMO "Don't worry because prefixes" may make sense as a response if one
interacts with Wikidata primarily via RDF dumps, or SPARQL, or perhaps
writing system code -- environments where those prefixes may be
generally present and used.
But for anyone actually working first-hand with the data, whose work
involves any substantial checking and/or manual editing of data through
the wikibase user interface, I think it fails to ring true. Extensive
hand-editing in this way tends to be an unavoidable aspect when curating
a dataset in wikibase -- eg investigating anomalies revealed by query
reports, perhaps after a large data upload or data matching procedure,
and then identifying and making appropriate edits to resolve them.
For people making a lot of hand-edits like that, a process which as I
have said I think is inevitable when actively curating datasets, certain
property identifiers become so often encountered and so often used and
repeated that they become so deeply ingrained and internalised as to
become essentially second nature -- eg P18 for image, P373 for
commonscat, P131 for located in administrative territorial entity, etc
etc, the precise properties depending on the kind of data and items one
is working with. Similarly also for a lot of certain item identifiers,
eg Q5 human etc.
If one's doing a lot of editing and looking-up through the interface,
these identifications become very very familiar - as internalised and
unconscious and automatic as breathing.
So I do think that reusing the same identifiers for quite different
meanings in a different wikibase (but with essentially exactly the same
editing interface) is to create a cognitive dissonance which (IMO) is
significant, unnecessary, unfortunate, and (I believe) ought to be
avoidable.
A second issue is Daniel's scenarios 2 to 4, where external repos want
to be using and referencing some or all of Wikidata's items and
properties, with the same identifiers as Wikidata, plus some additional
further properties and items of their own defined locally.
That's not straightforward, if they all have to be placed in the same
shared numerical sequences following the same restricted set of initial
letters.
I do take the point that it is useful to be able to use the initial
letter to distinguish different kinds of Wikibase object -- ie
Properties (P), Items (Q), Lexemes (L), MediaInfo items (M)
One solution might be to allow Wikibase instances to use additional
characters in the identifier for the local properties, items etc
specific to that Wikibase -- so that that the Wikibase could have
property identifiers like Px50 or Pz50 or Pm50 to distinguish them from
Wikidata's P50, or identifiers like Qx5000 or Qz5000 or Qosm5000 to
distinguish them from Wikidata's Q5000.
This would straightforwardly allow Wikidata and local items and
properties to exist side by side, and avoid confusion and dissonance
with internalised learnt identifier codes from the items and properties
on Wikidata itself.
Best regards,
James.
---
This email has been checked for viruses by AVG.
https://www.avg.com
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata