I'm hoping for a bit of advice and rather than talk in the usual generic
terms I'll use the actual example I'm working on.
I want to define the best way to record a person's sex (this is related
to the W3C GLD WG's forthcoming spec on describing a Person [1]). To
encourage interoperability, we want people to use a controlled
vocabulary and there are several that cover this topic.
ISO 5218 has:
0 = not known;
1 = male;
2 = female;
9 = not applicable.
and Eurostat offers
F = female
M = male
OTH = other
UNK = unknown
NAP = not applicable
IMO, the spec should not dictate which one to use (there are others too
of course). What I *do* want to do though is to encourage publishers to
state which vocabulary they're using. Sounds like a job for a datatype -
and for that you need a URI for the vocabulary. Something like:
schema:gender "1"^^<http://iso.org/5218/> .
Except I made that iso.org URI up. The actual URI for it is
http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266
(or rather, that's the page about the spec but that's a side issue for
now).
That URI is just horrible and certainly not a 'cool URI'. The Eurostat
one is no better.
Does the datatype URI have to resolve to anything (in theory no, but in
practice? Would a URN be appropriate?
Given that the identifier for the ISO standard is "ISO/IEC 5218:2004"
how about urn:iso/iec:5218:2005?
For Eurostat, the internal identifier for the vocabulary is "SCL - Sex"
(standard code list) so would urn:eurostat:scl:sex be appropriate?
Anyone done anything like this in the real world?
All advice gratefully received.
Thank you
Phil.
[1] https://dvcs.w3.org/hg/gld/raw-file/default/people/index.html
--
Phil Archer
W3C eGovernment
http://www.w3.org/egov/
http://philarcher.org
@philarcher1