martin wrote:
Dear friends,

I attach a study done by a student in my group about how identifiers for
CIDOC CRM instances in RDF could look like. Before publishing it, your feed-back
would be much appreciated, in particular, if you have knowledge of related work
and other approaches.

So this is about CRM instances in XML (not RDF). If we were
dealing with RDF, then the first step towards disambiguation
would have to address the CRM itself by defining a namespace
URI for the RDF schema definition.

Namespace prefixes can also be useful in XML and I wonder why
this mechanism is not even mentioned in the report. The ex-
amination of cataloguing rules remains vague and does not touch
the more relevant parts such as the UNIMARC 8xx fields which
are defined with concrete recommendations for naming the
sources of authority records.

Prefixing the contents of CRM property elements with names of
authority documents has been proposed earlier. A major dis-
advantage of this apprach is that XML parsers cannot separate
prefixes from actual instance names without the help of external string-fiddling routines. The proposal in this paper would make
the situation even worse by introducing further sub-element
divisions that remain outside the scope of XML syntax.

Subdividing XML elements using separator characters is generally
a bad idea. After all, why do we have XML elements and attributes?
Hierarchies of identifiers can be expressed as nested or
sequential XML elements, making every single indentifier component
accessible to an XML parser. Moreover, since authority references
can vary between levels of the hierarchy, using attributes for
source references would give us a clean data model that would just
have to be added to the DTD. Using an example from the report
(p.19), we could get something like

  <P72F_has_language>
     <crm-instance class="Language">
        <val authority="iso639-1">EL</val>
     </crm-instance>
     <P3F_has_note>medieval Greek</P3F_has_note>
  </P72F_has_language>

Exchanging <crm-instance> for <crm:instance> using XML name-
spaces would give us a simple extension mechanism, e.g. for
instantiating subclasses derived from CRM classes such as
<myns:instance class="HorselessCarriage">...</myns:instance>

Multi-level disambiguation could be done in various ways
such as nesting or sequencing the disambiguating elements.
Finding the most suitable representation with respect to
processing (e.g. through XSLT) would be a worthwhile area
of study.

Best regards,
Detlev

Reply via email to