Hey Scott. Thanks for the detailed reply. See my response inline below
On Monday, June 20, 2011 at 4:15 PM, M. Scott Marshall wrote: 
> Hi Chime,
> 
> The main reason is that when semantics and natural language are
> inserted into identifiers, some identifers are doomed to become stale
> as thinking evolves or changes about the semantic representation. 
Ok. But that seems like throwing the baby out with the bath water. I understand 
that inserting meaning into identifiers can open the door for a disconnect 
between the (possibly evolving) meaning of a term and what the natural language 
sense of the identifier suggests is the meaning. However, it seems much less 
draconian to (instead) promulgate best practices that emphasize that the 
meaning of the term comes from the definitions in the ontology rather than the 
natural language sense of the identifier. After all, the web architecture axiom 
that URIs should be opaque is a best practice about not deriving meaning *from* 
them rather than a mandate that they are truly opaque, and it addresses the 
same issue. 

The use of ontology annotations (puns?) is also well suited to situations this 
kind of issue (where some meta information is needed about the terms). 
Consider, SKOS, for example. In addition (and this is perhaps a bit off topic) 
ontologies should strive to ensure a large proportion of the terms in them have 
full definitions (or at least, non-empty definitions - like so many in the 
biomedical domain do), so users of the ontology do not feel compelled to 
discern the meaning of the terms from their name. 

My main issue with this approach is the tremendous impact on readability of 
ontology content. As an example for the impact on readability, consider the 
rendering of terms via a well-engineered ontology syntax such as the manchester 
OWL syntax. I have been repeatedly amazed at how readable 
automatically-generated manchester OWL syntax can be and this restriction we 
are discussing almost rules out such a quick win. 

> Or when a new 'name brand' is created for that namespace: I think that
> the best example of this was provided by Jonathan Rees for Shared
> Names - ever heard of 'locuslink' identifiers? I believe that Entrez
> Gene occupies the name branding of that space now.This is precisely
> the sort of problem that Shared Names would like to avoid by serving
> (non-ontological) identifiers from a 'neutral namespace'. 
But how is this different from the general problem of identifiers changing over 
time and can't you use semantic identifiers in 'neutral namespaces' such as 
purl.org?
> In
> ontologies, the same principle applies (I see that Helena has supplied
> a good example).
> 
> I agree with Mark about proper tooling - the tools should
> automatically display labels. 
It seems to me that this only shifts the complexity (and burden) brought about 
by the mandate to tools that already have their work - in making interactions 
with semantic web content user friendly and intuitive - cut out for them. If 
there was broad, community agreement on which properties provide authoritative 
human-readable labels, then such tools could easily show these labels. 
Vocabularies such as SKOS have gone along way in this direction. However, the 
decision to rule out the use of semantic identifiers seems to make this 
capability a prerequisite to using them in a user-friendly way.
> It's true that I don't know of a SPARQL
> editor that does this to a satisfying degree yet, (except for one:
> SPARQL Assist Lanugage-Neutral Query Composer from McCarty et al,
> shown at SWAT4LS in Berlin :) See Mark's post.) but that is not a
> reason to create identifiers and your knowledge representation in a
> way that won't stand the test of time.
I think this is part of the larger 'ontology evolution problem' and requiring 
semantic-free identifiers not only doesn't solve the general problem but it 
also introduces difficulty in the one area the SW can not afford any more 
challenges: accessibility and user friendliness 
> Shouldn't we consider RDF to be the bytecode of knowledge? 
> 

Yes, but that shouldn't require that we limit the form of the identifiers. 
Besides, it already behaves in that way since reasoners already treat 
identifiers opaquely. The problem being addressed only applies to humans, which 
is why I'm inclined to think that what is needed is best practices. 
> Although I
> understand the difficulty of dealing with non-human readable
> identifiers in SPARQL and RDF, I believe that we are now looking at
> bytecode and complaining that it isn't human readable.
You can think of it that way, but consider the earlier comment about 
readability of computer programs. Regardless of whether or not they all end up 
as machine code, readability *is* still a distinguishing factor between 
computer programming languages.
> 
> 

>  It's true that,
> until the tools are available, it is difficult to write SPARQL
> queries. But if we applied the same logic to gene accession numbers,
> where would we be now? 
I certainly agree that there are domains where the use of human readable 
identifiers exacerbates the ontology evolution problem (and gene data is the 
prime example), but I do not think this is completely representative of all 
uses of human readable identifiers and thus I think the approach of educating 
developers and consumers of the nuances that make one scenario more of an issue 
than another is much more appropriate.
> 
> 

General observation: This seems like another neat v.s. scruffy thread and there 
seem to be many of these playing out in the various semantic web communities at 
this time. http-range-14 v.s. ontology-determined meaning of resources, 
dereferencability of RDF URIs, etc.


-- 
Chime Ogbuji
Sent with Sparrow

Reply via email to