Re: [Crm-sig] Issue: Solution for Dualism of E41 Appellation and rdfs:label

Martin Doerr Thu, 13 Sep 2018 22:57:51 +0300

Dear Richard,

On 12/09/2018 14:55, Martin Doerr wrote:
Dear Richard,
I basically agree with your comments. Specifically however, I indeedwanted to say that the official definition of rdfs:label makes itexactly a subproperty of P1 (or shortcut of it) in any correct use ofRDFS. If we want to mix RDFS models, we should have an opinion abouttheir compatibility. Otherwise, we would have to regard them asalternative that cannot be compared with the CRM.
OK: noted. My concern is simply that we should not include assertionswhich mean that 'CRM RDF' fails to play nicely with other RDFframeworks. I would welcome the thoughts of others on this issue.
I am not happy with adding rdfs:label to instances of Appellation,because this would mean it is a name for a name and not the name. Iwould sympathize with George using rdfs:value, if it had therespective semantics.
Yes, we're in full agreement on this.
What we need, to my opinion, is a property of Symbolic Object we maycall it "has symbolic content" or "has symbolic content inline" oranything better, which defines that the symbolic content *isidentical to* the Literal, *abstracted *to the "level of symbolicspecificity" that the Literal implies and that conforms to theidentity condition of the Symbolic Object, i.e., characters of acertain script, or whatever. That would make the meaning of the"value" unambiguous.
Again, I'm in complete agreement with this line of thought. Onedecision we should make is whether this property forms part of thegeneric CRM framework, or if it is to be an implementation-specificproperty which only appears in our RDF implementation of the CRM. Myinstinct is for it to go into the CRM proper: the treatment ofSymbolic Object and its subclasses would I think be made clearer bythe addition of this property.

For CRM proper!

It's worth bearing in mind that RDF strings have a built-in mechanismfor specifying the language of the string. This would allow us toexpress, for example, a place name in multiple languages by simplyhaving one 'has symbolic content' property per language, each with anassociated string.
We may need add another property, such as "is contained in" or sopointing to a URL actually holding an instance of its content, againabstracted to the "level of symbolic specificity" that the fileinstance implies and that conforms to the identity condition of theSymbolic Object.
I think that we would benefit from some use cases which demonstratethe practical need for this property. My own instinct is that if weare really just recording a string value, then it is overkill toassign it a URL and put it somewhere

I made a jump here. This is for things like a (standardized) text ofAristotle in a MS Word document, and in a .html file. If I mean the textalleged to Aristotle, I obviously do not mean the type face in MS Wordto belong to Aristotle's text, nor html layout instructions. Means, thatboth contain the precisely the same text, but are themselves different,because they are richer in information, which are modern renderings. Allthree, the standardized text of Aristotle, the MS Word representationand the html representation are different Symbolic Objects, but one iscontained in the other two.

else. If it's more than just a string value, in what way is it more? Is it an instance of some other class, which we should be defining (orhave already identified)?
My suggestion is that we define the "has symbolic content" property,and then put our energy into agreeing one or more subproperties ofrdf:value which meet the known recording requirements for culturalheritage information. By doing this, I suggest that we will havesolved the main problem which confronts implementors who want toexpress CRM in RDF.

Yep, subproperty of rdf:value is not bad.

Whereas the shortcut interpretation is attractive, it is not exactlythe same. Using a shortcut, we say that the intermediate node is ofdifferent, independent nature from the terminal node. Here, we do notsay "Appellation" is related to something called "Literal". We say"this Appellation IS itself what is in this Literal". That may or maynot be a reason to reject this interpretation.
True. At least two respondents in this conversation have said thatthey prefer the fully-worked-out paths. Let's sort out an initialstrategy for RDF based on the current CRM; then we can form a view asto whether further shortcuts are still required.
We also have to distinguish Appellations and other Symbolic Objectswhich have multiple symbolic forms, i.e. spelling variants, versionsetc., from those *being one* symbolic form. The rdfs:value has nomeans to express that. I believe we need yet another property "hassymbolic content variant". In that case, the URI is necessary, to myopinion.
There may be a need for such a property; an analogy would be in SKOS,which has skos:prefLabel (one per language) and skos:altLabel. However, I wonder if there is value in being able to express, in anopen world situation, that one symbolic form is the "right" one andthe others are variants. I would welcome some concrete examples toinform our discussion.

Well, I did not mean that there is a "right" form: "Martin","Martinus","Martijn", "Marty".....if you go back in history there isoften no standard for one anguage either,

From your explanations, I am getting a mental picture of anAppellation which has been the subject of much study, where you wantto record, in a condensed way, all the possible forms which thatAppellation might take. For example, the sort of entry you might findin an encyclopaedia or a biographical authority. I think that a moretypical scenario might be where the 'same' name (e.g. the name of aknown individual) occurs in a number of sources, but varies between them.
Also, I don't see how introducing a URL helps with this problem. Ifyou have an Appellation node in your graph, there are variousstatements which you can make about it.

Sure. It does not make sense for E41. Names are small enough to keepthem in a Literal. Other Symbolic Objects may not be.

If instead you invent a URL to represent that Appellation, you are inexactly the same situation as before, in terms of the statements youcan make. In fact, you have taken one step backwards, because you nowhave to begin by declaring explicitly that this node represents anAppellation: <myURL> rdf:type crm:P41_Appellation.

I think the polymorphism we describe here, well studied inobject-oriented languages, is in the nature of Appellations. Theproblem for me is, that the the respective KR models have NOT THOUGHTof the case that such polymorphisms can occurr. Nevertheless, RDFS istolerant enough to accept the Superproperty statement, but not tocreate a class which is either URI or *inline expanded* object.
This polymorphism occurs EXCLUSIVELY for Symbolic Objects with symbolsets a certain machine supports. Another reason not to userdfs:value, because it does not give credit to the fact that onlySymbolic Objects can have such a "value".
I'm afraid you have lost me here. It would be very helpful to me (andmight encourage others to join in the conversation) if you could postone or two concrete examples of what you mean.

OK, in simple words: there are names which have an identity based on acertain sequence of characters. There are others, historicallyinteresting, which have a phonetic identity, and even that may vary. Wecollaborate with historians, that deal with family names in the Aegeanarea around 1800, which have no standard spelling at all, not even apreferred one. The different spelling variants have later evolved intodistinct family names. But in order to match instances in the documents,we need both concepts of identity.

Even my ancestors used "Derr" instead of "Dörr". Since the local dialectdoes not distinguish "e" and "ö", it is unclear if it is a spellingvariant of the same phonetics or if the "ö" is an etymologicalmisinterpretion, because "Dörr" has a linguistic meaning and the "e" in"Derr" may have another semantic root, but this is not widely accepted.

So, the names that are not identical to a Literal must be representedusing a URI. That is what I mean by polymorphism. Also, if we want totalk about the name itself as a historical fact, we need a distinctidentity. All these cases are needed but rare for names. For texts, itis the opposite. They are more often in files than in literals.

On the other side, only Symbolic Objects can "reside" on computers andoutside. Therefore the "punning" problem does only occur in connectionto Symbolic Objects. Only these can have a "value" in the machine,whereas rdfs:value may be about anything.



Best,

Martin

Best wishes,

Richard
I agree that we may over-think the point. As I mentioned, thesuperproperty statement I propose has no other effect than that I canget E41's and labels back by querying P1 only.
Opinions?

Best,

Martin

On 9/12/2018 9:56 AM, Richard Light wrote:
On 11/09/2018 20:02, Martin Doerr wrote:
Dear All,
Firstly, apologies, the RDF was wrong, it was intended to be P1 issuperproperty of rdfs:label.
I'm not sure that this is something we need to state at all, and Iworry that - if it is included in our RDFS Schema - it may bringunwanted side-effects. Isn't this saying that any instance ofrdfs:label is to be treated as an instance of P1? Bear in mind thatCRM data may co-exist in triple stores in company with other RDFdata, which may well use rdfs:label for its own purposes. Thisassertion that 'all rdfs:labels are P1 relationships' would then beapplied to this other data as well. This might well result inincorrect/spurious results when SPARQL queries are applied to the data.
In general, I suggest that we are ok to definesub-classes/properties of standard RDFS types, but we shouldn'tdefine super-classes/properties of them. (I would welcome commentson the validity of this suggestion from someone who understands RDFbetter than me.)
Semantically, the range of rdfs:label, when used, is ontologicallyan Appellation in the sense of the CRM.
Agreed (see my reply from yesterday). The conclusion I draw fromthis is that we can validly say:
E1 rdfs:label "string value" is a shortcut for the path 'E1 CRMEntity' 'P1 is identified by' 'E41 Appellation' ...
in exactly the same spirit as the similarly-worded note which wefind in the definition of P1 itself. (Obviously, by using thisshortcut, we lose the information that this string value is anAppellation, but that's the nature of short-cuts.)
I agree with George, that all RDF nodes should have a humanreadable label. They name the thing, even if it is a technical node.I would find it confusing to say, labels are not to be queried,only to be read, and the "real" names must have a URI,
regardless weather I have more to say about it.

I am really not a fan of punning, we definitely forbid it in the CRM.
The point with Appellations is that some, the simple ones, candirectly be represented in the machine, or be outside. The solutionto assign a URI in all cases, and then a value or label, does notmake the world easier. It is extremely bad performance. We talkhere about implementation, not about ontology.You get simply a useless explosion of the graph for a purpose oftheoretic purity.
Agreed. What we need to do is to propose a simple way of expressingsimple Appellations in RDF. That is why my shortcut definitionabove ends with '...': I don't think we have yet decided how to do this.
I've just been looking over the draft document we are trying towrite, and it currently says that a fully-worked-out path will use'P3 has note -> E62 string' to express the value of an E41Appellation. This (i.e. the suggestion to use P3) comes from thedefinition of the superclass E90 Symbolic Object. A comment in ourdraft RDF document questions whether this is sufficiently precise,since P3 is simply "a container for all informal descriptions aboutan object that have not been expressed in terms of CRM constructs". I suggest that we need either to use rdfs:value to hold the stringvalue, or (better) to define a CRM-specific subproperty ofrdfs:value and use that. (This subproperty could be part of thepublished CRM, or it could just form part of the 'RDFimplementation' guidelines.) I don't think that we should userdfs:label here.
I don't think we should concern ourselves with URLs in our RDFguidance document. Any implementer of our RDF solutions can chooseto assign a URL to represent any node in the structure, but it won'tchange the logic of the resulting RDF, or how it responds to SPARQLqueries.
Those claiming confusing should be more precise. Has someone lookedat query benchmarks? Has someone looked at graphicalrepresentations of RDF graphs. Do they really look better?
So either we either ignore the issue, and write queries thatcollect names either via P1, URI and a value/label, or via a label,because this is where names appear in RDF, we make no punning, butour queries implement exactly this meaning. So, we are not better,but do as if we wouldn't know.
Or, we describe the fact by punning, have one superproperty for allcases, which we can query, and stop thereby the discussion iflabels are allowed or not, and how they relate to appellations. Thepunning comes in, because the range of the superproperty mustcomprise the ranges of the subproperties. We can play a bit more,make the punning with a superproperty of P1, and have both P1 andrdfs:label subproperties of it, if this is preferred.The solution I describe is just a logical representation of thesituation, not creating a different situation. It just says thatnames can be complex objects or simple literals.
As I said yesterday, I don't see how any punning strategy can makedifferently-structured RDF equivalent for the purposes of querying.Therefore, I think we will have to accept that if we allow more thanone way of representing a given statement in CRM RDF, we will haveto construct queries which look explicitly for each of the possiblepatterns.
The problem is, that the RDF literals do have meaning beyond beingsymbol sequences.
Insofar as they have such meaning, I would argue that we define it(i.e. that meaning) by the CRM context in which we place thestring/literal value. I think there is a danger that we couldover-think this problem.
Richard
The punning does not introduce the problem. With or without, thequeries have to cope with names in either form.This holds similarly for space primitives and large geometry files,for short texts and equivalent files etc.
Opinions?

Best

Martin
--
*Richard Light*


_______________________________________________
Crm-sig mailing list
Crm-sig@ics.forth.gr
http://lists.ics.forth.gr/mailman/listinfo/crm-sig
--
--------------------------------------------------------------
  Dr. Martin Doerr              |  Vox:+30(2810)391625        |
  Research Director             |  Fax:+30(2810)391638        |
                                |  Email:mar...@ics.forth.gr  |
                                                              |
                Center for Cultural Informatics               |
                Information Systems Laboratory                |
                 Institute of Computer Science                |
    Foundation for Research and Technology - Hellas (FORTH)   |
                                                              |
                N.Plastira 100, Vassilika Vouton,             |
                 GR70013 Heraklion,Crete,Greece               |
                                                              |
              Web-site:http://www.ics.forth.gr/isl            |
--------------------------------------------------------------


_______________________________________________
Crm-sig mailing list
Crm-sig@ics.forth.gr
http://lists.ics.forth.gr/mailman/listinfo/crm-sig
--
*Richard Light*


_______________________________________________
Crm-sig mailing list
Crm-sig@ics.forth.gr
http://lists.ics.forth.gr/mailman/listinfo/crm-sig



--
--------------------------------------------------------------
 Dr. Martin Doerr              |  Vox:+30(2810)391625        |
 Research Director             |  Fax:+30(2810)391638        |
                               |  Email: mar...@ics.forth.gr |
                                                             |
               Center for Cultural Informatics               |
               Information Systems Laboratory                |
                Institute of Computer Science                |
   Foundation for Research and Technology - Hellas (FORTH)   |
                                                             |
               N.Plastira 100, Vassilika Vouton,             |
                GR70013 Heraklion,Crete,Greece               |
                                                             |
             Web-site: http://www.ics.forth.gr/isl           |
--------------------------------------------------------------

Re: [Crm-sig] Issue: Solution for Dualism of E41 Appellation and rdfs:label

Reply via email to