Re: [Crm-sig] ISSUE: representing compound name strings
I think it's potentially helpful to encode compound data such as personal names using XML literals in an RDF graph, for display purposes, but not for SPARQL querying. For efficient querying, I don't see any good alternative to providing separate literals for the individual components of the name, such as with "foreName", "surname", etc properties in separate RDF triples. I suggest that RDF encoding guidelines could suggest adopting both practices (i.e. redundant representation both as parts and also as a whole). On Fri, 23 Nov 2018 at 03:53, Martin Doerr wrote: > Dear Richard, Robert, > > It is simply wrong that encoding structured data into an rdfs:Literal > makes it invisible to SPARQL. It is exactly what xsd:dateTime does. The > year, month, etc., is available to querying individually in SPARQL, not by > magic but by a standard extension mechanism. > The date functions in SPARQL that allow an xsd:dateTime literal to be parsed into months, days, etc, are not really an extension to SPARQL; they are part of the SPARQL language standard: <https://www.w3.org/TR/2013/REC-sparql11-query-20130321/#func-date-time>. Because they are a standard data type in SPARQL, a SPARQL processor can achieve efficiencies by normalizing them (to a standard time zone) and using the normalized form in comparisons. The SPARQL specification does allow for SPARQL implementations to have "extension" functions, though, and to extend the operation of built-in SPARQL operators such as "<" or "=", so hypothetically a SPARQL store might offer XPath-evaluation functions to query inside XML literals, analogously to the way that the REGEX and REPLACE functions do with string literals. This kind of hybrid RDF graph/XML tree model could be supported effectively by a SPARQL store which maintained indices of the tree structure of the XML literal objects it contained. I believe Virtuoso actually has such a feature, and there may well be other SPARQL engines with a similar feature, but I personally think it would be unhelpful for the CRM to suggest an approach that depends on such a non-standardised extension. > It is a question to IT experts to tell us how to upload into the SPARQL > code the respective string functions for other compounds. > The standard SPARQL string functions (including regular expression) can be used to parse "compound" string literals, though not to parse XML literals, in general, since XML is not a regular language. Of course the CIDOC CRM could suggest "regular" XML encodings for particular types of compound literals; for example a "persName" data type could be defined and constrained with a regular expression to require that it begins with "" and ends with "", optionally containing child elements beginning with "" and ending with "", and even for these elements to have attributes (such as 'type') drawn from a particular value space. They could be queried using SPARQL string functions e.g. like so: SELECT ?person WHERE { ?person tei:persName ?persName. FILTER(CONTAINS(?persName, 'Richard')) } However, relying on SPARQL FILTER and string-parsing would be grossly inefficient in terms of query performance, compared to querying individual properties, e.g. SELECT ?person WHERE { ?person tei:foreName 'Richard'. } If the "compound" XML literals are not intended for fine-grained querying, they can still be valuable for display purposes, but I don't see much value in constraining them beyond the general "XML literal" datatype. An information system that understands XML literals can examine the XML and process it appropriately based on its namespace. -- Conal Tuohy http://conaltuohy.com/ @conal_tuohy +61-466-324297
Re: [Crm-sig] new technical paper
> > #E5: I think we point into an XML-RDF file. > > Yes, that's the intention, but it will only work if there is an element in > that file with attribute xml:id="E5" to act as the target for the link. > This strategy works fine for the links into the HTML expression of the RDF, > but not for the RDF/XML expression. > I don't think that's correct, actually. The issue of how to resolve URI fragment identifiers is a complicated one, since a URI as a whole refers to a resource (in the Web Architecture sense), while the interpretation of the fragment identifier part of the URI is dependent on the media type of the *representation* of the resource, and of course a single HTTP URI may correspond to many different media types, if the web server supports content negotiation for that resource. So while It's true that fragment identifiers used with XML documents in general (i.e. "application/xml" media type) refer to elements by the value of their xml:id attribute (or other attribute whose type is ID), this interpretation is over-ridden in the case of RDF/XML (the "application/rdf+xml" media type). The registration document for "application/rdf+xml" says that the fragment identifier should be interpreted as corresponding to the rdf:about and rdf:ID attributes. https://tools.ietf.org/html/rfc3870#page-4 On the other hand, it certainly is true that redirecting from http://www.cidoc-crm.org/cidoc-crm/E5_Event to http://www.cidoc-crm.org/html/5.0.4/cidoc-crm.html#E5 is a mistake. The redirect location should be http://www.cidoc-crm.org/html/5.0.4/cidoc-crm.html#E5_Event (since "E5_Event" is the value of the rdf:about attribute in the RDF/XML, not "E5"). http://conaltuohy.com/ @conal_tuohy +61-466-324297
Re: [Crm-sig] ISSUE: E74 Group (from LRMoo discussions)
It's the citizenry as a whole which decides who that government should be. It's not the decision of an individual elector, but of the citizenry as a group. On 9 May 2018 at 16:40, Pat Riva wrote: > Each individual in a democratic nation can choose to vote (or not). > > Then the elected government (an LRM-E8 Collective agent) takes actions and > is responsible, not all those people holding that citizenship. > > > Pat Riva > > Associate University Librarian, Collection Services > > Concordia University > > > > Vanier Library (VL-301-61) > > 7141 Sherbrooke Street West > <https://maps.google.com/?q=7141+Sherbrooke+Street+West+%0D%0A+Montreal,+QC+H4B+1R6+%0D%0A+Canada=gmail=g> > > Montreal, QC H4B 1R6 > <https://maps.google.com/?q=7141+Sherbrooke+Street+West+%0D%0A+Montreal,+QC+H4B+1R6+%0D%0A+Canada=gmail=g> > > Canada > <https://maps.google.com/?q=7141+Sherbrooke+Street+West+%0D%0A+Montreal,+QC+H4B+1R6+%0D%0A+Canada=gmail=g> > > +1-514-848-2424 ext. 5255 > > pat.r...@concordia.ca > -- > *From:* Conal Tuohy > *Sent:* May 9, 2018 1:43 AM > *To:* Pat Riva > *Cc:* CRM-SIG > *Subject:* Re: [Crm-sig] ISSUE: E74 Group (from LRMoo discussions) > > > > On 7 May 2018 at 14:27, Pat Riva wrote: > >> Propose to modify the scope note of E74 Group so that it clearly >> corresponds to LRM-E8 Collective Agent. To do this any groups of people not >> having agency, such as national, religious, cultural, ethnic groups, must >> be excluded from the scope of E74. >> > This strikes me as odd! Is it really true that the citizenry of a nation > is entirely lacking in agency? Can they not take political decisions, for > instance? > > > -- > Conal Tuohy > http://conaltuohy.com/ > @conal_tuohy > +61-466-324297 > -- Conal Tuohy http://conaltuohy.com/ @conal_tuohy +61-466-324297
Re: [Crm-sig] ISSUE: E74 Group (from LRMoo discussions)
On 7 May 2018 at 14:27, Pat Riva wrote: > Propose to modify the scope note of E74 Group so that it clearly > corresponds to LRM-E8 Collective Agent. To do this any groups of people not > having agency, such as national, religious, cultural, ethnic groups, must > be excluded from the scope of E74. > This strikes me as odd! Is it really true that the citizenry of a nation is entirely lacking in agency? Can they not take political decisions, for instance? -- Conal Tuohy http://conaltuohy.com/ @conal_tuohy +61-466-324297
Re: [Crm-sig] Properties of properties in RDF
Dear Martin I'm not sure what you meant by "partially declared subproperties" there (the ambiguity of the term "subproperty" in this discussion doesn't help). I think I understood the rest of what you were saying, though. To be clear, all I was saying was that I would prefer not to publish RDF that directly uses those generic RDF predicates P01_has_domain and P02_has_range, but instead to use a set of more specific predicates (which could be defined to be (RDFS) subproperties of those two predicates). So each distinct type of CRM property which had been reified as an RDFS class (e.g. PC14_carried_out_by) would have its own pair of RDF properties for linking to instances of its domain and range. My rationale for that preference is that it would be more meaningful to users to make use of an RDF predicate called Pxxx_has_actor (with a domain of PC14_carried_out_by and a range of E39_Actor) and Pxxx_has_activity (with domain PC14_carried_out_by and range E7_Activity), rather than using generic predicates P01_has_domain and P02_has_range. Plus it would give us more type-safety. It would be a trivial extension to that existing RDFS to add those extra RDFS subproperties (about 60 of them, including the inverses). Regards Conal On 15 March 2018 at 20:37, Martin Doerr wrote: > Dear Conal, > > There is no conflict with adding subproperties. Once we have defined in > FOL the logic of properties of properties, each PC class implies its base > property. Hence, logically, the subproperty and any added ".1" will hold > for the instances declared and imply the same base property. If, at any > time we wish to connect term hierarchies of roles for the .1 properties > with partially declared subproperties, we need a straight-forward extension > of the CRM. Any subproperty, e.g., may refine domain and range. > > All the best, > > Martin > > > On 3/15/2018 6:28 AM, Conal Tuohy wrote: > > Thanks Martin, for the link to http://www.cidoc-crm.org/ > sites/default/files/CRMpc_v1.1_0.rdfs > > This is actually very close to (and compatible with) the approach I > suggested in my earlier email, and I'm embarrassed to say I wasn't aware of > it at all. > > I've managed to find some background material (though I had to use Google > to find it!) > > http://www.cidoc-crm.org/Issue/ID-266-reified-association-vs-sub-event is > an archive of a relevant discussion. > > http://www.cidoc-crm.org/sites/default/files/Roles.pdf presents a few > slides showing options for modelling properties of properties, including > the "Property Class" approach. > > These slides include a nice illustration of the approach defined in the > RDFS: > > http://www.cidoc-crm.org/sites/default/files/ > 20160802PropertiesOfProperties.pptx > > I think I'd be very happy with this "Property Class" approach, although > rather than using the generic properties P01_has_domain, P02_has_range, and > their inverses,I would still want to define specific subproperties, e.g. > for the case of actors playing a specific role in the performance of an > activity, I would prefer to link the performance (i.e. the instance of PC14 > carried out by) to the actor and the activity using domain-specific > properties such as has_actor and has_activity. > > Conal > > On 15 March 2018 at 04:25, Martin Doerr wrote: > >> Dear All, >> >> Please see:http://www.cidoc-crm.org/sites/default/files/CRMpc_v1.1_0.rdfs >> on page http://www.cidoc-crm.org/versions-of-the-cidoc-crm, plus the >> issues discussing the solution for version 6.2 (I'll look for all >> references). >> >> Best, >> >> martin >> >> >> On 3/14/2018 12:49 PM, Conal Tuohy wrote: >> >> >> >> On 8 March 2018 at 18:02, Richard Light >> wrote: >> >>> I was thinking last night that maybe we should focus our RDF efforts on >>> exactly this issue: the representation of the CRM primitive classes E60, >>> E61 and E62 in RDF. The current RDF document is becoming quite >>> wide-ranging in its scope, and (for example) you have questioned whether >>> certain sections belong in it. If we concentrate on this single aspect of >>> the broader RDF issue, I think we can produce something which is of >>> practical value relatively quickly. In particular, I would like to devote >>> time to this during the Lyon meeting. >>> >> I applaud the idea of focusing narrowly on something so as to produce >> some of practical value quickly! >> >> But I do hope that the other issues raised in that document will not be >> set aside too long, or lost. >> >> In particular, it seems to me that the mapping from the C
Re: [Crm-sig] Properties of properties in RDF
Thanks Martin, for the link to http://www.cidoc-crm.org/sites/default/files/CRMpc_v1.1_0.rdfs This is actually very close to (and compatible with) the approach I suggested in my earlier email, and I'm embarrassed to say I wasn't aware of it at all. I've managed to find some background material (though I had to use Google to find it!) http://www.cidoc-crm.org/Issue/ID-266-reified-association-vs-sub-event is an archive of a relevant discussion. http://www.cidoc-crm.org/sites/default/files/Roles.pdf presents a few slides showing options for modelling properties of properties, including the "Property Class" approach. These slides include a nice illustration of the approach defined in the RDFS: http://www.cidoc-crm.org/sites/default/files/20160802PropertiesOfProperties.pptx I think I'd be very happy with this "Property Class" approach, although rather than using the generic properties P01_has_domain, P02_has_range, and their inverses,I would still want to define specific subproperties, e.g. for the case of actors playing a specific role in the performance of an activity, I would prefer to link the performance (i.e. the instance of PC14 carried out by) to the actor and the activity using domain-specific properties such as has_actor and has_activity. Conal On 15 March 2018 at 04:25, Martin Doerr wrote: > Dear All, > > Please see:http://www.cidoc-crm.org/sites/default/files/CRMpc_v1.1_0.rdfs > on page http://www.cidoc-crm.org/versions-of-the-cidoc-crm, plus the > issues discussing the solution for version 6.2 (I'll look for all > references). > > Best, > > martin > > > On 3/14/2018 12:49 PM, Conal Tuohy wrote: > > > > On 8 March 2018 at 18:02, Richard Light wrote: > >> I was thinking last night that maybe we should focus our RDF efforts on >> exactly this issue: the representation of the CRM primitive classes E60, >> E61 and E62 in RDF. The current RDF document is becoming quite >> wide-ranging in its scope, and (for example) you have questioned whether >> certain sections belong in it. If we concentrate on this single aspect of >> the broader RDF issue, I think we can produce something which is of >> practical value relatively quickly. In particular, I would like to devote >> time to this during the Lyon meeting. >> > I applaud the idea of focusing narrowly on something so as to produce some > of practical value quickly! > > But I do hope that the other issues raised in that document will not be > set aside too long, or lost. > > In particular, it seems to me that the mapping from the CRM's "properties > of properties" to RDF is actually a more serious gap. > > In the CRM, there are a number of properties which are themselves the > domain of properties. In RDF, however, a property does not have properties > of its own. Incidentally, I remember years ago being able to model this > directly in ISO Topic Maps, but practical considerations of > interoperability and community dictate that RDF, despite its simpler model, > is the technology of choice today. > > One example of the issue is how to model the role that individuals play in > events. If a concert performance X was P14 carried out by person Y, then > this maps naturally to an RDF triple in which the predicate is > crm:P14_carried_out_by. However, if the person carried out that activity in > a particular role (e.g. as a saxophonist) then things are more difficult. > In the CRM, the P14_carried_out_by itself has the property > P14.1_in_the_role_of, whose value could be an instance of E55_Type: > Saxophonist. This is pleasingly consistent with how the CRM handles > taxonomies in other parts of the model, but it is not workable in RDF > because the P14_carried_out_by property cannot itself have a property. > > There are a number of "work-arounds" to this issue, such as simplying > ignoring the problem and "dumbing down" the data, or moving the locus of > classification from the property to the property value (e.g. in this case > that would mean classifying the person rather than their role; that doesn't > work very well because people may have many distinct roles, but it works > better for other cases). > > The existing guidance would suggest defining a new "saxophone-played-by" > property to be a rdfs:subpropertyof P14_carried_out_by. This can certainly > work, but it's actually a poor expression of the CRM's model. It negates > the practical benefits of having external taxonomies for this kind of > classification. This guidance, in my opinion, makes too much of the > apparent similarity between the CRM's properties and RDF properties. They > are not in fact the same kind of thing, and a property which itself bears > properties is more closely approximated
[Crm-sig] Properties of properties in RDF
On 8 March 2018 at 18:02, Richard Light wrote: > I was thinking last night that maybe we should focus our RDF efforts on > exactly this issue: the representation of the CRM primitive classes E60, > E61 and E62 in RDF. The current RDF document is becoming quite > wide-ranging in its scope, and (for example) you have questioned whether > certain sections belong in it. If we concentrate on this single aspect of > the broader RDF issue, I think we can produce something which is of > practical value relatively quickly. In particular, I would like to devote > time to this during the Lyon meeting. > I applaud the idea of focusing narrowly on something so as to produce some of practical value quickly! But I do hope that the other issues raised in that document will not be set aside too long, or lost. In particular, it seems to me that the mapping from the CRM's "properties of properties" to RDF is actually a more serious gap. In the CRM, there are a number of properties which are themselves the domain of properties. In RDF, however, a property does not have properties of its own. Incidentally, I remember years ago being able to model this directly in ISO Topic Maps, but practical considerations of interoperability and community dictate that RDF, despite its simpler model, is the technology of choice today. One example of the issue is how to model the role that individuals play in events. If a concert performance X was P14 carried out by person Y, then this maps naturally to an RDF triple in which the predicate is crm:P14_carried_out_by. However, if the person carried out that activity in a particular role (e.g. as a saxophonist) then things are more difficult. In the CRM, the P14_carried_out_by itself has the property P14.1_in_the_role_of, whose value could be an instance of E55_Type: Saxophonist. This is pleasingly consistent with how the CRM handles taxonomies in other parts of the model, but it is not workable in RDF because the P14_carried_out_by property cannot itself have a property. There are a number of "work-arounds" to this issue, such as simplying ignoring the problem and "dumbing down" the data, or moving the locus of classification from the property to the property value (e.g. in this case that would mean classifying the person rather than their role; that doesn't work very well because people may have many distinct roles, but it works better for other cases). The existing guidance would suggest defining a new "saxophone-played-by" property to be a rdfs:subpropertyof P14_carried_out_by. This can certainly work, but it's actually a poor expression of the CRM's model. It negates the practical benefits of having external taxonomies for this kind of classification. This guidance, in my opinion, makes too much of the apparent similarity between the CRM's properties and RDF properties. They are not in fact the same kind of thing, and a property which itself bears properties is more closely approximated in RDF not as a property but reified as a subject resource in its own right. A more faithful mapping of the CRM's abstract model to RDF would introduce a new RDFS class corresponding to the performance of the activity. We could then say that concert performance X was P14a_performed_in Performance Z; that Performance Z was P14b_carried_out_by person Y, and that Performance Z was P14.1_in_the_role_of Saxophonist. That's just one example of the general problem; there are a number of others, which are listed here in the context of the Linked Art project: https://github.com/linked-art/linked.art/issues/55 along with a variety of options for dealing with the issue. In my opinion the current situation with respect to properties of properties (in RDF) is really quite unsatisfactory and could be substantially improved by a more consistent treatment across the entire schema. -- Conal Tuohy http://conaltuohy.com/ @conal_tuohy +61-466-324297
Re: [Crm-sig] P90 etc.
On 9 March 2018 at 04:39, Martin Doerr wrote: > > > I recommend NOT to recommend rdf:value, because RDFS 1.1 defines: > "5.4.3 rdf:value rdf:value is an instance of rdf:Property > <https://www.w3.org/TR/rdf-schema/#ch_property> that may be used in > describing structured values. rdf:value has no meaning on its own. " > > As CRM-SIG, we cannot recommend a property without meaning. We do ontology > here, so the must be a minimal ontological commitment. Are there other > opinions? > My opinion is that the real value of rdf:value is that it effectively negates one of the weaknesses in the expressiveness of RDF, with respect to the CRM. In RDF, a literal value is a second-class citizen: it has no identifier, which makes it ineligible to appear as the subject of a triple, so it can't have properties of its own. It can't be woven into the "Web of Data". It can't effectively function as an "access point" (in the library science sense) without some additional context. As Linked Data practitioners, we generally have literals like "Conal Tuohy" as our source data for e.g. Appellations (and it's worth noting that all of the formal examples of E41 Appellation are given as string literals), but it's highly undesirable to encode an E41 Appellation merely as a literal; such an encoding would make it impossible, either for us, or for third parties, to annotate that name with properties of its own ("A name of Irish origin ..."). So we must mint an identifier, either a local ("blank node") identifier -- or better still, an HTTP URI -- for that name (e.g. "_conal_tuohy"), so that we can then attach other properties to that identifier. We are left, finally, with the residual problem of how to associate the literal name value itself ("Conal Tuohy") with that identifier. This is where rdf:value plays a valuable role of effectively just equating the literal with identifier; it is described as having "no meaning on its own" precisely because it really plays only a syntactical role. This is why I think it would be a mistake to critique the use of rdf:value on the basis of it "lacking meaning of its own"; it would be equivalent to criticising a relational database for having an Appellation table with a column named "value". Regards "Conal Tuohy" > Taken the above definition in RDFS 1.1, I question both, the precise use > and the emerging good practice, > until better evidence:-). > Do you have better evidence? > > It is up to crm-sig to decide, I present only my opinion here. > > Best, > > martin > > > > On 3/8/2018 6:28 PM, Robert Sanderson wrote: > > > > Martin, > > > > Could you clarify why you have changed your mind about rdf:value? > > > > > I recommend NOT to recommend rdf:value > > > > In particular, in the last week you said: > > > > “CRM-SIG normally works reactively: When a good community practice > emerges, this is taken up.” > > > > and > > > > “Whatever the vast majority is and rdf:value does the job, I have no > objections to its use. > Just define precisely what you use it for. We can add that to our > guidelines. It is already standard rdf.” > > > > Thanks, > > > > Rob > > > > > > > -- > -- > Dr. Martin Doerr | Vox:+30(2810)391625| > Research Director | Fax:+30(2810)391638| >| Email: mar...@ics.forth.gr | > | >Center for Cultural Informatics | >Information Systems Laboratory| > Institute of Computer Science| >Foundation for Research and Technology - Hellas (FORTH) | > | >N.Plastira 100, Vassilika Vouton, | > GR70013 Heraklion,Crete,Greece | > | > Web-site: http://www.ics.forth.gr/isl | > -- > > > ___ > Crm-sig mailing list > Crm-sig@ics.forth.gr > http://lists.ics.forth.gr/mailman/listinfo/crm-sig > > -- Conal Tuohy http://conaltuohy.com/ @conal_tuohy +61-466-324297
Re: [Crm-sig] Domain and range of P90
One of the "gaps" which puzzles me most is the example you give of encoding the string value of an Appellation. I understand the recommended practice is to attach the string value of a person's name using P3_has_note, or actually, using a custom subproperty of P3_has_note. The semantics of P3_has_note itself are weak; a note is simply an "informal description" of something, so if I have a particular name (an RDF resource) which P3_has_note the literal string "Conal Tuohy", then I should really define subproperties so as to be able to distinguish that string value from a note which really is nothing more than an "informal description" of that name e.g. "A very uncommon name of Irish origin". What puzzles me most about this "gap" in the RDFS specification is that the distinction between a note ABOUT a name, and the actual textual representation OF a name is somehow considered out of scope of the CRM in RDFS. It's puzzling, because the string value of a name is something which really must be encoded in a standard fashion, to achieve interoperability (as an aside, my personal view is that the string literal "Conal Tuohy" could be attached to an Appellation using the rdf:value or rdfs:label predicate defined in the RDFS spec). But the important thing is that the RDFS schema should stipulate how to attach this literal data rather than leave it as an open question. In general these are the kinds of issues which puzzle many people who approach the CRM from a position of having already worked with other RDF ontologies in the cultural heritage space, and find themselves wondering how they are supposed to make these details CRM work in RDF in an interoperable way, without having to pick and choose from a variety of techniques for "finessing" the gaps. These kinds of gaps are serious barriers to interoperability in the Linked Open Data cloud, and they need to be addressed by agreeing on some encoding procedures that can be used consistently by different projects on the web. It would be helpful to CRM adopters in the Linked Data community if these gaps could be filled in a manner which is clear and simple and interoperable. I am not in favour of just offering a menu of possible approaches, especially where individual projects would have to make local customisations to their schema. If there is some particular value in multiple approaches, then they could be published as different "profiles" that encoders could simply adopt, as a whole. I think the recent effort by Richard Light (and other contributors) to collate guidelines on RDF encoding is a great initiative! <https://docs.google.com/docum ent/d/1zCGZ4iBzekcEYo4Dy0hI8CrZ7dTkMD2rJaxavtEOET0/edit> It deserves more input and I hope it will continue to be discussed on the list. I also think the Linked Art project http://linked.art/ with its "profile" of the CRM is another really good way forward. Regards Conal On 22 February 2018 at 19:46, George Bruseker wrote: > Dear Phil et al., > > I think this is a case of interpreting the label of the property rather > than its intention. CRM ‘has value’ isn’t supposed to cover all possible > meanings of the natural language interpretation of has value. Rather it has > a very restricted use. It is meant to give the quantitive number value > associated to a dimension. Dimension is a class that should be used to > store information that results from a measurement activity. The measurement > activity is specified as some procedural event that has the intentional > objective of producing quantitative data. It is an activity of interacting > with the world with the intention of producing a quantitive result. > > So it would be a nonsensical, to say 'this paragraph (E73) has dimension > (E54 defined as a quantitive result from a measuring procedure) has value > “the characters in this paragraph” (E59 primitive value). The definition of > E54 forbids it because a string is not a quantity (though of course it may > have a quantity… that would have to be measure). > > That of course sounds irritating. It would be nice to have a property that > could store all values. But then of course that property would mean > everything and nothing and the ontology wouldn’t work for getting specific > information, like the quantitative results of measurement activities > separate from any other value ‘good’ ‘bad’ ‘ugly’ ‘monogamy’ ‘world peace’ > ‘all the characters in this present string’. > > That’s the ontological argument. The practical question is why you are > looking to expand the scope. I’m guessing that the reason is because you > want a unique place to store a data value (this is a guess, so please do > correct my presumption if I’m wrong). > > This seems to me to get back to the encoding issue and having a standard > strategy. I th
Re: [Crm-sig] Domain and range of P90
I have used rdf:value for this purpose. https://www.w3.org/TR/rdf-schema/#ch_value The CRM's origin was outside of the RDF space, and it is still considered to be something more abstract than any concrete expression in RDFS or OWL. This is why, I think, there remains a puzzling gap between RDF resources which are instances of CRM classes and their literal values which must be expressed using primitive RDF data types. The point of rdf:value, as I understand it, is to fill in gaps like these. On 22 February 2018 at 02:04, Carlisle, Philip < philip.carli...@historicengland.org.uk> wrote: > Dear all, > Naïve question. > > > > Is there any reason why P90 has value could not/should not change its > domain and range from: > > > > Domain:Range > > E54 Dimension E60 Number > > > > to > > > > E1 CRM Entity E59 Primitive Value > > > > I look forward to you answers > > > > Phil > > > > > > > > *Phil Carlisle* > > Knowledge Organization Specialist > > Listing Group, Historic England > > Direct Dial: +44 (0)1793 414824 <+44%201793%20414824> > > > > http://thesaurus.historicengland.org.uk/ > > http://www.heritagedata.org/blog/ > > > > Listing Information Services fosters an environment where colleagues are > valued for their skills and knowledge, and where communication, customer > focus and working in partnership are at the heart of everything we do. > > > > > > > [image: Historic England Logo] <http://www.historicengland.org.uk/> > > We help people understand, enjoy and value the historic environment, and > protect it for the future. Historic England > <https://www.historicengland.org.uk> is a public body, and we champion > everyone’s heritage, across England. > Follow us: Facebook <https://www.facebook.com/HistoricEngland> | > Twitter <https://twitter.com/HistoricEngland> | Instagram > <https://www.instagram.com/historicengland/> Sign up to our newsletter > <http://webmail.historicenglandservices.org.uk/k/Historic-England/newsletter_sign_up> > > > Help us create a list of the 100 places which tell England's remarkable > story and its impact on the world. A History of England in 100 Places > <https://historicengland.org.uk/100places> sponsored by Ecclesiastical > <http://www.ecclesiastical.com/fororganisations/insurance/heritageinsurance/100-places/index.aspx> > . > > We have moved! Our new London office is at 4th Floor, Cannon Bridge House, > 25 Dowgate Hill, London, EC4R 2YA. > > This e-mail (and any attachments) is confidential and may contain personal > views which are not the views of Historic England unless specifically > stated. If you have received it in error, please delete it from your system > and notify the sender immediately. Do not use, copy or disclose the > information in any way nor act in reliance on it. Any information sent to > Historic England may become publicly available. > > > > ___ > Crm-sig mailing list > Crm-sig@ics.forth.gr > http://lists.ics.forth.gr/mailman/listinfo/crm-sig > > -- Conal Tuohy http://conaltuohy.com/ @conal_tuohy +61-466-324297
Re: [Crm-sig] Associative relationship mapping
Hi Philip! I very much like Stephen's suggestion of modelling generic relationships by reifying subsets of the museum's database records as a set of E73 Information Objects each of which *P67 refers to* a set of "generically related" objects. The nice thing about an "Information Object" is that the semantics it carries are not required to be expressed in terms of the CIDOC-CRM, so it doesn't matter that the exact semantics aren't known. This technique seems to me like it could be a useful very generally for representing information from legacy systems with under-specified semantics. I was confronted with the same issue when I was experimenting with building a CIDOC-CRM interpretation of the data exposed by Museum Victoria's Collections API. Actually, I wish I'd thought of Stephen's approach, now, rather than the approach I took, which wrote up on my blog: < http://conaltuohy.com/blog/bridging-conceptual-gap-api-cidoc-crm/> Of particular relevance to your question is the section about how to model these "generic relations" between collection items: < http://conaltuohy.com/blog/bridging-conceptual-gap-api-cidoc-crm/#relationships>. The problem is that MV's underlying database records did not document any detailed semantics for this relationship, and that a number of different types of relationships might have been represented using the same data structure. If you are aiming to model the relationship as something general enough to subsume all the instances of the relationship, the owl:topObjectProperty would certainly work for this purpose, but you might perhaps find something semantically stronger. Empirical investigation might allow you to use one of the CIDOC-CRM's properties (though it might show that the general relationships are actually too heterogeneous for that). In the case of my experiment, my reading of MV's data led me to believe that the actual relationships could legitimately be encoded as relationships of similarity, and represented with P130_shows_features_of (in the symmetrical, non-directed sense of that relationship), though this was controversial, as you can see by the comments on the post: < http://conaltuohy.com/blog/bridging-conceptual-gap-api-cidoc-crm/#comments> The other approach would be to try to guess at a more particular meaning for each of these "general relationships", using clues from other available data. You might find that the "general relationships" between photographs and other items was one of depiction, for instance, and be able to automate that inference in your mapping. But that's an empirical question, and potentially a lot of work. Regards Conal On 15 September 2016 at 20:16, Carlisle, Philip < philip.carli...@historicengland.org.uk> wrote: > Hi all, > > > > The Arches project moves on a pace and is in the process of modifying the > graphs for version 4. > > > > In the original graphs we used a British Museum extension property > (PXX_is_related_to) as a work around to allow us to represent the general > association relationship we had in legacy datasets. eg. this telephone > kiosk has a general association with this telephone exchange. > > > > We now want to continue to be able to model a general association but the > only property available P69 has association with (is associated with) is > restricted in its domain and range to E29 Design or Procedure. > > > > How do we model the ‘If you’re interested in that you might be interested > in this’ nature of the general association between two physical man made > things? > > > > All thoughts appreciated. > > > > Phil > > > > *Phil Carlisle* > > Knowledge Organization Specialist > > Listing Group, Historic England > > Direct Dial: +44 (0)1793 414824 > > > > http://thesaurus.historicengland.org.uk/ > > http://www.heritagedata.org/blog/ > > > > Listing Information Services fosters an environment where colleagues are > valued for their skills and knowledge, and where communication, customer > focus and working in partnership are at the heart of everything we do. > > > > > > ___ > Crm-sig mailing list > Crm-sig@ics.forth.gr > http://lists.ics.forth.gr/mailman/listinfo/crm-sig > > -- Conal Tuohy http://conaltuohy.com/ @conal_tuohy +61-466-324297
[Crm-sig] Blog post on mapping a museum web API to CIDOC CRM
I've written a third blog post in a series, exploring an experimental Linked Data publishing technique in which a transforming web proxy is used to transform and expose a museum collection JSON web API as RDF graphs (using the Erlangen CRM). This post deals with the challenges of mapping from the "record oriented" data model exposed by the API, to the more ramified CRM, with examples. Any feedback very gratefully received! http://conaltuohy.com/blog/bridging-conceptual-gap-api-cidoc-crm/
Re: [Crm-sig] Fixity Hash in CRM Addendum
This might also be a good time to dip into FRBRoo. On 11 September 2015 at 13:14, daniel riley wrote: > Hello folks, > > I'm adding a bit to this question since I think its relevant to anyone in > digital preservation. If anyone finds it off-topic, let me know. > > So, where we left off was that perhaps E38_Image wasn't the best entity to > express a digital image of an artwork since E38_Image doesn't specify a > concrete manifestation of that image. However, in the scope notes for > P138_represents, it explicitly states: > > "This property is also used for the relationship between an original and > a digitisation of the original by the use of techniques such as digital > photography, flatbed or infrared scanning." > > So it seems like the property is correct for specifying a digital version > of the work but perhaps the Range entity is incorrect. Should I simply be > using the superclass E73_Information_Object rather than E38_Image as the > range, if I want to specify a digital image file with a specific set of > bytes? > > Thanks, > Daniel Riley > > On Wed, Sep 9, 2015 at 6:07 PM, daniel riley wrote: > >> Hi Simon, >> >> That makes sense. For instance, one image could have multiple sizes. We >> would think about them as the same image but their hashes would be >> completely different. I am not as familiar with FRBRoo, but I took a look >> at F4 Manifestation Singleton, and I'm not sure if its intention is >> something like this. >> >> One thing that is confusing is that in many cases like in the british >> museum example here: >> >> http://collection.britishmuseum.org/resource?uri=http%3A%2F%2Fwww.britishmuseum.org%2Fcollectionimages%2FAN00037%2FAN00037369_001_l.jpg >> >> The resource is a specific digital version of an image with a specific >> asset id and a specific filename. So it would seem that if I added a >> property about that resource it would be about the specific binary data, >> and not about all possible versions of that image. >> >> If anyone knows of an example implementation that addresses fixity it >> would be a great help. >> >> Thanks, >> Dan >> >> P.S. I was using British Museum's linked data as a guide for most of my >> work: >> >> On Wed, Sep 9, 2015 at 5:23 PM, Simon Spero wrote: >> >>> Another problem with this is that a hash of a bit string does not >>> identify an Image (even if the hash is 1:1). >>> >>> An Image is abstract and conceptual, and has an identity is preserved >>> across transformations that would generate different bit strings. >>> >>> Going the other way, I believe that CIDOC does require that the same >>> bit string not correspond to multiple images. For example, an imaging >>> sensor might capture an image with the shutter closed at the start of a >>> series of measurements - such an image could be used for calibration. >>> Many such images might have identical bit strings, but would be >>> conceptually different works under some stances. However, since they have >>> indistinguishable appearances, they are the same Image. >>> >>> Fixity hashes might be better treated as properties of a FRBRoo >>> Manifestation; such properties are intrinsic to the Manifestation*; they >>> are not externally assigned in the same way that a URI, accession number, >>> etc are. >>> >>> Simon >>> * or as a the value of a property that must be the same for every item >>> that is an instance of that Manifestation >>> On Sep 9, 2015 4:15 PM, "daniel riley" wrote: >>> >>>> Hello all, >>>> >>>> I wanted to get confirmation on the correct application of the >>>> Cidoc-crm in the case of checksum hashes (i.e. fixity values). >>>> >>>> For instance if the hash of a digital image file computes to: >>>> 6b8dca09e851a987050463c9c60603e9ad797ba09117056fc2e0c07bcac66e43 >>>> >>>> My first thought would be to use: >>>> >>>> E38_Image - P1_is_identified_by - E42_Identifier (hash value) >>>> E42_Identifier - P2_has_type - "SHA256 HASH" >>>> >>>> However, the scope notes for E42_Identifier explicitly states: >>>> The class E42 Identifier is not normally used for machine-generated >>>> identifiers >>>> >>>> A hash is definitely machine generated, so what are the other options >>>> here? Should I use a different ontology for this case? >>>> >>>> Thanks, >>>> Daniel Riley >>>> Verisart >>>> >>>> ___ >>>> Crm-sig mailing list >>>> Crm-sig@ics.forth.gr >>>> http://lists.ics.forth.gr/mailman/listinfo/crm-sig >>>> >>>> >> > > ___ > Crm-sig mailing list > Crm-sig@ics.forth.gr > http://lists.ics.forth.gr/mailman/listinfo/crm-sig > > -- Conal Tuohy http://conaltuohy.com/ @conal_tuohy +61-466-324297
Re: [Crm-sig] How to represent the textual content of documents about museum objects?
On 8 September 2015 at 19:27, Dominic Oldman wrote: > > I think there are various approaches you can take depending upon what your > objectives are. > > 1. Identify (describe) the document and provide access to it. Using CRM > this would harmonise with other CRM data. > This is really all I'm aiming to do, though I had to step outside of the CIDOC CRM (and use FRBRoo) to encode the relationship between the E31 Document and the associated HTML content. I'm slightly dissatisfied with that, but perhaps it's to be expected. I'm open to other options! > 2. Identify particular fragments of the text (using FRBRoo). > 3. Tag particular things in the text > > In terms of 3 there is TEI but also the option of using CRM in RDFa tags > to identify entities and relationships in the text that would have > correspondence in the data. This is an approach we have used at the BM. > RDFa tags can be used to identify people, places, subjects etc, and can > link these entities using CRM properties. These can operate on their own as > an extension to the RDF store or be harvested into the RDF store. > In other projects I have used TEI as a source for RDF, with a workflow which harvests RDF from TEI documents and stores them in a SPARQL graph store. It's a powerful technique for aggregating data across a corpus of texts. I would be very interested to read more about how you have used TEI (or RDFa) in this way at the British Museum! But in this particular project I'm trying out a workflow that doesn't involve an RDF store at all. I don't control the source of the data (I don't work for Museum Victoria); I am merely querying it and re-formatting it to produce RDF on the fly (i.e. as requested by a Linked Data client). Their API is not natively RDF, and I'm not harvesting or even caching the RDF data I generate so there's actually no "RDF store" involved at all. It's been an interesting experiment for me; the weaknesses in the approach are that any actual aggregation you need to do has to be quick enough to perform on the fly. The Linked Data resources (RDF graphs) my software produces are all based on 1 or at most 2 queries to the Museum's API, and possibly 1 to dbpedia. On the positive side, the lack of caching and harvesting makes the whole thing very simple. Cheers! Conal -- Conal Tuohy http://conaltuohy.com/ @conal_tuohy +61-466-324297
Re: [Crm-sig] How to represent the textual content of documents about museum objects?
Thanks again Richard! I have taken your advice and avoided XML Literals. I would appreciate any comments or criticisms of my new alternative. To recap, there are "articles" which I have modelled as instances of E31 Document. These used to have rdf:value properties which were XML Literals containing the HTML text of the article. Now I've discarded those properties, and instead I'm minting a distinct URI for those HTML documents, and I decided to utilise the FRBRoo ontology to link each E31 Document to its own HTML resource, using the FRBR R4_carriers_provided_by predicate. This means that implicitly the HTML document containing the article text is an F3 Manifestation Product Type, and the "article" resource is now an F2 Expression as well as an E31 Document. For example, here is one such "article": < http://conaltuohy.com/xproc-z/museum-victoria/data/articles/1201> and here's the text of that article: < http://conaltuohy.com/xproc-z/museum-victoria/data/html/articles/1201> I'm not that clear how best to consider this in terms of FRBRoo. It seemed to me that my "article text" resources are essentially HTML in nature, and mono-lingual (in English), hence Expressions rather than Works. The pipeline which returns the HTML HTTP response is a factory of identical (or near enough) bitstreams, hence Manifestation Product Type. On 9 September 2015 at 19:46, rich...@light.demon.co.uk < rich...@light.demon.co.uk> wrote: > I don't think that many Linked Data clients will be set up to work with > XML literals. I would go for a simple wrapper to create a well formed > document. RDF is not at its best when dealing with string values - witness > all definitions in dbpedia and SKOS resources, which ought to have > structure but can't. > > Richard > > Richard Light > Sent from my phone > > - Reply message - > From: "Conal Tuohy" > To: "Richard Light" > Cc: "CRM SIG" > Subject: [Crm-sig] How to represent the textual content of documents about > museum objects? > Date: Wed, Sep 9, 2015 05:53 > > > > On 8 September 2015 at 19:05, Richard Light > wrote: > >> Your approach seems perfectly reasonable to me, in the context of an >> RDF/XML serialization. Presumably it might present problems in other >> serializations, e.g. Turtle, when you get to the point of offering more. >> > > Thanks Richard! I hadn't even considered the possibility that the XML > literal might be a problem in other RDF serializations. I will look into > that. > >> >> Another way of doing it might be to treat the article as a free-standing >> information resource, mint a URL for it, and create RDF metadata which >> describes this resource. Your proxy software would have to resolve the URL >> and serve up the HTML when requested, but I assume that wouldn't be hard. >> > > Yes that is the other option I considered, and as you say, it would not be > hard. > > In the JSON which the Museum API provides these HTML fragments are not > even complete HTML documents; or even well-formed documents; they are just > a sequence of elements. I think any real user interface would want to > integrate them into a larger page, with a title, images, etc; that's at > least partly why I chose to encode them just as literal fragments, rather > than to promote them into being resources in their own right. > > But it's difficult to get a picture of which might actually be a useful > approach for a Linked Data client. > -- > Conal Tuohy > http://conaltuohy.com/ > @conal_tuohy > +61-466-324297 > -- Conal Tuohy http://conaltuohy.com/ @conal_tuohy +61-466-324297
Re: [Crm-sig] How to represent the textual content of documents about museum objects?
On 8 September 2015 at 19:05, Richard Light wrote: > Your approach seems perfectly reasonable to me, in the context of an > RDF/XML serialization. Presumably it might present problems in other > serializations, e.g. Turtle, when you get to the point of offering more. > Thanks Richard! I hadn't even considered the possibility that the XML literal might be a problem in other RDF serializations. I will look into that. > > Another way of doing it might be to treat the article as a free-standing > information resource, mint a URL for it, and create RDF metadata which > describes this resource. Your proxy software would have to resolve the URL > and serve up the HTML when requested, but I assume that wouldn't be hard. > Yes that is the other option I considered, and as you say, it would not be hard. In the JSON which the Museum API provides these HTML fragments are not even complete HTML documents; or even well-formed documents; they are just a sequence of elements. I think any real user interface would want to integrate them into a larger page, with a title, images, etc; that's at least partly why I chose to encode them just as literal fragments, rather than to promote them into being resources in their own right. But it's difficult to get a picture of which might actually be a useful approach for a Linked Data client. -- Conal Tuohy http://conaltuohy.com/ @conal_tuohy +61-466-324297
[Crm-sig] How to represent the textual content of documents about museum objects?
I have recently made an experimental software application to generate a Linked Data expression of Museum Data from the public collection API of Museum Victoria (Melbourne, Australia). The Museum Victoria API is a custom-built web application which returns custom JSON data. My experimental software is a proxy which translates their JSON into RDF/XML using the Erlangen OWL version of the CIDOC CRM. More details available here: http://conaltuohy.com/blog/lod-from-custom-web-api/ The Museum Victoria database contains a number of "articles" which each describe one or more objects in their collection. I have modelled each of these as an "E31 Document", and related them to the corresponding collection items using "P70 documents". My question is how to express the text of the actual articles (which the Museum Victoria API provides as an HTML fragment embedded in its JSON response). At the moment I have simply used rdf:value to attach the HTML fragment as an XML literal to the E31 Document instance. Is this the recommended practice? Here is an example of one of these "articles": http://graphite.ecs.soton.ac.uk/browser/?uri=http%3A%2F%2Fconaltuohy.com%2Fxproc-z%2Fmuseum-victoria%2Fresource%2Farticles%2F1201 Regards Conal -- Conal Tuohy http://conaltuohy.com/ @conal_tuohy +61-466-324297