Re: [Crm-sig] ISSUE: representing compound name strings

2018-11-25 Thread Conal Tuohy
I think it's potentially helpful to encode compound data such as personal
names using XML literals in an RDF graph, for display purposes, but not for
SPARQL querying. For efficient querying, I don't see any good alternative
to providing separate literals for the individual components of the name,
such as with "foreName", "surname", etc properties in separate RDF triples.
I suggest that RDF encoding guidelines could suggest adopting both
practices (i.e. redundant representation both as parts and also as a whole).

On Fri, 23 Nov 2018 at 03:53, Martin Doerr  wrote:

> Dear Richard, Robert,
>
> It is simply wrong that encoding structured data into an rdfs:Literal
> makes it invisible to SPARQL. It is exactly what xsd:dateTime does. The
> year, month, etc., is available to querying individually in SPARQL, not by
> magic but by a standard extension mechanism.
>

The date functions in SPARQL that allow an xsd:dateTime literal to be
parsed into months, days, etc, are not really an extension to SPARQL; they
are part of the SPARQL language standard:
<https://www.w3.org/TR/2013/REC-sparql11-query-20130321/#func-date-time>.
Because they are a standard data type in SPARQL, a SPARQL processor can
achieve efficiencies by normalizing them (to a standard time zone) and
using the normalized form in comparisons.

The SPARQL specification does allow for SPARQL implementations to have
"extension" functions, though, and to extend the operation of built-in
SPARQL operators such as "<" or "=", so hypothetically a SPARQL store might
offer XPath-evaluation functions to query inside XML literals, analogously
to the way that the REGEX and REPLACE functions do with string literals.
This kind of hybrid RDF graph/XML tree model could be supported effectively
by a SPARQL store which maintained indices of the tree structure of the XML
literal objects it contained. I believe Virtuoso actually has such a
feature, and there may well be other SPARQL engines with a similar feature,
but I personally think it would be unhelpful for the CRM to suggest an
approach that depends on such a non-standardised extension.


> It is a question to IT experts to tell us how to upload into the SPARQL
> code the respective string functions for other compounds.
>

The standard SPARQL string functions (including regular expression) can be
used to parse "compound" string literals, though not to parse XML literals,
in general, since XML is not a regular language. Of course the CIDOC CRM
could suggest "regular" XML encodings for particular types of compound
literals; for example a "persName" data type could be defined and
constrained with a regular expression to require that it begins with
"" and ends with
"", optionally containing child elements beginning with
"" and ending with "", and even for these elements to
have attributes (such as 'type') drawn from a particular value space. They
could be queried using SPARQL string functions e.g. like so:

SELECT ?person
WHERE {
   ?person tei:persName ?persName.
   FILTER(CONTAINS(?persName, 'Richard'))
}

However, relying on SPARQL FILTER and string-parsing would be grossly
inefficient in terms of query performance, compared to querying individual
properties, e.g.

SELECT ?person
WHERE {
   ?person tei:foreName 'Richard'.
}

If the "compound" XML literals are not intended for fine-grained querying,
they can still be valuable for display purposes, but I don't see much value
in constraining them beyond the general "XML literal" datatype. An
information system that understands XML literals can examine the XML and
process it appropriately based on its namespace.


-- 
Conal Tuohy
http://conaltuohy.com/
@conal_tuohy
+61-466-324297


Re: [Crm-sig] new technical paper

2018-11-25 Thread Conal Tuohy
>
> #E5: I think we point into an XML-RDF file.
>
> Yes, that's the intention, but it will only work if there is an element in
> that file with attribute xml:id="E5" to act as the target for the link.
> This strategy works fine for the links into the HTML expression of the RDF,
> but not for the RDF/XML expression.
>

I don't think that's correct, actually.

The issue of how to resolve URI fragment identifiers is a complicated one,
since a URI as a whole refers to a resource (in the Web Architecture
sense), while the interpretation of the fragment identifier part of the URI
is dependent on the media type of the *representation* of the resource, and
of course a single HTTP URI may correspond to many different media types,
if the web server supports content negotiation for that resource.

So while It's true that fragment identifiers used with XML documents in
general (i.e. "application/xml" media type) refer to elements by the value
of their xml:id attribute (or other attribute whose type is ID), this
interpretation is over-ridden in the case of RDF/XML (the
"application/rdf+xml" media type). The registration document for
"application/rdf+xml" says that the fragment identifier should be
interpreted as corresponding to the rdf:about and rdf:ID attributes.

https://tools.ietf.org/html/rfc3870#page-4

On the other hand, it certainly is true that redirecting from
http://www.cidoc-crm.org/cidoc-crm/E5_Event to
http://www.cidoc-crm.org/html/5.0.4/cidoc-crm.html#E5 is a mistake. The
redirect location should be
http://www.cidoc-crm.org/html/5.0.4/cidoc-crm.html#E5_Event (since
"E5_Event" is the value of the rdf:about attribute in the RDF/XML, not
"E5").






http://conaltuohy.com/
@conal_tuohy
+61-466-324297


Re: [Crm-sig] ISSUE: E74 Group (from LRMoo discussions)

2018-05-09 Thread Conal Tuohy
It's the citizenry as a whole which decides who that government should be.
It's not the decision of an individual elector, but of the citizenry as a
group.

On 9 May 2018 at 16:40, Pat Riva  wrote:

> ​Each individual in a democratic nation can choose to vote (or not).
>
> Then the elected government (an LRM-E8 Collective agent) takes actions and
> is responsible, not all those people holding that citizenship.
>
>
> Pat Riva
>
> Associate University Librarian, Collection Services
>
> Concordia University
>
>
>
> Vanier Library (VL-301-61)
>
> 7141 Sherbrooke Street West
> <https://maps.google.com/?q=7141+Sherbrooke+Street+West+%0D%0A+Montreal,+QC+H4B+1R6+%0D%0A+Canada=gmail=g>
>
> Montreal, QC H4B 1R6
> <https://maps.google.com/?q=7141+Sherbrooke+Street+West+%0D%0A+Montreal,+QC+H4B+1R6+%0D%0A+Canada=gmail=g>
>
> Canada
> <https://maps.google.com/?q=7141+Sherbrooke+Street+West+%0D%0A+Montreal,+QC+H4B+1R6+%0D%0A+Canada=gmail=g>
>
> +1-514-848-2424 ext. 5255
>
> pat.r...@concordia.ca
> --
> *From:* Conal Tuohy 
> *Sent:* May 9, 2018 1:43 AM
> *To:* Pat Riva
> *Cc:* CRM-SIG
> *Subject:* Re: [Crm-sig] ISSUE: E74 Group (from LRMoo discussions)
>
>
>
> On 7 May 2018 at 14:27, Pat Riva  wrote:
>
>> Propose to modify the scope note of E74 Group so that it clearly
>> corresponds to LRM-E8 Collective Agent. To do this any groups of people not
>> having agency, such as national, religious, cultural, ethnic groups, must
>> be excluded from the scope of E74.
>>
> This strikes me as odd! Is it really true that the citizenry of a nation
> is entirely lacking in agency? Can they not take political decisions, for
> instance?
>
>
> --
> Conal Tuohy
> http://conaltuohy.com/
> @conal_tuohy
> +61-466-324297
>



-- 
Conal Tuohy
http://conaltuohy.com/
@conal_tuohy
+61-466-324297


Re: [Crm-sig] ISSUE: E74 Group (from LRMoo discussions)

2018-05-09 Thread Conal Tuohy
On 7 May 2018 at 14:27, Pat Riva  wrote:

> Propose to modify the scope note of E74 Group so that it clearly
> corresponds to LRM-E8 Collective Agent. To do this any groups of people not
> having agency, such as national, religious, cultural, ethnic groups, must
> be excluded from the scope of E74.
>
This strikes me as odd! Is it really true that the citizenry of a nation is
entirely lacking in agency? Can they not take political decisions, for
instance?


-- 
Conal Tuohy
http://conaltuohy.com/
@conal_tuohy
+61-466-324297


Re: [Crm-sig] Properties of properties in RDF

2018-03-15 Thread Conal Tuohy
Dear Martin

I'm not sure what you meant by "partially declared subproperties" there (the
ambiguity of the term "subproperty" in this discussion doesn't help). I think
I understood the rest of what you were saying, though.

To be clear, all I was saying was that I would prefer not to publish RDF
that directly uses those generic RDF predicates P01_has_domain and
P02_has_range, but instead to use a set of more specific predicates (which
could be defined to be (RDFS) subproperties of those two predicates). So
each distinct type of CRM property which had been reified as an RDFS class
(e.g. PC14_carried_out_by) would have its own pair of RDF properties for
linking to instances of its domain and range.

My rationale for that preference is that it would be more meaningful to
users to make use of an RDF predicate called Pxxx_has_actor (with a domain
of PC14_carried_out_by and a range of E39_Actor) and Pxxx_has_activity
(with domain PC14_carried_out_by and range E7_Activity), rather than using
generic predicates P01_has_domain and P02_has_range. Plus it would give us
more type-safety. It would be a trivial extension to that existing RDFS to
add those extra RDFS subproperties (about 60 of them, including the
inverses).

Regards

Conal

On 15 March 2018 at 20:37, Martin Doerr  wrote:

> Dear Conal,
>
> There is no conflict with adding subproperties. Once we have defined in
> FOL the logic of properties of properties, each PC class implies its base
> property. Hence, logically, the subproperty and any added ".1" will hold
> for the instances declared and imply the same base property. If, at any
> time we wish to connect term hierarchies of roles for the .1 properties
> with partially declared subproperties, we need a straight-forward extension
> of the CRM. Any subproperty, e.g., may refine domain and range.
>
> All the best,
>
> Martin
>
>
> On 3/15/2018 6:28 AM, Conal Tuohy wrote:
>
> Thanks Martin, for the link to http://www.cidoc-crm.org/
> sites/default/files/CRMpc_v1.1_0.rdfs
>
> This is actually very close to (and compatible with) the approach I
> suggested in my earlier email, and I'm embarrassed to say I wasn't aware of
> it at all.
>
> I've managed to find some background material (though I had to use Google
> to find it!)
>
> http://www.cidoc-crm.org/Issue/ID-266-reified-association-vs-sub-event is
> an archive of a relevant discussion.
>
> http://www.cidoc-crm.org/sites/default/files/Roles.pdf presents a few
> slides showing options for modelling properties of properties, including
> the "Property Class" approach.
>
> These slides include a nice illustration of the approach defined in the
> RDFS:
>
> http://www.cidoc-crm.org/sites/default/files/
> 20160802PropertiesOfProperties.pptx
>
> I think I'd be very happy with this "Property Class" approach, although
> rather than using the generic properties P01_has_domain, P02_has_range, and
> their inverses,I would still want to define specific subproperties, e.g.
> for the case of actors playing a specific role in the performance of an
> activity, I would prefer to link the performance (i.e. the instance of PC14
> carried out by) to the actor and the activity using domain-specific
> properties such as has_actor and has_activity.
>
> Conal
>
> On 15 March 2018 at 04:25, Martin Doerr  wrote:
>
>> Dear All,
>>
>> Please see:http://www.cidoc-crm.org/sites/default/files/CRMpc_v1.1_0.rdfs
>> on page http://www.cidoc-crm.org/versions-of-the-cidoc-crm, plus the
>> issues discussing the solution for version 6.2 (I'll look for all
>> references).
>>
>> Best,
>>
>> martin
>>
>>
>> On 3/14/2018 12:49 PM, Conal Tuohy wrote:
>>
>>
>>
>> On 8 March 2018 at 18:02, Richard Light 
>> wrote:
>>
>>> I was thinking last night that maybe we should focus our RDF efforts on
>>> exactly this issue: the representation of the CRM primitive classes E60,
>>> E61 and E62 in RDF.  The current RDF document is becoming quite
>>> wide-ranging in its scope, and (for example) you have questioned whether
>>> certain sections belong in it.  If we concentrate on this single aspect of
>>> the broader RDF issue, I think we can produce something which is of
>>> practical value relatively quickly.  In particular, I would like to devote
>>> time to this during the Lyon meeting.
>>>
>> I applaud the idea of focusing narrowly on something so as to produce
>> some of practical value quickly!
>>
>> But I do hope that the other issues raised in that document will not be
>> set aside too long, or lost.
>>
>> In particular, it seems to me that the mapping from the C

Re: [Crm-sig] Properties of properties in RDF

2018-03-15 Thread Conal Tuohy
Thanks Martin, for the link to
http://www.cidoc-crm.org/sites/default/files/CRMpc_v1.1_0.rdfs

This is actually very close to (and compatible with) the approach I
suggested in my earlier email, and I'm embarrassed to say I wasn't aware of
it at all.

I've managed to find some background material (though I had to use Google
to find it!)

http://www.cidoc-crm.org/Issue/ID-266-reified-association-vs-sub-event is
an archive of a relevant discussion.

http://www.cidoc-crm.org/sites/default/files/Roles.pdf presents a few
slides showing options for modelling properties of properties, including
the "Property Class" approach.

These slides include a nice illustration of the approach defined in the
RDFS:

http://www.cidoc-crm.org/sites/default/files/20160802PropertiesOfProperties.pptx

I think I'd be very happy with this "Property Class" approach, although
rather than using the generic properties P01_has_domain, P02_has_range, and
their inverses,I would still want to define specific subproperties, e.g.
for the case of actors playing a specific role in the performance of an
activity, I would prefer to link the performance (i.e. the instance of PC14
carried out by) to the actor and the activity using domain-specific
properties such as has_actor and has_activity.

Conal

On 15 March 2018 at 04:25, Martin Doerr  wrote:

> Dear All,
>
> Please see:http://www.cidoc-crm.org/sites/default/files/CRMpc_v1.1_0.rdfs
> on page http://www.cidoc-crm.org/versions-of-the-cidoc-crm, plus the
> issues discussing the solution for version 6.2 (I'll look for all
> references).
>
> Best,
>
> martin
>
>
> On 3/14/2018 12:49 PM, Conal Tuohy wrote:
>
>
>
> On 8 March 2018 at 18:02, Richard Light  wrote:
>
>> I was thinking last night that maybe we should focus our RDF efforts on
>> exactly this issue: the representation of the CRM primitive classes E60,
>> E61 and E62 in RDF.  The current RDF document is becoming quite
>> wide-ranging in its scope, and (for example) you have questioned whether
>> certain sections belong in it.  If we concentrate on this single aspect of
>> the broader RDF issue, I think we can produce something which is of
>> practical value relatively quickly.  In particular, I would like to devote
>> time to this during the Lyon meeting.
>>
> I applaud the idea of focusing narrowly on something so as to produce some
> of practical value quickly!
>
> But I do hope that the other issues raised in that document will not be
> set aside too long, or lost.
>
> In particular, it seems to me that the mapping from the CRM's "properties
> of properties" to RDF is actually a more serious gap.
>
> In the CRM, there are a number of properties which are themselves the
> domain of properties. In RDF, however, a property does not have properties
> of its own. Incidentally, I remember years ago being able to model this
> directly in ISO Topic Maps, but practical considerations of
> interoperability and community dictate that RDF, despite its simpler model,
> is the technology of choice today.
>
> One example of the issue is how to model the role that individuals play in
> events. If a concert performance X was P14 carried out by person Y, then
> this maps naturally to an RDF triple in which the predicate is
> crm:P14_carried_out_by. However, if the person carried out that activity in
> a particular role (e.g. as a saxophonist) then things are more difficult.
> In the CRM, the P14_carried_out_by itself has the property
> P14.1_in_the_role_of, whose value could be an instance of E55_Type:
> Saxophonist. This is pleasingly consistent with how the CRM handles
> taxonomies in other parts of the model, but it is not workable in RDF
> because the P14_carried_out_by property cannot itself have a property.
>
> There are a number of "work-arounds" to this issue, such as simplying
> ignoring the problem and "dumbing down" the data, or moving the locus of
> classification from the property to the property value (e.g. in this case
> that would mean classifying the person rather than their role; that doesn't
> work very well because people may have many distinct roles, but it works
> better for other cases).
>
> The existing guidance would suggest defining a new "saxophone-played-by"
> property to be a rdfs:subpropertyof P14_carried_out_by. This can certainly
> work, but it's actually a poor expression of the CRM's model. It negates
> the practical benefits of having external taxonomies for this kind of
> classification. This guidance, in my opinion, makes too much of the
> apparent similarity between the CRM's properties and RDF properties. They
> are not in fact the same kind of thing, and a property which itself bears
> properties is more closely approximated

[Crm-sig] Properties of properties in RDF

2018-03-14 Thread Conal Tuohy
On 8 March 2018 at 18:02, Richard Light  wrote:

> I was thinking last night that maybe we should focus our RDF efforts on
> exactly this issue: the representation of the CRM primitive classes E60,
> E61 and E62 in RDF.  The current RDF document is becoming quite
> wide-ranging in its scope, and (for example) you have questioned whether
> certain sections belong in it.  If we concentrate on this single aspect of
> the broader RDF issue, I think we can produce something which is of
> practical value relatively quickly.  In particular, I would like to devote
> time to this during the Lyon meeting.
>
I applaud the idea of focusing narrowly on something so as to produce some
of practical value quickly!

But I do hope that the other issues raised in that document will not be set
aside too long, or lost.

In particular, it seems to me that the mapping from the CRM's "properties
of properties" to RDF is actually a more serious gap.

In the CRM, there are a number of properties which are themselves the
domain of properties. In RDF, however, a property does not have properties
of its own. Incidentally, I remember years ago being able to model this
directly in ISO Topic Maps, but practical considerations of
interoperability and community dictate that RDF, despite its simpler model,
is the technology of choice today.

One example of the issue is how to model the role that individuals play in
events. If a concert performance X was P14 carried out by person Y, then
this maps naturally to an RDF triple in which the predicate is
crm:P14_carried_out_by. However, if the person carried out that activity in
a particular role (e.g. as a saxophonist) then things are more difficult.
In the CRM, the P14_carried_out_by itself has the property
P14.1_in_the_role_of, whose value could be an instance of E55_Type:
Saxophonist. This is pleasingly consistent with how the CRM handles
taxonomies in other parts of the model, but it is not workable in RDF
because the P14_carried_out_by property cannot itself have a property.

There are a number of "work-arounds" to this issue, such as simplying
ignoring the problem and "dumbing down" the data, or moving the locus of
classification from the property to the property value (e.g. in this case
that would mean classifying the person rather than their role; that doesn't
work very well because people may have many distinct roles, but it works
better for other cases).

The existing guidance would suggest defining a new "saxophone-played-by"
property to be a rdfs:subpropertyof P14_carried_out_by. This can certainly
work, but it's actually a poor expression of the CRM's model. It negates
the practical benefits of having external taxonomies for this kind of
classification. This guidance, in my opinion, makes too much of the
apparent similarity between the CRM's properties and RDF properties. They
are not in fact the same kind of thing, and a property which itself bears
properties is more closely approximated in RDF not as a property but
reified as a subject resource in its own right. A more faithful mapping of
the CRM's abstract model to RDF would introduce a new RDFS class
corresponding to the performance of the activity. We could then say that
concert performance X was P14a_performed_in Performance Z; that Performance
Z was P14b_carried_out_by person Y, and that Performance Z was
P14.1_in_the_role_of Saxophonist.

That's just one example of the general problem; there are a number of
others, which are listed here in the context of the Linked Art project:
https://github.com/linked-art/linked.art/issues/55 along with a variety of
options for dealing with the issue.

In my opinion the current situation with respect to properties of
properties (in RDF) is really quite unsatisfactory and could be
substantially improved by a more consistent treatment across the entire
schema.




-- 
Conal Tuohy
http://conaltuohy.com/
@conal_tuohy
+61-466-324297


Re: [Crm-sig] P90 etc.

2018-03-14 Thread Conal Tuohy
On 9 March 2018 at 04:39, Martin Doerr  wrote:

>
>
> I recommend NOT to recommend rdf:value, because RDFS 1.1 defines:
> "5.4.3 rdf:value rdf:value is an instance of rdf:Property
> <https://www.w3.org/TR/rdf-schema/#ch_property> that may be used in
> describing structured values. rdf:value has no meaning on its own. "
>
> As CRM-SIG, we cannot recommend a property without meaning. We do ontology
> here, so the must be a minimal ontological commitment. Are there other
> opinions?
>

My opinion is that the real value of rdf:value is that it effectively
negates one of the weaknesses in the expressiveness of RDF, with respect to
the CRM.

In RDF, a literal value is a second-class citizen: it has no identifier,
which makes it ineligible to appear as the subject of a triple, so it can't
have properties of its own. It can't be woven into the "Web of Data". It
can't effectively function as an "access point" (in the library science
sense) without some additional context.

As Linked Data practitioners, we generally have literals like "Conal Tuohy"
as our source data for e.g. Appellations (and it's worth noting that all of
the formal examples of E41 Appellation are given as string literals), but
it's highly undesirable to encode an E41 Appellation merely as a literal;
such an encoding would make it impossible, either for us, or for third
parties, to annotate that name with properties of its own ("A name of Irish
origin ...").

So we must mint an identifier, either a local ("blank node") identifier --
or better still, an HTTP URI -- for that name (e.g. "_conal_tuohy"), so
that we can then attach other properties to that identifier. We are left,
finally, with the residual problem of how to associate the literal name
value itself ("Conal Tuohy") with that identifier. This is where rdf:value
plays a valuable role of effectively just equating the literal with
identifier; it is described as having "no meaning on its own" precisely
because it really plays only a syntactical role. This is why I think it
would be a mistake to critique the use of rdf:value on the basis of it
"lacking meaning of its own"; it would be equivalent to criticising a
relational database for having an Appellation table with a column named
"value".

Regards

"Conal Tuohy"




> Taken the above definition in RDFS 1.1, I question both, the precise use
> and the emerging good practice,
> until better evidence:-).
> Do you have better evidence?
>
> It is up to crm-sig to decide, I present only my opinion here.
>
> Best,
>
> martin
>
>
>
> On 3/8/2018 6:28 PM, Robert Sanderson wrote:
>
>
>
> Martin,
>
>
>
> Could you clarify why you have changed your mind about rdf:value?
>
>
>
> > I recommend NOT to recommend rdf:value
>
>
>
> In particular, in the last week you said:
>
>
>
> “CRM-SIG normally works reactively: When a good community practice
> emerges, this is taken up.”
>
>
>
> and
>
>
>
> “Whatever the vast majority is  and rdf:value does the job, I have no
> objections to its use.
> Just define precisely what you use it for. We can add that to our
> guidelines. It is already standard rdf.”
>
>
>
> Thanks,
>
>
>
> Rob
>
>
>
>
>
>
> --
> --
>  Dr. Martin Doerr  |  Vox:+30(2810)391625|
>  Research Director |  Fax:+30(2810)391638|
>|  Email: mar...@ics.forth.gr |
>  |
>Center for Cultural Informatics   |
>Information Systems Laboratory|
> Institute of Computer Science|
>Foundation for Research and Technology - Hellas (FORTH)   |
>  |
>N.Plastira 100, Vassilika Vouton, |
> GR70013 Heraklion,Crete,Greece   |
>  |
>  Web-site: http://www.ics.forth.gr/isl   |
> --
>
>
> ___
> Crm-sig mailing list
> Crm-sig@ics.forth.gr
> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>
>


-- 
Conal Tuohy
http://conaltuohy.com/
@conal_tuohy
+61-466-324297


Re: [Crm-sig] Domain and range of P90

2018-03-01 Thread Conal Tuohy
One of the "gaps" which puzzles me most is the example you give of encoding
the string value of an Appellation. I understand the recommended practice
is to attach the string value of a person's name using P3_has_note, or
actually, using a custom subproperty of P3_has_note. The semantics of
P3_has_note itself are weak; a note is simply an "informal description" of
something, so if I have a particular name (an RDF resource) which
P3_has_note the literal string "Conal Tuohy", then I should really define
subproperties so as to be able to distinguish that string value from a note
which really is nothing more than an "informal description" of that name
e.g. "A very uncommon name of Irish origin". What puzzles me most about
this "gap" in the RDFS specification is that the distinction between a note
ABOUT a name, and the actual textual representation OF a name is somehow
considered out of scope of the CRM in RDFS. It's puzzling, because the
string value of a name is something which really must be encoded in a
standard fashion, to achieve interoperability (as an aside, my personal
view is that the string literal "Conal Tuohy" could be attached to an
Appellation using the rdf:value or rdfs:label predicate defined in the RDFS
spec). But the important thing is that the RDFS schema should stipulate how
to attach this literal data rather than leave it as an open question. In
general these are the kinds of issues which puzzle many people who approach
the CRM from a position of having already worked with other RDF ontologies
in the cultural heritage space, and find themselves wondering how they are
supposed to make these details CRM work in RDF in an interoperable way,
without having to pick and choose from a variety of techniques for
"finessing" the gaps.

These kinds of gaps are serious barriers to interoperability in the Linked
Open Data cloud, and they need to be addressed by agreeing on some encoding
procedures that can be used consistently by different projects on the web.
It would be helpful to CRM adopters in the Linked Data community if these
gaps could be filled in a manner which is clear and simple and
interoperable. I am not in favour of just offering a menu of possible
approaches, especially where individual projects would have to make local
customisations to their schema. If there is some particular value in
multiple approaches, then they could be published as different "profiles"
that encoders could simply adopt, as a whole. I think the recent effort by
Richard Light (and other contributors) to collate guidelines on RDF
encoding is a great initiative! <https://docs.google.com/docum
ent/d/1zCGZ4iBzekcEYo4Dy0hI8CrZ7dTkMD2rJaxavtEOET0/edit> It deserves more
input and I hope it will continue to be discussed on the list. I also think
the Linked Art project http://linked.art/ with its "profile" of the CRM is
another really good way forward.

Regards

Conal


On 22 February 2018 at 19:46, George Bruseker  wrote:

> Dear Phil et al.,
>
> I think this is a case of interpreting the label of the property rather
> than its intention. CRM ‘has value’ isn’t supposed to cover all possible
> meanings of the natural language interpretation of has value. Rather it has
> a very restricted use. It is meant to give the quantitive number value
> associated to a dimension. Dimension is a class that should be used to
> store information that results from a measurement activity. The measurement
> activity is specified as some procedural event that has the intentional
> objective of producing quantitative data. It is an activity of interacting
> with the world with the intention of producing a quantitive result.
>
> So it would be a nonsensical, to say 'this paragraph (E73) has dimension
> (E54 defined as a quantitive result from a measuring procedure) has value
> “the characters in this paragraph” (E59 primitive value). The definition of
> E54 forbids it because a string is not a quantity (though of course it may
> have a quantity… that would have to be measure).
>
> That of course sounds irritating. It would be nice to have a property that
> could store all values. But then of course that property would mean
> everything and nothing and the ontology wouldn’t work for getting specific
> information, like the quantitative results of measurement activities
> separate from any other value ‘good’ ‘bad’ ‘ugly’ ‘monogamy’ ‘world peace’
> ‘all the characters in this present string’.
>
> That’s the ontological argument. The practical question is why you are
> looking to expand the scope. I’m guessing that the reason is because you
> want a unique place to store a data value (this is a guess, so please do
> correct my presumption if I’m wrong).
>
> This seems to me to get back to the encoding issue and having a standard
> strategy. I th

Re: [Crm-sig] Domain and range of P90

2018-02-28 Thread Conal Tuohy
I have used rdf:value for this purpose.
https://www.w3.org/TR/rdf-schema/#ch_value

The CRM's origin was outside of the RDF space, and it is still considered
to be something more abstract than any concrete expression in RDFS or OWL.
This is why, I think, there remains a puzzling gap between RDF resources
which are instances of CRM classes and their literal values which must be
expressed using primitive RDF data types. The point of rdf:value, as I
understand it, is to fill in gaps like these.

On 22 February 2018 at 02:04, Carlisle, Philip <
philip.carli...@historicengland.org.uk> wrote:

> Dear all,
> Naïve question.
>
>
>
> Is there any reason why P90 has value could not/should not change its
> domain and range from:
>
>
>
> Domain:Range
>
> E54 Dimension  E60 Number
>
>
>
> to
>
>
>
> E1 CRM Entity  E59 Primitive Value
>
>
>
> I look forward to you answers
>
>
>
> Phil
>
>
>
>
>
>
>
> *Phil Carlisle*
>
> Knowledge Organization Specialist
>
> Listing Group, Historic England
>
> Direct Dial: +44 (0)1793 414824 <+44%201793%20414824>
>
>
>
> http://thesaurus.historicengland.org.uk/
>
> http://www.heritagedata.org/blog/
>
>
>
> Listing Information Services fosters an environment where colleagues are
> valued for their skills and knowledge, and where communication, customer
> focus and working in partnership are at the heart of everything we do.
>
>
>
>
>
>
> [image: Historic England Logo] <http://www.historicengland.org.uk/>
>
> We help people understand, enjoy and value the historic environment, and
> protect it for the future. Historic England
> <https://www.historicengland.org.uk> is a public body, and we champion
> everyone’s heritage, across England.
> Follow us:  Facebook <https://www.facebook.com/HistoricEngland>  |
> Twitter <https://twitter.com/HistoricEngland>  |  Instagram
> <https://www.instagram.com/historicengland/> Sign up to our newsletter
> <http://webmail.historicenglandservices.org.uk/k/Historic-England/newsletter_sign_up>
>
>
> Help us create a list of the 100 places which tell England's remarkable
> story and its impact on the world. A History of England in 100 Places
> <https://historicengland.org.uk/100places> sponsored by Ecclesiastical
> <http://www.ecclesiastical.com/fororganisations/insurance/heritageinsurance/100-places/index.aspx>
> .
>
> We have moved! Our new London office is at 4th Floor, Cannon Bridge House,
> 25 Dowgate Hill, London, EC4R 2YA.
>
> This e-mail (and any attachments) is confidential and may contain personal
> views which are not the views of Historic England unless specifically
> stated. If you have received it in error, please delete it from your system
> and notify the sender immediately. Do not use, copy or disclose the
> information in any way nor act in reliance on it. Any information sent to
> Historic England may become publicly available.
>
>
>
> ___
> Crm-sig mailing list
> Crm-sig@ics.forth.gr
> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>
>


-- 
Conal Tuohy
http://conaltuohy.com/
@conal_tuohy
+61-466-324297


Re: [Crm-sig] Associative relationship mapping

2016-09-23 Thread Conal Tuohy
Hi Philip!

I very much like Stephen's suggestion of modelling generic relationships by
reifying subsets of the museum's database records as a set of E73
Information Objects each of which *P67 refers to* a set of "generically
related" objects. The nice thing about an "Information Object" is that the
semantics it carries are not required to be expressed in terms of the
CIDOC-CRM, so it doesn't matter that the exact semantics aren't known. This
technique seems to me like it could be a useful very generally for
representing information from legacy systems with under-specified semantics.

I was confronted with the same issue when I was experimenting with building
a CIDOC-CRM interpretation of the data exposed by Museum Victoria's
Collections API.

Actually, I wish I'd thought of Stephen's approach, now, rather than the
approach I took, which wrote up on my blog: <
http://conaltuohy.com/blog/bridging-conceptual-gap-api-cidoc-crm/>

Of particular relevance to your question is the section about how to model
these "generic relations" between collection items: <
http://conaltuohy.com/blog/bridging-conceptual-gap-api-cidoc-crm/#relationships>.
The problem is that MV's underlying database records did not document any
detailed semantics for this relationship, and that a number of different
types of relationships might have been represented using the same data
structure.

If you are aiming to model the relationship as something general enough to
subsume all the instances of the relationship, the owl:topObjectProperty
would certainly work for this purpose, but you might perhaps find something
semantically stronger. Empirical investigation might allow you to use one
of the CIDOC-CRM's properties (though it might show that the general
relationships are actually too heterogeneous for that). In the case of my
experiment, my reading of MV's data led me to believe that the actual
relationships could legitimately be encoded as relationships of similarity,
and represented with P130_shows_features_of (in the symmetrical,
non-directed sense of that relationship), though this was controversial, as
you can see by the comments on the post: <
http://conaltuohy.com/blog/bridging-conceptual-gap-api-cidoc-crm/#comments>

The other approach would be to try to guess at a more particular meaning
for each of these "general relationships", using clues from other available
data. You might find that the "general relationships" between photographs
and other items was one of depiction, for instance, and be able to automate
that inference in your mapping. But that's an empirical question, and
potentially a lot of work.

Regards

Conal

On 15 September 2016 at 20:16, Carlisle, Philip <
philip.carli...@historicengland.org.uk> wrote:

> Hi all,
>
>
>
> The Arches project moves on a pace and is in the process of modifying the
> graphs for version 4.
>
>
>
> In the original graphs we used a British Museum extension property
> (PXX_is_related_to) as a work around to allow us to represent the general
> association relationship we had in legacy datasets. eg. this telephone
> kiosk has a general association with this telephone exchange.
>
>
>
> We now want to continue to be able to model a general association but the
> only property available P69 has association with (is associated with) is
> restricted in its domain and range to E29 Design or Procedure.
>
>
>
> How do we model the ‘If you’re interested in that you might be interested
> in this’ nature of the general association between two physical man made
> things?
>
>
>
> All thoughts appreciated.
>
>
>
> Phil
>
>
>
> *Phil Carlisle*
>
> Knowledge Organization Specialist
>
> Listing Group, Historic England
>
> Direct Dial: +44 (0)1793 414824
>
>
>
> http://thesaurus.historicengland.org.uk/
>
> http://www.heritagedata.org/blog/
>
>
>
> Listing Information Services fosters an environment where colleagues are
> valued for their skills and knowledge, and where communication, customer
> focus and working in partnership are at the heart of everything we do.
>
>
>
>
>
> ___
> Crm-sig mailing list
> Crm-sig@ics.forth.gr
> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>
>


-- 
Conal Tuohy
http://conaltuohy.com/
@conal_tuohy
+61-466-324297


[Crm-sig] Blog post on mapping a museum web API to CIDOC CRM

2015-10-21 Thread Conal Tuohy
I've written a third blog post in a series, exploring an experimental
Linked Data publishing technique in which a transforming web proxy is used
to transform and expose a museum collection JSON web API as RDF graphs
(using the Erlangen CRM).

This post deals with the challenges of mapping from the "record oriented"
data model exposed by the API, to the more ramified CRM, with examples.

Any feedback very gratefully received!

http://conaltuohy.com/blog/bridging-conceptual-gap-api-cidoc-crm/


Re: [Crm-sig] Fixity Hash in CRM Addendum

2015-09-11 Thread Conal Tuohy
This might also be a good time to dip into FRBRoo.

On 11 September 2015 at 13:14, daniel riley  wrote:

> Hello folks,
>
> I'm adding a bit to this question since I think its relevant to anyone in
> digital preservation. If anyone finds it off-topic, let me know.
>
> So, where we left off was that perhaps E38_Image wasn't the best entity to
> express a digital image of an artwork since E38_Image doesn't specify a
> concrete manifestation of that image.  However, in the scope notes for
> P138_represents, it explicitly states:
>
> "This property is also used for the relationship between an original and
> a digitisation of the original by the use of techniques such as digital
> photography, flatbed or infrared scanning."
>
> So it seems like the property is correct for specifying a digital version
> of the work but perhaps the Range entity is incorrect. Should I simply be
> using the superclass E73_Information_Object rather than E38_Image as the
> range, if I want to specify a digital image file with a specific set of
> bytes?
>
> Thanks,
> Daniel Riley
>
> On Wed, Sep 9, 2015 at 6:07 PM, daniel riley  wrote:
>
>> Hi Simon,
>>
>> That makes sense. For instance, one image could have multiple sizes. We
>> would think about them as the same image but their hashes would be
>> completely different.  I am not as familiar with FRBRoo, but I took a look
>> at F4 Manifestation Singleton, and I'm not sure if its intention is
>> something like this.
>>
>> One thing that is confusing is that in many cases like in the british
>> museum example here:
>>
>> http://collection.britishmuseum.org/resource?uri=http%3A%2F%2Fwww.britishmuseum.org%2Fcollectionimages%2FAN00037%2FAN00037369_001_l.jpg
>>
>> The resource is a specific digital version of an image with a specific
>> asset id and a specific filename. So it would seem that if I added a
>> property about that resource it would be about the specific binary data,
>> and not about all possible versions of that image.
>>
>> If anyone knows of an example implementation that addresses fixity it
>> would be a great help.
>>
>> Thanks,
>> Dan
>>
>> P.S. I was using British Museum's linked data as a guide for most of my
>> work:
>>
>> On Wed, Sep 9, 2015 at 5:23 PM, Simon Spero  wrote:
>>
>>> Another problem with this is that a hash of a bit string does not
>>> identify an Image (even if the hash is 1:1).
>>>
>>> An Image is abstract and conceptual,  and has an identity is preserved
>>> across transformations that would generate different bit strings.
>>>
>>> Going the other way,  I believe that CIDOC does require that the same
>>> bit string not correspond to multiple images. For example, an imaging
>>> sensor might capture an image with the shutter closed at the start of a
>>> series of measurements - such an image could be used for calibration.
>>> Many such images might have identical bit strings, but would be
>>> conceptually different works under some stances. However,  since they have
>>> indistinguishable appearances, they are the same Image.
>>>
>>> Fixity hashes might be better treated as properties of a FRBRoo
>>> Manifestation; such properties are intrinsic to the Manifestation*; they
>>> are not externally assigned in the same way that a URI, accession number,
>>> etc are.
>>>
>>> Simon
>>> * or as a the value of a property that must be  the same for every item
>>> that is an instance of that Manifestation
>>> On Sep 9, 2015 4:15 PM, "daniel riley"  wrote:
>>>
>>>> Hello all,
>>>>
>>>> I wanted to get confirmation on the correct application of the
>>>> Cidoc-crm in the case of checksum hashes (i.e. fixity values).
>>>>
>>>> For instance if the hash of a digital image file computes to:
>>>> 6b8dca09e851a987050463c9c60603e9ad797ba09117056fc2e0c07bcac66e43
>>>>
>>>> My first thought would be to use:
>>>>
>>>> E38_Image - P1_is_identified_by - E42_Identifier (hash value)
>>>> E42_Identifier - P2_has_type - "SHA256 HASH"
>>>>
>>>> However, the scope notes for E42_Identifier explicitly states:
>>>> The class E42 Identifier is not normally used for machine-generated
>>>> identifiers
>>>>
>>>> A hash is definitely machine generated, so what are the other options
>>>> here? Should I use a different ontology for this case?
>>>>
>>>> Thanks,
>>>> Daniel Riley
>>>> Verisart
>>>>
>>>> ___
>>>> Crm-sig mailing list
>>>> Crm-sig@ics.forth.gr
>>>> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>>>>
>>>>
>>
>
> ___
> Crm-sig mailing list
> Crm-sig@ics.forth.gr
> http://lists.ics.forth.gr/mailman/listinfo/crm-sig
>
>


-- 
Conal Tuohy
http://conaltuohy.com/
@conal_tuohy
+61-466-324297


Re: [Crm-sig] How to represent the textual content of documents about museum objects?

2015-09-10 Thread Conal Tuohy
On 8 September 2015 at 19:27, Dominic Oldman  wrote:

>
> I think there are various approaches you can take depending upon what your
> objectives are.
>
> 1. Identify (describe) the document and provide access to it. Using CRM
> this would harmonise with other CRM data.
>

This is really all I'm aiming to do, though I had to step outside of the
CIDOC CRM (and use FRBRoo) to encode the relationship between the E31
Document and the associated HTML content. I'm slightly dissatisfied with
that, but perhaps it's to be expected. I'm open to other options!


> 2. Identify particular fragments of the text (using FRBRoo).
> 3. Tag particular things in the text
>
> In terms of 3 there is TEI but also the option of using CRM in RDFa tags
> to identify entities and relationships in the text that would have
> correspondence in the data. This is an approach we have used at the BM.
> RDFa tags can be used to identify people, places, subjects etc, and can
> link these entities using CRM properties. These can operate on their own as
> an extension to the RDF store or be harvested into the RDF store.
>

In other projects I have used TEI as a source for RDF, with a workflow
which harvests RDF from TEI documents and stores them in a SPARQL graph
store. It's a powerful technique for aggregating data across a corpus of
texts. I would be very interested to read more about how you have used TEI
(or RDFa) in this way at the British Museum!

But in this particular project I'm trying out a workflow that doesn't
involve an RDF store at all. I don't control the source of the data (I
don't work for Museum Victoria); I am merely querying it and re-formatting
it to produce RDF on the fly (i.e. as requested by a Linked Data client).
Their API is not natively RDF, and I'm not harvesting or even caching the
RDF data I generate so there's actually no "RDF store" involved at all.
It's been an interesting experiment for me; the weaknesses in the approach
are that any actual aggregation you need to do has to be quick enough to
perform on the fly. The Linked Data resources (RDF graphs) my software
produces are all based on 1 or at most 2 queries to the Museum's API, and
possibly 1 to dbpedia. On the positive side, the lack of caching and
harvesting makes the whole thing very simple.

Cheers!

Conal


-- 
Conal Tuohy
http://conaltuohy.com/
@conal_tuohy
+61-466-324297


Re: [Crm-sig] How to represent the textual content of documents about museum objects?

2015-09-10 Thread Conal Tuohy
Thanks again Richard!

I have taken your advice and avoided XML Literals. I would appreciate any
comments or criticisms of my new alternative.

To recap, there are "articles" which I have modelled as instances of E31
Document. These used to have rdf:value properties which were XML Literals
containing the HTML text of the article.

Now I've discarded those properties, and instead I'm minting a distinct URI
for those HTML documents, and I decided to utilise the FRBRoo ontology to
link each E31 Document to its own HTML resource, using the FRBR
R4_carriers_provided_by predicate. This means that implicitly the HTML
document containing the article text is an F3 Manifestation Product Type,
and the "article" resource is now an F2 Expression as well as an E31
Document.

For example, here is one such "article": <
http://conaltuohy.com/xproc-z/museum-victoria/data/articles/1201> and
here's the text of that article: <
http://conaltuohy.com/xproc-z/museum-victoria/data/html/articles/1201>

I'm not that clear how best to consider this in terms of FRBRoo. It seemed
to me that my "article text" resources are essentially HTML in nature, and
mono-lingual (in English), hence Expressions rather than Works. The
pipeline which returns the HTML HTTP response is a factory of identical (or
near enough) bitstreams, hence Manifestation Product Type.


On 9 September 2015 at 19:46, rich...@light.demon.co.uk <
rich...@light.demon.co.uk> wrote:

> I don't think that many Linked Data clients will be set up to work with
> XML literals. I would go for a simple wrapper to create a well formed
> document. RDF is not at its best when dealing with string values - witness
> all definitions in dbpedia and SKOS resources, which ought to have
> structure but can't.
>
> Richard
>
> Richard Light
> Sent from my phone
>
> - Reply message -
> From: "Conal Tuohy" 
> To: "Richard Light" 
> Cc: "CRM SIG" 
> Subject: [Crm-sig] How to represent the textual content of documents about
> museum objects?
> Date: Wed, Sep 9, 2015 05:53
>
>
>
> On 8 September 2015 at 19:05, Richard Light 
> wrote:
>
>> Your approach seems perfectly reasonable to me, in the context of an
>> RDF/XML serialization.  Presumably it might present problems in other
>> serializations, e.g. Turtle, when you get to the point of offering more.
>>
>
> Thanks Richard! I hadn't even considered the possibility that the XML
> literal might be a problem in other RDF serializations. I will look into
> that.
>
>>
>> Another way of doing it might be to treat the article as a free-standing
>> information resource, mint a URL for it, and create RDF metadata which
>> describes this resource.  Your proxy software would have to resolve the URL
>> and serve up the HTML when requested, but I assume that wouldn't be hard.
>>
>
> Yes that is the other option I considered, and as you say, it would not be
> hard.
>
> In the JSON which the Museum API provides these HTML fragments are not
> even complete HTML documents; or even well-formed documents; they are just
> a sequence of  elements. I think any real user interface would want to
> integrate them into a larger page, with a title, images, etc; that's at
> least partly why I chose to encode them just as literal fragments, rather
> than to promote them into being resources in their own right.
>
> But it's difficult to get a picture of which might actually be a useful
> approach for a Linked Data client.
> --
> Conal Tuohy
> http://conaltuohy.com/
> @conal_tuohy
> +61-466-324297
>



-- 
Conal Tuohy
http://conaltuohy.com/
@conal_tuohy
+61-466-324297


Re: [Crm-sig] How to represent the textual content of documents about museum objects?

2015-09-09 Thread Conal Tuohy
On 8 September 2015 at 19:05, Richard Light 
wrote:

> Your approach seems perfectly reasonable to me, in the context of an
> RDF/XML serialization.  Presumably it might present problems in other
> serializations, e.g. Turtle, when you get to the point of offering more.
>

Thanks Richard! I hadn't even considered the possibility that the XML
literal might be a problem in other RDF serializations. I will look into
that.

>
> Another way of doing it might be to treat the article as a free-standing
> information resource, mint a URL for it, and create RDF metadata which
> describes this resource.  Your proxy software would have to resolve the URL
> and serve up the HTML when requested, but I assume that wouldn't be hard.
>

Yes that is the other option I considered, and as you say, it would not be
hard.

In the JSON which the Museum API provides these HTML fragments are not even
complete HTML documents; or even well-formed documents; they are just a
sequence of  elements. I think any real user interface would want to
integrate them into a larger page, with a title, images, etc; that's at
least partly why I chose to encode them just as literal fragments, rather
than to promote them into being resources in their own right.

But it's difficult to get a picture of which might actually be a useful
approach for a Linked Data client.
-- 
Conal Tuohy
http://conaltuohy.com/
@conal_tuohy
+61-466-324297


[Crm-sig] How to represent the textual content of documents about museum objects?

2015-09-08 Thread Conal Tuohy
I have recently made an experimental software application to generate a
Linked Data expression of Museum Data from the public collection API of
Museum Victoria (Melbourne, Australia).

The Museum Victoria API is a custom-built web application which returns
custom JSON data. My experimental software is a proxy which translates
their JSON into RDF/XML using the Erlangen OWL version of the CIDOC CRM.
More details available here:
http://conaltuohy.com/blog/lod-from-custom-web-api/

The Museum Victoria database contains a number of "articles" which each
describe one or more objects in their collection. I have modelled each of
these as an "E31 Document", and related them to the corresponding
collection items using "P70 documents".

My question is how to express the text of the actual articles (which the
Museum Victoria API provides as an HTML fragment embedded in its JSON
response). At the moment I have simply used rdf:value to attach the HTML
fragment as an XML literal to the E31 Document instance. Is this the
recommended practice?

Here is an example of one of these "articles":
http://graphite.ecs.soton.ac.uk/browser/?uri=http%3A%2F%2Fconaltuohy.com%2Fxproc-z%2Fmuseum-victoria%2Fresource%2Farticles%2F1201

Regards

Conal




-- 
Conal Tuohy
http://conaltuohy.com/
@conal_tuohy
+61-466-324297