Re: [Crm-sig] Domain and range of P90

2018-03-01 Thread Martin Doerr

Dear Doug,

On 3/1/2018 3:14 PM, Douglas Tudhope wrote:
We'd also support this direction and the move to standardising RDF 
guidelines. The need for standardisation of technical representations 
and encodings of the CRM (particularly RDF) has been a recurrent theme 
in CRM workshops and papers oriented to implementation issues over the 
last 10 years (*). We can't really achieve interoperability without 
this. Encouraging particular communities with shared use cases (as in 
linked.art) to adopt a set of consistent patterns for mapping instance 
data to the CRM is also a necessary ingredient of practical 
interoperability.
That is exactly my proposal. CRM-SIG normally works reactively: When a 
good community practice emerges, this is taken up.


Best,

Martin


Doug Tudhope and Ceri Binding

Hypermedia Research Group, University of South Wales

(*) eg workshops on CRM implementation issues

http://tpdl2013.upatras.gr/ws-crmex.php

https://www.topoi.org/wp-content/uploads/2011/02/Programme_Datenwelten.pdf 



PS just seen Martin's concurrent reply to this thread and this is not 
taking issue with that, rather arguing that we can make significant 
progress by some basic implementation (RDF) extensions and mapping 
patterns, even though they will not solve all the problems


*From:* Crm-sig [crm-sig-boun...@ics.forth.gr] on behalf of Conal 
Tuohy [conal.tu...@gmail.com]

*Sent:* 01 March 2018 03:02
*To:* George Bruseker
*Cc:* crm-sig (Crm-sig@ics.forth.gr)
*Subject:* Re: [Crm-sig] Domain and range of P90

One of the "gaps" which puzzles me most is the example you give of 
encoding the string value of an Appellation. I understand the 
recommended practice is to attach the string value of a person's name 
using P3_has_note, or actually, using a custom subproperty of 
P3_has_note. The semantics of P3_has_note itself are weak; a note is 
simply an "informal description" of something, so if I have a 
particular name (an RDF resource) which P3_has_note the literal string 
"Conal Tuohy", then I should really define subproperties so as to be 
able to distinguish that string value from a note which really is 
nothing more than an "informal description" of that name e.g. "A very 
uncommon name of Irish origin". What puzzles me most about this "gap" 
in the RDFS specification is that the distinction between a note ABOUT 
a name, and the actual textual representation OF a name is somehow 
considered out of scope of the CRM in RDFS. It's puzzling, because the 
string value of a name is something which really must be encoded in a 
standard fashion, to achieve interoperability (as an aside, my 
personal view is that the string literal "Conal Tuohy" could be 
attached to an Appellation using the rdf:value or rdfs:label predicate 
defined in the RDFS spec). But the important thing is that the RDFS 
schema should stipulate how to attach this literal data rather than 
leave it as an open question. In general these are the kinds of issues 
which puzzle many people who approach the CRM from a position of 
having already worked with other RDF ontologies in the cultural 
heritage space, and find themselves wondering how they are supposed to 
make these details CRM work in RDF in an interoperable way, without 
having to pick and choose from a variety of techniques for "finessing" 
the gaps.


These kinds of gaps are serious barriers to interoperability in the 
Linked Open Data cloud, and they need to be addressed by agreeing on 
some encoding procedures that can be used consistently by different 
projects on the web. It would be helpful to CRM adopters in the Linked 
Data community if these gaps could be filled in a manner which is 
clear and simple and interoperable. I am not in favour of just 
offering a menu of possible approaches, especially where individual 
projects would have to make local customisations to their schema. If 
there is some particular value in multiple approaches, then they could 
be published as different "profiles" that encoders could simply adopt, 
as a whole. I think the recent effort by Richard Light (and other 
contributors) to collate guidelines on RDF encoding is a great 
initiative! 
> 
It deserves more input and I hope it will continue to be discussed on 
the list. I also think the Linked Art project http://linked.art/ with 
its "profile" of the CRM is another really good way forward.


Regards

Conal



___
Crm-sig mailing list
Crm-sig@ics.forth.gr
http://lists.ics.forth.gr/mailman/listinfo/crm-sig



--
--
 Dr. Martin Doerr  |  Vox:+30(2810)391625|
 Research Director |  Fax:+30(2810)391638|
   

Re: [Crm-sig] Domain and range of P90

2018-03-01 Thread Martin Doerr

Dear Rob,

On 3/1/2018 3:54 PM, Robert Sanderson wrote:


Let me try and summarize your position, to see if I understand correctly…

    It is theoretically impossible to fulfil all of the complex 
possibilities,



Yes.


so the vast majority of pragmatic cases that just need a value 
associated with the resource also cannot be fulfilled.


Whatever the vast majority is  and rdf:value does the job, I have no 
objections to its use.
Just define precisely what you use it for. We can add that to our 
guidelines. It is already standard rdf.


If it is about persons, a good practice is for instance to use a URI of 
a resource such as ULAN, which defines all the name values. Having 
rdf:value to fill in a name, is equally good as rdfs:label.


If it is the content of a digital object, we should make this an ISSUE, 
because it has a certain complexity and needs harmonization with FRBRoo.


Best,

Martin


For the rest of us, I think we should agree in this community to use 
rdf:value, per Conal’s email.


Rob

*From: *Crm-sig  on behalf of Martin 
Doerr 

*Date: *Thursday, March 1, 2018 at 7:19 AM
*To: *"crm-sig@ics.forth.gr" 
*Subject: *Re: [Crm-sig] Domain and range of P90

Dear Conal Tuohy,

Your comments well taken, first again a general note: We try since 22 
years carefully to make standards were standards are possible, and to 
take care that the most relevant semantics for integrating data are 
modelled, as long as they can be modelled at all with these means. 
Standards making in the first place means following good practice 
around the world and understanding, were a consensus appears, and 
interfacing rather than redoing to those communities that have 
effective competence in certain fields.


So: "these gaps could be filled in a manner which is clear and simple 
and interoperable"  How much would we love to do! The point is, 
that the encoding of names has an immense complexity. The most 
comprehensive and experienced standards come from the library world. 
If you believe there is a clear and simple solution, please try to 
extract it from the library cataloguing rules (AACR2, RDA 
https://www.oclc.org/en/rda/about.html), but there is also EAC-CPF 
(http://eac.staatsbibliothek-berlin.de/) and FOAF. FOAF works badly 
for historical data, as I was informed.


The names of people are indeed an issue of interoperability. However, 
if we have a particular person described with events etc., the exact 
name itself has no further links to other kinds of facts than instance 
matching with other occurrences of the same person (not talking about 
families here). Therefore the damage to global reasoning of having 
representation problems is relatively limited. ULAN, for instance, 
registers an average of two names per known artist. All practice of 
instance matching shows, that even encoding well names, the identity 
question is not settled. Instance matching is a science and says 
everything about the respective reasoning needed, and the effectivity 
of name standards.


In each case, in which an unambiguous formulation of some properties 
cannot be achieved because the word is more complex, we can only rely 
on mapping.


More comments below:

On 3/1/2018 5:02 AM, Conal Tuohy wrote:

One of the "gaps" which puzzles me most is the example you give of
encoding the string value of an Appellation. I understand the
recommended practice is to attach the string value of a person's
name using P3_has_note, or actually, using a custom subproperty of
P3_has_note. The semantics of P3_has_note itself are weak; a note
is simply an "informal description" of something, so if I have a
particular name (an RDF resource) which P3_has_note the literal
string "Conal Tuohy", then I should really define subproperties so
as to be able to distinguish that string value from a note which
really is nothing more than an "informal description" of that name
e.g. "A very uncommon name of Irish origin". What puzzles me most
about this "gap" in the RDFS specification is that the distinction
between a note ABOUT a name, and the actual textual representation
OF a name is somehow considered out of scope of the CRM in RDFS.
It's puzzling, because the string value of a name is something
which really must be encoded in a standard fashion, to achieve
interoperability (as an aside, my personal view is that the string
literal "Conal Tuohy" could be attached to an Appellation using
the rdf:value or rdfs:label predicate defined in the RDFS spec).

I can only repeat that the instructions of using the CRM and even the 
RDFS is making your own extensions. The weak semantics of P3 ensures 
that information is reached, but not, that it is specifically 
interpreted. Since there are world-wide no comprehensive encoding 
schemes for personal names, you can reuse for instance FOAF properties 
as subproperties of CRM properties, or reuse MARC encoded strings. 
Both represent a good prac

Re: [Crm-sig] Domain and range of P90

2018-03-01 Thread Robert Sanderson

Let me try and summarize your position, to see if I understand correctly…
It is theoretically impossible to fulfil all of the complex possibilities, 
so the vast majority of pragmatic cases that just need a value associated with 
the resource also cannot be fulfilled.

For the rest of us, I think we should agree in this community to use rdf:value, 
per Conal’s email.

Rob


From: Crm-sig  on behalf of Martin Doerr 

Date: Thursday, March 1, 2018 at 7:19 AM
To: "crm-sig@ics.forth.gr" 
Subject: Re: [Crm-sig] Domain and range of P90

Dear Conal Tuohy,

Your comments well taken, first again a general note: We try since 22 years 
carefully to make standards were standards are possible, and to take care that 
the most relevant semantics for integrating data are modelled, as long as they 
can be modelled at all with these means. Standards making in the first place 
means following good practice around the world and understanding, were a 
consensus appears, and interfacing rather than redoing to those communities 
that have effective competence in certain fields.

So: "these gaps could be filled in a manner which is clear and simple and 
interoperable"  How much would we love to do! The point is, that the 
encoding of names has an immense complexity. The most comprehensive and 
experienced standards come from the library world. If you believe there is a 
clear and simple solution, please try to extract it from the library 
cataloguing rules (AACR2, RDA https://www.oclc.org/en/rda/about.html), but 
there is also EAC-CPF (http://eac.staatsbibliothek-berlin.de/) and FOAF. FOAF 
works badly for historical data, as I was informed.

The names of people are indeed an issue of interoperability. However, if we 
have a particular person described with events etc., the exact name itself has 
no further links to other kinds of facts than instance matching with other 
occurrences of the same person (not talking about families here). Therefore the 
damage to global reasoning of having representation problems is relatively 
limited. ULAN, for instance, registers an average of two names per known 
artist. All practice of instance matching shows, that even encoding well names, 
the identity question is not settled. Instance matching is a science and says 
everything about the respective reasoning needed, and the effectivity of name 
standards.

In each case, in which an unambiguous formulation of some properties cannot be 
achieved because the word is more complex, we can only rely on mapping.

More comments below:

On 3/1/2018 5:02 AM, Conal Tuohy wrote:
One of the "gaps" which puzzles me most is the example you give of encoding the 
string value of an Appellation. I understand the recommended practice is to 
attach the string value of a person's name using P3_has_note, or actually, 
using a custom subproperty of P3_has_note. The semantics of P3_has_note itself 
are weak; a note is simply an "informal description" of something, so if I have 
a particular name (an RDF resource) which P3_has_note the literal string "Conal 
Tuohy", then I should really define subproperties so as to be able to 
distinguish that string value from a note which really is nothing more than an 
"informal description" of that name e.g. "A very uncommon name of Irish 
origin". What puzzles me most about this "gap" in the RDFS specification is 
that the distinction between a note ABOUT a name, and the actual textual 
representation OF a name is somehow considered out of scope of the CRM in RDFS. 
It's puzzling, because the string value of a name is something which really 
must be encoded in a standard fashion, to achieve interoperability (as an 
aside, my personal view is that the string literal "Conal Tuohy" could be 
attached to an Appellation using the rdf:value or rdfs:label predicate defined 
in the RDFS spec).
I can only repeat that the instructions of using the CRM and even the RDFS is 
making your own extensions. The weak semantics of P3 ensures that information 
is reached, but not, that it is specifically interpreted. Since there are 
world-wide no comprehensive encoding schemes for personal names, you can reuse 
for instance FOAF properties as subproperties of CRM properties, or reuse MARC 
encoded strings. Both represent a good practice and are well defined. As long 
as an LOD system has this information, it can map between them, run instance 
matching algorithms, and display the information.

Using rdfs:label can be a solution, as well as rdfs:value, which should be 
discussed as possible recommendations. Also,
instantiating Appellation with a URI in its own right is not necessary, if 
rdfs:label is sufficient. The problem is, that rdfs:label creates overlapping 
semantics with any ontology dealing with names.  We can only register this 
fact, by admitting that there is more than one representation, depending on the 
case.

But the important thing is that the RDFS schema should stipulate how to attach 
this literal data rather than leav

Re: [Crm-sig] Domain and range of P90

2018-03-01 Thread Douglas Tudhope
We'd also support this direction and the move to standardising RDF guidelines. 
The need for standardisation of technical representations and encodings of the 
CRM (particularly RDF) has been a recurrent theme in CRM workshops and papers 
oriented to implementation issues over the last 10 years (*). We can't really 
achieve interoperability without this. Encouraging particular communities with 
shared use cases (as in linked.art) to adopt a set of consistent patterns for 
mapping instance data to the CRM is also a necessary ingredient of practical 
interoperability.

Doug Tudhope and Ceri Binding
Hypermedia Research Group, University of South Wales

(*) eg workshops on CRM implementation issues
http://tpdl2013.upatras.gr/ws-crmex.php
https://www.topoi.org/wp-content/uploads/2011/02/Programme_Datenwelten.pdf

PS just seen Martin's concurrent reply to this thread and this is not taking 
issue with that, rather arguing that we can make significant progress by some 
basic implementation (RDF) extensions and mapping patterns, even though they 
will not solve all the problems

From: Crm-sig [crm-sig-boun...@ics.forth.gr] on behalf of Conal Tuohy 
[conal.tu...@gmail.com]
Sent: 01 March 2018 03:02
To: George Bruseker
Cc: crm-sig (Crm-sig@ics.forth.gr)
Subject: Re: [Crm-sig] Domain and range of P90

One of the "gaps" which puzzles me most is the example you give of encoding the 
string value of an Appellation. I understand the recommended practice is to 
attach the string value of a person's name using P3_has_note, or actually, 
using a custom subproperty of P3_has_note. The semantics of P3_has_note itself 
are weak; a note is simply an "informal description" of something, so if I have 
a particular name (an RDF resource) which P3_has_note the literal string "Conal 
Tuohy", then I should really define subproperties so as to be able to 
distinguish that string value from a note which really is nothing more than an 
"informal description" of that name e.g. "A very uncommon name of Irish 
origin". What puzzles me most about this "gap" in the RDFS specification is 
that the distinction between a note ABOUT a name, and the actual textual 
representation OF a name is somehow considered out of scope of the CRM in RDFS. 
It's puzzling, because the string value of a name is something which really 
must be encoded in a standard fashion, to achieve interoperability (as an 
aside, my personal view is that the string literal "Conal Tuohy" could be 
attached to an Appellation using the rdf:value or rdfs:label predicate defined 
in the RDFS spec). But the important thing is that the RDFS schema should 
stipulate how to attach this literal data rather than leave it as an open 
question. In general these are the kinds of issues which puzzle many people who 
approach the CRM from a position of having already worked with other RDF 
ontologies in the cultural heritage space, and find themselves wondering how 
they are supposed to make these details CRM work in RDF in an interoperable 
way, without having to pick and choose from a variety of techniques for 
"finessing" the gaps.

These kinds of gaps are serious barriers to interoperability in the Linked Open 
Data cloud, and they need to be addressed by agreeing on some encoding 
procedures that can be used consistently by different projects on the web. It 
would be helpful to CRM adopters in the Linked Data community if these gaps 
could be filled in a manner which is clear and simple and interoperable. I am 
not in favour of just offering a menu of possible approaches, especially where 
individual projects would have to make local customisations to their schema. If 
there is some particular value in multiple approaches, then they could be 
published as different "profiles" that encoders could simply adopt, as a whole. 
I think the recent effort by Richard Light (and other contributors) to collate 
guidelines on RDF encoding is a great initiative! 

 It deserves more input and I hope it will continue to be discussed on the 
list. I also think the Linked Art project http://linked.art/ with its "profile" 
of the CRM is another really good way forward.

Regards

Conal



Re: [Crm-sig] Domain and range of P90

2018-03-01 Thread Martin Doerr

Dear Conal Tuohy,

Your comments well taken, first again a general note: We try since 22 
years carefully to make standards were standards are possible, and to 
take care that the most relevant semantics for integrating data are 
modelled, as long as they can be modelled at all with these means. 
Standards making in the first place means following good practice around 
the world and understanding, were a consensus appears, and interfacing 
rather than redoing to those communities that have effective competence 
in certain fields.


So: "these gaps could be filled in a manner which is clear and simple 
and interoperable"  How much would we love to do! The point is, that 
the encoding of names has an immense complexity. The most comprehensive 
and experienced standards come from the library world. If you believe 
there is a clear and simple solution, please try to extract it from the 
library cataloguing rules (AACR2, RDA 
https://www.oclc.org/en/rda/about.html), but there is also EAC-CPF 
(http://eac.staatsbibliothek-berlin.de/) and FOAF. FOAF works badly for 
historical data, as I was informed.


The names of people are indeed an issue of interoperability. However, if 
we have a particular person described with events etc., the exact name 
itself has no further links to other kinds of facts than instance 
matching with other occurrences of the same person (not talking about 
families here). Therefore the damage to global reasoning of having 
representation problems is relatively limited. ULAN, for instance, 
registers an average of two names per known artist. All practice of 
instance matching shows, that even encoding well names, the identity 
question is not settled. Instance matching is a science and says 
everything about the respective reasoning needed, and the effectivity of 
name standards.


In each case, in which an unambiguous formulation of some properties 
cannot be achieved because the word is more complex, we can only rely on 
mapping.


More comments below:

On 3/1/2018 5:02 AM, Conal Tuohy wrote:
One of the "gaps" which puzzles me most is the example you give of 
encoding the string value of an Appellation. I understand the 
recommended practice is to attach the string value of a person's name 
using P3_has_note, or actually, using a custom subproperty of 
P3_has_note. The semantics of P3_has_note itself are weak; a note is 
simply an "informal description" of something, so if I have a 
particular name (an RDF resource) which P3_has_note the literal string 
"Conal Tuohy", then I should really define subproperties so as to be 
able to distinguish that string value from a note which really is 
nothing more than an "informal description" of that name e.g. "A very 
uncommon name of Irish origin". What puzzles me most about this "gap" 
in the RDFS specification is that the distinction between a note ABOUT 
a name, and the actual textual representation OF a name is somehow 
considered out of scope of the CRM in RDFS. It's puzzling, because the 
string value of a name is something which really must be encoded in a 
standard fashion, to achieve interoperability (as an aside, my 
personal view is that the string literal "Conal Tuohy" could be 
attached to an Appellation using the rdf:value or rdfs:label predicate 
defined in the RDFS spec).
I can only repeat that the instructions of using the CRM and even the 
RDFS is making your own extensions. The weak semantics of P3 ensures 
that information is reached, but not, that it is specifically 
interpreted. Since there are world-wide no comprehensive encoding 
schemes for personal names, you can reuse for instance FOAF properties 
as subproperties of CRM properties, or reuse MARC encoded strings. Both 
represent a good practice and are well defined. As long as an LOD system 
has this information, it can map between them, run instance matching 
algorithms, and display the information.


Using rdfs:label can be a solution, as well as rdfs:value, which should 
be discussed as possible recommendations. Also,
instantiating Appellation with a URI in its own right is not necessary, 
if rdfs:label is sufficient. The problem is, that rdfs:label creates 
overlapping semantics with any ontology dealing with names.  We can only 
register this fact, by admitting that there is more than one 
representation, depending on the case.
But the important thing is that the RDFS schema should stipulate how 
to attach this literal data rather than leave it as an open question. 
In general these are the kinds of issues which puzzle many people who 
approach the CRM from a position of having already worked with other 
RDF ontologies in the cultural heritage space, and find themselves 
wondering how they are supposed to make these details CRM work in RDF 
in an interoperable way, without having to pick and choose from a 
variety of techniques for "finessing" the gaps.

Yes, but only if it is feasible at all, see above.


These kinds of gaps are serious barriers to interopera

Re: [Crm-sig] Domain and range of P90

2018-03-01 Thread Conal Tuohy
One of the "gaps" which puzzles me most is the example you give of encoding
the string value of an Appellation. I understand the recommended practice
is to attach the string value of a person's name using P3_has_note, or
actually, using a custom subproperty of P3_has_note. The semantics of
P3_has_note itself are weak; a note is simply an "informal description" of
something, so if I have a particular name (an RDF resource) which
P3_has_note the literal string "Conal Tuohy", then I should really define
subproperties so as to be able to distinguish that string value from a note
which really is nothing more than an "informal description" of that name
e.g. "A very uncommon name of Irish origin". What puzzles me most about
this "gap" in the RDFS specification is that the distinction between a note
ABOUT a name, and the actual textual representation OF a name is somehow
considered out of scope of the CRM in RDFS. It's puzzling, because the
string value of a name is something which really must be encoded in a
standard fashion, to achieve interoperability (as an aside, my personal
view is that the string literal "Conal Tuohy" could be attached to an
Appellation using the rdf:value or rdfs:label predicate defined in the RDFS
spec). But the important thing is that the RDFS schema should stipulate how
to attach this literal data rather than leave it as an open question. In
general these are the kinds of issues which puzzle many people who approach
the CRM from a position of having already worked with other RDF ontologies
in the cultural heritage space, and find themselves wondering how they are
supposed to make these details CRM work in RDF in an interoperable way,
without having to pick and choose from a variety of techniques for
"finessing" the gaps.

These kinds of gaps are serious barriers to interoperability in the Linked
Open Data cloud, and they need to be addressed by agreeing on some encoding
procedures that can be used consistently by different projects on the web.
It would be helpful to CRM adopters in the Linked Data community if these
gaps could be filled in a manner which is clear and simple and
interoperable. I am not in favour of just offering a menu of possible
approaches, especially where individual projects would have to make local
customisations to their schema. If there is some particular value in
multiple approaches, then they could be published as different "profiles"
that encoders could simply adopt, as a whole. I think the recent effort by
Richard Light (and other contributors) to collate guidelines on RDF
encoding is a great initiative!  It deserves more
input and I hope it will continue to be discussed on the list. I also think
the Linked Art project http://linked.art/ with its "profile" of the CRM is
another really good way forward.

Regards

Conal


On 22 February 2018 at 19:46, George Bruseker  wrote:

> Dear Phil et al.,
>
> I think this is a case of interpreting the label of the property rather
> than its intention. CRM ‘has value’ isn’t supposed to cover all possible
> meanings of the natural language interpretation of has value. Rather it has
> a very restricted use. It is meant to give the quantitive number value
> associated to a dimension. Dimension is a class that should be used to
> store information that results from a measurement activity. The measurement
> activity is specified as some procedural event that has the intentional
> objective of producing quantitative data. It is an activity of interacting
> with the world with the intention of producing a quantitive result.
>
> So it would be a nonsensical, to say 'this paragraph (E73) has dimension
> (E54 defined as a quantitive result from a measuring procedure) has value
> “the characters in this paragraph” (E59 primitive value). The definition of
> E54 forbids it because a string is not a quantity (though of course it may
> have a quantity… that would have to be measure).
>
> That of course sounds irritating. It would be nice to have a property that
> could store all values. But then of course that property would mean
> everything and nothing and the ontology wouldn’t work for getting specific
> information, like the quantitative results of measurement activities
> separate from any other value ‘good’ ‘bad’ ‘ugly’ ‘monogamy’ ‘world peace’
> ‘all the characters in this present string’.
>
> That’s the ontological argument. The practical question is why you are
> looking to expand the scope. I’m guessing that the reason is because you
> want a unique place to store a data value (this is a guess, so please do
> correct my presumption if I’m wrong).
>
> This seems to me to get back to the encoding issue and having a standard
> strategy. I think that a usual suggestion could be to throw it into string
> via P3 via note. Another suggestion would be to put it in label and, as I
> recall, there is rdf has value which could hold the actual data point