[Crm-sig] Fwd: New Issue: dimension intervals

2018-11-09 Thread Richard Light
[reply trimmed to fit the list's size limits]

 Forwarded Message 
Subject:Re: [Crm-sig] New Issue: dimension intervals
Date:   Fri, 9 Nov 2018 18:01:59 +
From:   Richard Light 

To: crm-sig@ics.forth.gr



On 08/11/2018 20:00, Martin Doerr wrote:
Dear Richard,

It requires a sort of datatype or encoding.

Assume unit = "ft&inches"
   value = <3,6>

would that make sense?

In the xsd datatypes everything is in the value already.
The XSD datatypes all resolve to single values, so don't give a clear steer 
from them as to how to deal with the 'multiple units' issue.

I can see what you're saying as regards a 'complex' datatype, but I can't find 
examples on the Web of how the value would actually be encoded as an RDF value 
which software agents could do anything useful with.

The best I have come up with is this document from 2002:

http://infolab.stanford.edu/~melnik/rdf/datatyping/

which has some heavy hitters associated with it.  Is this the sort of approach 
you are proposing?

A slightly more complex example would be a geographical coordinate expressed as 
latitude and longitude (both expressed as degrees, minutes and seconds).

Thanks,

Richard

--
Richard Light


Re: [Crm-sig] Issue 383 Homework

2018-11-09 Thread Martin Doerr

Dear Robert,

On 11/6/2018 9:00 PM, Robert Sanderson wrote:


Thank you for pushing this forward, Martin!

Quantification wise, I would be in favor of 0,1 : 0,1.


I prefer 0,1:0,n or 0,n:0,n


If the structure of the set of symbols changed, then it would be a 
different symbolic object according to my understanding of E90:


>  … identifiable symbols and any aggregation of symbols …  that have an 
objectively recognizable structure and


that are documented as single units.

Correct. The question is, if we encounter different representations, for 
instance one giving a text "hello world" in Latin 1, and another in 
ASCII, but the E90 instance is of type Latin characters only, or if you 
write my name DOERR or DÖRR, both regarded by German authorities as 
identical variants representing the "Umlaut" OE or Ö.  Of course, in 
that case, having both representations would be redundant. In that case, 
0:n is more tolerant.
Another opinion being, that one string is enough to define the E90. 
Then, 0,1.


Similarly, if the same string was used by different Symbolic Objects, 
then it seems like they would actually be the same symbolic object (or 
you would instead use two strings with the same data).


This is a long debated question. In most cases, this appears as 
reasonable, but we do have cases in which the identity of the E90, seen 
as a message in the sense of Claude Shannon, is bound to the "sender". 
Discussing the sense of E35 Title, it appears that we cannot take the 
identity of the Title detached from the thing it was given to. This 
creates a precedent for the latter interpretation.


As a general principle, a 1:1 dependency is a thing subject to the 
suspicion of a hidden identity. To be on the safe side, I would rather 
not identify the E90 with the content model.


Two strings with the same data to be different is a (good) 
implementation choice of RDF, which assigns the identity to the link 
rather to the string, exactly in order to distinguish where the message 
comes from. If two strings with the same data are regarded as different, 
then we have actually a 0,x:0,n model in the ontology.


(And in the RDF projection this makes no difference, as literal values 
do not have their own separate identity)


For the examples, I would replace the Little Red Riding Hood example 
with one that is complete, to avoid confusion with the scope note 
requirement of being represented completely.


How about:

>  The Accession Number (E42) of the J. Paul Getty Museum’s “Abduction of 
Europa” (E22) _/has symbolic content/_ “95.PB.7“



Good!


And for the file question, do you mean that the symbolic object is the 
MS Word file, which has a representable set of (binary) symbols,



No


or that the symbolic object is text which is incorporated within the 
file, but not verbatim (as the characters in the (e.g.) paragraph are 
likely to be represented in the file using very a different structure).



Right.

Best,

martin


Rob

*From: *Crm-sig  on behalf of Martin 
Doerr 

*Date: *Tuesday, November 6, 2018 at 6:46 AM
*To: *"crm-sig@ics.forth.gr" , Chrysoula Bekiari 


*Subject: *[Crm-sig] Issue 383 Homework

Dear All,

I had sent the below as new issue, but it is indeed the answer to 
Issue 383.


The question is, how to deal with a file, which is more specific in 
content, such as an MS Word, but represents the character sequence 
that defines the content of the respective E90. Is is "is incorporated 
in", or a subproperty of it?


On 9/19/2018 11:09 PM, Martin Doerr wrote:

Here my scope note:


  Pxxx has symbolic content

Domain:  E90 Symbolic Object
<#_E2_Temporal_Entity>

Range: E62 String

Quantification: many to many (0,n:0,n) ??

 In CRM RDFS   subproperty of: rdfs:value

Scope note:  This property associates an instance of E90
Symbolic Object with a complete, identifying representation of its
content in the form of an instance of E62 String. This property
only applies to instances of E90 Symbolic Object that can be
represented completely in this form. The representation may be
more specific than the symbolic level defining the identity
condition of the represented. This depends on the type of the
symbolic object represented. For instance, if a name has type
"Modern Greek character sequence", it may be represented in a
loss-free Latin transcription, meaning however the sequence of
Greek letters. As another example, if the represented object has
type "English words sequence", American English or British English
spelling variants may be chosen to represent the English word
"colour" without defining a different symbolic object. If a name
has type "European traditional name", no particular string may
define its content.

Examples:

* The materials description (E33) of the painting (E22)  _/has
symbolic content/_ “Oil, French Watercolors on Paper, Graphite and
Ink on 

Re: [Crm-sig] New Issue: dimension intervals

2018-11-09 Thread Martin Doerr

On 11/9/2018 12:14 AM, Franco Niccolucci wrote:

Martin,

I agree with you, E60 Number is a jack-of-all-trades and can be a couple, a 
triple, whatever numeric value or set of values as long as it is clear what is 
what.

So for ancient/nonstandard/local units such as ft & inches or Roman cubitus I 
would add:

E58 Measurement Unit “ft&inches” P70 is documented in E31 Document “F.W Clarke, 
Weights Measures and Money of all Nations. Appleton & C. New York 1888”.

Incidentally, Prof. Clarke (from the U. of Cincinnati) wrote in the 
introduction “Our three sets of weights, our three different gallons, and our 
two dissimilar bushels, all unrelated to each other, or to the units of length, 
must soon give way before the simplicity and elegance of the metric system. 
That this event my soon happen [...] is the sincere wish and hope of the 
writer.” 130 years have passed since then, at no avail.

Thus, I would at least regard any such unit (system) as local or historical, 
and therefore needing a reference description: otherwise for me - and for any 
scientist - that value of 3 ft 6 inches could equally well be the distance of 
Alpha Centauri from the Earth, or the size of a bacterium.

Best

Franco

By the way, reference to ISO1000:1992 in the E58 scope note should be updated to 
ISO8:2009, superseding ISO1000 and in force for some 10 years now; probably also 
referencing the so-called "BIPM SI Brochure" would be OK.

We'll update!

Best,

martin


Removing all reference to non-SI units from the scope note description would 
also be desirable: there is no such thing as “internationally recognized non-SI 
terms”, who gives this “international recognition” if not the BIPM?
Of course they may remain in the examples, together with the recommendation of 
preserving archaic measurement units.

F.

Prof. Franco Niccolucci
Director, VAST-LAB
PIN - U. of Florence
Scientific Coordinator
ARIADNEplus - PARTHENOS

Editor-in-Chief
ACM Journal of Computing and Cultural Heritage (JOCCH)

Piazza Ciardi 25
59100 Prato, Italy



Il giorno 8 nov 2018, alle ore 21:00, Martin Doerr  ha 
scritto:

Dear Richard,

It requires a sort of datatype or encoding.

Assume unit = "ft&inches"
value = <3,6>

would that make sense?

In the xsd datatypes everything is in the value already.

best,

martin

On 11/8/2018 8:00 PM, Richard Light wrote:

While we're looking at this area, I would be grateful if we could also look at 
Value and Unit.

I have never understood how P90 and P91 are actually meant to be used together. I can see 
how a single E54 can be represented by a single P90 and a single P91, but how do we 
represent anything more complex?  An example would be "3 ft 6 inches".  Can 
that be an E54 Dimension, and if so how do you know which unit applies to which value?

Thanks,

Richard


On 07/11/2018 16:10, Martin Doerr wrote:

Dear All.

Continuing issue 363,

I propose the following:

"Whereas the CRM regards that intervals of primitive values are primitive values by 
themselves, there is currently no corresponding practice in RDF. Therefore, in analogy to 
the properties of E52 Time-Span, we define in CRM RDFS two more subproperties of P90 has 
value: “P90a_has_lower_value_limit” and “P90b_has_upper_value_limit”. The precise 
guidelines for using these properties are to be given."

Sensor arrays, more and more in use, pose the issue of a single measurement 
resulting in an array of numbers which altogether form one quantitative 
statement about the observed. We can describe such structures easily as one 
complex type of unit (and define an IRI for it), and then regard the value to a 
matrix of numbers, in which each position obeys subunits as defined in the 
complex unit type.

Even if we regard complex matrices of numbers as one value for an instance of 
E54 Dimension, such as RGB image, we can argue that minimal and maximal values 
exist as two separate matrices of the same structure.

Consequently I propose to deprecate P83, P84, because in competes with an 
interval interpretation of P90, and :

Introduce instead Pxxx had duration, Domain:  E52 Time-Span, Range: E54 
Dimension
and use the P90, P90a, P90b as adequate

or introduce  an Exxx Temporal Duration , subclass of E54 Dimension, and define 
subproperties in RDFS ending in xsd:duration.

See:
P83 had at least duration (was minimum duration of)

  
Domain:  E52 Time-Span


Range:E54 Dimension

Quantification:one to one (1,1:1,1)

  
Scope note: This property describes the minimum length of time covered by an E52 Time-Span.


  
It allows an E52 Time-Span to be associated with an E54 Dimension representing it’s minimum duration (i.e. it’s inner boundary) independent from the actual beginning and end.


Examples:

§  the time span of the Battle of Issos 333 B.C.E. (E52) had at least duration 
Battle of Issos minimum duration (E54) has unit (P91) day (E58) has value (P90) 
1 (E60)

  
In First Order Logic:


  

Re: [Crm-sig] New Issue: dimension intervals

2018-11-09 Thread Franco Niccolucci
Martin,

I agree with you, E60 Number is a jack-of-all-trades and can be a couple, a 
triple, whatever numeric value or set of values as long as it is clear what is 
what.

So for ancient/nonstandard/local units such as ft & inches or Roman cubitus I 
would add: 

E58 Measurement Unit “ft&inches” P70 is documented in E31 Document “F.W Clarke, 
Weights Measures and Money of all Nations. Appleton & C. New York 1888”.

Incidentally, Prof. Clarke (from the U. of Cincinnati) wrote in the 
introduction “Our three sets of weights, our three different gallons, and our 
two dissimilar bushels, all unrelated to each other, or to the units of length, 
must soon give way before the simplicity and elegance of the metric system. 
That this event my soon happen [...] is the sincere wish and hope of the 
writer.” 130 years have passed since then, at no avail. 

Thus, I would at least regard any such unit (system) as local or historical, 
and therefore needing a reference description: otherwise for me - and for any 
scientist - that value of 3 ft 6 inches could equally well be the distance of 
Alpha Centauri from the Earth, or the size of a bacterium.

Best

Franco

By the way, reference to ISO1000:1992 in the E58 scope note should be updated 
to ISO8:2009, superseding ISO1000 and in force for some 10 years now; 
probably also referencing the so-called "BIPM SI Brochure" would be OK. 

Removing all reference to non-SI units from the scope note description would 
also be desirable: there is no such thing as “internationally recognized non-SI 
terms”, who gives this “international recognition” if not the BIPM?
Of course they may remain in the examples, together with the recommendation of 
preserving archaic measurement units.

F.

Prof. Franco Niccolucci
Director, VAST-LAB
PIN - U. of Florence
Scientific Coordinator
ARIADNEplus - PARTHENOS

Editor-in-Chief
ACM Journal of Computing and Cultural Heritage (JOCCH) 

Piazza Ciardi 25
59100 Prato, Italy


> Il giorno 8 nov 2018, alle ore 21:00, Martin Doerr  ha 
> scritto:
> 
> Dear Richard,
> 
> It requires a sort of datatype or encoding.
> 
> Assume unit = "ft&inches" 
>value = <3,6>
> 
> would that make sense?
> 
> In the xsd datatypes everything is in the value already.
> 
> best,
> 
> martin
> 
> On 11/8/2018 8:00 PM, Richard Light wrote:
>> While we're looking at this area, I would be grateful if we could also look 
>> at Value and Unit.
>> 
>> I have never understood how P90 and P91 are actually meant to be used 
>> together. I can see how a single E54 can be represented by a single P90 and 
>> a single P91, but how do we represent anything more complex?  An example 
>> would be "3 ft 6 inches".  Can that be an E54 Dimension, and if so how do 
>> you know which unit applies to which value?
>> 
>> Thanks,
>> 
>> Richard
>> 
>> 
>> On 07/11/2018 16:10, Martin Doerr wrote:
>>> Dear All.
>>> 
>>> Continuing issue 363,
>>> 
>>> I propose the following:
>>> 
>>> "Whereas the CRM regards that intervals of primitive values are primitive 
>>> values by themselves, there is currently no corresponding practice in RDF. 
>>> Therefore, in analogy to the properties of E52 Time-Span, we define in CRM 
>>> RDFS two more subproperties of P90 has value: “P90a_has_lower_value_limit” 
>>> and “P90b_has_upper_value_limit”. The precise guidelines for using these 
>>> properties are to be given."
>>> 
>>> Sensor arrays, more and more in use, pose the issue of a single measurement 
>>> resulting in an array of numbers which altogether form one quantitative 
>>> statement about the observed. We can describe such structures easily as one 
>>> complex type of unit (and define an IRI for it), and then regard the value 
>>> to a matrix of numbers, in which each position obeys subunits as defined in 
>>> the complex unit type.
>>> 
>>> Even if we regard complex matrices of numbers as one value for an instance 
>>> of E54 Dimension, such as RGB image, we can argue that minimal and maximal 
>>> values exist as two separate matrices of the same structure.
>>> 
>>> Consequently I propose to deprecate P83, P84, because in competes with an 
>>> interval interpretation of P90, and :
>>> 
>>> Introduce instead Pxxx had duration, Domain:  E52 Time-Span, Range: E54 
>>> Dimension
>>> and use the P90, P90a, P90b as adequate
>>> 
>>> or introduce  an Exxx Temporal Duration , subclass of E54 Dimension, and 
>>> define subproperties in RDFS ending in xsd:duration.
>>> 
>>> See:
>>> P83 had at least duration (was minimum duration of)
>>> 
>>>  
>>> Domain:  E52 Time-Span
>>> 
>>> Range:E54 Dimension
>>> 
>>> Quantification:one to one (1,1:1,1)
>>> 
>>>  
>>> Scope note: This property describes the minimum length of time 
>>> covered by an E52 Time-Span.
>>> 
>>>  
>>> It allows an E52 Time-Span to be associated with an E54 Dimension 
>>> representing it’s minimum duration (i.e. it’s inner boundary) independent 
>>> from the actual beginning and end.
>>> 
>>>