[Crm-sig] Issue 490: how to model a file [HW reminder]

2023-09-15 Thread Martin Doerr via Crm-sig

Dear All,

Let me summarize the discussion about issue 490 between George, 
Christian-Emil and me, to be discussed in the next meeting:


"How to model a file" may be too vague.

There are three aspects:

A) What constructs are needed in the CRM ontologically to refer to the 
unique content of a file.


B) What constructs are needed to refer unambiguously to a resource that 
changes content. This is modeled in CRMpem as "Volatile Dataset", and 
will not be discussed in this issue.


C) How to connect in a knowledge base to a materialized content description.

About A):

We take a file (see also Persistent Dataset in CRMpem) in the sense of 
an immaterial E73 Information Object as a unique sequence of symbols 
that can be machine-encoded, regardless what groups of bits constitute 
one of the symbols of interest in this object.

   in the KB: The intended identity can be represented by a URI.

We take a file in the sense of a material copy on a digital medium as a 
kind of "E24 Human-Made Feature", regardless whether it is on a *local* 
installation, in a "*cloud*" cluster of machines, a *LOCKSS* federation 
of copies, or on a *removable* carrier.


    in the KB: We may refer to the material copy by an *external URL*, 
or create an *E52 String *in a KB or within an RDF file, or use a 
platform-internal  "*BLOB mechanism*" with whatever kind of identifier 
the platform refers to the local copy.


Ontologically, it is irrelevant for the intended immaterial content if 
the copy is printed or scribbled on a paper or on a digital medium (or 
even a Morse sound track), as long as the material form  is unambiguous 
wrt to the intended content. Both, paper and digital media can have 
errors. The CIDOC CRM v7.1 can be printed on paper and in principle be 
reentered manually into a file loss-free.


   in the KB: We may refer to a paper copy or a removable medium by an 
archival identifier.


About C)

Using an archival identifier for a paper copy, a removable digital 
medium or a URL for a file on a machine, in all cases the maintainers of 
the archive must guarantee that the identifier will be uniquely 
connected with the content. Otherwise, using a URL in a KB is simply 
inadequate.
The DOI organisation forsees penalties for users that change the content 
of a URL associated with a DOI. There is no other solution.


DOI *automatically redirects* from the DOI URI to the guaranteed URL.

The property P190 has symbolic is used to connect a machine-encodable 
information object to a KB internal string. *Similarly*, we want to 
refer to the content of an information object via an *external* digital 
or not copy, via a *URL or archival identifier*. Therefore we propose 
the following property:


**New proposal:**

*Pxxx has representative copy*

Domain:

E90 Symbolic Object

Range:

E25 Human-Made Feature

Subproperty of:

E90 Symbolic Object. P128i is carried by (carries): E18 Physical Thing

Quantification:

many to many (0,n:0,n)

Scope note:

This property associates an instance of E90 Symbolic Object with a 
complete, identifying representation of its content in the form of a 
sufficiently readable instance of E25 Human-Made Feature, including, in 
particular, representations on electronic media, regardless whether they 
reside internally in clusters of electronic machines, such as in 
so-called cloud services, or on removable media.


This property only applies to instances of E73 Information Object that 
can completely be represented by discrete symbols, in contrast to 
analogue information. The representing object may be more specific than 
the symbolic level defining the identity condition of the represented. 
This depends on the type of the information object represented. For 
instance, if a text has type "Sequence of Modern Greek characters and 
punctuation marks", it may be represented in a formatted file with 
particular fonts on a particular machine, meaning however only the 
sequence of Greek letters. Any additional analogue elements contained in 
the representing object will not regarded to be part of the represented.


As another example, if the represented object has type "English words 
sequence", American English or British English spelling variants may be 
chosen to represent the English word "colour" without defining a 
different symbolic object.


In a knowledge base, typically, the represented object will appear as a 
URI without a corresponding file, whereas the representing one may 
appear by the URL of a binary encoded file existing outside the 
knowledge base proper, or by the archival identifier of a paper edition. 
A URL for identifying the copy itself in a knowledge base should only be 
used as long as the providers support the persistence of that copy under 
this URL, as it is current practice for "Linked Open Data". Associating 
the referred copy with a checksum in the knowledge base may help 
safeguarding the maintainers against unexpected change of content under 
this URL. If more tha

[Crm-sig] Fwd: Issue 316: coreference statements to CRMinf

2023-09-15 Thread Eleni Tsouloucha via Crm-sig
*post by Martin concerning issue 316*


Dear All,

As Belief Adoption has completely be redesigned, the previous proposals
would interfere with the current solution. Also, I think a coreference
statement should be a form of belief adoption primarily, saying that
several propositions in different sources or at different places in the
same source refer to one item, identified by one URI. This means that we
belief that the author or authors meant one real item, whatever it was.
Therefore we would need a sort of mark-up or so to the text passages, and
one URI for this assumed real world existence. The question is, if we need
a construct "identified as whatever was meant by the author at xxx" to
assign an identity to a URI with such a reservation wrt to other known
items.

Another question would be a "possibly same" statement between two URIs, or
possibly not same.

Best,

Mrtin
___
Crm-sig mailing list
Crm-sig@ics.forth.gr
http://lists.ics.forth.gr/mailman/listinfo/crm-sig