Re: [Crm-sig] Question: How to model a 'file'

2021-03-02 Thread Martin Doerr

Dear Daria,

I have the impression you never got an answer to this question, because 
it was filed by mistake under ISSUE 490, "How to model a file"!


The answer to your question is:

step 1 - modeling
step 2 - decision

Are Activity Types (E55 Type). Step 1, step 2 give positions in a 
framework schema, another E55, "Daria's Modeling Workflow".


If you instantiate step 1 - modelling, you create a particular instance 
of E7 Activity, has type :"Modelling (Step1)", with label "my first 
attempt" or so.

Same for step2.

Then you have the generic instruction, "return to step 1", which means, 
you create a new instance of E7 Activity, has type :"Modelling(Step1)". 
with label "my second  attempt" or so.


All the framework schema tells you, which inputs of the first attempt to 
reuse in the second, and what to change.


Please let me know if this answer is adequate!

Best wishes,

Martin

On 4/16/2020 11:30 AM, Дарья Юрьевна Гук wrote:

Dear friends,

I am not sure for clear understanding the model of "Activity", please 
check it if the situation is repeatble (reality):

step 1 - modeling
step 2 - decision
step 3 - return to the step 1 (re-modeling, improvements or renovation)
The same thing exiats in digital world as in reality and even fixed in 
State standrds (in Russia).


With kind regards,
Daria Hookk

Senior Researcher of
the dept. of archaeology of
Eastern Europe and Siberia of
the State Hermitage Museum,
PhD, ICOMOS member

E-mail: ho...@hermitage.ru 
Skype: daria.hookk
https://hermitage.academia.edu/HookkDaria 



___
Crm-sig mailing list
Crm-sig@ics.forth.gr
http://lists.ics.forth.gr/mailman/listinfo/crm-sig



--

 Dr. Martin Doerr
  
 Honorary Head of the

 Center for Cultural Informatics
 
 Information Systems Laboratory

 Institute of Computer Science
 Foundation for Research and Technology - Hellas (FORTH)
  
 N.Plastira 100, Vassilika Vouton,

 GR70013 Heraklion,Crete,Greece
 
 Vox:+30(2810)391625

 Email: mar...@ics.forth.gr
 Web-site: http://www.ics.forth.gr/isl

___
Crm-sig mailing list
Crm-sig@ics.forth.gr
http://lists.ics.forth.gr/mailman/listinfo/crm-sig


Re: [Crm-sig] Question: How to model a 'file'

2020-04-16 Thread Martin Doerr
...in addition, we should think about digital images of printed material 
providing the identity of content for a lost physical item...


On 4/16/2020 9:46 PM, Martin Doerr wrote:

Dear George,

You are right, this is an open and important question.

I have repeatedly pointed to that issue in CRM-SIG Meetings, but with 
limited response so far.


One part of the discussion had been in CRMInf, about the equivalence 
of a file with a Proposition Set.


The other part has been when introducing P190 has symbolic content. My 
remark is not even in the minutes, that this must be discussed. May be 
I forgot a homework to elaborate this more. Let me expand here a bit 
my current understanding:


This: E41 Appellation p190 has symbolic content df:literal "file name 
value goes here" feeds only the name of the file, which identifies the 
file, into the Appellation. It could be just an rdf label, because the 
content of the Appellation is hardly ambiguous. On the other side, 
reserving another node for the Appellation allows for assigning a type 
"filename" to it. But a filename is anyhow not a good identifier.


If the Digital Object is represented by a URI, e.g., a DOI, the 
remaining question is, if it resolves or can unambiguously be related 
to an external content or not.


If it does, then the identity of this Digital Object should be the 
"primitive" one, its binary identity. I.e., a .pdf and .doc of the 
same scientific publication would be different objects, even a .doc 
with changes in embedded metadata would make it different.


If we mean however that the ontological identity is, for instance, 
that of the equivalence class of possible encodings of one certain 
publication following Springer rules or so, the URI pointing to a 
binary is misleading, because many files can represent the same 
publication.  The different encodings will both /incorporate/ and 
/represent /the respective publication, but both properties are not 
identifying the content.


Therefore, a variation (not subproperty) of P190 should  do it. We 
have again the problem, that we need to form a common superclass with 
a Primitive Value.


Perhaps, once we have done the great step and declared some Primitive 
Values as IsA Appellations, the most elegant form would be to form a 
superclass of E62 String and  Digital Object, and raise the range of 
P190 to it. This would elegantly make clear that E62 String and 
Digital Object differ only in the fact if they are in or out of the KB 
proper.


If we do that, the range of P190 will again point to a URI, which, in 
this case, either must be the binary, or a lower representation than 
the level of symbolic specificity given for the domain instance. In 
any case, we should reach at a "tangible" binary, and a suitable type 
to distinguish, if the URI is meant to correspond to a real binary 
(even if no more extent!!), or to a higher level may be useful.


We should also answer the question, how this translates to analogue 
content, because we may copy files manually and re-encode.


After that, we should think about Propositional Objects represented in 
files...


Any thoughts?

Best,

Martin

On 4/15/2020 8:16 PM, George Bruseker wrote:

Dear all,

Here is another humble modelling problem for which I don't feel that 
there is a commonly agreed and documented answer, although it is a 
common question. How do we connect an actual file with the semantic 
network? So here is the scenario.


I have a file: a word doc, a jpg image, a powerpoint. I want to 
represent it in CIDOC CRM and connect it the semantic network and do 
so in a way that would be interoperable with all other well formed 
instances of CIDOC CRM. How do I do that?


Well part of the answer is clear. Part is unclear. Regarding the 
representation of the the fact that there is a digital object we have 
two choices. If we use pure CRMbase then we have


E73 p2 has type E55 "Digital Object"

If we use CRM extensions then we have

D1 Digital Object

Great. Now in the semantic network we can relate this in all sorts of 
standard ways to other entities (p67 refers to, p128 is about) etc. 
etc. We can use a creation event from CRM base or a digital machine 
event from CRMdig to document when the file was created, by whom etc. 
Super. I can use p1 is identified by E41 appellation to indicate the 
name of that digital object (which may differ from the file name) and 
give it a type with p2 has type. All standard and wonderful.


I still have to put the file itself, that actual digital object which 
I want my user to be able to find and manipulate somehow in relation 
to the semantic network.


How do people tend to do that? I have seen many variation but no 
common method.


So what is the go-to solution and should it perhaps be documented on 
the CIDOC CRM site because it is a really common pattern?


I have seen

the file = E73... just put the file as the URN of the semantic node. 
But then this means your file is accessible via a URN 

Re: [Crm-sig] Question: How to model a 'file'

2020-04-16 Thread Martin Doerr

Dear George,

You are right, this is an open and important question.

I have repeatedly pointed to that issue in CRM-SIG Meetings, but with 
limited response so far.


One part of the discussion had been in CRMInf, about the equivalence of 
a file with a Proposition Set.


The other part has been when introducing P190 has symbolic content. My 
remark is not even in the minutes, that this must be discussed. May be I 
forgot a homework to elaborate this more. Let me expand here a bit my 
current understanding:


This: E41 Appellation p190 has symbolic content df:literal "file name 
value goes here" feeds only the name of the file, which identifies the 
file, into the Appellation. It could be just an rdf label, because the 
content of the Appellation is hardly ambiguous. On the other side, 
reserving another node for the Appellation allows for assigning a type 
"filename" to it. But a filename is anyhow not a good identifier.


If the Digital Object is represented by a URI, e.g., a DOI, the 
remaining question is, if it resolves or can unambiguously be related to 
an external content or not.


If it does, then the identity of this Digital Object should be the 
"primitive" one, its binary identity. I.e., a .pdf and .doc of the same 
scientific publication would be different objects, even a .doc with 
changes in embedded metadata would make it different.


If we mean however that the ontological identity is, for instance, that 
of the equivalence class of possible encodings of one certain 
publication following Springer rules or so, the URI pointing to a binary 
is misleading, because many files can represent the same publication.  
The different encodings will both /incorporate/ and /represent /the 
respective publication, but both properties are not identifying the content.


Therefore, a variation (not subproperty) of P190 should  do it. We have 
again the problem, that we need to form a common superclass with a 
Primitive Value.


Perhaps, once we have done the great step and declared some Primitive 
Values as IsA Appellations, the most elegant form would be to form a 
superclass of E62 String and Digital Object, and raise the range of P190 
to it. This would elegantly make clear that E62 String and Digital 
Object differ only in the fact if they are in or out of the KB proper.


If we do that, the range of P190 will again point to a URI, which, in 
this case, either must be the binary, or a lower representation than the 
level of symbolic specificity given for the domain instance. In any 
case, we should reach at a "tangible" binary, and a suitable type to 
distinguish, if the URI is meant to correspond to a real binary (even if 
no more extent!!), or to a higher level may be useful.


We should also answer the question, how this translates to analogue 
content, because we may copy files manually and re-encode.


After that, we should think about Propositional Objects represented in 
files...


Any thoughts?

Best,

Martin

On 4/15/2020 8:16 PM, George Bruseker wrote:

Dear all,

Here is another humble modelling problem for which I don't feel that 
there is a commonly agreed and documented answer, although it is a 
common question. How do we connect an actual file with the semantic 
network? So here is the scenario.


I have a file: a word doc, a jpg image, a powerpoint. I want to 
represent it in CIDOC CRM and connect it the semantic network and do 
so in a way that would be interoperable with all other well formed 
instances of CIDOC CRM. How do I do that?


Well part of the answer is clear. Part is unclear. Regarding the 
representation of the the fact that there is a digital object we have 
two choices. If we use pure CRMbase then we have


E73 p2 has type E55 "Digital Object"

If we use CRM extensions then we have

D1 Digital Object

Great. Now in the semantic network we can relate this in all sorts of 
standard ways to other entities (p67 refers to, p128 is about) etc. 
etc. We can use a creation event from CRM base or a digital machine 
event from CRMdig to document when the file was created, by whom etc. 
Super. I can use p1 is identified by E41 appellation to indicate the 
name of that digital object (which may differ from the file name) and 
give it a type with p2 has type. All standard and wonderful.


I still have to put the file itself, that actual digital object which 
I want my user to be able to find and manipulate somehow in relation 
to the semantic network.


How do people tend to do that? I have seen many variation but no 
common method.


So what is the go-to solution and should it perhaps be documented on 
the CIDOC CRM site because it is a really common pattern?


I have seen

the file = E73... just put the file as the URN of the semantic node. 
But then this means your file is accessible via a URN which is often 
not the case and anyhow you probably want to distinguish your semantic 
node which 'stands for' the file from the actual file itself.


I have seen and used E41 

Re: [Crm-sig] Question: How to model a 'file'

2020-04-16 Thread Дарья Юрьевна Гук
Dear friends,


I am not sure for clear understanding the model of "Activity", please check it 
if the situation is repeatble (reality): 

step 1 - modeling
step 2 - decision
step 3 - return to the step 1 (re-modeling, improvements or renovation)
The same thing exiats in digital world as in reality and even fixed in State 
standrds (in Russia).

With kind regards,
Daria Hookk

Senior Researcher of
the dept. of archaeology of
Eastern Europe and Siberia of 
the State Hermitage Museum,
PhD, ICOMOS member

E-mail: ho...@hermitage.ru
Skype: daria.hookk
https://hermitage.academia.edu/HookkDaria___
Crm-sig mailing list
Crm-sig@ics.forth.gr
http://lists.ics.forth.gr/mailman/listinfo/crm-sig


[Crm-sig] Question: How to model a 'file'

2020-04-15 Thread George Bruseker
Dear all,

Here is another humble modelling problem for which I don't feel that there
is a commonly agreed and documented answer, although it is a common
question. How do we connect an actual file with the semantic network? So
here is the scenario.

I have a file: a word doc, a jpg image, a powerpoint. I want to represent
it in CIDOC CRM and connect it the semantic network and do so in a way that
would be interoperable with all other well formed instances of CIDOC CRM.
How do I do that?

Well part of the answer is clear. Part is unclear. Regarding the
representation of the the fact that there is a digital object we have two
choices. If we use pure CRMbase then we have

E73 p2 has type E55 "Digital Object"

If we use CRM extensions then we have

D1 Digital Object

Great. Now in the semantic network we can relate this in all sorts of
standard ways to other entities (p67 refers to, p128 is about) etc. etc. We
can use a creation event from CRM base or a digital machine event from
CRMdig to document when the file was created, by whom etc. Super. I can use
p1 is identified by E41 appellation to indicate the name of that digital
object (which may differ from the file name) and give it a type with p2 has
type. All standard and wonderful.

I still have to put the file itself, that actual digital object which I
want my user to be able to find and manipulate somehow in relation to the
semantic network.

How do people tend to do that? I have seen many variation but no common
method.

So what is the go-to solution and should it perhaps be documented on the
CIDOC CRM site because it is a really common pattern?

I have seen

the file = E73... just put the file as the URN of the semantic node. But
then this means your file is accessible via a URN which is often not the
case and anyhow you probably want to distinguish your semantic node which
'stands for' the file from the actual file itself.

I have seen and used E41 Appellation as a pattern. So the D1 or E73 p1 is
identified by E41 Appellation p190 has symbolic content df:literal "file
name value goes here". Here you have a problem that you then need also to
store somehow a path by which to reach that on some file system.

I guess another alternative would be to use p190 has symbolic content and
then throw the file in there as a blob. I don't particularly like this
solution, as I would hope to find strings at the end of p190 and not blobs.

Would maybe a sub property of p190 'is encoded in file' be an option in
order to use the blob solution?

Anyhow maybe there are already better solutions than I lay out above, but I
would be interested to hear. Also I think it would be great to identify the
best practice and put in on the main site so that people follow this
strategy consistently.

Probably my examples hide multiple use cases requiring different patterns.
Anyhow, what do you think?

Best,

George
___
Crm-sig mailing list
Crm-sig@ics.forth.gr
http://lists.ics.forth.gr/mailman/listinfo/crm-sig