Re: [Crm-sig] ISSUE: E13 Attribute Assignment

Maximilian Schich Sat, 24 Mar 2018 18:15:15 +0200

Dear Martin,

My "recommendation" was just putting into question an aspect ofFlorian's suggestion, and not meant to replace it in a final way.

Regarding your points: The practical cases I am familiar with would usethe E13 on the whole triple, i.e. the link/property-type including aspecific source node and a particular target node. This means either thetriple is stored as a quad, or the triple carries an ID or address, soone can refer to it. TEI standoff markup would be another practical example.

As an art historian/archaeologist and hopeless class-conceptualist, I donot believe in trusted sources. Everything comes with a probability. :)

a) Self-description is of course never perfect, yet depends on thedensity of information: A signature, as in "Martin performedAttr.Ass.512" or "A[lbrecht] D[ürer] fecit" is only one form of(self-)descriptive information, which is as good or bad as anythingelse, internal or external. Of course, it is better to see Dürer indetail or to hear Anne-Sophie Mutter actually play, rather than relyingon a verbal statement of attribution.

b) I don't understand: Any graph-like description of a graph constitutesa forest of graphs with the original graph, i.e. a disconnected graphthat contains the description of itself. If we generalize that statementto symbolic representation, you are in essence saying description isimpossible.

c) I think in most cases "description within a set of information aboutits provenance" is the only thing we have. There is no default up thenext source of source. Evolutionary biologists, material scientists, arthistorians working on renaissance drawings, and scholars of ancientmanuscripts all rely on hysteresis, i.e. history of the object containedwithin the object. There never was a comprehensive DNS for organisms,manuscript fragments, or paintings, and there never will be. For thesame reason we need to embed provenance in our data sets. Probably weshould even block-chain it in with enough information, so we don't haveto rely on simple signatures.

*/To make my case much more simple and short: "All of Wikipedia includesthe full edit-history". /*This is how it is produced, and how it shouldbe analyzed. The same standard should apply to any cultural heritagedata set. Any other practice would be like citing monographs withoutpagination. This is why E13 is really central, particularly inmulti-authored data sets.



On 2018-03-24 15:01, Martin Doerr wrote:

Dear Maximilian,
This makes sense to me, but I do not agree with your recommendation asa general rule.
There is a fundamental epistemological problem, which has nothing todo with quantitative evidence. The latter,by the way, cannot detect an endless recursion anyhow, because peoplewould break it.
The ramifications of this breaking are huge, as can be seen by youranswer.
Let us start with a more fundamental construct, a simpleCRM-compatible "knowledge graph" with one attribute:
"Martin" has residence "Heraklion".

Using an E13,
"Martin" performed "Attr.Ass.512". has type: "has residence"
assigned: "Heraklion"
assigned to: "Martin"
now reading it, I know the knowledge graph wants to make me believewho said "has residence", but I do not know, who introduced thesethree more attributes.So, I reify the three new attributes with 9 more, and I am still notwiser, nor will I be with any other iteration of it.
If I know that the knowledge graph *was produced by Martin as atrusted source as a whole*, I do not need the E13 in it.
Then, I can add metadata to the whole knowledge graph, e.g., as aNamed Graph or "context" or on paper etc. , but I amstill in the same situation: who produced these metadata, are theytrusted?
Hence, I conclude three things:
a) There is no completely self-descriptive information. The trustedsource ("sender of the message" in Claude Shannon's sense) liesoutside the information unit. It must always be the default. In orderto characterize the default, we need semantics different from E13.
b) It makes no sense to describe the default in the graph itself.
c) Any description within a set of information about its provenancepushes the level where the default applies up to the next source ofsource. Hence, if a team decides to register actions of their members,the team as a whole pushes the default up to the trust in theregistration, rather than in the primarily registered. I see all youexamples as practices of this kind. There may be many reasons to dothis, but in other cases also not to do it.
Such a rule cannot replace understanding the basic epistemology, whichis always the same.
Does that make sense:-)?

All the best,

Martin



On 3/24/2018 12:10 PM, Maximilian Schich wrote:
Dear Florian and all,

Based on quantitative evidence, I'd object to the following to part of your 
suggestion:

"This fact must not individually be registered for all instances of properties 
provided by the maintaining team, because it*/would result in an endless recursion/*  of 
whose opinion was the description of an opinion."
=> This would only be correct if the maintaining team would add additional E13 Attribute Assignments to their own E13 statements. Otherwise,*/in practice, the data would (a) more or less double, plus (b) anon-exploding truncated tail of additional E13 correction statements/**/, where the maintaining team corrects itself./*
=> Example for (a): In large data sets such as the "Census of Antique Works of Art and 
Architecture" the "record history" approximately doubles the data set as a whole. Note: The 
Census "record history" is the place where the maintaining team records their own E13-like/attribute 
//assertions /(aka/assertions of database record authorship/). It is important to point out that the record 
history, where an internal database curator implicitly claims authorship for say an artist attribution in the 
Census, is conceptually in no way different from an external author providing a differing opinion (both usually 
have PhDs in art history). Ergo there are two default cases: (1) The internal database curator claims authorship 
for a*/direct assertion/*  via a single E13 Attribute assignment in the record history; (2) The internal 
database curator claims authorship for a*/cited assertion/*  via an E13 attribute assignment in the record 
history on top of the*/original assertion/*  that connects the stated opinion to its external source via another 
E13 attribute assignment.

=> Example for (b): In large data sets where the multiplicity of opinion is 
recorded, the number of competing assertions including both record history and 
external opinions, is usually characterized by a tailed frequency distribution*. 
This usually means in practice that the data set stays in the same order of 
magnitude relative to the case where the maintaining team decides to follow one of 
the alternative assertions.**
* The frequency distributions would look similar to Schich 2010 "Revealing 
Matrices" Fig. 14-8. Indeed, my pre-publication version of this figure had a column 
for the record history, not included in the article, as the networks were too large for 
the preceding figure.
** Yes, we should expect some "assertion cascades" to be exceedingly large, but we can 
also expect the median cascade length being very short, between 1 and 2 in cultural heritage 
databases based on personal experience, and still short in very large scale cases, such as 
spreading rumors on the Web (cf. Friggeri et al. 2014 "Rumour cascades" Fig. 5).
=> The recommendation, in my opinion, should be:*/By default, the maintaining team should establish authorship byadding an E13 Attribute Assignment to each assertion in the data set.Yet, the maintaining team should _only_ add an E13 AttributeAssignment to their own E13 Attribute Assignments in the case ofdiscernible modifications, updates, or corrections. To avoid commentcascades, such alternative E13 statements should be done in /**/*/parallel(!) not recursively.***/* This recommended procedureestablishes a record history and granular ability to cite data setcontributions by author, yet also avoids a recursive explosion of E13statements./*
*** Parallel, means E13 statements in the internal record history should never 
be about statements in the record history itself. This can easily be maintained 
with users being logged in or recorded via IP and timestamp. Working example: 
The Wikipedia edit history.


Hope this makes sense.

Best, Max
--
--------------------------------------------------------------
  Dr. Martin Doerr              |  Vox:+30(2810)391625        |
  Research Director             |  Fax:+30(2810)391638        |
                                |  Email:mar...@ics.forth.gr  |
                                                              |
                Center for Cultural Informatics               |
                Information Systems Laboratory                |
                 Institute of Computer Science                |
    Foundation for Research and Technology - Hellas (FORTH)   |
                                                              |
                N.Plastira 100, Vassilika Vouton,             |
                 GR70013 Heraklion,Crete,Greece               |
                                                              |
              Web-site:http://www.ics.forth.gr/isl            |
--------------------------------------------------------------


_______________________________________________
Crm-sig mailing list
Crm-sig@ics.forth.gr
http://lists.ics.forth.gr/mailman/listinfo/crm-sig


--
*Dr. Maximilian Schich*
Associate Professor, Arts & Technology
Founding member, The Edith O'Donnell Institute of Art History

*/The University of Texas at Dallas/*
800 West Campbell Road, AT10
Richardson, Texas 75080 â€“ USA
US phone: +1-214-673-3051
EU phone: +49-179-667-8041

www.utdallas.edu/atec/schich/ <http://www.utdallas.edu/atec/schich/>
www.schich.info <http://www.schich.info/>
www.cultsci.net <http://www.cultsci.net/>

Current location: Dallas, Texas

Re: [Crm-sig] ISSUE: E13 Attribute Assignment

Reply via email to