Dear All,

I belief that TX5 Reading and TX6 Transcription should be in a different relationship.

In more detail, I propose to rename TX5 Reading to "TX5 Text Recognition", and ontologically strictly separate observation from inferred interpretation of meaning, once TX5 Reading is declared as subclass of Observation, and TX6 Transcription is not.

Note that one can perfectly "read" a clear text written in a known script, without understanding any word. E.g., I can indeed copy well-written or printed Chinese Han characters without understanding any Chinese, just by knowledge of the relevant structural features. I assume the same holds for cuneiform. Equally, I can copy a Latin inscription without understanding any of the abundant abbreviations. This is indeed the proper observation.

If the result of this "reading" is a documentation in the same script and notation or not is a detail up to the reader. I'd argue, however, that the class TX5 *needs* a formal output, an instance of E90 Symbolic Object at least, in order to be useful. This is missing in the current model. Transcription in the sense of changing script of notation could be an internal, not documented  intermediate step of the text recognition ("transcribing text recognition", or adequate output properties), or an explicit step after the recognition of the Symbolic Object.

It is obviously true that text recognition typically includes arguments of understanding. I'd argue, that this is *not* intrinsic to reading, but only applies to texts not clearly typed. Strictly speaking, any such process constitutes *ERROR CORRECTION* and text *COMPLETION*.

Therefore, I propose a new class "Meaning Comprehension", which would take *as input a recognized text *and interprets an assumed meaning in plain language, or even formal propositions, which would be the end-stadium of the reading process, resulting in an information object. This class may reside in CRMinf or in CRMtex.

We can then construct from "Text Recognition", "Transcription" and "Meaning Comprehension" combined and short-cutting constructs, which would include "error correction", "resolution of recognition ambiguity" and "missing part completion" as useful in practice for representing typical scholarly defaults.

I'd argue that resolution of linguistic ambiguity using scholarly arguments about the likely context of reference of the text constitutes a scholarly interpretation process after "reading", regardless whether error correction and completion used such arguments.

We need these separations, in order to create a clear interface to "Belief Adoption" in CRMinf, which is about the assumed real world truth of statements in texts.

Opinions?

All the best,

Martin


--
------------------------------------
 Dr. Martin Doerr
Honorary Head of the
 Center for Cultural Informatics
Information Systems Laboratory
 Institute of Computer Science
 Foundation for Research and Technology - Hellas (FORTH)
N.Plastira 100, Vassilika Vouton,
 GR70013 Heraklion,Crete,Greece
Vox:+30(2810)391625
 Email: mar...@ics.forth.gr
 Web-site: http://www.ics.forth.gr/isl

_______________________________________________
Crm-sig mailing list
Crm-sig@ics.forth.gr
http://lists.ics.forth.gr/mailman/listinfo/crm-sig

Reply via email to