[Crm-sig] NEW ISSUE: revise TX5 Reading versus TX6 Transcription

Martin Doerr via Crm-sig Mon, 06 Sep 2021 10:56:23 -0700

Dear All,

I belief that TX5 Reading and TX6 Transcription should be in a differentrelationship.

In more detail, I propose to rename TX5 Reading to "TX5 TextRecognition", and ontologically strictly separate observation frominferred interpretation of meaning, once TX5 Reading is declared assubclass of Observation, and TX6 Transcription is not.

Note that one can perfectly "read" a clear text written in a knownscript, without understanding any word. E.g., I can indeed copywell-written or printed Chinese Han characters without understanding anyChinese, just by knowledge of the relevant structural features. I assumethe same holds for cuneiform. Equally, I can copy a Latin inscriptionwithout understanding any of the abundant abbreviations. This is indeedthe proper observation.

If the result of this "reading" is a documentation in the same scriptand notation or not is a detail up to the reader. I'd argue, however,that the class TX5 *needs* a formal output, an instance of E90 SymbolicObject at least, in order to be useful. This is missing in the currentmodel. Transcription in the sense of changing script of notation couldbe an internal, not documented intermediate step of the textrecognition ("transcribing text recognition", or adequate outputproperties), or an explicit step after the recognition of the SymbolicObject.

It is obviously true that text recognition typically includes argumentsof understanding. I'd argue, that this is *not* intrinsic to reading,but only applies to texts not clearly typed. Strictly speaking, any suchprocess constitutes *ERROR CORRECTION* and text *COMPLETION*.

Therefore, I propose a new class "Meaning Comprehension", which wouldtake *as input a recognized text *and interprets an assumed meaning inplain language, or even formal propositions, which would be theend-stadium of the reading process, resulting in an information object.This class may reside in CRMinf or in CRMtex.

We can then construct from "Text Recognition", "Transcription" and"Meaning Comprehension" combined and short-cutting constructs, whichwould include "error correction", "resolution of recognition ambiguity"and "missing part completion" as useful in practice for representingtypical scholarly defaults.

I'd argue that resolution of linguistic ambiguity using scholarlyarguments about the likely context of reference of the text constitutesa scholarly interpretation process after "reading", regardless whethererror correction and completion used such arguments.

We need these separations, in order to create a clear interface to"Belief Adoption" in CRMinf, which is about the assumed real world truthof statements in texts.


Opinions?

All the best,

Martin


--
------------------------------------
 Dr. Martin Doerr

Honorary Head of the

 Center for Cultural Informatics

Information Systems Laboratory

 Institute of Computer Science
 Foundation for Research and Technology - Hellas (FORTH)

N.Plastira 100, Vassilika Vouton,

 GR70013 Heraklion,Crete,Greece

Vox:+30(2810)391625

 Email: mar...@ics.forth.gr
 Web-site: http://www.ics.forth.gr/isl

_______________________________________________
Crm-sig mailing list
Crm-sig@ics.forth.gr
http://lists.ics.forth.gr/mailman/listinfo/crm-sig

[Crm-sig] NEW ISSUE: revise TX5 Reading versus TX6 Transcription

Reply via email to