Dear All,
I belief that TX5 Reading and TX6 Transcription should be in a different
relationship.
In more detail, I propose to rename TX5 Reading to "TX5 Text
Recognition", and ontologically strictly separate observation from
inferred interpretation of meaning, once TX5 Reading is declared as
subclass of Observation, and TX6 Transcription is not.
Note that one can perfectly "read" a clear text written in a known
script, without understanding any word. E.g., I can indeed copy
well-written or printed Chinese Han characters without understanding any
Chinese, just by knowledge of the relevant structural features. I assume
the same holds for cuneiform. Equally, I can copy a Latin inscription
without understanding any of the abundant abbreviations. This is indeed
the proper observation.
If the result of this "reading" is a documentation in the same script
and notation or not is a detail up to the reader. I'd argue, however,
that the class TX5 *needs* a formal output, an instance of E90 Symbolic
Object at least, in order to be useful. This is missing in the current
model. Transcription in the sense of changing script of notation could
be an internal, not documented intermediate step of the text
recognition ("transcribing text recognition", or adequate output
properties), or an explicit step after the recognition of the Symbolic
Object.
It is obviously true that text recognition typically includes arguments
of understanding. I'd argue, that this is *not* intrinsic to reading,
but only applies to texts not clearly typed. Strictly speaking, any such
process constitutes *ERROR CORRECTION* and text *COMPLETION*.
Therefore, I propose a new class "Meaning Comprehension", which would
take *as input a recognized text *and interprets an assumed meaning in
plain language, or even formal propositions, which would be the
end-stadium of the reading process, resulting in an information object.
This class may reside in CRMinf or in CRMtex.
We can then construct from "Text Recognition", "Transcription" and
"Meaning Comprehension" combined and short-cutting constructs, which
would include "error correction", "resolution of recognition ambiguity"
and "missing part completion" as useful in practice for representing
typical scholarly defaults.
I'd argue that resolution of linguistic ambiguity using scholarly
arguments about the likely context of reference of the text constitutes
a scholarly interpretation process after "reading", regardless whether
error correction and completion used such arguments.
We need these separations, in order to create a clear interface to
"Belief Adoption" in CRMinf, which is about the assumed real world truth
of statements in texts.
Opinions?
All the best,
Martin
--
------------------------------------
Dr. Martin Doerr
Honorary Head of the
Center for Cultural Informatics
Information Systems Laboratory
Institute of Computer Science
Foundation for Research and Technology - Hellas (FORTH)
N.Plastira 100, Vassilika Vouton,
GR70013 Heraklion,Crete,Greece
Vox:+30(2810)391625
Email: mar...@ics.forth.gr
Web-site: http://www.ics.forth.gr/isl
_______________________________________________
Crm-sig mailing list
Crm-sig@ics.forth.gr
http://lists.ics.forth.gr/mailman/listinfo/crm-sig