Dear all,
having somehow started this discussion in a hot August evening, let me remind you that the initial question was: "When describing biographical information [in an archive] it’s common to state that some person was fluent in some language, or languages, apart from his/her native one. Using current archival descriptions standards [ISAD(G) 3.2.2; EAD <bioghist>] this is represented within a text, usually a very long text string with information of distinct natures. So far we have been able to decompose the different elements and represent them adequately as instances of CIDOC-CRM classes and link them trough the suitable properties. We cannot link a Person (E21) to a language (E56) and neither use multiple instantiation, as it has been suggested in other cases (http://www.cidoc-crm.org/Issue/ID-258-p72-quantification), because Person (E21) and Linguistic Object (E33) are disjoint.” I understand these bios consist in a text, and metadata are added to it as instances of various CIDOC-CRM classes. The question was how to indicate in such metadata the knowledge of a language as reported in the bio: so not a real quality of the person, but a fact documented. My suggestion was to use E74 Group. I always prefer to use what is already available and avoid the unnecessary proliferation of classes and properties, in my opinion there are already (more than) enough. But in doing so I try to maximize expressiveness, as otherwise one class (E1 CRM Entity) and one property (P2 has type) would be sufficient for the whole world: P2 is not a jack-of-all-trades. Reportedly, the Group solution seemed to please the person who made the question. I don’t know if the "language spoken" is an information usually taken into account in CH; but in this case it was by the archivist, otherwise no question would have been aaked. Best regards Prof. Franco Niccolucci Director, VAST-LAB PIN - U. of Florence Scientific Coordinator ARIADNEplus - PARTHENOS Editor-in-Chief ACM Journal of Computing and Cultural Heritage (JOCCH) Piazza Ciardi 25 59100 Prato, Italy > Il giorno 14 ott 2019, alle ore 22:39, Detlev Balzer <d...@balilabs.de> ha > scritto: > > Dear George, Martin, > > this discussion made me curious whether or not I can confirm George's > assertion that such statements are common in the cultural heritage field. > > EAC-CPF does have a language element, which is, however, only used to > indicate in which language the name of a person or corporation is expressed. > > GND, the authority file for libraries in German-speaking countries, has a > Language entity which is used for making statements about the "field of > study" of a person. Other predicates for the person-language pair of entities > do occur, but these are obvious data entry errors. > > Having extracted person-related data from a dozen or more cultural heritage > projects, I don't remember any example where languages spoken or known by > somebody have been considered in any other sense than relating to the > documented activity, rather than to the (possibly un-instantiated) capacity > of the person. > > Of course, this is just an observation that doesn't prove anything. > Personally, I would tend towards Martin's view that there is little, if > anything, to be gained by defining such kind of statement in a reference > model such as the CIDOC CRM. > > Best wishes, > Detlev > >> George Bruseker <george.bruse...@gmail.com> hat am 14. Oktober 2019 um 19:45 >> geschrieben: >> >> >> Dear Martin, >> >> The conversation began with a use case from an archive. I just inform that >> this is also found in all the projects I work on for memory institutions. >> They find it in scope, so looking further afield for what anthropologists >> do doesn't seem like a necessary step? Though highly fascinating! >> >> Best >> >> George >> >> >> >> On Mon, Oct 14, 2019, 6:58 PM Martin Doerr <mar...@ics.forth.gr> wrote: >> >>> Dear George, All, >>> >>> As a second thought: >>> >>> I think documentation formats such as LIDO are an adequate place to add >>> such useful properties to characterize items in a more detailed way, we >>> would not put in the CRM analytically. Shapes, colors etc. being typical >>> examples. >>> >>> Question: Are there formats from the archival world that use to describe >>> the languages people speak? EAD CFP? >>> Libraries are interested in the languages someone publishes in, not >>> speaking. >>> >>> What are the anthropologists registering? Would they be interested in >>> languages learned at school, or rather in the language used for >>> communication in a typical group? Would they document people being >>> incapable of communicating in that group? >>> Or just infer language via group? >>> >>> How to distinguish native speakers from non-native? >>> >>> Would historians make cases of people that could not communicate in a >>> given language, with societal effects? >>> >>> What about illiterate people? Speaking, not writing...? Maintaining oral >>> history with great precision, etc. >>> >>> What about creoles ? >>> >>> Best, >>> >>> Martin >>> >>> On 10/14/2019 7:33 PM, Martin Doerr wrote: >>> >>> >>> Dear George, >>> >>> The first principle of all is are there relevant queries that need that >>> property for integrating disparate sources, which indeed provide such data, >>> and is that research one we like to support with the CRM? >>> >>> Second, using p2 on E21 does the job, doesn't it? What is the added value >>> of "knows language"? >>> >>> Next principle, keep the ontology small. Querying 1000 properties is >>> already more than anybody can keep in mind. Each additional property has an >>> implementation cost. We need strong arguments for relevance. >>> >>> It has been the mos t important success factor of the CRM to keep the >>> ontology small and still expressive enough. If we loose this discipline, we >>> will loose the whole project. >>> >>> Finally, we are not repeating in the CRM the way typically information >>> systems document, but always tried to find a more fundamental >>> representation. With that argument, we could never have introduced events. >>> They did NOT appear in any of the typical systems at that time. It is a >>> principle *not *to model all the valuable description elements, which are >>> relevant to characterize an item, but not creating interesting links across >>> resources. >>> >>> I did not say that it is a personal opinion that someone speaks a >>> language. I said, this is observable. I document: Franco has spoken Latin, >>> repeatedly? But talking about skills, is another level, it introduces a >>> quality, which is hard to objectify, as Franco has pointed out. Actually, >>> it is a typical classification problem, with all its boundary case >>> questions, and the CRM is about relations between particulars. >>> >>> So, what is the* added value* against p2, and what are the typical >>> research data and typical research questions for *integrating* such data, >>> that cannot be answered with P2? >>> >>> Best, >>> >>> martin >>> >>> >>> >>> >>> On 10/14/2019 4:24 PM, George Bruseker wrote: >>> >>> Dear Martin, >>> >>> Which is CEO’s proposition that you support? It gets lost in the string. >>> Do you mean that a) a person speaking a language means being part of a >>> group, or b) using the p2 on E21 and then make types for ’Speakers of...' >>> >>> I am (still and very much ) a supporter of a new property ‘knows >>> language'. I do not think that the group solution works because of the >>> identify criteria of groups. I also don’t think the event solution is >>> necessary (another suggestion that has floated in this conversation). It is >>> often the case that for person we do not know events of their acquisition >>> or use of language or a skill but we do have proposition that they had that >>> language or skill! I also don’ t support the ‘English Speakers’ type >>> solution since it provides a different URI than the URI for ‘English’ and >>> forces more, obscure, modelling. >>> >>> Another CIDOC CRM principle is model at the level of knowledge that is >>> typically present in information systems. Again, I think the present case >>> (people know languages) is identical to the case of >>> >>> E22 consists of E57 Material >>> >>> This is a typical piece of knowledge held about an object. It would be >>> obtuse to insist that one should create an event node to indicate the >>> manner of this material becoming the constituting material of the object >>> when we don’t know this fact. This is why CRM represents such binary >>> relations, because they are real, they are a level of knowledge and they >>> are observable. >>> >>> If someone has entered into an information system George: English, Pot >>> Making, it is unlikely that what they want to reconstruct are instances of >>> me using English or performing Pot making. Rather they are interested that >>> there is an individual which has a particular formation which means that he >>> knows language x, knows skill x. This information is probably used in an >>> actual integration to connect an instance of E21 via an instance of E57 >>> Language to for example E33 that use the same E57. >>> >>> It would seem we need some sort of hierarchy in the principles which can >>> also be conflicting. >>> >>> >>> My approach is not documenting skills*.* My approach is documenting >>> facts, rather than potentials. I take notice and may document that you >>> spoke Latin, as I have done last time at school. I have a document stating >>> my grade in Latin at high school. My grade at high school confirms a set >>> of years of continued successful lessons, not that I could understand much >>> Latin now;-). >>> Speaking a language can be documented as an extended (observed) activity, >>> as in FRBRoo. >>> >>> >>> It may be, but is it typically? I have never seen an information system, >>> especially in museum context that would. >>> >>> For instance, someone writing books in particular language. This falls >>> under any kind of extended activity not further specified, such as an >>> artist using a technique for some time, and avoids transforming actual >>> activities into potentials. >>> >>> We can document someone's documented opinion about a potential of a >>> person, as an information object. >>> >>> >>> That would make this information mostly unusable however. If our goal is >>> to functionally use the observation person x speaks language y, then it >>> needs to be semantically represented and not made a string. >>> >>> >>> In the "Principles for Modelling Ontologies" we refer: >>> "7.2 Avoid concepts depending on a personal/ spectator perspective" >>> >>> This could be elaborated more. In the CRM, we do not model concepts >>> "because people use them", but because they can be used to integrated >>> information related to them with URIs. Therefore, your arguments and what >>> I wanted to say is, "skill" is a bad concept for integration. What should >>> be instantiated are the observable activities, which may or may not >>> indicate skills. >>> >>> >>> I don’t see that this principle applies. It is not a personal perspective >>> that someone speaks a language, anymore than it is a personal perspective >>> that an object is constituted of a material. This fact can be documented >>> and observed. Someone else can come and do the same. Don’t believe Franco >>> can speak Latin? Watch him and see if he can. When someone writes in an >>> information system, they probably typically mean, some evidence leads me to >>> assert Person y knows language y. They do not mean to say at some point in >>> the past he learned it, or at some point he performed it. >>> >>> In the case of documenting that someone knows a language this can be used >>> practically to integrate using URIs just in case we use the same URI for >>> English that we use to describe a document and that we use to describe the >>> knowledge of the individual >>> >>> E21 knows language E57 Language URI:AA >>> E33 has language E57 Language URI:AA >>> >>> answers the query, who in this graph knew the language this document was >>> written in. >>> >>> Functionally, the issue for me is, is there a good reason against adding >>> a binary property off of person which can indicate their knowledge ability >>> and connect to a well known URI for a language. >>> >>> Best, >>> >>> George >>> >>> >>> -- >>> ------------------------------------ >>> Dr. Martin Doerr >>> >>> Honorary Head of the >>> Center for Cultural Informatics >>> >>> Information Systems Laboratory >>> Institute of Computer Science >>> Foundation for Research and Technology - Hellas (FORTH) >>> >>> N.Plastira 100, Vassilika Vouton, >>> GR70013 Heraklion,Crete,Greece >>> >>> Vox:+30(2810)391625 >>> Email: mar...@ics.forth.gr >>> Web-site: http://www.ics.forth.gr/isl >>> >>> >>> _______________________________________________ >>> Crm-sig mailing >>> listCrm-sig@ics.forth.grhttp://lists.ics.forth.gr/mailman/listinfo/crm-sig >>> >>> >>> -- >>> ------------------------------------ >>> Dr. Martin Doerr >>> >>> Honorary Head of the >>> Center for Cultural Informatics >>> >>> Information Systems Laboratory >>> Institute of Computer Science >>> Foundation for Research and Technology - Hellas (FORTH) >>> >>> N.Plastira 100, Vassilika Vouton, >>> GR70013 Heraklion,Crete,Greece >>> >>> Vox:+30(2810)391625 >>> Email: mar...@ics.forth.gr >>> Web-site: http://www.ics.forth.gr/isl >>> >>> _______________________________________________ >>> Crm-sig mailing list >>> Crm-sig@ics.forth.gr >>> http://lists.ics.forth.gr/mailman/listinfo/crm-sig >>> >> _______________________________________________ >> Crm-sig mailing list >> Crm-sig@ics.forth.gr >> http://lists.ics.forth.gr/mailman/listinfo/crm-sig > > _______________________________________________ > Crm-sig mailing list > Crm-sig@ics.forth.gr > http://lists.ics.forth.gr/mailman/listinfo/crm-sig