Hi, Am 13.01.2021 um 12:04 schrieb Erik Fäßler: >> :-) >> >> I was looking the the Person definition there, but didn't find matching >> features. > Oh, sorry, I did not articulate myself clear enough: In my real case work I > don’t have Person annotations but Organism annotation which are derived from > ConceptMentions. And ConceptMentions have the resourceEntryList feature. > I apologize for the confusion. For the matter of simplicity I made up the > Person example in my initial E-Mail and now and bit me in the a** ;-)
Ah no, all fine. When I prepared the first exemplary rules, I wondered about the type range of the id feature. As I assumed you were using the JCore type systems as your question indicated some non-trivial real world use case. I have a quick look (1min) if I can identify the range for the ids Person annotations in these type systems but failed... so I simply used String as range :-) >> >> In general, I find it better to create additional annotations for >> complex structures instead of merging the information in an existing >> annotation, simple due to maintainability reasons. It's easier to >> inspect unintended behavior several month later that way ... > Great, I am with you here, feels like I did it the recommended way. >> >>> So actually, there is one step missing now: I need to replace merged >>> Organism entries with the covering OrganismEnumeration (Person and >>> PersonEnumeration in my example). >> >> I am not sure what the input/output behavior should be. Don't you have >> two separate annotations and isn't the enum the merge of the semantic? > You’re right. And I think I will leave it this way. I’m thinking too > complicated. >> >> Labels and inlined rules are the two best language features I added in >> Ruta, really useful. Let me know if you want to learn more about them >> and if there is information missing in the documentation. >> > No, it’s all great. It’s just not that trivial and, honestly, while I had a > look at the base syntax, I came quite far with cherry-picking from the > documentation what I needed. I did not study the syntax in great detail > because I could always make it work with doing it. That’s my bad. But this > time I didn’t know where to start so I asked. And you helped me a lot, thank > you so much. > RUTA is a great tool. I only have trouble of a regular exceptions in the > Eclipse Workbench but I got used to it and I have probably combined wrong > versions of RUTA and Eclipse or something. There were several reports of problems lately which had their source in different Java versions used. Best, Peter > > Thank you! > > Erik > >> >> Best, >> >> >> Peter >> >> >> >>> construction so this enumeration-annotation-merging might actually be easy >>> and I just don’t see it. >>> >>> Thank you so much! >>> >>> Erik >>> >>>> On 10. Jan 2021, at 16:21, Peter Klügl <peter.klu...@averbis.com> wrote: >>>> >>>> Hi, >>>> >>>> >>>> Am 07.01.2021 um 14:55 schrieb Erik Fäßler: >>>>> Hi Peter and thank you once again for your excellent support of your >>>>> excellent RUTA software! >>>> You are welcome :-) >>>> >>>> >>>>> Your second example was very much what I needed. Thank you so far! >>>>> I have one last bump in the road: >>>>> >>>>> My Person#id feature is an FSArray with ID annotations instead of a plain >>>>> uima.cas.String. So, one Person annotation might have multiple IDs per >>>>> the type system. >>>>> The ID type has a feature “entryId”. >>>>> In my particular case I actually have only one entry in the id array. >>>>> Still, I need to access this entry somehow. >>>>> Is that at all possible in RUTA? I would need something like >>>>> >>>>> >>>>> // collect ids of all covered Persons using an extra list >>>>> STRINGLIST ids; >>>>> pe:PersonEnumeration{-> pe.personIds = ids} >>>>> <-{p:Person{-> ADD(ids,p.id <http://p.id/> <http://p.id/ >>>>> <http://p.id/>>[0].entryId)};}; >>>>> >>>>> This does not seem to be covered by the FeatureExpression grammar in >>>>> RUTA. Is there a work around? Otherwise I will have to solve it some >>>>> other way. >>>> there are actual "indexed" expressions like Person.ids[0] but it's not >>>> yet an "official" and stable feature. However, I think it's not even >>>> necessary. >>>> >>>> >>>> Is your typesystem available somewhere? JCoRe? >>>> >>>> Is this a solution for you? >>>> >>>> >>>> PACKAGE uima.ruta; >>>> >>>> // mock types >>>> DECLARE CC, EnumCC; >>>> DECLARE Person (FSArray ids); >>>> DECLARE PersonId (String personId); >>>> DECLARE PersonEnumeration (StringArray personIds); >>>> >>>> // mock annotations >>>> "Trump" -> Person; >>>> "Biden" -> Person; >>>> "and" -> CC; >>>> INT counter = 1; >>>> p:Person{-> pid:CREATE(PersonId, "personId" = "id_" + (counter)), >>>> counter = counter +1, p.ids = pid}; >>>> >>>> (COMMA? @CC){-> EnumCC}; >>>> >>>> // identify enum span >>>> (Person (COMMA Person)* EnumCC Person){-> PersonEnumeration}; >>>> >>>> // collect ids of all covered Persons using a extra list >>>> STRINGLIST ids; >>>> pe:PersonEnumeration{-> pe.personIds = ids} >>>> <-{p:Person{-> ADD(ids,p.ids.personId)};}; >>>> >>>> >>>> Best, >>>> >>>> >>>> Peter >>>> >>>> >>>> >>>>> Many thanks, >>>>> >>>>> Erik >>>>> >>>>>> On 7. Jan 2021, at 10:47, Peter Klügl <peter.klu...@averbis.com >>>>>> <mailto:peter.klu...@averbis.com>> wrote: >>>>>> >>>>>> Hi Erik, >>>>>> >>>>>> >>>>>> it depends on how you want to represent the information of the ids of >>>>>> the covered Person annotations. You somehow need to represent the values >>>>>> in the PersonEnumeration annotation. I assume that the ID feature of >>>>>> Person is uima.cas.String? PersonEnumeration could either use one String >>>>>> Feature, a StringArray feature or a FSArray feature (pointing to the >>>>>> Person annotation which provide the IDs). >>>>>> >>>>>> Here are two examples: >>>>>> >>>>>> >>>>>> PACKAGE uima.ruta; >>>>>> >>>>>> // mock types >>>>>> DECLARE CC, EnumCC; >>>>>> DECLARE Person (STRING id); >>>>>> DECLARE PersonEnumeration (FSArray persons); >>>>>> >>>>>> // mock annotations >>>>>> "Trump" -> Person ("id" = "1"); >>>>>> "Biden" -> Person ("id" = "2"); >>>>>> "and" -> CC; >>>>>> >>>>>> COMMA? @CC{-> EnumCC}; >>>>>> >>>>>> // identify enum span >>>>>> (Person (COMMA Person)* EnumCC Person){-> PersonEnumeration}; >>>>>> >>>>>> // collect all covered Persons >>>>>> pe:PersonEnumeration{-> pe.persons = Person}; >>>>>> >>>>>> ######################## >>>>>> >>>>>> ######################## >>>>>> >>>>>> PACKAGE uima.ruta; >>>>>> >>>>>> // mock types >>>>>> DECLARE CC, EnumCC; >>>>>> DECLARE Person (STRING id); >>>>>> DECLARE PersonEnumeration (StringArray personIds); >>>>>> >>>>>> // mock annotations >>>>>> "Trump" -> Person ("id" = "1"); >>>>>> "Biden" -> Person ("id" = "2"); >>>>>> "and" -> CC; >>>>>> >>>>>> COMMA? @CC{-> EnumCC}; >>>>>> >>>>>> // identify enum span >>>>>> (Person (COMMA Person)* EnumCC Person){-> PersonEnumeration}; >>>>>> >>>>>> // collect ids of all covered Persons using an extra list >>>>>> STRINGLIST ids; >>>>>> pe:PersonEnumeration{-> pe.personIds = ids} >>>>>> <-{p:Person{-> ADD(ids,p.id)};}; >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Best, >>>>>> >>>>>> >>>>>> Peter >>>>>> >>>>>> >>>>>> Am 06.01.2021 um 08:29 schrieb Erik Fäßler: >>>>>>> Hello everyone (and a happy new year :-)), >>>>>>> >>>>>>> I have been working on the following issue: Whenever there is >>>>>>> conjunction in text of two entities (e.g. [...]Biden and Trump ran for >>>>>>> president […]) I create a new annotation spanning both entities and the >>>>>>> conjunction ([Biden and Trump]_coordination). I can do this fine. >>>>>>> However, my entities - Biden and Trump - also have the ID feature. The >>>>>>> new annotation should receive both IDs from the Biden and Trump >>>>>>> annotations. But I couldn’t manage to do this. >>>>>>> >>>>>>> I have rules like this: >>>>>>> >>>>>>> (Person ( >>>>>>> ",” (Person) >>>>>>> ","? PennBioIEPOSTag.value=="CC" >>>>>>> Person >>>>>>> ) {->MARK(PersonEnumeration)}; >>>>>>> >>>>>>> So an enumeration of Persons are covered with a new annotation of type >>>>>>> “PersonEnumeration”. And now “PersonEnumeration” should receive all the >>>>>>> ID features from the covered Person annotations. How can I do this? >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Erik >>>>>> -- >>>>>> Dr. Peter Klügl >>>>>> Head of Text Mining/Machine Learning >>>>>> >>>>>> Averbis GmbH >>>>>> Salzstr. 15 >>>>>> 79098 Freiburg >>>>>> Germany >>>>>> >>>>>> Fon: +49 761 708 394 0 >>>>>> Fax: +49 761 708 394 10 >>>>>> Email: peter.klu...@averbis.com >>>>>> Web: https://averbis.com >>>>>> >>>>>> Headquarters: Freiburg im Breisgau >>>>>> Register Court: Amtsgericht Freiburg im Breisgau, HRB 701080 >>>>>> Managing Directors: Dr. med. Philipp Daumke, Dr. Kornél Markó >>>>>> >>>> -- >>>> Dr. Peter Klügl >>>> Head of Text Mining/Machine Learning >>>> >>>> Averbis GmbH >>>> Salzstr. 15 >>>> 79098 Freiburg >>>> Germany >>>> >>>> Fon: +49 761 708 394 0 >>>> Fax: +49 761 708 394 10 >>>> Email: peter.klu...@averbis.com <mailto:peter.klu...@averbis.com> >>>> Web: https://averbis.com <https://averbis.com/> >>>> >>>> Headquarters: Freiburg im Breisgau >>>> Register Court: Amtsgericht Freiburg im Breisgau, HRB 701080 >>>> Managing Directors: Dr. med. Philipp Daumke, Dr. Kornél Markó >> -- >> Dr. Peter Klügl >> Head of Text Mining/Machine Learning >> >> Averbis GmbH >> Salzstr. 15 >> 79098 Freiburg >> Germany >> >> Fon: +49 761 708 394 0 >> Fax: +49 761 708 394 10 >> Email: peter.klu...@averbis.com <mailto:peter.klu...@averbis.com> >> Web: https://averbis.com <https://averbis.com/> >> >> Headquarters: Freiburg im Breisgau >> Register Court: Amtsgericht Freiburg im Breisgau, HRB 701080 >> Managing Directors: Dr. med. Philipp Daumke, Dr. Kornél Markó > -- Dr. Peter Klügl Head of Text Mining/Machine Learning Averbis GmbH Salzstr. 15 79098 Freiburg Germany Fon: +49 761 708 394 0 Fax: +49 761 708 394 10 Email: peter.klu...@averbis.com Web: https://averbis.com Headquarters: Freiburg im Breisgau Register Court: Amtsgericht Freiburg im Breisgau, HRB 701080 Managing Directors: Dr. med. Philipp Daumke, Dr. Kornél Markó