Re: RUTA: Copy features into new annotation

Peter Klügl Sun, 10 Jan 2021 07:22:06 -0800

Hi,


Am 07.01.2021 um 14:55 schrieb Erik Fäßler:
> Hi Peter and thank you once again for your excellent support of your 
> excellent RUTA software!


You are welcome :-)


>
> Your second example was very much what I needed. Thank you so far!
> I have one last bump in the road:
>
> My Person#id feature is an FSArray with ID annotations instead of a plain 
> uima.cas.String. So, one Person annotation might have multiple IDs per the 
> type system.
> The ID type has a feature “entryId”.
> In my particular case I actually have only one entry in the id array. Still, 
> I need to access this entry somehow.
> Is that at all possible in RUTA? I would need something like
>
>
> // collect ids of all covered Persons using an extra list
> STRINGLIST ids;
> pe:PersonEnumeration{-> pe.personIds = ids}
>     <-{p:Person{-> ADD(ids,p.id <http://p.id/>[0].entryId)};};
>
> This does not seem to be covered by the FeatureExpression grammar in RUTA. Is 
> there a work around? Otherwise I will have to solve it some other way.


there are actual "indexed" expressions like Person.ids[0] but it's not
yet an "official" and stable feature. However, I think it's not even
necessary.


Is your typesystem available somewhere? JCoRe?

Is this a solution for you?


PACKAGE uima.ruta;

// mock types
DECLARE CC, EnumCC;
DECLARE Person (FSArray ids);
DECLARE PersonId (String personId);
DECLARE PersonEnumeration (StringArray personIds);

// mock annotations
"Trump" -> Person;
"Biden" -> Person;
"and" -> CC;
INT counter = 1;
p:Person{-> pid:CREATE(PersonId, "personId" = "id_" + (counter)),
counter = counter +1, p.ids = pid};

(COMMA? @CC){-> EnumCC};

// identify enum span
(Person (COMMA Person)* EnumCC Person){-> PersonEnumeration};

// collect ids of all covered Persons using a extra list
STRINGLIST ids;
pe:PersonEnumeration{-> pe.personIds = ids}
    <-{p:Person{-> ADD(ids,p.ids.personId)};};


Best,


Peter



>
> Many thanks,
>
> Erik
>
>> On 7. Jan 2021, at 10:47, Peter Klügl <peter.klu...@averbis.com> wrote:
>>
>> Hi Erik,
>>
>>
>> it depends on how you want to represent the information of the ids of
>> the covered Person annotations. You somehow need to represent the values
>> in the PersonEnumeration annotation. I assume that the ID feature of
>> Person is uima.cas.String? PersonEnumeration could either use one String
>> Feature, a StringArray feature or a FSArray feature (pointing to the
>> Person annotation which provide the IDs).
>>
>> Here are two examples:
>>
>>
>> PACKAGE uima.ruta;
>>
>> // mock types
>> DECLARE CC, EnumCC;
>> DECLARE Person (STRING id);
>> DECLARE PersonEnumeration (FSArray persons);
>>
>> // mock annotations
>> "Trump" -> Person ("id" = "1");
>> "Biden" -> Person ("id" = "2");
>> "and" -> CC;
>>
>> COMMA? @CC{-> EnumCC};
>>
>> // identify enum span
>> (Person (COMMA Person)* EnumCC Person){-> PersonEnumeration};
>>
>> // collect all covered Persons
>> pe:PersonEnumeration{-> pe.persons = Person};
>>
>> ########################
>>
>> ########################
>>
>> PACKAGE uima.ruta;
>>
>> // mock types
>> DECLARE CC, EnumCC;
>> DECLARE Person (STRING id);
>> DECLARE PersonEnumeration (StringArray personIds);
>>
>> // mock annotations
>> "Trump" -> Person ("id" = "1");
>> "Biden" -> Person ("id" = "2");
>> "and" -> CC;
>>
>> COMMA? @CC{-> EnumCC};
>>
>> // identify enum span
>> (Person (COMMA Person)* EnumCC Person){-> PersonEnumeration};
>>
>> // collect ids of all covered Persons using an extra list
>> STRINGLIST ids;
>> pe:PersonEnumeration{-> pe.personIds = ids}
>>     <-{p:Person{-> ADD(ids,p.id)};};
>>
>>
>>
>>
>> Best,
>>
>>
>> Peter
>>
>>
>> Am 06.01.2021 um 08:29 schrieb Erik Fäßler:
>>> Hello everyone (and a happy new year :-)),
>>>
>>> I have been working on the following issue: Whenever there is conjunction 
>>> in text of two entities (e.g. [...]Biden and Trump ran for president […]) I 
>>> create a new annotation spanning both entities and the conjunction ([Biden 
>>> and Trump]_coordination). I can do this fine.
>>> However, my entities - Biden and Trump - also have the ID feature. The new 
>>> annotation should receive both IDs from the Biden and Trump annotations. 
>>> But I couldn’t manage to do this.
>>>
>>> I have rules like this:
>>>
>>> (Person (
>>>    ",” (Person)
>>>     ","? PennBioIEPOSTag.value=="CC"
>>> Person
>>> ) {->MARK(PersonEnumeration)};
>>>
>>> So an enumeration of Persons are covered with a new annotation of type 
>>> “PersonEnumeration”. And now “PersonEnumeration” should receive all the ID 
>>> features from the covered Person annotations. How can I do this?
>>>
>>> Best,
>>>
>>> Erik
>> -- 
>> Dr. Peter Klügl
>> Head of Text Mining/Machine Learning
>>
>> Averbis GmbH
>> Salzstr. 15
>> 79098 Freiburg
>> Germany
>>
>> Fon: +49 761 708 394 0
>> Fax: +49 761 708 394 10
>> Email: peter.klu...@averbis.com
>> Web: https://averbis.com
>>
>> Headquarters: Freiburg im Breisgau
>> Register Court: Amtsgericht Freiburg im Breisgau, HRB 701080
>> Managing Directors: Dr. med. Philipp Daumke, Dr. Kornél Markó
>>
>
-- 
Dr. Peter Klügl
Head of Text Mining/Machine Learning

Averbis GmbH
Salzstr. 15
79098 Freiburg
Germany

Fon: +49 761 708 394 0
Fax: +49 761 708 394 10
Email: peter.klu...@averbis.com
Web: https://averbis.com

Headquarters: Freiburg im Breisgau
Register Court: Amtsgericht Freiburg im Breisgau, HRB 701080
Managing Directors: Dr. med. Philipp Daumke, Dr. Kornél Markó

Re: RUTA: Copy features into new annotation

Reply via email to