[
https://issues.apache.org/jira/browse/UIMA-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705516#comment-16705516
]
Miguel Alvarez commented on UIMA-5757:
--------------------------------------
Yes, that explains it perfectly! Thanks!
> Unable to extract features when annotation ends with HTML tag
> -------------------------------------------------------------
>
> Key: UIMA-5757
> URL: https://issues.apache.org/jira/browse/UIMA-5757
> Project: UIMA
> Issue Type: Bug
> Components: Ruta
> Affects Versions: 2.6.1ruta
> Environment: RUTA 2.6.1, Windows 10, Eclipse Mars, JDK 1.8.0_144
> Reporter: Miguel Alvarez
> Assignee: Peter Klügl
> Priority: Minor
>
> If there is an annotation that covers the whole sofa string, and the sofa
> string ends with an HTML tag, it seems like RUTA isn't able to extract the
> features for that annotation. For instance, lets suppose this document
> (represented as XMI):
>
> {code:java}
> // XMI document
> <?xml version="1.0" encoding="UTF-8"?>
> <xmi:XMI xmlns:xmi="http://www.omg.org/XMI"
> xmlns:cas="http:///uima/cas.ecore" xmlns:tcas="http:///uima/tcas.ecore"
> xmlns:types="http:///com/acme/uima/types.ecore" xmi:version="2.0">
> <cas:NULL xmi:id="0"/>
> <tcas:DocumentAnnotation xmi:id="8" sofa="1" begin="0" end="12"
> language="es"/>
> <types:MyDocument xmi:id="14" sofa="1" begin="0" end="12"
> documentId="test_docsize_39d5541c-5e7f-391c-95af-c82ce6306644"/>
> <cas:Sofa xmi:id="1" sofaNum="1" sofaID="_InitialView" mimeType="text"
> sofaString="ABCDEFGHIJ<p>"/>
> <cas:View sofa="1" members="8 14"/>
> </xmi:XMI>
> {code}
> And the following RUTA script:
>
>
> {code:java}
> // RUTA script
> STRING documentId = "Unknown";
> com.acme.uima.types.MyDocument{-> GETFEATURE("documentId", documentId)};
> LOG("Starting to process document: " + documentId);
> {code}
> The LOG action will output Unknown. But as soon as the string doesn't end
> with an HTML tag, it works fine.
>
> Any ideas what could be going on?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)