Michael Stenger created UIMA-6404:
-------------------------------------
Summary: Ruta: @ with quantifier ignores matches
Key: UIMA-6404
URL: https://issues.apache.org/jira/browse/UIMA-6404
Project: UIMA
Issue Type: Bug
Components: Ruta
Affects Versions: 3.1.0ruta, 2.8.1ruta
Reporter: Michael Stenger
Fix For: 2.9.0ruta, 3.1.1ruta
Hi.
it seems combining the start anchor with a (minmax) quantifier causes the
interpreter to miss what I would consider matches in cases where @ is put with
inner rule elements like so:
{code:java}
(W @W W)[2,2];
// or
(W @W W W)[3,4];
// or
(W W @W W)[2,3];
{code}
On the other hand,
{code:java}
(W W W w)[2,2];
{code}
would match passages as expected. I suspect this is caused by the changed
matching order within the composed rule element when it is applied multiple
times.
Minimal Example:
Script:
{noformat}
(W @W W W)[2,2]{-> T1};
(W W @W W)[2,2]{-> T2};{noformat}
Text:
{noformat}
omega alpha beta gamma omega alpha beta gamma omega alpha{noformat}
Expected matches:
* T1, T2: omega alpha beta gamma omega alpha beta gamma
* T1, T2: alpha beta gamma omega alpha beta gamma omega
* T1, T2: beta gamma omega alpha beta gamma omega alpha
Actual matches:
* T2: beta gamma omega alpha beta gamma omega alpha
Or, since I could not find anything on the intended behaviour in such cases in
the Guide, the broader question is how the interpreter is supposed to handle @
in a composed rule element that is also quantified. E.g. is it supposed to
ignore the anchor from the second application (on the same match) onwards?
Best, Michael
--
This message was sent by Atlassian Jira
(v8.20.1#820001)