Hi,

this looks like a bug. I would normally say that this is situation where ruta is better than similar rule languages since it can handle these alternatives somewhere deep down the rule matching. It should work just fine.

I will reproduce/fix it and get back to you (before Monday).

Best,

Peter

Am 09.07.2015 um 16:31 schrieb Andreas Weber:
Hi,

I just started with Ruta by trying some kind of fact extraction on my
own annotations.
Some problems occured with composite rules (having more than one rule
element) when more than one annotation of the same kind occurs at the
same position in the input text.
It's hard to provide a reproduceable example cause the whole processing
is integrated in a bunch of our software, but I try to explain it:

In my input text I use an annotation MY_ANNO which has a feature "STRING
type".
My composite rule should find occurences of MY_ANNO with the feature
"type = person" followed by MY_ANNO with the feature "type = location"
in the same sentence:

MyAnno.type == "person" {
     -> MARK ...    // do some action
}
W*?
MyAnno.type == "location" {
     -> MARK ...   // do some action
};
This works fine when my input annotations look (simplified) like that:

1. W
    MY_ANNO (type: person)
2. W
3. W
    MY_ANNO (type: location)

(1,2 and 3 are the positions of the annotations, e.g. at the first
position we have two annotations: "W" and "MY_ANNO (type: person)" )
The rule above doesn't match when my input has an additional MY_ANNO
annotation at the third position:

1. W
    MY_ANNO (type: person)
2. W
3. W
    MY_ANNO (type: somethingElse)
    MY_ANNO (type: location)
Even a more simple rule without Star Reluctant operator doesn't match
for that input:

MyAnno.type == "person" {
     -> MARK ...    // do some action
}
W
MyAnno.type == "location" {
     -> MARK ...   // do some action
};

Maybe that's a misinterpretation by myself of how the Ruta rule
evaluation should work.

However, I tried to find the reason by debugging a little bit in the
Ruta code (ruta-core 2.3.0) and found RutaRuleElement.continueMatch():
Here, the idea seems to be when the "useAlternatives" flag is set to
true, the "ruleMatch" and "containerMatch" objects are copied for having
them unchanged when processing each alternative. But the original
"containerMatch" object can be changed by further processing in the
stepbackMatch() call.
In my case the matching failed for the first alternative and that set
the containerMatch object to not matching. And although the second
alternative matched, the base of the "ruleMatch" was already marked as
non-matching and so the evaluation of the whole ruleMatch was false.
Is there a reason for that or is this a bug?

Any help/hints/comments appreciated!  :)

Best regards,
   Andreas


Reply via email to