Hi, capturing groups are not supported by the REGEXP condition since it is essentially just a boolean function and cannot transfer its internal information to an action which creates annotations. However, there are many other ways to solve it.
There is maybe a problem with your regexp. I changed it to ".*(?:no|No) (.*)" in the following. You can, for example, use the simple regexp rule and restrict its matching context to each line: ... with a BLOCK: BLOCK(eachLine) Line{}{ ".*(?:no|No) (.*)" -> Rule1NoPattern, 1=Group1; } ... with an inlined rule: Line->{".*(?:no|No) (.*)" -> Rule1NoPattern, 1=Group1;}; Some additional comments: You should mention the type Line in the EXEC action for reindexing, if you want to use these annotations in the following rules: Document{-> EXEC(PlainTextAnnotator, {Line})}; For your rules, it does not make a difference, but if you use other conditions like PARTOF, it will not work correctly. >From my experience, I'd recommend to work directly with annotations instead of regexes for detecting the target of a negation. Then, you can refactor the rules more easily, e.g., if you have a rule like Line->{PrefixNegationInd #{-> Group1};}; you can replace the wildcard with something better in future like ChunkNP. (I just wanted to mention it. I know that your example was probably just an example to describe the problem with ruta.) Best, Peter Am 08.02.2016 um 00:37 schrieb Bonnie MacKellar: > Hi, > > I am trying to write RUTA rules using regular expressions and capturing > groups. I want the matches to be line by line. I can do this using the > following script > > ENGINE utils.PlainTextAnnotator; > TYPESYSTEM utils.PlainTextTypeSystem; > Document{-> RETAINTYPE(BREAK)}; > Document{-> EXEC(PlainTextAnnotator)}; > DECLARE Rule1NoPattern, Group1, Group2; > Line{REGEXP(".*no|No (.*)") -> Rule1NoPattern}; > > Given this text > Not pregnant or nursing > Fertile patients must use effective contraception (hormonal contraception > or intra-uterine device [IUD]) > No concurrent participation in another clinical trial that would preclude > the interventions or outcome assessment of this clinical trial > No other concurrent anticancer therapy > > it correctly matches the last two lines and annotates them with > Rule1NoPattern > The problem is, I want to use the capturing group information as well. I > can do this using the simple regular expression syntax > ".*no|No (.*)\n|S" -> Rule1NoPattern, 1=Group1; > > if I just give it one line, say > No other concurrent anticancer therapy > > it will correctly annotate the entire line with Rule1NoPattern, and "other > concurrent anticancer therapy" wll be annotated with Group1. > Is there a way, using the first rule variant > Line{REGEXP(".*no|No (.*)") -> Rule1NoPattern}; > > to annotate the text in capturing group? > > I have tried all kinds of syntax, but none of it seems to be correct > > thanks, > Bonnie MacKellar >