[ https://issues.apache.org/jira/browse/NIFI-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197963#comment-15197963 ]
Joseph Witt commented on NIFI-1632: ----------------------------------- all good. full clean build w/contrib check. Unit test is great. Did live test on old nifi and latest build and it broke before and works perfectly now. Merged in +1 > ExtractText throws NullPointerException if Regular Expression has optional > Capturing Group that is not present > -------------------------------------------------------------------------------------------------------------- > > Key: NIFI-1632 > URL: https://issues.apache.org/jira/browse/NIFI-1632 > Project: Apache NiFi > Issue Type: Bug > Reporter: Mark Payne > Assignee: Mark Payne > Fix For: 0.6.0 > > Attachments: > 0001-NIFI-1632-Fixed-NPE-that-occurs-if-a-capturing-group.patch, > 0001-NIFI-1632-Generated-unit-test-to-prove-broken-behavi.patch > > > If we use ExtractText and configure it with a regular expression that > contains a Capturing Group that is "optional" (ends with a ?), and the regex > matches some content where the capturing group is not present, then it will > throw a NullPointerException > Conrad Crampton on the users mailing list reported: > Hi, > I don’t know if this is expected behaviour but I think I understand why this > is happening now. I have a regexp in the ExtractText processors viz: > (?s:^.+: (\d\d?)(\w\w\w)(\d > {4} > ) ([\d ]\d:\d\d:\d\d) Product=(.?) OriginIP=(.?) Origin=(.?) Action=(.?) > SIP=(.?) Source=(.?) SPort=(\d+?) DIP=(.) Destination=(.?) DPort=(\d+?) > Protocol=(.?)(?: ICMPType=(.?) ICMPCode=(.?))? IFName=(.?) IFDirection=(.?) > Reason=(.?) Rule=(.?) PolicyName=(.?) Info=(.?) XlateSIP=(.?) > XlateSPort=([\d]|-?) XlateDIP=(.?) XlateDPort=([\d]+|-?)(.*)$) > With this (?: ICMPType=(.?) ICMPCode=(.?))? the problem I think. Because I > have made a non capturing matching group optional, for those log lines that > don’t have this section matching the dynamic variable can’t set the index > correctly as the match is returning null for these capture groups. Obviously > I haven’t gone too deep into the code, but if I have a RouteOnContent > processor before this testing for this string and remove this from regexp > (and have two ExtractText processors) then it works. It appeared that all the > NPE were thrown for those lines that didn’t match the optional matching group. > Has this been observed before? > Thanks > Conrad > — In looking at the code this line looks offensive: > https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExtractText.java#L325 -- This message was sent by Atlassian JIRA (v6.3.4#6332)