[jira] [Commented] (NIFI-4789) Enhance ExtractGrok processor to handle multiple grok expressions

ASF GitHub Bot (JIRA) Mon, 29 Jan 2018 10:01:19 -0800

    [ 
https://issues.apache.org/jira/browse/NIFI-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343718#comment-16343718
 ]


ASF GitHub Bot commented on NIFI-4789:
--------------------------------------

Github user markap14 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2411#discussion_r164503563
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExtractGrok.java
 ---
    @@ -181,15 +243,28 @@ public void onStopped() {
     
         @OnScheduled
         public void onScheduled(final ProcessContext context) throws 
GrokException {
    +        grokList.clear();
             for (int i = 0; i < context.getMaxConcurrentTasks(); i++) {
                 final int maxBufferSize = 
context.getProperty(MAX_BUFFER_SIZE).asDataSize(DataUnit.B).intValue();
                 final byte[] buffer = new byte[maxBufferSize];
                 bufferQueue.add(buffer);
             }
     
    -        grok = new Grok();
    -        
grok.addPatternFromFile(context.getProperty(GROK_PATTERN_FILE).getValue());
    -        grok.compile(context.getProperty(GROK_EXPRESSION).getValue(), 
context.getProperty(NAMED_CAPTURES_ONLY).asBoolean());
    +        resultPrefix = context.getProperty(RESULT_PREFIX).getValue();
    +        breakOnFirstMatch = 
context.getProperty(BREAK_ON_FIRST_MATCH).asBoolean() ;
    +        matchedExpressionAttribute = 
context.getProperty(MATCHED_EXP_ATTR).getValue();
    +        expressionSeparator = 
context.getProperty(EXPRESSION_SEPARATOR).getValue();
    +
    +        String patterns  = context.getProperty(GROK_EXPRESSION).getValue();
    +        for (String patternName : patterns.split(expressionSeparator)) {
    +            Grok grok = new Grok();
    +            final String patternFileListString = 
context.getProperty(GROK_PATTERN_FILE).getValue();
    +            for (String patternFile : 
patternFileListString.split(PATTERN_FILE_LIST_SEPARATOR)) {
    +                grok.addPatternFromFile(patternFile);
    --- End diff --
    
    It probably makes sense to call trim() on this value before passing it into 
the Grok object so that if the user enters something like "abc, xyz, 123" we 
don't pass in the white space.


> Enhance ExtractGrok processor to handle multiple grok expressions
> -----------------------------------------------------------------
>
>                 Key: NIFI-4789
>                 URL: https://issues.apache.org/jira/browse/NIFI-4789
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Core Framework
>    Affects Versions: 1.2.0, 1.5.0
>         Environment: all
>            Reporter: Charles Porter
>            Priority: Minor
>              Labels: features
>
> Many flows require running several grok expressions against an input to 
> correctly tag and extract data. using many separate grok processors to 
> accomplish this is unwieldy and hard to maintain.  Supporting multiple grok 
> expressions delimited by comma or user selected delimiter greatly simplifies 
> this.  
> Feature is coded and tested, ready for pull request, if feature is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (NIFI-4789) Enhance ExtractGrok processor to handle multiple grok expressions

Reply via email to