[jira] [Commented] (NIFI-4789) Enhance ExtractGrok processor to handle multiple grok expressions

ASF GitHub Bot (JIRA) Mon, 29 Jan 2018 12:48:13 -0800

    [ 
https://issues.apache.org/jira/browse/NIFI-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343995#comment-16343995
 ]


ASF GitHub Bot commented on NIFI-4789:
--------------------------------------

Github user charlesporter commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2411#discussion_r164558925
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExtractGrok.java
 ---
    @@ -70,33 +65,51 @@
     @Tags({"grok", "log", "text", "parse", "delimit", "extract"})
     @CapabilityDescription("Evaluates one or more Grok Expressions against the 
content of a FlowFile, " +
         "adding the results as attributes or replacing the content of the 
FlowFile with a JSON " +
    -    "notation of the matched content")
    +    "notation of the matched content\n" +
    +        "uses https://github.com/thekrakken/java-grok.";)
     @WritesAttributes({
    -    @WritesAttribute(attribute = "grok.XXX", description = "When operating 
in flowfile-attribute mode, each of the Grok identifier that is matched in the 
flowfile " +
    -        "will be added as an attribute, prefixed with \"grok.\" For 
example," +
    -        "if the grok identifier \"timestamp\" is matched, then the value 
will be added to an attribute named \"grok.timestamp\"")})
    +    @WritesAttribute(attribute = "{result prefix}XXX", description = "When 
operating in flowfile-attribute mode, each of the Grok identifier that is 
matched in the flowfile " +
    +        "will be added as an attribute, prefixed with \"{result prefix}\" 
For example," +
    +        "if the grok identifier \"timestamp\" is matched, then the value 
will be added to an attribute named \"{result prefix}timestamp\""),
    +
    +        @WritesAttribute(attribute = "ExtractGrok.exception", description 
= "if an error occurs, an exception will be written to this attribute, " +
    +                "and the flow routed to 'unmatched' ")
    +})
     public class ExtractGrok extends AbstractProcessor {
     
         public static final String FLOWFILE_ATTRIBUTE = "flowfile-attribute";
         public static final String FLOWFILE_CONTENT = "flowfile-content";
    -    private static final String APPLICATION_JSON = "application/json";
    -
    +    public static final String APPLICATION_JSON = "application/json";
    +    public static final String GROK_EXPRESSION_KEY = "Grok Expression";
    +    public static final String GROK_PATTERN_FILE_KEY = "Grok Pattern file";
    +    public static final String DESTINATION_KEY = "Destination";
    +    public static final String CHARACTER_SET_KEY = "Character Set";
    +    public static final String MAXIMUM_BUFFER_SIZE_KEY = "Maximum Buffer 
Size";
    +    public static final String NAMED_CAPTURES_ONLY_KEY = "Named captures 
only";
    +    public static final String SINGLE_MATCH_KEY = "Single Match";
    +    public static final String RESULT_PREFIX_KEY = "result prefix";
    +    public static final String MATCHED_EXP_ATTR_KEY = "matched expression 
attribute";
    +    public static final String EXP_SEPARATOR_KEY = "expression-separator";
    +    public static final String PATTERN_FILE_LIST_SEPARATOR = ",";
    +
    +    //properties
         public static final PropertyDescriptor GROK_EXPRESSION = new 
PropertyDescriptor.Builder()
    -        .name("Grok Expression")
    -        .description("Grok expression")
    +        .name(GROK_EXPRESSION_KEY)
    --- End diff --
    
    just because sometimes (maybe not in this processor) the literals for the 
properties do get re-used. I like consistency, even if in a particular 
situation it might not add much.   If the NIFI style is to not to do it this 
way, I will happily use literals.


> Enhance ExtractGrok processor to handle multiple grok expressions
> -----------------------------------------------------------------
>
>                 Key: NIFI-4789
>                 URL: https://issues.apache.org/jira/browse/NIFI-4789
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Core Framework
>    Affects Versions: 1.2.0, 1.5.0
>         Environment: all
>            Reporter: Charles Porter
>            Priority: Minor
>              Labels: features
>
> Many flows require running several grok expressions against an input to 
> correctly tag and extract data. using many separate grok processors to 
> accomplish this is unwieldy and hard to maintain.  Supporting multiple grok 
> expressions delimited by comma or user selected delimiter greatly simplifies 
> this.  
> Feature is coded and tested, ready for pull request, if feature is approved



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (NIFI-4789) Enhance ExtractGrok processor to handle multiple grok expressions

Reply via email to