[ https://issues.apache.org/jira/browse/NIFI-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Norito Agetsuma updated NIFI-4539: ---------------------------------- Description: ExtractGrok support named captures only option. Currently, ExtractGrok returns all matches for a grok pattern. In some case, this is verbose. Following example parse apache common access log. {noformat} 83.149.9.216 - - [17/May/2015:10:05:03 +0000] "GET /presentations/logstash-monitorama-2013/images/kibana-search.png HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36 {noformat} Disable named captures only {noformat} { "grok.auth": "-", "grok.timestamp": "17/May/2015:10:05:03 +0000", "grok.httpversion": "1.1", "grok.HOUR": "10", "grok.ident": "-", "grok.SECOND": "03", "grok.HTTPD_COMMONLOG": "83.149.9.216 - - [17/May/2015:10:05:03 +0000] \"GET /presentations/logstash-monitorama-2013/images/kibana-search.png HTTP/1.1\" 200 203023", "grok.USERNAME": "[-, -]", "grok.IP": "83.149.9.216", "grok.clientip": "83.149.9.216", "grok.verb": "GET", "grok.EMAILADDRESS": "[null, null]", "grok.request": "/presentations/logstash-monitorama-2013/images/kibana-search.png", "grok.EMAILLOCALPART": "[null, null]", "grok.INT": "+0000", "grok.BASE10NUM": "[1.1, 200, 203023]", "grok.YEAR": "2015", "grok.IPV4": "83.149.9.216", "grok.MINUTE": "05", "grok.HOSTNAME": "[null, null, null]", "grok.USER": "[-, -]", "grok.response": "200", "grok.bytes": "203023", "grok.TIME": "10:05:03", "grok.MONTH": "May", "grok.MONTHDAY": "17" } {noformat} Enable named captures only {noformat} { "grok.request": "/presentations/logstash-monitorama-2013/images/kibana-search.png", "grok.auth": "-", "grok.ident": "-", "grok.timestamp": "17/May/2015:10:05:03 +0000", "grok.httpversion": "1.1", "grok.clientip": "83.149.9.216", "grok.response": "200", "grok.bytes": "203023", "grok.verb": "GET" } {noformat} was: ExtractGrok support named captures only option. Currently, ExtractGrok returns all matches for a grok pattern. In some case, this is verbose. Following example parse apache common access log. {noformat} 83.149.9.216 - - [17/May/2015:10:05:03 +0000] "GET /presentations/logstash-monitorama-2013/images/kibana-search.png HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36 {noformat} Disable named captures only {noformat} { "grok.auth": "-", "grok.timestamp": "17/May/2015:10:05:03 +0000", "grok.httpversion": "1.1", "grok.HOUR": "10", "grok.ident": "-", "grok.SECOND": "03", "grok.HTTPD_COMMONLOG": "83.149.9.216 - - [17/May/2015:10:05:03 +0000] \"GET /presentations/logstash-monitorama-2013/images/kibana-search.png HTTP/1.1\" 200 203023", "grok.USERNAME": "[-, -]", "grok.IP": "83.149.9.216", "grok.clientip": "83.149.9.216", "grok.verb": "GET", "grok.EMAILADDRESS": "[null, null]", "grok.request": "/presentations/logstash-monitorama-2013/images/kibana-search.png", "grok.EMAILLOCALPART": "[null, null]", "grok.INT": "+0000", "grok.BASE10NUM": "[1.1, 200, 203023]", "grok.YEAR": "2015", "grok.IPV4": "83.149.9.216", "grok.MINUTE": "05", "grok.HOSTNAME": "[null, null, null]", "grok.USER": "[-, -]", "grok.response": "200", "grok.bytes": "203023", "grok.TIME": "10:05:03", "grok.MONTH": "May", "grok.MONTHDAY": "17" } {noformat} Enable named captures only {noformat] { "grok.request": "/presentations/logstash-monitorama-2013/images/kibana-search.png", "grok.auth": "-", "grok.ident": "-", "grok.timestamp": "17/May/2015:10:05:03 +0000", "grok.httpversion": "1.1", "grok.clientip": "83.149.9.216", "grok.response": "200", "grok.bytes": "203023", "grok.verb": "GET" } {noformat} > ExtractGrok - Add support returning only named captures > ------------------------------------------------------- > > Key: NIFI-4539 > URL: https://issues.apache.org/jira/browse/NIFI-4539 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions > Affects Versions: 1.4.0 > Reporter: Norito Agetsuma > > ExtractGrok support named captures only option. > Currently, ExtractGrok returns all matches for a grok pattern. In some case, > this is verbose. > Following example parse apache common access log. > {noformat} > 83.149.9.216 - - [17/May/2015:10:05:03 +0000] "GET > /presentations/logstash-monitorama-2013/images/kibana-search.png HTTP/1.1" > 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" > "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, > like Gecko) Chrome/32.0.1700.77 Safari/537.36 > {noformat} > Disable named captures only > {noformat} > { > "grok.auth": "-", > "grok.timestamp": "17/May/2015:10:05:03 +0000", > "grok.httpversion": "1.1", > "grok.HOUR": "10", > "grok.ident": "-", > "grok.SECOND": "03", > "grok.HTTPD_COMMONLOG": "83.149.9.216 - - [17/May/2015:10:05:03 +0000] > \"GET /presentations/logstash-monitorama-2013/images/kibana-search.png > HTTP/1.1\" 200 203023", > "grok.USERNAME": "[-, -]", > "grok.IP": "83.149.9.216", > "grok.clientip": "83.149.9.216", > "grok.verb": "GET", > "grok.EMAILADDRESS": "[null, null]", > "grok.request": > "/presentations/logstash-monitorama-2013/images/kibana-search.png", > "grok.EMAILLOCALPART": "[null, null]", > "grok.INT": "+0000", > "grok.BASE10NUM": "[1.1, 200, 203023]", > "grok.YEAR": "2015", > "grok.IPV4": "83.149.9.216", > "grok.MINUTE": "05", > "grok.HOSTNAME": "[null, null, null]", > "grok.USER": "[-, -]", > "grok.response": "200", > "grok.bytes": "203023", > "grok.TIME": "10:05:03", > "grok.MONTH": "May", > "grok.MONTHDAY": "17" > } > {noformat} > Enable named captures only > {noformat} > { > "grok.request": > "/presentations/logstash-monitorama-2013/images/kibana-search.png", > "grok.auth": "-", > "grok.ident": "-", > "grok.timestamp": "17/May/2015:10:05:03 +0000", > "grok.httpversion": "1.1", > "grok.clientip": "83.149.9.216", > "grok.response": "200", > "grok.bytes": "203023", > "grok.verb": "GET" > } > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)