[ https://issues.apache.org/jira/browse/DRILL-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16862968#comment-16862968 ]
ASF GitHub Bot commented on DRILL-7293: --------------------------------------- arina-ielchiieva commented on pull request #1807: DRILL-7293: Convert the regex ("log") plugin to use EVF URL: https://github.com/apache/drill/pull/1807#discussion_r293324511 ########## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/log/README.md ########## @@ -11,26 +18,50 @@ If you wanted to analyze log files such as the MySQL log sample shown below usin 070917 16:29:01 21 Query select * from location 070917 16:29:12 21 Query select * from location where id = 1 LIMIT 1 ``` -This plugin will allow you to configure Drill to directly query logfiles of any configuration. + +Using this plugin, you can configure Drill to directly query log files of +any configuration. ## Configuration Options -* **`type`**: This tells Drill which extension to use. In this case, it must be `logRegex`. This field is mandatory. -* **`regex`**: This is the regular expression which defines how the log file lines will be split. You must enclose the parts of the regex in grouping parentheses that you wish to extract. Note that this plugin uses Java regular expressions and requires that shortcuts such as `\d` have an additional slash: ie `\\d`. This field is mandatory. -* **`extension`**: This option tells Drill which file extensions should be mapped to this configuration. Note that you can have multiple configurations of this plugin to allow you to query various log files. This field is mandatory. -* **`maxErrors`**: Log files can be inconsistent and messy. The `maxErrors` variable allows you to set how many errors the reader will ignore before halting execution and throwing an error. Defaults to 10. -* **`schema`**: The `schema` field is where you define the structure of the log file. This section is optional. If you do not define a schema, all fields will be assigned a column name of `field_n` where `n` is the index of the field. The undefined fields will be assigned a default data type of `VARCHAR`. + +* **`type`**: This tells Drill which extension to use. In this case, it must +be `logRegex`. This field is mandatory. Review comment: ```suggestion be `logRegex`. This field is mandatory. ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Convert the regex ("log") plugin to use EVF > ------------------------------------------- > > Key: DRILL-7293 > URL: https://issues.apache.org/jira/browse/DRILL-7293 > Project: Apache Drill > Issue Type: Improvement > Affects Versions: 1.16.0 > Reporter: Paul Rogers > Assignee: Paul Rogers > Priority: Major > Fix For: 1.17.0 > > > The "log" plugin (which uses a regex to define the row format) is the subject > of Chapter 12 of the Learning Apache Drill book (though the version in the > book is simpler than the one in the master branch.) > The recently-completed "Enhanced Vector Framework" (EVF, AKA the "row set > framework") gives Drill control over the size of batches created by readers, > and allows readers to use the recently-added provided schema mechanism. > We wish to use the log reader as an example for how to convert a Drill format > plugin to use the EVF so that other developers can convert their own plugins. > This PR provides the first set of log plugin changes to enable us to publish > a tutorial on the EVF. -- This message was sent by Atlassian JIRA (v7.6.3#76005)