[ 
https://issues.apache.org/jira/browse/DRILL-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16862968#comment-16862968
 ] 

ASF GitHub Bot commented on DRILL-7293:
---------------------------------------

arina-ielchiieva commented on pull request #1807: DRILL-7293: Convert the regex 
("log") plugin to use EVF
URL: https://github.com/apache/drill/pull/1807#discussion_r293324511
 
 

 ##########
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/log/README.md
 ##########
 @@ -11,26 +18,50 @@ If you wanted to analyze log files such as the MySQL log 
sample shown below usin
 070917 16:29:01      21 Query       select * from location
 070917 16:29:12      21 Query       select * from location where id = 1 LIMIT 1
 ```
-This plugin will allow you to configure Drill to directly query logfiles of 
any configuration.
+
+Using this plugin, you can configure Drill to directly query log files of
+any configuration.
 
 ## Configuration Options
-* **`type`**:  This tells Drill which extension to use.  In this case, it must 
be `logRegex`.  This field is mandatory.
-* **`regex`**:  This is the regular expression which defines how the log file 
lines will be split.  You must enclose the parts of the regex in grouping 
parentheses that you wish to extract.  Note that this plugin uses Java regular 
expressions and requires that shortcuts such as `\d` have an additional slash:  
ie `\\d`.  This field is mandatory.
-* **`extension`**:  This option tells Drill which file extensions should be 
mapped to this configuration.  Note that you can have multiple configurations 
of this plugin to allow you to query various log files.  This field is 
mandatory.
-* **`maxErrors`**:  Log files can be inconsistent and messy.  The `maxErrors` 
variable allows you to set how many errors the reader will ignore before 
halting execution and throwing an error.  Defaults to 10.
-* **`schema`**:  The `schema` field is where you define the structure of the 
log file.  This section is optional.  If you do not define a schema, all fields 
will be assigned a column name of `field_n` where `n` is the index of the 
field. The undefined fields will be assigned a default data type of `VARCHAR`.
+
+* **`type`**:  This tells Drill which extension to use.  In this case, it must
+be `logRegex`.  This field is mandatory.
 
 Review comment:
   ```suggestion
   be `logRegex`. This field is mandatory.
   ```
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Convert the regex ("log") plugin to use EVF
> -------------------------------------------
>
>                 Key: DRILL-7293
>                 URL: https://issues.apache.org/jira/browse/DRILL-7293
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.16.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Major
>             Fix For: 1.17.0
>
>
> The "log" plugin (which uses a regex to define the row format) is the subject 
> of Chapter 12 of the Learning Apache Drill book (though the version in the 
> book is simpler than the one in the master branch.)
> The recently-completed "Enhanced Vector Framework" (EVF, AKA the "row set 
> framework") gives Drill control over the size of batches created by readers, 
> and allows readers to use the recently-added provided schema mechanism.
> We wish to use the log reader as an example for how to convert a Drill format 
> plugin to use the EVF so that other developers can convert their own plugins.
> This PR provides the first set of log plugin changes to enable us to publish 
> a tutorial on the EVF.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to