[ 
https://issues.apache.org/jira/browse/DRILL-7817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247287#comment-17247287
 ] 

ASF GitHub Bot commented on DRILL-7817:
---------------------------------------

cgivre commented on a change in pull request #2122:
URL: https://github.com/apache/drill/pull/2122#discussion_r540206798



##########
File path: contrib/format-httpd/README.md
##########
@@ -1,35 +1,39 @@
 # Web Server Log Format Plugin (HTTPD)
 This plugin enables Drill to read and query httpd (Apache Web Server) and 
nginx access logs natively. This plugin uses the work by [Niels 
Basjes](https://github.com/nielsbasjes
-) which is available here: https://github.com/nielsbasjes/logparser.
+) which is available here: https://github.com/nielsbasjes/logparser .

Review comment:
       Nit:  Why the extra space?

##########
File path: 
contrib/format-httpd/src/main/java/org/apache/drill/exec/store/httpd/HttpdParser.java
##########
@@ -35,45 +36,61 @@
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
+import java.util.ArrayList;
 import java.util.EnumSet;
-import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
+import java.util.TreeMap;
+
+import static nl.basjes.parse.core.Casts.DOUBLE;
+import static nl.basjes.parse.core.Casts.DOUBLE_ONLY;
+import static nl.basjes.parse.core.Casts.LONG;
+import static nl.basjes.parse.core.Casts.LONG_ONLY;
+import static nl.basjes.parse.core.Casts.STRING;
+import static nl.basjes.parse.core.Casts.STRING_ONLY;
 
 public class HttpdParser {
 
   private static final Logger logger = 
LoggerFactory.getLogger(HttpdParser.class);
 
   public static final String PARSER_WILDCARD = ".*";
-  public static final String REMAPPING_FLAG = "#";
   private final Parser<HttpdLogRecord> parser;
   private final List<SchemaPath> requestedColumns;
   private final Map<String, MinorType> mappedColumns;
+  private final Map<String, Casts> columnCasts;
   private final HttpdLogRecord record;
   private final String logFormat;
+  private final boolean parseUserAgent;
+  private final String logParserRemapping;
   private Map<String, String> requestedPaths;
-  private EnumSet<Casts> casts;
-
 
-  public HttpdParser(final String logFormat, final String timestampFormat, 
final boolean flattenWildcards, final EasySubScan scan) {
+  public HttpdParser(
+          final String logFormat,
+          final String timestampFormat,
+          final boolean flattenWildcards,
+          final boolean parseUserAgent,
+          final String logParserRemapping,
+          final EasySubScan scan) {
 
     Preconditions.checkArgument(logFormat != null && 
!logFormat.trim().isEmpty(), "logFormat cannot be null or empty");
 
     this.logFormat = logFormat;
+    this.parseUserAgent = parseUserAgent;
     this.record = new HttpdLogRecord(timestampFormat, flattenWildcards);
 
-    if (timestampFormat == null) {
-      this.parser = new HttpdLoglineParser<>(HttpdLogRecord.class, logFormat);
-    } else {
-      this.parser = new HttpdLoglineParser<>(HttpdLogRecord.class, logFormat, 
timestampFormat);

Review comment:
       Will the parser throw errors or warnings if the `timestampFormat` is 
`null`?   I know you don't recommend that the user provides a custom 
`timestampFormat` so I was trying to optimize the code for that. 

##########
File path: contrib/format-httpd/src/main/resources/bootstrap-format-plugins.json
##########
@@ -5,7 +5,7 @@
       "formats": {
         "httpd" : {
           "type" : "httpd",
-          "logFormat" : "%h %l %u %t \"%r\" %s %b \"%{Referer}i\" 
\"%{User-agent}i\"",
+          "logFormat" : "common\ncombined",

Review comment:
       Nice!  I like this a lot better than having to specify the exact config. 
 Can we put something in the `README` about this?  Are there other canned 
configs we could use?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


> Add direct Yauaa support for HTTPD Format Plugin.
> -------------------------------------------------
>
>                 Key: DRILL-7817
>                 URL: https://issues.apache.org/jira/browse/DRILL-7817
>             Project: Apache Drill
>          Issue Type: New Feature
>            Reporter: Niels Basjes
>            Assignee: Niels Basjes
>            Priority: Minor
>
> Enhancement of having the Yauaa useragent parser immediately integrated with 
> the HTTPD logparser.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to