[GitHub] kkhatua commented on issue #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query

2019-02-05 Thread GitBox
kkhatua commented on issue #1608: DRILL-6960: Auto Limit Wrapping should not 
apply to non-select query
URL: https://github.com/apache/drill/pull/1608#issuecomment-460930334
 
 
   Let me take a look at `WebUserConnection` . I want to make sure that we 
don't hold all the results indefinitely if we get a hint (the autoLimit) that 
the entire resultset will not be consumed. From the description of the class, 
it looks like it might be able to do that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cgivre commented on a change in pull request #1530: DRILL-6582: SYSLOG (RFC-5424) Format Plugin

2019-02-05 Thread GitBox
cgivre commented on a change in pull request #1530: DRILL-6582: SYSLOG 
(RFC-5424) Format Plugin
URL: https://github.com/apache/drill/pull/1530#discussion_r254135268
 
 

 ##
 File path: contrib/format-syslog/src/main/resources/checkstyle-config.xml
 ##
 @@ -0,0 +1,40 @@
+
+
+http://www.puppycrawl.com/dtds/configuration_1_2.dtd;>
+
+
 
 Review comment:
   Removed


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cgivre commented on a change in pull request #1530: DRILL-6582: SYSLOG (RFC-5424) Format Plugin

2019-02-05 Thread GitBox
cgivre commented on a change in pull request #1530: DRILL-6582: SYSLOG 
(RFC-5424) Format Plugin
URL: https://github.com/apache/drill/pull/1530#discussion_r254135160
 
 

 ##
 File path: 
contrib/format-syslog/src/main/java/org/apache/drill/exec/store/syslog/SyslogFormatConfig.java
 ##
 @@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.syslog;
+
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.google.common.base.Objects;
+import org.apache.drill.common.logical.FormatPluginConfig;
+
+import java.util.Arrays;
+import java.util.List;
+import java.util.ArrayList;
+
+@JsonTypeName("syslog")
+public class SyslogFormatConfig implements FormatPluginConfig {
+
+  public List extensions;
+  public int maxErrors = 10;
+  public boolean flattenStructuredData = false;
+
+  public boolean getFlattenStructuredData() {
+return flattenStructuredData;
+  }
+
+  public int getMaxErrors() {
+return maxErrors;
+  }
+
+  public List getExtensions() {
+return extensions;
+  }
+
+  public void setExtensions(List ext) {
+this.extensions = ext;
+  }
+
+  public void setExtension(String ext) {
+if (this.extensions == null) {
+  this.extensions = new ArrayList();
+}
+this.extensions.add(ext);
+  }
+
+  public void setMaxErrors(int errors) {
+this.maxErrors = errors;
+  }
+
+  public void setFlattenStructuredData(boolean flattenErrors) {
+this.flattenStructuredData = flattenErrors;
+  }
+
+  @Override
+  public boolean equals(Object obj) {
+if (this == obj) {
+  return true;
+}
+if (obj == null || getClass() != obj.getClass()) {
+  return false;
+}
+SyslogFormatConfig other = (SyslogFormatConfig) obj;
+return Objects.equal(extensions, other.extensions) &&
+Objects.equal(maxErrors, other.maxErrors) &&
+Objects.equal(flattenStructuredData, other.flattenStructuredData);
+  }
+
+  @Override
+  public int hashCode() {
+return Arrays.hashCode(new Object[]{maxErrors, flattenStructuredData, 
extensions});
+  }
+
+}
 
 Review comment:
   I think I got them all this time. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (DRILL-7029) Move LogRegex Format Plugin to Contrib

2019-02-05 Thread Charles Givre (JIRA)
Charles Givre created DRILL-7029:


 Summary: Move LogRegex Format Plugin to Contrib
 Key: DRILL-7029
 URL: https://issues.apache.org/jira/browse/DRILL-7029
 Project: Apache Drill
  Issue Type: Task
Reporter: Charles Givre
Assignee: Charles Givre






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] cgivre commented on a change in pull request #1530: DRILL-6582: SYSLOG (RFC-5424) Format Plugin

2019-02-05 Thread GitBox
cgivre commented on a change in pull request #1530: DRILL-6582: SYSLOG 
(RFC-5424) Format Plugin
URL: https://github.com/apache/drill/pull/1530#discussion_r254134574
 
 

 ##
 File path: contrib/format-syslog/README.md
 ##
 @@ -0,0 +1,41 @@
+# Syslog Format Plugin
 
 Review comment:
   Hi @vdiravka ,
   I meant `contrib`.  
   Anyway, here's the link to the JIRA ticket:  
https://issues.apache.org/jira/browse/DRILL-7029.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query

2019-02-05 Thread GitBox
kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit 
Wrapping should not apply to non-select query
URL: https://github.com/apache/drill/pull/1608#discussion_r254118821
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java
 ##
 @@ -84,9 +85,14 @@
   private boolean closed = false;
   private DrillOperatorTable table;
 
-  public QueryContext(final UserSession session, final DrillbitContext 
drillbitContext, QueryId queryId) {
+  public QueryContext(UserSession session, DrillbitContext drillbitContext, 
QueryId queryId) {
+this(session, drillbitContext, queryId, null);
+  }
+
+  public QueryContext(UserSession session, DrillbitContext drillbitContext, 
QueryId queryId, Integer autoLimit) {
 
 Review comment:
   I'm effectively creating an additional constructor with a modified 
signature, so as to not change anything else.  We cant go the "query option"  
solution, so let me see if I can do without it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query

2019-02-05 Thread GitBox
kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit 
Wrapping should not apply to non-select query
URL: https://github.com/apache/drill/pull/1608#discussion_r254115477
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java
 ##
 @@ -183,4 +183,20 @@ private static PhysicalPlan getQueryPlan(QueryContext 
context, String sql, Point
 
 return handler.getPlan(sqlNode);
   }
+
+  //Wrap with Auto Limit
+  private static SqlNode wrapWithAutoLimit(SqlConverter parser, QueryContext 
context, String sql) {
+SqlNode sqlNode = parser.parse(sql);
 
 Review comment:
   This helps a lot! I've no knowledge of the Calcite SqlNodes, so this snippet 
helps.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query

2019-02-05 Thread GitBox
kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit 
Wrapping should not apply to non-select query
URL: https://github.com/apache/drill/pull/1608#discussion_r254106628
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java
 ##
 @@ -150,6 +150,14 @@ private long 
tallyMajorFragmentCost(List majorFragments) {
 return globalProcessNanos;
   }
 
+  public boolean hasAutoLimit() {
+return profile.hasAutoLimit();
+  }
+
+  public int getAutoLimit() {
+return profile.getAutoLimit();
 
 Review comment:
   My statement above... I don't think the query option is the right thing to 
do, since this is a WebUI interaction feature.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query

2019-02-05 Thread GitBox
kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit 
Wrapping should not apply to non-select query
URL: https://github.com/apache/drill/pull/1608#discussion_r254106698
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/user/UserWorker.java
 ##
 @@ -73,9 +73,13 @@ private static QueryId queryIdGenerator() {
   }
 
   public QueryId submitWork(UserClientConnection connection, RunQuery query) {
+return submitWork(connection, query, null);
+  }
+
+  public QueryId submitWork(UserClientConnection connection, RunQuery query, 
Integer autoLimitRowCount) {
 
 Review comment:
   Agreed.. let me take a look at modifying RunQuery


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query

2019-02-05 Thread GitBox
kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit 
Wrapping should not apply to non-select query
URL: https://github.com/apache/drill/pull/1608#discussion_r254106312
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java
 ##
 @@ -136,15 +136,30 @@
*/
   public Foreman(final WorkerBee bee, final DrillbitContext drillbitContext,
   final UserClientConnection connection, final QueryId queryId, final 
RunQuery queryRequest) {
+this(bee, drillbitContext, connection, queryId, queryRequest, null);
+  }
+
+  /**
+   * Constructor. Sets up the Foreman, but does not initiate any execution.
+   *
+   * @param bee work manager (runs fragments)
+   * @param drillbitContext drillbit context
+   * @param connection connection
+   * @param queryId the id for the query
+   * @param queryRequest the query to execute
+   * @param autoLimitRowCount the number of rows to limit in execution (used 
for WebRequests)
+   */
+  public Foreman(final WorkerBee bee, final DrillbitContext drillbitContext,
 
 Review comment:
   I was reluctant to make too many changes to the protobuf, so I avoided 
modifying RunQuery.  I can try looking at the work in changing the protobuf for 
this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query

2019-02-05 Thread GitBox
kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit 
Wrapping should not apply to non-select query
URL: https://github.com/apache/drill/pull/1608#discussion_r254106312
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java
 ##
 @@ -136,15 +136,30 @@
*/
   public Foreman(final WorkerBee bee, final DrillbitContext drillbitContext,
   final UserClientConnection connection, final QueryId queryId, final 
RunQuery queryRequest) {
+this(bee, drillbitContext, connection, queryId, queryRequest, null);
+  }
+
+  /**
+   * Constructor. Sets up the Foreman, but does not initiate any execution.
+   *
+   * @param bee work manager (runs fragments)
+   * @param drillbitContext drillbit context
+   * @param connection connection
+   * @param queryId the id for the query
+   * @param queryRequest the query to execute
+   * @param autoLimitRowCount the number of rows to limit in execution (used 
for WebRequests)
+   */
+  public Foreman(final WorkerBee bee, final DrillbitContext drillbitContext,
 
 Review comment:
   I was reluctant to make too many changes to the protobuf, so I avoided 
modifying RunQuery. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query

2019-02-05 Thread GitBox
kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit 
Wrapping should not apply to non-select query
URL: https://github.com/apache/drill/pull/1608#discussion_r254105360
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java
 ##
 @@ -273,6 +279,29 @@ public RemoteFunctionRegistry getRemoteFunctionRegistry() 
{
 return drillbitContext.getRemoteFunctionRegistry();
   }
 
+  /**
+   * Check if auto-limiting of resultset is enabled
+   * @return True if auto-limit is enabled
+   */
+  public boolean isAutoLimitEnabled() {
+return autoLimitRowCount != null;
+  }
+
+  /**
+   * Returns the maximum size of auto-limited resultset
+   * @return Maximum size of auto-limited resultSet
+   */
+  public Integer getAutoLimitRowCount() {
 
 Review comment:
   The feature is specifically for WebUI submitted queries. A query option 
doesn't make sense because WebUI queries are "session-less" and the only reason 
we want to wrap with a limit is to prevent the Drillbit from being overloaded 
with holding an entire result set that does not get completely consumed by the 
HTTP client.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (DRILL-7028) Reduce the planning time of queries on large Parquet tables with large metadata cache files

2019-02-05 Thread Venkata Jyothsna Donapati (JIRA)
Venkata Jyothsna Donapati created DRILL-7028:


 Summary: Reduce the planning time of queries on large Parquet 
tables with large metadata cache files
 Key: DRILL-7028
 URL: https://issues.apache.org/jira/browse/DRILL-7028
 Project: Apache Drill
  Issue Type: Improvement
  Components: Metadata
Reporter: Venkata Jyothsna Donapati
Assignee: Venkata Jyothsna Donapati
 Fix For: 1.16.0


If the Parquet table has a large number of small files, the metadata cache 
files grow larger and the planner tries to read the large metadata cache file 
which leads to the planning time overhead. Most of the time of execution is 
spent during the planning phase.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-7027) TPCH Queries hit IOB when planner.enable_demux_exchange = true

2019-02-05 Thread Karthikeyan Manivannan (JIRA)
Karthikeyan Manivannan created DRILL-7027:
-

 Summary: TPCH Queries hit IOB when planner.enable_demux_exchange = 
true
 Key: DRILL-7027
 URL: https://issues.apache.org/jira/browse/DRILL-7027
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Reporter: Karthikeyan Manivannan
Assignee: Karthikeyan Manivannan


Runing TPCH queries on SF100 dataset, a few queries (13, 14, and 19) hit IOB 
exception:
{code}
 java.sql.SQLException: SYSTEM ERROR: IndexOutOfBoundsException: index 154

Fragment 7:0

Please, refer to logs for more information.

[Error Id: e312dc77-0cad-4bc0-b90e-fb0d477ef272 on ucs-node2.perf.lab:31010]
at 
org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:528)
at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:632)
at 
org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:217)
at 
org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:151)
at PipSQueak.fetchRows(PipSQueak.java:346)
at PipSQueak.runTest(PipSQueak.java:113)
at PipSQueak.main(PipSQueak.java:477)
Caused by: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM 
ERROR: IndexOutOfBoundsException: index 154

Fragment 7:0

Please, refer to logs for more information.

[Error Id: e312dc77-0cad-4bc0-b90e-fb0d477ef272 on ucs-node2.perf.lab:31010]
at 
org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123)
at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422)
at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96)
at 
org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273)
at 
org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243)
at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at 
io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at 
io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312)
at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at 
io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at 
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
at 

[GitHub] cgivre opened a new pull request #1635: DRILL-7021: HTTPD Throws NPE and Doesn't Recognize Timeformat

2019-02-05 Thread GitBox
cgivre opened a new pull request #1635: DRILL-7021: HTTPD Throws NPE and 
Doesn't Recognize Timeformat
URL: https://github.com/apache/drill/pull/1635
 
 
   This PR addresses several issues with the HTTPD format plugin. 
   1.  Several fields were missing in the implementation which were causing a 
NPE if you included them explicitly in a query
   2.  The format plugin was not parsing time stamps correctly and projecting 
the fields as timestamps
   3.  The `request_user-agent` field had a dash in it which, if not escaped 
correctly, would cause problems in queries.
   4.  The unit tests were not satisfactory
   5.  The default format specified in the `bootstrap.json` file omitted the 
user agent string which most HTTPD logs do include.
   
   This PR fixes all these issues and updates the HTTPD log parser to version 
5.2 which will accept Nginx logs.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kfaraaz opened a new pull request #1634: DRILL-6979: Added autofocus attribute to username on login page, and to query tex…

2019-02-05 Thread GitBox
kfaraaz opened a new pull request #1634: DRILL-6979: Added autofocus attribute 
to username on login page, and to query tex…
URL: https://github.com/apache/drill/pull/1634
 
 
   …tbox on Query tab
   
@sohami @kkhatua  Can you please review ?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Apache Drill Hangout - 05 Feb, 2019

2019-02-05 Thread Bohdan Kazydub
Hi Drillers,
The bi-weekly Apache Drill hangout is scheduled for today, Tuesday, Feb
5th, at 10 AM PST. The original plan is for Sorabh & Hanumath to talk about
Resource Management. If there are any other topics or questions, feel free
to reply or raise during the hangout.

The hangout link: https://meet.google.com/yki-iqdf-tai?authuser=3

P.s. Sorry for late notification.


[GitHub] vvysotskyi commented on issue #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query

2019-02-05 Thread GitBox
vvysotskyi commented on issue #1608: DRILL-6960: Auto Limit Wrapping should not 
apply to non-select query
URL: https://github.com/apache/drill/pull/1608#issuecomment-460669449
 
 
   Regarding this fix and fix for DRILL-6050. I think allowing web-client to 
rewrite the query is not the best approach. A user may want to submit the query 
to see its profile for example, but with this feature, the profile will not 
correspond to the original query. Besides that, this fix does a lot of changes 
for web client on the server side.
   
   It would be better to address this issue by limiting rows which should be 
obtained from the result set on the client side (see 
[WebUserConnection.sendData()](https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/WebUserConnection.java#L138)
 method).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: [DISCUSS] Format plugins in contrib module

2019-02-05 Thread Vitalii Diravka
Absolutely agree with Arina.

I think the core Format Plugins for Parquet, Json and CSV, TSV, PSV files
(which are used for creating Drill tables) can be left in current config
file
and the rest ones should be factored out to the separate config files along
with creating separate modules in Drill *contrib *module.

Therefore the process of creating the new plugins will be more transparent.

Kind regards
Vitalii


On Tue, Feb 5, 2019 at 3:12 PM Charles Givre  wrote:

> I’d concur with Arina’s suggestion.  I do think this would be useful and
> make it easier to make plugins “pluggable”.
> In the meantime, should we recommend that developers of format-plugins
> include their plugins in the bootstrap-storage-plugins.json?  I was
> thinking also that we might want to have some guidelines for unit tests for
> format plugins.  I’m doing some work on the HTTPD format plugin and found
> some issues which cause it to throw NPEs.
> — C
>
>
> > On Feb 5, 2019, at 06:40, Arina Yelchiyeva 
> wrote:
> >
> > Hi all,
> >
> > Before we were adding new formats / plugins into the exec module.
> Eventually we came up to the point that exec package size is growing and
> adding plugin and format contributions is better to separate out in the
> different module.
> > Now we have contrib module where we add such contributions. Plugins are
> pluggable, there are added automatically by means of having
> drill-module.conf file which points to the scanning packages.
> > Format plugins are using the same approach, the only problem is that
> they are not added into bootstrap-storage-plugins.json. So when adding new
> format plugin, in order for it to automatically appear in Drill Web UI,
> developer has to update bootstrap file which is in the exec module.
> > My suggestion we implement some functionality that would merge format
> config with the bootstrap one. For example, each plugin would have to have
> bootstrap-format.json file with the information to which plugin format
> should be added (structure the same as in bootstrap-storage-plugins.json):
> > Example:
> >
> > {
> >  "storage":{
> >dfs: {
> >  formats: {
> >"psv" : {
> >  type: "msgpack",
> >  extensions: [ "mp" ]
> >}
> >  }
> >}
> >  }
> > }
> >
> > Then during Drill start up such bootstrap-format.json files will be
> merged with bootstrap-storage-plugins.json.
> >
> >
> > Current open PR for adding new format plugins:
> > Format plugin for LTSV files - https://github.com/apache/drill/pull/1627
> > SYSLOG (RFC-5424) Format Plugin -
> https://github.com/apache/drill/pull/1530
> > Msgpack format reader - https://github.com/apache/drill/pull/1500
> >
> > Any suggestions?
> >
> > Kind regards,
> > Arina
>
>


[GitHub] vvysotskyi commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query

2019-02-05 Thread GitBox
vvysotskyi commented on a change in pull request #1608: DRILL-6960: Auto Limit 
Wrapping should not apply to non-select query
URL: https://github.com/apache/drill/pull/1608#discussion_r253866957
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java
 ##
 @@ -183,4 +183,20 @@ private static PhysicalPlan getQueryPlan(QueryContext 
context, String sql, Point
 
 return handler.getPlan(sqlNode);
   }
+
+  //Wrap with Auto Limit
+  private static SqlNode wrapWithAutoLimit(SqlConverter parser, QueryContext 
context, String sql) {
+SqlNode sqlNode = parser.parse(sql);
 
 Review comment:
   Please replace this code with the following to avoid parsing the query 
several times.
   ```suggestion
   SqlNode sqlNode = parser.parse(sql);
   
   int fetch = context.getAutoLimitRowCount();
   
   if (sqlNode.getKind().belongsTo(SqlKind.QUERY)) {
 if (sqlNode.getKind() == SqlKind.ORDER_BY) {
   // merge SqlOrderBy
   SqlOrderBy orderBy = (SqlOrderBy) sqlNode;
   if (orderBy.fetch != null) {
 fetch = Math.min(Integer.valueOf(orderBy.fetch.toString()), fetch);
   }
   sqlNode = new SqlOrderBy(orderBy.getParserPosition(), orderBy.query, 
orderBy.orderList, orderBy.offset,
   SqlLiteral.createExactNumeric(Integer.toString(fetch), 
SqlParserPos.ZERO));
 } else {
   sqlNode = new SqlOrderBy(SqlParserPos.ZERO, sqlNode, 
SqlNodeList.EMPTY, null,
   SqlLiteral.createExactNumeric(Integer.toString(fetch), 
SqlParserPos.ZERO));
 }
   }
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: [DISCUSS] Format plugins in contrib module

2019-02-05 Thread Charles Givre
I’d concur with Arina’s suggestion.  I do think this would be useful and make 
it easier to make plugins “pluggable”.  
In the meantime, should we recommend that developers of format-plugins include 
their plugins in the bootstrap-storage-plugins.json?  I was thinking also that 
we might want to have some guidelines for unit tests for format plugins.  I’m 
doing some work on the HTTPD format plugin and found some issues which cause it 
to throw NPEs.  
— C


> On Feb 5, 2019, at 06:40, Arina Yelchiyeva  wrote:
> 
> Hi all,
> 
> Before we were adding new formats / plugins into the exec module. Eventually 
> we came up to the point that exec package size is growing and adding plugin 
> and format contributions is better to separate out in the different module.
> Now we have contrib module where we add such contributions. Plugins are 
> pluggable, there are added automatically by means of having drill-module.conf 
> file which points to the scanning packages.
> Format plugins are using the same approach, the only problem is that they are 
> not added into bootstrap-storage-plugins.json. So when adding new format 
> plugin, in order for it to automatically appear in Drill Web UI, developer 
> has to update bootstrap file which is in the exec module.
> My suggestion we implement some functionality that would merge format config 
> with the bootstrap one. For example, each plugin would have to have 
> bootstrap-format.json file with the information to which plugin format should 
> be added (structure the same as in bootstrap-storage-plugins.json):
> Example:
> 
> {
>  "storage":{
>dfs: {
>  formats: {
>"psv" : {
>  type: "msgpack",
>  extensions: [ "mp" ]
>}
>  }
>}
>  }
> }
> 
> Then during Drill start up such bootstrap-format.json files will be merged 
> with bootstrap-storage-plugins.json.
> 
> 
> Current open PR for adding new format plugins:
> Format plugin for LTSV files - https://github.com/apache/drill/pull/1627
> SYSLOG (RFC-5424) Format Plugin - https://github.com/apache/drill/pull/1530 
> Msgpack format reader - https://github.com/apache/drill/pull/1500
> 
> Any suggestions?
> 
> Kind regards,
> Arina



[jira] [Resolved] (DRILL-4813) Need better error message for table function when table does not exist

2019-02-05 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva resolved DRILL-4813.
-
Resolution: Fixed

> Need better error message for table function when table does not exist
> --
>
> Key: DRILL-4813
> URL: https://issues.apache.org/jira/browse/DRILL-4813
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.8.0
>Reporter: Krystal
>Priority: Major
> Fix For: 1.12.0
>
>
> Need user friendly error for table function when specified table does not 
> exists. Example as shown below:
> select * from table(`drill-3149/blah.txt`(type=>'text', 
> lineDelimiter=>'\r\n'));
> Error: VALIDATION ERROR: null
> SQL Query null
> [Error Id: 6a021d48-bbec-4617-9b53-94bc116dce76]
>   (java.lang.NullPointerException) null
> 
> org.apache.drill.exec.planner.logical.DrillTranslatableTable.getRowType():49
> org.apache.calcite.sql.validate.ProcedureNamespace.validateImpl():68
> .
> .
> .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (DRILL-5673) NPE from planner when using a table function with an invalid table

2019-02-05 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva resolved DRILL-5673.
-
Resolution: Fixed

> NPE from planner when using a table function with an invalid table
> --
>
> Key: DRILL-5673
> URL: https://issues.apache.org/jira/browse/DRILL-5673
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Priority: Major
> Fix For: 1.12.0
>
>
> Create a CSV file, with headers and call it "data_3.csv."
> Set up a storage plugin config with headers, delimiter of ",". Call it 
> "myws". Then read the file:
> {code}
> SELECT * FROM `dfs.myws`.`data_3.csv`
> {code}
> This query works.
> Try the same with a table function:
> {code}
> SELECT * FROM table(dfs.myws.`data_3.csv` (type => 'text', fieldDelimiter => 
> ',' , extractHeader => true))
> {code}
> This also works.
> Now, let's use an incorrect name:
> {code}
> SELECT * FROM `dfs.myws`.`data_33.csv`
> {code}
> (Note the "33" in the name.)
> This produces an error to the client:
> Now the bug. Do the same thing with the table function:
> {code}
> SELECT * FROM table(dfs.myws.`data_33.csv` (type => 'text', fieldDelimiter => 
> ',' , extractHeader => true))
> {code}
> This results in an NPE:
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: null
> SQL Query null
> [Error Id: cf151c28-9879-4ecc-893a-78d85a11c2f4 on 10.250.50.74:31010]
>   at 
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123)
> ...
> Caused by: java.lang.NullPointerException: null
>   at 
> org.apache.drill.exec.planner.logical.DrillTranslatableTable.getRowType(DrillTranslatableTable.java:49)
>  ~[classes/:na]
>   at 
> org.apache.calcite.sql.validate.ProcedureNamespace.validateImpl(ProcedureNamespace.java:68)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:883)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:869)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2806)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2791)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3014)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:883)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:869)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:210) 
> ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
> ...
> {code}
> This bug causes much user confusion as the user cannot immediately tell that 
> this is "user error" vs. something terribly wrong with Drill.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] arina-ielchiieva commented on issue #1612: DRILL-6970: fix issue with logregex format plugin where drillbuf was overflowing

2019-02-05 Thread GitBox
arina-ielchiieva commented on issue #1612: DRILL-6970: fix issue with logregex 
format plugin where drillbuf was overflowing
URL: https://github.com/apache/drill/pull/1612#issuecomment-460618283
 
 
   @jcmcote is there any update?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[DISCUSS] Format plugins in contrib module

2019-02-05 Thread Arina Yelchiyeva
Hi all,

Before we were adding new formats / plugins into the exec module. Eventually we 
came up to the point that exec package size is growing and adding plugin and 
format contributions is better to separate out in the different module.
Now we have contrib module where we add such contributions. Plugins are 
pluggable, there are added automatically by means of having drill-module.conf 
file which points to the scanning packages.
Format plugins are using the same approach, the only problem is that they are 
not added into bootstrap-storage-plugins.json. So when adding new format 
plugin, in order for it to automatically appear in Drill Web UI, developer has 
to update bootstrap file which is in the exec module.
My suggestion we implement some functionality that would merge format config 
with the bootstrap one. For example, each plugin would have to have 
bootstrap-format.json file with the information to which plugin format should 
be added (structure the same as in bootstrap-storage-plugins.json):
Example:

{
  "storage":{
dfs: {
  formats: {
"psv" : {
  type: "msgpack",
  extensions: [ "mp" ]
}
  }
}
  }
}

Then during Drill start up such bootstrap-format.json files will be merged with 
bootstrap-storage-plugins.json.


Current open PR for adding new format plugins:
Format plugin for LTSV files - https://github.com/apache/drill/pull/1627
SYSLOG (RFC-5424) Format Plugin - https://github.com/apache/drill/pull/1530 
Msgpack format reader - https://github.com/apache/drill/pull/1500

Any suggestions?

Kind regards,
Arina

[GitHub] vdiravka commented on a change in pull request #1627: DRILL-7014: Format plugin for LTSV files

2019-02-05 Thread GitBox
vdiravka commented on a change in pull request #1627: DRILL-7014: Format plugin 
for LTSV files
URL: https://github.com/apache/drill/pull/1627#discussion_r253824507
 
 

 ##
 File path: 
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVFormatPlugin.java
 ##
 @@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.ltsv;
+
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.proto.UserBitShared;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.store.RecordReader;
+import org.apache.drill.exec.store.RecordWriter;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin;
+import org.apache.drill.exec.store.dfs.easy.EasyWriter;
+import org.apache.drill.exec.store.dfs.easy.FileWork;
+import org.apache.hadoop.conf.Configuration;
+
+import java.io.IOException;
+import java.util.List;
+
+public class LTSVFormatPlugin extends EasyFormatPlugin 
{
+
+private static final boolean IS_COMPRESSIBLE = false;
+private static final String DEFAULT_NAME = "ltsv";
+private LTSVFormatPluginConfig config;
+
+private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(LTSVFormatPlugin.class);
+
+public LTSVFormatPlugin(String name, DrillbitContext context, 
Configuration fsConf, StoragePluginConfig storageConfig) {
+this(name, context, fsConf, storageConfig, new 
LTSVFormatPluginConfig());
+}
+
+public LTSVFormatPlugin(String name, DrillbitContext context, 
Configuration fsConf, StoragePluginConfig config, LTSVFormatPluginConfig 
formatPluginConfig) {
+super(name, context, fsConf, config, formatPluginConfig, true, false, 
false, IS_COMPRESSIBLE, formatPluginConfig.getExtensions(), DEFAULT_NAME);
+this.config = formatPluginConfig;
+}
+
+@Override
+public RecordReader getRecordReader(FragmentContext context, 
DrillFileSystem dfs, FileWork fileWork,
+List columns, String 
userName) throws ExecutionSetupException {
+return new LTSVRecordReader(context, fileWork.getPath(), dfs, columns, 
config);
+}
+
+
+@Override
+public int getReaderOperatorType() {
+// TODO Is it correct??
+return UserBitShared.CoreOperatorType.JSON_SUB_SCAN_VALUE;
 
 Review comment:
   It is used for Drill RPC. You can find more info about it here:
   https://github.com/paul-rogers/drill/wiki/RPC-Protocol
   
   In your case it is necessary for determining proper type of the reader for 
`EasySubScan` operator, once the last is transferred to some Drill fragment.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] vvysotskyi commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query

2019-02-05 Thread GitBox
vvysotskyi commented on a change in pull request #1608: DRILL-6960: Auto Limit 
Wrapping should not apply to non-select query
URL: https://github.com/apache/drill/pull/1608#discussion_r253814230
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java
 ##
 @@ -136,15 +136,30 @@
*/
   public Foreman(final WorkerBee bee, final DrillbitContext drillbitContext,
   final UserClientConnection connection, final QueryId queryId, final 
RunQuery queryRequest) {
+this(bee, drillbitContext, connection, queryId, queryRequest, null);
+  }
+
+  /**
+   * Constructor. Sets up the Foreman, but does not initiate any execution.
+   *
+   * @param bee work manager (runs fragments)
+   * @param drillbitContext drillbit context
+   * @param connection connection
+   * @param queryId the id for the query
+   * @param queryRequest the query to execute
+   * @param autoLimitRowCount the number of rows to limit in execution (used 
for WebRequests)
+   */
+  public Foreman(final WorkerBee bee, final DrillbitContext drillbitContext,
 
 Review comment:
   I'm not sure that changing `Foreman` constructor signature or adding the new 
one is a good idea. Can `autoLimitRowCount` be set into `RunQuery` object and 
then obtained where needed?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] vvysotskyi commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query

2019-02-05 Thread GitBox
vvysotskyi commented on a change in pull request #1608: DRILL-6960: Auto Limit 
Wrapping should not apply to non-select query
URL: https://github.com/apache/drill/pull/1608#discussion_r253807881
 
 

 ##
 File path: exec/java-exec/src/main/resources/rest/static/js/querySubmission.js
 ##
 @@ -53,7 +53,18 @@ function doSubmitQueryWithUserName() {
 //Wrap & Submit Query (invoked directly if impersonation is not enabled)
 function wrapAndSubmitQuery() {
 //Wrap if required
-wrapQuery();
+var mustWrapWithLimit = $('input[name="forceLimit"]:checked').length > 0;
+//Clear field when submitting if not mustWrapWithLimit
+if (!mustWrapWithLimit) {
+  //Wipe out any numeric entry in the field before
+  $('#autoLimit').attr('value', '');
+} else {
+  let autoLimitValue=document.getElementById("autoLimit").value;
+  if (isNaN(autoLimitValue)) {
+alert(autoLimitValue+ " is not a number. Please fill in a valid 
number");
 
 Review comment:
   I disagree with leaving alert as it is. Can be shown the error message 
similar to the case when the wrong configuration for storage plugin is 
submitted?
   
   
![image](https://user-images.githubusercontent.com/20928429/52267594-177a9280-2942-11e9-8a55-c00353da8a81.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] vvysotskyi commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query

2019-02-05 Thread GitBox
vvysotskyi commented on a change in pull request #1608: DRILL-6960: Auto Limit 
Wrapping should not apply to non-select query
URL: https://github.com/apache/drill/pull/1608#discussion_r253817340
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/user/UserWorker.java
 ##
 @@ -73,9 +73,13 @@ private static QueryId queryIdGenerator() {
   }
 
   public QueryId submitWork(UserClientConnection connection, RunQuery query) {
+return submitWork(connection, query, null);
+  }
+
+  public QueryId submitWork(UserClientConnection connection, RunQuery query, 
Integer autoLimitRowCount) {
 
 Review comment:
   For the case when `RunQuery` will contain `autoLimitRowCount`, there will be 
no need for adding a new method.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] vvysotskyi commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query

2019-02-05 Thread GitBox
vvysotskyi commented on a change in pull request #1608: DRILL-6960: Auto Limit 
Wrapping should not apply to non-select query
URL: https://github.com/apache/drill/pull/1608#discussion_r253814733
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java
 ##
 @@ -84,9 +85,14 @@
   private boolean closed = false;
   private DrillOperatorTable table;
 
-  public QueryContext(final UserSession session, final DrillbitContext 
drillbitContext, QueryId queryId) {
+  public QueryContext(UserSession session, DrillbitContext drillbitContext, 
QueryId queryId) {
+this(session, drillbitContext, queryId, null);
+  }
+
+  public QueryContext(UserSession session, DrillbitContext drillbitContext, 
QueryId queryId, Integer autoLimit) {
 
 Review comment:
   With specifying query option, there is no need to change `QueryContext` 
constructor.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] vvysotskyi commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query

2019-02-05 Thread GitBox
vvysotskyi commented on a change in pull request #1608: DRILL-6960: Auto Limit 
Wrapping should not apply to non-select query
URL: https://github.com/apache/drill/pull/1608#discussion_r253798677
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java
 ##
 @@ -273,6 +279,29 @@ public RemoteFunctionRegistry getRemoteFunctionRegistry() 
{
 return drillbitContext.getRemoteFunctionRegistry();
   }
 
+  /**
+   * Check if auto-limiting of resultset is enabled
+   * @return True if auto-limit is enabled
+   */
+  public boolean isAutoLimitEnabled() {
+return autoLimitRowCount != null;
+  }
+
+  /**
+   * Returns the maximum size of auto-limited resultset
+   * @return Maximum size of auto-limited resultSet
+   */
+  public Integer getAutoLimitRowCount() {
 
 Review comment:
   I don't think that it is a good idea to expand the interface of 
`QueryContext` for such a minor thing. Please consider adding and using query 
option instead. As an example please see a fix for DRILL-6834.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] vdiravka commented on a change in pull request #1530: DRILL-6582: SYSLOG (RFC-5424) Format Plugin

2019-02-05 Thread GitBox
vdiravka commented on a change in pull request #1530: DRILL-6582: SYSLOG 
(RFC-5424) Format Plugin
URL: https://github.com/apache/drill/pull/1530#discussion_r253767075
 
 

 ##
 File path: 
contrib/format-syslog/src/main/java/org/apache/drill/exec/store/syslog/SyslogRecordReader.java
 ##
 @@ -0,0 +1,403 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.syslog;
+
+import com.google.common.base.Charsets;
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.exception.OutOfMemoryException;
+import org.apache.drill.exec.expr.holders.VarCharHolder;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.ops.OperatorContext;
+import org.apache.drill.exec.physical.impl.OutputMutator;
+import org.apache.drill.exec.store.AbstractRecordReader;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import org.apache.drill.exec.store.dfs.easy.FileWork;
+import org.apache.drill.exec.vector.complex.impl.VectorContainerWriter;
+import org.apache.drill.exec.vector.complex.writer.BaseWriter;
+import org.apache.hadoop.fs.Path;
+import org.realityforge.jsyslog.message.StructuredDataParameter;
+import org.realityforge.jsyslog.message.SyslogMessage;
+
+import java.io.BufferedReader;
+import java.io.InputStream;
+import java.io.InputStreamReader;
+import java.text.SimpleDateFormat;
+import java.util.List;
+import java.util.Map;
+import java.util.Iterator;
+
+public class SyslogRecordReader extends AbstractRecordReader {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(SyslogRecordReader.class);
+  private static final int MAX_RECORDS_PER_BATCH = 4096;
+
+  private final DrillFileSystem fileSystem;
+  private final FileWork fileWork;
+  private final String userName;
+  private BufferedReader reader;
+  private DrillBuf buffer;
+  private VectorContainerWriter writer;
+  private SyslogFormatConfig config;
+  private int maxErrors;
+  private boolean flattenStructuredData;
+  private int errorCount;
+  private int lineCount;
+  private List projectedColumns;
+  private String line;
+
+  private SimpleDateFormat df;
+
+  public SyslogRecordReader(FragmentContext context,
+DrillFileSystem fileSystem,
+FileWork fileWork,
+List columns,
+String userName,
+SyslogFormatConfig config) throws 
OutOfMemoryException {
+
+this.fileSystem = fileSystem;
+this.fileWork = fileWork;
+this.userName = userName;
+this.config = config;
+this.maxErrors = config.getMaxErrors();
+this.df = getValidDateObject("-MM-dd'T'HH:mm:ss.SSS'Z'");
+this.errorCount = 0;
+this.buffer = context.getManagedBuffer(2048);
 
 Review comment:
   The constant for 4096 is created, but 2048 - the new value is introduced and 
used?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] vdiravka commented on a change in pull request #1530: DRILL-6582: SYSLOG (RFC-5424) Format Plugin

2019-02-05 Thread GitBox
vdiravka commented on a change in pull request #1530: DRILL-6582: SYSLOG 
(RFC-5424) Format Plugin
URL: https://github.com/apache/drill/pull/1530#discussion_r253763788
 
 

 ##
 File path: contrib/format-syslog/README.md
 ##
 @@ -0,0 +1,41 @@
+# Syslog Format Plugin
 
 Review comment:
   Once you will create a Jira please post its number here.
   
   I think you meant `contrib`. It is not a strict division, but there are core 
components in `java-exec`. And almost all storage/format plugins are in the 
`contrib` module.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] vdiravka commented on a change in pull request #1530: DRILL-6582: SYSLOG (RFC-5424) Format Plugin

2019-02-05 Thread GitBox
vdiravka commented on a change in pull request #1530: DRILL-6582: SYSLOG 
(RFC-5424) Format Plugin
URL: https://github.com/apache/drill/pull/1530#discussion_r253784583
 
 

 ##
 File path: 
contrib/format-syslog/src/main/java/org/apache/drill/exec/store/syslog/SyslogFormatConfig.java
 ##
 @@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.syslog;
+
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.google.common.base.Objects;
+import org.apache.drill.common.logical.FormatPluginConfig;
+
+import java.util.Arrays;
+import java.util.List;
+import java.util.ArrayList;
+
+@JsonTypeName("syslog")
+public class SyslogFormatConfig implements FormatPluginConfig {
+
+  public List extensions;
+  public int maxErrors = 10;
+  public boolean flattenStructuredData = false;
+
+  public boolean getFlattenStructuredData() {
+return flattenStructuredData;
+  }
+
+  public int getMaxErrors() {
+return maxErrors;
+  }
+
+  public List getExtensions() {
+return extensions;
+  }
+
+  public void setExtensions(List ext) {
+this.extensions = ext;
+  }
+
+  public void setExtension(String ext) {
+if (this.extensions == null) {
+  this.extensions = new ArrayList();
+}
+this.extensions.add(ext);
+  }
+
+  public void setMaxErrors(int errors) {
+this.maxErrors = errors;
+  }
+
+  public void setFlattenStructuredData(boolean flattenErrors) {
+this.flattenStructuredData = flattenErrors;
+  }
+
+  @Override
+  public boolean equals(Object obj) {
+if (this == obj) {
+  return true;
+}
+if (obj == null || getClass() != obj.getClass()) {
+  return false;
+}
+SyslogFormatConfig other = (SyslogFormatConfig) obj;
+return Objects.equal(extensions, other.extensions) &&
+Objects.equal(maxErrors, other.maxErrors) &&
+Objects.equal(flattenStructuredData, other.flattenStructuredData);
+  }
+
+  @Override
+  public int hashCode() {
+return Arrays.hashCode(new Object[]{maxErrors, flattenStructuredData, 
extensions});
+  }
+
+}
 
 Review comment:
   Looks like the new line is still absent (in some other files too).
   ```suggestion
   }
   
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] shimamoto commented on issue #1627: DRILL-7014: Format plugin for LTSV files

2019-02-05 Thread GitBox
shimamoto commented on issue #1627: DRILL-7014: Format plugin for LTSV files
URL: https://github.com/apache/drill/pull/1627#issuecomment-460559287
 
 
   > Also, IMHO, you should add the config to the default Drill config so that 
people can query `ltsv` files by default.
   
   Is it correct to add `bootstrap-storage-plugins.json` into 
format-ltsv/src/main/resources? I tried adding 
`bootstrap-storage-plugins.json`, but it didn't work.
   Like this.
   ```
   {
 "storage": {
   dfs: {
 type: "ltsv",
 connection: "file:///",
 formats: {
 "ltsv": {
   "type": "ltsv",
   "extensions": [ "ltsv" ]
 },
 enabled: true
   }
 }
   }
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] shimamoto commented on a change in pull request #1627: DRILL-7014: Format plugin for LTSV files

2019-02-05 Thread GitBox
shimamoto commented on a change in pull request #1627: DRILL-7014: Format 
plugin for LTSV files
URL: https://github.com/apache/drill/pull/1627#discussion_r253772190
 
 

 ##
 File path: 
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVFormatPlugin.java
 ##
 @@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.ltsv;
+
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.proto.UserBitShared;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.store.RecordReader;
+import org.apache.drill.exec.store.RecordWriter;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin;
+import org.apache.drill.exec.store.dfs.easy.EasyWriter;
+import org.apache.drill.exec.store.dfs.easy.FileWork;
+import org.apache.hadoop.conf.Configuration;
+
+import java.io.IOException;
+import java.util.List;
+
+public class LTSVFormatPlugin extends EasyFormatPlugin 
{
+
+private static final boolean IS_COMPRESSIBLE = false;
+private static final String DEFAULT_NAME = "ltsv";
+private LTSVFormatPluginConfig config;
+
+private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(LTSVFormatPlugin.class);
+
+public LTSVFormatPlugin(String name, DrillbitContext context, 
Configuration fsConf, StoragePluginConfig storageConfig) {
+this(name, context, fsConf, storageConfig, new 
LTSVFormatPluginConfig());
+}
+
+public LTSVFormatPlugin(String name, DrillbitContext context, 
Configuration fsConf, StoragePluginConfig config, LTSVFormatPluginConfig 
formatPluginConfig) {
+super(name, context, fsConf, config, formatPluginConfig, true, false, 
false, IS_COMPRESSIBLE, formatPluginConfig.getExtensions(), DEFAULT_NAME);
+this.config = formatPluginConfig;
+}
+
+@Override
+public RecordReader getRecordReader(FragmentContext context, 
DrillFileSystem dfs, FileWork fileWork,
+List columns, String 
userName) throws ExecutionSetupException {
+return new LTSVRecordReader(context, fileWork.getPath(), dfs, columns, 
config);
+}
+
+
+@Override
+public int getReaderOperatorType() {
+// TODO Is it correct??
+return UserBitShared.CoreOperatorType.JSON_SUB_SCAN_VALUE;
 
 Review comment:
   Could you explain more about why I have to add the protobuf?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] shimamoto commented on a change in pull request #1627: DRILL-7014: Format plugin for LTSV files

2019-02-05 Thread GitBox
shimamoto commented on a change in pull request #1627: DRILL-7014: Format 
plugin for LTSV files
URL: https://github.com/apache/drill/pull/1627#discussion_r253772028
 
 

 ##
 File path: 
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVFormatPlugin.java
 ##
 @@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.ltsv;
+
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.proto.UserBitShared;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.store.RecordReader;
+import org.apache.drill.exec.store.RecordWriter;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin;
+import org.apache.drill.exec.store.dfs.easy.EasyWriter;
+import org.apache.drill.exec.store.dfs.easy.FileWork;
+import org.apache.hadoop.conf.Configuration;
+
+import java.io.IOException;
+import java.util.List;
+
+public class LTSVFormatPlugin extends EasyFormatPlugin 
{
+
+private static final boolean IS_COMPRESSIBLE = false;
+private static final String DEFAULT_NAME = "ltsv";
+private LTSVFormatPluginConfig config;
+
+private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(LTSVFormatPlugin.class);
+
+public LTSVFormatPlugin(String name, DrillbitContext context, 
Configuration fsConf, StoragePluginConfig storageConfig) {
+this(name, context, fsConf, storageConfig, new 
LTSVFormatPluginConfig());
+}
+
+public LTSVFormatPlugin(String name, DrillbitContext context, 
Configuration fsConf, StoragePluginConfig config, LTSVFormatPluginConfig 
formatPluginConfig) {
+super(name, context, fsConf, config, formatPluginConfig, true, false, 
false, IS_COMPRESSIBLE, formatPluginConfig.getExtensions(), DEFAULT_NAME);
+this.config = formatPluginConfig;
+}
+
+@Override
+public RecordReader getRecordReader(FragmentContext context, 
DrillFileSystem dfs, FileWork fileWork,
+List columns, String 
userName) throws ExecutionSetupException {
+return new LTSVRecordReader(context, fileWork.getPath(), dfs, columns, 
config);
+}
+
+
+@Override
+public int getReaderOperatorType() {
+// TODO Is it correct??
+return UserBitShared.CoreOperatorType.JSON_SUB_SCAN_VALUE;
+}
+
+@Override
+public int getWriterOperatorType() {
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] shimamoto commented on a change in pull request #1627: DRILL-7014: Format plugin for LTSV files

2019-02-05 Thread GitBox
shimamoto commented on a change in pull request #1627: DRILL-7014: Format 
plugin for LTSV files
URL: https://github.com/apache/drill/pull/1627#discussion_r253771968
 
 

 ##
 File path: 
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVRecordReader.java
 ##
 @@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.ltsv;
+
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.exception.OutOfMemoryException;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.ops.OperatorContext;
+import org.apache.drill.exec.physical.impl.OutputMutator;
+import org.apache.drill.exec.store.AbstractRecordReader;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import org.apache.drill.exec.vector.complex.impl.VectorContainerWriter;
+import org.apache.drill.exec.vector.complex.writer.BaseWriter;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.Path;
+
+import java.io.BufferedReader;
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.text.ParseException;
+import java.util.List;
+
+public class LTSVRecordReader extends AbstractRecordReader {
+
+private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(LTSVRecordReader.class);
+private static final int MAX_RECORDS_PER_BATCH = 8096;
+
+private String inputPath;
+private BufferedReader reader;
+private DrillBuf buffer;
+private VectorContainerWriter writer;
+private LTSVFormatPluginConfig config;
+private int lineCount;
+
+public LTSVRecordReader(FragmentContext fragmentContext, String inputPath, 
DrillFileSystem fileSystem,
+List columns, LTSVFormatPluginConfig 
config) throws OutOfMemoryException {
+try {
+FSDataInputStream fsStream = fileSystem.open(new Path(inputPath));
+this.inputPath = inputPath;
+this.lineCount = 0;
+this.reader = new BufferedReader(new 
InputStreamReader(fsStream.getWrappedStream(), "UTF-8"));
+this.config = config;
+this.buffer = fragmentContext.getManagedBuffer();
+setColumns(columns);
+
+} catch(IOException e){
+logger.debug("LTSV Plugin: " + e.getMessage());
+}
+}
+
+public void setup(final OperatorContext context, final OutputMutator 
output) throws ExecutionSetupException {
+this.writer = new VectorContainerWriter(output);
+}
+
+public int next() {
+this.writer.allocate();
+this.writer.reset();
+
+int recordCount = 0;
+
+try {
+BaseWriter.MapWriter map = this.writer.rootAsMap();
+String line = null;
+
+while(recordCount < MAX_RECORDS_PER_BATCH &&(line = 
this.reader.readLine()) != null){
+lineCount++;
+
+// Skip empty lines
+if(line.trim().length() == 0){
+continue;
+}
+
+this.writer.setPosition(recordCount);
+map.start();
+
+String[] fields = line.split("\t");
+for(String field: fields){
+int index = field.indexOf(":");
+if(index <= 0) {
+throw new ParseException(
+"Invalid LTSV format: " + inputPath + "\n" + 
lineCount + ":" + line, 0
+);
+}
+
+String fieldName  = field.substring(0, index);
+String fieldValue = field.substring(index + 1);
+
+byte[] bytes = fieldValue.getBytes("UTF-8");
+this.buffer.setBytes(0, bytes, 0, bytes.length);
+map.varChar(fieldName).writeVarChar(0, bytes.length, 
buffer);
 
 Review comment:
   done. Could you please check if my change makes sense?


This is an automated 

[GitHub] shimamoto commented on a change in pull request #1627: DRILL-7014: Format plugin for LTSV files

2019-02-05 Thread GitBox
shimamoto commented on a change in pull request #1627: DRILL-7014: Format 
plugin for LTSV files
URL: https://github.com/apache/drill/pull/1627#discussion_r253770309
 
 

 ##
 File path: 
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVRecordReader.java
 ##
 @@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.ltsv;
+
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.exception.OutOfMemoryException;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.ops.OperatorContext;
+import org.apache.drill.exec.physical.impl.OutputMutator;
+import org.apache.drill.exec.store.AbstractRecordReader;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import org.apache.drill.exec.vector.complex.impl.VectorContainerWriter;
+import org.apache.drill.exec.vector.complex.writer.BaseWriter;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.Path;
+
+import java.io.BufferedReader;
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.text.ParseException;
+import java.util.List;
+
+public class LTSVRecordReader extends AbstractRecordReader {
+
+private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(LTSVRecordReader.class);
+private static final int MAX_RECORDS_PER_BATCH = 8096;
+
+private String inputPath;
+private BufferedReader reader;
+private DrillBuf buffer;
+private VectorContainerWriter writer;
+private LTSVFormatPluginConfig config;
+private int lineCount;
+
+public LTSVRecordReader(FragmentContext fragmentContext, String inputPath, 
DrillFileSystem fileSystem,
+List columns, LTSVFormatPluginConfig 
config) throws OutOfMemoryException {
+try {
+FSDataInputStream fsStream = fileSystem.open(new Path(inputPath));
+this.inputPath = inputPath;
+this.lineCount = 0;
+this.reader = new BufferedReader(new 
InputStreamReader(fsStream.getWrappedStream(), "UTF-8"));
+this.config = config;
+this.buffer = fragmentContext.getManagedBuffer();
+setColumns(columns);
+
+} catch(IOException e){
+logger.debug("LTSV Plugin: " + e.getMessage());
+}
+}
+
+public void setup(final OperatorContext context, final OutputMutator 
output) throws ExecutionSetupException {
+this.writer = new VectorContainerWriter(output);
+}
+
+public int next() {
+this.writer.allocate();
+this.writer.reset();
+
+int recordCount = 0;
+
+try {
+BaseWriter.MapWriter map = this.writer.rootAsMap();
+String line = null;
+
+while(recordCount < MAX_RECORDS_PER_BATCH &&(line = 
this.reader.readLine()) != null){
+lineCount++;
+
+// Skip empty lines
+if(line.trim().length() == 0){
+continue;
+}
+
+this.writer.setPosition(recordCount);
+map.start();
+
+String[] fields = line.split("\t");
+for(String field: fields){
+int index = field.indexOf(":");
+if(index <= 0) {
+throw new ParseException(
+"Invalid LTSV format: " + inputPath + "\n" + 
lineCount + ":" + line, 0
+);
+}
+
+String fieldName  = field.substring(0, index);
+String fieldValue = field.substring(index + 1);
+
+byte[] bytes = fieldValue.getBytes("UTF-8");
+this.buffer.setBytes(0, bytes, 0, bytes.length);
+map.varChar(fieldName).writeVarChar(0, bytes.length, 
buffer);
+}
+
+map.end();
+recordCount++;
+}
+
+this.writer.setValueCount(recordCount);
+

[GitHub] shimamoto commented on a change in pull request #1627: DRILL-7014: Format plugin for LTSV files

2019-02-05 Thread GitBox
shimamoto commented on a change in pull request #1627: DRILL-7014: Format 
plugin for LTSV files
URL: https://github.com/apache/drill/pull/1627#discussion_r253770219
 
 

 ##
 File path: 
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVRecordReader.java
 ##
 @@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.ltsv;
+
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.exception.OutOfMemoryException;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.ops.OperatorContext;
+import org.apache.drill.exec.physical.impl.OutputMutator;
+import org.apache.drill.exec.store.AbstractRecordReader;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import org.apache.drill.exec.vector.complex.impl.VectorContainerWriter;
+import org.apache.drill.exec.vector.complex.writer.BaseWriter;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.Path;
+
+import java.io.BufferedReader;
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.text.ParseException;
+import java.util.List;
+
+public class LTSVRecordReader extends AbstractRecordReader {
+
+private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(LTSVRecordReader.class);
+private static final int MAX_RECORDS_PER_BATCH = 8096;
+
+private String inputPath;
+private BufferedReader reader;
+private DrillBuf buffer;
+private VectorContainerWriter writer;
+private LTSVFormatPluginConfig config;
+private int lineCount;
+
+public LTSVRecordReader(FragmentContext fragmentContext, String inputPath, 
DrillFileSystem fileSystem,
+List columns, LTSVFormatPluginConfig 
config) throws OutOfMemoryException {
+try {
+FSDataInputStream fsStream = fileSystem.open(new Path(inputPath));
+this.inputPath = inputPath;
+this.lineCount = 0;
+this.reader = new BufferedReader(new 
InputStreamReader(fsStream.getWrappedStream(), "UTF-8"));
+this.config = config;
+this.buffer = fragmentContext.getManagedBuffer();
+setColumns(columns);
+
+} catch(IOException e){
+logger.debug("LTSV Plugin: " + e.getMessage());
+}
+}
+
+public void setup(final OperatorContext context, final OutputMutator 
output) throws ExecutionSetupException {
+this.writer = new VectorContainerWriter(output);
+}
+
+public int next() {
+this.writer.allocate();
+this.writer.reset();
+
+int recordCount = 0;
+
+try {
+BaseWriter.MapWriter map = this.writer.rootAsMap();
+String line = null;
+
+while(recordCount < MAX_RECORDS_PER_BATCH &&(line = 
this.reader.readLine()) != null){
+lineCount++;
+
+// Skip empty lines
+if(line.trim().length() == 0){
+continue;
+}
+
+this.writer.setPosition(recordCount);
+map.start();
+
+String[] fields = line.split("\t");
+for(String field: fields){
+int index = field.indexOf(":");
+if(index <= 0) {
+throw new ParseException(
+"Invalid LTSV format: " + inputPath + "\n" + 
lineCount + ":" + line, 0
 
 Review comment:
   This code is enclosed in a try block. When an exception occurs, the 
exception is caught and wrapped in Drill one.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] shimamoto commented on a change in pull request #1627: DRILL-7014: Format plugin for LTSV files

2019-02-05 Thread GitBox
shimamoto commented on a change in pull request #1627: DRILL-7014: Format 
plugin for LTSV files
URL: https://github.com/apache/drill/pull/1627#discussion_r253769950
 
 

 ##
 File path: 
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVFormatPluginConfig.java
 ##
 @@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.ltsv;
+
+import org.apache.drill.common.logical.FormatPluginConfig;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.fasterxml.jackson.annotation.JsonInclude;
+
+import java.util.List;
+
+@JsonTypeName("ltsv")
+public class LTSVFormatPluginConfig implements FormatPluginConfig {
+public List extensions;
+
+private static final List DEFAULT_EXTS = ImmutableList.of("ltsv");
+
+@JsonInclude(JsonInclude.Include.NON_DEFAULT)
+public List getExtensions() {
+if (extensions == null) {
+// when loading an old JSONFormatConfig that doesn't contain an 
"extensions" attribute
+return DEFAULT_EXTS;
+}
+return extensions;
+}
+
+@Override
+public int hashCode() {
+return 99;
+}
+
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] shimamoto commented on a change in pull request #1627: DRILL-7014: Format plugin for LTSV files

2019-02-05 Thread GitBox
shimamoto commented on a change in pull request #1627: DRILL-7014: Format 
plugin for LTSV files
URL: https://github.com/apache/drill/pull/1627#discussion_r25376
 
 

 ##
 File path: 
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVFormatPluginConfig.java
 ##
 @@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.ltsv;
+
+import org.apache.drill.common.logical.FormatPluginConfig;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableList;
+
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.fasterxml.jackson.annotation.JsonInclude;
+
+import java.util.List;
+
+@JsonTypeName("ltsv")
+public class LTSVFormatPluginConfig implements FormatPluginConfig {
+public List extensions;
+
+private static final List DEFAULT_EXTS = ImmutableList.of("ltsv");
+
+@JsonInclude(JsonInclude.Include.NON_DEFAULT)
+public List getExtensions() {
+if (extensions == null) {
+// when loading an old JSONFormatConfig that doesn't contain an 
"extensions" attribute
+return DEFAULT_EXTS;
+}
+return extensions;
+}
+
+@Override
+public int hashCode() {
+return 99;
+}
+
+@Override
+public boolean equals(Object obj) {
+if (this == obj) {
+return true;
+} else if (obj == null) {
+return false;
+} else if (getClass() == obj.getClass()) {
+return true;
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services