[GitHub] [drill] cgivre commented on a change in pull request #1637: DRILL-7032: Ignore corrupt rows in a PCAP file

2019-03-19 Thread GitBox
cgivre commented on a change in pull request #1637: DRILL-7032: Ignore corrupt 
rows in a PCAP file
URL: https://github.com/apache/drill/pull/1637#discussion_r267160182
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/pcap/decoder/Packet.java
 ##
 @@ -324,7 +329,11 @@ public int getDst_port() {
 byte[] data = null;
 if (packetLength >= payloadDataStart) {
   data = new byte[packetLength - payloadDataStart];
-  System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, data.length);
+  try {
+System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, 
data.length);
+  } catch (Exception e) {
 
 Review comment:
   Hi @arina-ielchiieva 
   I didn't have a lot of test data to work with, but I do think it would be 
worth logging the exception.  My general approach was to be as broad as 
possible so that any parsing error would be caught and the packet would be 
flagged as corrupt. 
   
   I'll add a logging statement.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] amansinha100 commented on a change in pull request #1699: DRILL-7108: Improve selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates

2019-03-19 Thread GitBox
amansinha100 commented on a change in pull request #1699: DRILL-7108: Improve 
selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates
URL: https://github.com/apache/drill/pull/1699#discussion_r267158524
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillRelMdSelectivity.java
 ##
 @@ -137,34 +138,33 @@ private double getScanSelectivityInternal(DrillTable 
table, RexNode predicate, L
 for (RexNode pred : RelOptUtil.conjunctions(predicate)) {
   double orSel = 0;
   for (RexNode orPred : RelOptUtil.disjunctions(pred)) {
-//CALCITE guess
-Double guess = RelMdUtil.guessSelectivity(pred);
+
 if (orPred.isA(SqlKind.EQUALS)) {
+  orSel += computeEqualsSelectivity(table, orPred, fieldNames);
+} else if (orPred.isA(SqlKind.NOT_EQUALS)) {
+  orSel += 1.0 - computeEqualsSelectivity(table, orPred, fieldNames);
+} else if (orPred.isA(SqlKind.LIKE)) {
+  // LIKE selectivity is 5% more than a similar equality predicate, 
capped at CALCITE guess
+  orSel +=  Math.min(computeEqualsSelectivity(table, orPred, 
fieldNames) + LIKE_PREDICATE_SELECTIVITY,
+  RelMdUtil.guessSelectivity(orPred));
+} else if (orPred.isA(SqlKind.NOT)) {
   if (orPred instanceof RexCall) {
-int colIdx = -1;
-RexInputRef op = findRexInputRef(orPred);
-if (op != null) {
-  colIdx = op.hashCode();
-}
-if (colIdx != -1 && colIdx < fieldNames.size()) {
-  String col = fieldNames.get(colIdx);
-  if (table.getStatsTable() != null
-  && table.getStatsTable().getNdv(col) != null) {
-orSel += 1.00 / table.getStatsTable().getNdv(col);
-  } else {
-orSel += guess;
-  }
+// LIKE selectivity is 5% more than a similar equality predicate, 
capped at CALCITE guess
+RexNode childOp = ((RexCall) orPred).getOperands().get(0);
+if (childOp.isA(SqlKind.LIKE)) {
+  orSel += 1.0 - Math.min(computeEqualsSelectivity(table, childOp, 
fieldNames) + LIKE_PREDICATE_SELECTIVITY,
+  RelMdUtil.guessSelectivity(childOp));
 } else {
-  orSel += guess;
-  if (logger.isDebugEnabled()) {
-logger.warn(String.format("No input reference $[%s] found for 
predicate [%s]",
-Integer.toString(colIdx), orPred.toString()));
-  }
+  orSel += 1.0 - RelMdUtil.guessSelectivity(orPred);
 }
   }
+} else if (orPred.isA(SqlKind.IS_NULL)) {
+  orSel += computeNullSelectivity(table, orPred, fieldNames);
+} else if (orPred.isA(SqlKind.IS_NOT_NULL)) {
+  orSel += 1.0 - computeNullSelectivity(table, orPred, fieldNames);
 
 Review comment:
   Why not have a computeIsNotNullSelectivity() function ?  Sorry for the 
nit-pick but since we already have the exact NNRowCount to begin with, the 
clean formula without the subtraction would be preferred. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] amansinha100 commented on a change in pull request #1699: DRILL-7108: Improve selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates

2019-03-19 Thread GitBox
amansinha100 commented on a change in pull request #1699: DRILL-7108: Improve 
selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates
URL: https://github.com/apache/drill/pull/1699#discussion_r267156075
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillRelMdSelectivity.java
 ##
 @@ -56,6 +56,12 @@
   private static final DrillRelMdSelectivity INSTANCE = new 
DrillRelMdSelectivity();
   static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DrillRelMdSelectivity.class);
   public static final RelMetadataProvider SOURCE = 
ReflectiveRelMetadataProvider.reflectiveSource(BuiltInMethod.SELECTIVITY.method,
 INSTANCE);
+  /*
+   * For now, we are treating all LIKE predicates to have the same selectivity 
irrespective of the number or position
+   * of wildcard characters (%). This is no different than the present 
Drill/Calcite behaviour w.r.t to LIKE predicates.
 
 Review comment:
   There is a difference from the current behavior of Drill/Calcite because the 
selectivity number has been reduced to 5%. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] amansinha100 commented on a change in pull request #1699: DRILL-7108: Improve selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates

2019-03-19 Thread GitBox
amansinha100 commented on a change in pull request #1699: DRILL-7108: Improve 
selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates
URL: https://github.com/apache/drill/pull/1699#discussion_r267156795
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillRelMdSelectivity.java
 ##
 @@ -173,6 +177,58 @@ private double getScanSelectivityInternal(DrillTable 
table, RexNode predicate, L
 return (sel > 1.0) ? 1.0 : sel;
   }
 
+  private double computeEqualsSelectivity(DrillTable table, RexNode orPred, 
List fieldNames) {
+String col = getColumn(orPred, fieldNames);
+if (col != null) {
+  if (table.getStatsTable() != null
+  && table.getStatsTable().getNdv(col) != null) {
+return 1.00 / table.getStatsTable().getNdv(col);
+  }
+}
+return guessSelectivity(orPred);
+  }
+
+  private double computeIsNullSelectivity(DrillTable table, RexNode orPred, 
List fieldNames) {
+String col = getColumn(orPred, fieldNames);
+if (col != null) {
+  if (table.getStatsTable() != null
+  && table.getStatsTable().getNNRowCount(col) != null) {
+// Cap selectivity between {1/Total_Rows, Calcite Guess}
+return Math.min(Math.max(1.0/table.getStatsTable().getRowCount(),
 
 Review comment:
   I missed the 1/getRowCount()...why not just min( (1 - 
NNRowCount/totalRowCount),  guessSelectivity) ?   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: [DISCUSS] 1.16.0 release

2019-03-19 Thread Sorabh Hamirwasia
Hi All,
Thanks for your response. We discussed about release today in hangout. Most
of the JIRA's shared in this thread will be considered for 1.16 since they
need at most a week or two to wrap up.

   - DRILL-6540 is to support Hadoop 3.1 compatibility which provides a
   risk at this point of the release. It is moved out to 1.17 so that we will
   have enough runtime with new set of libraries.
   - DRILL-6642 upgraded the protobuf version from 2.5.0 to 3.6.1. This
   includes dependencies on C++ 11 and requires new ODBC driver to be built.
   Getting new driver with new protobuf may take around 2 months or so as
   Simba has to upgrade their environment. The general agreement in community
   was to revert the protobuf update on 1.16.0 branch only so that we can
   release new ODBC driver quickly. While master will still have the original
   commit so that by 1.17 release timeline we will have a new driver with
   updated protobuf.
   - Currently there are 23 Open and InProgress issues with 5 blocker bugs
   
.
   Please work on these and try to wrap up ASAP.
   - Apart from above there are 14 issues under Review
   
.
   Please ping the reviewers to help review your PR so that it can be merged
   ASAP

We are currently looking for first cut-off date at end of first week of
April. Few folks are working on Parquet Metadata caching improvement and
will need couple of days to estimate remaining work. Once that is done we
will finalize the cut-off date.

Thanks,
Sorabh

On Tue, Mar 19, 2019 at 5:24 AM Bohdan Kazydub 
wrote:

> Hello Sorabh,
>
> I'm currently working on a small bug when querying views from S3 storage
> with plain authentication [DRILL-7079].
> I'd like to include it into Drill 1.16. I'll need 1-2 days for making the
> PR as well.
>
> Thanks,
> Bohdan
>
>
> On Tue, Mar 19, 2019 at 1:55 PM Igor Guzenko 
> wrote:
>
> > Hello Sorabh,
> >
> > I'm currently working on small improvement for Hive schema show tables
> > [DRILL-7115].
> > I'd like to include it into Drill 1.16. I'll need 1-2 days for making the
> > PR.
> >
> > Thanks,
> > Igor
> >
> > On Mon, Mar 18, 2019 at 11:25 PM Kunal Khatua  wrote:
> > >
> > > Khurram
> > >
> > > Currently, for DRILL-7061, the feature has been disabled. If I'm able
> to
> > get a commit within the week for DRILL-6960, this will be addressed in
> that.
> > >
> > > DRILL-7110 is something that is required as well, to allow DRILL-6960
> to
> > be accepted.
> > >
> > > ~ Kunal
> > > On 3/18/2019 1:20:16 PM, Khurram Faraaz  wrote:
> > > Can we also have the fix fox DRILL-7061
> > > in Drill 1.16 ?
> > >
> > > Thanks,
> > > Khurram
> > >
> > > On Mon, Mar 18, 2019 at 11:40 AM Karthikeyan Manivannan
> > > kmanivan...@mapr.com> wrote:
> > >
> > > > Please include DRILL-7107
> > > > https://issues.apache.org/jira/browse/DRILL-7107>
> > > > I will open the PR today.
> > > > It is a small change and fixes a basic usability issue.
> > > >
> > > > Thanks.
> > > >
> > > > Karthik
> > > >
> > > > On Thu, Mar 14, 2019 at 4:50 PM Charles Givre wrote:
> > > >
> > > > > One more… DRILL-7014 is basically done and I’d like to see that get
> > into
> > > > > Drill 1.16.
> > > > >
> > > > > > On Mar 14, 2019, at 19:44, Charles Givre wrote:
> > > > > >
> > > > > > Who should I add as a reviewer for 7032?
> > > > > >
> > > > > >> On Mar 14, 2019, at 19:42, Sorabh Hamirwasia
> > > > >
> > > > > wrote:
> > > > > >>
> > > > > >> Hi Charles,
> > > > > >> Can you please add reviewer for DRILL-7032 ?
> > > > > >> For DRILL-6970 the PR is closed by the author, I have pinged in
> > JIRA
> > > > > asking
> > > > > >> to re-open so that it can be reviewed.
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Sorabh
> > > > > >>
> > > > > >> On Thu, Mar 14, 2019 at 4:29 PM Charles Givre
> > > > wrote:
> > > > > >>
> > > > > >>> Hi Sorabh,
> > > > > >>> I have 3 PRs that are almost done, awaiting final review.
> > > > Drill-7077,
> > > > > >>> DRILL-7032, DRILL-7021. I owe @ariina some fixes for 7077, but
> > I’m
> > > > > waiting
> > > > > >>> for review of the others. Also, there is that DRILL-6970 about
> > the
> > > > > buffer
> > > > > >>> overflows in the logRegex reader that isn’t mine but I’d like
> to
> > see
> > > > > >>> included.
> > > > > >>> Thanks,
> > > > > >>> —C
> > > > > >>>
> > > > >  On Mar 14, 2019, at 13:13, Sorabh Hamirwasia
> > > > sohami.apa...@gmail.com
> > > > > >
> > > > > >>> wrote:
> > > > > 
> > > > >  Hi Arina,
> > > > >  Thanks for your

[GitHub] [drill] sohami commented on a change in pull request #1702: DRILL-7107 Unable to connect to Drill 1.15 through ZK

2019-03-19 Thread GitBox
sohami commented on a change in pull request #1702: DRILL-7107 Unable to 
connect to Drill 1.15 through ZK
URL: https://github.com/apache/drill/pull/1702#discussion_r267132228
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/Drillbit.java
 ##
 @@ -173,7 +175,11 @@ public Drillbit(
   coord = serviceSet.getCoordinator();
   storeProvider = new CachingPersistentStoreProvider(new 
LocalPersistentStoreProvider(config));
 } else {
-  coord = new ZKClusterCoordinator(config, context);
+  String clusterId = config.getString(ExecConstants.SERVICE_NAME);
+  String zkRoot = config.getString(ExecConstants.ZK_ROOT);
+  String drillClusterPath = "/" + zkRoot + "/" +  clusterId;
+  ACLProvider aclProvider = ZKACLProviderFactory.getACLProvider(config, 
drillClusterPath, context);
 
 Review comment:
   please use `final`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] sohami commented on a change in pull request #1702: DRILL-7107 Unable to connect to Drill 1.15 through ZK

2019-03-19 Thread GitBox
sohami commented on a change in pull request #1702: DRILL-7107 Unable to 
connect to Drill 1.15 through ZK
URL: https://github.com/apache/drill/pull/1702#discussion_r267146504
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java
 ##
 @@ -81,23 +78,20 @@
   private ConcurrentHashMap endpointsMap = new 
ConcurrentHashMap();
   private static final Pattern ZK_COMPLEX_STRING = 
Pattern.compile("(^.*?)/(.*)/([^/]*)$");
 
-  public ZKClusterCoordinator(DrillConfig config, String connect)
-  throws IOException, DrillbitStartupException {
-this(config, connect, null);
+  public ZKClusterCoordinator(DrillConfig config, String connect) {
+this(config, connect, new DefaultACLProvider());
   }
 
-  public ZKClusterCoordinator(DrillConfig config, BootStrapContext context)
-  throws IOException, DrillbitStartupException {
-this(config, null, context);
+  public ZKClusterCoordinator(DrillConfig config, ACLProvider aclProvider) {
+this(config, null, aclProvider);
 
 Review comment:
   Please add test for this codepath using `ClusterFixture`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] kkhatua commented on a change in pull request #1703: DRILL-7110: Skip writing profile when an ALTER SESSION is executed

2019-03-19 Thread GitBox
kkhatua commented on a change in pull request #1703: DRILL-7110: Skip writing 
profile when an ALTER SESSION is executed
URL: https://github.com/apache/drill/pull/1703#discussion_r267143781
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java
 ##
 @@ -838,7 +838,10 @@ private ExecConstants() {
* for any query.
*/
   public static final String ENABLE_QUERY_PROFILE_OPTION = 
"exec.query_profile.save";
-  public static final BooleanValidator ENABLE_QUERY_PROFILE_VALIDATOR = new 
BooleanValidator(ENABLE_QUERY_PROFILE_OPTION, null);
+  public static final BooleanValidator ENABLE_QUERY_PROFILE_VALIDATOR = new 
BooleanValidator(ENABLE_QUERY_PROFILE_OPTION, new OptionDescription("Save 
completed profiles to the persistent store"));
+  //Allow to skip writing Alter Session profiles
+  public static final String SKIP_ALTER_SESSION_QUERY_PROFILE = 
"exec.query_profile.alter_session.skip";
+  public static final BooleanValidator SKIP_SESSION_QUERY_PROFILE_VALIDATOR = 
new BooleanValidator(SKIP_ALTER_SESSION_QUERY_PROFILE, new 
OptionDescription("Skip saving ALTER SESSION profiles"));
 
 Review comment:
   No. Any `ALTER SESSION ...` query will not be logged as the option[s] is set 
in SESSION scope. With `RESET` , any subsequent query will log with no Session 
options showing up in the profile. The proof of the altered session options 
will always show up in queries' profiles, so logging any `ALTER SESSION ...` 
query is redundant.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] gparai commented on issue #1699: DRILL-7108: Improve selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates

2019-03-19 Thread GitBox
gparai commented on issue #1699: DRILL-7108: Improve selectivity estimates for 
(NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates
URL: https://github.com/apache/drill/pull/1699#issuecomment-474630689
 
 
   @amansinha100 I have addressed your review comments. Please take a look.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] gparai commented on a change in pull request #1699: DRILL-7108: Improve selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates

2019-03-19 Thread GitBox
gparai commented on a change in pull request #1699: DRILL-7108: Improve 
selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates
URL: https://github.com/apache/drill/pull/1699#discussion_r267142502
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillStatsTable.java
 ##
 @@ -134,6 +135,33 @@ public Double getRowCount() {
 return rowCount > 0 ? rowCount : null;
   }
 
+  /**
+   * Get the approximate number of distinct values of given column. If stats 
are not present for the
+   * given column, a null is returned.
+   *
+   * Note: returned data may not be accurate. Accuracy depends on whether the 
table data has changed
+   * after the stats are computed.
+   *
+   * @param col - column for which approximate count distinct is desired
+   * @return approximate count distinct of the column, if available. NULL 
otherwise.
+   */
+  public Double getNNRowCount(String col) {
+// Stats might not have materialized because of errors.
+if (!materialized) {
+  return null;
+}
+final String upperCol = col.toUpperCase();
+Long nnRowCntCol = nnRowCount.get(upperCol);
+if (nnRowCntCol == null) {
+  nnRowCntCol = 
nnRowCount.get(SchemaPath.getSimplePath(upperCol).toString());
 
 Review comment:
   I think the SchemaPath method was use to correctly construct names for 
nested scalar columns. Maybe it would also work for simple columns but I did 
not try it. I can make this change later on.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] gparai commented on a change in pull request #1699: DRILL-7108: Improve selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates

2019-03-19 Thread GitBox
gparai commented on a change in pull request #1699: DRILL-7108: Improve 
selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates
URL: https://github.com/apache/drill/pull/1699#discussion_r267141735
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillRelMdSelectivity.java
 ##
 @@ -173,6 +173,61 @@ private double getScanSelectivityInternal(DrillTable 
table, RexNode predicate, L
 return (sel > 1.0) ? 1.0 : sel;
   }
 
+  private double computeEqualsSelectivity(DrillTable table, RexNode orPred, 
List fieldNames) {
+if (orPred instanceof RexCall) {
+  int colIdx = -1;
+  RexInputRef op = findRexInputRef(orPred);
+  if (op != null) {
+colIdx = op.hashCode();
+  }
+  if (colIdx != -1 && colIdx < fieldNames.size()) {
+String col = fieldNames.get(colIdx);
+if (table.getStatsTable() != null
+&& table.getStatsTable().getNdv(col) != null) {
+  return 1.00 / table.getStatsTable().getNdv(col);
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] gparai commented on a change in pull request #1699: DRILL-7108: Improve selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates

2019-03-19 Thread GitBox
gparai commented on a change in pull request #1699: DRILL-7108: Improve 
selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates
URL: https://github.com/apache/drill/pull/1699#discussion_r267137297
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillRelMdSelectivity.java
 ##
 @@ -173,6 +173,61 @@ private double getScanSelectivityInternal(DrillTable 
table, RexNode predicate, L
 return (sel > 1.0) ? 1.0 : sel;
   }
 
+  private double computeEqualsSelectivity(DrillTable table, RexNode orPred, 
List fieldNames) {
+if (orPred instanceof RexCall) {
+  int colIdx = -1;
+  RexInputRef op = findRexInputRef(orPred);
+  if (op != null) {
+colIdx = op.hashCode();
+  }
+  if (colIdx != -1 && colIdx < fieldNames.size()) {
+String col = fieldNames.get(colIdx);
+if (table.getStatsTable() != null
+&& table.getStatsTable().getNdv(col) != null) {
+  return 1.00 / table.getStatsTable().getNdv(col);
+}
+  } else {
+if (logger.isDebugEnabled()) {
+logger.warn(String.format("No input reference $[%s] found for 
predicate [%s]",
+Integer.toString(colIdx), orPred.toString()));
+}
+  }
+}
+if (logger.isDebugEnabled()) {
+  logger.warn(String.format("Using guess for predicate [%s]", 
orPred.toString()));
+}
+//CALCITE guess
+return RelMdUtil.guessSelectivity(orPred);
+  }
+
+  private double computeNullSelectivity(DrillTable table, RexNode orPred, 
List fieldNames) {
+if (orPred instanceof RexCall) {
+  int colIdx = -1;
+  RexInputRef op = findRexInputRef(orPred);
+  if (op != null) {
+colIdx = op.hashCode();
+  }
+  if (colIdx != -1 && colIdx < fieldNames.size()) {
+String col = fieldNames.get(colIdx);
+if (table.getStatsTable() != null
+&& table.getStatsTable().getNNRowCount(col) != null) {
+  // Cap selectivity between {1/Total_Rows, Calcite Guess}
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (DRILL-7122) TPCDS queries 29 25 17 are slower when Statistics is disabled.

2019-03-19 Thread Robert Hou (JIRA)
Robert Hou created DRILL-7122:
-

 Summary: TPCDS queries 29 25 17 are slower when Statistics is 
disabled.
 Key: DRILL-7122
 URL: https://issues.apache.org/jira/browse/DRILL-7122
 Project: Apache Drill
  Issue Type: Bug
Reporter: Robert Hou






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [drill] gparai commented on a change in pull request #1699: DRILL-7108: Improve selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates

2019-03-19 Thread GitBox
gparai commented on a change in pull request #1699: DRILL-7108: Improve 
selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates
URL: https://github.com/apache/drill/pull/1699#discussion_r267136173
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillRelMdSelectivity.java
 ##
 @@ -173,6 +173,61 @@ private double getScanSelectivityInternal(DrillTable 
table, RexNode predicate, L
 return (sel > 1.0) ? 1.0 : sel;
   }
 
+  private double computeEqualsSelectivity(DrillTable table, RexNode orPred, 
List fieldNames) {
+if (orPred instanceof RexCall) {
+  int colIdx = -1;
+  RexInputRef op = findRexInputRef(orPred);
+  if (op != null) {
+colIdx = op.hashCode();
+  }
+  if (colIdx != -1 && colIdx < fieldNames.size()) {
+String col = fieldNames.get(colIdx);
+if (table.getStatsTable() != null
+&& table.getStatsTable().getNdv(col) != null) {
+  return 1.00 / table.getStatsTable().getNdv(col);
+}
+  } else {
+if (logger.isDebugEnabled()) {
+logger.warn(String.format("No input reference $[%s] found for 
predicate [%s]",
+Integer.toString(colIdx), orPred.toString()));
+}
+  }
+}
+if (logger.isDebugEnabled()) {
+  logger.warn(String.format("Using guess for predicate [%s]", 
orPred.toString()));
+}
+//CALCITE guess
+return RelMdUtil.guessSelectivity(orPred);
+  }
+
+  private double computeNullSelectivity(DrillTable table, RexNode orPred, 
List fieldNames) {
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] gparai commented on a change in pull request #1699: DRILL-7108: Improve selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates

2019-03-19 Thread GitBox
gparai commented on a change in pull request #1699: DRILL-7108: Improve 
selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates
URL: https://github.com/apache/drill/pull/1699#discussion_r267136048
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillRelMdSelectivity.java
 ##
 @@ -137,34 +138,33 @@ private double getScanSelectivityInternal(DrillTable 
table, RexNode predicate, L
 for (RexNode pred : RelOptUtil.conjunctions(predicate)) {
   double orSel = 0;
   for (RexNode orPred : RelOptUtil.disjunctions(pred)) {
-//CALCITE guess
-Double guess = RelMdUtil.guessSelectivity(pred);
+
 if (orPred.isA(SqlKind.EQUALS)) {
+  orSel += computeEqualsSelectivity(table, orPred, fieldNames);
+} else if (orPred.isA(SqlKind.NOT_EQUALS)) {
+  orSel += 1.0 - computeEqualsSelectivity(table, orPred, fieldNames);
+} else if (orPred.isA(SqlKind.LIKE)) {
+  // LIKE selectivity is 5% more than a similar equality predicate, 
capped at CALCITE guess
+  orSel +=  Math.min(computeEqualsSelectivity(table, orPred, 
fieldNames) + LIKE_PREDICATE_SELECTIVITY,
+  RelMdUtil.guessSelectivity(orPred));
+} else if (orPred.isA(SqlKind.NOT)) {
   if (orPred instanceof RexCall) {
-int colIdx = -1;
-RexInputRef op = findRexInputRef(orPred);
-if (op != null) {
-  colIdx = op.hashCode();
-}
-if (colIdx != -1 && colIdx < fieldNames.size()) {
-  String col = fieldNames.get(colIdx);
-  if (table.getStatsTable() != null
-  && table.getStatsTable().getNdv(col) != null) {
-orSel += 1.00 / table.getStatsTable().getNdv(col);
-  } else {
-orSel += guess;
-  }
+// LIKE selectivity is 5% more than a similar equality predicate, 
capped at CALCITE guess
+RexNode childOp = ((RexCall) orPred).getOperands().get(0);
+if (childOp.isA(SqlKind.LIKE)) {
+  orSel += 1.0 - Math.min(computeEqualsSelectivity(table, childOp, 
fieldNames) + LIKE_PREDICATE_SELECTIVITY,
+  RelMdUtil.guessSelectivity(childOp));
 } else {
-  orSel += guess;
-  if (logger.isDebugEnabled()) {
-logger.warn(String.format("No input reference $[%s] found for 
predicate [%s]",
-Integer.toString(colIdx), orPred.toString()));
-  }
+  orSel += 1.0 - RelMdUtil.guessSelectivity(orPred);
 }
   }
+} else if (orPred.isA(SqlKind.IS_NULL)) {
+  orSel += computeNullSelectivity(table, orPred, fieldNames);
+} else if (orPred.isA(SqlKind.IS_NOT_NULL)) {
+  orSel += 1.0 - computeNullSelectivity(table, orPred, fieldNames);
 
 Review comment:
   The NullSelectivity is essentially computed as 1-NNRowCount/TotalRowCount. 
So 1-NullSelectivity = NNRowCount/TotalRowCount; I was just reusing the 
function so we get `Cap` checks for free.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] gparai commented on a change in pull request #1699: DRILL-7108: Improve selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates

2019-03-19 Thread GitBox
gparai commented on a change in pull request #1699: DRILL-7108: Improve 
selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates
URL: https://github.com/apache/drill/pull/1699#discussion_r267135535
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillRelMdSelectivity.java
 ##
 @@ -56,6 +56,7 @@
   private static final DrillRelMdSelectivity INSTANCE = new 
DrillRelMdSelectivity();
   static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DrillRelMdSelectivity.class);
   public static final RelMetadataProvider SOURCE = 
ReflectiveRelMetadataProvider.reflectiveSource(BuiltInMethod.SELECTIVITY.method,
 INSTANCE);
+  private static final double LIKE_PREDICATE_SELECTIVITY = 0.05;
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] gparai commented on a change in pull request #1699: DRILL-7108: Improve selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates

2019-03-19 Thread GitBox
gparai commented on a change in pull request #1699: DRILL-7108: Improve 
selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates
URL: https://github.com/apache/drill/pull/1699#discussion_r267135566
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillStatsTable.java
 ##
 @@ -134,6 +135,33 @@ public Double getRowCount() {
 return rowCount > 0 ? rowCount : null;
   }
 
+  /**
+   * Get the approximate number of distinct values of given column. If stats 
are not present for the
+   * given column, a null is returned.
+   *
+   * Note: returned data may not be accurate. Accuracy depends on whether the 
table data has changed
+   * after the stats are computed.
+   *
+   * @param col - column for which approximate count distinct is desired
+   * @return approximate count distinct of the column, if available. NULL 
otherwise.
+   */
+  public Double getNNRowCount(String col) {
+// Stats might not have materialized because of errors.
+if (!materialized) {
+  return null;
+}
+final String upperCol = col.toUpperCase();
+Long nnRowCntCol = nnRowCount.get(upperCol);
+if (nnRowCntCol == null) {
+  nnRowCntCol = 
nnRowCount.get(SchemaPath.getSimplePath(upperCol).toString());
+}
+// Ndv estimation techniques like HLL may over-estimate, hence cap it at 
rowCount
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] gparai commented on a change in pull request #1699: DRILL-7108: Improve selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates

2019-03-19 Thread GitBox
gparai commented on a change in pull request #1699: DRILL-7108: Improve 
selectivity estimates for (NOT)LIKE, NOT_EQUALS, IS NOT NULL predicates
URL: https://github.com/apache/drill/pull/1699#discussion_r267135586
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillStatsTable.java
 ##
 @@ -134,6 +135,33 @@ public Double getRowCount() {
 return rowCount > 0 ? rowCount : null;
   }
 
+  /**
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (DRILL-7123) TPCDS query 83 runs slower when Statistics is disabled

2019-03-19 Thread Robert Hou (JIRA)
Robert Hou created DRILL-7123:
-

 Summary: TPCDS query 83 runs slower when Statistics is disabled
 Key: DRILL-7123
 URL: https://issues.apache.org/jira/browse/DRILL-7123
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.16.0
Reporter: Robert Hou
Assignee: Gautam Parai
 Fix For: 1.16.0


Query is TPCDS 83 with sf 100:
{noformat}
WITH sr_items 
 AS (SELECT i_item_id   item_id, 
Sum(sr_return_quantity) sr_item_qty 
 FROM   store_returns, 
item, 
date_dim 
 WHERE  sr_item_sk = i_item_sk 
AND d_date IN (SELECT d_date 
   FROM   date_dim 
   WHERE  d_week_seq IN (SELECT d_week_seq 
 FROM   date_dim 
 WHERE 
  d_date IN ( '1999-06-30', 
  '1999-08-28', 
  '1999-11-18' 
))) 
AND sr_returned_date_sk = d_date_sk 
 GROUP  BY i_item_id), 
 cr_items 
 AS (SELECT i_item_id   item_id, 
Sum(cr_return_quantity) cr_item_qty 
 FROM   catalog_returns, 
item, 
date_dim 
 WHERE  cr_item_sk = i_item_sk 
AND d_date IN (SELECT d_date 
   FROM   date_dim 
   WHERE  d_week_seq IN (SELECT d_week_seq 
 FROM   date_dim 
 WHERE 
  d_date IN ( '1999-06-30', 
  '1999-08-28', 
  '1999-11-18' 
))) 
AND cr_returned_date_sk = d_date_sk 
 GROUP  BY i_item_id), 
 wr_items 
 AS (SELECT i_item_id   item_id, 
Sum(wr_return_quantity) wr_item_qty 
 FROM   web_returns, 
item, 
date_dim 
 WHERE  wr_item_sk = i_item_sk 
AND d_date IN (SELECT d_date 
   FROM   date_dim 
   WHERE  d_week_seq IN (SELECT d_week_seq 
 FROM   date_dim 
 WHERE 
  d_date IN ( '1999-06-30', 
  '1999-08-28', 
  '1999-11-18' 
))) 
AND wr_returned_date_sk = d_date_sk 
 GROUP  BY i_item_id) 
SELECT sr_items.item_id, 
   sr_item_qty, 
   sr_item_qty / ( sr_item_qty + cr_item_qty + wr_item_qty ) / 3.0 
* 
   100 sr_dev, 
   cr_item_qty, 
   cr_item_qty / ( sr_item_qty + cr_item_qty + wr_item_qty ) / 3.0 
* 
   100 cr_dev, 
   wr_item_qty, 
   wr_item_qty / ( sr_item_qty + cr_item_qty + wr_item_qty ) / 3.0 
* 
   100 wr_dev, 
   ( sr_item_qty + cr_item_qty + wr_item_qty ) / 3.0 
   average 
FROM   sr_items, 
   cr_items, 
   wr_items 
WHERE  sr_items.item_id = cr_items.item_id 
   AND sr_items.item_id = wr_items.item_id 
ORDER  BY sr_items.item_id, 
  sr_item_qty
LIMIT 100; 
{noformat}

The number of threads for major fragments 1 and 2 has changed when Statistics 
is disabled.  The number of minor fragments has been reduced from 10 and 15 
fragments down to 3 fragments.  Rowcount has changed for major fragment 2 from 
1439754.0 down to 287950.8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-7121) TPCH 4 takes longer

2019-03-19 Thread Robert Hou (JIRA)
Robert Hou created DRILL-7121:
-

 Summary: TPCH 4 takes longer
 Key: DRILL-7121
 URL: https://issues.apache.org/jira/browse/DRILL-7121
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.16.0
Reporter: Robert Hou
Assignee: Gautam Parai
 Fix For: 1.16.0


Here is TPCH 4 with sf 100:
{noformat}
select
  o.o_orderpriority,
  count(*) as order_count
from
  orders o

where
  o.o_orderdate >= date '1996-10-01'
  and o.o_orderdate < date '1996-10-01' + interval '3' month
  and 
  exists (
select
  *
from
  lineitem l
where
  l.l_orderkey = o.o_orderkey
  and l.l_commitdate < l.l_receiptdate
  )
group by
  o.o_orderpriority
order by
  o.o_orderpriority;
{noformat}

The plan has changed when Statistics is disabled.   A Hash Agg and a Broadcast 
Exchange have been added.  These two operators expand the number of rows from 
the lineitem table from 137M to 9B rows.   This forces the hash join to use 6GB 
of memory instead of 30 MB.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [drill] HanumathRao commented on issue #1704: DRILL-7113: Fix creation of filter conditions for IS NULL and IS NOT …

2019-03-19 Thread GitBox
HanumathRao commented on issue #1704: DRILL-7113: Fix creation of filter 
conditions for IS NULL and IS NOT …
URL: https://github.com/apache/drill/pull/1704#issuecomment-474619585
 
 
   @amansinha100  Thanks for making the changes. Changes look good to me. +1


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (DRILL-7120) Query fails with ChannelClosedException

2019-03-19 Thread Robert Hou (JIRA)
Robert Hou created DRILL-7120:
-

 Summary: Query fails with ChannelClosedException
 Key: DRILL-7120
 URL: https://issues.apache.org/jira/browse/DRILL-7120
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.16.0
Reporter: Robert Hou
Assignee: Gautam Parai
 Fix For: 1.16.0


TPCH query 5 fails at sf100.  Here is the query:
{noformat}
select
  n.n_name,
  sum(l.l_extendedprice * (1 - l.l_discount)) as revenue
from
  customer c,
  orders o,
  lineitem l,
  supplier s,
  nation n,
  region r
where
  c.c_custkey = o.o_custkey
  and l.l_orderkey = o.o_orderkey
  and l.l_suppkey = s.s_suppkey
  and c.c_nationkey = s.s_nationkey
  and s.s_nationkey = n.n_nationkey
  and n.n_regionkey = r.r_regionkey
  and r.r_name = 'EUROPE'
  and o.o_orderdate >= date '1997-01-01'
  and o.o_orderdate < date '1997-01-01' + interval '1' year
group by
  n.n_name
order by
  revenue desc;
{noformat}

This is the error from drillbit.log:
{noformat}
2019-03-04 17:46:38,684 [23822b0a-b7bd-0b79-b905-1438f5b1d039:frag:6:64] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 
23822b0a-b7bd-0b79-b905-1438f5b1d039:6:64: State change requested RUNNING --> 
FINISHED
2019-03-04 17:46:38,684 [23822b0a-b7bd-0b79-b905-1438f5b1d039:frag:6:64] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 23822b0a-b7bd-0b79-b905-1438f5b1d039:6:64: 
State to report: FINISHED
2019-03-04 18:17:51,454 [BitServer-13] WARN  
o.a.d.exec.rpc.ProtobufLengthDecoder - Failure allocating buffer on incoming 
stream due to memory limits.  Current Allocation: 262144.
2019-03-04 18:17:51,454 [BitServer-13] ERROR o.a.drill.exec.rpc.data.DataServer 
- Out of memory in RPC layer.
2019-03-04 18:17:51,463 [BitServer-13] ERROR o.a.d.exec.rpc.RpcExceptionHandler 
- Exception in RPC communication.  Connection: /10.10.120.104:31012 <--> 
/10.10.120.106:53048 (data server).  Closing connection.
io.netty.handler.codec.DecoderException: 
org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating buffer.
at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:271)
 ~[netty-codec-4.0.48.Final.jar:4.0.48.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
 [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
 [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
 [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at 
io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
 [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
 [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
 [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
 [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at 
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
 [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
 [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
 [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
 [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
 [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) 
[netty-transport-4.0.48.Final.jar:4.0.48.Final]
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
 [netty-transport-4.0.48.Final.jar:4.0.48.Final]
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497) 
[netty-transport-4.0.48.Final.jar:4.0.48.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) 
[netty-transport-4.0.48.Final.jar:4.0.48.Final]
at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
 [netty-common-4.0.48.Final.jar:4.0.48.Final]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_1

[GitHub] [drill] amansinha100 commented on issue #1704: DRILL-7113: Fix creation of filter conditions for IS NULL and IS NOT …

2019-03-19 Thread GitBox
amansinha100 commented on issue #1704: DRILL-7113: Fix creation of filter 
conditions for IS NULL and IS NOT …
URL: https://github.com/apache/drill/pull/1704#issuecomment-474612404
 
 
   @HanumathRao  could you pls review ? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] amansinha100 opened a new pull request #1704: DRILL-7113: Fix creation of filter conditions for IS NULL and IS NOT …

2019-03-19 Thread GitBox
amansinha100 opened a new pull request #1704: DRILL-7113: Fix creation of 
filter conditions for IS NULL and IS NOT …
URL: https://github.com/apache/drill/pull/1704
 
 
   …NULL for MapR-DB format plugin


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on a change in pull request #1703: DRILL-7110: Skip writing profile when an ALTER SESSION is executed

2019-03-19 Thread GitBox
arina-ielchiieva commented on a change in pull request #1703: DRILL-7110: Skip 
writing profile when an ALTER SESSION is executed
URL: https://github.com/apache/drill/pull/1703#discussion_r267116610
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java
 ##
 @@ -838,7 +838,10 @@ private ExecConstants() {
* for any query.
*/
   public static final String ENABLE_QUERY_PROFILE_OPTION = 
"exec.query_profile.save";
-  public static final BooleanValidator ENABLE_QUERY_PROFILE_VALIDATOR = new 
BooleanValidator(ENABLE_QUERY_PROFILE_OPTION, null);
+  public static final BooleanValidator ENABLE_QUERY_PROFILE_VALIDATOR = new 
BooleanValidator(ENABLE_QUERY_PROFILE_OPTION, new OptionDescription("Save 
completed profiles to the persistent store"));
+  //Allow to skip writing Alter Session profiles
+  public static final String SKIP_ALTER_SESSION_QUERY_PROFILE = 
"exec.query_profile.alter_session.skip";
+  public static final BooleanValidator SKIP_SESSION_QUERY_PROFILE_VALIDATOR = 
new BooleanValidator(SKIP_ALTER_SESSION_QUERY_PROFILE, new 
OptionDescription("Skip saving ALTER SESSION profiles"));
 
 Review comment:
   How we handle reset command? Will it be logged?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on a change in pull request #1703: DRILL-7110: Skip writing profile when an ALTER SESSION is executed

2019-03-19 Thread GitBox
arina-ielchiieva commented on a change in pull request #1703: DRILL-7110: Skip 
writing profile when an ALTER SESSION is executed
URL: https://github.com/apache/drill/pull/1703#discussion_r267114576
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java
 ##
 @@ -92,6 +93,7 @@ public QueryContext(final UserSession session, final 
DrillbitContext drillbitCon
 this.drillbitContext = drillbitContext;
 this.session = session;
 this.queryId = queryId;
+skipWritingProfile(false);
 
 Review comment:
   ```suggestion
   this.skipProfileWrite = false;
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] kkhatua opened a new pull request #1703: DRILL-7110: Skip writing profile when an ALTER SESSION is executed

2019-03-19 Thread GitBox
kkhatua opened a new pull request #1703: DRILL-7110: Skip writing profile when 
an ALTER SESSION is executed
URL: https://github.com/apache/drill/pull/1703
 
 
   Allows (by default) for `ALTER SESSION SET =` queries to NOT 
be writen to the profile store. This would avoid the risk of potentially adding 
up to a lot of profiles being written unnecessarily, since those changes are 
also reflected on the queries that follow.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] HanumathRao commented on issue #1691: DRILL-7093 Batch Sizing in SingleSender

2019-03-19 Thread GitBox
HanumathRao commented on issue #1691: DRILL-7093 Batch Sizing in SingleSender
URL: https://github.com/apache/drill/pull/1691#issuecomment-474580674
 
 
   Code changes look good to me. I have few minor comments. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] HanumathRao commented on a change in pull request #1691: DRILL-7093 Batch Sizing in SingleSender

2019-03-19 Thread GitBox
HanumathRao commented on a change in pull request #1691: DRILL-7093 Batch 
Sizing in SingleSender
URL: https://github.com/apache/drill/pull/1691#discussion_r267088994
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/SimpleRecordBatch.java
 ##
 @@ -88,6 +88,11 @@ public WritableBatch getWritableBatch() {
 throw new UnsupportedOperationException();
   }
 
+  @Override
+  public WritableBatch getWritableBatch(int start, int length) {
+throw new UnsupportedOperationException();
 
 Review comment:
   Should there be a string in similar lines to that of SchemalessBatch 
exception.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] HanumathRao commented on a change in pull request #1691: DRILL-7093 Batch Sizing in SingleSender

2019-03-19 Thread GitBox
HanumathRao commented on a change in pull request #1691: DRILL-7093 Batch 
Sizing in SingleSender
URL: https://github.com/apache/drill/pull/1691#discussion_r267080211
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/mergereceiver/MergingRecordBatch.java
 ##
 @@ -838,4 +838,13 @@ public void dump() {
 container, outgoingPosition, Arrays.toString(incomingBatches), 
Arrays.toString(batchOffsets),
 Arrays.toString(tempBatchHolder), Arrays.toString(inputCounts), 
Arrays.toString(outputCounts));
   }
+
+  @Override
+  public WritableBatch getWritableBatch(int startIndex, int length) {
+VectorContainer partialContainer = new 
VectorContainer(context.getAllocator(), getSchema());
 
 Review comment:
   Looks like getWritableBatch in MergingRecordBatch, SpilledRecordBatch, 
AbstractRecordBatch and RecordBatchLoader is same.  Is there any way to write a 
common function in the hierarchy and use it in all these scenarios?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on a change in pull request #1637: DRILL-7032: Ignore corrupt rows in a PCAP file

2019-03-19 Thread GitBox
arina-ielchiieva commented on a change in pull request #1637: DRILL-7032: 
Ignore corrupt rows in a PCAP file
URL: https://github.com/apache/drill/pull/1637#discussion_r267070105
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/pcap/decoder/Packet.java
 ##
 @@ -324,7 +329,11 @@ public int getDst_port() {
 byte[] data = null;
 if (packetLength >= payloadDataStart) {
   data = new byte[packetLength - payloadDataStart];
-  System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, data.length);
+  try {
+System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, 
data.length);
+  } catch (Exception e) {
 
 Review comment:
   Which exception is thrown? Is it useful? Should we add log trace of it in 
case if user wants to get details?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on a change in pull request #1637: DRILL-7032: Ignore corrupt rows in a PCAP file

2019-03-19 Thread GitBox
arina-ielchiieva commented on a change in pull request #1637: DRILL-7032: 
Ignore corrupt rows in a PCAP file
URL: https://github.com/apache/drill/pull/1637#discussion_r267070105
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/pcap/decoder/Packet.java
 ##
 @@ -324,7 +329,11 @@ public int getDst_port() {
 byte[] data = null;
 if (packetLength >= payloadDataStart) {
   data = new byte[packetLength - payloadDataStart];
-  System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, data.length);
+  try {
+System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, 
data.length);
+  } catch (Exception e) {
 
 Review comment:
   Which exception is thrown? Is it useful? Should we add log trace of it in 
case if users wants to get details?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on a change in pull request #1637: DRILL-7032: Ignore corrupt rows in a PCAP file

2019-03-19 Thread GitBox
arina-ielchiieva commented on a change in pull request #1637: DRILL-7032: 
Ignore corrupt rows in a PCAP file
URL: https://github.com/apache/drill/pull/1637#discussion_r267074751
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/pcap/decoder/Packet.java
 ##
 @@ -53,6 +53,7 @@
   private int packetLength;
   protected int etherProtocol;
   protected int protocol;
+  protected boolean isCorrupt = false;
 
 Review comment:
   Maybe add new field as last one, not in the middle?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on a change in pull request #1680: DRILL-7077: Add Function to Facilitate Time Series Analysis

2019-03-19 Thread GitBox
arina-ielchiieva commented on a change in pull request #1680: DRILL-7077: Add 
Function to Facilitate Time Series Analysis
URL: https://github.com/apache/drill/pull/1680#discussion_r267069017
 
 

 ##
 File path: 
contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestNearestDateFunctions.java
 ##
 @@ -0,0 +1,152 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.udfs;
+
+import org.apache.drill.categories.SqlFunctionTest;
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterFixtureBuilder;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.time.LocalDateTime;
+
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+@Category({UnlikelyTest.class, SqlFunctionTest.class})
+public class TestNearestDateFunctions extends ClusterTest {
+
+  @BeforeClass
+  public static void setup() throws Exception {
+ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher);
+startCluster(builder);
+  }
+
+  @Test
+  public void testNearestDate() throws Exception {
+String query = "SELECT nearestDate( TO_TIMESTAMP('2019-02-01 07:22:00', 
'-MM-dd HH:mm:ss'), 'YEAR') AS nearest_year, " +
+"nearestDate( TO_TIMESTAMP('2019-02-01 07:22:00', '-MM-dd 
HH:mm:ss'), 'QUARTER') AS nearest_quarter, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'MONTH') AS nearest_month, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'DAY') AS nearest_day, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'WEEK_SUNDAY') AS nearest_week_sunday, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'WEEK_MONDAY') AS nearest_week_monday, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'HOUR') AS nearest_hour, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:42:00', '-MM-dd 
HH:mm:ss'), 'HALF_HOUR') AS nearest_half_hour, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:48:00', '-MM-dd 
HH:mm:ss'), 'QUARTER_HOUR') AS nearest_quarter_hour, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'MINUTE') AS nearest_minute, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:22', '-MM-dd 
HH:mm:ss'), 'HALF_MINUTE') AS nearest_30second, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:22', '-MM-dd 
HH:mm:ss'), 'QUARTER_MINUTE') AS nearest_15second, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:31', '-MM-dd 
HH:mm:ss'), 'SECOND') AS nearest_second " +
+"FROM (VALUES(1))";
+testBuilder()
+.sqlQuery(query)
+.unOrdered()
+.baselineColumns("nearest_year",
+"nearest_quarter",
+"nearest_month",
+"nearest_day",
+"nearest_week_sunday",
+"nearest_week_monday",
+"nearest_hour",
+"nearest_half_hour",
+"nearest_quarter_hour",
+"nearest_minute",
+"nearest_30second",
+"nearest_15second",
+"nearest_second")
+.baselineValues(LocalDateTime.of(2019, 1, 1, 0, 0, 0),  //Year
+LocalDateTime.of(2019, 1, 1, 0, 0, 0),  //Quarter
+LocalDateTime.of(2019, 2, 1, 0, 0, 0), //Month
+LocalDateTime.of(2019, 2, 15, 0, 0, 0), //Day
+LocalDateTime.of(2019, 2, 10, 0, 0, 0), //Week Sunday
+LocalDateTime.of(2019, 2, 11, 0, 0, 0), //Week Monday
+LocalDateTime.of(2019, 2, 15, 7, 0, 0), //Hour
+LocalDateTime.of(2019, 2, 15, 7, 30, 0), //Half Hour
+LocalDateTime.of(2019, 2, 15, 7, 45, 0), //Q

[GitHub] [drill] arina-ielchiieva commented on a change in pull request #1680: DRILL-7077: Add Function to Facilitate Time Series Analysis

2019-03-19 Thread GitBox
arina-ielchiieva commented on a change in pull request #1680: DRILL-7077: Add 
Function to Facilitate Time Series Analysis
URL: https://github.com/apache/drill/pull/1680#discussion_r267068700
 
 

 ##
 File path: 
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/NearestDateUtils.java
 ##
 @@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.udfs;
+
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+
+import java.time.temporal.TemporalAdjusters;
+import java.time.LocalDateTime;
+import java.time.DayOfWeek;
+import java.time.temporal.ChronoUnit;
+import java.util.Arrays;
+
+public class NearestDateUtils {
+  /**
+   * Specifies the time grouping to be used with the nearest date function
+   */
+  private enum TimeInterval {
+YEAR,
+QUARTER,
+MONTH,
+WEEK_SUNDAY,
+WEEK_MONDAY,
+DAY,
+HOUR,
+HALF_HOUR,
+QUARTER_HOUR,
+MINUTE,
+HALF_MINUTE,
+QUARTER_MINUTE,
+SECOND
+  }
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(NearestDateUtils.class);
 
 Review comment:
   ```suggestion
 private static final Logger logger = 
LoggerFactory.getLogger(NearestDateUtils.class);
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (DRILL-7119) Modify selectivity calculations to use histograms

2019-03-19 Thread Aman Sinha (JIRA)
Aman Sinha created DRILL-7119:
-

 Summary: Modify selectivity calculations to use histograms
 Key: DRILL-7119
 URL: https://issues.apache.org/jira/browse/DRILL-7119
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Query Planning & Optimization
Reporter: Aman Sinha
Assignee: Aman Sinha
 Fix For: 1.16.0


(Please see parent JIRA for the design document)
Once the t-digest based histogram is created, we need to read it back and 
modify the selectivity calculations such that they use the histogram buckets 
for range conditions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-7118) Filter not getting pushed down on MapR-DB tables.

2019-03-19 Thread Hanumath Rao Maduri (JIRA)
Hanumath Rao Maduri created DRILL-7118:
--

 Summary: Filter not getting pushed down on MapR-DB tables.
 Key: DRILL-7118
 URL: https://issues.apache.org/jira/browse/DRILL-7118
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.15.0
Reporter: Hanumath Rao Maduri
Assignee: Hanumath Rao Maduri
 Fix For: 1.16.0


A simple is null filter is not being pushed down for the mapr-db tables. Here 
is the repro for the same.
{code:java}
0: jdbc:drill:zk=local> explain plan for select * from dfs.`/tmp/js` where b is 
null;
ANTLR Tool version 4.5 used for code generation does not match the current 
runtime version 4.7.1ANTLR Runtime version 4.5 used for parser compilation does 
not match the current runtime version 4.7.1ANTLR Tool version 4.5 used for code 
generation does not match the current runtime version 4.7.1ANTLR Runtime 
version 4.5 used for parser compilation does not match the current runtime 
version 
4.7.1+--+--+
| text | json |
+--+--+
| 00-00 Screen
00-01 Project(**=[$0])
00-02 Project(T0¦¦**=[$0])
00-03 SelectionVectorRemover
00-04 Filter(condition=[IS NULL($1)])
00-05 Project(T0¦¦**=[$0], b=[$1])
00-06 Scan(table=[[dfs, /tmp/js]], groupscan=[JsonTableGroupScan 
[ScanSpec=JsonScanSpec [tableName=/tmp/js, condition=null], columns=[`**`, 
`b`], maxwidth=1]])
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [drill] asfgit closed pull request #1690: DRILL-7086: Output schema for row set mechanism

2019-03-19 Thread GitBox
asfgit closed pull request #1690: DRILL-7086: Output schema for row set 
mechanism
URL: https://github.com/apache/drill/pull/1690
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] asfgit closed pull request #1700: DRILL-7111: Fix table function execution for directories

2019-03-19 Thread GitBox
asfgit closed pull request #1700: DRILL-7111: Fix table function execution for 
directories
URL: https://github.com/apache/drill/pull/1700
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] asfgit closed pull request #1696: DRILL-7095: Expose table schema (TupleMetadata) to physical operator (EasySubScan)

2019-03-19 Thread GitBox
asfgit closed pull request #1696: DRILL-7095: Expose table schema 
(TupleMetadata) to physical operator (EasySubScan)
URL: https://github.com/apache/drill/pull/1696
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] asfgit closed pull request #1698: DRILL-7106: Fix Intellij warning for FieldSchemaNegotiator

2019-03-19 Thread GitBox
asfgit closed pull request #1698: DRILL-7106: Fix Intellij warning for 
FieldSchemaNegotiator
URL: https://github.com/apache/drill/pull/1698
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on issue #1649: DRILL-6970: Issue with LogRegex format plugin where drillbuf was overflowing

2019-03-19 Thread GitBox
ihuzenko commented on issue #1649: DRILL-6970: Issue with LogRegex format 
plugin where drillbuf was overflowing 
URL: https://github.com/apache/drill/pull/1649#issuecomment-474511694
 
 
   @jcmcote I'd like to suggest redesign the **LogRecordReader** a little bit. 
The main idea is to extract all try-catches from _**load(int rowIndex, String 
value)**_ methods of each concrete **ColumnDefn** implementation and make one 
unified ```try-cacth``` which will wrap whole loop used for reading and 
handling of each line in log file for the reader (the while loop inside 
```next()```). Such approach will reduce boilerplate code and expected that 
it'll work faster, because try-catch will be initiated once per row batch 
instead of each column of each line (row). Note that it's not requirement for 
this pull request, but rather suggestion for future improvement. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (DRILL-7117) Support creation of histograms for numeric data types (except Decimal)

2019-03-19 Thread Aman Sinha (JIRA)
Aman Sinha created DRILL-7117:
-

 Summary: Support creation of histograms for numeric data types 
(except Decimal)
 Key: DRILL-7117
 URL: https://issues.apache.org/jira/browse/DRILL-7117
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Query Planning & Optimization
Reporter: Aman Sinha
Assignee: Aman Sinha
 Fix For: 1.16.0


This JIRA is specific to creating histograms for numeric data types: INT, 
BIGINT, FLOAT4, FLOAT8  and their corresponding nullable/non-nullable versions. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [drill] bitblender opened a new pull request #1702: DRILL-7107 Unable to connect to Drill 1.15 through ZK

2019-03-19 Thread GitBox
bitblender opened a new pull request #1702: DRILL-7107 Unable to connect to 
Drill 1.15 through ZK
URL: https://github.com/apache/drill/pull/1702
 
 
   With these changes, the ACLProvider is passed to ZKClusterCoordinator to 
avoid issues that arise when a client instantiates a ZKClusterCoordinator in a 
secure setup.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: Roadmap for Drill 2.0 and beyond?

2019-03-19 Thread Sorabh Hamirwasia
You can find all the presentations here:
https://drive.google.com/drive/folders/1VjKkqCKghrrbmgAyDY7h2bhoF65QB57r

Thanks,
Sorabh

On Tue, Mar 19, 2019 at 9:33 AM Kisung Kim  wrote:

> Can you share the thread or the address where presentations are?
> I couldn't find any related thread.
>
> On Tue, Mar 19, 2019 at 8:22 AM Aman Sinha  wrote:
>
> > Please see the discussions under the thread for Drill Developer Day 2018
> ..
> > this was held in November.  Presentations for various planned projects
> for
> > 2.0 and beyond were also posted on google drive.  For 2.0, the Resource
> > Manager and Drill Metastore are actively being worked on.
> >
> >
> > On Tue, Mar 19, 2019 at 12:19 AM Nai Yan.  wrote:
> >
> > > Greetings,
> > >Was wondering if I can get a concept of Drill future, such as
> 2.0
> > > release and beyond?
> > >
> > >Thanks.
> > >
> > >
> > >
> > > Nai Yan
> > >
> > >
> >
>


Re: Roadmap for Drill 2.0 and beyond?

2019-03-19 Thread Kisung Kim
Can you share the thread or the address where presentations are?
I couldn't find any related thread.

On Tue, Mar 19, 2019 at 8:22 AM Aman Sinha  wrote:

> Please see the discussions under the thread for Drill Developer Day 2018 ..
> this was held in November.  Presentations for various planned projects for
> 2.0 and beyond were also posted on google drive.  For 2.0, the Resource
> Manager and Drill Metastore are actively being worked on.
>
>
> On Tue, Mar 19, 2019 at 12:19 AM Nai Yan.  wrote:
>
> > Greetings,
> >Was wondering if I can get a concept of Drill future, such as 2.0
> > release and beyond?
> >
> >Thanks.
> >
> >
> >
> > Nai Yan
> >
> >
>


[GitHub] [drill] arina-ielchiieva commented on issue #1649: DRILL-6970: Issue with LogRegex format plugin where drillbuf was overflowing

2019-03-19 Thread GitBox
arina-ielchiieva commented on issue #1649: DRILL-6970: Issue with LogRegex 
format plugin where drillbuf was overflowing 
URL: https://github.com/apache/drill/pull/1649#issuecomment-474445779
 
 
   @cgivre sure, please review.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] cgivre commented on issue #1649: DRILL-6970: Issue with LogRegex format plugin where drillbuf was overflowing

2019-03-19 Thread GitBox
cgivre commented on issue #1649: DRILL-6970: Issue with LogRegex format plugin 
where drillbuf was overflowing 
URL: https://github.com/apache/drill/pull/1649#issuecomment-474445450
 
 
   @vdiravka  I can do the review if you'd like.  There was a similar PR open 
and I asked for a unit test which was addressed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on issue #1649: DRILL-6970: Issue with LogRegex format plugin where drillbuf was overflowing

2019-03-19 Thread GitBox
arina-ielchiieva commented on issue #1649: DRILL-6970: Issue with LogRegex 
format plugin where drillbuf was overflowing 
URL: https://github.com/apache/drill/pull/1649#issuecomment-474445276
 
 
   @sohami my bad, I am not sure why I decided that PR has been reviewed.
   @cgivre / @vdiravka could you please review?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] sohami commented on issue #1649: DRILL-6970: Issue with LogRegex format plugin where drillbuf was overflowing

2019-03-19 Thread GitBox
sohami commented on issue #1649: DRILL-6970: Issue with LogRegex format plugin 
where drillbuf was overflowing 
URL: https://github.com/apache/drill/pull/1649#issuecomment-474442811
 
 
   @jcmcote @arina-ielchiieva - I don't see any review for this PR. How's it 
approved ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (DRILL-7116) Adapt statistics to use Drill Metastore API

2019-03-19 Thread Volodymyr Vysotskyi (JIRA)
Volodymyr Vysotskyi created DRILL-7116:
--

 Summary: Adapt statistics to use Drill Metastore API
 Key: DRILL-7116
 URL: https://issues.apache.org/jira/browse/DRILL-7116
 Project: Apache Drill
  Issue Type: Sub-task
Affects Versions: 1.16.0
Reporter: Volodymyr Vysotskyi
Assignee: Volodymyr Vysotskyi
 Fix For: 1.17.0


The current implementation of statistics supposes the usage of files for 
storing and reading statistics.
 The aim of this Jira is to adapt statistics to use Drill Metastore API so in 
future it may be stored in other metastore implementations.

Implementation details:
 - Move statistics info into {{TableMetadata}}
 - Provide a way for obtaining {{TableMetadata}} in the places where statistics 
may be used (partially implemented in the scope of DRILL-7089)
 - Investigate and implement (if possible) lazy materialization of 
{{DrillStatsTable}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Roadmap for Drill 2.0 and beyond?

2019-03-19 Thread Aman Sinha
Please see the discussions under the thread for Drill Developer Day 2018 ..
this was held in November.  Presentations for various planned projects for
2.0 and beyond were also posted on google drive.  For 2.0, the Resource
Manager and Drill Metastore are actively being worked on.


On Tue, Mar 19, 2019 at 12:19 AM Nai Yan.  wrote:

> Greetings,
>Was wondering if I can get a concept of Drill future, such as 2.0
> release and beyond?
>
>Thanks.
>
>
>
> Nai Yan
>
>


[GitHub] [drill] vvysotskyi commented on issue #1646: DRILL-6852: Adapt current Parquet Metadata cache implementation to use Drill Metastore API

2019-03-19 Thread GitBox
vvysotskyi commented on issue #1646: DRILL-6852: Adapt current Parquet Metadata 
cache implementation to use Drill Metastore API
URL: https://github.com/apache/drill/pull/1646#issuecomment-474411987
 
 
   Rebased onto the master and squashed commits.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] cgivre commented on a change in pull request #1680: DRILL-7077: Add Function to Facilitate Time Series Analysis

2019-03-19 Thread GitBox
cgivre commented on a change in pull request #1680: DRILL-7077: Add Function to 
Facilitate Time Series Analysis
URL: https://github.com/apache/drill/pull/1680#discussion_r266902871
 
 

 ##
 File path: 
contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestNearestDateFunctions.java
 ##
 @@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.udfs;
+
+import org.apache.drill.categories.SqlFunctionTest;
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterFixtureBuilder;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.time.LocalDateTime;
+
+import static org.junit.Assert.assertTrue;
+
+@Category({UnlikelyTest.class, SqlFunctionTest.class})
+public class TestNearestDateFunctions extends ClusterTest {
+
+  @BeforeClass
+  public static void setup() throws Exception {
+ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher);
+startCluster(builder);
+  }
+
+  @Test
+  public void testNearestDate() throws Exception {
+String query = "SELECT nearestDate( TO_TIMESTAMP('2019-02-01 07:22:00', 
'-MM-dd HH:mm:ss'), 'YEAR') AS nearest_year, " +
+"nearestDate( TO_TIMESTAMP('2019-02-01 07:22:00', '-MM-dd 
HH:mm:ss'), 'QUARTER') AS nearest_quarter, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'MONTH') AS nearest_month, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'DAY') AS nearest_day, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'WEEK_SUNDAY') AS nearest_week_sunday, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'WEEK_MONDAY') AS nearest_week_monday, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'HOUR') AS nearest_hour, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:42:00', '-MM-dd 
HH:mm:ss'), 'HALF_HOUR') AS nearest_half_hour, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:48:00', '-MM-dd 
HH:mm:ss'), 'QUARTER_HOUR') AS nearest_quarter_hour, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'MINUTE') AS nearest_minute, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:22', '-MM-dd 
HH:mm:ss'), 'HALF_MINUTE') AS nearest_30second, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:22', '-MM-dd 
HH:mm:ss'), 'QUARTER_MINUTE') AS nearest_15second, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:31', '-MM-dd 
HH:mm:ss'), 'SECOND') AS nearest_second " +
+"FROM (VALUES(1))";
+testBuilder()
+.sqlQuery(query)
+.unOrdered()
+.baselineColumns("nearest_year",
+"nearest_quarter",
+"nearest_month",
+"nearest_day",
+"nearest_week_sunday",
+"nearest_week_monday",
+"nearest_hour",
+"nearest_half_hour",
+"nearest_quarter_hour",
+"nearest_minute",
+"nearest_30second",
+"nearest_15second",
+"nearest_second")
+.baselineValues(LocalDateTime.of(2019, 1, 1, 0, 0, 0),  //Year
+LocalDateTime.of(2019, 1, 1, 0, 0, 0),  //Quarter
+LocalDateTime.of(2019, 2, 1, 0, 0, 0), //Month
+LocalDateTime.of(2019, 2, 15, 0, 0, 0), //Day
+LocalDateTime.of(2019, 2, 10, 0, 0, 0), //Week Sunday
+LocalDateTime.of(2019, 2, 11, 0, 0, 0), //Week Monday
+LocalDateTime.of(2019, 2, 15, 7, 0, 0), //Hour
+LocalDateTime.of(2019, 2, 15, 7, 30, 0), //Half Hour
+LocalDateTime.of(2019, 2, 15, 7, 45, 0), //Quarter Hour
+LocalDateTime.o

[GitHub] [drill] cgivre commented on a change in pull request #1680: DRILL-7077: Add Function to Facilitate Time Series Analysis

2019-03-19 Thread GitBox
cgivre commented on a change in pull request #1680: DRILL-7077: Add Function to 
Facilitate Time Series Analysis
URL: https://github.com/apache/drill/pull/1680#discussion_r266902953
 
 

 ##
 File path: 
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/NearestDateUtils.java
 ##
 @@ -0,0 +1,145 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.udfs;
+
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+
+public class NearestDateUtils {
+  /**
+   * Specifies the time grouping to be used with the nearest date function
+   */
+  private enum TimeInterval {
+YEAR,
+QUARTER,
+MONTH,
+WEEK_SUNDAY,
+WEEK_MONDAY,
+DAY,
+HOUR,
+HALF_HOUR,
+QUARTER_HOUR,
+MINUTE,
+HALF_MINUTE,
+QUARTER_MINUTE,
+SECOND
+  }
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(NearestDateUtils.class);
+
+  /**
+   * This function takes a Java LocalDateTime object, and an interval string 
and returns
+   * the nearest date closets to that time.  For instance, if you specified 
the date as 2018-05-04 and YEAR, the function
+   * will return 2018-01-01
+   *
+   * @param dthe original datetime before adjustments
+   * @param interval The interval string to deduct from the supplied date
+   * @return the modified LocalDateTime
+   */
+  public final static java.time.LocalDateTime getDate(java.time.LocalDateTime 
d, String interval) {
+java.time.LocalDateTime newDate = d;
+int year = d.getYear();
+int month = d.getMonth().getValue();
+int day = d.getDayOfMonth();
+int hour = d.getHour();
+int minute = d.getMinute();
+int second = d.getSecond();
+TimeInterval adjustmentAmount;
+try {
+  adjustmentAmount = TimeInterval.valueOf(interval.toUpperCase());
+} catch (IllegalArgumentException e) {
+  throw new DrillRuntimeException(interval + " is not a valid time 
statement.  Expecting YEAR, QUARTER, MONTH, WEEK_SUNDAY, " +
 
 Review comment:
   Fixed


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] cgivre commented on a change in pull request #1680: DRILL-7077: Add Function to Facilitate Time Series Analysis

2019-03-19 Thread GitBox
cgivre commented on a change in pull request #1680: DRILL-7077: Add Function to 
Facilitate Time Series Analysis
URL: https://github.com/apache/drill/pull/1680#discussion_r266874028
 
 

 ##
 File path: 
contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestNearestDateFunctions.java
 ##
 @@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.udfs;
+
+import org.apache.drill.categories.SqlFunctionTest;
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterFixtureBuilder;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.time.LocalDateTime;
+
+import static org.junit.Assert.assertTrue;
+
+@Category({UnlikelyTest.class, SqlFunctionTest.class})
+public class TestNearestDateFunctions extends ClusterTest {
+
+  @BeforeClass
+  public static void setup() throws Exception {
+ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher);
+startCluster(builder);
+  }
+
+  @Test
+  public void testNearestDate() throws Exception {
+String query = "SELECT nearestDate( TO_TIMESTAMP('2019-02-01 07:22:00', 
'-MM-dd HH:mm:ss'), 'YEAR') AS nearest_year, " +
+"nearestDate( TO_TIMESTAMP('2019-02-01 07:22:00', '-MM-dd 
HH:mm:ss'), 'QUARTER') AS nearest_quarter, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'MONTH') AS nearest_month, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'DAY') AS nearest_day, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'WEEK_SUNDAY') AS nearest_week_sunday, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'WEEK_MONDAY') AS nearest_week_monday, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'HOUR') AS nearest_hour, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:42:00', '-MM-dd 
HH:mm:ss'), 'HALF_HOUR') AS nearest_half_hour, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:48:00', '-MM-dd 
HH:mm:ss'), 'QUARTER_HOUR') AS nearest_quarter_hour, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:00', '-MM-dd 
HH:mm:ss'), 'MINUTE') AS nearest_minute, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:22', '-MM-dd 
HH:mm:ss'), 'HALF_MINUTE') AS nearest_30second, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:22', '-MM-dd 
HH:mm:ss'), 'QUARTER_MINUTE') AS nearest_15second, " +
+"nearestDate( TO_TIMESTAMP('2019-02-15 07:22:31', '-MM-dd 
HH:mm:ss'), 'SECOND') AS nearest_second " +
+"FROM (VALUES(1))";
+testBuilder()
+.sqlQuery(query)
+.unOrdered()
+.baselineColumns("nearest_year",
+"nearest_quarter",
+"nearest_month",
+"nearest_day",
+"nearest_week_sunday",
+"nearest_week_monday",
+"nearest_hour",
+"nearest_half_hour",
+"nearest_quarter_hour",
+"nearest_minute",
+"nearest_30second",
+"nearest_15second",
+"nearest_second")
+.baselineValues(LocalDateTime.of(2019, 1, 1, 0, 0, 0),  //Year
+LocalDateTime.of(2019, 1, 1, 0, 0, 0),  //Quarter
+LocalDateTime.of(2019, 2, 1, 0, 0, 0), //Month
+LocalDateTime.of(2019, 2, 15, 0, 0, 0), //Day
+LocalDateTime.of(2019, 2, 10, 0, 0, 0), //Week Sunday
+LocalDateTime.of(2019, 2, 11, 0, 0, 0), //Week Monday
+LocalDateTime.of(2019, 2, 15, 7, 0, 0), //Hour
+LocalDateTime.of(2019, 2, 15, 7, 30, 0), //Half Hour
+LocalDateTime.of(2019, 2, 15, 7, 45, 0), //Quarter Hour
+LocalDateTime.o

Re: [DISCUSS] 1.16.0 release

2019-03-19 Thread Bohdan Kazydub
Hello Sorabh,

I'm currently working on a small bug when querying views from S3 storage
with plain authentication [DRILL-7079].
I'd like to include it into Drill 1.16. I'll need 1-2 days for making the
PR as well.

Thanks,
Bohdan


On Tue, Mar 19, 2019 at 1:55 PM Igor Guzenko 
wrote:

> Hello Sorabh,
>
> I'm currently working on small improvement for Hive schema show tables
> [DRILL-7115].
> I'd like to include it into Drill 1.16. I'll need 1-2 days for making the
> PR.
>
> Thanks,
> Igor
>
> On Mon, Mar 18, 2019 at 11:25 PM Kunal Khatua  wrote:
> >
> > Khurram
> >
> > Currently, for DRILL-7061, the feature has been disabled. If I'm able to
> get a commit within the week for DRILL-6960, this will be addressed in that.
> >
> > DRILL-7110 is something that is required as well, to allow DRILL-6960 to
> be accepted.
> >
> > ~ Kunal
> > On 3/18/2019 1:20:16 PM, Khurram Faraaz  wrote:
> > Can we also have the fix fox DRILL-7061
> > in Drill 1.16 ?
> >
> > Thanks,
> > Khurram
> >
> > On Mon, Mar 18, 2019 at 11:40 AM Karthikeyan Manivannan
> > kmanivan...@mapr.com> wrote:
> >
> > > Please include DRILL-7107
> > > https://issues.apache.org/jira/browse/DRILL-7107>
> > > I will open the PR today.
> > > It is a small change and fixes a basic usability issue.
> > >
> > > Thanks.
> > >
> > > Karthik
> > >
> > > On Thu, Mar 14, 2019 at 4:50 PM Charles Givre wrote:
> > >
> > > > One more… DRILL-7014 is basically done and I’d like to see that get
> into
> > > > Drill 1.16.
> > > >
> > > > > On Mar 14, 2019, at 19:44, Charles Givre wrote:
> > > > >
> > > > > Who should I add as a reviewer for 7032?
> > > > >
> > > > >> On Mar 14, 2019, at 19:42, Sorabh Hamirwasia
> > > >
> > > > wrote:
> > > > >>
> > > > >> Hi Charles,
> > > > >> Can you please add reviewer for DRILL-7032 ?
> > > > >> For DRILL-6970 the PR is closed by the author, I have pinged in
> JIRA
> > > > asking
> > > > >> to re-open so that it can be reviewed.
> > > > >>
> > > > >> Thanks,
> > > > >> Sorabh
> > > > >>
> > > > >> On Thu, Mar 14, 2019 at 4:29 PM Charles Givre
> > > wrote:
> > > > >>
> > > > >>> Hi Sorabh,
> > > > >>> I have 3 PRs that are almost done, awaiting final review.
> > > Drill-7077,
> > > > >>> DRILL-7032, DRILL-7021. I owe @ariina some fixes for 7077, but
> I’m
> > > > waiting
> > > > >>> for review of the others. Also, there is that DRILL-6970 about
> the
> > > > buffer
> > > > >>> overflows in the logRegex reader that isn’t mine but I’d like to
> see
> > > > >>> included.
> > > > >>> Thanks,
> > > > >>> —C
> > > > >>>
> > > >  On Mar 14, 2019, at 13:13, Sorabh Hamirwasia
> > > sohami.apa...@gmail.com
> > > > >
> > > > >>> wrote:
> > > > 
> > > >  Hi Arina,
> > > >  Thanks for your response. With ETA of two weeks we are looking
> at
> > > end
> > > > of
> > > >  the month or beginning next month. I will wait until Monday for
> > > > others to
> > > >  respond and then will finalize on a cut-off date.
> > > > 
> > > >  Thanks,
> > > >  Sorabh
> > > > 
> > > >  On Wed, Mar 13, 2019 at 4:28 AM Arina Ielchiieva
> > > > >>> arina.yelchiy...@gmail.com>
> > > >  wrote:
> > > > 
> > > > > Hi Sorabh,
> > > > >
> > > > > thanks for volunteering to do the release.
> > > > >
> > > > > Paul and I are working on file schema provisioning for text
> file
> > > > storage
> > > > > which is aimed for 1.16. To wrap up the work we need to
> deliver two
> > > > >>> Jiras:
> > > > > DRILL-7095 and DRILL-7011. ETA: 2 weeks.
> > > > > Please plan the release date accordingly.
> > > > >
> > > > > Kind regards,
> > > > > Arina
> > > > >
> > > > > On Tue, Mar 12, 2019 at 9:16 PM Sorabh Hamirwasia
> > > > >>> sohami.apa...@gmail.com
> > > > >>
> > > > > wrote:
> > > > >
> > > > >> Hi All,
> > > > >> It's around two and a half month since we did 1.15.0 release
> for
> > > > Apache
> > > > >> Drill. Based on our 3 months release cadence I think it's
> time to
> > > > >>> discuss
> > > > >> our next release. I will volunteer to manage the next release.
> > > > >>
> > > > >> **Below is the current JIRA stats:*
> > > > >> *[1] Open#: 15*
> > > > >>
> > > > >> - Would be great if everyone can look into their assigned
> tickets
> > > > and
> > > > >> update the fix version as needed. Please keep the ones which
> you
> > > > find
> > > > >> must
> > > > >> have and can be completed sooner.
> > > > >>
> > > > >> *[2] InProgress#: 11*
> > > > >>
> > > > >> - If you think we *must* include any issues from this list
> then
> > > > >>> please
> > > > >> reply on this thread. Also would be great to know how much
> time
> > > you
> > > > >> think
> > > > >> is needed for these issues. Based on that we can take a call
> which
> > > > >>> one
> > > > >> to
> > > > >> target for this release.
> > > > >>
> > > > >> *[3] Reviewable#: 14*
> > > > >>
> > > > >> - All the review

Re: [DISCUSS] 1.16.0 release

2019-03-19 Thread Vitalii Diravka
HI Sorabh,

I want to update Hadoop libraries onto 3.1 version DRILL-6540. Currently I
am verifying it with Jetty9.4 version DRILL-7051. I will open PR this week.
Also I am planing to include DRILL-7098 to 1.16.0 release (PR will be
opened this or next week).

Kind regards
Vitalii


On Tue, Mar 19, 2019 at 1:55 PM Igor Guzenko 
wrote:

> Hello Sorabh,
>
> I'm currently working on small improvement for Hive schema show tables
> [DRILL-7115].
> I'd like to include it into Drill 1.16. I'll need 1-2 days for making the
> PR.
>
> Thanks,
> Igor
>
> On Mon, Mar 18, 2019 at 11:25 PM Kunal Khatua  wrote:
> >
> > Khurram
> >
> > Currently, for DRILL-7061, the feature has been disabled. If I'm able to
> get a commit within the week for DRILL-6960, this will be addressed in that.
> >
> > DRILL-7110 is something that is required as well, to allow DRILL-6960 to
> be accepted.
> >
> > ~ Kunal
> > On 3/18/2019 1:20:16 PM, Khurram Faraaz  wrote:
> > Can we also have the fix fox DRILL-7061
> > in Drill 1.16 ?
> >
> > Thanks,
> > Khurram
> >
> > On Mon, Mar 18, 2019 at 11:40 AM Karthikeyan Manivannan
> > kmanivan...@mapr.com> wrote:
> >
> > > Please include DRILL-7107
> > > https://issues.apache.org/jira/browse/DRILL-7107>
> > > I will open the PR today.
> > > It is a small change and fixes a basic usability issue.
> > >
> > > Thanks.
> > >
> > > Karthik
> > >
> > > On Thu, Mar 14, 2019 at 4:50 PM Charles Givre wrote:
> > >
> > > > One more… DRILL-7014 is basically done and I’d like to see that get
> into
> > > > Drill 1.16.
> > > >
> > > > > On Mar 14, 2019, at 19:44, Charles Givre wrote:
> > > > >
> > > > > Who should I add as a reviewer for 7032?
> > > > >
> > > > >> On Mar 14, 2019, at 19:42, Sorabh Hamirwasia
> > > >
> > > > wrote:
> > > > >>
> > > > >> Hi Charles,
> > > > >> Can you please add reviewer for DRILL-7032 ?
> > > > >> For DRILL-6970 the PR is closed by the author, I have pinged in
> JIRA
> > > > asking
> > > > >> to re-open so that it can be reviewed.
> > > > >>
> > > > >> Thanks,
> > > > >> Sorabh
> > > > >>
> > > > >> On Thu, Mar 14, 2019 at 4:29 PM Charles Givre
> > > wrote:
> > > > >>
> > > > >>> Hi Sorabh,
> > > > >>> I have 3 PRs that are almost done, awaiting final review.
> > > Drill-7077,
> > > > >>> DRILL-7032, DRILL-7021. I owe @ariina some fixes for 7077, but
> I’m
> > > > waiting
> > > > >>> for review of the others. Also, there is that DRILL-6970 about
> the
> > > > buffer
> > > > >>> overflows in the logRegex reader that isn’t mine but I’d like to
> see
> > > > >>> included.
> > > > >>> Thanks,
> > > > >>> —C
> > > > >>>
> > > >  On Mar 14, 2019, at 13:13, Sorabh Hamirwasia
> > > sohami.apa...@gmail.com
> > > > >
> > > > >>> wrote:
> > > > 
> > > >  Hi Arina,
> > > >  Thanks for your response. With ETA of two weeks we are looking
> at
> > > end
> > > > of
> > > >  the month or beginning next month. I will wait until Monday for
> > > > others to
> > > >  respond and then will finalize on a cut-off date.
> > > > 
> > > >  Thanks,
> > > >  Sorabh
> > > > 
> > > >  On Wed, Mar 13, 2019 at 4:28 AM Arina Ielchiieva
> > > > >>> arina.yelchiy...@gmail.com>
> > > >  wrote:
> > > > 
> > > > > Hi Sorabh,
> > > > >
> > > > > thanks for volunteering to do the release.
> > > > >
> > > > > Paul and I are working on file schema provisioning for text
> file
> > > > storage
> > > > > which is aimed for 1.16. To wrap up the work we need to
> deliver two
> > > > >>> Jiras:
> > > > > DRILL-7095 and DRILL-7011. ETA: 2 weeks.
> > > > > Please plan the release date accordingly.
> > > > >
> > > > > Kind regards,
> > > > > Arina
> > > > >
> > > > > On Tue, Mar 12, 2019 at 9:16 PM Sorabh Hamirwasia
> > > > >>> sohami.apa...@gmail.com
> > > > >>
> > > > > wrote:
> > > > >
> > > > >> Hi All,
> > > > >> It's around two and a half month since we did 1.15.0 release
> for
> > > > Apache
> > > > >> Drill. Based on our 3 months release cadence I think it's
> time to
> > > > >>> discuss
> > > > >> our next release. I will volunteer to manage the next release.
> > > > >>
> > > > >> **Below is the current JIRA stats:*
> > > > >> *[1] Open#: 15*
> > > > >>
> > > > >> - Would be great if everyone can look into their assigned
> tickets
> > > > and
> > > > >> update the fix version as needed. Please keep the ones which
> you
> > > > find
> > > > >> must
> > > > >> have and can be completed sooner.
> > > > >>
> > > > >> *[2] InProgress#: 11*
> > > > >>
> > > > >> - If you think we *must* include any issues from this list
> then
> > > > >>> please
> > > > >> reply on this thread. Also would be great to know how much
> time
> > > you
> > > > >> think
> > > > >> is needed for these issues. Based on that we can take a call
> which
> > > > >>> one
> > > > >> to
> > > > >> target for this release.
> > > > >>
> > > > >> *[3] Reviewable#

[jira] [Created] (DRILL-7115) Improve Hive schema show tables performance

2019-03-19 Thread Igor Guzenko (JIRA)
Igor Guzenko created DRILL-7115:
---

 Summary: Improve Hive schema show tables performance
 Key: DRILL-7115
 URL: https://issues.apache.org/jira/browse/DRILL-7115
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Hive, Storage - Information Schema
Reporter: Igor Guzenko
Assignee: Igor Guzenko


In Sqlline(Drill), "show tables" on a Hive schema is taking nearly 15mins to 
20mins. The schema has nearly ~8000 tables.
Whereas the same in beeline(Hive) is throwing the result in a split second(~ 
0.2 secs).

I tested the same in my test cluster by creating 6000 tables(empty!) in Hive 
and then doing "show tables" in Drill. It took more than 2 mins(~140 secs).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] 1.16.0 release

2019-03-19 Thread Igor Guzenko
Hello Sorabh,

I'm currently working on small improvement for Hive schema show tables
[DRILL-7115].
I'd like to include it into Drill 1.16. I'll need 1-2 days for making the PR.

Thanks,
Igor

On Mon, Mar 18, 2019 at 11:25 PM Kunal Khatua  wrote:
>
> Khurram
>
> Currently, for DRILL-7061, the feature has been disabled. If I'm able to get 
> a commit within the week for DRILL-6960, this will be addressed in that.
>
> DRILL-7110 is something that is required as well, to allow DRILL-6960 to be 
> accepted.
>
> ~ Kunal
> On 3/18/2019 1:20:16 PM, Khurram Faraaz  wrote:
> Can we also have the fix fox DRILL-7061
> in Drill 1.16 ?
>
> Thanks,
> Khurram
>
> On Mon, Mar 18, 2019 at 11:40 AM Karthikeyan Manivannan
> kmanivan...@mapr.com> wrote:
>
> > Please include DRILL-7107
> > https://issues.apache.org/jira/browse/DRILL-7107>
> > I will open the PR today.
> > It is a small change and fixes a basic usability issue.
> >
> > Thanks.
> >
> > Karthik
> >
> > On Thu, Mar 14, 2019 at 4:50 PM Charles Givre wrote:
> >
> > > One more… DRILL-7014 is basically done and I’d like to see that get into
> > > Drill 1.16.
> > >
> > > > On Mar 14, 2019, at 19:44, Charles Givre wrote:
> > > >
> > > > Who should I add as a reviewer for 7032?
> > > >
> > > >> On Mar 14, 2019, at 19:42, Sorabh Hamirwasia
> > >
> > > wrote:
> > > >>
> > > >> Hi Charles,
> > > >> Can you please add reviewer for DRILL-7032 ?
> > > >> For DRILL-6970 the PR is closed by the author, I have pinged in JIRA
> > > asking
> > > >> to re-open so that it can be reviewed.
> > > >>
> > > >> Thanks,
> > > >> Sorabh
> > > >>
> > > >> On Thu, Mar 14, 2019 at 4:29 PM Charles Givre
> > wrote:
> > > >>
> > > >>> Hi Sorabh,
> > > >>> I have 3 PRs that are almost done, awaiting final review.
> > Drill-7077,
> > > >>> DRILL-7032, DRILL-7021. I owe @ariina some fixes for 7077, but I’m
> > > waiting
> > > >>> for review of the others. Also, there is that DRILL-6970 about the
> > > buffer
> > > >>> overflows in the logRegex reader that isn’t mine but I’d like to see
> > > >>> included.
> > > >>> Thanks,
> > > >>> —C
> > > >>>
> > >  On Mar 14, 2019, at 13:13, Sorabh Hamirwasia
> > sohami.apa...@gmail.com
> > > >
> > > >>> wrote:
> > > 
> > >  Hi Arina,
> > >  Thanks for your response. With ETA of two weeks we are looking at
> > end
> > > of
> > >  the month or beginning next month. I will wait until Monday for
> > > others to
> > >  respond and then will finalize on a cut-off date.
> > > 
> > >  Thanks,
> > >  Sorabh
> > > 
> > >  On Wed, Mar 13, 2019 at 4:28 AM Arina Ielchiieva
> > > >>> arina.yelchiy...@gmail.com>
> > >  wrote:
> > > 
> > > > Hi Sorabh,
> > > >
> > > > thanks for volunteering to do the release.
> > > >
> > > > Paul and I are working on file schema provisioning for text file
> > > storage
> > > > which is aimed for 1.16. To wrap up the work we need to deliver two
> > > >>> Jiras:
> > > > DRILL-7095 and DRILL-7011. ETA: 2 weeks.
> > > > Please plan the release date accordingly.
> > > >
> > > > Kind regards,
> > > > Arina
> > > >
> > > > On Tue, Mar 12, 2019 at 9:16 PM Sorabh Hamirwasia
> > > >>> sohami.apa...@gmail.com
> > > >>
> > > > wrote:
> > > >
> > > >> Hi All,
> > > >> It's around two and a half month since we did 1.15.0 release for
> > > Apache
> > > >> Drill. Based on our 3 months release cadence I think it's time to
> > > >>> discuss
> > > >> our next release. I will volunteer to manage the next release.
> > > >>
> > > >> **Below is the current JIRA stats:*
> > > >> *[1] Open#: 15*
> > > >>
> > > >> - Would be great if everyone can look into their assigned tickets
> > > and
> > > >> update the fix version as needed. Please keep the ones which you
> > > find
> > > >> must
> > > >> have and can be completed sooner.
> > > >>
> > > >> *[2] InProgress#: 11*
> > > >>
> > > >> - If you think we *must* include any issues from this list then
> > > >>> please
> > > >> reply on this thread. Also would be great to know how much time
> > you
> > > >> think
> > > >> is needed for these issues. Based on that we can take a call which
> > > >>> one
> > > >> to
> > > >> target for this release.
> > > >>
> > > >> *[3] Reviewable#: 14*
> > > >>
> > > >> - All the reviewers and authors should try to close these as soon
> > as
> > > >> possible. If you think that any of the PR needs rework or should
> > be
> > > >> postponed to next release then please update the status and fix
> > > > version
> > > >> for
> > > >> those JIRA's as well.
> > > >>
> > > >> After above JIRA's are reviewed by everyone and based on their
> > > inputs
> > > >>> we
> > > >> can define a cut off date.
> > > >>
> > > >> *Approximate numbers as it can change based on JIRA updates.
> > > >>
> > > >> Thanks,
> > > >> Sorabh
> > > >>
> >

[GitHub] [drill] arina-ielchiieva commented on issue #1649: DRILL-6970: Issue with LogRegex format plugin where drillbuf was overflowing

2019-03-19 Thread GitBox
arina-ielchiieva commented on issue #1649: DRILL-6970: Issue with LogRegex 
format plugin where drillbuf was overflowing 
URL: https://github.com/apache/drill/pull/1649#issuecomment-474314187
 
 
   @jcmcote re-openned PR myself, you still need to resolve the conflicts and 
squash the commits.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] jcmcote opened a new pull request #1649: DRILL-6970: Issue with LogRegex format plugin where drillbuf was overflowing

2019-03-19 Thread GitBox
jcmcote opened a new pull request #1649: DRILL-6970: Issue with LogRegex format 
plugin where drillbuf was overflowing 
URL: https://github.com/apache/drill/pull/1649
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on issue #1649: DRILL-6970 Issue with LogRegex format plugin where drillbuf was overflowing

2019-03-19 Thread GitBox
arina-ielchiieva commented on issue #1649: DRILL-6970 Issue with LogRegex 
format plugin where drillbuf was overflowing 
URL: https://github.com/apache/drill/pull/1649#issuecomment-474313363
 
 
   @jcmcote you have closed the PR before the merge, thus it still have not 
been merged. Can you please re-open it and squash the commits?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] jcmcote commented on issue #1649: DRILL-6970 Issue with LogRegex format plugin where drillbuf was overflowing

2019-03-19 Thread GitBox
jcmcote commented on issue #1649: DRILL-6970 Issue with LogRegex format plugin 
where drillbuf was overflowing 
URL: https://github.com/apache/drill/pull/1649#issuecomment-474289593
 
 
   it looks like it was approved but not yet merged?
   
   On Mon, Mar 18, 2019 at 12:08 PM Sorabh Hamirwasia 
   wrote:
   
   > @jcmcote  - Are you planning to re-open this
   > PR ?
   >
   > —
   > You are receiving this because you were mentioned.
   > Reply to this email directly, view it on GitHub
   > , or mute
   > the thread
   > 

   > .
   >
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (DRILL-6430) Drill Should Not Fail If It Sees Deprecated Options Stored In Zookeeper Or Locally

2019-03-19 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub resolved DRILL-6430.
---
   Resolution: Done
Fix Version/s: (was: 1.17.0)
   1.16.0

> Drill Should Not Fail If It Sees Deprecated Options Stored In Zookeeper Or 
> Locally
> --
>
> Key: DRILL-6430
> URL: https://issues.apache.org/jira/browse/DRILL-6430
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.16.0
>
>
> This is required for resource management since we will likely remove many 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Roadmap for Drill 2.0 and beyond?

2019-03-19 Thread Nai Yan.
Greetings, 
   Was wondering if I can get a concept of Drill future, such as 2.0 
release and beyond?  

   Thanks. 



Nai Yan
 


Re: Re: Any update for Superset support for Drill?

2019-03-19 Thread Nai Yan.