[GitHub] kkhatua commented on issue #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query
kkhatua commented on issue #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query URL: https://github.com/apache/drill/pull/1608#issuecomment-460460710 @ihuzenko I've updated with changes and squashed and rebased onto current master. Please mark the Apache JIRA with a `ready-to-commit` tag if this looks alright. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query
kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query URL: https://github.com/apache/drill/pull/1608#discussion_r253686311 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java ## @@ -47,6 +47,7 @@ public class DrillSqlWorker { private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(DrillSqlWorker.class); private static final ControlsInjector injector = ControlsInjectorFactory.getInjector(DrillSqlWorker.class); + private static final String NEW_LINE = System.getProperty("line.separator"); Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query
kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query URL: https://github.com/apache/drill/pull/1608#discussion_r253686299 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/QueryResources.java ## @@ -83,10 +83,12 @@ public QueryResult submitQueryJSON(QueryWrapper query) throws Exception { @Consumes(MediaType.APPLICATION_FORM_URLENCODED) @Produces(MediaType.TEXT_HTML) public Viewable submitQuery(@FormParam("query") String query, - @FormParam("queryType") String queryType) throws Exception { + @FormParam("queryType") String queryType, + @FormParam("autoLimit") String autoLimit + ) throws Exception { try { final String trimmedQueryString = CharMatcher.is(';').trimTrailingFrom(query.trim()); - final QueryResult result = submitQueryJSON(new QueryWrapper(trimmedQueryString, queryType)); + final QueryResult result = submitQueryJSON(new QueryWrapper(trimmedQueryString, queryType, autoLimit)); //QueryWrapper will validate autoLimit Review comment: Ok This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query
kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query URL: https://github.com/apache/drill/pull/1608#discussion_r253686115 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java ## @@ -183,4 +184,23 @@ private static PhysicalPlan getQueryPlan(QueryContext context, String sql, Point return handler.getPlan(sqlNode); } + + //Wrap with Auto Limit + private static SqlNode wrapWithAutoLimit(SqlConverter parser, QueryContext context, String sql) { +SqlNode sqlNode = parser.parse(sql); +if (context.isAutoLimitEnabled() && sqlNode.getKind().belongsTo(SqlKind.QUERY)) { + int autoLimitRowCount = context.getAutoLimitRowCount(); Review comment: +1 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query
kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query URL: https://github.com/apache/drill/pull/1608#discussion_r253685826 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java ## @@ -273,6 +279,29 @@ public RemoteFunctionRegistry getRemoteFunctionRegistry() { return drillbitContext.getRemoteFunctionRegistry(); } + /** + * Check if auto-limiting of resultset is enabled + * @return Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query
kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query URL: https://github.com/apache/drill/pull/1608#discussion_r253685784 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java ## @@ -183,4 +184,23 @@ private static PhysicalPlan getQueryPlan(QueryContext context, String sql, Point return handler.getPlan(sqlNode); } + + //Wrap with Auto Limit + private static SqlNode wrapWithAutoLimit(SqlConverter parser, QueryContext context, String sql) { +SqlNode sqlNode = parser.parse(sql); +if (context.isAutoLimitEnabled() && sqlNode.getKind().belongsTo(SqlKind.QUERY)) { Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query
kkhatua commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query URL: https://github.com/apache/drill/pull/1608#discussion_r253685842 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java ## @@ -273,6 +279,29 @@ public RemoteFunctionRegistry getRemoteFunctionRegistry() { return drillbitContext.getRemoteFunctionRegistry(); } + /** + * Check if auto-limiting of resultset is enabled + * @return + */ + public boolean isAutoLimitEnabled() { +return autoLimitRowCount != null; + } + + /** + * Returns the maximum size of auto-limited resultset + * @return Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (DRILL-7026) Support Resource Management across queries in Drill
Sorabh Hamirwasia created DRILL-7026: Summary: Support Resource Management across queries in Drill Key: DRILL-7026 URL: https://issues.apache.org/jira/browse/DRILL-7026 Project: Apache Drill Issue Type: Task Components: Execution - Flow, Query Planning & Optimization Affects Versions: 1.16.0 Reporter: Sorabh Hamirwasia Assignee: Sorabh Hamirwasia Fix For: Future Tracks support for dynamic memory resource management across queries using current cluster state. It will also improve multi-tenancy support to configure multiple named queues for a Drill cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-6972) Error: Could not find or load main class sqlline.SqlLine
[ https://issues.apache.org/jira/browse/DRILL-6972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-6972. Resolution: Fixed Fixed in the scope of DRILL-6985. > Error: Could not find or load main class sqlline.SqlLine > > > Key: DRILL-6972 > URL: https://issues.apache.org/jira/browse/DRILL-6972 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.15.0 > Environment: Windows 10 Education > 64-bit OS >Reporter: Elena Campean >Assignee: Volodymyr Vysotskyi >Priority: Major > Labels: apache, drill, start > Fix For: 1.16.0 > > > I am running in cmd, the command:Â sqlline.bat -u "jdbc:drill:zk=local", in > order to start drill, but I get the result: > _DRILL_ARGS - " -u jdbc:drill:zk=local"_ > _The system cannot find the path specified._ > _HADOOP_HOME not detected..._ > _HBASE_HOME not detected..._ > _Calculating Drill classpath..._ > _Error: Could not find or load main class sqlline.SqlLine_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] ihuzenko commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query
ihuzenko commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query URL: https://github.com/apache/drill/pull/1608#discussion_r253517907 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/QueryResources.java ## @@ -83,10 +83,12 @@ public QueryResult submitQueryJSON(QueryWrapper query) throws Exception { @Consumes(MediaType.APPLICATION_FORM_URLENCODED) @Produces(MediaType.TEXT_HTML) public Viewable submitQuery(@FormParam("query") String query, - @FormParam("queryType") String queryType) throws Exception { + @FormParam("queryType") String queryType, + @FormParam("autoLimit") String autoLimit + ) throws Exception { try { final String trimmedQueryString = CharMatcher.is(';').trimTrailingFrom(query.trim()); - final QueryResult result = submitQueryJSON(new QueryWrapper(trimmedQueryString, queryType)); + final QueryResult result = submitQueryJSON(new QueryWrapper(trimmedQueryString, queryType, autoLimit)); //QueryWrapper will validate autoLimit Review comment: I think the comment is redundant and may be removed: ```suggestion final QueryResult result = submitQueryJSON(new QueryWrapper(trimmedQueryString, queryType, autoLimit)); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ihuzenko commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query
ihuzenko commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query URL: https://github.com/apache/drill/pull/1608#discussion_r253514227 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java ## @@ -183,4 +184,23 @@ private static PhysicalPlan getQueryPlan(QueryContext context, String sql, Point return handler.getPlan(sqlNode); } + + //Wrap with Auto Limit + private static SqlNode wrapWithAutoLimit(SqlConverter parser, QueryContext context, String sql) { +SqlNode sqlNode = parser.parse(sql); +if (context.isAutoLimitEnabled() && sqlNode.getKind().belongsTo(SqlKind.QUERY)) { Review comment: The ```context.isAutoLimitEnabled()``` is now checked before method invocation and may be removed here: ```suggestion if (sqlNode.getKind().belongsTo(SqlKind.QUERY)) { ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ihuzenko commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query
ihuzenko commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query URL: https://github.com/apache/drill/pull/1608#discussion_r253517282 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java ## @@ -183,4 +184,23 @@ private static PhysicalPlan getQueryPlan(QueryContext context, String sql, Point return handler.getPlan(sqlNode); } + + //Wrap with Auto Limit + private static SqlNode wrapWithAutoLimit(SqlConverter parser, QueryContext context, String sql) { +SqlNode sqlNode = parser.parse(sql); +if (context.isAutoLimitEnabled() && sqlNode.getKind().belongsTo(SqlKind.QUERY)) { + int autoLimitRowCount = context.getAutoLimitRowCount(); Review comment: The StringBuilder code within ```if(){ ... }``` may be simplified to : ```java String wrappedSql = String.format("-- [autoLimit: %2$d rows]%nSELECT * FROM (%n %1$s %n) LIMIT %2$d", sql, context.getAutoLimitRowCount()); logger.debug("Query text to execute: {}", wrappedSql); sqlNode = parser.parse(wrappedSql); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ihuzenko commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query
ihuzenko commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query URL: https://github.com/apache/drill/pull/1608#discussion_r253497690 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java ## @@ -273,6 +279,29 @@ public RemoteFunctionRegistry getRemoteFunctionRegistry() { return drillbitContext.getRemoteFunctionRegistry(); } + /** + * Check if auto-limiting of resultset is enabled + * @return + */ + public boolean isAutoLimitEnabled() { +return autoLimitRowCount != null; + } + + /** + * Returns the maximum size of auto-limited resultset + * @return Review comment: After ```@return``` usually follows description about returned value. Please check existing Drill javadocs and add yours. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ihuzenko commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query
ihuzenko commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query URL: https://github.com/apache/drill/pull/1608#discussion_r253497482 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java ## @@ -273,6 +279,29 @@ public RemoteFunctionRegistry getRemoteFunctionRegistry() { return drillbitContext.getRemoteFunctionRegistry(); } + /** + * Check if auto-limiting of resultset is enabled + * @return Review comment: After ```@return``` usually follows description about returned value. Please check existing Drill javadocs and add yours. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ihuzenko commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query
ihuzenko commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query URL: https://github.com/apache/drill/pull/1608#discussion_r253520647 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java ## @@ -47,6 +47,7 @@ public class DrillSqlWorker { private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(DrillSqlWorker.class); private static final ControlsInjector injector = ControlsInjectorFactory.getInjector(DrillSqlWorker.class); + private static final String NEW_LINE = System.getProperty("line.separator"); Review comment: May be removed after changing ```StringBuilder``` to ```String.format``` in ```wrapWithAutoLimit(parser, context, sql)``` method. See comments below for details. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ihuzenko commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query
ihuzenko commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query URL: https://github.com/apache/drill/pull/1608#discussion_r252687106 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java ## @@ -183,4 +183,24 @@ private static PhysicalPlan getQueryPlan(QueryContext context, String sql, Point return handler.getPlan(sqlNode); } + Review comment: I'd like to replace the constant and method defined below, with: ```suggestion private static boolean applyAutoLimit(SqlNode sqlNode, SqlKind kind, Integer autoLimit) { if (kind == SqlKind.SELECT) { SqlNumericLiteral fetch = SqlNumericLiteral.createExactNumeric(autoLimit.toString(), sqlNode.getParserPosition()); ((SqlSelect) sqlNode).setFetch(fetch); return true; } else if (sqlNode instanceof SqlWith) { SqlNode withBody = ((SqlWith) sqlNode).body; return applyAutoLimit(withBody, withBody.getKind(), autoLimit); } return false; } ``` Suggested method avoids double parsing of SQL string and handles WITH queries. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ihuzenko commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query
ihuzenko commented on a change in pull request #1608: DRILL-6960: Auto Limit Wrapping should not apply to non-select query URL: https://github.com/apache/drill/pull/1608#discussion_r252682867 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java ## @@ -273,6 +279,21 @@ public RemoteFunctionRegistry getRemoteFunctionRegistry() { return drillbitContext.getRemoteFunctionRegistry(); } + public boolean isAutoLimitEnabled() { +return (autoLimitRowCount != null); + } + + public Integer getAutoLimitRowCount() { Review comment: add javadoc This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] arina-ielchiieva commented on a change in pull request #1633: DRILL-7024: Refactor ColumnWriter to simplify type-conversion shim
arina-ielchiieva commented on a change in pull request #1633: DRILL-7024: Refactor ColumnWriter to simplify type-conversion shim URL: https://github.com/apache/drill/pull/1633#discussion_r253486041 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/rowSet/impl/ColumnState.java ## @@ -47,19 +48,19 @@ * Column metadata is hosted on the writer. */ - public static class PrimitiveColumnState extends ColumnState implements ColumnWriterListener { + public static class PrimitiveColumnState extends ColumnState implements AbstractScalarWriterImpl.ColumnWriterListener { public PrimitiveColumnState(LoaderInternals loader, AbstractObjectWriter colWriter, VectorState vectorState) { super(loader, colWriter, vectorState); - ScalarWriter scalarWriter; + WriterEvents scalarWriter; if (colWriter.type() == ObjectType.ARRAY) { -scalarWriter = writer.array().scalar(); +scalarWriter = writer.array().entry().events(); } else { -scalarWriter = writer.scalar(); +scalarWriter = writer.events(); } - scalarWriter.bindListener(this); + ((AbstractScalarWriterImpl) scalarWriter).bindListener(this); Review comment: Can cast be avoided? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] arina-ielchiieva commented on a change in pull request #1633: DRILL-7024: Refactor ColumnWriter to simplify type-conversion shim
arina-ielchiieva commented on a change in pull request #1633: DRILL-7024: Refactor ColumnWriter to simplify type-conversion shim URL: https://github.com/apache/drill/pull/1633#discussion_r253485911 ## File path: exec/vector/src/main/java/org/apache/drill/exec/vector/accessor/writer/AbstractScalarWriterImpl.java ## @@ -0,0 +1,186 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.vector.accessor.writer; + +import org.apache.drill.exec.record.metadata.ColumnMetadata; +import org.apache.drill.exec.vector.BaseDataValueVector; +import org.apache.drill.exec.vector.accessor.ColumnConversionFactory; +import org.apache.drill.exec.vector.accessor.ColumnWriter; +import org.apache.drill.exec.vector.accessor.ColumnWriterIndex; +import org.apache.drill.exec.vector.accessor.ObjectType; +import org.apache.drill.exec.vector.accessor.ScalarWriter; +import org.apache.drill.exec.vector.accessor.impl.HierarchicalFormatter; + +/** + * Column writer implementation that acts as the basis for the + * generated, vector-specific implementations. All set methods + * throw an exception; subclasses simply override the supported + * method(s). + */ + +public abstract class AbstractScalarWriterImpl extends AbstractScalarWriter implements WriterEvents { + + /** + * Wraps a scalar writer and its event handler to provide a uniform + * JSON-like interface for all writer types. + * + * The client sees only the scalar writer API. But, internals need + * visibility into a rather complex set of events to orchestrate + * vector events: mostly sent to the writer, but some times sent + * from the writer, such as vector overflow. Separating the two + * concepts makes it easier to add type-conversion shims on top of + * the actual vector writer. + */ + public static class ScalarObjectWriter extends AbstractObjectWriter { + +private final WriterEvents writerEvents; +private ScalarWriter scalarWriter; + +public ScalarObjectWriter(AbstractScalarWriterImpl scalarWriter) { + final ColumnMetadata metadata = scalarWriter.schema(); + final ColumnConversionFactory factory = metadata.typeConverter(); + writerEvents = scalarWriter; + if (factory == null) { +this.scalarWriter = scalarWriter; + } else { +this.scalarWriter = factory.newWriter(metadata, scalarWriter); + } +} + +@Override +public ScalarWriter scalar() { return scalarWriter; } + +@Override +public ColumnWriter writer() { return scalarWriter; } + +@Override +public WriterEvents events() { return writerEvents; } + +@Override +public void dump(HierarchicalFormatter format) { + format +.startObject(this) +.attribute("scalarWriter"); + writerEvents.dump(format); + format.endObject(); +} + } + + /** + * Listener (callback) for vector overflow events. To be optionally + * implemented and bound by the client code of the writer. If no + * listener is bound, and a vector overflows, then an exception is + * thrown. + */ + + public static interface ColumnWriterListener { + +/** + * Alert the listener that a vector has overflowed. Upon return, + * all writers must have a new set of buffers available, ready + * to accept the in-flight value that triggered the overflow. + * + * @param writer the writer that triggered the overflow + */ + +void overflowed(ScalarWriter writer); + +/** + * A writer wants to expand its vector. Allows the listener to + * either allow the growth, or trigger and overflow to limit + * batch size. + * + * @param writer the writer that wishes to grow its vector + * @param delta the amount by which the vector is to grow + * @return true if the vector can be grown, false if the writer + * should instead trigger an overflow by calling + * overflowed() + */ + +boolean canExpand(ScalarWriter writer, int delta); + } + + protected ColumnMetadata schema; + + /** + * Indicates the position in the vector to write. Set via an object so that + * all writers (within the same subtree) can agree on the write position. + * For example, all top-l
[GitHub] arina-ielchiieva commented on a change in pull request #1618: DRILL-6950: Row set-based scan framework
arina-ielchiieva commented on a change in pull request #1618: DRILL-6950: Row set-based scan framework URL: https://github.com/apache/drill/pull/1618#discussion_r253460539 ## File path: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/scan/TestFileMetadataColumnParser.java ## @@ -154,93 +148,85 @@ public void testWildcard() { Lists.newArrayList(metadataManager.projectionParser())); List cols = scanProj.columns(); -assertEquals(7, cols.size()); +assertEquals(1, cols.size()); assertEquals(UnresolvedColumn.WILDCARD, cols.get(0).nodeType()); -for (int i = 0; i < 4; i++) { - assertEquals(FileMetadataColumn.ID, cols.get(1+i).nodeType()); -} -assertEquals(PartitionColumn.ID, cols.get(5).nodeType()); -assertEquals(PartitionColumn.ID, cols.get(6).nodeType()); } /** - * Drill 1.1 - 1.11 and Drill 1.13 or later put metadata columns after - * data columns. Drill 1.12 moved them before data columns. For testing - * and compatibility, the client can request to use the Drill 1.12 position, - * though the after-data position is the default. + * Combine wildcard and file metadata columms. The wildcard expands + * table columns but not metadata columns. */ @Test - public void testDrill1_12Wildcard() { + public void testWildcardAndFileMetaata() { Path filePath = new Path("hdfs:///w/x/y/z.csv"); FileMetadataManager metadataManager = new FileMetadataManager( -fixture.getOptionManager(), true, +fixture.getOptionManager(), new Path("hdfs:///w"), Lists.newArrayList(filePath)); ScanLevelProjection scanProj = new ScanLevelProjection( -RowSetTestUtils.projectAll(), -Lists.newArrayList(metadataManager.projectionParser()), -true); +RowSetTestUtils.projectList( +SchemaPath.DYNAMIC_STAR, +ScanTestUtils.FILE_NAME_COL, +ScanTestUtils.SUFFIX_COL), +Lists.newArrayList(metadataManager.projectionParser())); List cols = scanProj.columns(); -assertEquals(7, cols.size()); -for (int i = 0; i < 4; i++) { - assertEquals(FileMetadataColumn.ID, cols.get(i).nodeType()); -} -assertEquals(PartitionColumn.ID, cols.get(4).nodeType()); -assertEquals(PartitionColumn.ID, cols.get(5).nodeType()); -assertEquals(UnresolvedColumn.WILDCARD, cols.get(6).nodeType()); +assertEquals(3, cols.size()); +assertEquals(UnresolvedColumn.WILDCARD, cols.get(0).nodeType()); +assertEquals(FileMetadataColumn.ID, cols.get(1).nodeType()); +assertEquals(FileMetadataColumn.ID, cols.get(2).nodeType()); } /** - * Can't explicitly list file metadata columns with a wildcard in - * "legacy" mode: that is, when the wildcard already includes partition - * and file metadata columns. + * As above, but include implicit columns before and after the + * wildcard. */ @Test - public void testErrorWildcardLegacyAndFileMetaata() { - + public void testWildcardAndFileMetaataMixed() { Review comment: `Metaata` -> `Metadata` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] arina-ielchiieva commented on a change in pull request #1618: DRILL-6950: Row set-based scan framework
arina-ielchiieva commented on a change in pull request #1618: DRILL-6950: Row set-based scan framework URL: https://github.com/apache/drill/pull/1618#discussion_r253463714 ## File path: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/scan/TestColumnsArray.java ## @@ -54,7 +54,7 @@ public void testColumnsArray() { Path filePath = new Path("hdfs:///w/x/y/z.csv"); Review comment: Should not we use `file:///` in tests, not `hdfs:///`? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] arina-ielchiieva commented on a change in pull request #1618: DRILL-6950: Row set-based scan framework
arina-ielchiieva commented on a change in pull request #1618: DRILL-6950: Row set-based scan framework URL: https://github.com/apache/drill/pull/1618#discussion_r253463105 ## File path: exec/java-exec/src/test/java/org/apache/drill/exec/physical/rowSet/impl/RowSetTestUtils.java ## @@ -30,11 +29,9 @@ private RowSetTestUtils() { } public static List projectList(String... names) { List selected = new ArrayList<>(); -for (String name: names) { - - // Parse from string does not handle wildcards. - - if (name.equals(SchemaPath.DYNAMIC_STAR)) { +for (String name : names) { + if (name.equals(SchemaPath.DYNAMIC_STAR) || + name.equals("*")) { Review comment: As far as I understand start column is replaced with `**` in Calcite, do we need to check for `*` as well? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] arina-ielchiieva commented on a change in pull request #1618: DRILL-6950: Row set-based scan framework
arina-ielchiieva commented on a change in pull request #1618: DRILL-6950: Row set-based scan framework URL: https://github.com/apache/drill/pull/1618#discussion_r253460360 ## File path: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/scan/TestFileMetadataColumnParser.java ## @@ -154,93 +148,85 @@ public void testWildcard() { Lists.newArrayList(metadataManager.projectionParser())); List cols = scanProj.columns(); -assertEquals(7, cols.size()); +assertEquals(1, cols.size()); assertEquals(UnresolvedColumn.WILDCARD, cols.get(0).nodeType()); -for (int i = 0; i < 4; i++) { - assertEquals(FileMetadataColumn.ID, cols.get(1+i).nodeType()); -} -assertEquals(PartitionColumn.ID, cols.get(5).nodeType()); -assertEquals(PartitionColumn.ID, cols.get(6).nodeType()); } /** - * Drill 1.1 - 1.11 and Drill 1.13 or later put metadata columns after - * data columns. Drill 1.12 moved them before data columns. For testing - * and compatibility, the client can request to use the Drill 1.12 position, - * though the after-data position is the default. + * Combine wildcard and file metadata columms. The wildcard expands + * table columns but not metadata columns. */ @Test - public void testDrill1_12Wildcard() { + public void testWildcardAndFileMetaata() { Review comment: `Metaata` -> `Metadata` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] arina-ielchiieva commented on a change in pull request #1618: DRILL-6950: Row set-based scan framework
arina-ielchiieva commented on a change in pull request #1618: DRILL-6950: Row set-based scan framework URL: https://github.com/apache/drill/pull/1618#discussion_r253459701 ## File path: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/scan/ScanTestUtils.java ## @@ -104,4 +108,23 @@ public static TupleMetadata schema(ResolvedTuple output) { } return schema; } + + public static List expandMetadata(int dirCount) { +List selected = Lists.newArrayList( +FULLY_QUALIFIED_NAME_COL, +FILE_PATH_COL, +FILE_NAME_COL, +SUFFIX_COL); + +for (int i = 0; i < dirCount; i++) { + selected.add(PARTITION_COL + Integer.toString(i)); +} +return RowSetTestUtils.projectList(selected); + } + + public static List projectAllWithMetadata(int dirCount) { +return RowSetTestUtils.concat( +RowSetTestUtils.projectAll(), +expandMetadata(2)); Review comment: Maybe we should pass `dirCount` instead of `2`? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] arina-ielchiieva commented on a change in pull request #1618: DRILL-6950: Row set-based scan framework
arina-ielchiieva commented on a change in pull request #1618: DRILL-6950: Row set-based scan framework URL: https://github.com/apache/drill/pull/1618#discussion_r253465368 ## File path: exec/java-exec/src/test/java/org/apache/drill/exec/store/easy/text/compliant/TestCsv.java ## @@ -151,12 +151,77 @@ private String makeStatement(String fileName) { return "SELECT * FROM `dfs.data`.`" + fileName + "`"; } - private void buildFile(String fileName, String[] data) throws IOException { + private static void buildFile(String fileName, String[] data) throws IOException { try(PrintWriter out = new PrintWriter(new FileWriter(new File(testDir, fileName { for (String line : data) { out.println(line); } } } + /** + * Verify that the wildcard expands columns to the header names, including + * case + */ + @Test + public void testWildcard() throws IOException { +String sql = "SELECT * FROM `dfs.data`.`%s`"; +RowSet actual = client.queryBuilder().sql(sql, CASE2_FILE_NAME).rowSet(); + +BatchSchema expectedSchema = new SchemaBuilder() +.add("a", MinorType.VARCHAR) +.add("b", MinorType.VARCHAR) +.add("c", MinorType.VARCHAR) +.build(); +assertTrue(expectedSchema.isEquivalent(actual.batchSchema())); + +RowSet expected = new RowSetBuilder(client.allocator(), expectedSchema) +.addRow("10", "foo", "bar") +.build(); +RowSetUtilities.verify(expected, actual); + } + + /** + * Verify that implicit columns are recognized and populated. Sanity test + * of just one implicit column. + */ + @Test + public void testImplicitColsExplicitSelect() throws IOException { +String sql = "SELECT A, filename FROM `dfs.data`.`%s`"; +RowSet actual = client.queryBuilder().sql(sql, CASE2_FILE_NAME).rowSet(); + +BatchSchema expectedSchema = new SchemaBuilder() +.add("A", MinorType.VARCHAR) +.addNullable("filename", MinorType.VARCHAR) Review comment: Why we add implicit file column as nullable? Should be it be required? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
Re: January Apache Drill board report
Report has been submitted. Kind regards, Arina On Fri, Feb 1, 2019 at 3:40 PM Arina Ielchiieva wrote: > Thanks everybody for the feedback. Please see below updated report: > > = > > ## Description: > - Drill is a Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud > Storage. > > ## Issues: > - There are no issues requiring board attention at this time. > > ## Activity: > - Since the last board report, Drill has released version 1.15.0, >including the following enhancements: >- Add capability to do index based planning and execution >- CROSS join support >- INFORMATION_SCHEMA FILES and FUNCTIONS were added >- Support for TIMESTAMPADD and TIMESTAMPDIFF functions >- Ability to secure znodes with custom ACLs >- Upgrade to SQLLine 1.6 >- Parquet filter pushdown for VARCHAR and DECIMAL data types >- Support JPPD (Join Predicate Push Down) >- Lateral join functionality was enabled by default >- Multiple Web UI improvements to simplify the use of options and > submit queries >- Query performance with the semi-join functionality was improved >- Support for aliases in the GROUP BY clause >- Option to prevent Drill from returning a result set for DDL > statements >- Storage plugin names became case-insensitive > > - Drill Developer Day was held on November 14, 2018: a variety of > technical design issues were discussed, including Apache Arrow > integration, > Metadata and Resource management, Storage plugins, etc. > > - Drill User Meetup was held on November 14, 2018: use cases of Drill and > indexing support were presented. > > ## Health report: > - The project is healthy. Development activity >as reflected in the pull requests and JIRAs is good. > - Activity on the dev and user mailing lists are stable. > - Three committers were added in the last period. > > ## PMC changes: > > - Currently 23 PMC members. > - No new PMC members added in the last 3 months > - Last PMC addition was Charles Givre on Mon Sep 03 2018 > > ## Committer base changes: > > - Currently 51 committers. > - New commmitters: > - Hanumath Rao Maduri was added as a committer on Thu Nov 01 2018 > - Karthikeyan Manivannan was added as a committer on Fri Dec 07 2018 > - Salim Achouche was added as a committer on Mon Dec 17 2018 > > ## Releases: > > - 1.15.0 was released on Mon Dec 31 2018 > > ## Mailing list activity: > > - dev@drill.apache.org: > - 415 subscribers (down -12 in the last 3 months): > - 2066 emails sent to list (2653 in previous quarter) > > - iss...@drill.apache.org: > - 18 subscribers (up 0 in the last 3 months): > - 2480 emails sent to list (3228 in previous quarter) > > - u...@drill.apache.org: > - 592 subscribers (down -5 in the last 3 months): > - 249 emails sent to list (310 in previous quarter) > > > ## JIRA activity: > > - 196 JIRA tickets created in the last 3 months > - 171 JIRA tickets closed/resolved in the last 3 months > > > On Fri, Feb 1, 2019 at 12:22 PM Bohdan Kazydub > wrote: > >> Hi, Arina >> >> Just to clarify, one of the options, the one "to return null for empty >> string" (`drill.exec.functions.cast_empty_string_to_null`), >> was present for some time (from Drill 0.8.0 at least) and affected CASTs >> from string to numeric types only (if the option is set to true, empty >> string in CAST is treated as NULL). >> In 1.15 the option was expanded to affect CASTs to other types and TO_* >> functions. >> >> >> On Fri, Feb 1, 2019 at 9:24 AM Abhishek Girish >> wrote: >> >> > +1. Looks good! >> > >> > On Thu, Jan 31, 2019 at 9:15 AM Vitalii Diravka >> > wrote: >> > >> > > +1 >> > > >> > > Kind regards >> > > Vitalii >> > > >> > > >> > > On Thu, Jan 31, 2019 at 6:18 PM Aman Sinha >> wrote: >> > > >> > > > Thanks for putting this together, Arina. >> > > > The Drill Developer Day and Meetup were separate events, so you can >> > split >> > > > them up. >> > > > - A half day Drill Developer Day was held on Nov 14. A variety of >> > > > technical design issues were discussed. >> > > > - A Drill user meetup was held on the same evening. 2 >> presentations >> > - >> > > > one on use case for Drill and one about indexing support in Drill >> were >> > > > presented. >> > > > >> > > > Rest of the report LGTM. >> > > > >> > > > -Aman >> > > > >> > > > >> > > > On Thu, Jan 31, 2019 at 7:58 AM Arina Ielchiieva >> > > wrote: >> > > > >> > > > > Hi all, >> > > > > >> > > > > please take a look at the draft board report for the last quarter >> and >> > > let >> > > > > me know if you have any comments. >> > > > > >> > > > > Thanks, >> > > > > Arina >> > > > > >> > > > > = >> > > > > >> > > > > ## Description: >> > > > > - Drill is a Schema-free SQL Query Engine for Hadoop, NoSQL and >> > Cloud >> > > > > Storage. >> > > > > >> > > > > ## Issues: >> > > > > - There are no issues requiring board attention at this time. >> > > > > >> > > > > ## Activity: >> > > > > - Since the last board r